# Mobile App for Lottery Addiction

Most people play the lottery for fun but it soon becomes a habit and an addiction which makes them use up their savings, take out loans and eventually puts the individual in debt or make them indulge in risky behaviour like theft. 

The medical institute has the task of developing a mobile app that will help gamblers to estimate their chances of winning. This project will develop the logical core of the app that calculate probabilities for the team of engineers to incorporate into the app. The project will provide answers to the following questions such as
* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three) winning numbers on a single ticket?

The first version of the app will be using historiacal data from the national 6/49 lottery in Canada. The dataset has 3,665 drawings dating back from 1982 to 2018. The link to the dataset can be found [here](https://www.kaggle.com/datascienceai/lottery-dataset). 

# Core functions

The two functions that will be used often are below
* factorials for calculating factorials
* combinations for calculating combinations 

In [1]:
# factorial function and combination function

def factorial(n):
    total_product = 1
    for i in range(n, 0, -1):
        total_product *= i
    return total_product

def combinations(n, k):
    numerator = factorial(n)
    denominator = factorial(k)*factorial(n-k)
    return numerator / denominator


# One ticket Probability

The engineering team wants the app to determine if the numbers on a given ticket as a result a function that does the following will be used by the app
* Inside the app, the user inputs six different numbers from 1 to 49.
* Under the hood, the six numbers will come as a Python list, which will serve as the single input to our function.
* The engineering team wants the function to print the probability value in a friendly way

A ticket is a winning ticket if all numbers from a set of 49 matches the winning numbers. The function below takes a list of 6 numbers and prints the probability of winning. 

In [2]:
def one_ticket_probability(user_numbers):
    possible_outcomes = combinations(49, 6)
    prob_one_ticket = (1 / possible_outcomes)*100
    return ('''Your chances to win the big prize with the numbers {} are {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(user_numbers,
                    prob_one_ticket, int(possible_outcomes)))

In [3]:
# Testing the above function
test_input_1 = [2, 23, 10, 34, 40, 15]
one_ticket_probability(test_input_1)

'Your chances to win the big prize with the numbers [2, 23, 10, 34, 40, 15] are 0.0000072%.\nIn other words, you have a 1 in 13,983,816 chances to win.'

In [4]:
test_input_2 = [9, 26, 41, 7, 15, 6]
one_ticket_probability(test_input_2)

'Your chances to win the big prize with the numbers [9, 26, 41, 7, 15, 6] are 0.0000072%.\nIn other words, you have a 1 in 13,983,816 chances to win.'

The function first calculates the number of unique winning number combinations for 6 numbers from 1 to 49 and then it calculates the probability of the list of numbers entered by the user to be the winning numbers. 

# Exploring the historical data

The institute wants to the histrical data from the Canadian national 6/49 lottery to be used to predict a user's chance of having the winning numbers on their ticket.  

In [5]:
# Reading the dataset

import pandas as pd
data_01 = pd.read_csv('649.csv')
data_01.shape

(3665, 11)

In [6]:
data_01.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [7]:
data_01.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


# Generating function for historical datacheck

The function will check the number on the ticket and compare it with the historical lottery data to know whether the ticket would have been the winning ticket by now. The engineering team wants the following to be taken into consideration
* Inside the app, the user inputs six different numbers from 1 to 49.
* Under the hood, the six numbers will come as a Python list and serve as an input to our function.
* The engineering team wants us to write a function that prints:
  * the number of times the combination selected occurred in the Canada data set; and
  * the probability of winning the big prize in the next drawing with that combination.

The extract_numbers() function will go over each row of the dataframe and extract the six winning numbers as a Python set.

In [8]:
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row

winning_numbers = data_01.apply(extract_numbers, axis = 1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [9]:
# Generating a function for comparing user numbers and historical data

def check_historical_occurrence(user_numbers, historical_numbers):
    user_nums = set(user_numbers)
    occurrences = historical_numbers == user_nums
    
    if occurrences.sum() == 0:
        print('''The combination {} has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, user_numbers))
    
    else:
        print('''The number of times combination {} has occured in the past is {}.
Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, occurrences.sum(),
                                                                            user_numbers))
    

In [10]:
# testing the historical occurrence function

test_numbers_1 = [33, 5, 17, 28, 35, 43]
check_historical_occurrence(test_numbers_1, winning_numbers)

The combination [33, 5, 17, 28, 35, 43] has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination [33, 5, 17, 28, 35, 43] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


In [11]:
test_numbers_2 = [33, 36, 37, 39, 8, 41]
check_historical_occurrence(test_numbers_2, winning_numbers)

The number of times combination [33, 36, 37, 39, 8, 41] has occured in the past is 1.
Your chances to win the big prize in the next drawing using the combination [33, 36, 37, 39, 8, 41] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


The check_historical_occurrence function compares the winning numbers from each row of the historical dataset and compares it with a list of numbers entered by the user. A single list of 6 numbers from 1 to 49 has little to no chance of winning the big price. The possibility of using more than one set of winning numbers should be explored to make users determine their chances of winning.  

# Multi ticket probability 

Lottery players normally try to play more than one ticket to increase their chance of winning. The next focus of the project is to develop functions to help calculate a players chances of winning the lottery for any number of different tickets.

The engineering team wants the following to be considered when developing a multi_ticket_probability function which gives probability of a given number of tickets winning the lottery.
* The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
* Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
* The function should print information about the probability of winning the big prize depending on the number of different tickets played.

In [12]:
# Multi ticket probability function

def multi_ticket_probability(n_tickets):
    possible_outcomes = combinations(49, 6)
    probability_tickets = (n_tickets / possible_outcomes)*100
    
    if n_tickets == 1:
        print('''Your chances to win the big prize with one ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(probability_tickets, int(possible_outcomes)))
    
    else:
        combinations_simplified = round(possible_outcomes / n_tickets)   
        print('''Your chances to win the big prize with {:,} different tickets are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_tickets, probability_tickets,
                                                               combinations_simplified))

In [13]:
# Testing the above function

test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for i in test_inputs:
    multi_ticket_probability(i)
    print('----------------------------------')

Your chances to win the big prize with one ticket are 0.000007%.
In other words, you have a 1 in 13,983,816 chances to win.
----------------------------------
Your chances to win the big prize with 10 different tickets are 0.000072%.
In other words, you have a 1 in 1,398,382 chances to win.
----------------------------------
Your chances to win the big prize with 100 different tickets are 0.000715%.
In other words, you have a 1 in 139,838 chances to win.
----------------------------------
Your chances to win the big prize with 10,000 different tickets are 0.071511%.
In other words, you have a 1 in 1,398 chances to win.
----------------------------------
Your chances to win the big prize with 1,000,000 different tickets are 7.151124%.
In other words, you have a 1 in 14 chances to win.
----------------------------------
Your chances to win the big prize with 6,991,908 different tickets are 50.000000%.
In other words, you have a 1 in 2 chances to win.
----------------------------------
Yo

The multi_ticket_probability took the number of tickets entered by the user calculated the probability of the tickets having the winning numbers using the combination function and displayed their chance of winning based on the number of tickets they entered.

From the above result if a lottery player wants to have the certainty of winning they will have to purchase 13,983,816 tickets. Peharps it is unrealistic to buy that many tickets at once to win as a result players might buy a given number of tickets considering they can win other prices.   

# Less than winning number Function

There are smaller prices for the 6/49 lottery is a player's ticket matches two, three, four, five or six numbers drawn. The user might want to know the probability of having two, three, four or five of the winning numbers.
The engineering team wants the following considered to add this capability to the app.
* Inside the app, the user inputs:
    * six different numbers from 1 to 49; and
    * an integer between 2 and 5 that represents the number of winning numbers expected
* Our function prints information about the probability of having the inputted number of winning numbers.


In [25]:
# Less than six function

def probability_less_6(input_val):
    combination_ticket = combinations(6, input_val)
    combination_other_nums = combinations(49 - input_val, 6 - input_val)
    n_successful_out = combination_ticket*combination_other_nums
    total_possible_outcome = combinations(49, 6)
    prob_win_input = (n_successful_out / total_possible_outcome)*100
    combinations_simplified = round(total_possible_outcome / n_successful_out)
    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(input_val, prob_win_input,
                                                               int(combinations_simplified)))
   

Testing the function above for the following list of possible inputs



In [27]:
test_list = [2, 3, 4, 5]
for i in test_list:
    probability_less_6(i)
    print('----------------------------')

Your chances of having 2 winning numbers with this ticket are 19.132653%.
In other words, you have a 1 in 5 chances to win.
----------------------------
Your chances of having 3 winning numbers with this ticket are 2.171081%.
In other words, you have a 1 in 46 chances to win.
----------------------------
Your chances of having 4 winning numbers with this ticket are 0.106194%.
In other words, you have a 1 in 942 chances to win.
----------------------------
Your chances of having 5 winning numbers with this ticket are 0.001888%.
In other words, you have a 1 in 52,969 chances to win.
----------------------------


The combination function accepts the number of winning numbers from the user which are any number from 2 to 5 and calculates the number of successful ways of choosing the numbers by multiplying the number of ways for the accepted numbers from the user and the number of ways for choosing the other numbers. The probability of having the number of winning numbers is a ratio of the successful ways of choosing the entered number to the total number of ways of picking 6 numbers from 1 to 49. 

From the results the chances of having 2 winning numbers has the highest chance of being winning numbers with a 1 in 5 chance. The lottery player may want to try a combination of winning numbers peharps historical winning numbers to find out their overall number of winning numbers per ticket which can be 2, 3, 4, or 5 numbers on each ticket.

# Conclusion

The project has designed some logic for designing the main functions of the app based on the medical institute's requirements for designing a betting app to help gamblers make decisions about spending money on tickets by knowing their odds of winning the big price or other prices. The four functions below were designed as a result.
* one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
* check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
* multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
* probability_less_6() — calculates the probability of having two, three, four or five winning numbers

Possible features for a second version of the app include:

* Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery").
* Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time.
* Combining the multi_ticket_probability() and the probability_less_6() to give the app the functionality for users to know their chance of winning prices if they decide to buy a certain number of tickets.