# Mobile App for Lottery Addiction

Many people start playing the lottery for fun, but for some this turns into a habit which then escalates into an addiction. This can lead to financial hardships and other issues.
We seek to create an application that aims to better estimate an individuals chances of winning. We use historical data from the national 6/49 lottery game in Canada. 



In [11]:
#Creating our core functions that we will use repeatedly in this project
def factorials(n):
    fact = 1
    for i in range(n,0,-1):
        fact *= i
    return fact

In [12]:
def combinations(n,k):
    numerator = factorials(n)
    denominator = factorials(k) * (factorials(n-k))
    return numerator/denominator

In [13]:
def one_ticket_probability(user_numbers):
    total_outcomes = combinations(49,6)
    ticket_probability = (1/total_outcomes)*100
    return "The probability of a successful ticket with the numbers {} is {:.7f}%. Meaning your chance of winning is 1 in {}!".format(user_numbers, ticket_probability, int(total_outcomes))

In [14]:
n = [1,27,3,9,5,6]
one_ticket_probability(n)

'The probability of a successful ticket with the numbers [1, 27, 3, 9, 5, 6] is 0.0000072%. Meaning your chance of winning is 1 in 13983816!'

In [15]:
import pandas as pd
data = pd.read_csv("649.csv")
data = pd.DataFrame(data)

In [16]:
data.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


In [21]:
#The extract_numbers function returns the winning numbers from the dataset. 
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row
winning_numbers = data.apply(extract_numbers,axis=1)

#The check_historical_occurence function compares the users numbers with winning numbers from the past
def check_historical_occurrence(user_numbers, historical_numbers):
    
    user_numbers_set = set(user_numbers)
    check_occurence = historical_numbers == user_numbers_set
    n_occurences = check_occurence.sum()
    
    if n_occurences == 0:
        return "The combination {} did not patch past winning results. {}".format(user_numbers, one_ticket_probability(user_numbers))
    else:
        return "The combination {} matched a previous winning results. {}".format(user_numbers, one_ticket_probability(user_numbers))


In [22]:
test_input_3 = [33, 36, 37, 39, 81, 41]
check_historical_occurrence(test_input_3, winning_numbers)

'The combination [33, 36, 37, 39, 81, 41] did not patch past winning results. The probability of a successful ticket with the numbers [33, 36, 37, 39, 81, 41] is 0.0000072%. Meaning your chance of winning is 1 in 13983816!'

The first function above goes through the entire dataset of winning numbers and returns only the 6 winning numbers for each occurence. This is better labeled as historical numbers.

The second function takes a users selected numbers and compares those numbers with the set of historical numbers to see if the users numbers matched any of the historical ones. 

The resaon we use a set in this case is to be able to easily compare if the user numbers are found within the historical numbers.

# Multiple tickets

So far we've been able to predict chances if an individual buys a single lottery ticket. Lottery addicts however, rarely only buy a single ticket. We need to be able to identify ones chances if they buy more than one ticket.

We want to create a function where the input can be the number of different tickets a user wants to play. This would be an integer between 1 and the total number of different tickets (13,983,816). This function should print information about the probability of winning the big prize depending on how many tickets are played. 

In [31]:
def multi_ticket_probability(n_tickets):
    total_possible_outcomes = combinations(49,6) #49 total numbers, and 6 numbers sampled
    success_probability = (n_tickets/total_possible_outcomes)*100
    return "The chance of winning with {} tickets is {:.7f}%, or {} in {}".format(n_tickets,success_probability,n_tickets,int(total_possible_outcomes))

In [36]:
multi_ticket_probability(10000)

'The chance of winning with 10000 tickets is 0.0715112%, or 10000 in 13983816'

# Probabilities for winning numbers

Now that we've established the chances for multiple tickets, we can look to determine the probability for two, three, four, or five winning numbers.

For context, in the 6/49 lotteries (where our data comes from) there are smaller prizes for matching 2,3,4, or 5 out of the 6 numbers. Thus, users might be interested in knowing the chances of having 2,3,4, or 5 winning numbers.

The function we want to create to present these probabilities should take in an input from the user where:
 - There are still 6 different numbers ranging from 1 to 49
 - An integer between 2 and 5 that represents the number of winning numbers expected.
 - The function prints the probability of having the inputted number of winning numbers.
 
A good exmample of this process would be:
 - Finding the total amount of n number combinations. We can use the combinations function we wrote at the beginning.
 - Find the total number of possible successful outcomes with n number combinations. 
 - Multiply these two values to find the total number of successful outcomes.
 - Lastly, divide the total number of successful outcomes by the total possible outcomes.

In [103]:
def probability_less_6(number_combinations):
    total_num_combinations = combinations(6,number_combinations)
    possible_successful_outcomes = combinations(43, 6-number_combinations)
    total_num_successful_outcomes = total_num_combinations * possible_successful_outcomes
    
    success_probability = (total_num_successful_outcomes / (combinations(49,6)))*100
    return '''''Your chances of having {} winning numbers with this ticket are {:.7f}%. In other words, you have a 1 in {:,} chances to win.'''''.format(number_combinations, success_probability,
                                                               round(combinations(49,6)/total_num_successful_outcomes))

In [104]:
test = [2,3,4,5]
for i in test:
    print(probability_less_6(i))
    print('--------------------------')

''Your chances of having 2 winning numbers with this ticket are 13.2378029%. In other words, you have a 1 in 8 chances to win.
--------------------------
''Your chances of having 3 winning numbers with this ticket are 1.7650404%. In other words, you have a 1 in 57 chances to win.
--------------------------
''Your chances of having 4 winning numbers with this ticket are 0.0968620%. In other words, you have a 1 in 1,032 chances to win.
--------------------------
''Your chances of having 5 winning numbers with this ticket are 0.0018450%. In other words, you have a 1 in 54,201 chances to win.
--------------------------


For the possible successful outcomes, in each case we need to subtract 49-6 because we need to remove the set of 6 numbers that don't match exactly. For example, if we are taking a 5 number combination (1,2,3,4,5), 44 outcomes would take the prize, but one of those (1,2,3,4,5,6) we need to remove because we are only interested in outcomes that match exactly five numbers, not at least five numbers. 
 - We basically take out the six number combination that wins the grand prize, since in our case here we are not interested in the combination that matches all six numbers, only less than six.

So for each of our n-number combinations there are 43 possible outcomes.