# Project: Analyzing Lottery Probabilities

In this project, I will use Python to analyze different probabilities of winning the lottery using historical data from the national 6/49 lottery in Canada. The dataset contains data from 3,665 separate drawings from 1982 to 2018

Questions that will be answered include:  
-What is probability of winning the lotto with one ticket?  
-Probability with 40 tickets?  
-What is probability of having 5 out of 6 winning numbers on a single ticket?  

### Defining Core Functions

To answer probability related questions for different lottery situations I will need to constantly calculate probabilities and different combinations. The factorial function is needed because the numbers are drawn without replacement and the combination function is needed to find the total number of combinations possible for results

In [1]:
def factorial(n):
    final_result = 1
    for i in range(n, 0, -1):
        final_result *= i
    return final_result

def combinations(n, k):
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return numerator/denominator

### Definining One Ticket Probability Function

This function calculates the probability of winning the lottery with one ticket, and allows the user to input their own lucky numbers that they want to test out.

In [2]:
def one_ticket_probability(user_numbers):
    n_outcomes = combinations(49, 6)
    succ_outcomes = 1/n_outcomes
    percentage_form = succ_outcomes * 100
    
    print('''The chances to win the big prize with the numbers {} are {:.7f}. In other words you have a 1 in {:,} chances to win.'''.format(user_numbers, 
                                                   percentage_form, int(n_outcomes)))

### Testing One Ticket Probability Function

Using two separate examples to test the validity and accuracy of the function

In [3]:
test_input_1 = [2, 43, 69, 42, 8, 24]
one_ticket_probability(test_input_1)

The chances to win the big prize with the numbers [2, 43, 69, 42, 8, 24] are 0.0000072. In other words you have a 1 in 13,983,816 chances to win.


In [4]:
test_input_2 = [5, 12, 20, 33, 16, 9, 7]
one_ticket_probability(test_input_2)

The chances to win the big prize with the numbers [5, 12, 20, 33, 16, 9, 7] are 0.0000072. In other words you have a 1 in 13,983,816 chances to win.


As expected, the probability of winning the lottery is extremely miniscule.

### Reading in Historical Lottery Drawings DataSet

In [5]:
import pandas as pd
historical_drawings = pd.read_csv('649.csv')
print(historical_drawings.shape)
historical_drawings.head(3)

(3665, 11)


Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [6]:
historical_drawings.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


### Writing function to extract winning lottery numbers from rows

In [7]:
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row

winning_numbers = historical_drawings.apply(extract_numbers, axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

### Function to Check Historical Occurences

This function will allow the user to input their own lottery numbers to see if that 6 number combination has ever won the lottery over the past 36 years.

In [15]:
def check_historical_occurence(user_numbers, historical_numbers):
    '''
    user_numbers: a Python list
    historical_numbers: a Pandas Series
    '''
    user_numbers_set = set(user_numbers)
    check_occurences = historical_numbers == user_numbers_set
    n_occurences = check_occurences.sum()
    if n_occurences == 0:
        print('''The combination {} has never occured. 
'''.format(user_numbers))
    else:
        print('''The number of times combination {} has occurred in the past
is {}.
'''.format(user_numbers, n_occurences, user_numbers))

### Testing the Function

Using two different examples to see which numbers have won before in the past.

In [16]:
test_input_3 = [33, 37, 5, 10, 34, 12]
check_historical_occurence(test_input_3, winning_numbers)

The combination [33, 37, 5, 10, 34, 12] has never occured. 



In [17]:
test_input_5 = [33, 36, 37, 39, 8, 41]
check_historical_occurence(test_input_5, winning_numbers)

The number of times combination [33, 36, 37, 39, 8, 41] has occurred in the past
is 1.



### Writing Function to Determine Multi-Ticket Probability

This function will predict your probability of winning the lottery with 1, 10 or 1000 tickets.

In [11]:
def multi_ticket_probability(n_tickets):
    n_outcomes = combinations(49, 6)
    succ_outcomes = n_tickets/n_outcomes
    percentage_form = succ_outcomes * 100
    
    if n_tickets == 1:
        print('''Your chances to win the big prize with one ticket are
{:.6f}%'''.format(percentage_form))
    
    else:
        print('''The probability of winning the lotto with {:,} tickets
are {:.6f}%.'''.format(n_tickets, percentage_form))

In [12]:
test_input_6 = [1, 10, 100, 1000, 1000000, 6991908, 13983816]

for test_input in test_input_6:
    multi_ticket_probability(test_input)
    print('-----------------------')

Your chances to win the big prize with one ticket are
0.000007%
-----------------------
The probability of winning the lotto with 10 tickets
are 0.000072%.
-----------------------
The probability of winning the lotto with 100 tickets
are 0.000715%.
-----------------------
The probability of winning the lotto with 1,000 tickets
are 0.007151%.
-----------------------
The probability of winning the lotto with 1,000,000 tickets
are 7.151124%.
-----------------------
The probability of winning the lotto with 6,991,908 tickets
are 50.000000%.
-----------------------
The probability of winning the lotto with 13,983,816 tickets
are 100.000000%.
-----------------------


### Defining Probability Less than 6 Function

This function will predi

In [13]:
def probability_less_6(n_winning_numbers):
    n_combinations = combinations(6, n_winning_numbers)
    n_combinations_remaining = combinations(43, 6-n_winning_numbers)
    succ_outcomes = n_combinations * n_combinations_remaining
    
    n_combinations_total = combinations(49, 6)
    prob = succ_outcomes/n_combinations_total
    
    percentage = prob * 100
    combo_simplified = round(n_combinations_total/succ_outcomes)
    print('''Your chances of having {} winning numbers with this ticket
are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, percentage, int(combo_simplified)))

In [14]:
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('----------------------')

Your chances of having 2 winning numbers with this ticket
are 13.237803%.
In other words, you have a 1 in 8 chances to win.
----------------------
Your chances of having 3 winning numbers with this ticket
are 1.765040%.
In other words, you have a 1 in 57 chances to win.
----------------------
Your chances of having 4 winning numbers with this ticket
are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.
----------------------
Your chances of having 5 winning numbers with this ticket
are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
----------------------
