# Build a lottery app to help addicts understand the odds of winning

- What is the probability of winning the grand prize with a single ticket?
- What is the probability of winning the grand price with 40 tickets?
- What is the probability of matching at least five of the winning numbers on a single ticket?

Six numbers are drawn from a set of 49.

Numbers range from 1 to 49.

Numbers are drawn without replacement.



In [1]:
import pandas as pd

In [2]:
# Define a Function to Calculate Factorials

def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

In [3]:
# Define a Function to Calculate Combinations

def combinations(n,k):
    return factorial(n) / (factorial(k) * factorial(n-k))

In [4]:
# Define a function to calculate the probability of winning the Grand Prize for any single ticket.

def single_ticket_probability(list_num,set_size = 49):
    num_combinations = combinations(set_size,len(list_num))
    probability = 1 / num_combinations * 100
    
    print(f'The probability of winning the grand prize with numbers {list_num} is {probability:.7f}%.')
    print()
    print(f'The chance of winning is 1 in {int(num_combinations):,}.')

In [5]:
ticket = [1,2,3,4,5,6]

single_ticket_probability(ticket)

The probability of winning the grand prize with numbers [1, 2, 3, 4, 5, 6] is 0.0000072%.

The chance of winning is 1 in 13,983,816.


In [6]:
def custom_date_parser(date):
    return pd.to_datetime(date, format='%m/%d/%Y')

In [7]:
# Define a Function to Parse Dates

def custom_date_parser(date):
    return pd.to_datetime(date, format='%m/%d/%Y')

# Import the Data

lotto_data = pd.read_csv('649.csv', parse_dates=['DRAW DATE'],date_parser=custom_date_parser)

In [8]:
lotto_data.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,1982-06-12,3,11,12,14,41,43,13
1,649,2,0,1982-06-19,8,33,36,37,39,41,9
2,649,3,0,1982-06-26,1,6,23,24,27,39,34


In [9]:
lotto_data.info()



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   PRODUCT          3665 non-null   int64         
 1   DRAW NUMBER      3665 non-null   int64         
 2   SEQUENCE NUMBER  3665 non-null   int64         
 3   DRAW DATE        3665 non-null   datetime64[ns]
 4   NUMBER DRAWN 1   3665 non-null   int64         
 5   NUMBER DRAWN 2   3665 non-null   int64         
 6   NUMBER DRAWN 3   3665 non-null   int64         
 7   NUMBER DRAWN 4   3665 non-null   int64         
 8   NUMBER DRAWN 5   3665 non-null   int64         
 9   NUMBER DRAWN 6   3665 non-null   int64         
 10  BONUS NUMBER     3665 non-null   int64         
dtypes: datetime64[ns](1), int64(10)
memory usage: 315.1 KB


In [10]:
# Define a function to extract historical winning numbers

def extract_winning_num(row):
    return set(row[4:10].values)


winning_nums = lotto_data.apply(extract_winning_num,axis=1)

In [11]:
winning_nums.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [36]:
# Write a Function to Check Historical Data

def check_historical_nums(list_num, historical_winners = winning_nums):
    list_num = set(list_num)
    compare_sets = list_num == historical_winners
    sum_instances = compare_sets.sum()
    noun = 'occasion' if sum_instances == 1 else 'occasions'
    
    if sum_instances == 0:
        print(f"The combination {list_num} has never occured.")
        print()
        print(f"This doesn't change your chances of winning.")
        print()
        print(f'The odds of winning the Grand Prize for any single ticket is 1 in 13,983,816.')
    else:
        print(f"The combination of {list_num} has occurred in the past on {sum_instances} {noun}.")
        print()
        print(f"This doesn't change your chances of winning.")
        print()
        print(f'The odds of winning the Grand Prize for any single ticket is 1 in 13,983,816.')

In [37]:
check_historical_nums([34, 5, 14, 47, 21, 31])

The combination of {34, 5, 14, 47, 21, 31} has occurred in the past on 1 occasion.

This doesn't change your chances of winning.

The odds of winning the Grand Prize for any single ticket is 1 in 13,983,816.


In [49]:
def multi_ticket_probability(num_tickets):
    num_combinations = int(combinations(49,6))
    prob = num_tickets / num_combinations
    simplified = round(num_combinations/num_tickets)

    if num_tickets == 1:
        print(f"The odds of winning the Grand Prize with 1 ticket are 1 in {num_combinations:,}.")

    else:
        print(f"The odds of winning the Grand Prize with {num_tickets:,} different tickets is 1 in {simplified:,}.")

In [50]:
multi_ticket_probability(2)

The odds of winning the Grand Prize with 2 different tickets is 1 in 6,991,908.


In [54]:
test_cases = [1,10,100,1000,1000000,5000000]

for num in test_cases:
    multi_ticket_probability(num)
    print('---------------------------------------')

The odds of winning the Grand Prize with 1 ticket are 1 in 13,983,816.
---------------------------------------
The odds of winning the Grand Prize with 10 different tickets is 1 in 1,398,382.
---------------------------------------
The odds of winning the Grand Prize with 100 different tickets is 1 in 139,838.
---------------------------------------
The odds of winning the Grand Prize with 1,000 different tickets is 1 in 13,984.
---------------------------------------
The odds of winning the Grand Prize with 1,000,000 different tickets is 1 in 14.
---------------------------------------
The odds of winning the Grand Prize with 5,000,000 different tickets is 1 in 3.
---------------------------------------


In [62]:
def matching_odds(num_matches):
    n_combinations_ticket = combinations(6, num_matches)
    n_combinations_remaining = combinations(43, 6 - num_matches)
    successful_outcomes = n_combinations_ticket * n_combinations_remaining
    
    n_combinations_total = combinations(49, 6)    
    probability = successful_outcomes / n_combinations_total
    
    probability_percentage = probability * 100    
    combinations_simplified = round(n_combinations_total/successful_outcomes)    
    print(f"The odds chances of having {num_matches:,} matching numbers with this ticket are 1 in {combinations_simplified:,}")

In [63]:
test_cases = [2,3,4,5,6]

for num in test_cases:
    matching_odds(num)
    print('-----------------------------------')

The odds chances of having 2 matching numbers with this ticket are 1 in 8
-----------------------------------
The odds chances of having 3 matching numbers with this ticket are 1 in 57
-----------------------------------
The odds chances of having 4 matching numbers with this ticket are 1 in 1,032
-----------------------------------
The odds chances of having 5 matching numbers with this ticket are 1 in 54,201
-----------------------------------
The odds chances of having 6 matching numbers with this ticket are 1 in 13,983,816
-----------------------------------
