# A Mobile App for Lottery Addiction

In this project, we will build functions that enable users to answer questions about the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49) like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or 4, 3, or 2) winning numbers on a single ticket?

We will also include historical data coming from the [national 6/49 lottery game in Canada from 1982 to 2018](https://www.kaggle.com/datasets/datascienceai/lottery-dataset).

In [1]:
# Create function to calculate factorials
def factorial(n):
    answer = 1
    for i in range(n,0,-1):
        answer *= i
    return answer

# Create function to calculate number of combinations:
def combinations(n,k):
    num = factorial(n)
    den = factorial(k) * factorial(n-k)
    return num / den

In [2]:
# Calculate probability of winning big prize with one ticket
def one_ticket_probability(list):
    num_possible_outcomes = int(combinations(49,6))
    probability = float(1 / num_possible_outcomes * 100)
    print("The numbers you chose are: {}\n\nThe probability of winning the big prize with your ticket is: {:.7f}%.\n\nIn other words, you have a 1 in {:,} chances of winning.".format(list, probability, num_possible_outcomes))
    

In [3]:
# Test function
one_ticket_probability([4,5,6,34,43,7])

The numbers you chose are: [4, 5, 6, 34, 43, 7]

The probability of winning the big prize with your ticket is: 0.0000072%.

In other words, you have a 1 in 13,983,816 chances of winning.


We will now consider the historical data drawings.

In [4]:
# Open data file
import pandas as pd
df = pd.read_csv('649.csv')
print(df.shape)

(3665, 11)


In [5]:
print(df.head(3))

   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0      649            1                0  6/12/1982               3   
1      649            2                0  6/19/1982               8   
2      649            3                0  6/26/1982               1   

   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0              11              12              14              41   
1              33              36              37              39   
2               6              23              24              27   

   NUMBER DRAWN 6  BONUS NUMBER  
0              43            13  
1              41             9  
2              39            34  


In [6]:
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB
None


In [7]:
# Write function to enable users to compare their ticket against historical data and see if they would have ever won by now.

def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row

winning_numbers = df.apply(extract_numbers, axis=1)
winning_numbers.head(3)


0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
dtype: object

In [8]:
def check_historical_occurence(user_ticket, winning_numbers):
    user_ticket = set(user_ticket)
    num_matches = sum(user_ticket == winning_numbers)
    
    num_possible_outcomes = int(combinations(49,6))
    probability = float(1 / num_possible_outcomes * 100)
    
    print("Your numbers {} have won {} time(s) in the past.".format(user_ticket, num_matches)
         + "\n\nThe probability of you winning the big prize in the next drawing with your ticket is: {:.7f}%.\n\nIn other words, you have a 1 in {:,} chances of winning the next drawing.".format(probability, num_possible_outcomes))
    
check_historical_occurence([3,1,12,14,41,43], winning_numbers)

Your numbers {1, 3, 41, 43, 12, 14} have won 0 time(s) in the past.

The probability of you winning the big prize in the next drawing with your ticket is: 0.0000072%.

In other words, you have a 1 in 13,983,816 chances of winning the next drawing.


In [13]:
# Find probability of winning the big prize depending on n number of tickets played
def multi_ticket_probability(num):
    num_possible_outcomes = int(combinations(49,6))
    successful_outcomes = num
    probability = successful_outcomes / num_possible_outcomes * 100
    ratio = round(num_possible_outcomes/num)
    print("Based on the number of {} ticket(s) you plan on playing, the chances of you winning the big prize is {:.7f}% or 1 in {:,} chances.".format(num,probability, ratio))
    
test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for test_input in test_inputs:
    multi_ticket_probability(test_input)
    print("-------------------")


Based on the number of 1 ticket(s) you plan on playing, the chances of you winning the big prize is 0.0000072% or 1 in 13,983,816 chances.
-------------------
Based on the number of 10 ticket(s) you plan on playing, the chances of you winning the big prize is 0.0000715% or 1 in 1,398,382 chances.
-------------------
Based on the number of 100 ticket(s) you plan on playing, the chances of you winning the big prize is 0.0007151% or 1 in 139,838 chances.
-------------------
Based on the number of 10000 ticket(s) you plan on playing, the chances of you winning the big prize is 0.0715112% or 1 in 1,398 chances.
-------------------
Based on the number of 1000000 ticket(s) you plan on playing, the chances of you winning the big prize is 7.1511238% or 1 in 14 chances.
-------------------
Based on the number of 6991908 ticket(s) you plan on playing, the chances of you winning the big prize is 50.0000000% or 1 in 2 chances.
-------------------
Based on the number of 13983816 ticket(s) you plan o

In [19]:
# Find probability of having 2, 3, 4, or 5 winning numbers
def probability_less_6(int):
    n_combinations = combinations(6, int)
    n_remaining_combinations = combinations(43, 6 - int)
    tot_successful_outcomes = n_combinations * n_remaining_combinations
    tot_possible_outcomes = combinations(49,6)
    probability = round(tot_successful_outcomes / tot_possible_outcomes * 100,4)
    ratio = tot_possible_outcomes / tot_successful_outcomes
    print("The probability of you having {} winning numbers is {}% or 1".format(int, probability))
    
possible_inputs = [2,3,4,5]
for i in possible_inputs:
    probability_less_6(i)
    print("------------")
    

The probability of you having 2 winning numbers is 13.2378%
------------
The probability of you having 3 winning numbers is 1.765%
------------
The probability of you having 4 winning numbers is 0.0969%
------------
The probability of you having 5 winning numbers is 0.0018%
------------
