# Mobile App for Lottery Addiction

In this fictional scenario, a medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the 6/49 lottery and build functions that enable users to answer questions like:

* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

For this project, we will be using the [data set](https://www.kaggle.com/datascienceai/lottery-dataset) from Kaggle for drawings from 1982 to 2018.

## One-ticket Probability

For the first version of the app, we want players to be able to calculate the probability of winning the big prize with the various numbers they play on a single ticket (for each ticket a player chooses six numbers out of 49). So, we'll start by building a function that calculates the probability of winning the big prize for any given ticket.

In [1]:
import math as m  # We will be using m.factorial()

def comb(n: int, k: int) -> int:
    '''
    Uses the combination formula and returns the number of combinations
    '''
    return (m.factorial(n)/(m.factorial(k)*m.factorial(n-k)))

def one_ticket_probability(numbers: list[int]) -> None:
    '''
    Takes a list of 6 integers
    
    Prints the probability of winning in percentage format as well as the odds
    '''
    n_comb = comb(49,6)
    probability = 1/n_comb #49 possible numbers and 6 sampled without replacement
    print('''Your chances to win the big prize with the numbers {} are {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(numbers,
                    probability*100, int(n_comb)))

In [2]:
one_ticket_probability([1,2,3,4,5,6])

Your chances to win the big prize with the numbers [1, 2, 3, 4, 5, 6] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


## Historical Data Check for Canada Lottery

On the previous screen, we wrote a function that can tell users what is the probability of winning the big prize with a single ticket. For the first version of the app, however, users should also be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

In [3]:
import pandas as pd

lottery = pd.read_csv('649.csv')
print('The (row, column) count is: ' + str(lottery.shape))

The (row, column) count is: (3665, 11)


In [4]:
print(lottery.head(1))

   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0      649            1                0  6/12/1982               3   

   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0              11              12              14              41   

   NUMBER DRAWN 6  BONUS NUMBER  
0              43            13  


In [5]:
print(lottery.tail(3))

      PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
3662      649         3589                0  6/13/2018               6   
3663      649         3590                0  6/16/2018               2   
3664      649         3591                0  6/20/2018              14   

      NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
3662              22              24              31              32   
3663              15              21              31              38   
3664              24              31              35              37   

      NUMBER DRAWN 6  BONUS NUMBER  
3662              34            16  
3663              49             8  
3664              48            17  


## Historical Data Check

The engineering team told us that we need to be aware of the following details:

* Inside the app, the user inputs six different numbers from 1 to 49.
* Under the hood, the six numbers will come as a Python list and serve as an input to our function.

The engineering team wants us to write a function that prints:
* The number of times the combination selected occurred in the Canada data set; and
* The probability of winning the big prize in the next drawing with that combination.

In [6]:
def extract_numbers(row: list[any]) -> set[int]:
    ''' 
    Expects as input a row of the lottery dataframe and returns a set containing all 
    of the six winning numbers.
    
    The dataframe must have numbers in indices 4-9
    '''
    return set(row[4:10])

winning_numbers = lottery.apply(extract_numbers, axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [9]:
def check_historical_occurrence(user_numbers: list[int], historical_numbers: pd.Series(dtype=int)) -> None:
    '''
    user_numbers: python list of 6 integers
    historical_numbers: pandas Series of lists of 6 integers
    
    Prints the number of occurrences of the given combination as well as the 
    probability of winning based on historical odds as well as probabalistic odds 
    '''
    
    user_numbers_set = set(user_numbers)
    check_occurrence = historical_numbers == user_numbers_set
    n_occurrences = check_occurrence.sum()
    
    if n_occurrences == 0:
        print('''The combination {} has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, user_numbers))
        
    else:
        print('''The number of times combination {} has occured in the past is {}.
Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, n_occurrences,
                                                                            user_numbers))

In [10]:
check_historical_occurrence([1,2,3,4,5,6], winning_numbers)

The combination [1, 2, 3, 4, 5, 6] has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination [1, 2, 3, 4, 5, 6] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


In [11]:
check_historical_occurrence([3, 41, 11, 12, 43, 14], winning_numbers)

The number of times combination [3, 41, 11, 12, 43, 14] has occured in the past is 1.
Your chances to win the big prize in the next drawing using the combination [3, 41, 11, 12, 43, 14] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


## Multi-ticket Probability

Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning.

We've talked with the engineering team and they gave us the following information:

* The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
* Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
* The function should print information about the probability of winning the big prize depending on the number of different tickets played.

In [12]:
def multi_ticket_probability(n_tickets: int) -> None:
    '''
    n_tickets = integer of number of tickets to be played that should be no greater than 13,983,816
    
    Prints the probability and odds of winning
    '''
    
    n_comb = comb(49,6)
    probability = n_tickets/n_comb #49 possible numbers and 6 sampled without replacement
    
    if n_tickets == 1:
        print('''Your chances to win the big prize with one ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(probability*100, int(n_comb)))
    
    else:
        combinations_simplified = round(n_comb / n_tickets)   
        print('''Your chances to win the big prize with {:,} different tickets are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_tickets, probability*100,
                                                               combinations_simplified))

In [13]:
for i in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    multi_ticket_probability(i)

Your chances to win the big prize with one ticket are 0.000007%.
In other words, you have a 1 in 13,983,816 chances to win.
Your chances to win the big prize with 10 different tickets are 0.000072%.
In other words, you have a 1 in 1,398,382 chances to win.
Your chances to win the big prize with 100 different tickets are 0.000715%.
In other words, you have a 1 in 139,838 chances to win.
Your chances to win the big prize with 10,000 different tickets are 0.071511%.
In other words, you have a 1 in 1,398 chances to win.
Your chances to win the big prize with 1,000,000 different tickets are 7.151124%.
In other words, you have a 1 in 14 chances to win.
Your chances to win the big prize with 6,991,908 different tickets are 50.000000%.
In other words, you have a 1 in 2 chances to win.
Your chances to win the big prize with 13,983,816 different tickets are 100.000000%.
In other words, you have a 1 in 1 chances to win.


## Less Winning Numbers

For extra context, in most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.

These are the engineering details we'll need to be aware of:

* Inside the app, the user inputs:
 * six different numbers from 1 to 49; and
 * an integer between 2 and 5 that represents the number of winning numbers expected
* Our function prints information about the probability of having the inputted number of winning numbers.

In [56]:
def probability_less_6(n_winning_numbers: int) -> None:
    '''
    n_winning_numbers is an integer between 0 and 6 inclusive
    
    Prints the probability of having the exact number of winning numbers (out of 6 on a single draw) and the odds of 
    matching that many numbers on a single ticket
    '''
    
    n_comb_ticket = comb(6, n_winning_numbers)
    n_comb_remaining = comb(49-n_winning_numbers, 6 - n_winning_numbers)
    successful_outcomes = n_comb_ticket * n_comb_remaining
    
    n_comb_total = comb(49, 6)    
    probability = successful_outcomes / n_comb_total
    
    combinations_simplified = round(n_comb_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability*100,
                                                               int(combinations_simplified)))

In [64]:
for test_input in [0, 1, 2, 3, 4, 5, 6]:
    probability_less_6(test_input)
    print('--------------------------') # output delimiter

Your chances of having 0 winning numbers with this ticket are 100.000000%.
In other words, you have a 1 in 1 chances to win.
--------------------------
Your chances of having 1 winning numbers with this ticket are 73.469388%.
In other words, you have a 1 in 1 chances to win.
--------------------------
Your chances of having 2 winning numbers with this ticket are 19.132653%.
In other words, you have a 1 in 5 chances to win.
--------------------------
Your chances of having 3 winning numbers with this ticket are 2.171081%.
In other words, you have a 1 in 46 chances to win.
--------------------------
Your chances of having 4 winning numbers with this ticket are 0.106194%.
In other words, you have a 1 in 942 chances to win.
--------------------------
Your chances of having 5 winning numbers with this ticket are 0.001888%.
In other words, you have a 1 in 52,969 chances to win.
--------------------------
Your chances of having 6 winning numbers with this ticket are 0.000007%.
In other words,

## Adding Funny Analogies

An addition to the output would be to include a comparison to an also unlikely event such as being the victim of a shark attack which is a 1 in 5 million chance.

In [58]:
def funny_output_multi_ticket(n_tickets: int) -> None:
    '''
    n_tickets = integer of number of tickets to be played
    
    Prints the probability of winning the big prize as well as the odds of winning as well as a funny output
    comparing to being the victim of a shark attack
    '''
    
    n_comb = comb(49,6)
    probability = n_tickets/n_comb #49 possible numbers and 6 sampled without replacement
    
    shark_prob = (1.0/5000000) / probability
    
    if shark_prob > 1:
        text = 'more'
    else:
        shark_prob = 1/shark_prob
        text = 'less'
    
    if n_tickets == 1:
        print('''Your chances to win the big prize with one ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.
For comparison, you are {:,} times {} likely to be the victim of a shark attack than winning the lottery.'''
              .format(probability*100, int(n_comb), round(shark_prob), text)
             )
    
    else:
        combinations_simplified = round(n_comb / n_tickets)   
        print('''Your chances to win the big prize with {:,} different tickets are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.
For comparison, you are {:,} times {} likely to be the victim of a shark attack than winning the lottery.'''
              .format(n_tickets, probability*100, combinations_simplified, round(shark_prob), text)
             )

In [59]:
for i in [1, 10]:
    funny_output_multi_ticket(i)

Your chances to win the big prize with one ticket are 0.000007%.
In other words, you have a 1 in 13,983,816 chances to win.
For comparison, you are 3 times more likely to be the victim of a shark attack than winning the lottery.
Your chances to win the big prize with 10 different tickets are 0.000072%.
In other words, you have a 1 in 1,398,382 chances to win.
For comparison, you are 4 times less likely to be the victim of a shark attack than winning the lottery.


In [62]:
def funny_output_probability_less_6(n_winning_numbers: int) -> None:
    '''
    n_winning_numbers is an integer between 0 and 6 inclusive
    
    Prints the probability of having the exact number of winning numbers (out of 6 on a single draw) and the odds of 
    matching that many numbers on a single as well as the odds of winning as well as a funny output
    comparing to being the victim of a shark attack
    '''
    
    n_comb_ticket = comb(6, n_winning_numbers)
    n_comb_remaining = comb(49 - n_winning_numbers, 6 - n_winning_numbers)
    successful_outcomes = n_comb_ticket * n_comb_remaining
    
    n_comb_total = comb(49, 6)    
    probability = successful_outcomes / n_comb_total
    
    shark_prob = (1.0/5000000) / probability
    
    if shark_prob > 1:
        text = 'more'
    else:
        shark_prob = 1/shark_prob
        text = 'less'
    
    combinations_simplified = round(n_comb_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.
For comparison, you are {:,} times {} likely to be the victim of a shark attack than winning the lottery'''
          .format(n_winning_numbers, probability*100, int(combinations_simplified), round(shark_prob), text)
         )

In [63]:
for test_input in [0, 1, 2, 3, 4, 5, 6]:
    funny_output_probability_less_6(test_input)
    print('--------------------------') # output delimiter

Your chances of having 0 winning numbers with this ticket are 100.000000%.
In other words, you have a 1 in 1 chances to win.
For comparison, you are 5,000,000 times less likely to be the victim of a shark attack than winning the lottery
--------------------------
Your chances of having 1 winning numbers with this ticket are 73.469388%.
In other words, you have a 1 in 1 chances to win.
For comparison, you are 3,673,469 times less likely to be the victim of a shark attack than winning the lottery
--------------------------
Your chances of having 2 winning numbers with this ticket are 19.132653%.
In other words, you have a 1 in 5 chances to win.
For comparison, you are 956,633 times less likely to be the victim of a shark attack than winning the lottery
--------------------------
Your chances of having 3 winning numbers with this ticket are 2.171081%.
In other words, you have a 1 in 46 chances to win.
For comparison, you are 108,554 times less likely to be the victim of a shark attack tha

## Combining Probability and Historical Occurrence

Another feature would be outputing probability and historical occurences at the same time. To do so, we will combine  one_ticket_probability() and check_historical_occurrence().

In [24]:
def one_ticket_probability_and_historical_occurrence(user_numbers: list[int], 
                                                     historical_numbers: pd.Series(dtype=int)) -> None:
    '''
    user_numbers: python list of 6 integers
    historical_numbers: pandas Series of 6 integers per element
    
    Prints the combined one ticket probability and historical occurrence
    '''
    
    # For one ticket probability
    n_comb = comb(49,6)
    probability = 1/n_comb #49 possible numbers and 6 sampled without replacement
    
    print('''Your chances to win the big prize with the numbers {} are {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(user_numbers,
                    probability*100, int(n_comb)))
    
    # For historical occurrence
    user_numbers_set = set(user_numbers)
    check_occurrence = historical_numbers == user_numbers_set
    n_occurrences = check_occurrence.sum()
    
    if n_occurrences == 0:
        print('''The combination {} has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, user_numbers))
        
    else:
        print('''The number of times combination {} has occured in the past is {}.'''.format(user_numbers, n_occurrences,
                                                                            user_numbers))

In [25]:
one_ticket_probability_and_historical_occurrence([3, 41, 11, 12, 43, 14], winning_numbers)

Your chances to win the big prize with the numbers [3, 41, 11, 12, 43, 14] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.
The number of times combination [3, 41, 11, 12, 43, 14] has occured in the past is 1.


## Creating Probability of at Least X Winning Numbers

Our final function will be similar to probability_less_6(). In this case it will be one which calculates the probability of having at least two, three, four or five winning numbers. For example, the number of successful outcomes for having at least four winning numbers is the sum of having at 4, 5, and 6 winning numbers exactly. We can also consider the events dependent and add the probabilities.

In [65]:
def probability_at_least_n(n_winning_numbers: int) -> None:
    '''
    n_winning_numbers is an integer between 0 and 6 inclusive
    
    Prints the probability of having at least that number of winning numbers (out of 6 on a single draw) 
    and the odds of matching at least that many numbers on a single ticket
    '''
    
    successful_outcomes = 0
    
    for i in range(n_winning_numbers,7):
        n_comb_ticket = comb(6, n_winning_numbers)
        n_comb_remaining = comb(49 - n_winning_numbers, 6 - n_winning_numbers)
        successful_outcomes += (n_comb_ticket * n_comb_remaining)
    
    n_comb_total = comb(49, 6)    
    probability = successful_outcomes / (n_comb_total * (7-n_winning_numbers))
    
    combinations_simplified = round(n_comb_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability*100,
                                                               int(combinations_simplified)))

In [66]:
for test_input in [0, 1, 2, 3, 4, 5, 6]:
    probability_at_least_n(test_input)
    print('--------------------------') # output delimiter

Your chances of having 0 winning numbers with this ticket are 100.000000%.
In other words, you have a 1 in 0 chances to win.
--------------------------
Your chances of having 1 winning numbers with this ticket are 73.469388%.
In other words, you have a 1 in 0 chances to win.
--------------------------
Your chances of having 2 winning numbers with this ticket are 19.132653%.
In other words, you have a 1 in 1 chances to win.
--------------------------
Your chances of having 3 winning numbers with this ticket are 2.171081%.
In other words, you have a 1 in 12 chances to win.
--------------------------
Your chances of having 4 winning numbers with this ticket are 0.106194%.
In other words, you have a 1 in 314 chances to win.
--------------------------
Your chances of having 5 winning numbers with this ticket are 0.001888%.
In other words, you have a 1 in 26,484 chances to win.
--------------------------
Your chances of having 6 winning numbers with this ticket are 0.000007%.
In other words,

## Conclusion

In this project, we were able to generate probabilities using combinations in order to determine the odds of winning a 6 digit lottery out of 49 numbers. We were also able to create the logic to output multiple things such as the historical occurence as well.