#  Lottery Addiction - Calculating Probability
We focus on the 6/49 lottery and build functions that enable users to answer questions like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. The drawing is done without replacement, which means once a number is drawn, it's not put back in the set.

This means that we can calculate the number of permutations with the formula:

$_{49}P_6 = 49 \cdot 48 \cdot 47 \cdot 46 \cdot 45 \cdot 44 = \cfrac{49!}{(49 - 6)!}$ 

Since the order of the numbers is not relevant we must count only the number of compinations:

$_{49}C_6 = \cfrac{_{49}P_6 }{\text{Number of permutations the same 6 unique numbers}} = \cfrac{49!}{6!(49 - 6)!} = \cfrac{n!}{k!(n - k)!} = \binom{n}{k}$

Creating a function to calculate the factorial:

In [1]:
def factorial(number):
    if(number == 1):
        return 1
    else:
        result = number*factorial(number-1)
    return result


Creating a fucntion using the former to calculate the combinations:

In [147]:
def combinations(n, k):
    return int(factorial(n) / (factorial(k) * factorial(n - k)))

Now we create a function tht takes in any list of numbers and outputs the probability of winning with that combination. This  is intended to let people with no training in probability theroy to easily grasp the chances of winning the lottery with any ticket.

In [109]:
def One_Ticket_Probability(numbers):
    different6 = list(set(numbers))
    if len(different6) < 6:
        print('Invalid List of Numbers')
    else:
        n_comb = combinations(49, 6)
        probability = 1 / n_comb *100
        message = 'The probability of winning the lottery with the numbers {} is {:.9f}%.\nYou have 1 in {:,} chances to win'.format(numbers, probability, n_comb)
        print(message)

In [110]:
One_Ticket_Probability([1,2,3,4,5,6])

The probability of winning the lottery with the numbers [1, 2, 3, 4, 5, 6] is 0.000007151%.
You have 1 in 13,983,816 chances to win


# Probabilities with real world data

Now we are going to study the historic of the lottery of Canada 1982 - 2018 to see if we could have won with our numbers so far.

The dataset can be downloaded [here](https://www.kaggle.com/datascienceai/lottery-dataset)

In [111]:
import pandas as pd
lottery = pd.read_csv('649.csv')

In [112]:
print('columns: {}\nrows: {}'.format(len(lottery.columns), len(lottery)))

columns: 11
rows: 3665


In [113]:
lottery.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [114]:
lottery.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


We define a fucntion that extracts the drawings of each lottery event and puts them in a set. 

In [115]:
def extract_numbers(row):
    return set(row[4:10]) 

In [116]:
extract_numbers(lottery.iloc[0])

{3, 11, 12, 14, 41, 43}

We use DataFrame.apply() to extract all the results of the lotteries from 1982 to 2018 into a list

In [117]:
winning_numbers = lottery.apply(extract_numbers, axis = 1)

In [118]:
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

Now we create a function to check the ocurrence of the user's number in the history

In [119]:
def check_historical_occurrence(user_numbers, history):
    times_won = winning_numbers[winning_numbers == set(user_numbers)]
    print('You would have won {} times with that number'.format(len(times_won)))
    One_Ticket_Probability(user_numbers)

In [120]:
user_n_1 = [3,41,11,12,43,14] # Taken from the first row of the history
check_historical_occurrence(user_n_1, winning_numbers)

You would have won 1 times with that number
The probability of winning the lottery with the numbers [3, 41, 11, 12, 43, 14] is 0.000007151%.
You have 1 in 13,983,816 chances to win


In [121]:
user_n_1 = [3,41,21,12,43,14]
check_historical_occurrence(user_n_1, winning_numbers)

You would have won 0 times with that number
The probability of winning the lottery with the numbers [3, 41, 21, 12, 43, 14] is 0.000007151%.
You have 1 in 13,983,816 chances to win


In [122]:
user_n_1 = [8,1,14,12,43,13]
check_historical_occurrence(user_n_1, winning_numbers)

You would have won 0 times with that number
The probability of winning the lottery with the numbers [8, 1, 14, 12, 43, 13] is 0.000007151%.
You have 1 in 13,983,816 chances to win


Now we make another function to check the probability of winning with multiple tickets

In [125]:
def multi_ticket_probability(number_of_tickets):
    n_comb = combinations(49, 6)
    probability = number_of_tickets / n_comb *100
    message = 'The probability of winning the lottery with {} tickets is {:.9f}%.\nYou have {} in {:,} chances to win'.format(number_of_tickets, probability,number_of_tickets, n_comb)
    print(message)

In [126]:
n_tickets_list = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for number in n_tickets_list:
    multi_ticket_probability(number)

The probability of winning the lottery with 1 tickets is 0.000007151%.
You have 1 in 13,983,816 chances to win
The probability of winning the lottery with 10 tickets is 0.000071511%.
You have 10 in 13,983,816 chances to win
The probability of winning the lottery with 100 tickets is 0.000715112%.
You have 100 in 13,983,816 chances to win
The probability of winning the lottery with 10000 tickets is 0.071511238%.
You have 10000 in 13,983,816 chances to win
The probability of winning the lottery with 1000000 tickets is 7.151123842%.
You have 1000000 in 13,983,816 chances to win
The probability of winning the lottery with 6991908 tickets is 50.000000000%.
You have 6991908 in 13,983,816 chances to win
The probability of winning the lottery with 13983816 tickets is 100.000000000%.
You have 13983816 in 13,983,816 chances to win


As there are prices fro having between 2 and 5 numbers in the drawings, we now calculate the probability of winning a secondary price.

What is the probability of having exactly 5 winning numbers?

If the winning 6 numbers are {1,2,3,4,5,6}

There are 6 combinations of 5 numbers that have 5 numbers in that set $\binom{6}{5} = \frac{6!}{5!(6-5)!} = 6$

There are $49 - 6 = 43$ numbers that are not in the winning 6 numbers set. So, $6 \times 43 = 258$ different combinations of 5 digit coincidences.

If there are 13983816 different combinations, the chances are $\frac{258}{\binom{49}{6} = 13983816} = 0.00001845$



In [150]:
def probability_less_6(number_matching):
    # combinations of sets of size 'number_matching' in a set of 6 numbers
    number_comb_in_6 = combinations(6,number_matching)
    # There are 43 numbers that are not in the winning set, so the number of combinations, by the rule of product is
    comb_remaining = combinations(43,6 - number_matching)
    total_n_combinations = comb_remaining * number_comb_in_6
    probability = total_n_combinations/combinations(49,6) * 100
    message = 'The probability of having {} matching numbers is {:.9f}%.\nYou have {} in {:,} chances to win'.format(number_matching, probability,total_n_combinations,combinations(49,6))
    print(message)

In [151]:
probability_less_6(5)

The probability of having 5 matching numbers is 0.001844990%.
You have 258 in 13,983,816 chances to win


In [152]:
probability_less_6(4)

The probability of having 4 matching numbers is 0.096861972%.
You have 13545 in 13,983,816 chances to win


In [153]:
probability_less_6(3)

The probability of having 3 matching numbers is 1.765040387%.
You have 246820 in 13,983,816 chances to win


In [154]:
probability_less_6(2)

The probability of having 2 matching numbers is 13.237802900%.
You have 1851150 in 13,983,816 chances to win


In [155]:
probability_less_6(1)

The probability of having 1 matching numbers is 41.301945048%.
You have 5775588 in 13,983,816 chances to win
