# Winner winner chicken dinner!

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

These are the questions that the company wants us to answer: 

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

To this end we will use data from the 6/49 lottery game in Canada from this [data set](https://www.kaggle.com/datascienceai/lottery-dataset). It has data for 3,665 drawings, dating from 1982 to 2018. 

In [1]:
# functions to calculate factorials and combinations
def factorial(n):
    result=1
    for i in range(n,0,-1):
        result *= i
    return result
    
def combinations(n,k):
    numerator=factorial(n)
    denominator=factorial(k)*factorial(n-k)
    return numerator/denominator
    

We want players to be able to calculate the probability of winning the big prize with the various numbers they play on a single ticket (for each ticket a player chooses six numbers out of 49).

We can first compute the number of possible outcomes, which if numbers are sampled without replacement would be equal to a combination of the total number of values possible (49) and the number of values selected (6). The probability of any random list of six numbers being the winner would be the inverse of this total number of combinations. 

Below is a function that will compute and print the probability of winning the big prize. 

In [2]:
def one_ticket_probability(numbers):
    p = 1/combinations(49,6)
    print('The combination {} has a {:.7f}% chance of winning the lottery; or 1 in {:,}.'.format(
        numbers, p*100, int(combinations(49,6))))

In [3]:
import random

# random list
r = [random.randrange(1, 49, 1) for i in range(6)]

one_ticket_probability(r)

The combination [4, 3, 10, 33, 23, 12] has a 0.0000072% chance of winning the lottery; or 1 in 13,983,816.


In [4]:
import pandas as pd

df = pd.read_csv('649.csv')

In [5]:
print('The dataset has '+str(df.shape[0])+
      ' rows and '+str(df.shape[1])+' columns' )

The dataset has 3665 rows and 11 columns


In [6]:
df.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [7]:
df.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


In [8]:
# return winning numbers from any row in dataset
def extract_numbers(row):
    winning_numbers = []
    for i in range(4,10):
        winning_numbers.append(row[i])
    return set(winning_numbers)
        
        

In [9]:
# list of all winning combinations in 6/49 history
winning_combinations_canada = df.apply(extract_numbers, axis=1)

In [10]:
def check_historical_occurence(combination, winning_combinations):
    combination = set(combination)
    occurences = combination==winning_combinations
    n_occurences = occurences.sum()
    
    print('The combination {} has occured a total of {} times before in the Canadian 6/49 lottery. The chances of your combination {} winning remain the same as before though, at 0.0000072% or 1 in 13,983,816.'.format(
        combination, n_occurences, combination))

In [11]:
#testing function
r = [random.randrange(1, 49, 1) for i in range(6)]
check_historical_occurence(r, winning_combinations_canada)

The combination {1, 37, 11, 14, 17, 18} has occured a total of 0 times before in the Canadian 6/49 lottery. The chances of your combination {1, 37, 11, 14, 17, 18} winning remain the same as before though, at 0.0000072% or 1 in 13,983,816.


Lottery players will typically play more than one ticket in a single drawing. We will therefore write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [17]:
# calculate the chance of winning the lottery based on number of tickets played
def multi_ticket_probability(n_tickets):
    outcomes = combinations(49,6) #total number of outcomes
    p = n_tickets/outcomes*100 #chances of winning
    return print('There is a {:.7f}% chance of this number of tickets winning the lottery'.format(p))
    

In [18]:
# test function written above
tickets = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for t in tickets:
    multi_ticket_probability(t)

There is a 0.0000072% chance of this number of tickets winning the lottery
There is a 0.0000715% chance of this number of tickets winning the lottery
There is a 0.0007151% chance of this number of tickets winning the lottery
There is a 0.0715112% chance of this number of tickets winning the lottery
There is a 7.1511238% chance of this number of tickets winning the lottery
There is a 50.0000000% chance of this number of tickets winning the lottery
There is a 100.0000000% chance of this number of tickets winning the lottery


The easiest way to see that the function was written successfully is that the second to last and last input result in a 50% and 100% chance of winning. 

In [30]:
def probability_less_6(n):
    n_combinations = combinations(6,n)
    succesful_comb = combinations(43,6-n)
    p = n_combinations*succesful_comb/combinations(49,6)*100
    print('Your chances of having {} winning numbers are {:.7f}%'.format(n, p))
    

In [31]:
for n in range(2, 6):
    probability_less_6(n)
    print('--------------------------')

Your chances of having 2 winning numbers are 13.2378029%
Your chances of having 3 winning numbers are 1.7650404%
Your chances of having 4 winning numbers are 0.0968620%
Your chances of having 5 winning numbers are 0.0018450%


And this was it folks! 