<h2>Mobile App for Lottery Addiction</h2>

In this project we are going to compute the probability of different scenarios regarding winning a 6/49 lottery. Thequestion we are going to answer are:  
What is the probability of winning the big prize with a single ticket?  
What is the probability of winning the big prize if we play 40 different tickets (or any other number)?  
What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?  


In [1]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline


In [3]:
def factorial(n):
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

In [4]:
def combinations(n,k):
    c = factorial(n) / (factorial(k) * factorial(n-k))
    return c

In [16]:
def one_ticket_probability(x_list):
    tot_combinations = combinations(49,6)
    prob_1 = 1 / tot_combinations * 100
    print("The chances of winning with these numbers are {:8f}%".format(prob_1))
    

In [17]:
test1_list = [1,2,3,4,5,6]
test1 = one_ticket_probability(test1_list)

The chances of winning with these numbers are 0.000007%


I used percentages and removed the scientific way of presenting the number so it will be easier for people to understand the true value of the answer and how low it is.

I this section we are going to check the specific set of numbers chosen by the user and how they compare with historical data from the Canadian lottery. the data can be downloaded from this link https://www.kaggle.com/datascienceai/lottery-dataset

In [18]:
df = pd.read_csv('649.csv')

In [19]:
df.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


In [21]:
df.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


In [22]:
def extract_numbers(row):
    num_list = []
    num_list.append(row['NUMBER DRAWN 1'])
    num_list.append(row['NUMBER DRAWN 2'])
    num_list.append(row['NUMBER DRAWN 3'])
    num_list.append(row['NUMBER DRAWN 4'])
    num_list.append(row['NUMBER DRAWN 5'])
    num_list.append(row['NUMBER DRAWN 6'])
    return set(num_list)

In [26]:
win_series = df.apply(extract_numbers,axis=1)

In [28]:
print(win_series[0:5])

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object


In [37]:
def check_historical_occurence(user_list, hist_win):
    user_set = set(user_list)
    wins = hist_win==user_set
    print("The number of times this combination won in the past is {}".format(wins.sum()))
    one_ticket_probability(user_list)


In [38]:
test2 = check_historical_occurence(test1_list, win_series)

The number of times this combination won in the past is 0
The chances of winning with these numbers are 0.000007%


In [39]:
test_list2 = [22,3,41,17,19,34]
test3 = check_historical_occurence(test_list2, win_series)

The number of times this combination won in the past is 0
The chances of winning with these numbers are 0.000007%


In [56]:
def multi_ticket_probability(n):
    total_combinations = combinations(49,6)
    chances = n / total_combinations * 100
    text = round(total_combinations / n)
    print ("Your chances of winning with {} different tickets are {:8f}% meaning it is 1 in {}".format(n, chances,text))
    

In [57]:
test4 = multi_ticket_probability(1)

Your chances of winning with 1 different tickets are 0.000007% meaning it is 1 in 13983816


In [58]:
test5 = multi_ticket_probability(10)

Your chances of winning with 10 different tickets are 0.000072% meaning it is 1 in 1398382


In [59]:
test4 = multi_ticket_probability(100)

Your chances of winning with 100 different tickets are 0.000715% meaning it is 1 in 139838


In [60]:
test4 = multi_ticket_probability(1000)

Your chances of winning with 1000 different tickets are 0.007151% meaning it is 1 in 13984


In [61]:
test4 = multi_ticket_probability(1000000)

Your chances of winning with 1000000 different tickets are 7.151124% meaning it is 1 in 14


In [62]:
test4 = multi_ticket_probability(6991908)

Your chances of winning with 6991908 different tickets are 50.000000% meaning it is 1 in 2


In [63]:
test4 = multi_ticket_probability(13983816)

Your chances of winning with 13983816 different tickets are 100.000000% meaning it is 1 in 1


In [70]:
def probability_less_6(n):
    comb = combinations(6,n)
    n_combinations_remaining = combinations(43, 6 - n)
    successful_outcomes = comb * n_combinations_remaining
    answer = successful_outcomes / combinations(49,6) 
    answer_pct = answer * 100
    text = round(combinations(49,6)/successful_outcomes) 
    print("The chances of having exactly {} winning numbers is {:8f}% meaning you have 1 in {} chance".format(n,answer_pct,text))

    

In [73]:
test5 = probability_less_6(5)

The chances of having exactly 5 winning numbers is 0.001845% meaning you have 1 in 54201 chance


In [75]:
test5 = probability_less_6(4)

The chances of having exactly 4 winning numbers is 0.096862% meaning you have 1 in 1032 chance


In [76]:
test5 = probability_less_6(3)

The chances of having exactly 3 winning numbers is 1.765040% meaning you have 1 in 57 chance


In [77]:
test5 = probability_less_6(2)

The chances of having exactly 2 winning numbers is 13.237803% meaning you have 1 in 8 chance
