# Mobile App for Lottery Addiction

In this project I will demonstrate the skills I have learned around:  
- Calculating empirical and theoretical probabilities
- Using probability rules to solve problems  
- Using combinations and permutations

This project is based on the following fictional scenario:  

Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

The institute also wants us to consider historical data coming from the national 6/49 lottery game in Canada. The data set has data for 3,665 drawings, dating from 1982 to 2018.

In [1]:
# function to calculate factorials
def factorial(n):
    final_product = 1
    for i in range (n, 0, -1):
        final_product *= i
    return final_product

In [2]:
# function to calculate combinations
def combinations(n, k):
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return numerator / denominator

### One Ticket Probability

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. 

For the first version of the app, we want players to be able to calculate the probability of winning the big prize with the various numbers they play on a single ticket. So, we'll start by building a function that calculates the probability of winning the big prize for any given ticket.

In [3]:
# function to calculate probability that one ticket will win the big prize
def one_ticket_probability(l):
    c = combinations(49, 6)
    p = 1 / c
    output = "The probility of winning is {:.6f} %".format(p * 100) # easy to read format
    print(output)

In [4]:
# testing the function
one_ticket_probability(list)

The probility of winning is 0.000007 %


### Historical Data Check  

For the first version of the app, users should also be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

Next, I'll focus on exploring the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from Kaggle. The data set contains historical data for 3,665 drawings (each row shows data for a single drawing), dating from 1982 to 2018.

In [5]:
# import the historical lottery dataset and display number of rows and columns
import pandas as pd
lottery = pd.read_csv('data/649.csv')
print(lottery.shape)

(3665, 11)


In [6]:
lottery.head(3) # first 3 rows

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [7]:
lottery.tail(3) # last 3 rows

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


In [8]:
# function to extract numbers from lottery dataset 
def extract_numbers(row):
    row = row[4:10]
    numbers = set(row.values)
    return numbers

In [9]:
# testing the extraction function
extract_numbers(lottery.iloc[1])

{8, 33, 36, 37, 39, 41}

In [10]:
# using the extraction function to extract all the winning numbers
winners = lottery.apply(extract_numbers, axis=1)
winners.head(3)

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
dtype: object

In [11]:
# function to check if users listhas occured in historical list
def check_historical_occurence(user, series):
    user = set(user)
    check = series == user
    n_check = check.sum()
    if n_check == 0: 
        return "Your list of numbers has never occured in the past. Your chance of winning now is 1 in {:.0f}".format(combinations(49, 6))
    else:
        return "Your list of numbers has occured {0} times in the past. Your chance of winning now is 1 {1:.0f}".format(n_check, combinations(49, 6))

In [12]:
# test 1
check_historical_occurence([1,2,3,4,5,6], winners)

'Your list of numbers has never occured in the past. Your chance of winning now is 1 in 13983816'

In [13]:
# test 2
check_historical_occurence([9,8,7,45,22,12], winners)

'Your list of numbers has never occured in the past. Your chance of winning now is 1 in 13983816'

### Multi-ticket Probability

Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. The purpose is to help them better estimate their chances of winning, I'm going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [14]:
# function for multi-ticket probability
def multi_ticket_probability(n):
    c = combinations(49, 6)
    p = n / c
    return "If you play {0:.0f} tickets the chances of winning are {1:.6f}%, in other words 1 in {2:.0f}".format(n, p * 100, round(c/n)) 

In [15]:
multi_ticket_probability(1)

'If you play 1 tickets the chances of winning are 0.000007%, in other words 1 in 13983816'

In [16]:
multi_ticket_probability(10)

'If you play 10 tickets the chances of winning are 0.000072%, in other words 1 in 1398382'

In [17]:
multi_ticket_probability(100)

'If you play 100 tickets the chances of winning are 0.000715%, in other words 1 in 139838'

In [18]:
multi_ticket_probability(1000)

'If you play 1000 tickets the chances of winning are 0.007151%, in other words 1 in 13984'

In [19]:
multi_ticket_probability(1000000)

'If you play 1000000 tickets the chances of winning are 7.151124%, in other words 1 in 14'

In [20]:
multi_ticket_probability(6991908)

'If you play 6991908 tickets the chances of winning are 50.000000%, in other words 1 in 2'

In [21]:
multi_ticket_probability(13983816)

'If you play 13983816 tickets the chances of winning are 100.000000%, in other words 1 in 1'

This function shows the user that the probability of wining does indeed increase when purchasing more tickets. But the chance of winning still remains impossibly unlikely.

### Less Winning Numbers

Next I'm going to write one more function to allow the users to calculate probabilities for two, three, four, or five winning numbers.

For extra context, in most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.

In [22]:
# function for less winning numbers
def probability_less_6(n_winning_numbers):
    
    n_combinations_ticket = combinations(6, n_winning_numbers)
    n_combinations_remaining = combinations(43, 6 - n_winning_numbers)
    successful_outcomes = n_combinations_ticket * n_combinations_remaining
    
    n_combinations_total = combinations(49, 6)    
    probability = successful_outcomes / n_combinations_total
    
    probability_percentage = probability * 100    
    combinations_simplified = round(n_combinations_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability_percentage,
                                                               int(combinations_simplified)))

In [23]:
probability_less_6(2)

Your chances of having 2 winning numbers with this ticket are 13.237803%.
In other words, you have a 1 in 8 chances to win.


In [24]:
probability_less_6(3)

Your chances of having 3 winning numbers with this ticket are 1.765040%.
In other words, you have a 1 in 57 chances to win.


In [25]:
probability_less_6(4)

Your chances of having 4 winning numbers with this ticket are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.


In [26]:
probability_less_6(5)

Your chances of having 5 winning numbers with this ticket are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
