# Mobile app for Lottery Addiction

A medical institute for preventing and treating gambling addiction wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. They have a team of engineers to build it, but need help creating the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the 6/49 lottery and help users answer the following questions:
 - What is the probability of winning the big prize with a single ticket?
 - What is the probability of winning the big prize if we play 40 different tickets (or any number)?
 - What is the probability of having at least five (or four, or htree, or two) winning numbers on a single ticket?
 
The institute wants us to use data coming from the national 6/49 lottery game in Canada. The data set has data for 3665 drawings from 1982 to 2018, [here](https://www.kaggle.com/datascienceai/lottery-dataset)



We will start by coding 2 functions that we are going to use repeatedly:
 - A function that calculates factorials
 - A function that calculates combinations

In [1]:
def factorial(n):
    fact = 1
    for i in range(n, 0, -1):
        fact *= i
    return fact

def combinations(n, k):
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return numerator/denominator

The next step will be writing a function that calculate the probability of winning the big prize. 
In the 6/49 lottery 6 numbers are drawn from a set of 49 numbers that range from 1 to 49. The winning tickets includes all the six number drawn.

We got some recommendations from the engineering team when writing the function:
 - Inside the app, the user will input 6 different numbers from 1 to 49
 - The six number will come as a Python list, which will serve as the single input to our function
 - The engineering team wants the function to print the probability value in a friendly way, so people with no probability training to be able to understand

In [2]:
def one_ticket_probability(list_1):
    possible_outcomes = combinations(49, 6)
    probability_ticket = 1/possible_outcomes 
    percentage = probability_ticket*100
    
    print('Your chances of winning the big prize with the numbers {} are {:.7f}%. You have 1 in {:,} chances to win.'.format(list_1, percentage, int(possible_outcomes)))
    

In [3]:
test = [1, 5, 45, 11, 9, 7]
one_ticket_probability(test)

Your chances of winning the big prize with the numbers [1, 5, 45, 11, 9, 7] are 0.0000072%. You have 1 in 13,983,816 chances to win.


The initial request was also that user should be able to compare their ticket against the lottery data in Canada and determine if they would have won or not. 

# Exploring the data for Canada lottery

In [4]:
import pandas as pd
canada_lottery = pd.read_csv('649.csv')
canada_lottery.shape

(3665, 11)

In [5]:
canada_lottery.head(3)


Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [6]:
canada_lottery.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


Our next step will be writing a function that will compare a user's ticket to this data set and determine if that user would have ever won.

The engineering team asked us to consider the following:
 - The user will input 6 different numbers from 1 to 49
 - Those 6 numbers will come as a Python list and will be the input to our function
 - The function will print the number of times the user's combination occurred in the Canada data set and the probability of winning the big prize in the next drawing with that combination. 

We will first write a question that extracts all winning sets from the Canada data set.

In [7]:
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row

winning_no = canada_lottery.apply(extract_numbers, axis=1)
winning_no.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [8]:
def check_historical_occurence(user_list, winning_list):
    user_list = set(user_list)
    check_occurence = user_list == winning_list
    no_occurences = check_occurence.sum()
    
    if no_occurences == 0:
        print('The combination {} never occured. Your chances to win the big prize in the next drawing with the combination {} are 0.0000072%. You have a 1 in 13,983,816 chances to win.'.format(user_list, user_list))
    else:
        print('The combination {} occured {} times in the past. Your chances to win the big prize in the next drawing with the combination {} are 0.0000072%. You have a 1 in 13,983,816 chances to win.'.format(user_list, no_occurences, user_list))
    
    

In [9]:
test = [1, 5, 45, 11, 9, 7]
check_historical_occurence(test, winning_no)

The combination {1, 5, 7, 9, 11, 45} never occured. Your chances to win the big prize in the next drawing with the combination {1, 5, 7, 9, 11, 45} are 0.0000072%. You have a 1 in 13,983,816 chances to win.


# Multi-ticket probability

Considering that lottery addicts play mor than one ticket, we need to help them estimate if their chances increase significantly if they play more tickets at a time.

We will write a function that will allow users to calculate the chances of winning for any number of different tickets, considering these recommendations from the engineering team:
- The user will input the number of different tickets they want to play
- Our function will see an integer between 1 and 13,983, 816 (the max number of tickets)
- The function should print out information about the probability of winning the big prize depending on the number of different tickets played.

In [10]:
def multi_ticket_probability(no_tickets):
    possible_outcomes = combinations(49, 6)
    probability = no_tickets/possible_outcomes
    percentage = probability * 100
    
    if no_tickets == 1:
        print('Your chances to win the big prize with one ticket are {:.6f}%. You have 1 in {:,} to win.'.format(percentage, int(possible_outcomes)))
    else:
        combinations_simple = round(possible_outcomes/no_tickets)
        print('Your chances to win the big prize with {:,} different tickets are {:.6f}%. You have a 1 in {:,} chances to win.'.format(no_tickets, percentage, combinations_simple))
    
    

In [11]:
test = [1, 10, 100, 100000, 6991908, 13983816]

for value in test:
    multi_ticket_probability(value)
    print('\n')

Your chances to win the big prize with one ticket are 0.000007%. You have 1 in 13,983,816 to win.


Your chances to win the big prize with 10 different tickets are 0.000072%. You have a 1 in 1,398,382 chances to win.


Your chances to win the big prize with 100 different tickets are 0.000715%. You have a 1 in 139,838 chances to win.


Your chances to win the big prize with 100,000 different tickets are 0.715112%. You have a 1 in 140 chances to win.


Your chances to win the big prize with 6,991,908 different tickets are 50.000000%. You have a 1 in 2 chances to win.


Your chances to win the big prize with 13,983,816 different tickets are 100.000000%. You have a 1 in 1 chances to win.




# Less winning numbers

In the last step, we are going to write a function that allows the users to calculate probabilities for 2, 3, 4 or 5 numbers.

In most 6/49 lotteries there are smaller prizes if a player;s ticket match 2, 3, 4 or 5 numbers drawn. 

The engineering team asked as to consider the following:
- the user will input six dfferent numbers from 1 to 49 and an integer between 2 and 5that represents the number of winning numbers expected
- the function will print information about the probability of having the inputted number of winning numbers

In [12]:
def probability_less_6(n_winning_numbers):
    possible_outcomes = combinations(6, n_winning_numbers)
    possible_outcomes_remaining = combinations(43, 6 - n_winning_numbers)
    successful_outcomes = possible_outcomes * possible_outcomes_remaining
    
    combinations_total = combinations(49, 6)    
    probability = successful_outcomes / combinations_total
    
    probability_percentage = probability * 100    
    combinations_simplified = round(combinations_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
You have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability_percentage,
                                                               int(combinations_simplified)))
    

In [14]:
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('\n')

Your chances of having 2 winning numbers with this ticket are 13.237803%.
You have a 1 in 8 chances to win.


Your chances of having 3 winning numbers with this ticket are 1.765040%.
You have a 1 in 57 chances to win.


Your chances of having 4 winning numbers with this ticket are 0.096862%.
You have a 1 in 1,032 chances to win.


Your chances of having 5 winning numbers with this ticket are 0.001845%.
You have a 1 in 54,201 chances to win.




# Conclusions

We wrote four main functions:

- one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
- check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
- multi_ticket_probability() — calculates the probability for any number of tickets between 1 and 13,983,816
- probability_less_6() - takes in an integer between 2 and 5 and prints information about the chances of winning depending on the value of that integer

The next step would be getting feedback about our first app and find ways to improve it or add new features.