## Mobile App for Lottery Addiction


Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the 6/49 lottery and build functions that enable users to answer questions like:

   1. What is the probability of winning the big prize with a single ticket?
   
   2. What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
   
   3. What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

## Core Functions

In [1]:
def factorial(n):
    f = 1
    for i in range(n):
        f *= (n -i)
    return f

def combinations(n,k):
    def permutation(n,k):
        return factorial(n)/factorial(n-k)
    return permutation(n,k)/factorial(k)

## One-ticket Probability

In [2]:
def one_ticket_probability(numbers):
    total_outcomes = combinations(49,6)
    successful_outcomes = 1/total_outcomes
    
    print("Probability of a successful outcomes:  {:.7f}% and your list of numbers is {z}".format(successful_outcomes * 100, z = numbers))

In [3]:
drawn = [1,2,3,4,5,6]
test = one_ticket_probability(drawn)

Probability of a successful outcomes:  0.0000072% and your list of numbers is [1, 2, 3, 4, 5, 6]


## Historical Data Check for Canada Lottery

In [4]:
import pandas as pd

In [5]:
data = pd.read_csv("649.csv")

In [6]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


In [7]:
data.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


In [8]:
data.tail()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3660,649,3587,0,6/6/2018,10,15,23,38,40,41,35
3661,649,3588,0,6/9/2018,19,25,31,36,46,47,26
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


## Function for Historical Data Check

In [9]:
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row

winning_data = data.apply(extract_numbers,  axis = 1) 

In [10]:
winning_data.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [11]:
def check_historical_occurence(user_numbers,historical_data):
    user_numbers_set = set(user_numbers)
    check = (user_numbers_set == historical_data) 
    n_occurence = check.sum()
    
    
    if n_occurence == 0:
        print('''The combination {} has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, user_numbers))
    else:
        print('''The number of times combination {} has occured in the past is {}.
Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, n_occurrences,
                                                                            user_numbers))

In [12]:
test_1 = {1,2,3,4,5,6}
test_2 = {22,33,44,55,66}

In [13]:
check_historical_occurence(test_1,winning_data)

The combination {1, 2, 3, 4, 5, 6} has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination {1, 2, 3, 4, 5, 6} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


## Multi-ticket Probability

In [14]:
def multi_ticket_probability(n):
    total_outcomes = combinations(49,6)
    multi = n/total_outcomes
    percentage_form = multi * 100
    
    if n == 1:
        print('''Your chances to win the big prize with one ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(percentage_form, int(total_outcomes)))
    
    else:
        combinations_simplified = round(total_outcomes / n)   
        print('''Your chances to win the big prize with {:,} different tickets are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n, percentage_form,
                                                               combinations_simplified))
    

In [15]:
tests = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for test in tests:
    multi_ticket_probability(test)
    print("______________________", sep = "\n\n")

Your chances to win the big prize with one ticket are 0.000007%.
In other words, you have a 1 in 13,983,816 chances to win.
______________________
Your chances to win the big prize with 10 different tickets are 0.000072%.
In other words, you have a 1 in 1,398,382 chances to win.
______________________
Your chances to win the big prize with 100 different tickets are 0.000715%.
In other words, you have a 1 in 139,838 chances to win.
______________________
Your chances to win the big prize with 10,000 different tickets are 0.071511%.
In other words, you have a 1 in 1,398 chances to win.
______________________
Your chances to win the big prize with 1,000,000 different tickets are 7.151124%.
In other words, you have a 1 in 14 chances to win.
______________________
Your chances to win the big prize with 6,991,908 different tickets are 50.000000%.
In other words, you have a 1 in 2 chances to win.
______________________
Your chances to win the big prize with 13,983,816 different tickets are 10

## Less Winning Numbers — Function

In [17]:
def probability_less_6(n_winning_numbers):
    
    n_combinations_ticket = combinations(6, n_winning_numbers)
    n_combinations_remaining = combinations(43, 6 - n_winning_numbers)
    successful_outcomes = n_combinations_ticket * n_combinations_remaining
    
    n_combinations_total = combinations(49, 6)    
    probability = successful_outcomes / n_combinations_total
    
    probability_percentage = probability * 100    
    combinations_simplified = round(n_combinations_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability_percentage,
                                                               int(combinations_simplified)))


In [18]:
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('--------------------------')

Your chances of having 2 winning numbers with this ticket are 13.237803%.
In other words, you have a 1 in 8 chances to win.
--------------------------
Your chances of having 3 winning numbers with this ticket are 1.765040%.
In other words, you have a 1 in 57 chances to win.
--------------------------
Your chances of having 4 winning numbers with this ticket are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.
--------------------------
Your chances of having 5 winning numbers with this ticket are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
--------------------------


Next steps
For the first version of the app, we coded four main functions:

one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
probability_less_6() — calculates the probability of having two, three, four or five winning numbers exactly
Possible features for a second version of the app include:

Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery")
Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time
Create a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers. Hint: the number of successful outcomes for having at least four winning numbers is the sum of these three numbers:
The number of successful outcomes for having four winning numbers exactly
The number of successful outcomes for having five winning numbers exactly
The number of successful outcomes for having six winning numbers exactly