## Guided Project: Mobile App for Lottery Addiction


A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49) and build functions that enable users to answer questions like:

    What is the probability of winning the big prize with a single ticket?
    What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
    What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?
    
The institute also wants us to consider historical data coming from the national 6/49 lottery game in Canada. The data set has data for 3,665 drawings, dating from 1982 to 2018 (we'll come back to this).

The scenario we're following throughout this project is fictional.

In [1]:
##Step 2 of 8 - two functions 

def factorial (n):
    
    result = 1
    i = 1
    while i <= n:
        result = result *i
        i += 1
    return result


def combinations(n,k):
    
    result = factorial(n)/(factorial(k)*factorial(n-k))
    
    return result

We are writing a fucntion and using it to print in a freidnly way the chances of success for the person taking part in the lottery. This will give them a sense of their real chance for winning the big prpice given the single ticket they have purchased with the 6 numbers. 

In [2]:
## step 3 out of 8

def one_ticket_probability(numbers):
    
    num_possible = 49
    num_chosen = len(numbers)
    
    total_outcomes = combinations(num_possible,num_chosen)
    
    num_success = 1
    
    P_success = num_success/total_outcomes
    
    
    Prob_percentage = "{:.8%}".format(P_success)
    
    print ("You have a {} chance of winning the big prize".format(Prob_percentage))
    #print("\n")
    print ("That is to say your chances of winning the big prize is 1 to {}".format(total_outcomes))
    


In [3]:
numbers = [1,2,3,4,5,6]

one_ticket_probability(numbers)

You have a 0.00000715% chance of winning the big prize
That is to say your chances of winning the big prize is 1 to 13983816.0


In [4]:
## step 4 of 8

import pandas as pd
import numpy as np

data = pd.read_csv("649.csv")

data.head(3)



Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [5]:
data.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


In [6]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


We extract the winning combinations form the historial data and then we compare the user combination with the histrial combination and we rpint the results. 

In [7]:
## Slide 5 of 8

def extract_numbers(input_series):

    drawn = input_series.iloc[4:10]  #using integer location indexing, to extract each row as a series
    result = set(drawn.values)
    return(result)



#Use extract_numbers() in combination with the DataFrame.apply() method to extract all the winning numbers
winning_numbers = data.apply(extract_numbers, axis = 1)
win_num = pd.Series(winning_numbers)

win_num

0        {3, 41, 11, 12, 43, 14}
1        {33, 36, 37, 39, 8, 41}
2         {1, 6, 39, 23, 24, 27}
3         {3, 9, 10, 43, 13, 20}
4        {34, 5, 14, 47, 21, 31}
5        {8, 41, 20, 21, 25, 31}
6       {33, 36, 42, 18, 25, 28}
7        {7, 40, 16, 17, 48, 31}
8        {5, 38, 37, 10, 23, 27}
9        {4, 37, 46, 15, 48, 30}
10        {33, 38, 7, 9, 42, 21}
11      {36, 11, 43, 17, 19, 20}
12       {37, 7, 14, 47, 17, 20}
13      {35, 44, 25, 28, 29, 30}
14       {36, 39, 8, 41, 47, 18}
15       {9, 12, 13, 14, 44, 48}
16       {4, 40, 43, 44, 14, 18}
17      {34, 35, 36, 13, 16, 18}
18      {36, 11, 23, 25, 28, 29}
19       {37, 7, 45, 18, 23, 25}
20      {37, 11, 45, 18, 19, 31}
21       {8, 14, 16, 48, 18, 31}
22       {4, 11, 45, 23, 24, 25}
23        {33, 34, 3, 4, 48, 19}
24       {5, 43, 17, 21, 28, 30}
25       {36, 6, 38, 46, 17, 24}
26        {4, 9, 10, 11, 43, 46}
27       {32, 33, 7, 13, 45, 23}
28      {35, 37, 11, 18, 22, 28}
29      {35, 45, 48, 25, 26, 31}
          

In [8]:
def check_historical_occurence(win_num, usr_num):
    user_num_set = set(usr_num)
    check={}
    
    for i in range(0, 2):
#         print("winner number is:{}".format(win_num[i]))
#         print("input number is :{}".format(usr_num))
        
        if usr_num == win_num[i]:
            check[i] = True
        else:
            check[i] = False
            ##print(check[i])    
    times_won = sum(pd.Series(check))
    print("The number of thimes your combination -{}- has won is {} times before".format(usr_num,times_won))                       
    #return(check)
    
    
numbers = set([3, 41, 11, 12, 43, 14])

check_historical_occurence(win_num,numbers)

one_ticket_probability(numbers)


The number of thimes your combination -{3, 41, 11, 12, 43, 14}- has won is 1 times before
You have a 0.00000715% chance of winning the big prize
That is to say your chances of winning the big prize is 1 to 13983816.0


Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning â€” on this screen, we're going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

We've talked with the engineering team and they gave us the following information:

    * The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
    * Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
    * The function should print information about the probability of winning the big prize depending on the number of different tickets played.

In [9]:
# Slide 6 of 8

def multi_ticket_probability(num_plays):
    k = 6 #numbers in the lottery ticket
    n = 49 #total numbers to choose from
    
    total_outcomes = combinations(n,k)
    
    num_success = num_plays #number of sucessful outcomes is the number of tickets the user plays
    
    prob_sucess = num_success/total_outcomes
    
    prob_sucess_percentage = "{:.8%}".format(prob_sucess)
    
    print ("You have a {} chance of winning the big prize".format(prob_sucess_percentage))
    #print("\n")
    print ("That is to say your chances of winning the big prize is {} in {}".format(num_success,total_outcomes))
    
multi_ticket_probability(10000)
    
    

You have a 0.07151124% chance of winning the big prize
That is to say your chances of winning the big prize is 10000 in 13983816.0


On this screen, we're going to write one more function to allow the users to calculate probabilities for two, three, four, or five winning numbers.

For extra context, in most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.

These are the engineering details we'll need to be aware of:

    Inside the app, the user inputs:
        six different numbers from 1 to 49; and
        an integer between 2 and 5 that represents the number of winning numbers expected
    Our function prints information about the probability of having the inputted number of winning numbers.

In [10]:
#Slide 7 of 8

def probability_less_6(win_numbers):
    
    n = 49 # total numbers
    k = 6 #numbers in a ticket 
    
    successful_outcomes = combinations(k,win_numbers)*combinations((n-k),(k-win_numbers))
    total_outcomes = combinations(n,k)
                                                                   
    prob_sucess_less6 =  successful_outcomes/ total_outcomes
    prob_sucess_less6_percent = "{:.6%}".format(prob_sucess_less6)
    print ("You have a {} chance of winning the a prize, when you have chosen {} numbers".format(prob_sucess_less6_percent, win_numbers))
                                                                       
probability_less_6(5)


You have a 0.001845% chance of winning the a prize, when you have chosen 5 numbers


In [11]:
probability_less_6(4)

You have a 0.096862% chance of winning the a prize, when you have chosen 4 numbers


In [12]:
probability_less_6(3)

You have a 1.765040% chance of winning the a prize, when you have chosen 3 numbers


In [13]:
probability_less_6(2)

You have a 13.237803% chance of winning the a prize, when you have chosen 2 numbers


### For ___at least ___ 2,3,4,5 extra optional:

Creating a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers. 

Hint: the number of successful outcomes for having at least four winning numbers is the sum of these three numbers:

       @ The number of successful outcomes for having four winning numbers exactly
       
       @ The number of successful outcomes for having five winning numbers exactly
       
       @ The number of successful outcomes for having six winning numbers exactly

In [23]:
def probability_atleast_less_6(win_numbers):
    
    n = 49 # total numbers
    k = 6 #numbers in a ticket 
    win_count = win_numbers
    successful_outcomes = []
    
    for i in range(win_numbers,k):
        
        successful_outcomes.append(combinations(k,win_count)*combinations((n-k),(k-win_count))) 
        win_count+=1
    
    
    total_outcomes = combinations(n,k)
                                                                   
    prob_sucess_less6 =  sum(successful_outcomes)/ total_outcomes
    prob_sucess_less6_percent = "{:.6%}".format(prob_sucess_less6)
    print ("You have a {} chance of having at least {} winning numbers".format(prob_sucess_less6_percent, win_numbers))
    
    
probability_atleast_less_6(2)

You have a 15.101550% chance of having at least 2 winning numbers
