# Mobile App for Lottery Addiction

Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need to create a logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the [Lotto 6/49](6/49) lottery and build functions that enable users to answer questions like:

* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

The institute also wants to consider [historical data](https://www.kaggle.com/datascienceai/lottery-dataset) coming from the national 6/49 lottery game in Canada.


** Creating functions to calculate factorials and combination **

Above we saw our goal is to write code that can enable users to answer probability questions about playing the lottery. Throughout the project, we'll need to calculate repeatedly probabilities and combinations. As consequence, we'll start by writing two functions that we'll use often:

1. A function that calculates factorials
2. A function that calculates combinations

To calculate factorials, below is the formula:
n!=n * (n-1) * (n-2) * ........ *2*1

To calculate combinations we use the below formula:

nCk=n!/( k! * (n-k)!)


In [1]:
# Creating factorial function

def factorial(n):
    factorial=1
    for i in range(1,n+1,1):
        factorial*=i
    return factorial

# Creating combination function

def combinations(n,k):
    return factorial(n)/(factorial(k)*factorial(n-k))

# Probability Function to calculate the probability of the winning big prize

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. If a player has a ticket with the numbers {13, 22, 24, 27, 42, 44}, he only wins the big prize if the numbers drawn are {13, 22, 24, 27, 42, 44}. If only one number differs, he doesn't win.

We discussed with the engineering team of the medical institute, and they told us we need to be aware of the following details when we write the function:

Inside the app, the user inputs six different numbers from 1 to 49.
Under the hood, the six numbers will come as a Python list, which will serve as the single input to our function.
The engineering team wants the function to print the probability value in a friendly way — in a way that people without any probability training are able to understand.

In [2]:
# Probability Function
def one_ticket_probability(list_nos): #takes input of 6 nos as a list
    nos_drawn=len(list_nos)
    total_outcomes=combinations(49,6)
    """
    Total number of outcomes will be a combination of k(6) numbers choosen from 
    a given set of numbers(1 to 49 which is 49).
    Here n = 49and k = 6
    """
    prob_winning=1/total_outcomes
    
    """"
    Since all the 6 numbers drawn need to match with the ticket to win the big 
    prize there is only one favourable outcome to us. Hence the above formula.
    """
    prob_winning_neat=round(prob_winning*100,8)
    
    """
    Rounding off the probabilty to 2 places and in percentage so the users can
    easily understand
    """
    print("The probability of winning the big prize with your ticket{} is {:.7f}%."
          .format(list_nos,prob_winning_neat))


In [3]:
# Test the above function

one_ticket_probability([13,22,24,27,42,44])   

one_ticket_probability([10,8,9,26,49,35])   

The probability of winning the big prize with your ticket[13, 22, 24, 27, 42, 44] is 0.0000072%.
The probability of winning the big prize with your ticket[10, 8, 9, 26, 49, 35] is 0.0000072%.


Above we wrote a function that can tell users what is the probability of winning with their available ticket. However we also wish to make use of historical data and let the users be able to compare their current ticket against the historical lottery data in Canada and determine if probably they would have ever won by now.

# Exploring Historical Data

Now we will focus on exploring the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset).

The data set contains historical data for 3,665 drawings (each row shows data for a single drawing), dating from 1982 to 2018.

In [4]:
# Load the data

import pandas as pd

data=pd.read_csv("649.csv")

In [5]:
data.shape

(3665, 11)

In [6]:
data.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


# Function to compare ticket against historical lottery

Now we will write a function that will enable users to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

The engineering team told us that we need to be aware of the following details:

* Inside the app, the user inputs six different numbers from 1 to 49.
* Under the hood, the six numbers will come as a Python list and serve as an input to our function.


The engineering team wants us to write a function that prints:

* The number of times the combination selected occurred in the Canada data set
* The probability of winning the big prize in the next drawing with that combination

** Function to extract numbers **

First we will write a function that will extract the numbers from each column in the above dataset and return a set containing all the winning numbers.

In [7]:
# Function to extract numbers

def extract_number(row):
    row=row[4:10]
    nums=set(row.values)
    return nums

data["Winning_Set"]=data.apply(extract_number,axis=1)

data.head(3)
    

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER,Winning_Set
0,649,1,0,6/12/1982,3,11,12,14,41,43,13,"{3, 41, 11, 12, 43, 14}"
1,649,2,0,6/19/1982,8,33,36,37,39,41,9,"{33, 36, 37, 39, 8, 41}"
2,649,3,0,6/26/1982,1,6,23,24,27,39,34,"{1, 6, 39, 23, 24, 27}"


** Function to check historical occurences of the ticket **

This function will calculate the number of times the combination selected occured in the Canada dataset and the probability of winning the big prize.



In [8]:
# Function to check historical occurences

def check_historical_occurences(ticket):
    """
    The function will take in the user ticket as the input which is a
    python list.
    """
    ticket_set=set(ticket)
    """
    The lists is converted to sets for comparison
    """
    winning_combinations=data["Winning_Set"]
    a=ticket_set==winning_combinations
    total_occurences=a.sum()
    if total_occurences==0:
        print("The {} combination has never occured.There is a 1 in 13,983,816 chance of winning with this ticket"
             .format(ticket))
    else:
        print("The {} combination has occured {} times historically.There is a 1 in 13,983,816 chance of winning with this ticket"
             .format(ticket,total_occurences))
        

In [9]:
# Testing the above function

check_historical_occurences([12,2,25,36,45,30])

The [12, 2, 25, 36, 45, 30] combination has never occured.There is a 1 in 13,983,816 chance of winning with this ticket


In [10]:
# Testing the above function
check_historical_occurences([3, 41, 11, 12, 43, 14])

The [3, 41, 11, 12, 43, 14] combination has occured 1 times historically.There is a 1 in 13,983,816 chance of winning with this ticket


# Multi Ticket Probability

Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning.

Now we are going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

We've talked with the engineering team and they gave us the following information:

* The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
* Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
* The function should print information about the probability of winning the big prize depending on the number of different tickets played.


In [37]:
# Write the function for multi ticket probability

def multi_ticket_probability(no_of_tickets):
    """"
    The function will take in the input as number of tickets purchased by user
    """
    total_outcomes=combinations(49,6)
    """"
    The above will give total possible outcomes of 6 numbers drawan from a 
    total of 49 numbers
    """
    successful_outcomes=no_of_tickets
    """
    Successful outcomes will be the number of tickets drawn
    """
    probability=successful_outcomes/total_outcomes
    probability=probability*100
    print(probability)
    """
    Convert the probability to percentage so users understand easily
    """
    if no_of_tickets==1:
        print('''You have bought one ticket.The chance of you winning this lottery is
        {:.6f}%'''.format(probability))
    else:
        print('''You have bought {} tickets.The chance of you winning this lottery is
        {:.6f}%'''.format(no_of_tickets,probability))    

In [39]:
# Test the above function
multi_ticket_probability(1)

7.151123842018516e-06
You have bought one ticket.The chance of you winning this lottery is
        0.000007%


In [40]:
# Test the above function
multi_ticket_probability(50)

0.0003575561921009258
You have bought 50 tickets.The chance of you winning this lottery is
        0.000358%


In [41]:
# Test the above function
multi_ticket_probability(10000)

0.07151123842018516
You have bought 10000 tickets.The chance of you winning this lottery is
        0.071511%


# Winning Numbers Probability

In most 6/49 lotteries there are smaller prizes if a player's ticket match two,three,four or five of the six numbers drawn. As a result,the user might be interested in knowing the probability of having two,three,four,or five winning numbers.

These are the engineering details we'll need to be aware of:

Inside the app, the user inputs:

1. Six different numbers from 1 to 49 and an integer between 2 and 5 that represents the number of winning numbers expected.
2. Our function prints information about the probability of having the inputted number of winning numbers.


** Steps to follow when writing the above function**

First, we need to differentiate between these two probability questions:

1. What is the probability of having exactly five winning numbers?
2. What is the probability of having at least five winning numbers?

For our purpose we will focus on the first question.

Let's say a player chose these six numbers on a ticket: (1, 2, 3, 4 ,5 ,6). Out of these six numbers, we can form six five-number combinations:
(1, 2, 3, 4, 5)
(1, 2, 3, 4, 6)
(1, 2, 3, 5, 6)
(1, 2, 4, 5, 6)
(1, 3, 4, 5, 6)
(2, 3, 4, 5, 6)

Total number of combinations can be also found using the combinations formula or the function we used above. The output will be 6.

For each one of the six five-number combinations we will have 44 possible successful outcomes. Let's take (1,2,4,5,6) as an example. The possible combinations will be:
(1,2,4,5,6,3),(1,2,4,5,6,7),(1,2,4,5,6,8),.................(1,2,4,5,6,49)

However we need to leave out the combination (1,2,4,5,6,3) since iot matches all the six numbers and we only want combinations which will match exact five numbers.So total combination will be 44-1=43.

Since there are six five-number combinations and each combination corresponds to 43 successful outcomes, we need to multiply 6 by 43 to find the total number of successful outcomes:

6 * 43=258.

Since there are 258 successful outcomes and there are 13,983,816 total possible outcomes(result of choosing 6 nos of 49 nos),the probability of having exactlt five winning numbers for a single lottery ticket is:

258/13,983,816=0.00001845

With the above calculation we can see that the exact combination of numbers on the ticket is irrelevant and we only need number between 2 and 5 to calculate the probability.

In [53]:
# Write the function
def probability_less_6(numbers_count):
    '''
    Function will take the input as numbers_count.
    '''
    total_outcomes=combinations(49,6)
    '''
    Calculate the total outcomes using the combinatinos function above.
    '''
    num_combinations=combinations(6,numbers_count)
    '''
    Calculate the total number of possible combinations using the combinations function.
    We need to find possible combinations of selecting 5 out of 6 numbers on a ticket.
    '''
    successful_combinations=num_combinations*(combinations(43,6-numbers_count))
    '''
    Every possible combination in num_combinations will have (49 - numbers_count) possible outcomes.
    For example if we want to look at probability of matching 2 nos on the ticket.
    The 2 numbers will have 15 combinations and each combination will have 47 different combinations.
    
    However as discussed above we need to remove the one exact combination which 
    will match all the numbers on the ticket.
    '''
    prob=(successful_combinations/total_outcomes)*100
    print(''' The probability of getting {} winning numbers on your ticket is 
    {:.6f}%'''.format(numbers_count,prob))

In [56]:
# Test the above function
for i in range(2,6):
    probability_less_6(i)
    print("----------")  # output delimiter

 The probability of getting 2 winning numbers on your ticket is 
    13.237803%
----------
 The probability of getting 3 winning numbers on your ticket is 
    1.765040%
----------
 The probability of getting 4 winning numbers on your ticket is 
    0.096862%
----------
 The probability of getting 5 winning numbers on your ticket is 
    0.001845%
----------


# Conclusion

For this version of the app we coded four main main functions:

* one_ticket_probability - Probablity of winning the big prize
* check_historical_occurrence - Check if the current combination of the user ticket has occured historically
* multi_ticket_probability- Probability of winning for any number of tickets.
* probability_less_6 - Probabibility of having 2 to 5 winning numbers in the user ticket.
