# Mobile App To Deal With Lottery Addiction

## Introduction

For a good number of people, playing lottery often starts as a fun activity. Yet, for some other people, this activity may grow into a habit that may eventually become an addiction.

As with other people who gamble compulsively, it is not unusual for lottery addicts to start to spend from their savings or loans, accumulate debts from borrowing or develop more desperate traits like theft. 

A medical institute with the aim to prevent and treat these gambling addictions will like to build a mobile app that helps lottery addicts with better estimates of their winning chances.

While there are developers who are on standby to build the app, the institute needs us to create the logic of the app and calculate the probabilities.

The medical institute will like us to zero in on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49) and develop functions that wil help users answer these kind of questions:

- What is the probability that I will win the big prize with just one ticket?
- What is the probability that I will win the big prize if I play multiple tickets?
- What is the probability that I will have at least five winning numbers on just one winning ticket?

For the purpose of this project, the institute will like us to consider historical data from the national 6/49 lottery game in Canada. [This data set is available on Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset) and contains data for 3,665 drawings made between 1982 and 2018.

## Helper Functions

Since we want to write code that helps users answer probability questions on lottery playing, we will need to calculate probabilites and combinations repeatedly and throughout the course of the project.

The way the 6/49 lottery works is that 6 numbers are drawn from 49 numbers ranging from 1 to 49. Each draw is done without a replacement. This means that a number cannot be put back into the set once it has been drawn.

That being said, in writing our code, we will start by writing 2 helper functions that we will be using often:
- A function to calculate factorials.
- A function to calculate combinations.

Here is the formula for calculating factorials:

<img src="files/image1.PNG" />


To find the number of combinations in this scenario where we are taking *k* objects out of *n* objects, we will use the formula below:
<img src="files/image2.PNG" />

Now that we've established the basics, let's write our two helper functions using the formulas above.

In [1]:
def factorial(n):            #calculates factorials
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

In [2]:
def combinations(n, k):      #calculates combinations
    numerator = factorial(n)
    denominator = factorial(n-k)
    return (numerator/denominator)/ factorial(k)

## Probability of Winning the Big Prize with a Single Ticket

Recall that the first question we hope the app will helps users answer is **'What is the probability that I will win the big prize with just one ticket?'**

Remember also that with the 6/49 lottery, a player chooses 6 out of 49 numbers for a single ticket. 

So, the next step is to write a function that calculates the probability that a user will win the big prize for any ticket.

Meanwhile, based on one of our discussions with the team of developers, we will be considering the following details when writing the function:

- To use the lottery app, the user will input 6 different numbers from 1 to 49.
- The 6 numbers will be presented as a Python list under the hood and will serve as the input to our function.
- The function has to print in such a friendly way that a user with no knowledge of probability concepts can understand.

In [3]:
def one_ticket_probability(your_six_numbers):
    k = len(your_six_numbers)
    n = 49
    possible_outcomes = combinations(n,k)
    successful_outcomes = 1
    probability = successful_outcomes / possible_outcomes * 100
    return print('''You have a {:.7f}% chance of winning the big prize with a single ticket when you use the numbers {}!
This means you have 1 in {:,} chances of winning.'''.format(probability, your_six_numbers, int(possible_outcomes)))

Let's test the function with a list of 6 numbers..

In [4]:
one_ticket_probability([1,3,4,6,49,8])

You have a 0.0000072% chance of winning the big prize with a single ticket when you use the numbers [1, 3, 4, 6, 49, 8]!
This means you have 1 in 13,983,816 chances of winning.


## Checking Historical Data for the 6/49 Lottery in Canada

In the previous step, we developed a function that helps users determine the probability of winning the big prize with just one ticket.

We also think users should be able to compare their ticket with historical data from the lottery in Canada. Doing this will help them know if they should have won by now.

Let's explore the Canada 6/49 lottery data...

In [5]:
import pandas as pd
canada_lottery = pd.read_csv("649.csv")
canada_lottery.shape

(3665, 11)

In [6]:
canada_lottery.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [7]:
canada_lottery.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


<u>**About the Data Set**</u>

- This dataset has 3,665 rows and 11 columns.
- Each row in the data set represents data for a single drawing dating between 1982 and 2018.
- The 6 numbers selected for each drawing in the data set are in the following columns:
 - NUMBER DRAWN 1
 - NUMBER DRAWN 2
 - NUMBER DRAWN 3
 - NUMBER DRAWN 4
 - NUMBER DRAWN 5
 - NUMBER DRAWN 6

## Comparing a Ticket with Historical Data

In this section, we will write a function that helps users compare their ticket with historical data from the Canada lottery.

Here are a few things that the developers want us to consider when we write the function:

- To use the lottery app, the user will input 6 different numbers from 1 to 49.
- The 6 numbers will be presented as a Python list under the hood and will serve as the input to our function.
- The function will print:
 - the frequency of the selected combination in the Canada data set
 - the probability of winning the big prize with the selected combination in the next drawing.
 


In [8]:
# Extracts all the winning six numbers from the canada_lottery data 
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    return row

# applying extract_numbers() to canada_lottery
winning_numbers = canada_lottery.apply(extract_numbers, axis=1)
winning_numbers.head(5)

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [9]:
def check_historical_occurence(your_6_numbers, winning_numbers):
    your_6_numbers = set(your_6_numbers)
    occurrence = your_6_numbers == winning_numbers
    frequency_of_occurrence = occurrence.sum()
    if frequency_of_occurrence > 0:
        print("This combination of 6 numbers has occurred {} time(s) in the past.".format(frequency_of_occurrence))
        print("You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.")
    else:
        print("This combination of 6 numbers has never occurred in the past.")
        print('''But it doesn't mean it is likely to occur now. 
You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.''')

Let's test this function on a combination of 6 numbers...

In [10]:
testing_1 = [1,4,5,6,7,37]
check_historical_occurence(testing_1, winning_numbers)

This combination of 6 numbers has never occurred in the past.
But it doesn't mean it is likely to occur now. 
You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.


In [11]:
testing_2 = [6, 22, 24, 31, 32, 34]
check_historical_occurence(testing_2, winning_numbers)

This combination of 6 numbers has occurred 1 time(s) in the past.
You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.


## Probability of Winning the Big Prize with Multiple Tickets

So far, we've been able to build a function that calculates the probability of winning the big prize with just one ticket and another function that checks the occurence of a combination of numbers in the Canada lottery data set.

However, lottery addicts usually don't pay a single ticket. They often play multiple tickets because they think their chances of winning will increase significantly when they play more tickets.

In this section, we will help them with better estimates of their chances by writing a function that helps a user calculate their chances of winning with any number of tickets.

Here are a few important details we will be considering when we write the function:
- Users will input the number of different tickets they will like to play without indicating the combinations they want to play.
- The function will receive integers ranging betwen 1 and 13,983,816 as input.
- The function should print a personalized message about the chances of wining the big prize based on the number of different tickets inputted.

Let's write the function...

In [12]:
def multi_ticket_probability(number_of_tickets):
    
    tot_possible_outcomes = combinations(49, 6)
    tot_successful_outcomes = number_of_tickets
    
    probability = tot_successful_outcomes / tot_possible_outcomes * 100
    combinations_rounded = round(tot_possible_outcomes / number_of_tickets)
    print('''You have a {:.7f}% chance of winning the big prize when you play {} ticket(s).
This means you have 1 in {:,} chances of winning.'''.format(probability, number_of_tickets, combinations_rounded))

Let's test the function with the numbers in the list [1, 10, 100, 10000, 1000000, 6991908, 13983816]

In [13]:
for num_tickets in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    multi_ticket_probability(num_tickets)
    print("========================") #separates each output

You have a 0.0000072% chance of winning the big prize when you play 1 ticket(s).
This means you have 1 in 13,983,816 chances of winning.
You have a 0.0000715% chance of winning the big prize when you play 10 ticket(s).
This means you have 1 in 1,398,382 chances of winning.
You have a 0.0007151% chance of winning the big prize when you play 100 ticket(s).
This means you have 1 in 139,838 chances of winning.
You have a 0.0715112% chance of winning the big prize when you play 10000 ticket(s).
This means you have 1 in 1,398 chances of winning.
You have a 7.1511238% chance of winning the big prize when you play 1000000 ticket(s).
This means you have 1 in 14 chances of winning.
You have a 50.0000000% chance of winning the big prize when you play 6991908 ticket(s).
This means you have 1 in 2 chances of winning.
You have a 100.0000000% chance of winning the big prize when you play 13983816 ticket(s).
This means you have 1 in 1 chances of winning.


## Probability of Winning Smaller Prizes

In most 6/49 lotteries, players may win smaller prizes if their ticket matches two, three, four, or five of the six numbers they draw.

This means the user who uses the app may also want to know their chances of winning the smaller prizes.

To achieve this, we will write a function that calculates the probabilities of having **exactly** two, three, four, or five winning numbers.

Here are a few things we will consider while writing the code:
- The user will input:
 - 6 dfferent numbers from beteen 1 and 49.
 - an integer between 2 and 5 to represent the expected number of winning numbers.
- The function will print a message about the probability of having the inputted number of winning numbers

In [14]:
def probability_less_6(num_expected_winning_num):
    
    number_of_combinations = combinations(6,num_expected_winning_num)
    number_of_combinations_left = combinations(43, 6-num_expected_winning_num)
    
    tot_successful_outcomes = number_of_combinations * number_of_combinations_left
    tot_possible_outcomes = combinations (49, 6)
    
    probability = tot_successful_outcomes / tot_possible_outcomes * 100
    combination_rounded = round(tot_possible_outcomes/tot_successful_outcomes)
    print('''You have a {:.7f}% chance of having exactly {} winning numbers with this ticket.
This means you have 1 in {} chances of winning.'''.format(probability, num_expected_winning_num, combination_rounded))

Let's test the function with all 4 possible inputs...

In [15]:
for winning_num in [2,3,4,5]:
    probability_less_6(winning_num)
    print("============================") #separates each output

You have a 13.2378029% chance of having exactly 2 winning numbers with this ticket.
This means you have 1 in 8 chances of winning.
You have a 1.7650404% chance of having exactly 3 winning numbers with this ticket.
This means you have 1 in 57 chances of winning.
You have a 0.0968620% chance of having exactly 4 winning numbers with this ticket.
This means you have 1 in 1032 chances of winning.
You have a 0.0018450% chance of having exactly 5 winning numbers with this ticket.
This means you have 1 in 54201 chances of winning.


Let's make some modifications to the `probability_less_6()` function to calculate the probability of having **at least** 2, 3, 4, or 5 winning numbers.

For every inputted number `n`, the new function will calculate the sum of the number of successful outcomes for having exactly n+1, n+2,...,6 winning numbers.

For instance, the number of successful outcomes for having **at least** 3 winning numbers will be the sum of:
- The number of successful outcomes for having exactly 3 winning numbers.
- The number of successful outcomes for having exactly 4 winning numbers.
- The number of successful outcomes for having exactly 5 winning numbers.
- The number of successful outcomes for having exactly 6 winning numbers.

In [16]:
def probability_at_least(n):
    
    tot_successful_outcomes = 0
    for i in range(n,7):
        number_of_combinations = combinations(6,i)
        number_of_combinations_left = combinations(43, 6-i)
        successful_outcomes = number_of_combinations * number_of_combinations_left
        tot_successful_outcomes = tot_successful_outcomes + successful_outcomes
    
    tot_possible_outcomes = combinations(49, 6)
    
    probability = tot_successful_outcomes / tot_possible_outcomes * 100
    combination_rounded = round(tot_possible_outcomes/tot_successful_outcomes)
    print('''You have a {:.7f}% chance of having at least {} winning numbers with this ticket.
This means you have 1 in {} chances of winning'''.format(probability, n, combination_rounded))

We will now test the `probability_at_least()` function with all 4 possible inputs...

In [17]:
for winning_num in [2,3,4,5]:
    probability_at_least(winning_num)
    print("============================")

You have a 15.1015574% chance of having at least 2 winning numbers with this ticket.
This means you have 1 in 7 chances of winning
You have a 1.8637545% chance of having at least 3 winning numbers with this ticket.
This means you have 1 in 54 chances of winning
You have a 0.0987141% chance of having at least 4 winning numbers with this ticket.
This means you have 1 in 1013 chances of winning
You have a 0.0018521% chance of having at least 5 winning numbers with this ticket.
This means you have 1 in 53992 chances of winning


## Observation/Results

Here are the questions we started with and the answers we got from our analysis:
- **What is the probability that I will win the big prize with just one ticket?**

    From our analysis, you are over 400,000 times more likely to become a millionaire from making investments or running  a business in America than you are to win the big prize with a single ticket ([source](https://www.fool.com/slideshow/25-things-more-likely-happen-you-winning-lottery/?slide=26)).
    
    
- **What is the probability that I will win the big prize if I play multiple tickets?**

    The chance of winning the big prize increases with increasing number of tickets played. But the chance only increases *significantly* with a *significant* amount of tickets; which will cost you a fortune. 
    
    Given that [a combination costs $3](https://en.wikipedia.org/wiki/Lotto_6/49#Gameplay):
    
 - 3 million dollars worth of tickets will only give you a 7.2 % chance.
 - You will need about 20 million dollars worth of tickets to get a 50% chance at winning.
 
    
- **What is the probability that I will win smaller prizes?**
    
    The probability of winning smaller prizes is relatively higher with a smaller number of expected winning numbers.
    You stand a better chance of having exactly 2 winning numbers (13.238%) than having exactly 5 winning numbers (0.002%).
    

- **What is the probability that I will have at least five winning numbers on just one winning ticket?**

    You have 1 in 53,992 chances of having **at least** 5 winning numbers on a ticket. This means you are 5 times more likely to win an Oscar award than you are to have at least 5 winning numbers on a 6/49 lottery ticket. So, enrolling in acting classes may be a better investment than buying lottery tickets ([source](https://finance.yahoo.com/news/35-things-more-likely-happen-111452908.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAHQegGjCOHuC3tGufoyZ5lxokvqzsqS_Z0AtJnWKLfkmKPIUvlq12DuwZchD4GDgLzGtjpQ_8dsD15TFQ2Fbda9z3OhgEYJPwi398HdV3ENFYycJV2J4qZRXkP7F4Osizrl3vEPOcns5k96toe4Jxwl47Xa7zSZ2-W0im_m5WPiL)).


## Conclusion

We started out in this project with the goal to write the logic for an app that provides lottery addicts with better estimates of their chances of winning the lottery.

To achieve this, we successfully developed the following functions:

- `one_ticket_probability()` — to calculate the probability of winning the big prize with just one ticket.
- `check_historical_occurrence()` — to check if a certain combination has ever occurred in the Canada lottery data set.
- `multi_ticket_probability()` — to calculate the probability of winning the big prize with any number of tickets up to 13,983,816 tickets.
- `probability_less_6()` — to calculate the probability of having **exactly** two, three, four or five winning numbers so as to win smaller prizes.
- `probability_at_least()` — to calculate the probability of having **at least** two, three, four or five winning numbers so as to win smaller prizes.