# Estimate chances of winning lottery

## Background

Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

## Project Objective

The goal of the project is to answer questions like:

1. What is the probability of winning the big prize with a single ticket?
2. What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
3. What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

Since we will be calculating permuationa dn combinations repeatedly we will create 2 functions:

factorial() — a function that calculates factorials
combinations() — a function that calculates combinations

In [1]:
def factorial(n):
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

def combinations(n, k):
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return numerator/denominator

We will create a function which take in a list of 6 numbers of a ticket and will print the probability of winning in an easy to understand format

In [2]:
def one_ticket_probability(lottery):
    t_combinations=combinations(49,6)
    p=(1/t_combinations)
    percentage=p*100
    print('''Your chances to win the big prize with the numbers {} is {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(lottery,
                    percentage, int(t_combinations)))

In [3]:
ticket_1=[4,7,12,18,32,26]
one_ticket_probability(ticket_1)

Your chances to win the big prize with the numbers [4, 7, 12, 18, 32, 26] is 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


users should also be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

In [4]:
# Read in historical data
import pandas as pd
hist=pd.read_csv("649.csv")

In [5]:
hist.shape

(3665, 11)

In [6]:
hist.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [7]:
def extract_numbers(row):
    row=row.iloc[4:10]
    row=set(row.values)
    return(row)    

In [11]:
winning_numbers=hist.apply(extract_numbers,axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

We are going to write a function that will enable users to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

In [16]:
def check_historical_occurrence(list_1,series_1):
    set_1=set(list_1)
    match=set_1==series_1
    occurrence=match.sum()
    if occurrence==0:
        print('''The combination {} has never occurred in the past. It also doesn't mean that the chances are more now.
        .Your chances to win the big prize with the numbers {} is {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(list_1,lottery,
                    percentage, int(t_combinations)))
    else:
        print('''The number of times combination {} has occured in the past is {}.
Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(list_1, occurrence,
                                                                            list_1))

In [18]:
# check function
test_input_3 = [33, 36, 37, 39, 8, 41]
check_historical_occurrence(test_input_3, winning_numbers)

The number of times combination [33, 36, 37, 39, 8, 41] has occured in the past is 1.
Your chances to win the big prize in the next drawing using the combination [33, 36, 37, 39, 8, 41] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning. We are going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [23]:
def multi_ticket_probability(n_tickets):
    t_combinations=combinations(49,6)
    p=(n_tickets/t_combinations)
    percentage=p*100
    if n_tickets==1:
        print('''Your chances to win the big prize with one ticket is {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(
                    percentage, int(t_combinations)))
    else:
        print('''Your chances to win the big prize with {} tickets is {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_tickets,
                    percentage, int(t_combinations)/n_tickets))
    

In [25]:
#check function with test inputs

test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for test_input in test_inputs:
    multi_ticket_probability(test_input)
    print('------------------------') # output delimiter

Your chances to win the big prize with one ticket is 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.
------------------------
Your chances to win the big prize with 10 tickets is 0.0000715%.
In other words, you have a 1 in 1,398,381.6 chances to win.
------------------------
Your chances to win the big prize with 100 tickets is 0.0007151%.
In other words, you have a 1 in 139,838.16 chances to win.
------------------------
Your chances to win the big prize with 10000 tickets is 0.0715112%.
In other words, you have a 1 in 1,398.3816 chances to win.
------------------------
Your chances to win the big prize with 1000000 tickets is 7.1511238%.
In other words, you have a 1 in 13.983816 chances to win.
------------------------
Your chances to win the big prize with 6991908 tickets is 50.0000000%.
In other words, you have a 1 in 2.0 chances to win.
------------------------
Your chances to win the big prize with 13983816 tickets is 100.0000000%.
In other words, you have 

In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.
These are the engineering details we'll need to be aware of:

Inside the app, the user inputs:
1. six different numbers from 1 to 49; and
2. an integer between 2 and 5 that represents the number of winning numbers expected
3. Our function prints information about the probability of having the inputted number of winning numbers.

In [29]:
def probability_less_6(n):
    total_combinations=combinations(6,n)
    remaining_combinations=combinations(43,6-n)
    successful_outcomes=total_combinations*remaining_combinations
    n_combinations_total=combinations(49,6)
    probability=successful_outcomes/n_combinations_total
    probability_percentage=probability*100
    combinations_simplified = round(n_combinations_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n, probability_percentage,
                                                               int(combinations_simplified)))

In [30]:
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('--------------------------') # output delimiter

Your chances of having 2 winning numbers with this ticket are 13.237803%.
In other words, you have a 1 in 8 chances to win.
--------------------------
Your chances of having 3 winning numbers with this ticket are 1.765040%.
In other words, you have a 1 in 57 chances to win.
--------------------------
Your chances of having 4 winning numbers with this ticket are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.
--------------------------
Your chances of having 5 winning numbers with this ticket are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
--------------------------


Next steps
For the first version of the app, we coded four main functions:

one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
probability_less_6() — calculates the probability of having two, three, four or five winning numbers exactly
Possible features for a second version of the app include:

Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery")
Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time
Create a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers. Hint: the number of successful outcomes for having at least four winning numbers is the sum of these three numbers:
The number of successful outcomes for having four winning numbers exactly
The number of successful outcomes for having five winning numbers exactly
The number of successful outcomes for having six winning numbers exactly