<h1> Is the 6/49 lottery winnable? </h1>

In this exercise we will look to identify the chances of winning the lottery using statistics. We'll examine the chances of the different types of winning based on how many tickets you buy. The hope is to enlighten readers about the low chances they have when playing the lottery. The context of this project is that we are building out part of an app that users can use to figure out their chances of winning.

The 6/49 lotto allows a person to pick 6 numbers (without replacement) from the range 1 - 49 inclusive. 

In [1]:
# We will be using factorials and combinations multiple times in this exercise so to start we
# will create some functions we can use.

def factorial(n):
    the_fac = n
    for i in range(1,n):
        the_fac *= i
    return the_fac

def combinations(n,k):
    return factorial(n) / (factorial(k)*factorial(n-k))


In [2]:
# The first function we will build will take a set of 6 numnbers and return the chance of
# winning with those numbers.

def one_ticket_chance_of_winning(user_selection):
    
    # Just to rule out numbers outside of the range we add an if statement.
    
    if (min(user_selection) < 1) or (max(user_selection) > 49):
        return print("One or more of your numbers was outside the range of possible numbers, 1 - 49.")
    
    # Then we create the numbers we will need to calculate the percent chance of winning.
    
    num_of_outcomes = 1  # We defined this in case we want to go back and accept more inputs.
    n = 49               # Defined by the range of numbers to chose from.
    k = len(user_selection)
    possible_outcomes = combinations(n,k)
    
    p_one_ticket = round((num_of_outcomes / possible_outcomes) * 100,10)
    
    return print('Your chance of winning the next Big Prize using the numbers {} is {}%'.format(user_selection,p_one_ticket))

# Testing the function we see the percent chance of winning is quite low.

user_selection = [3, 41, 43, 12, 11, 1]
one_ticket_chance_of_winning(user_selection)
    

Your chance of winning the next Big Prize using the numbers [3, 41, 43, 12, 11, 1] is 7.1511e-06%


Obviously, whatever we choose here the percent chance of winning is the same and is quite low. This satisfies the first requirement for the app.

In [3]:
# For the next step we'll be looking at historical numbers for the lotto. Let's explore the
# data.

import pandas as pd

lotto_data = pd.read_csv('649.csv')

In [4]:
print(lotto_data.shape)
print(lotto_data.head(3),lotto_data.tail(3))

(3665, 11)
   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0      649            1                0  6/12/1982               3   
1      649            2                0  6/19/1982               8   
2      649            3                0  6/26/1982               1   

   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0              11              12              14              41   
1              33              36              37              39   
2               6              23              24              27   

   NUMBER DRAWN 6  BONUS NUMBER  
0              43            13  
1              41             9  
2              39            34         PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
3662      649         3589                0  6/13/2018               6   
3663      649         3590                0  6/16/2018               2   
3664      649         3591                0  6/20/2018              1

We now want to look to see if the person choosing the lotto ticket would have ever won the lotto with the numbers they chose based on historical data.

In [5]:
# First lets extract all the winning lotto numbers from the past weekly pulls.

def extract_numbers (a,b,c,d,e,f):
    return {a,b,c,d,e,f}

winning_numbers = lotto_data.apply(lambda row: 
                                             extract_numbers(row['NUMBER DRAWN 1'],
                                                             row['NUMBER DRAWN 2'],
                                                             row['NUMBER DRAWN 3'],
                                                             row['NUMBER DRAWN 4'],
                                                             row['NUMBER DRAWN 5'],
                                                             row['NUMBER DRAWN 6']), 
                                             axis=1)

print(winning_numbers.head())

0    {3, 41, 43, 12, 11, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object


In [6]:
def check_historical_occurence(user_selection,winning_numbers):
    user_set = set(user_selection)  # We first convert the original selection into a set.
    num_of_wins = 0
    
    for row in winning_numbers: 
        if (user_set == row) == True:
            num_of_wins += 1

    if num_of_wins > 0:        
        return "You would have won the Big Prize {} times in the past!".format(num_of_wins)
    else:
        return "Using the numbers you selected {} you would have won the lotto in the past {} times.".format(user_selection,num_of_wins)


check_historical_occurence(user_selection,winning_numbers)

'Using the numbers you selected [3, 41, 43, 12, 11, 1] you would have won the lotto in the past 0 times.'

This step is helpful to the person buying the ticket because they can see after learning how low of a chance they have of winning, the fact that they probably wouldn't have won in the past either.

Next we will create a function to show the customer what their chance of winning would be if they just bought more tickets.

In [7]:
def multi_ticket_probability (t_played):
    total_ticket_options = combinations(49,6)
    p_multi_ticket = round((t_played / total_ticket_options) * 100,5) # represented as a percentage
    return "If you played {} you would have a {}% of winning the Big Prize".format(t_played,p_multi_ticket)

In [8]:
# Testing multi_ticket_probability

print(multi_ticket_probability(1))
print(multi_ticket_probability(10))
print(multi_ticket_probability(100))
print(multi_ticket_probability(10000))
print(multi_ticket_probability(100000))
print(multi_ticket_probability(6991908))
print(multi_ticket_probability(13983816))

If you played 1 you would have a 1e-05% of winning the Big Prize
If you played 10 you would have a 7e-05% of winning the Big Prize
If you played 100 you would have a 0.00072% of winning the Big Prize
If you played 10000 you would have a 0.07151% of winning the Big Prize
If you played 100000 you would have a 0.71511% of winning the Big Prize
If you played 6991908 you would have a 50.0% of winning the Big Prize
If you played 13983816 you would have a 100.0% of winning the Big Prize


By showing these numbers the user will see that they would have to spend an immense amount of money to improve their chances of winning and even if they spent that much they still would not be guarenteed to win.

Now to finish this off, let's write a function that allows the user to input which prize they are hoping to win and tell them what their chances of winning are. We will use numbers 2 - 5 (the number of correct numbers from the users selection which correspond to a respective prize) as our input for the function.

In [25]:
def probability_less_6(correct_num_count):
    # We need to adjust for arriving at exactly the correct_num_count
    adjustment = 6 - correct_num_count 
    # Next we calculate the number of combos possible given the function input
    num_of_user_combos = int(combinations(6,correct_num_count))
    # We then figure out how many correct answers there would be for each combo above
    possible_correct_combos_per_answer = ((49 - correct_num_count) * factorial(adjustment)) - adjustment 
    # Then multiply the number of combos times the number of correct combos
    possible_correct_answers = num_of_user_combos * possible_correct_combos_per_answer
    # Again, total possible lotto combos
    total_num_outcomes = combinations(49,6)  
    percent_chance_of_winning = round((possible_correct_answers / total_num_outcomes) * 100,5)
    
    return "You would have a {}% chance of winning with {} correct numbers".format(percent_chance_of_winning,correct_num_count)

In [26]:
# Testing the function above.

print(probability_less_6(2))
print(probability_less_6(3))
print(probability_less_6(4))
print(probability_less_6(5))



You would have a 0.12057% chance of winning with 2 correct numbers
You would have a 0.03905% chance of winning with 3 correct numbers
You would have a 0.00944% chance of winning with 4 correct numbers
You would have a 0.00184% chance of winning with 5 correct numbers


This again reiterates how difficult it is to win anything at all. If the user is just hoping to win with 2 numbers they don't even have a 1% chance of winning. Hopefully, given this data, the app can help those addicted to the lottery understand that they are likely never going to win or make back any money they have spent to date on the lottery.