# Mobile App for Lottery Addiction

Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the 6/49 lottery and build functions that enable users to answer questions like:

What is the probability of winning the big prize with a single ticket?
What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?



In [1]:
def factorial(n): # a function that calculates factorials
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

def combinations(n, k): # a functions that calculates combinations
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return numerator/denominator

## One-Ticket Probability

First we will find out the probability of winning the big prize with one ticket.

In the 6/49 lottery, six numbers are draws from a set of 49 numbers that range from 1 to 49. A player's six numbers must match all six numbers drawn. An example ticket would be a ticket with the numbers {13, 22, 24, 27, 42, 44}. 

In [2]:
def one_ticket_probability(user_numbers):
    possible_outcomes = combinations(49, 6)
    probability = 1/ possible_outcomes
    percentage_form = probability * 100
    
    text = "If you have the numbers {}, Your probability of winning would be {:.7f}%."
    print(text.format(user_numbers, percentage_form))


The function above takes a set of 6 numbers and displays the probability of winning the big prize. Let's test it out to see what it looks like.

In [3]:
one_ticket_probability([13, 22, 24, 27, 42, 44])

If you have the numbers [13, 22, 24, 27, 42, 44], Your probability of winning would be 0.0000072%.


The output of the function is formatted in a way that is easy to understand to the average person by showing the probability in the form of a percentage.

## Historical Data Check for Canada Lottery

For this version of the app, users will be allowed to compare their ticket against historical lottery data in Canada to determine whether they would have ever won by now.

In [4]:
import pandas as pd

lottery_data = pd.read_csv('649.csv')
lottery_data.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [5]:
print(lottery_data.shape)
lottery_data.tail(3)

(3665, 11)


Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


The data above is historical data from the Canada 6/49 lottery. The dataset can be downloaded from kaggle at https://www.kaggle.com/datasets/datascienceai/lottery-dataset. The dataset has 3665 rows representing 3665 drawings dating from 1982 to 2018.

## Function for Historical Data Check

We're now going to write a function that will allow users to compare their ticket against the historical lottery data in Canada and determine whether they would have won by now.

In [6]:
def extract_numbers(row):
    return {row['NUMBER DRAWN 1'], row['NUMBER DRAWN 2'], row['NUMBER DRAWN 3'],
            row['NUMBER DRAWN 4'], row['NUMBER DRAWN 5'], row['NUMBER DRAWN 6']}

# extract all the winning numbers as sets
winning_numbers_series = lottery_data.apply(extract_numbers, axis=1)

def check_historical_occurrence(user_numbers, winning_sets):
    user_numbers_set = set(user_numbers)
    matches = winning_sets.apply(lambda x: x == user_numbers_set)
    num_matches = matches.sum()
    
    # Calculate the probability of winning with the given combination
    probability = 1 / combinations(49, 6)
    percentage_probability = probability * 100
    
    # Print the results
    print("""Your combination {} occurred 
    {} time(s) in the historical dataset.""".format(user_numbers_set, num_matches))


We have created two functions. One that extracts the winning 6 numbers from the historical data and one that checks those numbers against an input to see how many times it has occured in the dataset. Now let's test our functions with a few different inputs.

In [7]:
test_numbers_1 = [3, 41, 11, 12, 43, 14]
test_numbers_2 = [1, 2, 3, 4, 5, 6]
test_numbers_3 = [10, 20, 30, 40, 50, 60]  # Assuming the max number can be above 49 for demonstration

print("Testing with first set of numbers:")
check_historical_occurrence(test_numbers_1, winning_numbers_series)

print("\nTesting with second set of numbers:")
check_historical_occurrence(test_numbers_2, winning_numbers_series)

print("\nTesting with third set of numbers (outside typical range):")
check_historical_occurrence(test_numbers_3, winning_numbers_series)

Testing with first set of numbers:
Your combination {3, 41, 11, 12, 43, 14} occurred 
    1 time(s) in the historical dataset.

Testing with second set of numbers:
Your combination {1, 2, 3, 4, 5, 6} occurred 
    0 time(s) in the historical dataset.

Testing with third set of numbers (outside typical range):
Your combination {40, 10, 50, 20, 60, 30} occurred 
    0 time(s) in the historical dataset.


## Multi-ticket Probability

Lottery players usually play more than one ticket in a single drawing with the goal of increasing their chances of winning. To help them better estimate their chances of winning, we will write a functions that calcualtes their probability of winning based on the number of tickets that are drawn.

In [11]:
def multi_ticket_probability(tickets):
    possible_outcomes = combinations(49, 6)
    probability = tickets / possible_outcomes
    percentage_form = probability * 100
    
    text = "If you have {} tickets, Your probability of winning would be {:.7f}%."
    print(text.format(tickets, percentage_form))

The function above is a modified version of the `single_ticket_probability` function. Now let's test it with some inputs.

In [17]:
multi_ticket_probability(1)
multi_ticket_probability(10)
multi_ticket_probability(100)
multi_ticket_probability(10000)
multi_ticket_probability(1000000)
multi_ticket_probability(6991908)

If you have 1 tickets, Your probability of winning would be 0.0000072%.
If you have 10 tickets, Your probability of winning would be 0.0000715%.
If you have 100 tickets, Your probability of winning would be 0.0007151%.
If you have 10000 tickets, Your probability of winning would be 0.0715112%.
If you have 1000000 tickets, Your probability of winning would be 7.1511238%.
If you have 6991908 tickets, Your probability of winning would be 50.0000000%.
