# Mobile App for Lottery Addiction
In this project, we're contributing to the development of a mobile app that's meant to help prevent lottery addition by showing gamblers their chances of winning.

The app is being developed by a medical institue which specializes in treating gambling addition. They already have a team of engineers ready to build the app, but they need us to create the logical core of the app and calculate probablities.

The app will be focused on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49), where six numbers are drawn from a set of 49, and a player can win a prize by matching two or more numbers on a single ticket. The big prize of 5 million CA$ is awarded to anyone who matches all 6 numbers.

Our goal is to build functions that answer the following questions:
- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play multiple tickets (example 40 tickets)?
- What is the probability of winning a lesser prize by matching between 2 to 5 winning numbers on a single ticket?

The scenario we're following throughout this project is fictional — the main purpose is to practice applying probability and combinatorics (permutations and combinations) concepts in a setting that simulates a real-world scenario.

## Core Functions
We'll start by writing two functions that we'll be using often for calculating factorials and combinations.

In [1]:
def factorial(n):
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

def combinations(n,k):
    return factorial(n)/(factorial(k)*factorial(n-k))

## One-Ticket Probablity
Now, let's wrtie a function to calculate the probability of winning the big prize for any given ticket.

For the first version of the app, the engineer team wants players to be able to input the numbers they intend to play and recieve the probability of winning in a friendly way, such that anyone without any probability knowledge can still understand.

In [2]:
def validate_numbers(user_numbers):
    '''Returns whether or not a list contains 6 unique numbers between 1 and 49.
    If not, then a message is printed asking the user to try again.
    '''
    for i in range(len(user_numbers)):
        if 1 > user_numbers[i] > 49 or len(set(user_numbers)) != 6:
            print('Please insert six different numbers ranging between 1 to 49')
            return False
        else:
            return True

def one_ticket_probability(user_numbers):
    if validate_numbers(user_numbers):
        comb = int(combinations(49, 6))
        prob = (1 / comb) * 100
        print('''Your chances to win the big prize with the numbers {} are {:.7f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(user_numbers, prob, comb))

Now we will test our function on two different inputs.

In [3]:
test_input_1 = [7, 13, 32, 44, 19, 1]
one_ticket_probability(test_input_1)

Your chances to win the big prize with the numbers [7, 13, 32, 44, 19, 1] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


In [4]:
test_input_2 = [29, 6, 31, 10, 17, 18]
one_ticket_probability(test_input_2)

Your chances to win the big prize with the numbers [29, 6, 31, 10, 17, 18] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


In [5]:
invalid_test_input_1 = [1, 2, 3, 4, 5, 6, 7]
one_ticket_probability(invalid_test_input_1)

Please insert six different numbers ranging between 1 to 49


In [6]:
invalid_test_input_2 = [1, 1, 1, 1, 1, 1]
one_ticket_probability(invalid_test_input_2)

Please insert six different numbers ranging between 1 to 49


## Historical Data Check for Canada Lottery
The next feature of our app will show users whether their numbers would have won in the past by referencing historical lottery data in Canada.

We will be using data from the national 6/49 lottery game in Canada. The data set contains historical data for 3,665 drawings, dating from 1982 to 2018 (the data set can be downloaded from [here](https://www.kaggle.com/datasets/datascienceai/lottery-dataset)).

First, let's open this dataset and get familiar with it.

In [7]:
import pandas as pd

lottery_canada = pd.read_csv('649.csv')
lottery_canada.shape

(3665, 11)

In [8]:
lottery_canada.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [9]:
lottery_canada.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


## Function for Historical Data Check
Now, we'll write a function to compare any ticket with the historical data, and will output the following:
- The number of times the ticket would have won the big prize in the past.
- The probablity the ticket will win the big prize on the next drawing.

In [10]:
def extract_numbers(row):
    '''Takes a row of the lottery dataframe and returns a set containing all the six 
    winning numbers'''
    return set(row[4:10])

winning_numbers = lottery_canada.apply(extract_numbers, axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [11]:
def check_historical_occurrence(user_numbers, historical_numbers):   
    '''
    user_numbers: a Python list of 6 numbers from 1 to 49 inclusive
    historical numbers: a pandas Series of historical winning numbers
    '''
    if validate_numbers(user_numbers):
        check_occurrence = historical_numbers == set(user_numbers)
        n_occurrences = check_occurrence.sum()
    
        if n_occurrences == 0:
            print('''The combination {} has never occured.
This doesn't mean it's more likely to occur now.'''.format(user_numbers))
        else:
            print('The number of times combination {} has occured in the past is {}'.format(user_numbers, n_occurrences))
            
        one_ticket_probability(user_numbers)

In [12]:
test_input_3 = [14, 27, 3, 31, 12, 44]
check_historical_occurrence(test_input_3, winning_numbers)

The combination [14, 27, 3, 31, 12, 44] has never occured.
This doesn't mean it's more likely to occur now.
Your chances to win the big prize with the numbers [14, 27, 3, 31, 12, 44] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


In [13]:
test_input_4 = [33, 36, 37, 39, 8, 41]
check_historical_occurrence(test_input_4, winning_numbers)

The number of times combination [33, 36, 37, 39, 8, 41] has occured in the past is 1
Your chances to win the big prize with the numbers [33, 36, 37, 39, 8, 41] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


## Multi-ticket Probability
Gamblers often play multiple tickets for a single drawing, hoping to significantly increase their chances of winning. So we're going to include a feature allowing them to see the probablity of winning when playing multiple tickets. The idea is that the user can input the number of tickets they intent to play rom 1 to 13,983,816 (the maximum number of different tickets), and our function will tell them the probablity of winning.

In [14]:
def multi_ticket_probability(n_tickets):
    n_combinations = combinations(49, 6)
    probability = (n_tickets / n_combinations) * 100
    
    if n_tickets == 1:
        print('''Your chances to win the big prize with one ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(probability, int(n_combinations)))
    
    else:
        combinations_simplified = round(n_combinations / n_tickets)   
        print('''Your chances to win the big prize with {:,} different tickets are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_tickets, probability,
                                                               combinations_simplified))

In [15]:
test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for test_input in test_inputs:
    multi_ticket_probability(test_input)
    print('------------------------') # output delimiter

Your chances to win the big prize with one ticket are 0.000007%.
In other words, you have a 1 in 13,983,816 chances to win.
------------------------
Your chances to win the big prize with 10 different tickets are 0.000072%.
In other words, you have a 1 in 1,398,382 chances to win.
------------------------
Your chances to win the big prize with 100 different tickets are 0.000715%.
In other words, you have a 1 in 139,838 chances to win.
------------------------
Your chances to win the big prize with 10,000 different tickets are 0.071511%.
In other words, you have a 1 in 1,398 chances to win.
------------------------
Your chances to win the big prize with 1,000,000 different tickets are 7.151124%.
In other words, you have a 1 in 14 chances to win.
------------------------
Your chances to win the big prize with 6,991,908 different tickets are 50.000000%.
In other words, you have a 1 in 2 chances to win.
------------------------
Your chances to win the big prize with 13,983,816 different ti

## Probability of Having Less Winning Numbers
In most 6/49 lotteries, there are still smaller prizes for players who match between 2 to 5 numbers on a single draw. Hence, players may be interested in finding the probablity of having two, three, four, or five winning numbers. So we will be writing a function that allows players to enter the amount of winning numbers (between 2 to 5) they desire, and the program will tell them the probablity of matching that many numbers.

In [16]:
def probability_less_6(n_winning_numbers):
    n_combinations_ticket = combinations(6, n_winning_numbers)
    n_combinations_remaining = combinations(43, 6 - n_winning_numbers)
    successful_outcomes = n_combinations_ticket * n_combinations_remaining
    
    n_combinations_total = combinations(49, 6)
    probability = successful_outcomes / n_combinations_total
    
    probability_percentage = probability * 100    
    combinations_simplified = round(n_combinations_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability_percentage,
                                                               int(combinations_simplified)))

In [17]:
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('--------------------------') # output delimiter

Your chances of having 2 winning numbers with this ticket are 13.237803%.
In other words, you have a 1 in 8 chances to win.
--------------------------
Your chances of having 3 winning numbers with this ticket are 1.765040%.
In other words, you have a 1 in 57 chances to win.
--------------------------
Your chances of having 4 winning numbers with this ticket are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.
--------------------------
Your chances of having 5 winning numbers with this ticket are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
--------------------------


## Conclusion
In this project, we wrote the logic for an application allowing users to see the odds of winning the 6/49 lottery using different strategies, such as playing  one or many tickets, using historical data to see if a combination of numbers has ever won before, and trying to win a smaller prize by getting between 2 to 5 winning numbers. We hope users seeing the low chances of winning a prize will discourage them from gambling. Here are our main takeaways:
- The chances to win the big prize with a single ticket are extremely low.
- To have relatively high chances of winning the big prize, the player has to buy **a huge amount of tickets**.
- The probability of having less winning numbers is still very low.