# Mobile App for Lottery Addiction

The goal of this project is to help lottery addicts better estimate their chances of winning. A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app for this information. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities. For this project, we will be using historical data from the Canadian lottery available [here](https://www.kaggle.com/datasets/datascienceai/lottery-dataset).

The first step of this project will be to write the core functions to calculate our probabilities. We will use a function, `factorial`, which calculates the factorial of any given number. We will also use a function, `combinations`, which takes two inputs, `n` and `k`, and returns the number of possible combinations when selecting `k` objects from a possible `n` objects. Since the order in which lottery numbers are drawn has no bearing on winning the lottery, we need to calculate the number of combinations instead of the total number of permutations.

In [1]:
import pandas as pd

def factorial(n):
    answer = 1
    for i in range(n):
        answer *= n
        n -= 1
    return answer

def combinations(n,k):
    return int(factorial(n) / (factorial(n-k) * factorial(k)))

## One-Ticket Probability

Our next function will be used to display the odds of winning the lottery for one ticket with a set of numbers, `n`. Since the specific numbers have no bearing on the probability of winning, this is an unused input, and no input should raise an error. Essentially, this function is used only to print the probabilities of winning the lottery when choosing six numbers from a possible 49.

In [2]:
def one_ticket_probability(n):
    combos = combinations(49,6)
    probability = 1 / combos
    print('Buying one lottery ticket:')
    print('\tYour probability of having the correct numbers is {:.7f}'.format(100 * probability) + '%.')
    print('\tYour chances of winning are 1 in {:,}'.format(combos) + '.')
    return

#Sample output
one_ticket_probability(0)

Buying one lottery ticket:
	Your probability of having the correct numbers is 0.0000072%.
	Your chances of winning are 1 in 13,983,816.


## Historical Data Check for Canadian Lottery

We have written a function that will tell users their chances of winning the lottery with a given set of numbers, but we also want users to be able to compare their ticket to historical lottery data in Canada to determine if they ever would have won with that particular set of numbers. We'll read in the dataset and examine what it contains.

In [3]:
lottery = pd.read_csv('649.csv')
lottery.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


In [4]:
lottery.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [5]:
lottery.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


## Function for Historical Data Check

To write this function, we will have to take in a set of six numbers and compare them to all of the past winning combinations. We will output how many times the set of numbers would have won the lottery as well as the probability of winning the jackpot in the next drawing with that particular combination. For the purposes of our function we will be using the following columns:
* `NUMBER DRAWN 1`
* `NUMBER DRAWN 2`
* `NUMBER DRAWN 3`
* `NUMBER DRAWN 4`
* `NUMBER DRAWN 5`
* `NUMBER DRAWN 6`

The `BONUS NUMBER` column is not relevant to our calculations.

In [6]:
# This function extracts each set of winning numbers as a set.

def extract_numbers(row):
    winning_numbers = set(row.iloc[4:10])
    return winning_numbers

In [7]:
def check_historical_occurence(user_num, winning_num):
    wins = 0
    user_num = set(user_num)
    for i in range(len(winning_num)):
        if user_num == winning_num[i]:
            wins += 1
    if wins == 0:
        print('This combination has not won in the last 35 years.')
        print('If you use these numbers for the next drawing:')
        print('\tYour probability of having the correct numbers is 0.0000072%%.')
        print('\tYour chances of winning are 1 in 13,983,816.')
    elif wins == 1:
        print('This combination won one time in the last 35 years.')
        print('If you use these numbers for the next drawing:')
        print('\tYour probability of having the correct numbers is 0.0000072%%.')
        print('\tYour chances of winning are 1 in 13,983,816.')
# More than one win with a set of number almost certainly hasn't happened, but these few lines of code prevent
# having to check and future-proof the function in case a given set of numbers does win more than once.
    else:
        print('This combination won {} times in the last 35 years.'.format(wins))
        print('If you use these numbers for the next drawing:')
        print('\tYour probability of having the correct numbers is 0.0000072%%.')
        print('\tYour chances of winning are 1 in 13,983,816.')
    return


winning_numbers = lottery.apply(extract_numbers, axis=1)

In [8]:
user_numbers = {1,2,3,4,5,6}
check_historical_occurence(user_numbers, winning_numbers)

This combination has not won in the last 35 years.
If you use these numbers for the next drawing:
	Your probability of having the correct numbers is 0.0000072%%.
	Your chances of winning are 1 in 13,983,816.


In [9]:
user_numbers = (14,24,31,35,37,48)
check_historical_occurence(user_numbers, winning_numbers)

This combination won one time in the last 35 years.
If you use these numbers for the next drawing:
	Your probability of having the correct numbers is 0.0000072%%.
	Your chances of winning are 1 in 13,983,816.


## Multi-Ticket Probability

The next function, `multi_ticket_probability` prints the probability of winning with `n` number of tickets purchased. This is achieved by simply multiplying the probability of winning with one ticket by the total number of tickets purchased. This ignores the very small possibility of purchasing the same numbers on more than one ticket.

In [10]:
def multi_ticket_probability(tickets):
    combos = int(combinations(49,6))
    probability = tickets / combos
    chances = int(combos / tickets)
    if tickets == 1:
        print('Buying {:,}'.format(tickets) + ' ticket:')
    else:
        print('Buying {:,}'.format(tickets) + ' tickets:')
    print('Your probability of having the correct numbers is {:.7f}'.format(100 * probability) + '%.')
    print('Your chances of winning are 1 in {:,}.'.format(chances))
    return

test_list = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for item in test_list:
    multi_ticket_probability(item)
    print('\n')

Buying 1 ticket:
Your probability of having the correct numbers is 0.0000072%.
Your chances of winning are 1 in 13,983,816.


Buying 10 tickets:
Your probability of having the correct numbers is 0.0000715%.
Your chances of winning are 1 in 1,398,381.


Buying 100 tickets:
Your probability of having the correct numbers is 0.0007151%.
Your chances of winning are 1 in 139,838.


Buying 10,000 tickets:
Your probability of having the correct numbers is 0.0715112%.
Your chances of winning are 1 in 1,398.


Buying 1,000,000 tickets:
Your probability of having the correct numbers is 7.1511238%.
Your chances of winning are 1 in 13.


Buying 6,991,908 tickets:
Your probability of having the correct numbers is 50.0000000%.
Your chances of winning are 1 in 2.


Buying 13,983,816 tickets:
Your probability of having the correct numbers is 100.0000000%.
Your chances of winning are 1 in 1.




## Less Winning Numbers

We have so far written functions dealing with the probabilities of winning the lottery jackpot. To finish, we will write one more function to calculate the probability of matching less than six numbers. Many lotteries pay smaller prizes if a ticket matches two, three, four, or five numbers. Our final function, `probability_less_6`, will take in an integer between two and five and calculate the probability of matching exactly that many numbers on a given ticket.

In [11]:
def probability_less_6(num):
    winning_combos = combinations(6, num)
    losing_numbers = combinations(43,6-num)
    total_combos = combinations(49,6)
    probability = winning_combos * losing_numbers / total_combos
    chances = round(1 / probability)
    prob_pct = probability * 100
    print('The chances of having exactly {} winning numbers is 1 in {:,}. This is a {:.3f}% chance.'
          .format(num, chances, prob_pct))    

In [12]:
test_inputs = [2,3,4,5]

for item in test_inputs:
    probability_less_6(item)
    print('\n')

The chances of having exactly 2 winning numbers is 1 in 8. This is a 13.238% chance.


The chances of having exactly 3 winning numbers is 1 in 57. This is a 1.765% chance.


The chances of having exactly 4 winning numbers is 1 in 1,032. This is a 0.097% chance.


The chances of having exactly 5 winning numbers is 1 in 54,201. This is a 0.002% chance.




There are a number of other steps which could be taken such as calculating the odds of matching at least a certain amount of numbers, but for the purposes of this project, we are finished.