# Treating Lottery Addiction with Probabilities
In order to help those with a lottery addiction a medical institute aims to provide the chances of winning. The focus of the project will be the logical core of the app which will generate the probabilities using equations

## Part II - Core Functions
Defining functions that calculates factorials and combinations

In [1]:
def factorial(n):
    factorial_result=1
    for i in range(n,0,-1):
        factorial_result*=i
    return factorial_result

In [2]:
def combinations(n,k):
    return factorial(n)/factorial(n-k)/factorial(k)

## Part III - One-ticket Probability
Finding the probability of winning the lottery for the 6/49 lottery

In [3]:
def one_ticket_probability(six_number_list):
    n_of_unique_comb=combinations(49,6)
    chances_of_one=1/n_of_unique_comb
    chances_as_perc=chances_of_one*100
    return 'Your chances of winning with {} are: {}%'\
.format(six_number_list,'{:f}'.format(chances_as_perc))
one_ticket_probability([1,2,3,4,5,6])


#can also do '{:.#f}'.format(chances_as_perc), to disp more decimals
#disp float number - https://stackoverflow.com/questions/658763/how-do-i-suppress-scientific-notation-in-python

'Your chances of winning with [1, 2, 3, 4, 5, 6] are: 0.000007%'

## Part IV - Historical Data Check for Canada Lottery
Allows users to compare their ticket against historical lottery data in Canada and determine whether they would have ever won by now

In [4]:
import pandas as pd
historical_data=pd.read_csv('649.csv')
historical_data.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [5]:
historical_data.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


## Part V - Function for Historical Data Check
Part IV explored the historical data set, this part will include a written function that will enable users to compare their ticket against the historical lottery data

In [6]:
def extract_numbers(row):
    number_series=row[4:10]
    return set(number_series)
winning_number_sets=historical_data.apply(extract_numbers, axis=1)
winning_number_sets.head()

#Create a set from a series in pandas - https://stackoverflow.com/questions/39551566/create-a-set-from-a-series-in-pandas
#can do: set(number_series.values)

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [7]:
#will a single set compare to a series of sets, even though not same size?
tf=set([41,8,39,37,36,33])==winning_number_sets
#YES

In [16]:
def check_historical_occurence(their_list,historical_list):
    their_set=set(their_list)
    t_f_match_series=their_set==historical_list
    number_of_matches=sum(t_f_match_series)
    return 'The number of times {} has won in the past is: {}'\
.format(their_list,number_of_matches)

In [17]:
test_combination=[41,8,39,37,36,33]
print(one_ticket_probability(test_combination))
print(check_historical_occurence(test_combination,winning_number_sets))

Your chances of winning with [41, 8, 39, 37, 36, 33] are: 0.000007%
The number of times [41, 8, 39, 37, 36, 33] has won in the past is: 1


## Part VI - Multi-ticket Probability
The goal of this portion is to write a function that allows users to calculate the chances of winning for any number of different tickets purchased

In [24]:
def multi_ticket_probability(num_dif_tickets):
    total_number_of_combinations=combinations(49,6)
    probs_perc=num_dif_tickets/total_number_of_combinations*100
    return 'The probability of winning with {} different \
tickets is: {:f}%'.format(num_dif_tickets,probs_perc)

In [25]:
for val in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    print(multi_ticket_probability(val))

The probability of winning with 1 different tickets is: 0.000007%
The probability of winning with 10 different tickets is: 0.000072%
The probability of winning with 100 different tickets is: 0.000715%
The probability of winning with 10000 different tickets is: 0.071511%
The probability of winning with 1000000 different tickets is: 7.151124%
The probability of winning with 6991908 different tickets is: 50.000000%
The probability of winning with 13983816 different tickets is: 100.000000%


## Part VII - Less Winning Numbers
There are smaller prizes if a player's ticket match two, three, four or five of the six numbers drawn therefore curating a function that provides users with the probability of two to five numbers matching would be useful

Due note there is a **difference** between the probabilities of: <br>
a) having **exactly** five winning numbers <br>
b) having **at least** five winning numbers

For example, say the winning lottery set was (1,2,3,4,5,6) <br>
The ticket (1,2,3,4,5,7) has at least 4 winning numbers <br>
but the ticket (1,2,3,4,7,8) has exactly 4 winning numbers

For our purposes here, we want to have exactly five winning numbers

### Example to understand the calculations:
We know that for the 6 unique number set (because sampling without replacement so unique) that can be numbers from 1-49, there are **13,983,816** total possible outcomes (or combinations)

Say that you choose the ticket (1,2,3,4,5,6) and want to know the **probability your set matches exactly five of the winning numbers of the lottery**

Out of the 6 numbers on your ticket how many 5 digit combinations can you make? (remember combinations - order does not matter) <br>
(12345,23456,34561,45612,56123,61234) <br>
**six combinations** (aka 6C5 = 6!/(6-5)!/5!)

For one ticket, there are six combinations of five digits. Let's say that one of these five digit combinations match five of the numbers on the winning ticket. There are 44 tickets in total that can have this combinations and of those tickets is the lottery winner (matches all six), but because we only want to match 5 digits there are only 43 lottery outcomes that match exactly 5 numbers.

In order to calculate the total number of succesful outcomes for 5 of the 6 digits matching let us say that the winning lottery ticket is (1,2,3,4,5,6). This means there are 6 5-digit unique sets that can match and each one of the allowed sets can be on 43 different tickets so the total number of successful outcomes would be calculated using the rule of product: 6 x 43 = 258. You can visualize a tree diagram to help understand

In [60]:
def probability_less_6(int_between_2_and_5):
    num_of_int_num_combin=combinations(6,int_between_2_and_5)
    spots=6-int_between_2_and_5
    base=43 #max amount of other numbers possible
    count=combinations(base,spots)
#BELOW DOESN'T WORK BECAUSE 43*42*41 IS FOR PERMUTATION NOT COMBINATIONS
#     count=1
#     for i in range(spots):
#         count*=base
#         base-=1
# I.E. combinations(6,2)*43*42*41*40 = 44,427,600 > 13,983,816
    total_num_successful_outcomes=num_of_int_num_combin*count  
    prob_of_int_perc=total_num_successful_outcomes/combinations(49,6)*100
    return 'The probability of matching {} \
of the six numbers to the winning lottery number \
is: {:f}%'.format(int_between_2_and_5,prob_of_int_perc)


In [61]:
for i in [2,3,4,5]:
    print(probability_less_6(i))

The probability of matching 2 of the six numbers to the winning lottery number is: 13.237803%
The probability of matching 3 of the six numbers to the winning lottery number is: 1.765040%
The probability of matching 4 of the six numbers to the winning lottery number is: 0.096862%
The probability of matching 5 of the six numbers to the winning lottery number is: 0.001845%


### Understanding how to Write The Function
Draw it out. Use a 4 digit code where each digit can be 0-9 (10 digits). Say for example you want to match 3 of them. You have a correct code of 0,1,2,3. That means there are 4!/(4-3)!/1! = 4 3-digit combos that will work (012,123,230,301). Keep in mind this is combinations so 012 is the same as 102 because order doesn't matter.

Now, each one of the four different 3-digit combos that will work have a four empty digit. Say we are looking at 012 there could be 7 other numbers accompanying it (3,4,5,6,7,8,9). However, we only want 3/4 digits to match the correct code so 3 can't be an option because we would get 0123, so there are actually only 6 digits that can have that one spot. So therefore we need to calculate optionsCspots = 6!/(6-1)!1! = 6 combinations for each 3-digit combo and thus there would be 4 x 6 = 24 total number of successful outcomes.

Say now that we only want 2 digits to match so if we have two spots and four unique digits there are 4!/(2-2)!/2! = six different 2-digit combos that will have 2/4 winning numbers. We also have two other spots to fill. Let's use 01 then we can have 8 other numbers accompanying it (2,3,4,5,6,7,8,9). However, we can't use 2 or 3 so now again we only have 6 digits that can have those 2 spots. 6!/(6-2)!2! = 15 combinations for each 2-digit combo and thus there would be 6 x 15 = 90 total number of successful outcomes.

If this was a permutation problem then the order of the spots matter so say you wanted to know the number of permutations that include at least 2 of the 4 digits in 0123 and that it was still sampling without replacement so each value is unique. This means that 0123 and 1023 are different. The number of permutations for 10 digits and 4 spots would be 10!/(10-4)! = 5040 and only 4/10th of those 