# Guided Project: Mobile App for Lottery Addiction

## Introduction

For this particular project, a medical institute aims to prevent and treat gambling addictions by creating an app to help gamblers predict the lottery better. Therefore, in order to predict the lottery better, I will need to calculate probability values. There are certain questions that I will need to answer such as: 
    
   1) What is the probability of winning the big prize with a single ticket? 
    
   2) What is the probability of winning the big prize if we play 40 different 
       tickets? 
    
   3) What is the probability of having at least five, four and etc winning 
       numbers on a single ticket? 

I will make two core functions, the factorial function and the combinatorics function to answer the questions.


## Core Functions

In [1]:
def factorial(n):
    final_product = 1
    for n in range(n, 0, -1):
        final_product *= n
    return final_product

def combinations(n,k):
    numerator = factorial(n)
    d_1 = factorial(n-k)
    d_2 = factorial(k)
    return (numerator / (d_1 * d_2))

    

        

## One-ticket Probability

I will now create a function called one_ticket_probability which will return the probability of winning from a list of 6 unique numbers. I will then show the results for one example to see if my function behaves correctly.

In [5]:
def one_ticket_probability(user_input): 
    outcomes = combinations(49, 6)
    probability = (1/outcomes) * 100
    print('The probability of winning with the numbers {} is {:.7f}%'.format(user_input, probability))

one_ticket_probability([1,2,3,4,5,6])

The probability of winning with the numbers [1, 2, 3, 4, 5, 6] is 0.0000072%


## Historical Data Check for Canada Lottery

In the previous step the function, one_ticket_probability, informs users what the probability of winning is for the input of numbers. However, besides this though, most people would like to check their numbers against other numbers in the past. Therefore, this is the next step we must take. 

In [7]:
import pandas as pd 
df = pd.read_csv('649.csv', parse_dates = True)
print(df.shape)
print(df.head(3))
print('\n')
print(df.tail(3))

(3665, 11)
   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0      649            1                0  6/12/1982               3   
1      649            2                0  6/19/1982               8   
2      649            3                0  6/26/1982               1   

   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0              11              12              14              41   
1              33              36              37              39   
2               6              23              24              27   

   NUMBER DRAWN 6  BONUS NUMBER  
0              43            13  
1              41             9  
2              39            34  


      PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
3662      649         3589                0  6/13/2018               6   
3663      649         3590                0  6/16/2018               2   
3664      649         3591                0  6/20/2018             

## Function for Historical Data Check 

For this particular part of the project, I will write two functions. The first function will allow me to extract the rows in the dataframe representing the past winning lottery numbers. Also, I will create a second function that checks the numbers a user inputs to that of the past historical winning numbers to see if the user did indeed win. 

In [8]:
def extract_numbers(input_list):
    input_list = input_list[4:10]
    input_list = set(input_list.values)
    return input_list

canada_lotto_numbers = df.apply(extract_numbers, axis = 1)
canada_lotto_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [24]:
def check_historical_occurrence(input_list, historical_data):
    set_input_list = set(input_list)
    check = set_input_list == historical_data
    number_of_occurrences = check.sum()
    if number_of_occurrences == 0: 
        print("The numbers {} have never occurred.".format(input_list, input_list))
    else: 
        print("The numbers {} occurred {} time(s).".format(input_list, number_of_occurrences))

test_input_3 = [1,2,3,4,5,6]
test_input_4 = [33, 36, 37, 39, 8, 41]
check_historical_occurrence(test_input_3, canada_lotto_numbers)
check_historical_occurrence(test_input_4, canada_lotto_numbers)

The numbers [1, 2, 3, 4, 5, 6] have never occurred.
The numbers [33, 36, 37, 39, 8, 41] occurred 1 time(s).


Throughout this project, I created two functions called one_ticket_probability and check_historical_occurrence. One_ticket_probability calculates the probability of winning the big prize with a single ticket. Check_historical_occurrence checks whether a certain combination has occurred in the Canada lottery data set. Since lottery addicts play more than one ticket, I will help them better estimate their chances of winning. I will write a new function to allow users to calculate the chances of winning for any number of different tickets.

## Multi-ticket Probability

In [33]:
def multi_ticket_probability(number_of_tickets):
    number_of_combinations = combinations(49,6)
    probability = number_of_tickets / number_of_combinations
    percentage = 100 * probability
    if number_of_tickets == 1:
        print("The chances of winning are {:.6f}%".format(percentage))
    else: 
        combos = round(number_of_combinations / number_of_tickets)
        print("The chances of winning are {:.6f}%".format(percentage))
        

In [35]:
test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for test_input in test_inputs: 
    multi_ticket_probability(test_input)
    print('-----------------------------------------------------')

The chances of winning are 0.000007%
-----------------------------------------------------
The chances of winning are 0.000072%
-----------------------------------------------------
The chances of winning are 0.000715%
-----------------------------------------------------
The chances of winning are 0.071511%
-----------------------------------------------------
The chances of winning are 7.151124%
-----------------------------------------------------
The chances of winning are 50.000000%
-----------------------------------------------------
The chances of winning are 100.000000%
-----------------------------------------------------


## Less Winning Numbers Function

Throughout the project, I have written three main functions: 
    
  1) one_ticket_probability
  
  2) check_historical occurrence
  
  3) multi_ticket_probability

The last part is to write one more function to allow people to calculate probabilites for two, three or four winning numbers. The reason I will do this is because people can still win a prize even if they cannot get all the matching numbers. 

In [39]:
def probability_less_6(winning_numbers):
    n_combos_tickets = combinations(6, winning_numbers)
    n_remaining_combos = combinations(43, 6 - winning_numbers)
    successful = n_combos_tickets * n_remaining_combos
    
    n_total_combos = combinations(49, 6)
    prob = successful / n_total_combos
    
    percentage = 100 * prob
    rounded_combos = round(n_total_combos / successful)
    
    print("The chances of having {} is {:.6f}%".format(winning_numbers, percentage))
    

In [40]:
for test in [2,3,4,5]:
    probability_less_6(test)
    print('-------------------------------------------')

The chances of having 2 is 13.237803%
-------------------------------------------
The chances of having 3 is 1.765040%
-------------------------------------------
The chances of having 4 is 0.096862%
-------------------------------------------
The chances of having 5 is 0.001845%
-------------------------------------------


## Next Steps 

Throughout this project, I created four functions: 
    
   1) one_ticket_probability: Calculates the probability of winning the big 
      prize with a single ticket.
    
   2) check_historical_occurrence: Checks whether a certain combination has 
      occurred in the Canada lottery data set. 
      
   3) multi_ticket_probability: Calculates the probability for any number of 
      tickets between 1 and 13,983,816.
     
   4) probability_less_6: Calculates the probability of having two, three, four 
      or five winning numbers. 

This is for the first version of the app and there could be improvements that could be made for the second app. 