# Lotery probability app development 
---

focus on the 6/49 lottery and build functions that enable users to answer questions like:

* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

We will consider historical data from national 6/49 lottery game in Canada

**Primary Goal:**
prototype probabilty functionality for app development


In [2]:
import pandas as pd
import numpy as np 

In [3]:

def factorial(n:int)->int:
    """functiion to calculate factorial of a number"""
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

def combinations(n:int,k:int)->int:
    """function to calculate combinations of n with k"""
    numerator = factorial(n)
    n_sub_k = n-k
    denominator = factorial(k)*factorial(n_sub_k)
    return int(numerator/denominator)

In [216]:
""""
user input is 6 elements of a list
will be single input into function
print probability in user friendly way (ie no stats training)
we will display the probability, and return the probability as a float
"""

def one_tix_probabiltiy(n:list)->str:
    if len(n) !=6:
        return print('You must enter a 6 digit number')
    else:
        n_len = len(n)
        total_comb = combinations(49, n_len)
        success_outcome=1
        prob = success_outcome/total_comb
        return prob, print(f'Your odds are {success_outcome} in {total_comb}')

These functions are generic, and we should compare against prior data to allow users historical insight

In [5]:
raw_data = pd.read_csv('649.csv')
data = (
    raw_data
    .rename(columns= lambda c: c.lower().replace(" ","_"))
    .assign(draw_date=pd.to_datetime(raw_data['DRAW DATE']))
)
#TO DO: left off at cleaned data


In [6]:
data.head()

Unnamed: 0,product,draw_number,sequence_number,draw_date,number_drawn_1,number_drawn_2,number_drawn_3,number_drawn_4,number_drawn_5,number_drawn_6,bonus_number
0,649,1,0,1982-06-12,3,11,12,14,41,43,13
1,649,2,0,1982-06-19,8,33,36,37,39,41,9
2,649,3,0,1982-06-26,1,6,23,24,27,39,34
3,649,4,0,1982-07-03,3,9,10,13,20,43,34
4,649,5,0,1982-07-10,5,14,21,31,34,47,45


### Engineering team is developing an app to for users to input a ticket and get a probability of winning compared against previous winning ticket occruances. We have been asked to write the backend functionality of this specific task, the data will be stored within the app, so we only need to write the code after the data has been read and cleaned (using processing above). 

    * Inside the app, the user inputs six different numbers from 1 to 49.
    * Under the hood, the six numbers will come as a Python list and serve as an input to our function.
    * The engineering team wants us to write a function that prints:
        the number of times the combination selected occurred in the Canada data set; and
        the probability of winning the big prize in the next drawing with that combination.
        
    * A second feature will allow the user to input the number of different tickets they want to play (without inputting the specific combinations they intend to play). 
    * Our input will be see an integer between 1 and 13,983,816 (the maximum number of different tickets).
    * The function should print information about the probability of winning the big prize depending on the number of different tickets played.


In [229]:

#First we will isolate the numbers column and create a column in the dataframe with the ticket sequence as a whole (if performacnce suffers, we can convert to np array)

numbers = ['number_drawn_1', 'number_drawn_2', 'number_drawn_3', 'number_drawn_4',
       'number_drawn_5', 'number_drawn_6']

data =(data
       .assign(
              tix_seq = pd.Series([a for a in data[numbers].to_numpy()]) # extracts indiv numbers into array
              )
       )

def user_ticket()->list[int]:
    """
    Function to take user input list and return np array of ticket
    """
    user_input=input('enter your ticket number: ').split(",")
    user_ticket = [int(i) for i in user_input]
    
       
    return np.array(user_ticket)

  
    
        
def tix_freq_counts():
    """
    Function to process user ticket and compare to ticket column in dataframe. We create a boolean list and return the frequency counts
    """
    user_tix = user_ticket()
    

    bool_list = []
    for i in range(data.shape[0]):
        if np.array_equal(data['tix_seq'][i], user_tix) == True:
           bool_list.append(True)
        else: 
            bool_list.append(False)

    bool_arr = np.array(bool_list)
    freq_count = np.unique(bool_arr, return_counts=True)
    return freq_count

def tix_prob()->tuple:
    """
    Function to process ticket frequency counts.
    
    If ticket has a True frequency > 0, we calulate the probability as a %. 
    Then return the odds as __ in {total ticket count} and probabliy __%
    """

    tix_freq = tix_freq_counts()
    if len(tix_freq[1])== 1:
        true_freq = 0 
    else:
        true_freq = tix_freq[1][1]

    total_count= np.sum(tix_freq[1])
    tix_odds =  true_freq/total_count
    return true_freq, total_count, tix_odds

def user_odds():
    """
    calls tix_prob() as tix_odds
    if the True frequency (true_freq) is greater than 0, report The odds as __ in {total ticket count} and probabliy __%
    if True frequency(true_freq) is 0, then return "This ticket has never been a winning ticket"
    """
    tix_odds = tix_prob()
    if tix_odds[0] > 0:
        return f"The odds of winning with this ticket are {tix_odds[0]} in {tix_odds[1]} with a probability of {round((100*tix_odds[2]), 2)}%"
    if tix_odds[0] == 0:
        return f"This ticket has never been a winning ticket"
    
user_odds() #final call for app

'This ticket has never been a winning ticket'

In [215]:
# ticket number probabilty 

def user_input_total()->int:
    """
    Function to take user input list and return np array of ticket
    """
    try:
     user_input_number = int(input())
     return user_input_number 
     
    except:
      return print('Please enter only one number')
    
def combinations(n=49 ,k=6)->int:

    """
    Function to calculate total combinations of numbers, built in flexibility but deaults to our ticket requirements
    """

    total_comb = np.math.factorial(n)/(np.math.factorial(k)*np.math.factorial(n-k))
    return  int(total_comb)    
   
def probability():

  """
  Function to calculate probability from user input ticket number and total combinations of numbers. 
  returns human readable probability 
  """
    
  number_of_tix = user_input_total()
  total_comb = combinations()
  probability_percent = number_of_tix/total_comb*100

  return print(f'The probability of winning with {number_of_tix} tickets is {probability_percent:f}%')

probability()

The probability of winning with 400 tickets is 0.002860%



### Functionality to calculate probabilities for two, three, four, or five winning numbers.


* Inside the app, the user inputs:
    > six different numbers from 1 to 49;
    
    > an integer between 2 and 5 that represents the number of winning numbers expected
* Our function prints information about the probability of having the inputted number of winning numbers.



In [242]:
"""
user ticket calls user_ticket function
user total input calls user input_total function_calls 

"""
## function calls takes user_input() function and user_input_total function, outputs porbability of winniing with 2,3,4,5 matching numbers 

def match_odds()->float:
    user_tix = user_ticket()
    user_numb=int(input('enter numbner between 2 and 5: '))
    ticket_combinations = combinations(n=len(user_tix), k=user_numb)
    total_sucessful_outcomes = 49-user_numb
    total_sucessful_outcomes = ticket_combinations * total_sucessful_outcomes
    total_combinations = combinations()

    probability_win_pct = total_sucessful_outcomes/total_combinations * 100

    return print(f'The probability of winning with {len(user_tix)} choose {user_numb} is {probability_win_pct:f}%')

match_odds()


The probability of winning with 6 choose 1 tickets is 0.002060%
