
# Developping a Mobile App for Lottery Addicts


In this project we will simulate a Data Science team assignment in which we are asked to contribute to the development of a mobile app to help lottery addicts better estimate their chances of winning. 
We will build a code that enables users to answer certain questions like
* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets( of any other number)?
* What is the probability of having at least five(four, or three, or two) winning numbers on a single ticket?


The [dataset](https://www.kaggle.com/datasets/datascienceai/lottery-dataset) used for this project is from the national 6/49 lottery game in Canada.

# Defining Probability Functions

In the 6/49 lottery, six numbers are down from a set of 49 numbers that range from 1 to 49. The drawing is done without replacement.

In [1]:
#def function to find factorial of a number n

def factorial(n):
    fact_result = n
    
    for i in range(1,n):
        number = n - i
        
        fact_result*=  number
    return fact_result

        
#def function for the number of possible unique combinations

def combinations(n,k):
    
    n_factorial = factorial(n)
    k_factorial = factorial(k)
    
    n_k_factorial = factorial(n-k)
    
    combi = n_factorial/(k_factorial*n_k_factorial)
   
    return combi



In [2]:
#def function for probability of winning the big prize for any ticket

def one_ticket_probability(ticket_numbers):
    
    possible_outcomes = combinations(49,6)
    
    prb_one_ticket_win = 1/possible_outcomes
    
    perc_one_ticket_win = prb_one_ticket_win*100
    
    
    return "Your Chances of Winning the Big Prize with numbers {} is {} %".format(ticket_numbers, perc_one_ticket_win)


one_ticket_probability(8)

'Your Chances of Winning the Big Prize with numbers 8 is 7.151123842018516e-06 %'

In [3]:
#test with random lottery numbers

from numpy.random import seed, randint



for i in range(10):
    seed(i)

    random_number = randint(low= 1,high= 49, size = 6)
 
    print("Ticket Number {}".format(i), "\n", random_number)
    
    print(one_ticket_probability(random_number))
    


Ticket Number 0 
 [45 48  1  4  4 40]
Your Chances of Winning the Big Prize with numbers [45 48  1  4  4 40] is 7.151123842018516e-06 %
Ticket Number 1 
 [38 44 13  9 10 12]
Your Chances of Winning the Big Prize with numbers [38 44 13  9 10 12] is 7.151123842018516e-06 %
Ticket Number 2 
 [41 16 46  9 23 44]
Your Chances of Winning the Big Prize with numbers [41 16 46  9 23 44] is 7.151123842018516e-06 %
Ticket Number 3 
 [43 25  4  9  1 22]
Your Chances of Winning the Big Prize with numbers [43 25  4  9  1 22] is 7.151123842018516e-06 %
Ticket Number 4 
 [47  6  2 41 24  9]
Your Chances of Winning the Big Prize with numbers [47  6  2 41 24  9] is 7.151123842018516e-06 %
Ticket Number 5 
 [36 15 48 39 17 10]
Your Chances of Winning the Big Prize with numbers [36 15 48 39 17 10] is 7.151123842018516e-06 %
Ticket Number 6 
 [11 10 36 21 43 46]
Your Chances of Winning the Big Prize with numbers [11 10 36 21 43 46] is 7.151123842018516e-06 %
Ticket Number 7 
 [48  5 26  4 20 24]
Your Chanc

# Historical Data Check for 6/49 Canada Lottery 

To enable users to compare their ticket against historical winning lottery numbers, we will use this [dataset](https://www.kaggle.com/datasets/datascienceai/lottery-dataset)

## Data Overview


In [4]:
import pandas as pd

lottery_649 = pd.read_csv("649.csv")

lottery_649.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


In [5]:
lottery_649.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


In [6]:
#def function to extract all winning six numbers from dataframe

def extract_numbers(row):
    
    numbers = []
    
    for i in row:
        
        numbers.append(i)
        
    reduced_numbers = numbers[4:10]
    
    numbers_set = set(reduced_numbers)
    
    return numbers_set
        
winning_numbers = lottery_649.apply(extract_numbers, axis = 1)

winning_numbers.head()


0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [7]:
#def function to check historical occurence of any lottery numbers

def check_historical_occurence(ticket_numbers_list, winning_sets = winning_numbers):
    ticket_numbers_set = set(ticket_numbers_list)
    
    bool_check = ticket_numbers_set == winning_numbers
    
    match_count = bool_check.sum()
    
 
   
    possible_outcomes = combinations(49,6)
    
    if match_count == 0:
        prb_winning = 1/possible_outcomes
    else:
        prb_winning = match_count/possible_outcomes
    
    perc_winning = prb_winning*100
    
    text= "Your current lottery numbers combinations have won the big prize {} times in past drawings".format(match_count)
    
    text_2 = "Your Chances of Winning the Big Prize with numbers {} are {} %".format(ticket_numbers_set, perc_winning)
    
    return text, text_2

In [8]:
#test check_historical_occurence function with random lottery numbers

from numpy.random import seed, randint



for i in range(5):
    seed(i)

    random_number = randint(low= 1,high= 49, size = 6)
 
    print("Ticket Number {}".format(i), "\n", random_number)
    
    print(check_historical_occurence(ticket_numbers_list=random_number))

Ticket Number 0 
 [45 48  1  4  4 40]
('Your current lottery numbers combinations have won the big prize 0 times in past drawings', 'Your Chances of Winning the Big Prize with numbers {48, 1, 40, 4, 45} are 7.151123842018516e-06 %')
Ticket Number 1 
 [38 44 13  9 10 12]
('Your current lottery numbers combinations have won the big prize 0 times in past drawings', 'Your Chances of Winning the Big Prize with numbers {38, 9, 10, 44, 13, 12} are 7.151123842018516e-06 %')
Ticket Number 2 
 [41 16 46  9 23 44]
('Your current lottery numbers combinations have won the big prize 0 times in past drawings', 'Your Chances of Winning the Big Prize with numbers {41, 9, 44, 46, 16, 23} are 7.151123842018516e-06 %')
Ticket Number 3 
 [43 25  4  9  1 22]
('Your current lottery numbers combinations have won the big prize 0 times in past drawings', 'Your Chances of Winning the Big Prize with numbers {1, 4, 9, 43, 22, 25} are 7.151123842018516e-06 %')
Ticket Number 4 
 [47  6  2 41 24  9]
('Your current lo

In [9]:
#test a winning number
print(check_historical_occurence(ticket_numbers_list=[34, 5, 14, 47, 21, 31]))

('Your current lottery numbers combinations have won the big prize 1 times in past drawings', 'Your Chances of Winning the Big Prize with numbers {34, 5, 14, 47, 21, 31} are 7.151123842018516e-06 %')


In the previous step, we provided the probability of winning the big prize for any combination of lottery numbers, if the ticket numbers combinations input by the user had previously won a big prize, we use that historica data divided by the number of possible outcomes to determine the probability.

## Multi-ticket Probability 

In [10]:
#def function for multi-ticket probability 

def multi_ticket_probability(tickets_count):
    possible_outcomes = combinations(49, 6)
    
    successful_outcomes = tickets_count
    
    prb_winning = successful_outcomes/possible_outcomes
    
    perc_winning = prb_winning*100 
    
    text = "Your chances of winning with {} tickets is {} %".format(tickets_count, perc_winning)
    
    return text

In [11]:
#test multi_ticket_probability() function

for i in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    
    result = multi_ticket_probability(i)
    
    print(result, "\n")

Your chances of winning with 1 tickets is 7.151123842018516e-06 % 

Your chances of winning with 10 tickets is 7.151123842018517e-05 % 

Your chances of winning with 100 tickets is 0.0007151123842018516 % 

Your chances of winning with 10000 tickets is 0.07151123842018516 % 

Your chances of winning with 1000000 tickets is 7.151123842018517 % 

Your chances of winning with 6991908 tickets is 50.0 % 

Your chances of winning with 13983816 tickets is 100.0 % 



The test code code above demonstrates that the higher the number of tickets a person purchases, the greater their chances of winning. As expected, if a user purchases 13,983,816 tickets, which the maximum number of different tickets, the chance of winning the big prize is 100%.

## Less Winning Numbers - Function

In this step we calculate the probability of wining the big prize if the player's ticket matches two, three, four or five of the six numbers drawn.

In [12]:

def probability_less_6(number):
    
    if number >= 2 or number <=5:
        
        combinations_int_6 = combinations(6,number) #total number of int combinations out of 6 numbers
    
        total_success_outcomes = combinations_int_6*43
    
        prb_int_winning_numbers = total_success_outcomes/combinations(49,6)
        
        perc_winning_numbers = prb_int_winning_numbers*100
        
        text = "Your chances of having {} out of 6 winning numbers are {}".format(number, perc_winning_numbers)
        
        return text
    else:
        return "Please input a number between 2 and 5"
    
     

In [13]:
#test function

for i in range(2,6):
    
    print(probability_less_6(i))

Your chances of having 2 out of 6 winning numbers are 0.004612474878101943
Your chances of having 3 out of 6 winning numbers are 0.006149966504135924
Your chances of having 4 out of 6 winning numbers are 0.004612474878101943
Your chances of having 5 out of 6 winning numbers are 0.0018449899512407771
