## Project: Mobile App for Lottery Addiction

In this project we are going to use our knowledge in probability and combinatorics to simulate our contribution to the development of a mobile app that helps people beat lottery addiction by calculating their actual chances of winning. We will focus on the 6/49 lottery and build functions that can answer the following questions:

* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three) winning numbers on a single ticket?

Throughout the project we will use the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from [Kaggle](https://www.kaggle.com/datasets/datascienceai/lottery-dataset).

![Image](https://images.unsplash.com/photo-1609017909889-d7b582c072f7?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=2069&q=80)
_Photo by Naser Tamimi on Unsplash_

### Core Functions

In [1]:
import pandas as pd

In [2]:
lottery=pd.read_csv("C:/Users/Denisa/Desktop/Project Apps/project 14/649.csv")

Throughout the project, we'll need to calculate repeatedly probabilities and combinations. As a consequence, we'll start by writing two functions that we'll use often:

* A function that calculates factorials;
* A function that calculates combinations.

In [3]:
#a function that calculates factorials 
def factorial(n):
    result=1
    for i in range(1,n+1):
        result*=i
    return result


In [4]:
#a function that calculates combinations
def combinations(n,k):
    result= factorial(n)/(factorial(k)*factorial(n-k))
    return result

### One-ticket Probability

We will write a function that calculates the probability of winning the big prize  for any given ticket. In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn.

In [5]:
# takes in a list of six unique numbers and prints the probability of winning
def one_ticket_probability(ticket):
    number_combinations=combinations(49,6)
    one_ticket_probability=1/number_combinations
    percentage=one_ticket_probability*100
    print('Chances of winnng with the numbers {} are {:.7f}%, which means your chances to win are 1 in {:,}'.
          format(ticket,percentage,int(number_combinations)))

In [6]:
#test input 1
one_ticket_probability([2, 43, 22, 23, 11, 5])

Chances of winnng with the numbers [2, 43, 22, 23, 11, 5] are 0.0000072%, which means your chances to win are 1 in 13,983,816


In [7]:
#test input 2
one_ticket_probability([9, 26, 41, 7, 15, 6])

Chances of winnng with the numbers [9, 26, 41, 7, 15, 6] are 0.0000072%, which means your chances to win are 1 in 13,983,816


### Historical Data Check for Canada Lottery

We want to give the users the chance to be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

In [8]:
lottery.shape

(3665, 11)

In [9]:
lottery.head

<bound method NDFrame.head of       PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0         649            1                0  6/12/1982               3   
1         649            2                0  6/19/1982               8   
2         649            3                0  6/26/1982               1   
3         649            4                0   7/3/1982               3   
4         649            5                0  7/10/1982               5   
...       ...          ...              ...        ...             ...   
3660      649         3587                0   6/6/2018              10   
3661      649         3588                0   6/9/2018              19   
3662      649         3589                0  6/13/2018               6   
3663      649         3590                0  6/16/2018               2   
3664      649         3591                0  6/20/2018              14   

      NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0        

 The data set contains historical data for 3,665 drawings (each row shows data for a single drawing), dating from 1982 to 2018. For each drawing, we can find the six numbers drawn in the following six columns:
* `NUMBER DRAWN 1`
* `NUMBER DRAWN 2`
* `NUMBER DRAWN 3`
* `NUMBER DRAWN 4`
* `NUMBER DRAWN 5`
* `NUMBER DRAWN 6`

### Function for Historical Data Check

In [10]:
#Extract all the winning six numbers from the historical data set
def extract_numbers(df):
    list1=[]
    for data in df[4:10]:
        list1.append(data)
    return list1
winning=lottery.apply(extract_numbers,axis=1)

In [11]:
winning

0        [3, 11, 12, 14, 41, 43]
1        [8, 33, 36, 37, 39, 41]
2         [1, 6, 23, 24, 27, 39]
3         [3, 9, 10, 13, 20, 43]
4        [5, 14, 21, 31, 34, 47]
                  ...           
3660    [10, 15, 23, 38, 40, 41]
3661    [19, 25, 31, 36, 46, 47]
3662     [6, 22, 24, 31, 32, 34]
3663     [2, 15, 21, 31, 38, 49]
3664    [14, 24, 31, 35, 37, 48]
Length: 3665, dtype: object

In [12]:
#checks whether a certain combination has occurred in the Canada lottery data set
def check_historical_occurrence(user_numbers, historical_numbers):   

    sum=0
    for comb in historical_numbers:
        if comb == user_numbers:
            sum+=1
    
    if sum == 0:
        print('''The combination {} has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers, user_numbers))
        
    else:
        print('''The number of times combination {} has occured in the past is {}.
Your chances to win the big prize in the next drawing using the combination {} are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.'''.format(user_numbers,sum,
                                                                            user_numbers))

In [13]:
#Testing the function with a few inputs
test_input_1 = [33, 36, 37, 39, 8, 41]
check_historical_occurrence(test_input_1, winning)

The combination [33, 36, 37, 39, 8, 41] has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination [33, 36, 37, 39, 8, 41] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


In [14]:
test_input_2 = [3, 2, 44, 22, 1, 44]
check_historical_occurrence(test_input_2, winning)

The combination [3, 2, 44, 22, 1, 44] has never occured.
This doesn't mean it's more likely to occur now. Your chances to win the big prize in the next drawing using the combination [3, 2, 44, 22, 1, 44] are 0.0000072%.
In other words, you have a 1 in 13,983,816 chances to win.


### Multi-ticket Probability

Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning for any number of different tickets. The user will input the number of different tickets they want to play and the _multi_ticket_probability_ function will calculate the total number of possible ourcomes and then print information about the probability of winning the big prize depending on the number of different tickets played.

In [15]:
#function that calculates the probability of winning the big prize depending on the number of different tickets played
def multi_ticket_probability(n_tickets):
    
    n_combinations = combinations(49, 6)
    
    probability = n_tickets / n_combinations
    percentage_form = probability * 100
    
    if n_tickets == 1:
        print('''Your chances to win the big prize with one ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(percentage_form, int(n_combinations)))
    
    else:
        combinations_simplified = round(n_combinations / n_tickets)   
        print('''Your chances to win the big prize with {:,} different tickets are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_tickets, percentage_form,
                                                               combinations_simplified))

In [16]:
#Testing the function using the following inputs
test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for test_input in test_inputs:
    multi_ticket_probability(test_input)
    print('------------------------') # output delimiter

Your chances to win the big prize with one ticket are 0.000007%.
In other words, you have a 1 in 13,983,816 chances to win.
------------------------
Your chances to win the big prize with 10 different tickets are 0.000072%.
In other words, you have a 1 in 1,398,382 chances to win.
------------------------
Your chances to win the big prize with 100 different tickets are 0.000715%.
In other words, you have a 1 in 139,838 chances to win.
------------------------
Your chances to win the big prize with 10,000 different tickets are 0.071511%.
In other words, you have a 1 in 1,398 chances to win.
------------------------
Your chances to win the big prize with 1,000,000 different tickets are 7.151124%.
In other words, you have a 1 in 14 chances to win.
------------------------
Your chances to win the big prize with 6,991,908 different tickets are 50.000000%.
In other words, you have a 1 in 2 chances to win.
------------------------
Your chances to win the big prize with 13,983,816 different ti

### Less Winning Numbers — Function

Next we're going to write one more function to allow the users to calculate probabilities for two, three, four, or five winning numbers because in most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. The function takes in an integer between 2 and 5 and prints information about the chances of winning depending on the value of that integer

In [17]:

def probability_less_6(n_winning_numbers):
    
    n_combinations_ticket = combinations(6, n_winning_numbers)
    n_combinations_remaining = combinations(43, 6 - n_winning_numbers)
    successful_outcomes = n_combinations_ticket * n_combinations_remaining
    
    n_combinations_total = combinations(49, 6)    
    probability = successful_outcomes / n_combinations_total
    
    probability_percentage = probability * 100    
    combinations_simplified = round(n_combinations_total/successful_outcomes)    
    print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(n_winning_numbers, probability_percentage,
                                                               int(combinations_simplified)))

In [18]:
#Testing the function on all possible inputs
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('--------------------------') # output delimiter

Your chances of having 2 winning numbers with this ticket are 13.237803%.
In other words, you have a 1 in 8 chances to win.
--------------------------
Your chances of having 3 winning numbers with this ticket are 1.765040%.
In other words, you have a 1 in 57 chances to win.
--------------------------
Your chances of having 4 winning numbers with this ticket are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.
--------------------------
Your chances of having 5 winning numbers with this ticket are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
--------------------------


### Conclusion

For the first version of the app, we coded four main functions that answer the questions from the beginning:

* one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
* check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
* multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
* probability_less_6() — calculates the probability of having two, three, four or five winning numbers exactly