### Project : Mobile App Logic for Lottery Addiction

#### Objective - The goal is to build the logic for an anti-(lottery) addiction app. This app would allow users to estimate their chances of winning the lottery and thus discourage them. The app would have different scenarios and help users evaluate their chances in different situations. The examples include : 

1. What is the probability of winning the big prize with a single ticket?
2. What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
3. What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

### Core Functions

#### Factorials and Combinations are widely used for calculating outcomes and hence we will develop functions for the same. 

In [1]:
def factorial(n):
    product = 1
    for i in range(n,0,-1):
        product*= i
    return product

In [2]:
def combinations(n,k):
    numerator = factorial(n)
    denominator = factorial (n-k) * factorial (k)
    
    combination = numerator/denominator
    return combination

### One-ticket Probability

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. 

For the first version of the app, we want players to be able to calculate the probability of winning the big prize with the various numbers they play on a single ticket (for each ticket a player chooses six numbers out of 49). So, we'll start by building a function that calculates the probability of winning the big prize for any given ticket.

In [3]:
def one_ticket_probability(list_ticket_nos):
    total_outcomes = combinations(49,6) #Using the Combinations function written above
    n_successful_outcomes = 1 
    
    p_one_ticket = n_successful_outcomes / total_outcomes
    
    #p_one_ticket_pc = format((p_one_ticket * 100), '.3g')
    
    p_one_ticket_pc = p_one_ticket * 100
    
    #print(p_one_ticket_pc)
    
    print('The chances of your ticket {a} winning the lottery are {b:.7f} %'.format(a = list_ticket_nos, b = (p_one_ticket_pc)))
    
    return p_one_ticket

In [4]:
import numpy as np

np.random.randint(1,49,6)

array([12, 18, 18, 38, 48, 45])

#### Testing a few random lists containing numbers between 1 and 49

In [5]:
one_ticket_probability ([34, 48, 44, 10, 35,  3])

The chances of your ticket [34, 48, 44, 10, 35, 3] winning the lottery are 0.0000072 %


7.151123842018516e-08

In [6]:
one_ticket_probability ([34, 48, 44, 10, 35,  3])

The chances of your ticket [34, 48, 44, 10, 35, 3] winning the lottery are 0.0000072 %


7.151123842018516e-08

#### The function has been created on the basis of the total number of outcomes for a combination of 6 numbers selected between 1 and 49. The sampling is done without replacement. 

In [7]:
format(4.311237638482733e-91, '.3g')

'4.31e-91'

### Historical Data Check for Canada Lottery

#### We'll focus on exploring the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from Kaggle (link below) :

https://www.kaggle.com/datascienceai/lottery-dataset


The data set contains historical data for 3,665 drawings (each row shows data for a single drawing), dating from 1982 to 2018. For each drawing, we can find the six numbers drawn in the following six columns:

    NUMBER DRAWN 1
    NUMBER DRAWN 2
    NUMBER DRAWN 3
    NUMBER DRAWN 4
    NUMBER DRAWN 5
    NUMBER DRAWN 6

In [8]:
import pandas as pd
df_649_canada = pd.read_csv('649.csv')

In [9]:
df_649_canada.shape

(3665, 11)

#### Thus we can observe that the dataframe has 3665 rows and 11 columns. 

In [10]:
df_649_canada.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [11]:
df_649_canada.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


### Function for Historical Data Check

#### we're going to write a function that will enable users to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

We will write a function that prints:

    a) The number of times the combination selected occurred in the Canada data set; and
    b) The probability of winning the big prize in the next drawing with that combination.

In [12]:
a = set('abracadabra')

In [13]:
a

{'a', 'b', 'c', 'd', 'r'}

In [14]:
b = set(df_649_canada.iloc[0][4:10])

In [15]:
b

{3, 11, 12, 14, 41, 43}

In [16]:
def extract_numbers(row):
    row = row[4:10]
    row = set(row.values)
    #print(row)
    return row

In [17]:
historical_lottery_numbers = df_649_canada.apply(extract_numbers,axis = 1)

In [18]:
print(historical_lottery_numbers.head())

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object


#### We will also write a function to check for historical occurence

In [19]:
def count_historical_occurence(user_input_list):
    user_set = set(user_input_list)
    
    def user_input_check(row):
        if set(user_input_list) == row:
            return True
        else:
            return False
    
    user_input_matches = historical_lottery_numbers.apply(user_input_check)
    
    #user_input_matches = historical_lottery_numbers == user_set ## Another alternative. 
    
    user_set_count = user_input_matches.sum()
    return user_set_count

In [20]:
def check_historical_occurence(user_input_list):
    count_of_occurences = count_historical_occurence(user_input_list)
    total_outcomes = combinations(49,6)
    output_string = ''
    if count_of_occurences == 0:        
        output_string = 'Your ticket combination {a} has not occurred yet. But still there is a 0.0000072% or 1 in {b} chance of winning'.format(a = user_input_list, b = int(total_outcomes))
    
    else:
        pc_outcomes = (count_of_occurences/total_outcomes)*100
        output_string = 'Your ticket combination {a} has occurred {b} times. Hence you have a {c:.7f}% chance of winning'.format(a = user_input_list, b = int(count_of_occurences), c = pc_outcomes)
    
    return output_string
    
    

#### Checking for test combinations

In [21]:
check_historical_occurence([1, 6, 39, 23, 24, 27])

'Your ticket combination [1, 6, 39, 23, 24, 27] has occurred 1 times. Hence you have a 0.0000072% chance of winning'

In [22]:
check_historical_occurence([3, 41, 11, 12, 43, 1])

'Your ticket combination [3, 41, 11, 12, 43, 1] has not occurred yet. But still there is a 0.0000072% or 1 in 13983816 chance of winning'

#### Thus we created and tested a function for checking the probability of a combination winning the lottery. In cases where the combination had occurred before, we calculated the probability on the basis of historical results. In cases where it hadn't occurred before, we assumed an equal probability of all outcomes and assigned 0.0000072% as the probability. 



### Multi-ticket Probability

#### We're going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [23]:
def multi_ticket_probability(n_tickets):
    total_possible_outcomes = combinations(49, 6)
    n_successful_outcomes = n_tickets
    prob_multi_ticket = n_successful_outcomes/total_possible_outcomes
    prob_multi_ticket_pc = prob_multi_ticket * 100
    combinations_simplified = round(total_possible_outcomes / n_successful_outcomes)
    op_string = 'Your chance of winning the lottery for {a} tickets is {b:.7f}%. In other words, you have a 1 in {c} chance of winning.'.format(a = n_successful_outcomes, b = prob_multi_ticket_pc,
                                                                                                                                                    c = combinations_simplified )
    print('-----------------------------')
    print(op_string)
    print('-----------------------------')
    return prob_multi_ticket_pc

In [24]:
test_ticket_count = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

In [25]:
for count in test_ticket_count:
    multi_ticket_probability(count)

-----------------------------
Your chance of winning the lottery for 1 tickets is 0.0000072%. In other words, you have a 1 in 13983816 chance of winning.
-----------------------------
-----------------------------
Your chance of winning the lottery for 10 tickets is 0.0000715%. In other words, you have a 1 in 1398382 chance of winning.
-----------------------------
-----------------------------
Your chance of winning the lottery for 100 tickets is 0.0007151%. In other words, you have a 1 in 139838 chance of winning.
-----------------------------
-----------------------------
Your chance of winning the lottery for 10000 tickets is 0.0715112%. In other words, you have a 1 in 1398 chance of winning.
-----------------------------
-----------------------------
Your chance of winning the lottery for 1000000 tickets is 7.1511238%. In other words, you have a 1 in 14 chance of winning.
-----------------------------
-----------------------------
Your chance of winning the lottery for 6991908 tic

#### Thus we created and tested a function for checking the probability of multiple tickets winning the lottery. We can observe that if a person purchases all the lottery tickets, then his chances of winning the lottery are 100%. 

### Less Winning Numbers — Function

#### So far we have considered a binary scenario, either all numbers are winning numbers or the whole ticket is null and void even if one number is not there in the winning combination. We will code a function to consider less than 6 winning numbers. 

In [26]:
'''
## Code discarded post testing, preserved for reference

def probability_less_6(count_winning_nos_exp):
    if count_winning_nos_exp in range(2,6):
        total_target_outcomes = 43 # Based on subtracting 1 correct outcome from 44 possibilities.
        for i in range(count_winning_nos_exp,5):
            total_target_outcomes*= (49-count_winning_nos_exp)
            print('Total target outcomes, post iteration', i, '= ', total_target_outcomes)
        #print(total_target_outcomes)
        total_target_outcomes*= combinations(6, count_winning_nos_exp)
        print('Total target outcomes, post final iteration = ', total_target_outcomes)
        prob_exact_outcomes = total_target_outcomes/combinations(49,6)
        prob_exact_outcomes_pc = prob_exact_outcomes*100
        denominator = round(combinations(49,6)/total_target_outcomes)
        print('Total target outcomes = ', total_target_outcomes )
        print('Total possible outcomes = ', combinations(49,6))
        op_string = 'The probability of your winning the lottery is {a:.6f} %. In other words you have a 1 in {b} chance of winning'.format(a = prob_exact_outcomes_pc, b = denominator )
        print(op_string)
        return prob_exact_outcomes
    else:
        print('Kindly input a count between 2 and 5')
        
'''
    
    
    

"\n## Code discarded post testing, preserved for reference\n\ndef probability_less_6(count_winning_nos_exp):\n    if count_winning_nos_exp in range(2,6):\n        total_target_outcomes = 43 # Based on subtracting 1 correct outcome from 44 possibilities.\n        for i in range(count_winning_nos_exp,5):\n            total_target_outcomes*= (49-count_winning_nos_exp)\n            print('Total target outcomes, post iteration', i, '= ', total_target_outcomes)\n        #print(total_target_outcomes)\n        total_target_outcomes*= combinations(6, count_winning_nos_exp)\n        print('Total target outcomes, post final iteration = ', total_target_outcomes)\n        prob_exact_outcomes = total_target_outcomes/combinations(49,6)\n        prob_exact_outcomes_pc = prob_exact_outcomes*100\n        denominator = round(combinations(49,6)/total_target_outcomes)\n        print('Total target outcomes = ', total_target_outcomes )\n        print('Total possible outcomes = ', combinations(49,6))\n       

In [27]:
def probability_less_6(count_winning_nos_exp):
    if count_winning_nos_exp in range(2,6):
        total_outcomes_correct_nos = combinations(6, count_winning_nos_exp)
        total_oucomes_incorrect_nos = combinations(43, (6 - count_winning_nos_exp))
        
        total_target_outcomes = total_outcomes_correct_nos * total_oucomes_incorrect_nos
        
        prob_exact_outcomes = total_target_outcomes/combinations(49,6)
        prob_exact_outcomes_pc = prob_exact_outcomes*100
        denominator = round(combinations(49,6)/total_target_outcomes)
        
        
        
        print('Total target outcomes = ', total_target_outcomes )
        print('Total possible outcomes = ', combinations(49,6))
        op_string = 'The probability of your winning the lottery is {a:.6f} %. In other words you have a 1 in {b} chance of winning'.format(a = prob_exact_outcomes_pc, b = denominator )
        print(op_string)
        return prob_exact_outcomes
    else:
        print('Kindly input a count between 2 and 5')

In [28]:
probability_less_6(4)

Total target outcomes =  13545.0
Total possible outcomes =  13983816.0
The probability of your winning the lottery is 0.096862 %. In other words you have a 1 in 1032 chance of winning


0.000968619724401408

In [29]:
test_list = [2,3,4,5]
for i in test_list:
    print('------------------------------------')
    probability_less_6(i)

------------------------------------
Total target outcomes =  1851150.0
Total possible outcomes =  13983816.0
The probability of your winning the lottery is 13.237803 %. In other words you have a 1 in 8 chance of winning
------------------------------------
Total target outcomes =  246820.0
Total possible outcomes =  13983816.0
The probability of your winning the lottery is 1.765040 %. In other words you have a 1 in 57 chance of winning
------------------------------------
Total target outcomes =  13545.0
Total possible outcomes =  13983816.0
The probability of your winning the lottery is 0.096862 %. In other words you have a 1 in 1032 chance of winning
------------------------------------
Total target outcomes =  258.0
Total possible outcomes =  13983816.0
The probability of your winning the lottery is 0.001845 %. In other words you have a 1 in 54201 chance of winning


#### The above function will give the probability of 2-5 winning numbers in the ticket. We have considered two combinations, one of x numbers being correct out of 6, and the other of y numbers being selected out of (49-6) and multiplied them to get the total number of successful outcomes. The probability has been calculated by dividing this number with the total number of possible outcomes. 

### Conclusion

We managed to write four main functions for our app:

    one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
    check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
    multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
    probability_less_6() — calculates the probability of having two, three, four or five winning numbers