# Development of  a Mobile App to Help Lottery Addicts Better Estimate their Chances of Winning

## 1. Introduction

A medical institute aims to prevent and treat gambling addiction by building a dedicated mobile app that helps lottery addicts better estimate their chances of winning. The institute has a team of engineers for building the app. We support them create the logical core of the app and calculate probabilities. 

The team wants us to focus on __[6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49)__ (one of the three national lottery games in Canada) and build a function that enables readers to answer following questions:

 - What is the probability of winning the big prize with a single ticket?
 - What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
 - What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

The institute wants us to consider working with historical data emanating from 6/49 lottery game in Canada __[dataset](https://www.kaggle.com/datasets/datascienceai/lottery-dataset)__. This dataset has *3665* drawings, dating from *1982* to *2018*.

## 2. Structuring Core Functions

In the introduction we outlined the probability questions we need to ask the users about playing the lottery. As an outcome of that we are required to calculate *probabilities* and *combinations* repeatedly throughout the project. In this section, we start with writing two following functions:-

 **Function that calculates factorials, called *factorial()*: **
This function takes one input *n* and computes the factorial of *n*.

In [1]:
def factorial(n):
    product = 1
    for i in range(2, (n+1)):
        product *= i
    return product    

** Function that calculates combinations, called *combinations()*: **
  This function takes two inputs,*n* and *k* and outputs the number of combinations when we are taking *k* objects from a group of *n* objects.

In [2]:
def combinations(n,k):
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return int(numerator/denominator)

## 3. Computing One-Ticket Probability

In this section we calculate the probability of winning the big prize. i.e. in the *6/49* lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their ticket matches all the six numbers drawn.

For the first version of the app, we start by building a function that calculates the probability of winning the big prize for any given ticket. For each ticket a player chooses six numbers out of 49. 

We need to keep following details in mind while writing a function:

 * Inside the function, the user inputs 6 different numbers from 1 to 49
 * These 6 numbers will be taken as a Python list and this will serve as the single input to our function
 * The function should print the probability value in a way that users without any probability training are able to understand.

In [3]:
list_a = [int(x) for x in input("Enter 6 numbers from 1 to 49 with a comma: ").split(",")]

def one_ticket_probability(list_a):    
    
    print("\nThis function takes a list of 6 numbers from 1 to 49")
    print("as input and prints the probability of winning the lottery")
    print("in percentages\n")
    
    total_possible_outcomes = combinations(49, 6)
    successive_outcomes = 1
        
    probability = (successive_outcomes / total_possible_outcomes)*100
        
    print("\nChances of winning the lottery for the {} ticket is {:.7f}%".format(list_a, probability))

one_ticket_probability(list_a)      
    

Enter 6 numbers from 1 to 49 with a comma: 33, 36, 37, 39, 8, 41

This function takes a list of 6 numbers from 1 to 49
as input and prints the probability of winning the lottery
in percentages


Chances of winning the lottery for the [33, 36, 37, 39, 8, 41] ticket is 0.0000072%


## 4. Checking Historical Data for Canada Lottery

Further we want users to be able to compare their ticket against the historical lottery data in *Canada* and determine whether they would have even won by now. For this reason we will consider the dataset mentioned in the introduction. 

In [4]:
#Let us import the necessary module
import pandas as pd

The dataset is called `649.csv`. We will save it as a pandas dataframe for further analysis.

In [5]:
Hist_649 = pd.read_csv("649.csv") 
Hist_649.shape

(3665, 11)

In the above dataset we have *11 columns* and *3665 rows*. Let us look at the first and last 3 rows of the dataframe.

In [6]:
Hist_649.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [7]:
Hist_649.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


## 5. Building a Function for Historical Data Check

In this section our objective is to write a function that enables users to 

 1. Compare their ticket against historical lottery data in Canada. i.e. number of times the combination selected occurred in the Canada data set.
 2. Determine whether they would have ever won by now. i.e. the probability of winning the big prize in the next drawing with that combination.
 
First,let us look at the columns in the dataset with winning lottery digits.

In [8]:
Hist_649.iloc[:,[4,5,6,7,8,9]]

Unnamed: 0,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6
0,3,11,12,14,41,43
1,8,33,36,37,39,41
2,1,6,23,24,27,39
3,3,9,10,13,20,43
4,5,14,21,31,34,47
5,8,20,21,25,31,41
6,18,25,28,33,36,42
7,7,16,17,31,40,48
8,5,10,23,27,37,38
9,4,15,30,37,46,48


The columns with winning lottery digits are *4, 5, 6, 7, 8, 9*. Further we will write a function which takes in a row of the lottery dataframe and returs a set containing all the six winning numbers of that row.

In [9]:
def extract_numbers(row):
    '''
    This function takes a row as input and
    returns a set containing all
    '''
    lottery_set= set()
    result = row[4],row[5],row[6],row[7],row[8],row[9]
    lottery_set.update(result)
    return lottery_set

We will extract all the winning lottery numbers by using *extract_numbers()* function in combination with the *DataFrame.apply()* method.  

In [10]:
winners = Hist_649.apply(extract_numbers,axis=1)
winners.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In here we will write a function that will take a Python list of user given six numbers and a pandas Series containing sets with winning numbers from the Canada lottery draws *'winners'* as inputs. And prints number of times the combination inputted by the user occurred in the past & the probability of winning the big prize in the next drawing with that combination.

In [11]:
list_6 = [int(x) for x in input("Enter 6 numbers from 1 to 49 with a comma: ").split(",")]
def check_historical_occurence(list_6, winners):
    print("\n This function takes in two inputs, a Python list")
    print("containing user given numbers and a pandas Series containing")
    print("sets with winning numbers")
    set_6 = set(list_6)
    count = 0
    check_occurence = (set_6 == winners)
    result = check_occurence.sum()

    print("\nNumber of times this combination occurred in the past = ", result)
    
    total_possible_outcomes = combinations(49,6)
        
    probability = (1 / total_possible_outcomes)*100
    
    if result == 0:
        print("\nNumber of occurences in the past is ", result)
        print("\nChance of winning the lottery for {} in future draw is {:.7f}".format(list_6, probability))
        
    else:
        print("\nNumber of occurences in the past is ", result)
        print("\nChance of winning the lottery for {} in future draw is {:.7f}".format(list_6, probability))

check_historical_occurence(list_6, winners)

Enter 6 numbers from 1 to 49 with a comma: 33, 36, 37, 39, 8, 41

 This function takes in two inputs, a Python list
containing user given numbers and a pandas Series containing
sets with winning numbers

Number of times this combination occurred in the past =  1

Number of occurences in the past is  1

Chance of winning the lottery for [33, 36, 37, 39, 8, 41] in future draw is 0.0000072


## 6. Building a Multi-ticket Probability Function

Lottery addicts usually play more than one ticket on a single drawing thinking this might increase their chance of winning. In here we are going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

For this function:

 - The input should be the number of different tickets the user wants to play (without inputting the specific combinations they intend to play)
 
 - The output should be the information about the probability of winning the big prize depending on the number of different tickets played.
 
*multi_ticket_probability()* prints the probability of winning the big prize depending on the number of different tickets played.

In [12]:
tickets = [int(x) for x in input("Enter number of different tickets with a comma").split(",")]

def multi_ticket_probability(tickets):
    '''This program takes a list of number of different 
    tickets as input and prints the probability of winning 
    the lottery'''
    for t in tickets:
        successful_outcomes = t
        total_possible_outcomes = combinations(49,6)
        probability = (successful_outcomes / total_possible_outcomes) * 100
    
        print("\nChances of Winning the Lottery for {} Tickets is {:.7f}%".format(t, probability))

multi_ticket_probability(tickets)

Enter number of different tickets with a comma1, 10, 100, 10000, 1000000, 6991908, 13983816

Chances of Winning the Lottery for 1 Tickets is 0.0000072%

Chances of Winning the Lottery for 10 Tickets is 0.0000715%

Chances of Winning the Lottery for 100 Tickets is 0.0007151%

Chances of Winning the Lottery for 10000 Tickets is 0.0715112%

Chances of Winning the Lottery for 1000000 Tickets is 7.1511238%

Chances of Winning the Lottery for 6991908 Tickets is 50.0000000%

Chances of Winning the Lottery for 13983816 Tickets is 100.0000000%


## 7. Building A Less Winning Numbers Function

In this section we are going to build a function that allows the users to calculate probabilities for *two, three, four, or five* winning numbers.

In most 6/49 lotteries there are smaller prizes if a player's ticket matches *two, three, four, or five* of the *six* numbers drawn. As a consequence, the users might be interested in knowing the probabilities of having the above winning numbers.

- Input to the function is a integer/s from 2 to 5 that represents the expected  winning numbers in a given ticket
   
- Output of the function prints information about the probability of having the inputted number of winning

In [13]:
list_x = [int(x) for x in input("Enter'x' numbers from 2 to 5 with a comma ").split(",")]

def probability_less_6(list_x):
    
    print("\nThis program takes input, a integer/s from 2 to 5 and")
    print("prints the probability of winning depending upon the value")
    print("of the integer")
    
    for x in list_x:
        no_of_combinations = combinations(6,x)
        possible_successful_outcomes = combinations((49-6),(6-x))
        total_successful_outcomes = possible_successful_outcomes * no_of_combinations
    
        probability = (total_successful_outcomes / combinations(49,6))*100
    
        print("\nChances of owning {} winning numbers in a given ticket is {:7f}%".format(x, probability))
    
probability_less_6(list_x)

Enter'x' numbers from 2 to 5 with a comma 2,3,4,5

This program takes input, a integer/s from 2 to 5 and
prints the probability of winning depending upon the value
of the integer

Chances of owning 2 winning numbers in a given ticket is 13.237803%

Chances of owning 3 winning numbers in a given ticket is 1.765040%

Chances of owning 4 winning numbers in a given ticket is 0.096862%

Chances of owning 5 winning numbers in a given ticket is 0.001845%


## Conclusions

In this project we assisted a medical institute build an app that helps prevent and treat gambling addicts from lottery addictions. We supported them create logical core of the app by building functions and calculate probabilities.

In this project we worked with historical data originated from 6/49 lottery (one of the 3 national lottery games in Canada).

Here we tried answering following 3 questions:

 1. What is the probability of winning the big prize with a single ticket?
     - We built a function *one_ticet_probability()* that takes a list of numbers from 1 to 49 and prints the probability of winning the lottery in percentages. For any given 6 numbers, the probability of winning a lottery is *7.15e-6%*.
         - We also compared the user chosen ticket against the historical lottery data in Canada and determined their winning chances by building a function called *check_historical_occurence()* for historical data check.
         
 2. What is the probability of winning the big prize if we play any number of different tickets?
     - We built a *multi_ticket_probability()* function that takes in the number of different tickets the user wants to play and prints the chances of winning the big prize depending on the number of different tickets played.
     
 3. What is the probability of having atleast five (or four, or three, or two) winning numbers on a single ticket?
     - We built a function *probability_less_6()* which takes the integer/s from 2 to 5 as input and prints the probability winning the lottery for the inputted number in a ticket.