# A Mobile App to help Lottery addicts to better estimate their chances of Winning
### The Scenario (Fictional)
A medical institute that aims to **prevent and treat gambling addictions** wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need assistance in the **logical core of the app and how to calculate probabilities**.

### Our Goal
For this version of the app, we would focus on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49) and build functions that enable users to answer questions like:

* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play **40** different tickets (or any other number)?
* What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

### The Dataset
The institute also wants us to consider historical data coming from the [national 6/49 lottery game in Canada](https://www.kaggle.com/datascienceai/lottery-dataset). This data set has data for **`3,665`** drawings, dating from 1982 to 2018.

### Import the needed functions
We shall need to calculate probabilities and for this we shall need **factorials, permutations** and **combinations**. We shall make use of the Python implementation of these fuctions instead of writing custom functions.

In [1]:
import numpy as np
import pandas as pd
from math import factorial
from itertools import permutations, combinations, combinations_with_replacement

### One-ticket Probability: The probability of winning the big prize
In the **6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49**. A player wins the big prize if the **six numbers on their tickets match all the six numbers drawn**. If a player has a ticket with the numbers `{13, 22, 24, 27, 42, 44}`, he only wins the big prize if the numbers drawn are `{13, 22, 24, 27, 42, 44}`. **If only one number differs, he doesn't win**.

In [2]:
def one_ticket_probability(nums):
    if len(nums) != 6 or len(np.unique(nums)) != 6 or min(nums) < 1 or max(nums) > 49:
        return 'Enter exactly 6 unique numbers between 1 and 49'
    comb = combinations(list(range(1, 50)), 6)
    n_comb = len(list(comb))
    prob = 1 / n_comb
    print('Your chances of winning is: ' + str(prob * 100) + '%' )

### Testing the function
* Using valid 6 digits
* Using 5 digits
* Using 7 digits
* Using repeated digits
* Using number < 1 or > 49

In [3]:
nums = [1, 2, 3, 4, 5, 6]
one_ticket_probability(nums)

Your chances of winning is: 7.151123842018516e-06%


In [4]:
nums = [1, 2, 3, 4, 5]
one_ticket_probability(nums)

'Enter exactly 6 unique numbers between 1 and 49'

In [5]:
nums = [1, 2, 3, 4, 5, 6, 5]
one_ticket_probability(nums)

'Enter exactly 6 unique numbers between 1 and 49'

In [6]:
nums = [1, 2, 3, 4, 5, 5]
one_ticket_probability(nums)

'Enter exactly 6 unique numbers between 1 and 49'

In [7]:
nums = [1, 2, 3, 4, 5, 0]
one_ticket_probability(nums)

'Enter exactly 6 unique numbers between 1 and 49'

In [8]:
nums = [1, 2, 3, 4, 5, 100]
one_ticket_probability(nums)

'Enter exactly 6 unique numbers between 1 and 49'

### Historical Data Check for Canada Lottery
Users should also be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won. The data set can be downloaded from [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset). 

In [9]:
df = pd.read_csv('649.csv')
print(df.shape)
df.head()

(3665, 11)


Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


### Function for Historical Data Check
#### Extract all unqiue 6 historical winning numbers (combinations)

In [10]:
def extract_numbers(row):
    return set(row[4:10]) 

In [11]:
historical = [extract_numbers(row) for row in df.values]

In [12]:
historical[:3]

[{3, 11, 12, 14, 41, 43}, {8, 33, 36, 37, 39, 41}, {1, 6, 23, 24, 27, 39}]

### Helper function to check historical winning numbers
If you had played the lottery in the past using the combinations you today would you have won? Let's find out...

In [13]:
def check_historical_occurence(nums, historical):
    if len(nums) != 6 or len(np.unique(nums)) != 6 or min(nums) < 1 or max(nums) > 49:
        return 'Enter exactly 6 unique numbers between 1 and 49'
    nums = set(nums)
    wins = [nums == hist for hist in historical]
    comb = combinations(list(range(1, 50)), 6)
    n_comb = len(list(comb))
    prob = sum(wins) / n_comb
    print('Your chances of winning is: ' + str(prob * 100) + '%' )

#### Testing our function
* Using random input
* Using a past known winning combination

In [14]:
nums = [1, 2, 3, 4, 5, 6]
check_historical_occurence(nums, historical)

Your chances of winning is: 0.0%


In [15]:
nums = [3, 11, 12, 14, 41, 43]
check_historical_occurence(nums, historical)

Your chances of winning is: 7.151123842018516e-06%


## Multi-ticket Probability
Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning...**Let's find out the chances of winning with any number of tickets**. 

In [16]:
def multi_ticket_probability(n_tickets):
    comb = combinations(list(range(1, 50)), 6)
    n_comb = len(list(comb))
    if n_tickets < 1 or n_tickets > n_comb:
        return 'Enter a number between 1 and ' + str(n_comb)
    prob = n_tickets / n_comb
    print('With ' + str(n_tickets) + ' tickets, your chances of winning is: ', str(prob * 100) + '%' )

In [17]:
num_tickets = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for n in num_tickets:
    multi_ticket_probability(n)

With 1 tickets, your chances of winning is:  7.151123842018516e-06%
With 10 tickets, your chances of winning is:  7.151123842018517e-05%
With 100 tickets, your chances of winning is:  0.0007151123842018516%
With 10000 tickets, your chances of winning is:  0.07151123842018516%
With 1000000 tickets, your chances of winning is:  7.151123842018517%
With 6991908 tickets, your chances of winning is:  50.0%
With 13983816 tickets, your chances of winning is:  100.0%


**As we can see, even with `100` tickets the chances of winning is still extremely slim!**

***We do not expect anyone, no matter the level of addiction, to purchase `6,991,908` tickets just to stand a `50%` chance of winning!!***

### Less Winning Numbers — Function
In most **6/49 lotteries** there are smaller prizes if a player's ticket match **two, three, four, or five** of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having **two, three, four, or five** winning numbers.

In [18]:
def probability_less_6(n):
    if n < 2 or n > 5:
        print('Enter a number between 2 and 5 inclusive.')
    comb_6 = combinations(list(range(1, 7)), n)
    num_6 = len(list(comb_6))
    comb_rem = combinations(list(range(1, 44)), 6 - n)
    num_rem = len(list(comb_rem))
    num_total_outcomes = num_6 * num_rem
    comb = combinations(list(range(1, 50)), 6)
    n_comb = len(list(comb))
    print('Your chances of matching ' + str(n) + ' numbers is: ' + str(num_total_outcomes / n_comb * 100) + '%')

In [19]:
probability_less_6(5)

Your chances of matching 5 numbers is: 0.0018449899512407771%


In [20]:
matches = [2,3,4,5]
for n in matches:
    probability_less_6(n)

Your chances of matching 2 numbers is: 13.237802900152577%
Your chances of matching 3 numbers is: 1.7650403866870101%
Your chances of matching 4 numbers is: 0.0968619724401408%
Your chances of matching 5 numbers is: 0.0018449899512407771%


### Simulating A Lottery Draw
The function below draws **a combination of 6 digits from the total of `13,983,816`** and compares it with the six digits provided by the user and declare the results! Whether:

* You won the jackpot with 6 matching digits
* You lost with no matching digits
* How close you came to winning (the number of matching digits)

In [32]:
def ticket_probability(nums):
    if len(nums) != 6 or len(np.unique(nums)) != 6 or min(nums) < 1 or max(nums) > 49:
        return 'Enter exactly 6 unique numbers between 1 and 49'
    
    nums = set(nums)
    idx = np.random.choice(list(range(13983816)))
    comb = combinations(list(range(1, 50)), 6)
    winner = set(list(comb)[idx])
    wins = len(nums.intersection(winner))
    
    if wins == 6:  
        print('Congratulations! You have won the jackpot!!')
    elif wins == 0:  
        print('Sorry! You have not won!!')
    else:
        print('Sorry! But you match ' + str(wins) + ' numbers correct')

### Testing the function - Playing the Lottery

In [33]:
nums = [3, 11, 12, 14, 41, 43]
ticket_probability(nums)

Sorry! You have not won!!


In [34]:
nums = [3, 11, 12, 14, 41, 43]
ticket_probability(nums)

Sorry! You have not won!!


In [35]:
nums = [3, 11, 12, 14, 41, 43]
ticket_probability(nums)

Sorry! But you match 3 numbers correct


In [36]:
nums = [3, 11, 12, 14, 41, 43]
ticket_probability(nums)

Sorry! You have not won!!
