# The Lotto Predictor

The project is about helping medical institutions help their patients fight gambling addictions. The goal of the project is to correctly predict the probablility of winning lottery against various input to give people a better sense of their chance of winning.

In [1]:
def factorial(n):
    if n <= 1:
        return n
    return n * factorial(n -1)

def combinations(n, k):
    return factorial(n) / (factorial(k) * factorial(n - k))

# Both of these functions assist us in calculating cases where probability 
# occurrs without replacement

In [2]:
def one_ticket_probability(a_list):
    total_outcome = int(combinations(49, 6))
    percent = (1/total_outcome) * 100
    print("The Lotto lottery system picks 6 numbers from a set of 49 numbers, and if your numbers are same as those of Lotto's then you win 5 million dollars!" + "\n")
    print("Exciting, right?" + " " + "But wait not so fast, have you tried to figure out how many possible different combinations of 6 numbers you can build from 49 numbers?" + "\n")
    print("Take a guess..." + "There are " + str(total_outcome) +" possible combinations." + "\n")
    print("And a single purchase of ticket only gives you access to choose 1 such combination."+ "\n")
    print("The odds are stacked heavily against you. Your combination {} only have {:.7f}% chance of winning.".format(a_list, percent))

In [3]:
one_ticket_probability([1,2,3,4,5,6])

The Lotto lottery system picks 6 numbers from a set of 49 numbers, and if your numbers are same as those of Lotto's then you win 5 million dollars!

Exciting, right? But wait not so fast, have you tried to figure out how many possible different combinations of 6 numbers you can build from 49 numbers?

Take a guess...There are 13983816 possible combinations.

And a single purchase of ticket only gives you access to choose 1 such combination.

The odds are stacked heavily against you. Your combination [1, 2, 3, 4, 5, 6] only have 0.0000072% chance of winning.


### The one_ticket_probability() function does following things:
1. Takes user's desired input.
2. Shows the odds of winning with one ticket.
3. Playfully cautions them about gambling by drawing on the odds or percentage of winning with a lottery ticket.

## Experimenting with historical data


In [4]:
import pandas as pd

lotto = pd.read_csv("649.csv")
lotto.shape

(3665, 11)

First three rows

In [5]:
lotto.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


Last three rows

In [6]:
lotto.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


In [7]:
def extract_numbers(a_list):
    return set(a_list)

winning_numbers = lotto.loc[:, ["NUMBER DRAWN {}".format(i) for i in range(1,7)]]
lotto["the_winning_combination"] = winning_numbers.apply(extract_numbers, axis=1)

In [8]:
def check_historical_occurence(user_input, combination_series):
    user_input = set(user_input)
    output_series = user_input == combination_series
    
    successful_outcome = 0
    if True in output_series.value_counts():
        successful_outcome = output_series.value_counts()[True]
    
    s_probability = (successful_outcome/len(output_series)) * 100
    print('''Your input series has occurred {} time(s) in the past. 
    This means you have close to {:.3f}% chance of winning'''.format(
        successful_outcome, s_probability))

check_historical_occurence([8,9,25,36,47,6],
                           lotto["the_winning_combination"])    
#check_historical_occurence([2,15,21,31,38,49],
#                           lotto["the_winning_combination"]).value_counts()

Your input series has occurred 0 time(s) in the past. 
    This means you have close to 0.000% chance of winning


### The above functions calculates the probability of a set of random user input occurring in past, and informs the user their chance of winning by comparing the user input with the winning set of numbers.

In [9]:
def multi_ticket_probability(n):
    total_outcome = int(combinations(49, 6))
    
    if (n < 0) or (n > total_outcome):
        return "Invalid input. Your input is either less than zero or more than possible combinations. Pls input different number of tickets"
    
    percent = (n/total_outcome) * 100
    return "Your chances of winning are {:.7f}%. And if a ticket costs $50, the total cost of just trying is ${}.".format(percent, str(50*n))
    
test = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for t in test:
    print(multi_ticket_probability(t))
    print("\n")

Your chances of winning are 0.0000072%. And if a ticket costs $50, the total cost of just trying is $50.


Your chances of winning are 0.0000715%. And if a ticket costs $50, the total cost of just trying is $500.


Your chances of winning are 0.0007151%. And if a ticket costs $50, the total cost of just trying is $5000.


Your chances of winning are 0.0715112%. And if a ticket costs $50, the total cost of just trying is $500000.


Your chances of winning are 7.1511238%. And if a ticket costs $50, the total cost of just trying is $50000000.


Your chances of winning are 50.0000000%. And if a ticket costs $50, the total cost of just trying is $349595400.


Your chances of winning are 100.0000000%. And if a ticket costs $50, the total cost of just trying is $699190800.




## The Lottery also awards prices if 2 or more numbers have exact match. Therefore, it is also important to calculate the probability of user input getting these matches. The below function calculates probability of having exact matches of any number between 2 or 5.

In [10]:
def probability_less_6(n):
    if (n < 2) or (n > 5):
        return "Invalid input. Your input must be an integer between 2 and 5(inclusive)"
    
    combinations_n = combinations(6, n)
    possible_outcomes_for_each_n = combinations(49 - n, 6 - n) - 1 
    ## the - 1 removes the atleast probability
    ## Suppose we want to calculate the probability
    ## of matching exactly two numbers from the user's list 
    ## Remember, though user can choose any number from 1 to 49, but they can
    ## only enter maximum 6 numbers.
    ## Also, any 2 numbers from the 6 user entered can 
    ## make correct combination, therefore there are total 15
    ## possible outcomes i.e., 6C2
    ## In addition, these 15 numbers can exist with any of the 47C4 numbers - 1
    ## Hence, the total possible outcomes are 15 * (47C4 - 1)
    
    total_possible_outcomes = combinations_n * possible_outcomes_for_each_n
    probability = total_possible_outcomes / combinations(49, 6)
    
    return "There is {:.5f}% chance of matching exactly {} numbers.".format(probability*100, n)


In [11]:
probability_less_6(3)

'There is 2.17094% chance of matching exactly 3 numbers.'

In [12]:
probability_less_6(2)

'There is 19.13255% chance of matching exactly 2 numbers.'

In [13]:
probability_less_6(4)

'There is 0.10609% chance of matching exactly 4 numbers.'

In [14]:
probability_less_6(5)

'There is 0.00184% chance of matching exactly 5 numbers.'