# Mobile App for Lottery Addiction

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. 

For the first version of the app, they want us to focus on the 6/49 lottery and build functions that enable users to answer questions like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

Rules of 6/49 lottery: six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. If a player has a ticket with the numbers {13, 22, 24, 27, 42, 44}, he only wins the big prize if the numbers drawn are {13, 22, 24, 27, 42, 44}. If only one number differs, he doesn't win.

We use the historical data coming from the national 6/49 lottery game in Canada:
https://www.kaggle.com/datascienceai/lottery-dataset
The data set has data for 3,665 drawings, dating from 1982 to 2018

----------
One-ticket Probability
-----------

In [1]:
def factorial(n):
    result = 1
    for i in range(n,1,-1):
        result *= i
    return result

def combinations(n,k):
    return (factorial(n) / (factorial(k) * factorial(n-k)))

Some details: 
- Inside the app, the user inputs six different numbers from 1 to 49.
- Under the hood, the six numbers will come as a Python list, which will serve as the single input to our function.
- The engineering team wants the function to print the probability value in a friendly way — in a way that people without any probability training are able to understand.

In [2]:
def one_ticket_probability(user_list):
    tot_outcome = combinations(49,6)
    probability = 1 / tot_outcome
    print('The probability of wining with ticket {0} is:\n {1:.7f}%'\
          .format(user_list, probability * 100))
    print('This is 1 in {0:,.0f} chance to win'.format(tot_outcome))

In [3]:
#test the function
one_ticket_probability([11,21,7,3,49,12])

The probability of wining with ticket [11, 21, 7, 3, 49, 12] is:
 0.0000072%
This is 1 in 13,983,816 chance to win


-----------
Historical Data Check for Canada Lottery
----------

In [4]:
import pandas as pd
cl = pd.read_csv('649.csv')

In [5]:
cl.sample(5)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3104,649,3034,1,2/16/2013,10,15,16,38,44,48,0
1850,649,1851,0,10/17/2001,18,21,25,34,36,45,11
386,649,387,0,10/7/1987,3,4,6,13,14,48,40
2433,649,2434,0,5/19/2007,8,16,33,37,40,42,35
3064,649,2999,0,10/17/2012,4,20,22,23,42,44,41


each row shows data is a single drawing - data from 1982 to 2018

In [6]:
cl.shape

(3665, 11)

We would like to write a function that prints:
- the number of times the combination selected occurred in the Canada data set
- the probability of winning the big prize in the next drawing with that combination.

In [7]:
def extract_numbers(row):
    result = {row[i] for i in range(4,10)}
    return result

all_numbers = cl.apply(extract_numbers, axis=1)

In [8]:
def check_historical_occurence(user_list, ext_series):
    user_set = set(user_list)
    comp_bol = ext_series == user_set
    num_occu = sum(comp_bol)
    
    print('{0} time(s) this combination of numbers occured in the past\n'.\
          format(num_occu))

    one_ticket_probability(user_list)
    return

In [9]:
#test the function
check_historical_occurence([11,21,7,3,49,12], all_numbers)

0 time(s) this combination of numbers occured in the past

The probability of wining with ticket [11, 21, 7, 3, 49, 12] is:
 0.0000072%
This is 1 in 13,983,816 chance to win


In [10]:
#test the function
check_historical_occurence([9, 19, 21, 23, 31, 49], all_numbers)

1 time(s) this combination of numbers occured in the past

The probability of wining with ticket [9, 19, 21, 23, 31, 49] is:
 0.0000072%
This is 1 in 13,983,816 chance to win


--------------
Multi-ticket Probability
-------------

We would to write a function that will allow the users to calculate the chances of winning for any number of different tickets:
- The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
- Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
- The function should print information about the probability of winning the big prize depending on the number of different tickets played.



In [11]:
def multi_ticket_probability(num_ticket):
    tot_outcome = combinations(49,6)
    probability = num_ticket / tot_outcome
    print('The probability of wining with {0} ticket(s) is:\n {1:.7f}%'\
          .format(num_ticket, probability * 100))
    prob_ratio = round(tot_outcome/num_ticket,0)
    print('This is 1 in {0:,.0f} chance to win\n'.format(prob_ratio))
    return

In [12]:
# test function 
inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for i in inputs:
    multi_ticket_probability(i)    

The probability of wining with 1 ticket(s) is:
 0.0000072%
This is 1 in 13,983,816 chance to win

The probability of wining with 10 ticket(s) is:
 0.0000715%
This is 1 in 1,398,382 chance to win

The probability of wining with 100 ticket(s) is:
 0.0007151%
This is 1 in 139,838 chance to win

The probability of wining with 10000 ticket(s) is:
 0.0715112%
This is 1 in 1,398 chance to win

The probability of wining with 1000000 ticket(s) is:
 7.1511238%
This is 1 in 14 chance to win

The probability of wining with 6991908 ticket(s) is:
 50.0000000%
This is 1 in 2 chance to win

The probability of wining with 13983816 ticket(s) is:
 100.0000000%
This is 1 in 1 chance to win



------
Less Wining Numbers
------------

In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers:
- Inside the app, the user inputs an integer between 2 and 5 that represents the number of winning numbers expected
- Our function prints information about the probability of having the inputted number of winning numbers.

In [15]:
combinations(6,5) * (combinations(49-5-1,(6-5))) / combinations(49,6)

1.8449899512407772e-05

In [14]:
combinations(49,6)/combinations(49,5)

7.333333333333333

In [30]:
def probability_less_6(n_win):
    # number of suscceesful outcome to have exactly n_win wininng numbers
    succ_outcome = combinations(6,n_win) * combinations(49-6, 6-n_win)
    # number of total possible outcome
    tot_outcome = combinations(49,6)
    prob = succ_outcome / tot_outcome
    prob_ratio = round(tot_outcome / succ_outcome, 0)
    print('The probability of having exactly {0} winning numbers in your ticket is:\n {1:.4f}%'\
          .format(n_win, prob * 100))
    print('This is 1 in {0:,.0f} chance to win\n'.format(prob_ratio))
    return

In [31]:
# test function
for i in range(2,6):
    probability_less_6(i)

The probability of having exactly 2 winning numbers in your ticket is:
 13.2378%
This is 1 in 8 chance to win

The probability of having exactly 3 winning numbers in your ticket is:
 1.7650%
This is 1 in 57 chance to win

The probability of having exactly 4 winning numbers in your ticket is:
 0.0969%
This is 1 in 1,032 chance to win

The probability of having exactly 5 winning numbers in your ticket is:
 0.0018%
This is 1 in 54,201 chance to win



--------------
Creating a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers:

In [51]:
def prob_atleast_less_6(n_win_atleast):
    tot_outcome = combinations(49,6)
    succ_outcome_atleast_list = [combinations(6,i) * combinations(49-6, 6-i) \
                                 for i in range(n_win_atleast,6+1)]
    succ_outcome_atleast = sum(succ_outcome_atleast_list)
    prob_atleast = succ_outcome_atleast / tot_outcome
    prob_atleast_ratio = round(tot_outcome / succ_outcome_atleast, 0)
    print('The probability of having at least {0} winning numbers in your ticket is:\n {1:.4f}%'\
          .format(n_win_atleast, prob_atleast * 100))
    print('This is 1 in {0:,.0f} chance to win\n'.format(prob_atleast_ratio))
    return
    

In [52]:
# test function
for i in range(2,6):
    prob_atleast_less_6(i)

The probability of having at least 2 winning numbers in your ticket is:
 15.1016%
This is 1 in 7 chance to win

The probability of having at least 3 winning numbers in your ticket is:
 1.8638%
This is 1 in 54 chance to win

The probability of having at least 4 winning numbers in your ticket is:
 0.0987%
This is 1 in 1,013 chance to win

The probability of having at least 5 winning numbers in your ticket is:
 0.0019%
This is 1 in 53,992 chance to win

