##  Mobile App for Lottery Addiction

![Image](https://i.pinimg.com/originals/66/3e/e6/663ee6d5c6e6d81d8f8d6447380d91fa.png)

### Introduction

[Ludomania](https://en.wikipedia.org/wiki/Problem_gambling) or pathological gambling is a common disorder that is associated with both social and family costs.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. We're here to help them to create the logical core of the app and calculate probabilities.

We'll focus on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49) and help users to find:
* The probability of winning the big prize with a single ticket?
* The probability of winning the big prize if we play 40 different tickets (or any other number)?
* The probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

The institute also provides us with the historical [data](https://www.kaggle.com/datascienceai/lottery-dataset/version/1). There are 3,665 drawings from the national 6/49 lottery game in Canada, dating from 1982 to 2018.

### Finding the probabilities

First of all let's prepare two core functions: [**factorial**](https://en.wikipedia.org/wiki/Factorial) and [**combinations**](https://en.wikipedia.org/wiki/Combination).

In [1]:
def factorial(n):
    '''
    Calculate the factirial of n (n!)
    '''
    result = 1
    for i in range(n, 1, -1):
        result *= i
        
    return result

def combinations(n, k):
    '''
    Calculate the number of combinations when we're sampling
    without replacement and taking only k objects from a group
    of n objects
    '''
    denominator = factorial(k) * factorial(n - k)
    return factorial(n) / denominator

Now we can move on.

#### One ticket

We'll start with simpe function of our app - calculation the probability of winning the big prize with a single ticket.

A user input six different numbers from 1 to 49, we calculate the probability under the hood and return it in some frianly way.

In [24]:
def one_ticket_probability(ticket):
    '''
    Calculate the probability to win using only one ticket
    (exact numbers match)
    
    ticket - list
    '''
    k = len(ticket)
    n = 49
    comb = combinations(n, k)
    prob = 1 / comb
    print('''Your chances to win the big prize with the {} numbers
are {:%} or 1 in {:,}.
'''.format(ticket, prob, int(comb)))
    return None

Let's run some tests.

In [25]:
one_ticket_probability([1, 2, 3, 4, 5, 6])
one_ticket_probability([15, 25, 35, 45, 55, 65])

Your chances to win the big prize with the [1, 2, 3, 4, 5, 6] numbers
are 0.000007% or 1 in 13,983,816.

Your chances to win the big prize with the [15, 25, 35, 45, 55, 65] numbers
are 0.000007% or 1 in 13,983,816.



Everything's fine, the probability is independent from the numbers. We can hope that users will notice that too.

#### Historical draws

Now we want give users the ability to compare their ticket against the historical lottery data and determine whether they would have ever won by now. Let's first take a look at this data.

In [4]:
import pandas as pd

historical = pd.read_csv('649.csv')
print(historical.shape)
historical.head()

(3665, 11)


Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


Nubers are stored in six different columns. We need to combine them in something more useful - sets. Then we' store these sets in the new column.

In [5]:
def extract_numbers(row):
    '''
    Extract numbers from certain columns in the row
    and store them as set
    '''
    numbers = row.iloc[4:10]
    return set(numbers)

historical['SETS DRAWN'] = historical.apply(extract_numbers, axis=1)
historical.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER,SETS DRAWN
0,649,1,0,6/12/1982,3,11,12,14,41,43,13,"{3, 41, 11, 12, 43, 14}"
1,649,2,0,6/19/1982,8,33,36,37,39,41,9,"{33, 36, 37, 39, 8, 41}"
2,649,3,0,6/26/1982,1,6,23,24,27,39,34,"{1, 6, 39, 23, 24, 27}"
3,649,4,0,7/3/1982,3,9,10,13,20,43,34,"{3, 9, 10, 43, 13, 20}"
4,649,5,0,7/10/1982,5,14,21,31,34,47,45,"{34, 5, 14, 47, 21, 31}"


Now we'll write a function that makes the comparison.

In [6]:
def check_historical_occurence(ticket, compare_to=historical['SETS DRAWN']):
    '''
    Take ticket number and compare them with the winning sets
    Return number of wins and probability to win
    
    ticket - list
    '''
    ticket_set = set(ticket)
    win_num = (ticket_set == compare_to).sum()
    
    if win_num == 1:
        print('''The {} numbers won {:,} time.
It doesn't change chances to win in the future.'''.format(ticket, int(win_num)))
        one_ticket_probability(ticket)
        
    else:
        print('''The {} numbers won {:,} times.
It doesn't change chances to win in the future.'''.format(ticket, int(win_num)))
        one_ticket_probability(ticket)
    return None

And some more tests.

In [7]:
check_historical_occurence([1, 2, 3, 4, 5, 6])
check_historical_occurence([3, 41, 11, 12, 43, 14])

The [1, 2, 3, 4, 5, 6] numbers won 0 times.
It doesn't change chances to win in the future.
Your chances to win the big prize with the [1, 2, 3, 4, 5, 6] numbers
are 0.000715% or 1 in 13,983,816.

The [3, 41, 11, 12, 43, 14] numbers won 1 time.
It doesn't change chances to win in the future.
Your chances to win the big prize with the [3, 41, 11, 12, 43, 14] numbers
are 0.000715% or 1 in 13,983,816.



#### Several tickets

We've finished with the single ticket case but lottery addicts usually play more than one ticket on a single drawing. They thinks that this might increase their chances of winning significantly.

Our purpose is to help them better estimate their chances of winning. So we should add ability to calculate the chances of winning for any number of different tickets.

In [26]:
def multi_ticket_probability(num_of_ticket=1):
    '''
    Calculate the probability to win using any number of tickets
    (exact numbers match)
    
    ticket - int
    '''
    k = 6
    n = 49
    comb = combinations(n, k)
    prob = num_of_ticket / comb
    if num_of_ticket == 1:
        print('''Your chances to win the big prize with the one ticket
are {:%} or 1 in {:,}.
'''.format(prob, int(comb)))
        
    else:
        print('''Your chances to win the big prize with the {:,} number of tickets
are {:%} or {:,} in {:,}.
'''.format(num_of_ticket, prob, num_of_ticket, int(comb)))
    return None

Let's test our function on different numbers of tickets. Start with one end with the number's equal to the number of combinations - 13,983,816.

In [27]:
test_num = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for number in test_num:
    multi_ticket_probability(number)

Your chances to win the big prize with the one ticket
are 0.000007% or 1 in 13,983,816.

Your chances to win the big prize with the 10 number of tickets
are 0.000072% or 10 in 13,983,816.

Your chances to win the big prize with the 100 number of tickets
are 0.000715% or 100 in 13,983,816.

Your chances to win the big prize with the 10,000 number of tickets
are 0.071511% or 10,000 in 13,983,816.

Your chances to win the big prize with the 1,000,000 number of tickets
are 7.151124% or 1,000,000 in 13,983,816.

Your chances to win the big prize with the 6,991,908 number of tickets
are 50.000000% or 6,991,908 in 13,983,816.

Your chances to win the big prize with the 13,983,816 number of tickets
are 100.000000% or 13,983,816 in 13,983,816.



Everything's exactly as planed. With the 13,983,816 number of tickets user has 100% to win.

#### Small prizes

In most 6/49 lotteries there are smaller prizes if a player's ticket match 2, 3, 5, or 5 of the 6 numbers drawn. So the users might be interested in knowing the probability of having less then 6 winning numbers.

For that purpose we'll write another function.

In [57]:
def probability_less_6(ticket=[1,2,3,4,5,6], winning_number=5):
    '''
    Calculate the probability of having exactly winnning numbers
    
    ticket - list
    winning_number - int between 2 and 5
    '''
    k = len(ticket)
    n = 49
    #Combinations of winning numbers (exactly win out of 6)
    win_num_comb = combinations(k, winning_number)
    #Combinatioans of additional numbers (6-win out of 49 except 6 out of 6)
    add_win_comb = combinations(43, k-winning_number)
    n_win = win_num_comb * add_win_comb
    #Total combinatians
    comb = combinations(n, k)
    prob = n_win / comb
    
    print('''Your chances to win the small prize with the {} numbers and
{} winning numbers are {:%} or 1 in {:,}.
'''.format(ticket, int(winning_number), prob, round(comb/n_win)))
    
    return None

And test this function with the different winning numbers.

In [58]:
winning = [2, 3, 4, 5]
for number in winning:
    probability_less_6(winning_number=number)

Your chances to win the small prize with the [1, 2, 3, 4, 5, 6] numbers and
2 winning numbers are 13.237803% or 1 in 8.

Your chances to win the small prize with the [1, 2, 3, 4, 5, 6] numbers and
3 winning numbers are 1.765040% or 1 in 57.

Your chances to win the small prize with the [1, 2, 3, 4, 5, 6] numbers and
4 winning numbers are 0.096862% or 1 in 1,032.

Your chances to win the small prize with the [1, 2, 3, 4, 5, 6] numbers and
5 winning numbers are 0.001845% or 1 in 54,201.



### Conclusion

For the first version of the app, we added four main features:
* show the probability of winning the big prize with a single ticket
* check whether a certain combination has occurred in the Canada lottery
* show the probability for any number of of tickets between 1 and 13,983,816
* show the probability of having 2, 3, 4 or 5 winning numbers exactly

Hope, that will help a medical institute to prevent and treat gambling addictions in the future.