# Mobile App for Lottery Addiction

## Introduction

This guided project demonstrates:
* How to calculate theoretical and empirical probabilities
* How to use probability rules to solve probability problems
* How to use combinations and permutations


The purpose of this project is to calculate the probabilities of winning different lotteries and informing gambling addicts of their low chances. These probabilities will be integrated into a new dedicated mobile app that helps lottery addicts better estimate their changes of winning.

We can answer questions like:
* What is the probability of winning the big prize with a single ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three, or two)  winning numbers on a single ticket?

The [dataset](https://www.kaggle.com/datascienceai/lottery-dataset) comes from a national 6/49 lottery game in Canada from 3,665 drawings, dating from 1981 to 2018. 

## Calculating factorials and combinations

In [42]:
def factorial(n):
    product=1
    for i in range(n,0,-1):
        product*=i
    return product

def combinations(n,k):
    numerator=factorial(n)
    denominator=factorial(k) * factorial(n-k)
    return numerator/denominator

In the 6/49 game, 6 numbers are drawn from a set of 49 numbers that range from 1 to 49. A player must match all 6 numbers exactly to win.

Let's calculate the probability of winning this big prize using a function.

In [45]:
def one_ticket_probability(user_numbers):
    tot_outcomes = combinations(49,6)
    percentage = 1/tot_outcomes * 100
    print('''The chance of you winning this lottery with the numbers {} is {:.7f}%. 
In other words, you have a 1 in {:,} chance of winning.'''.format(user_numbers,percentage,int(tot_outcomes)))

In [48]:
#Let's test a few sets of numbers
one_ticket_probability([1,2,3,4,5,6])

The chance of you winning this lottery with the numbers [1, 2, 3, 4, 5, 6] is 0.0000072%. 
In other words, you have a 1 in 13,983,816 chance of winning.


In [49]:
one_ticket_probability([5,13,35,36,39,68])

The chance of you winning this lottery with the numbers [5, 13, 35, 36, 39, 68] is 0.0000072%. 
In other words, you have a 1 in 13,983,816 chance of winning.


Unsuprisingly, the probability is the same no matter what the chosen numbers are. Users will be able to see that they have a 1 in roughly 14 million chance of winning the big prize. Not great odds!

## Historical Lottery Data

Now, let's create some functionality that allows users to compare their chosen numbers against historical lottery data in Canada and determine whether they would have ever won by now. 

In [50]:
import pandas as pd
lotto = pd.read_csv('649.csv')
lotto.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


In [53]:
lotto.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [54]:
lotto.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


We've been instructed to write a function that prints the number of times the combination selected occurred in the Canada dataset and the probability of winning the big prize in the drawing with that combination.

In [61]:
# First write a function to extract the winning numbers from the dataset
def extract_numbers(row):
    row=row[4:10]
    numbers=set(row.values)
    return numbers

winning_numbers=lotto.apply(extract_numbers,axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [78]:
# Next, write a function that compares a set of chosen numbers
# to the set of historical winning numbers
def check_historical_occurence(list, series):
    chosen_numbers=set(list)
    check_match = chosen_numbers==winning_numbers
    n_wins=sum(check_match)
    print('''The number of times that the numbers {} has won the lottery is {}.
The probability of winning the big prize with that combination is 1 in 13,983,816.'''.format(list,n_wins))

In [79]:
# Let's test it
check_historical_occurence([33, 36, 37, 39, 8, 41],winning_numbers)

The number of times that the numbers [33, 36, 37, 39, 8, 41] has won the lottery is 1.
The probability of winning the big prize with that combination is 1 in 13,983,816.


We can hard code the chances of winning the big prize with any given set of numbers because it's always the same.

### What about multiple tickets?
Lottery players usually play more than one ticket on a single drawing, thinking that this will drastically increase their odds.

We will now make a function that calculates the probability of winning the big prize given the *number* of tickets played.

In [108]:
def multi_ticket_probability(n_tickets):
    odds = n_tickets / combinations(49,6)
    percentage = odds * 100
    multi_ticket_odds = round(combinations(49,6)/n_tickets)
    print('''If you buy {} tickets, you will have a {:.7f}% chance of winning the big prize. \n
In other words, your odds to win are 1 in {:,}.'''
.format(n_tickets,percentage,int(multi_ticket_odds)))

In [109]:
multi_ticket_probability(1)

If you buy 1 tickets, you will have a 0.0000072% chance of winning the big prize. 

In other words, your odds to win are 1 in 13,983,816.


In [110]:
multi_ticket_probability(10)

If you buy 10 tickets, you will have a 0.0000715% chance of winning the big prize. 

In other words, your odds to win are 1 in 1,398,382.


In [111]:
multi_ticket_probability(100)

If you buy 100 tickets, you will have a 0.0007151% chance of winning the big prize. 

In other words, your odds to win are 1 in 139,838.


In [114]:
multi_ticket_probability(10000)

If you buy 10000 tickets, you will have a 0.0715112% chance of winning the big prize. 

In other words, your odds to win are 1 in 1,398.


In [115]:
multi_ticket_probability(1000000)

If you buy 1000000 tickets, you will have a 7.1511238% chance of winning the big prize. 

In other words, your odds to win are 1 in 14.


In [116]:
multi_ticket_probability(6991908)

If you buy 6991908 tickets, you will have a 50.0000000% chance of winning the big prize. 

In other words, your odds to win are 1 in 2.


In [117]:
multi_ticket_probability(13983816)

If you buy 13983816 tickets, you will have a 100.0000000% chance of winning the big prize. 

In other words, your odds to win are 1 in 1.


Even buying 10,000 tickets gives you less than 10% chance at winning.

Buying 6,991,908 tickets gives you a 50% chance to win.
You would need to buy 13,983,816 tickets to guarantee a win!

## Less winning numbers?

In most lotteries, there are smaller prizes if a player's ticket matches 2, 3, 4, or 5 of the 6 numbers drawn.
Let's look and see if there are numbers that are chosen less often, historically.

Here we're going to write a function that allows the user to calculate probabilities for 2, 3, 4, or 5 winning numbers.

In this function, the user will input 
* a list of 6 numbers, and
* a number between 2 and 5 that represents the number of winning numbers expected

Note that this is addressing the probability of having *exactly* five winning numbers, not the probability of having *at least* five winning numbers.

In [125]:
def probability_less_6(n_matching_numbers):
    n_combinations_ticket = combinations(6, n_matching_numbers)
    n_combinations_remaining = combinations(43, 6 - n_matching_numbers)
    success_outcomes = n_combinations_ticket * n_combinations_remaining
    
    total_outcomes = combinations(49,6)
    probability = success_outcomes / total_outcomes
    prob_percent = probability * 100
    
    print('''The probability of matching {} numbers is {:.5}%.'''.format(n_matching_numbers,prob_percent))

In [126]:
# Test this function on 2-5
probability_less_6(2)

The probability of matching 2 numbers is 13.238%.


In [127]:
probability_less_6(3)

The probability of matching 3 numbers is 1.765%.


In [128]:
probability_less_6(4)

The probability of matching 4 numbers is 0.096862%.


In [129]:
probability_less_6(5)

The probability of matching 5 numbers is 0.001845%.


### Conclusions

That's the end of this guided project! Hopefully the probabilities showcased in this project can inform lottery players of their rather poor odds when playing.