# Mobile App for Lottery Addiction

Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the 6/49 lottery and build functions that enable users to answer questions like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

The institute also wants us to consider historical data coming from the national 6/49 lottery game in Canada. The data set has data for 3,665 drawings, dating from 1982 to 2018 (we'll come back to this).

During this project we will need to compute a lot of factorial and combination, that will allow us to calculate probabilities. Let's define those two functions.

In [67]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import re 
import seaborn as sns

In [7]:
def factorial(n):
    result = 1
    for i in range(1,n+1):
        result *= i
    return result

In [11]:
def combinations(n,k):
    return factorial(n)/(factorial(k)*factorial(n-k))

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. If a player has a ticket with the numbers {13, 22, 24, 27, 42, 44}, he only wins the big prize if the numbers drawn are {13, 22, 24, 27, 42, 44}. If only one number differs, he doesn't win.

We discussed with the engineering team of the medical institute, and they told us we need to be aware of the following details when we write the function:

- Inside the app, the user inputs six different numbers from 1 to 49.
- Under the hood, the six numbers will come as a Python list, which will serve as the single input to our function.
- The engineering team wants the function to print the probability value in a friendly way — in a way that people without any probability training are able to understand.

In [165]:
def bigprize(numbers):
    if len(numbers) != 6:
        return "You must give 6 numbers"
    for i in numbers:
        count = numbers.count(i)
        if count > 1:
            return "You must specify 6 unique numbers"
            break
        if i > 49:
            return "You must specifiy values between 1 and 49"
            break
        if i < 0:
            return "You must specifiy values between 1 and 49"
            break
        else: 
            continue
    return print("You have 1 chance out of {:,}".format(int(combinations(49,len(numbers)))),"to win. This corresponds to",'{0:.7f}'.format((1/combinations(49,len(numbers)))*100) ,"% chance of winning")


In [166]:
#exemple 
bigprize([2,3,4,5,6,9])

You have 1 chance out of 13,983,816 to win. This corresponds to 0.0000072 % chance of winning


We wrote a function that can tell users what is the probability of winning the big prize with a single ticket. For the first version of the app, however, users should also be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now. Let's explore the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset).

In [71]:
lottery = pd.read_csv("649.csv")
lottery.head(2) 

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9


In [74]:
lottery.tail(2)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


### Historical Data Check for Canada Lottery

Let's write a function that will enable users to compare their ticket against the historical lottery data in Canada (1982-2018) and determine whether they would have ever won by now! That should show us how few chances they have to win one day.

The engineering team wants us to write a function that prints:

- the number of times the combination selected occurred; and
- the probability of winning the big prize in the next drawing with that combination.

In [167]:
# First we extract all the winnings numbers since 1982 to a new series

def extract_numbers(row):
    winning = set()
    for i in row[4:10]:
        winning.add(i)
    return winning

# Then we apply it to our Dataframe Lottery to obtain a series with all the winning numbers
all_winning_numbers = lottery.apply(extract_numbers,1)

In [172]:
def check_historical(numbers):
    if len(numbers) != 6:
        return "You must give 6 numbers"
    for i in numbers:
        count = numbers.count(i)
        if count > 1:
            return "You must specify 6 unique numbers"
            break
        if i > 49:
            return "You must specifiy values between 1 and 49"
            break
        if i < 0:
            return "You must specifiy values between 1 and 49"
            break
        else: 
            continue
    count = 0
    for i in all_winning_numbers:
        if i == set(numbers):
            count += 1       
    return print("Since 1982, if you had played your numbers at every single 649 Lotto, you would have won", count, "times, congrats! Actually you have 1 chance out of {:,}".format(int(combinations(49,len(numbers)))),"to win. This corresponds to",'{0:.7f}'.format((1/combinations(49,len(numbers)))*100) ,"% chance of winning.")


In [173]:
#Let's check if it works with the first winning numbers!
check_historical([3, 11, 12, 14, 41,43])

Since 1982, if you had played your numbers at every single 649 Lotto, you would have won 1 times, congrats! Actually you have 1 chance out of 13,983,816 to win. This corresponds to 0.0000072 % chance of winning.


Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning — on this screen, we're going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [191]:
def several_tickets(n):
    total_combinations = combinations(49,6)
    proba_to_win = (n/total_combinations)*100
    return print("You think buying", n,"tickets will help you? You have now",'{0:.7f}'.format(proba_to_win) ,"% chance of winning, not so much right? It's actually 1 chance out of", round(total_combinations/n))

Let's see if you have some real chances to win the jackpot by buying a lot of tickets!

In [195]:
for i in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    several_tickets(i)
    print('------------------------')

You think buying 1 tickets will help you? You have now 0.0000072 % chance of winning, not so much right? It's actually 1 chance out of 13983816
------------------------
You think buying 10 tickets will help you? You have now 0.0000715 % chance of winning, not so much right? It's actually 1 chance out of 1398382
------------------------
You think buying 100 tickets will help you? You have now 0.0007151 % chance of winning, not so much right? It's actually 1 chance out of 139838
------------------------
You think buying 10000 tickets will help you? You have now 0.0715112 % chance of winning, not so much right? It's actually 1 chance out of 1398
------------------------
You think buying 1000000 tickets will help you? You have now 7.1511238 % chance of winning, not so much right? It's actually 1 chance out of 14
------------------------
You think buying 6991908 tickets will help you? You have now 50.0000000 % chance of winning, not so much right? It's actually 1 chance out of 2
-----------

### Less Winning Numbers — Function
 
In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.

These are the engineering details we'll need to be aware of:

Inside the app, the user inputs:

- six different numbers from 1 to 49; and
- an integer between 2 and 5 that represents the number of winning numbers expected

Our function prints information about the probability of having the inputted number of winning numbers.

In [251]:
def smaller_prices(n):
    if (n < 2) or (n > 5):
        return "n must be an integer between 2 and 5!"
    total_combinations = combinations(49,6)
    n_combinations = combinations(6,n)
    combinations_left = combinations(43, 6 - n)
    proba = (n_combinations*combinations_left/total_combinations)*100
    return print("You have",'{0:.7f}'.format(proba) ,"% chance of having {}".format(n), "winning numbers")


Now let's try our new function :

In [253]:
for n in [2, 3, 4, 5]:
    smaller_prices(n)
    print('--------------------------') 

You have 13.2378029 % chance of having 2 winning numbers
--------------------------
You have 1.7650404 % chance of having 3 winning numbers
--------------------------
You have 0.0968620 % chance of having 4 winning numbers
--------------------------
You have 0.0018450 % chance of having 5 winning numbers
--------------------------
