# Lottery Addiction mobile app
## Introduction
Many people start playing the lottery for fun, but for some this activity turns into a habit which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.
## Goal
In this project, we will help develop a mobile app meant to help lottery addicts better estimate their chances of winning.
Our stakeholders want to focus on 6/49 lottery and answer these questions:
- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?
## Data description
We will be using 6/49 lottery dataset from Kaggle with 3665 drawings, dating from 1982 to 2018. The dataset can be found here:
https://www.kaggle.com/datascienceai/lottery-dataset
Lanuched in 1982, 6/49 lottery is very popular in Canada. As the name implies, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. If a player has a ticket with the numbers {13, 22, 24, 27, 42, 44}, he only wins the big prize if the numbers drawn are {13, 22, 24, 27, 42, 44}. If only one number differs, he doesn't win. The drawing is done without replacement, which means once a number is drawn, it's not put back in the set.

## What is the probability of winning the big prize with a single ticket?
Let's start by writing a few functions:
- factorial() that computes the factorial of that number n
- combinations() which takes in two inputs (n and k) and outputs the number of combinations when we're taking only k objects from a group of n objects.
- one_ticket_probability() which takes in a list of six unique numbers and prints the probability of winning

In [1]:
def factorial(n):
    f = 1 
    for i in range(n, 1 , -1):
        f *= i
    return f

def combinations(n, k):
    return factorial(n) / (factorial(k) * factorial(n-k))
        
def one_ticket_probability(l):
    vic_per = 100 * (1 / combinations(49, 6))
    print("Your winning chance by percentage is {:.7f}%".format(vic_per))

## Compare ticket against historical lottery data
We will import historical Canadian lottery dataset for 3665 rows and 6 columns as follows:
- NUMBER DRAWN 1
- NUMBER DRAWN 2
- NUMBER DRAWN 3
- NUMBER DRAWN 4
- NUMBER DRAWN 5
- NUMBER DRAWN 6

There are also columns for id and date but let's not use them for now. Make a function check_historical_occurence() to compare user's number against historical record:
- Make function extract_numbers() that takes as input a row of the lottery dataframe and returns a set containing all the six winning numbers.
- Make new column for the winning set in the lottery dataframe.
- Make new function named check_historical_occurence() that takes in two inputs: a Python list containing the user numbers and a pandas Series containing sets with the winning numbers

In [4]:
import pandas as pd
lh = pd.read_csv('649.csv')
lh.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [21]:
def extract_numbers(row):
    s = set(row[4:10])
    return s
lh['win_num'] = lh.apply(extract_numbers, axis=1)

def check_historical_occurence(l, s):
    l = set(l)
    occ = (l == s).sum()
    print("This number appeared {} in history".format(occ))

##  What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. We will help them better estimate their chances of winning with a function multi_ticket_probability() that will take an arbitrary number and allow the users to calculate the chances of winning for that number of tickets.

In [9]:
def multi_ticket_probability(n):
    vic_per = (n * 100) / combinations(49, 6)
    print("Your winning chance is {:.7f}%".format(vic_per))
# Chances of winning a lottery with multipe tickets from 1 to 13,983,816 (the maximum number of different tickets)
for i in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    n = multi_ticket_probability(i)

Your winning chance is 0.0000072%
Your winning chance is 0.0000715%
Your winning chance is 0.0007151%
Your winning chance is 0.0715112%
Your winning chance is 7.1511238%
Your winning chance is 50.0000000%
Your winning chance is 100.0000000%


For $3 each ticket, you will have to spend a whoppering $41,951,448 to guarantee a victory of $1,000,000.

## What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?
In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.

In [16]:
def probability_less_6(n):
    pos_com = combinations(6, n)
    rem_com = combinations(43, 6 - n)
    suc_com = pos_com * rem_com
    tot_com = combinations(49, 6)
    prob = round((100 * suc_com / tot_com), 5)
    print("Your winning chance for {} tickets is {:.7f}%".format(n, prob))
for i in [2, 3, 4, 5]:
    n = probability_less_6(i)

Your winning chance for 2 tickets is 13.2378000%
Your winning chance for 3 tickets is 1.7650400%
Your winning chance for 4 tickets is 0.0968600%
Your winning chance for 5 tickets is 0.0018400%


## Conclusion
After making the project, one thing is for sure is that it is not wise to invest your money in 6/49 lottery. Perhaps that is why there is no such app on the market yet.