## Project: Mobile App for Lottery Addiction

In this project we will build the logical core of an app, that is meant to dissuade lottery addicts, by showing them their actual chances of winning.

The goals will be to answer the following questions:

* What is the probability of winning with a single ticket?
* What is the probability of winning with a number of tickets?
* What is the probability of having at least a certain number of winning numbers in a single ticket?

We will also be using historical data from the national 6/49 lottery game in Canada.

In [2]:
# Creating functions we will be using throughout the project
def factorial(n):
    fact = 1
    for item in range(1,n+1):
        fact *= item
    return fact

def combinations(n,k):
    return factorial(n)/(factorial(k) * factorial(n-k))

In [3]:
def one_ticket_probability(lst):
    c_lot_nrs = combinations(49,6)
    return 'you have a one in '+str(c_lot_nrs) +' of winnig. Seems like a good investment!'
one_ticket_probability('fart')
    

'you have a one in 13983816.0 of winnig. Seems like a good investment!'

This function will tell them their chances of winning the lottery. This chance is the same for every ticket regardless of the numbers entered.

Now we will use the historic data from past drawings of the Canadian national lottery

In [4]:
# Imports
import pandas as pd
import numpy as np

In [5]:
# Reading in the dataset
can_lot = pd.read_csv('649.csv')

Let's now explore the dataset

In [6]:
can_lot.shape

(3665, 11)

We can see, that the dataset includes data from 3665 drawings.

In [7]:
print(can_lot['DRAW DATE'].head(3))
print(can_lot['DRAW DATE'].tail(3))


0    6/12/1982
1    6/19/1982
2    6/26/1982
Name: DRAW DATE, dtype: object
3662    6/13/2018
3663    6/16/2018
3664    6/20/2018
Name: DRAW DATE, dtype: object


We can see that the data starts on 12.6.1982 and ends in 20.6.2018

In [8]:
can_lot.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


The dataset includes the columns:

* PRODUCT: Here we can see what lottery is mentioned. This is always the 6/49 lottery. We might want to drop this column
* DRAW NUMBER: The integer key for the drawings. We could also drop this columns and use the index instead.
* SEQUENCE NUMBER: Not sure what this is about
* DRAW DATE
* NUMBER DRAWN(1-6): The numbers drawn as well as the sequence in which they were drawn
* BONUS NUMBER

In [9]:
number_cols = can_lot.columns[can_lot.columns.str.contains('NUMBER DRAWN')]
# print(can_lot[number_cols])

def extract_numbers(row):
    list_draw = []
    for item in number_cols:
        list_draw.append(row[item])
    list_draw.sort()
    return set(list_draw)
can_lot['nr_list'] = can_lot.apply(extract_numbers, axis = 1)

In [30]:
print(can_lot.head())

   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0      649            1                0  6/12/1982               3   
1      649            2                0  6/19/1982               8   
2      649            3                0  6/26/1982               1   
3      649            4                0   7/3/1982               3   
4      649            5                0  7/10/1982               5   

   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0              11              12              14              41   
1              33              36              37              39   
2               6              23              24              27   
3               9              10              13              20   
4              14              21              31              34   

   NUMBER DRAWN 6  BONUS NUMBER                  nr_list  
0              43            13  {3, 41, 11, 12, 43, 14}  
1              41             9  {33, 36

In [58]:
list_a = [3,11,69,14,41,43]
def woulda_won(usr_lst,ser = can_lot.nr_list):
    usr_set = set(usr_lst)
    occurences = (ser == usr_set).sum()
    if occurences == 0:
        return '''The numbers {} have never been drawn. Your 
        chance of winning with them is 0.0000072%'''.format(usr_lst)
    elif occurences == 1:
        return '''The numbers {} have been drawn once.
        Your chance of winning with them is 0.0000072%'''.format(usr_lst)
    else:
        return '''The numbers {} have been drawn. 
        Your chance of winning with them is 0.0000072%'''.format(usr_lst) + str(occurence) + 'times'   

In [59]:
print(woulda_won(list_a))


The numbers [3, 11, 69, 14, 41, 43] have never been drawn. Your 
        chance of winning with them is 0.0000072%


The woulda_won function lets the user input a list of numbers and then tells them if the same numbers have ever been drawn in the history of the Canadian national lottery.

In [82]:
def multi_ticket_probability(nr):
    probab = nr / combinations(49,6)
    if nr == 1:
        return '''entering the lottery with a single ticket gives you a {:.7f}%
        chance of winning'''.format(probab*100)
    return '''if you enter the lottery with {} tickets, you have
    a {:.6f}% chance fo winning'''.format(nr,probab*100)
print(multi_ticket_probability(1))

entering the lottery with a single ticket gives you a 0.0000072%
        chance of winning


This function tells the user how high their probability of winning is if they purchase a certain number of tickets

In [117]:
combinations(49,6)
combinations(6,5)


def probability_less_6(nr):
    if nr in [2,3,4,5]:
        comb_nrs = combinations(49-nr,6-nr)
            
        total_combs = combinations(6,nr) * comb_nrs
        prob = total_combs / combinations(49,6)
        return '''Your chances of getting {} winning numbers in the drawing today is {:.7f}%'''.format(nr, prob*100)
    else:
        return '''Please enter a number between 2 and 5'''
probability_less_6(2)

'Your chances of getting 2 winning numbers in the drawing today is 19.1326531%'

In [103]:
for i in range(4,6):
    print(49-i)

45
44


This function lets the user enter a number and returns the percentage of them getting that ammount of winning numbers. The new numbers can be anywhere between 2 and 5.