## Project: Mobile App to Combat Lottery Addiction

In this project the logical core to an app meant to dissuade lottery addicts from buying lottery tickets will be constructed by showing the probability of winning in different circumstances.

The goal of the project will be to answer the following questions:

* What is the probability of winning the lottery with a single ticket?
* What is the probability of winning with a number of tickets?
* What is the probability of having at least a certain number of winning numbers in a single ticket?

Historical data from the Canadian national 6/49 lottery will also be used to include a function where the user can check if their numbers have ever been drawn.

The educational goal of the project is to get hands-on experience with probabilistic calculations.

In [1]:
# Creating useful functions

def factorial(n):
    '''Calculate the factorial of a number
    Args:
        n (int): number for which the factorial should be calclated
    Returns:
        int: n!
    '''
    fact = 1
    for item in range(1,n+1):
        fact *= item
    return fact

def combinations(n,k):
    '''Calculate the number of combinations possible while sampling k without replacement from a group of n
    Args:
        n (int): number of items in the group to be picked from (such as the 49 possible numbers in the lottery)
        k (int): number of items to pick (like the 6 numbers picked in the lottery)
    Returns:
        int: number of permutations possible when picking k out of a group of n items without replacement
    '''
    return factorial(n)/(factorial(k) * factorial(n-k))

In [2]:
def one_ticket_probability(lst):
    '''Display the probability of winning when picking numbers in a list
    Args:
        lst (list): list of numbers the user wants to pick, but since the probability is the same for all 
        different combinations of numbers, this argument is not actually used by the function, it is there 
        to give the user interactivity
    Returns:
        str: string telling the user their probability of winning with a single ticket in a game of 6/49
    '''
    c_lot_nrs = combinations(49,6)
    return 'you have a one in '+str(c_lot_nrs) +' of winning.'
one_ticket_probability("litterally anything that doesn't  cause an error")
    

'you have a one in 13983816.0 of winning.'

`one_ticket_probability` will tell them their chances of winning the lottery. This chance is the same for every ticket regardless of the numbers entered.

## Using historic data

Now historic data from past drawings of the Canadian national lottery will be used to let users find out if their numbers have ever been drawn

In [3]:
# Imports
import pandas as pd
import numpy as np

In [4]:
# Reading in the dataset
can_lot = pd.read_csv('649.csv')

## First overview over the dataset

In [5]:
can_lot.shape

(3665, 11)

The dataset includes data from 3665 drawings.

In [6]:
print(can_lot['DRAW DATE'].head(3))
print(can_lot['DRAW DATE'].tail(3))


0    6/12/1982
1    6/19/1982
2    6/26/1982
Name: DRAW DATE, dtype: object
3662    6/13/2018
3663    6/16/2018
3664    6/20/2018
Name: DRAW DATE, dtype: object


The data starts on 12.6.1982 and ends in 20.6.2018

In [7]:
    can_lot.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


The dataset includes the columns:

* `PRODUCT`: The lottery of the results in the row. This is always the 6/49 lottery. This column can be dropped
* `DRAW NUMBER`: The integer key for the drawings. This column can also be dropped, since it provides no benefit over the index.
* `SEQUENCE NUMBER`: Unclear, but not necessary for the analysis. Can be dropped
* `DRAW DATE`: Date of the draw
* `NUMBER DRAWN(1-6)`: The numbers drawn as well as the sequence in which they were drawn. Instead of using 6 columns this could be reduced to a single one by using a list instead of the columns.
* `BONUS NUMBER`: The bonus number of the draw

Turn the `NUMBER DRAWN <#>` columns into a single one with a list of the numbers

In [9]:
# Select the right columns
number_cols = can_lot.columns[can_lot.columns.str.contains('NUMBER DRAWN')]

def extract_numbers(row):
    '''Extract the numbers from a single draw in can_lot and return them as a list
    Args:
        row(int): index of the row to extract the numbers from
    '''
    list_draw = []
    for item in number_cols:
        list_draw.append(row[item])
    list_draw.sort()
    return set(list_draw)
can_lot['nr_list'] = can_lot.apply(extract_numbers, axis = 1)

In [10]:
print(can_lot.head())

   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \
0      649            1                0  6/12/1982               3   
1      649            2                0  6/19/1982               8   
2      649            3                0  6/26/1982               1   
3      649            4                0   7/3/1982               3   
4      649            5                0  7/10/1982               5   

   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \
0              11              12              14              41   
1              33              36              37              39   
2               6              23              24              27   
3               9              10              13              20   
4              14              21              31              34   

   NUMBER DRAWN 6  BONUS NUMBER                  nr_list  
0              43            13  {3, 41, 11, 12, 43, 14}  
1              41             9  {33, 36

In [12]:
list_a = [3,11,69,14,41,43]
def woulda_won(usr_lst,ser = can_lot.nr_list):
    '''Tell the user if their numbers have ever been drawn in the Canadian lottery
    Args:
        usr_lst (list): list containing the lottery numbers of the user
        ser (series): Series of sets of numbers drawn in the Canadian lottery
    Returns:
        str telling the user if their numbers have been drawn before and if so how many times and the 
        probability of winning with a given set of numbers (which is always the same for every set of numbers)
    '''
    usr_set = set(usr_lst)
    occurences = (ser == usr_set).sum()
    if occurences == 0:
        return '''The numbers {} have never been drawn. Your 
        chance of winning with them is 0.0000072%'''.format(usr_lst)
    elif occurences == 1:
        return '''The numbers {} have been drawn once.
        Your chance of winning with them is 0.0000072%'''.format(usr_lst)
    else:
        return '''The numbers {} have been drawn. 
        Your chance of winning with them is 0.0000072%'''.format(usr_lst) + str(occurence) + 'times'   

In [13]:
print(woulda_won(list_a))


The numbers [3, 11, 69, 14, 41, 43] have never been drawn. Your 
        chance of winning with them is 0.0000072%


The woulda_won function lets the user input a list of numbers and then tells them if the same numbers have ever been drawn in the history of the Canadian national lottery.

In [14]:
def multi_ticket_probability(nr):
    '''Tell the user the percentage chance of winning when buying a certain number of tickets
    Args:
        nr (int): number of tickets the user wishes to purchase
    Returns:
        str telling the user the probability of winning with x number of tickets
    '''
    probab = nr / combinations(49,6)
    if nr == 1:
        return '''entering the lottery with a single ticket gives you a {:.7f}%
        chance of winning'''.format(probab*100)
    return '''if you enter the lottery with {} tickets, you have
    a {:.6f}% chance fo winning'''.format(nr,probab*100)
print(multi_ticket_probability(1))

entering the lottery with a single ticket gives you a 0.0000072%
        chance of winning


This function tells the user how high their probability of winning is if they purchase a certain number of tickets

In [15]:
combinations(49,6)
combinations(6,5)


def probability_less_6(nr):
    '''Tell the user the probability of getting nr out of 6 in a lottery draw
    Args:
        nr (int): number to calculate the probability of between 2 and 5
    Returns:
        str probability of winning or an error telling the user to choose a number between 2 and 5
    '''
    if nr in [2,3,4,5]:
        comb_nrs = combinations(49-nr,6-nr)
            
        total_combs = combinations(6,nr) * comb_nrs
        prob = total_combs / combinations(49,6)
        return '''Your chances of getting {} winning numbers in the drawing today is {:.7f}%'''.format(nr, prob*100)
    else:
        return '''Please enter a number between 2 and 5'''
probability_less_6(2)

'Your chances of getting 2 winning numbers in the drawing today is 19.1326531%'

In [16]:
for i in range(4,6):
    print(49-i)

45
44


This function lets the user enter a number and returns the percentage of them getting that amount of winning numbers. The new numbers can be anywhere between 2 and 5.

# Summary

In this project the logical core to an app against lottery addiction was created. The goal of the app will be to tell the user the odds of winning, given certain conditions. It also tells the user whether their numbers have ever been drawn in the Canadian lottery.

A lot of gambling addicts trick themselves into a skewed perception of the odds of winning, which then furthers the addiction. This is helped by many people not having a clear understanding of how probability works. Having access to clear odds might dissuade some addicts from tricking themselves into believing their odds are higher than they actually are.