# Guided Project: Mobile App for Lottery Addiction

In this project, we will use our knowlege of probability to contribute to the development of a mobile app that is meant to help lottery addicts better estimate their chances of winning.

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.

We want to build functions that enable users to answer questions like:
- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

## Core Functions
we'll start by writing two functions that we'll use often: factorial() and combinations()

In [1]:
def factorial(n):
    if n == 1:
        return 1
    else:
        return n*factorial(n-1)

In [2]:
def combinations(n,k):
    numerator = factorial(n)
    denominator = factorial(k)*factorial(n-k)
    return numerator/denominator

## One-ticket Probability

we'll writing a function that calculates the probability of winning the big prize.

In [18]:
def one_ticket_probability(num_list):
    total_outcomes = combinations(49,6)
    prob = (1/total_outcomes)*100
    print("The probability of your choice to win is {:.7f}%. It means that you have 1 in {:,} chance to win.".format(prob, total_outcomes))

In [19]:
one_ticket_probability([3,6,34,7,41,12])

The probability of your choice to win is 0.0000072%. It means that you have 1 in 13,983,816.0 chance to win.


In [20]:
one_ticket_probability([45,36,28,19,4,30])

The probability of your choice to win is 0.0000072%. It means that you have 1 in 13,983,816.0 chance to win.


## Historical Data Check for Canada Lottery

In [21]:
import pandas as pd

In [22]:
lottery = pd.read_csv('649.csv')

In [23]:
lottery.shape

(3665, 11)

In [24]:
lottery.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


## Functions for Historical Data Check

We'll write a function that will enable users to compare their ticke against the historical lottery data in Canada and determine whether they would have ever won by now

In [25]:
# write a function that takes as input a row of lottery dataframe and 
# returns a set containing all the six winning numbers.
def extract_numbers(row):
    six_numbers = set()
    for i in range(1,7):
        six_numbers.add(row["NUMBER DRAWN "+str(i)])
    return six_numbers

In [26]:
lottery.loc[1,:]

PRODUCT                  649
DRAW NUMBER                2
SEQUENCE NUMBER            0
DRAW DATE          6/19/1982
NUMBER DRAWN 1             8
NUMBER DRAWN 2            33
NUMBER DRAWN 3            36
NUMBER DRAWN 4            37
NUMBER DRAWN 5            39
NUMBER DRAWN 6            41
BONUS NUMBER               9
Name: 1, dtype: object

In [28]:
extract_numbers(lottery.loc[1,:])

{8, 33, 36, 37, 39, 41}

In [31]:
winning_numbers = lottery.apply(extract_numbers,axis = 1)

In [32]:
# write a function that takes in two inputs: a python list containing the user numbers and 
# a pandas Series containing sets with the winning numbers.
def check_historical_occurence(user_num_list, winning_numbers):
    user_num = set(user_num_list)
    match_or_not = (user_num == winning_numbers)
    times = match_or_not.sum()
    print("your choice has occurred {} times in the past".format(times))
    print("The probability of your choice to win is 0.0000072%. It means that you have 1 in 13,983,816.0 chance to win.")

In [33]:
check_historical_occurence([8, 33, 36, 37, 39, 41], winning_numbers)

your choice has occurred 1 times in the past
The probability of your choice to win is 0.0000072%. It means that you have 1 in 13,983,816.0 chance to win.


In [34]:
check_historical_occurence([5,17,24,54,31,40], winning_numbers)

your choice has occurred 0 times in the past
The probability of your choice to win is 0.0000072%. It means that you have 1 in 13,983,816.0 chance to win.


## Multi_ticket Probability

Lottery addicts usually play more than one ticket on a single drawing. Our purpose is to help them better estimate their chances of winning.

In [51]:
# write a function that prints the probability of winning the big prize depending on the number of differenct tickets played
def multi_ticket_probability(n_tickets):
    total_outcomes = combinations(49,6)
    prob = (n_tickets/total_outcomes)*100
    print("The probability of your {} tickets to win is {:.7f}%. It means that you have {} in {} chance to win".format(n_tickets,prob, n_tickets,int(total_outcomes)))

In [52]:
for n in [1,10,100,10000,1000000,6991908,13983816]:
    multi_ticket_probability(n)
    print("--------------------")

The probability of your 1 tickets to win is 0.0000072%. It means that you have 1 in 13983816 chance to win
--------------------
The probability of your 10 tickets to win is 0.0000715%. It means that you have 10 in 13983816 chance to win
--------------------
The probability of your 100 tickets to win is 0.0007151%. It means that you have 100 in 13983816 chance to win
--------------------
The probability of your 10000 tickets to win is 0.0715112%. It means that you have 10000 in 13983816 chance to win
--------------------
The probability of your 1000000 tickets to win is 7.1511238%. It means that you have 1000000 in 13983816 chance to win
--------------------
The probability of your 6991908 tickets to win is 50.0000000%. It means that you have 6991908 in 13983816 chance to win
--------------------
The probability of your 13983816 tickets to win is 100.0000000%. It means that you have 13983816 in 13983816 chance to win
--------------------


## Less Winning Numbers
We'll write a function to allow the users to calculate probabilities for two, three, four or five winning numbers, since in most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four or five of the six numbers drawn.

In [61]:
# calculate the probability of having n winning numbers
def probability_less_6(n):
    # num_combinations is the number of combinations we can have for n numbers out of 6.
    num_combinations = combinations(6,n)
    # successful_outcomes_per_combination is the number of possible successful outcomes for each combination
    successful_outcomes_per_combination = combinations(49-n,6-n)
    successful_outcomes = num_combinations*successful_outcomes_per_combination
    total_outcomes = combinations(49,6)
    prob = successful_outcomes/total_outcomes
    percentage = prob*100
    combinations_simplified = int(total_outcomes/successful_outcomes)
    print("The probability of your ticket to contain {} winning numbers is {:.7f}%. It means that you have 1 in {} chance to win".format(n,percentage, combinations_simplified))

In [62]:
for n in [2,3,4,5]:
    probability_less_6(n)
    print("-----------------")

The probability of your ticket to contain 2 winning numbers is 19.1326531%. It means that you have 1 in 5 chance to win
-----------------
The probability of your ticket to contain 3 winning numbers is 2.1710812%. It means that you have 1 in 46 chance to win
-----------------
The probability of your ticket to contain 4 winning numbers is 0.1061942%. It means that you have 1 in 941 chance to win
-----------------
The probability of your ticket to contain 5 winning numbers is 0.0018879%. It means that you have 1 in 52969 chance to win
-----------------


## Summary and Next Steps:
We managed to write four main functions for our app:

- one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
- check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
- multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
- probability_less_6() — calculates the probability of having two, three, four or five winning numbers

Possible features for a second version of the app include:

- Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery").
- Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time.