# Creating a Mobile App for Lottery Addiction

For most people playing the lottery is a fun and harmless activity. For some it becomes a habit which eventually leads to addiction. For these people, playing the lottery can lead to spending money they don't have and resorting to criminal activities to repay their debts. 

To address this, a medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. Our role in the development will be to create the logical core of the app and calculate probabilities.

For the first version of the app, they want us to focus on the 6/49 lottery and build functions that enable users to answer questions like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

The institute also wants us to consider historical data coming from the national 6/49 lottery game in Canada. The data set has data for 3,665 drawings, dating from 1982 to 2018.

# Write Functions to Calculate Probabilities

Throughout the project, we'll need to repeatedly calculate probabilities and combinations. As a consequence, we'll start by writing two functions that we'll use often:

- A function that calculates factorials; and
- A function that calculates combinations.

In [2]:
# Create Factorial Function
def factorial(n):
    factorial = 1
    for i in range (n,0,-1):
        factorial *= i
    return factorial

# Create Combinations Function
def combinations(n,k):
    numerator = factorial(n)
    denominator = factorial(k)*factorial(n-k)
    return numerator/denominator

Next we'll create a function that calculates the probability of winning the lottery for a specific set of numbers the user inputs into the app. The function will take in a list of six unique numbers and print the probability of winning in a way that's easy to understand.

In [3]:
# Create Function That Calculates Probability of Winning for Users Set of Numbers
def one_ticket_probability(user_numbers):
    total_outcomes = combinations(49,6)
    probability = (1/total_outcomes)*100
    print("Your chance of winning with the numbers {0} is {1:.7f}%. In other words you have a 1 in {2:,} chance of winning with these numbers.".format(user_numbers, probability, int(total_outcomes)))
    
    # Test Function
one_ticket_probability([1,2,3,4,5,6])

Your chance of winning with the numbers [1, 2, 3, 4, 5, 6] is 0.0000072%. In other words you have a 1 in 13,983,816 chance of winning with these numbers.


# Explore the 6/49 Lottery Game Dataset

Another component of the app will enable users to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now. 

This data is available on [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset#649.csv). Lets familiarise ourselves with it.

In [4]:
import pandas as pd
import numpy as np
dataset_649 = pd.read_csv("/Users/katestone/Desktop/CSV files/649.csv")
dataset_649.shape

(3665, 11)

In [5]:
dataset_649.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [6]:
dataset_649.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


# Write a Function that Compares User's Numbers to Dataset

In order to enable users to compare there numbers with historical lottery data in Canada we'll need to write a function. The function will take the user's six unique numbers between 1 and 49 as input and return the number of times the combination selected occurred in the Canada data set. It will also return the probability of winning the big prize in the next drawing with that combination.

In [7]:
# Write Function that Extracts All the Winning Six Numbers from the Historical Data Set as Python Sets
def extract_numbers(row_number):
    number_set = set()
    for i in range(4,10):
        number_set.add(dataset_649.iloc[row_number][i])
    return number_set

# Test Function
extract_numbers(0)

{3, 11, 12, 14, 41, 43}

In [8]:
# Extract All Winning Numbers
winning_numbers = []
for i in range(3665):
    winning_number = extract_numbers(i)
    winning_numbers.append(winning_number)

dataset_649["Winning_Numbers"] = winning_numbers

In [9]:
# Create Function to Check Historical Occurence
def check_historical_occurence(user_numbers, series):
    user_set = set(user_numbers)
    number_matches = sum(series == user_set)
    total_outcomes = combinations(49,6)
    probability = (1/total_outcomes)*100
    print("This combination of numbers won {0} times in the past. Your chance of winning with the numbers {1} is {2:.7f}%. In other words you have a 1 in {3:,} chance of winning with these numbers.".format(int(number_matches),user_numbers, probability, int(total_outcomes)))
    
# Test Function
check_historical_occurence([6, 24, 22, 31, 32, 34], dataset_649["Winning_Numbers"])

This combination of numbers won 1 times in the past. Your chance of winning with the numbers [6, 24, 22, 31, 32, 34] is 0.0000072%. In other words you have a 1 in 13,983,816 chance of winning with these numbers.


# Calculate Probability of Winning with Multiple Tickets

So far we've created functions which calculate the probability of winning the lottery with a single ticket. While this is a good start, lottery addicts actually tend to play more than a one ticket on a single drawing, believing that this might increase their chances of winning. Lets write a function that will allow users to calculate the chances of winning for any number of different tickets.

The function will operate as follows:

- The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
- Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
- The function will print information about the probability of winning the big prize depending on the number of different tickets played.

In [10]:
# Write Multi Ticket Probability Function
def multi_ticket_probability(number_of_tickets):
    total_outcomes = combinations(49,6)
    probability = (number_of_tickets/total_outcomes)*100
    if number_of_tickets == 1:
        print('''Your chance of winning with one ticket is {0:.7f}%. In other words you have a 1 in {1:,} chance of winning with these numbers.'''.format(probability, int(total_outcomes)))
    else:
        combinations_simplified = round(total_outcomes/number_of_tickets)
        print('''Your chance of winning with {0} tickets is {1:.7f}%. In other words you have a 1 in {2:,} chance of winning with these numbers.'''.format(number_of_tickets, probability, int(combinations_simplified)))

In [11]:
test_numbers = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for test_number in test_numbers:
    multi_ticket_probability(test_number)
    print('------------------------')

Your chance of winning with one ticket is 0.0000072%. In other words you have a 1 in 13,983,816 chance of winning with these numbers.
------------------------
Your chance of winning with 10 tickets is 0.0000715%. In other words you have a 1 in 1,398,382 chance of winning with these numbers.
------------------------
Your chance of winning with 100 tickets is 0.0007151%. In other words you have a 1 in 139,838 chance of winning with these numbers.
------------------------
Your chance of winning with 10000 tickets is 0.0715112%. In other words you have a 1 in 1,398 chance of winning with these numbers.
------------------------
Your chance of winning with 1000000 tickets is 7.1511238%. In other words you have a 1 in 14 chance of winning with these numbers.
------------------------
Your chance of winning with 6991908 tickets is 50.0000000%. In other words you have a 1 in 2 chance of winning with these numbers.
------------------------
Your chance of winning with 13983816 tickets is 100.00000

# Calculate Probability of Winning for Two, Three, Four, or Five Winning Numbers

In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.

Lets create a function that operates as follows:

- Takes integer x (between 2 and 5) representing number of winning numbers as input.
- Returns user's probability of having x number of winning numbers in their set of 6 numbers. 

The function will calculate the probability that a player's ticket matches exactly the given number of winning numbers. If the player wants to find out the probability of having five winning numbers, the function will return the probability of having five winning numbers exactly (no more and no less). The function will not return the probability of having at least five winning numbers.

In [12]:
# Write Function Described Above
def probability_less_6(x):
    winning_combinations = combinations(6,x)
    remaining_combinations = combinations(43, (6 - x))
    sucessful_outcomes = winning_combinations*remaining_combinations
    total_outcomes = combinations(49,6)
    probability = sucessful_outcomes/total_outcomes
    percentage = probability*100
    combinations_simplified = round(total_outcomes/sucessful_outcomes)
    print('''Your chance of having {0} winning numbers is {1:.7f}%. In other words, you have a one in {2} chance of having {0} winning numbers in your set of six numbers.'''.format(x, percentage, combinations_simplified))

In [13]:
# Test Function
for test_input in [2, 3, 4, 5]:
    probability_less_6(test_input)
    print('--------------------------') # output delimiter

Your chance of having 2 winning numbers is 13.2378029%. In other words, you have a one in 8 chance of having 2 winning numbers in your set of six numbers.
--------------------------
Your chance of having 3 winning numbers is 1.7650404%. In other words, you have a one in 57 chance of having 3 winning numbers in your set of six numbers.
--------------------------
Your chance of having 4 winning numbers is 0.0968620%. In other words, you have a one in 1032 chance of having 4 winning numbers in your set of six numbers.
--------------------------
Your chance of having 5 winning numbers is 0.0018450%. In other words, you have a one in 54201 chance of having 5 winning numbers in your set of six numbers.
--------------------------


# Going Forward

For the first version of the app, we coded four main functions:

- one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
- check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
- multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
- probability_less_6() — calculates the probability of having two, three, four or five winning numbers exactly

Possible features for a second version of the app include:

- Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery")
- Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time
- Create a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers. Hint: the number of successful outcomes for having at least four winning numbers is the sum of these three numbers:
    - The number of successful outcomes for having four winning numbers exactly
    - The number of successful outcomes for having five winning numbers exactly
    - The number of successful outcomes for having six winning numbers exactly