# Mobile App for Lottery Addiction
In this project we'll be working for a fictional medical institute that aims to prevent and treat gambling addictions and wants to build an mobile app to help lottery addicts better estimate their chances of winning. This project aims to provide the logical core of the app. Key questions that should be answered by the app should be:
- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

These questions will be answered for the lottery in the 6/49 format. Furthermore, data coming from the national lottery game in Candada will be included in the considerations. The data set can be found [here](https://www.kaggle.com/datascienceai/lottery-dataset) and contains 3,665 drawings, dating from 1982 to 2018. <br><br>
The goal of this project will be to apply statistical methods with a big focus on probability.

## Core Functions

To begin the project, we will write two functions that will be necessary to calculate probabilities. First, we want to create a function that calculates factorials. Second, we want a function that calcluates combinations. <br>
Since the 6/49 lottery draws six numbers from a set of 49 numbers without replacement, the formula for this looks as follows:

Let's start to code the two functions:

In [1]:
# Function computing the factorial of a given number (n)
def factorial(n):
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

In [2]:
# Function calculating the probability of one combination using teh factorial function from above;
# n is the range (49), k is the number taken from the range (6)
def combinations(n, k):
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n-k)
    return numerator/denominator

## One-ticket Probability
Next, we want to provide the user of the app the ability to calculate the percentage of success to win the big jackpot (all 6 numbers are drawn) with their ticket. We have to be aware of the following details. This can be achieved with a new function. We have to be aware of the following function details:
- The user should be able to input six different numbers from 1 to 49
- The six numbers will come as a Python lost
- The function should return the probability in a way that is friendly and easy to read for everybody.

In [25]:
def one_ticket_probability(numbers_list):
    n_comb = combinations(49,6) # calculating the number of combinations of the lottery
    outcomes = 1 # number of successful outcomes
    probability = outcomes/n_comb
    percentage = probability * 100
    return "Your chances of winning the big jackpot with your numbers {} are {:.7f}%" .format(numbers_list, percentage)


In [28]:
test1 = one_ticket_probability([1,5,9,28,32,40])
test1

'Your chances of winning the big jackpot with your numbers [1, 5, 9, 28, 32, 40] are 0.0000072%'

In [29]:
test2 = one_ticket_probability([34,35,36,40,41,44])
test2

'Your chances of winning the big jackpot with your numbers [34, 35, 36, 40, 41, 44] are 0.0000072%'

## Historical Data Check for Canada Lottery
In addition to the feature above to return the probability of winning the lottery by having the right combinations of six numbers, the app should enable users to determine wether theu would have ever won using the data from the Canadian lottery. Like mentioned earlier, the data set contains 3,665 drawings from between 1982 and 2018. The numbers drawn can be found in the following columns:
- NUMBER DRAWN 1
- NUMBER DRAWN 2
- NUMBER DRAWN 3
- NUMBER DRAWN 4
- NUMBER DRAWN 5
- NUMBER DRAWN 6

In [31]:
# Reading in the csv file
import pandas as pd
lot = pd.read_csv('649.csv')
lot.shape

(3665, 11)

In [32]:
lot.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [33]:
lot.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


Next, we want to write a function that will show users to compare their ticket to historical data. While writing the function we have to be aware of the following details:
- The user has to input 6 numbers between 1 and 49 again
- The six numbers will come as a Python list
- The function should print the number of times the function appeared in the data set and the probability of winning the big prize at the next drawing

In [35]:
# Extracting all the winning six numbers from the historical data set as Python sets


# Function that takes in a row and returns the numbers as a set
def extract_numbers(row):
    numbers = row[4:10]
    numbers = set(numbers.values)
    return numbers

# Applying the function to the data set
winning_numbers = lot.apply(extract_numbers, axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

Next, we need to create a function that takes in two inputs: a Pyhton list containing the user numbers and a pandas Series containing sets with the winning numbers.

In [47]:
def check_historical_occurence(user_list, historical_numbers):
    user_set = set(user_list) # Transforming the list to a set
    check_occurence = historical_numbers == user_set # Comparing both sets
    nr_occurences = check_occurence.sum() 
    if nr_occurences == 0:
        print("The combination {} has never occured before. The chance of winning is 0.0000072%" .format(user_set))
    else:
        print("The number combination {} has occured {} times before. It is still very unlikely to occur again. The chance of winning is 0000072" .format(user_set,nr_occurences ))

In [48]:
test3 = check_historical_occurence([3, 41, 11, 12, 43, 14], winning_numbers)
test3

The number combination {3, 41, 11, 12, 43, 14} has occured 1 times before. It is still very unlikely to occur again. The chance of winning is 0000072


In [49]:
test4 = check_historical_occurence([2, 15, 21, 33, 34, 42], winning_numbers)
test4

The combination {33, 2, 34, 42, 15, 21} has never occured before. The chance of winning is 0.0000072%


## Multi-ticket Probability
Some people do not only play one ticket but play multiple because they think it might change their chances of winning drastically. The next function should give them a better understanding about how minimal the effect of multiple tickets is on the winning percentage. This function should:
- Give the user the ability to insert the number of different tickest (a number between 1 and 13,983,816, which is the maximum number of tickets)
- Return the probability of winning the jackpot

In [52]:
def multi_ticket_probability(number_of_tickets):
    number_of_tickets = number_of_tickets
    combinations_number = combinations(49,6)
    outcomes = number_of_tickets
    percentage = outcomes/combinations_number*100
    print("With a number of {} you have a {:.6f} percentage chance of winning the lottery".format(number_of_tickets, percentage))
    

In [54]:
test5 = multi_ticket_probability(12)

With a number of 12 you have a 0.000086 percentage chance of winning the lottery


In [55]:
test_1 = multi_ticket_probability(1)
test_10 = multi_ticket_probability(10)
test_100 = multi_ticket_probability(100)
test_1000 = multi_ticket_probability(1000)
test_100000 = multi_ticket_probability(100000)
test_6991908 = multi_ticket_probability(6991908)
test_13983816 = multi_ticket_probability(13983816)


With a number of 1 you have a 0.000007 percentage chance of winning the lottery
With a number of 10 you have a 0.000072 percentage chance of winning the lottery
With a number of 100 you have a 0.000715 percentage chance of winning the lottery
With a number of 1000 you have a 0.007151 percentage chance of winning the lottery
With a number of 100000 you have a 0.715112 percentage chance of winning the lottery
With a number of 6991908 you have a 50.000000 percentage chance of winning the lottery
With a number of 13983816 you have a 100.000000 percentage chance of winning the lottery


## Less Winning Numbers - Function
In the last function we are going to create, we want the users to be able to calculate the percentage to win a prize for two right numbers, three right numbers, etc. Often there are smaller prizes if some numbers match the winning numbers. This function will:
- allow the user to input an integer between 2 and 5 that represents the number of winning numbers expected
- return the percentage

In [57]:
def probability_less_6(n):
    n=n
    n_combinations = combinations(6, n)
    n_combinations_rest = combinations(43, 6-n)
    wins = n_combinations * n_combinations_rest
    n_total_comb = combinations(49,6)
    probability = wins/n_total_comb
    percentage = probability*100
    print("You have a {:.6f} percent chance to match exactly {} numbers".format(percentage, n))

In [59]:
test_2n = probability_less_6(2)
test_3n = probability_less_6(3)
test_4n = probability_less_6(4)
test_5n = probability_less_6(5)

You have a 13.237803 percent chance to match exactly 2 numbers
You have a 1.765040 percent chance to match exactly 3 numbers
You have a 0.096862 percent chance to match exactly 4 numbers
You have a 0.001845 percent chance to match exactly 5 numbers


#### Next steps
There are different possible features that could be included in the next version of the app:
- Making the outputs even easier to understand by adding fun analogies
- Combining the output information of one_ticket_probability() and check_historical_occurrence()
- Creating a function similar to probability_less_6() which calculates the probability of having at least, two, three, four or five winning numbers.

## Conclusion
In this project we were able to build a mobile app that help lottery addicts to get a better understanding of the winning chances of playing the lottery. We were using probabilistic methods and theories like factorials, events, and outcomes to generate the probabilities. Furthermore, we were using a data set that includes the winning numbers of over 30 years of the Candian lottery to demonstrate the low chances of winning the lottery. 