# Building App for Lottery Addiction.
# Developing a Lottery Addiction App.
We will contribute to the development of a mobile app by building a couple of functions that are mostly focused on calculating probabilities in this project. By assisting consumers in properly estimating their chances of winning, the software aims to both prevent and treat lottery addiction.


The concept for the app came from a medical center that specializes in treating gambling addictions. The institute already has a team of developers who will build the app, but we'll be responsible for the logical core and probability calculations. They want us to focus on the 6/49 lottery for the first version of the app and design functionalities that can answer the following questions for users:

How likely is it that a single ticket will win the grand prize?
What are the chances of winning the grand prize if we buy 40 tickets (or any other number)?

What is the likelihood of a single ticket containing at least five (or four, or three) winning numbers?

The scenario we'll be following for the duration of this project is hypothetical; the goal is to practice using probability and combinatorics (permutations and combinations) principles in a context that mimics a real-world situation.


## Importing and Exploring Dataset

In [2]:
import pandas as pd
import matplotlib.pyplot as plt

six49 = pd.read_csv("649.csv")

In [3]:
six49.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


In [4]:
six49.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   PRODUCT          3665 non-null   int64 
 1   DRAW NUMBER      3665 non-null   int64 
 2   SEQUENCE NUMBER  3665 non-null   int64 
 3   DRAW DATE        3665 non-null   object
 4   NUMBER DRAWN 1   3665 non-null   int64 
 5   NUMBER DRAWN 2   3665 non-null   int64 
 6   NUMBER DRAWN 3   3665 non-null   int64 
 7   NUMBER DRAWN 4   3665 non-null   int64 
 8   NUMBER DRAWN 5   3665 non-null   int64 
 9   NUMBER DRAWN 6   3665 non-null   int64 
 10  BONUS NUMBER     3665 non-null   int64 
dtypes: int64(10), object(1)
memory usage: 315.1+ KB


## Primary functions
We'll write two functions that we'll use frequently in the sections below:

- combinations() — a function that calculates combinations 
- factorial() — a function that calculates factorials

In [5]:
def factorial(x):
    product = 1
    for x in range(1,x+1):
        product *=x
    return product

def combination(n, k):
    return (factorial(n)/(factorial(k)*factorial(n-k)))

In [6]:
print(factorial(5))
print(combination(120,5))

120
190578024.0


## Winning Ticket
We'll need to create a function that calculates the chances of winning the top prize for each ticket. Six numbers are picked from a pool of 49 for each drawing, and a player wins the jackpot if the six numbers on their tickets match all six numbers.


When writing the function, the engineer team advised us to pay attention to the following details:

- The user enters six numbers ranging from 1 to 49 into the program.
- The six numbers will be sent to our function as a Python list and will be used as an input.
- The engineering team needs the function to print the probability value in a nice manner that anyone can comprehend, even if they have no prior experience with probability.
The `one ticket probability()` function is written below, and it takes a list of six unique integers and prints the probability of winning in an easy-to-understand format.

In [7]:
def one_ticket_probability(x=[]):
    combinations = combination(49, len(x))
    prob = 1/combinations
    print("The probability of this ticket winning is {probability:.10f}. That is, you stand a 1 in {chance:,} chance of winning".format(probability=prob, chance=combinations))
one_ticket_probability([2,3,5,7,8,10])

The probability of this ticket winning is 0.0000000715. That is, you stand a 1 in 13,983,816.0 chance of winning


In [8]:
def extract_numbers(row):
    row = row[4:10]
    numbers = set(row.values)
    return numbers

In [9]:
winning_tickets = six49.apply(extract_numbers, axis=1)
winning_tickets.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

## Historical Data check
The institute also wants us to take into account data from Canada's national 6/49 lottery game. From 1982 through 2018, the data set contains historical data for 3,665 drawings (the data set can be downloaded from [here](https://www.kaggle.com/datascienceai/lottery-dataset)).

In [10]:
def check_historical_occurence(user_numbers, winning_tickets):
    user_numbers = set(user_numbers)
    y = (user_numbers ==winning_tickets).sum()
    print("This {} combination has occurred {} time(s) in the past. Your chance of winning with this ticket is 0.0000000715. That is, you stand a 1 in 13,983,816.0 chance of winning.".format(user_numbers, y))

check_historical_occurence([33, 36, 37, 39, 8, 41], winning_tickets)

This {33, 36, 37, 39, 8, 41} combination has occurred 1 time(s) in the past. Your chance of winning with this ticket is 0.0000000715. That is, you stand a 1 in 13,983,816.0 chance of winning.


In [23]:
def multi_ticket_probability(x):
    n_combinations = combination(49, 6)
    
    probability = x / n_combinations
    percentage_form = probability * 100
    chance = round(n_combinations/x)
    print("With {x} ticket(s), The probability of winning is {probability:.10f}. That is, you stand a 1 in {chance:,} chance of winning\n".format(probability= probability, chance=chance, x=x))


In [24]:
test = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for x in  test:
    multi_ticket_probability(x)

With 1 ticket(s), The probability of winning is 0.0000000715. That is, you stand a 1 in 13,983,816 chance of winning

With 10 ticket(s), The probability of winning is 0.0000007151. That is, you stand a 1 in 1,398,382 chance of winning

With 100 ticket(s), The probability of winning is 0.0000071511. That is, you stand a 1 in 139,838 chance of winning

With 10000 ticket(s), The probability of winning is 0.0007151124. That is, you stand a 1 in 1,398 chance of winning

With 1000000 ticket(s), The probability of winning is 0.0715112384. That is, you stand a 1 in 14 chance of winning

With 6991908 ticket(s), The probability of winning is 0.5000000000. That is, you stand a 1 in 2 chance of winning

With 13983816 ticket(s), The probability of winning is 1.0000000000. That is, you stand a 1 in 1 chance of winning



## Sub Prizes for Less Numbers
Smaller rewards are awarded in most 6/49 lotteries if a player's ticket matches two, three, four, or five of the six numbers picked. This means that gamers may be curious about the odds of having two, three, four, or five winning numbers – in the first edition of the app, users should be able to find such odds.

These are the details to keep in mind when writing a function that allows us to calculate those probabilities:
- Inside the app, the user inputs:
 - six different numbers from 1 to 49; and
 - an integer between 2 and 5 that represents the number of winning numbers expected
 
Our function prints information about the probability of having a certain number of winning numbers
The likelihood that a player's ticket matches exactly the provided amount of winning numbers is calculated using the function below. If a player wants to know the chances of getting five winning numbers, the function will return the exact odds of getting five winning numbers (no more and no less). The probability of having at least five winning numbers will not be returned by the function.

To compute the probability, we inform the engineering team that the precise combination on the ticket is irrelevant, and that all we need is an integer between 2 and 5 that represents the predicted number of winning numbers. As a result, we'll construct a function called `probability_less_6()` that takes an integer and outputs information about the odds of winning based on that integer's value.


The likelihood that a player's ticket matches exactly the provided amount of winning numbers is calculated using the function below. If a player wants to know the chances of getting five winning numbers, the function will return the exact odds of getting five winning numbers (no more and no less). This is not the same as the chance of having at least five winning number.

In [25]:
def probability_less_6(x):
    number_combination = combination(6, x)
    successful_outcomes = combination(43, 6-x)
    total_combinations = combination(49, 6)
    answer = (number_combination * successful_outcomes)/total_combinations
    print("The probability that at least {number} of your 6 chosen numbers will be selected is {probability:.9f}. That is, a 1 in {chance:,}".format(number=x, probability=answer, chance =round((total_combinations/(number_combination * successful_outcomes)),0)))

In [26]:
test = range(2,6)
for x in test:
    probability_less_6(x)

The probability that at least 2 of your 6 chosen numbers will be selected is 0.132378029. That is, a 1 in 8.0
The probability that at least 3 of your 6 chosen numbers will be selected is 0.017650404. That is, a 1 in 57.0
The probability that at least 4 of your 6 chosen numbers will be selected is 0.000968620. That is, a 1 in 1,032.0
The probability that at least 5 of your 6 chosen numbers will be selected is 0.000018450. That is, a 1 in 54,201.0


In [27]:
combination(43,1)

43.0

## Next steps
For the first version of the app, we coded four main functions:

- one_ticket_probability() — calculates the probability of winning the big prize with a single ticket
- check_historical_occurrence() — checks whether a certain combination has occurred in the Canada lottery data set
- multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816
- probability_less_6() — calculates the probability of having two, three, four or five winning numbers exactly
Possible features for a second version of the app include:

- Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery")
- Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time
- Create a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers. Hint: the number of successful outcomes for having at least four winning numbers is the sum of these three numbers:
 - The number of successful outcomes for having four winning numbers exactly
 - The number of successful outcomes for having five winning numbers exactly
 - The number of successful outcomes for having six winning numbers exactly