# Mobile App for Lottery Addiction

We want to contribute to the development of a mobile app that is meant to help lottery addicts better understand their chances of winning.

We think the goal of this app is significant at social level since many people start playing for fun but for some this activity turns into a habit and which eventually escalates into addiction. Like other compulsive gamblers, lottery addicts soon begin spending from their savings and loans, they start to accumulate debts, and eventually engage in desperate behaviors like theft.

A medical institute, that aims to prevent and treat gambling addiction, wants to build a dedicate mobile app to help lottery addicts, and our task is to create the logical core of the app and calculate probabilities.
For this first version of the app, we'll focus on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49) and we'll build functions that enable users to answer questions like:

* What is the probability of winning ten big prize with a single ticket ?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

We'll also consider historical data about the 6/49 lottery. [The data set](https://www.kaggle.com/datascienceai/lottery-dataset) has data for 3665 drawings, from 1982 to 2018.

The scenario we're following throughout this project is fictional — the main purpose is to practice applying probability and combinatorics concepts in a setting that simulates a real-world scenario.

## Core Functions

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. The drawing is done without replacement, which means once a number is drawn, it's not put back in the set.

Throughout the project, we'll need to calculate repeatedly probabilities and combinations. As a consequences, we'll start by writing two functions that we'll use often:

* A function that calculate factorials.
* A function that calculate combinations.

In [2]:
#Defining the factorial function
def factorial(n):
    product = 1
    for x in range(2, n+1):
        product *= x
    return product

#Defining the combination function
def combinations(n, k):
    return (factorial(n) / factorial(k)) / factorial(n-k)

## One-Ticket Probability

Now we'll focus on writing a function that calculates the probability of winning the big prize. In the 6/49 lottery a player wins the big prize if the six numbers on their tickets match all the six numbers drawn.

For this first version of the app, we want players to be able to calculate the probabilty of winning the big prize with the various numbers the choose on a single ticket. We discussed with the engeneering teamof the mediacl institute, and they told us we need to be aware of the following details:

* Inside the app, the user inputs six different numbers from 1 to 49.
* Under the hood, the six numbers will come as a Python list, which will serve as the single input to our function.
* The engineering team wants the function to print the probability value in a friendly way — in a way that people without any probability training are able to understand.

In [3]:
#Creating the function
def one_ticket_probability(ticket_numbers):
    possible_outcomes = combinations(49, 6)
    successful_outcomes = 1
    final_probability = successful_outcomes * 100 / possible_outcomes
    print("With the numbers {} you have the {:.7f}% to win the big prize.\nIn others words, you have 1 in {:,} chances to win."
          .format(ticket_numbers, final_probability, int(100 / final_probability)))

Now let's test our functions on two differents tickets.

In [4]:
#Test number 1
one_ticket_probability([2, 44, 11, 7, 37, 26])

With the numbers [2, 44, 11, 7, 37, 26] you have the 0.0000072% to win the big prize.
In others words, you have 1 in 13,983,816 chances to win.


In [5]:
#Test number 2
one_ticket_probability([30, 48, 9, 22, 29, 16])

With the numbers [30, 48, 9, 22, 29, 16] you have the 0.0000072% to win the big prize.
In others words, you have 1 in 13,983,816 chances to win.


## Historical Data Check for Canada Lottery

We also have to allow users to compare their ticket against the historical lottery data and determine whether they would have ever won by now.

Let's open the data set and get familiar with its structure.

In [6]:
#Importing the library
import pandas as pd

#Opening the data set
lottery = pd.read_csv("649.csv")

#Showing the dimension of the data
lottery.shape

(3665, 11)

In [7]:
#Showing the first three rows
lottery.head(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34


In [8]:
#Showing the last three rows
lottery.tail(3)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
3662,649,3589,0,6/13/2018,6,22,24,31,32,34,16
3663,649,3590,0,6/16/2018,2,15,21,31,38,49,8
3664,649,3591,0,6/20/2018,14,24,31,35,37,48,17


The data is very user friendly, for each drawing we can find the six numbers drawn in the following six columns:

* *NUMBER DRAWN 1*
* *NUMBER DRAWN 2*
* *NUMBER DRAWN 3*
* *NUMBER DRAWN 4*
* *NUMBER DRAWN 5*
* *NUMBER DRAWN 6*

## Function for Historical Data Check

We're going to write a function that enable users to compare their ticket against the historical lottery data.
We need to be aware of the following details:

* Inside the app, the user inputs six different numbers from 1 to 49.
* Under the hood, the six numbers will come as a Python list and serve as an input to our function.
* The engineering team wants us to write a function that prints:
    * The number of times the combination selected               occurred in the Canada data set.
    * The probability of winning the big prize in the next drawing with that combination.

In [9]:
#Extracting the winning numbers in a set
def extract_numbers(row):
    row = row[4:10]
    row = set(row)
    return row

winning_numbers = lottery.apply(extract_numbers, axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [10]:
#Checking the user ticket
def check_historical_occurence(user_ticket, winning_numbers):
    ticket = set(user_ticket)
    match = ticket == winning_numbers
    number_matches = 3665 - match.value_counts()[0]
    print("The numbers {} occurred {} time/s from 1982 until now.\nWith the numbers {} you have the {:.7f}% to win the big prize".format
          (user_ticket, number_matches, user_ticket, 100 / combinations(49, 6) ))

Now let's test our functions.

In [11]:
#Test number 1
check_historical_occurence([3,10,43,13,9,20], winning_numbers)

The numbers [3, 10, 43, 13, 9, 20] occurred 1 time/s from 1982 until now.
With the numbers [3, 10, 43, 13, 9, 20] you have the 0.0000072% to win the big prize


In [12]:
#Test number 2
check_historical_occurence([39,41,12,2,8,22], winning_numbers)

The numbers [39, 41, 12, 2, 8, 22] occurred 0 time/s from 1982 until now.
With the numbers [39, 41, 12, 2, 8, 22] you have the 0.0000072% to win the big prize


## Multi-Ticket Probability

Lottery addicts usually plays more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our aim is to help them to better understand their real chances of winning. We have to build the function following these information:

* The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
* Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
* The function should print information about the probability of winning the big prize depending on the number of different tickets played.

Let's start writing the function.

In [13]:
#Checking a number of tickets
def multi_ticket_probability(n):
    if n in range(1, int(combinations(49, 6) + 1)):
        probability = n * 100 / combinations(49, 6)
        print("With {} ticket/s you have the {:.7f}% to win the big prize.\nIn others words, you have 1 in {:,} chances to win.".format
             (n, probability, int(100 / probability)))
    else:
        print("The number of tickets is too high")

Now we'll run a couple of tests of the function.

In [14]:
#Testing the function
test = [1, 2, 100, 1000, 6991908, 13983816, 13983817]
for element in test:
    multi_ticket_probability(element)
    print("--------------------------\n")

With 1 ticket/s you have the 0.0000072% to win the big prize.
In others words, you have 1 in 13,983,816 chances to win.
--------------------------

With 2 ticket/s you have the 0.0000143% to win the big prize.
In others words, you have 1 in 6,991,908 chances to win.
--------------------------

With 100 ticket/s you have the 0.0007151% to win the big prize.
In others words, you have 1 in 139,838 chances to win.
--------------------------

With 1000 ticket/s you have the 0.0071511% to win the big prize.
In others words, you have 1 in 13,983 chances to win.
--------------------------

With 6991908 ticket/s you have the 50.0000000% to win the big prize.
In others words, you have 1 in 2 chances to win.
--------------------------

With 13983816 ticket/s you have the 100.0000000% to win the big prize.
In others words, you have 1 in 1 chances to win.
--------------------------

The number of tickets is too high
--------------------------



## Less Winning Numbers

Here, we are going to write oone more function to allow the user to calculate probabilities for two, three, four or five winning numbers. In most lotteries there are smaller prizes if a user's ticket match two, three, four or five numbers of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of winning these prizes. These are the details we'll have to be aware of:

* Inside the app, the user inputs:
    * Six different numbers from 1 to 49.
    * An integer between 2 and 5 that represents the number of winning numbers expected.
* Our function prints information about the probability of having the inputted number of winning numbers.

In [21]:
#Creating the function 
def probability_less_6(n):
    if n not in range(2,6):
        print("Wrong number, please enter a correct number")
    else:
        successful_outcomes = combinations(6, n) * combinations(43, 6-n)
        total_outcomes = combinations(49, 6)
        probability = successful_outcomes / total_outcomes * 100
        print("Your probability of having {} winning numbers is {:.7f}%.\nIn other words you have 1 in {:,} chances to win.".format
              (n, probability, round(100 / probability)))

Now, let's test our function on all the possible outputs.

In [22]:
#Testing the function
for n in range(2,6):
    probability_less_6(n)
    print("--------------------\n")

Your probability of having 2 winning numbers is 13.2378029%.
In other words you have 1 in 8 chances to win.
--------------------

Your probability of having 3 winning numbers is 1.7650404%.
In other words you have 1 in 57 chances to win.
--------------------

Your probability of having 4 winning numbers is 0.0968620%.
In other words you have 1 in 1,032 chances to win.
--------------------

Your probability of having 5 winning numbers is 0.0018450%.
In other words you have 1 in 54,201 chances to win.
--------------------



## Conclusion

Doing a recap, we have wrote four main functions:

* one_ticket_probability() — calculates the probability of winning the big prize with a single ticket.
* check_historical_occurrence() — checks whether a certain combination has occurred in the lottery data set.
* multi_ticket_probability() — calculates the probability for any number of of tickets between 1 and 13,983,816.
* probability_less_6() — calculates the probability of having two, three, four or five winning numbers.

The project is concluded, anyway we would like to suggest some other features for a second version of the app:

* Making the outputs even easier to understand by adding fun analogies (for instance, we can output something like : "You are 100 times more likely to be victim of a shark attack than winning the lottery").
* Creating a function which calculates the probability to have *at least* two, three, four or five winning numbers. 