# Mobile App for Lottery Addiction

This project is part of a guided project available on [Dataquest.io](https://dataquest.io). In this project scenario we work for a medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities. The version of this app is focusing on [6/49 Lottery](https://en.wikipedia.org/wiki/Lotto_6/49) one of the three natinal lotteries in Canada. The functions in the app will enable users to answer questions like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

## Content
1. Initial Functions
2. One-ticket Probability
3. Historical Data Check
4. Multi-ticket Probability
5. Less Winning Numbers

---

## 1. Initial functions
This project will work a lot with probabilities, so as a start we will defined functions we'll be using a lot – function that calculates factorials and combinations.

**Factorial Function**

In [1]:
def factorial(n):
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

**Combination Function**

In [2]:
def combinations(n, k):
    return factorial(n) / (factorial(k) * factorial(n-k))

## 2. One-ticket Probability
In this section we will focuse on calculating the probality of winning the big prize. To win the big prize player needs to guess all six numbers correctly. 
We discussed with the engineering team of the medical institute, and they told us we need to be aware of the following details when we write the function:

- Inside the app, the user inputs six different numbers from 1 to 49.
- Under the hood, the six numbers will come as a Python list, which will serve as the single input to our function.
- The engineering team wants the function to print the probability value in a friendly way — in a way that people without any probability training are able to understand.


Write a function named one_ticket_probability(), which takes in a list of six unique numbers and prints the probability of winning in a way that's easy to understand.

Start by calculating the total number of possible outcomes — this is total number of combinations for a six-number lottery ticket. There are 49 possible numbers, and six numbers are sampled without replacement. Use the combinations() function you wrote in the previous screen.
The user inputs just one combination, which means the number of successful outcomes is 1.
Use the number of successful outcomes and the total number of possible outcomes to calculate the probability for one ticket.
The function should print the probability in a way that's easy to understand. It's up to you what you choose, but here are a few suggestions:
Print the probability as a percentage.
Use the str.format() method to make the printed message more personalized with respect to what the user inputs.
Test your function using a few inputs.

Add some context for your readers to explain what you did in this step and why.

In [3]:
def one_ticket_probability(list):
    n_combiantions = combinations(49,6)
    probability_one_ticket = 1/n_combiantions
    winning_probabilty = probability_one_ticket * 100

    print('''The chances of winning with the {} numbers is {:.7f} %
In other words you have 1 in {:,} chances to win.'''.format(list, winning_probabilty, int(n_combiantions)))


Testing the function

In [4]:
test_list_1 = [7,1,23,0,38,19]
one_ticket_probability(test_list_1)

The chances of winning with the [7, 1, 23, 0, 38, 19] numbers is 0.0000072 %
In other words you have 1 in 13,983,816 chances to win.


In [5]:
test_list_2 = [45, 34, 12, 3, 16, 49]
one_ticket_probability(test_list_2)

The chances of winning with the [45, 34, 12, 3, 16, 49] numbers is 0.0000072 %
In other words you have 1 in 13,983,816 chances to win.


## 3. Historical Data Check
Users should also be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.  We'll focus on exploring the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset) by user [MC](https://www.kaggle.com/datascienceai).

The data set contains historical data for 3,665 drawings (each row shows data for a single drawing), dating from 1982 to 2018. For each drawing, we can find the six numbers drawn in the following six columns:

- `NUMBER DRAWN 1`
- `NUMBER DRAWN 2`
- `NUMBER DRAWN 3`
- `NUMBER DRAWN 4`
- `NUMBER DRAWN 5`
- `NUMBER DRAWN 6`



In [6]:
import pandas as pd

lotto = pd.read_csv("649.csv")
lotto.shape

(3665, 11)

In [7]:
lotto.head(10)

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45
5,649,6,0,7/17/1982,8,20,21,25,31,41,33
6,649,7,0,7/24/1982,18,25,28,33,36,42,7
7,649,8,0,7/31/1982,7,16,17,31,40,48,26
8,649,9,0,8/7/1982,5,10,23,27,37,38,33
9,649,10,0,8/14/1982,4,15,30,37,46,48,3


We're going to write a function that will enable users to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

In [8]:
def extract_numbers(row):
    row = row[4:10]
    row = set(row)
    return row

winning_numbers = lotto.apply(extract_numbers, axis=1)
winning_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [9]:
def check_historical_occurence(user_list, winning_numbers):
    user_set = set(user_list)
    check_sets = user_set == winning_numbers
    n_past_winns = check_sets.sum()


    if n_past_winns >=1:
        print('''The following combination {} of numbers has won {} times in the past.'''.format(user_list, n_past_winns))
        one_ticket_probability(user_list)

    else:
        print('''The following combantion {} did not won in the past.'''.format(user_list))
        one_ticket_probability(user_list)

In [10]:
test_list_3 = [3, 41, 11, 12, 43, 14]
check_historical_occurence(test_list_3, winning_numbers)

The following combination [3, 41, 11, 12, 43, 14] of numbers has won 1 times in the past.
The chances of winning with the [3, 41, 11, 12, 43, 14] numbers is 0.0000072 %
In other words you have 1 in 13,983,816 chances to win.


In [11]:
check_historical_occurence(test_list_2, winning_numbers)

The following combantion [45, 34, 12, 3, 16, 49] did not won in the past.
The chances of winning with the [45, 34, 12, 3, 16, 49] numbers is 0.0000072 %
In other words you have 1 in 13,983,816 chances to win.


## 4. Multi-ticket Probability
Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning — we're going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

We've talked with the engineering team and they gave us the following information:

- The user will input the number of different tickets they want to play (without inputting the specific combinations they intend to play).
- Our function will see an integer between 1 and 13,983,816 (the maximum number of different tickets).
- The function should print information about the probability of winning the big prize depending on the number of different tickets played.

In [12]:
def multi_ticket_probability(user_nubmer):
    n_combiantions = combinations(49,6)
    probability_multi_ticket = int(user_nubmer)/n_combiantions
    winning_probability = probability_multi_ticket * 100

    print('''The probability of winning with {} tickets is {:.7f} %
In other words you have 1 in {} chances to win.'''.format(int(user_nubmer), winning_probability, int(n_combiantions)))


In [13]:
test_imputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for number in test_imputs:
    multi_ticket_probability(number)
    print("-" * 70)

The probability of winning with 1 tickets is 0.0000072 %
In other words you have 1 in 13983816 chances to win.
----------------------------------------------------------------------
The probability of winning with 10 tickets is 0.0000715 %
In other words you have 1 in 13983816 chances to win.
----------------------------------------------------------------------
The probability of winning with 100 tickets is 0.0007151 %
In other words you have 1 in 13983816 chances to win.
----------------------------------------------------------------------
The probability of winning with 10000 tickets is 0.0715112 %
In other words you have 1 in 13983816 chances to win.
----------------------------------------------------------------------
The probability of winning with 1000000 tickets is 7.1511238 %
In other words you have 1 in 13983816 chances to win.
----------------------------------------------------------------------
The probability of winning with 6991908 tickets is 50.0000000 %
In other word

## 5. Less Winning Numbers
wW're going to write one more function to allow the users to calculate probabilities for two, three, four, or five winning numbers. In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers. These are the engineering details we'll need to be aware of:

- Inside the app, the user inputs:
    - six different numbers from 1 to 49; and
    - an integer between 2 and 5 that represents the number of winning numbers expected

In [14]:
def probability_less_6(user_number):
   n_combinations_ticket = combinations(6, user_number)
   n_combinations_remains = combinations(43, 6 - user_number)
   succesfull_outcomes = n_combinations_ticket * n_combinations_remains

   winning_numbers = succesfull_outcomes / combinations(49,6)
   
   probability_percentage = winning_numbers * 100
   combinations_simplified = round(combinations(49,6)/succesfull_outcomes)
   print('''Your chances of having {} winning numbers with this ticket are {:.6f}%.
In other words, you have a 1 in {:,} chances to win.'''.format(user_number, probability_percentage,
                                                               int(combinations_simplified)))

In [15]:
test_imputs = [1, 2, 3, 4, 5]

for number in test_imputs:
    probability_less_6(number)
    print("-" * 70)

Your chances of having 1 winning numbers with this ticket are 41.301945%.
In other words, you have a 1 in 2 chances to win.
----------------------------------------------------------------------
Your chances of having 2 winning numbers with this ticket are 13.237803%.
In other words, you have a 1 in 8 chances to win.
----------------------------------------------------------------------
Your chances of having 3 winning numbers with this ticket are 1.765040%.
In other words, you have a 1 in 57 chances to win.
----------------------------------------------------------------------
Your chances of having 4 winning numbers with this ticket are 0.096862%.
In other words, you have a 1 in 1,032 chances to win.
----------------------------------------------------------------------
Your chances of having 5 winning numbers with this ticket are 0.001845%.
In other words, you have a 1 in 54,201 chances to win.
----------------------------------------------------------------------
