## Title : Mobile App to prevent and treat gambling addictions

## Introduction: 
Many people start playing the lottery for fun, then they became addicted.
A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning.

For the first version of the app, we will focus on historical data coming from the national 6/49 lottery game in Canada and build functions that enable users to answer questions like:

- What is the probability of winning the big prize with a single ticket?
- What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
- What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

In [3]:
def factorial(n):
    if (n == 0):
        return 1
    else:
        return n * factorial(n-1)

In [4]:
def combinations(n,k):
    return factorial(n)/(factorial(n-k) * factorial(k))

### One-ticket Probability
For the first version of the app, we want players to be able to calculate the probability of winning the big prize with the various numbers they play on a single ticket (for each ticket a player chooses six numbers out of 49). 

In [5]:
def one_ticket_probability(numbers):
    k = len(numbers)
    outcomes = combinations(49,k)
    probability = 1/outcomes
    print("Probability to win a 6/49 lottery is {:.7f}%. In other words, your chances to win is 1 in {:,}".format(probability*100,int(outcomes))) 
 
                                                                                                                  #return probability

In [6]:
one_ticket_probability([3,45,10,9,23,1])

Probability to win a 6/49 lottery is 0.0000072%. In other words, your chances to win is 1 in 13,983,816


In [7]:
test_input = [2, 43, 22, 23, 11, 5]
one_ticket_probability(test_input)

Probability to win a 6/49 lottery is 0.0000072%. In other words, your chances to win is 1 in 13,983,816


### Historical Data Check for Canada Lottery
 Explore the historical data coming from the Canada 6/49 lottery. The data set can be downloaded from [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset) and it has the following structure:

The data set contains historical data for 3,665 drawings (each row shows data for a single drawing), dating from 1982 to 2018. 

In [11]:
import pandas as pd

lottery = pd.read_csv("649.csv")

In [12]:
lottery.shape

(3665, 11)

In [13]:
lottery.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


### Function for Historical Data Check
 Extract all the winning six numbers from the historical data set. 

In [14]:
def extract_numbers(row):
    no_drawn = set(row[4:10].values)
    return no_drawn

In [17]:
winning_nos = []
for index, row in lottery.iterrows():
    #print(row.values)
    winning_nos.append(extract_numbers(row))

In [18]:
winning_nos[:5]

[{3, 11, 12, 14, 41, 43},
 {8, 33, 36, 37, 39, 41},
 {1, 6, 23, 24, 27, 39},
 {3, 9, 10, 13, 20, 43},
 {5, 14, 21, 31, 34, 47}]

In [19]:
winning_nos = lottery.apply(extract_numbers, axis=1)

In [20]:
def check_historical_occurence(user_nos, winning_nos):
    user_nos = set(user_nos)
    match = user_nos == winning_nos
    #Print information about the number of times the combination inputted 
    #by the user occurred in the past
    matching_no_count = match.sum()
    if matching_no_count == 0:
        print("The combination has never matched. Your chances to win is 1 in next draw is{:,}").format(combinations(6))
    else:
        #probability of winning the big prize with that combination
        print("You have {} matching numbers among the winning number slot. Your chances to win is 1 in next draw is {:,}.... Better Luck Next time"
          .format(matching_no_count, int(combinations(49,6))))
    

In [22]:
test_input = [33, 36, 37, 39, 8, 41]
check_historical_occurence(test_input, winning_nos)

You have 1 matching numbers among the winning number slot. Your chances to win is 1 in next draw is 13,983,816.... Better Luck Next time


### Multi-ticket Probability

Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Our purpose is to help them better estimate their chances of winning — on this screen, we're going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [23]:
def multi_ticket_probability(ticket_number):
    p_outcomes = combinations(49,6)
    if ticket_number == 1:
        probability = (1/p_outcomes)*100
        print("If you buy {} ticket, your chances of win is 1 in {:,} chances. So percentage of win is {:.7f}%".format(ticket_number, int(p_outcomes),probability))
    else:
        probability = ticket_number/p_outcomes
        print("If you buy {:,} tickets, your chances of win is 1 in {:,} chances. So percentage of win is {:.6f}%".format(ticket_number,round(1/probability), probability*100))
    

In [24]:
test_input = [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for ticket_no in test_input:
    multi_ticket_probability(ticket_no)

If you buy 1 ticket, your chances of win is 1 in 13,983,816 chances. So percentage of win is 0.0000072%
If you buy 10 tickets, your chances of win is 1 in 1,398,382 chances. So percentage of win is 0.000072%
If you buy 100 tickets, your chances of win is 1 in 139,838 chances. So percentage of win is 0.000715%
If you buy 10,000 tickets, your chances of win is 1 in 1,398 chances. So percentage of win is 0.071511%
If you buy 1,000,000 tickets, your chances of win is 1 in 14 chances. So percentage of win is 7.151124%
If you buy 6,991,908 tickets, your chances of win is 1 in 2 chances. So percentage of win is 50.000000%
If you buy 13,983,816 tickets, your chances of win is 1 in 1 chances. So percentage of win is 100.000000%


### Less Winning Numbers — Function
So far, we wrote three main functions:

- `one_ticket_probability()` — calculates the probability of winning the big prize with a single ticket
- `check_historical_occurrence()` — checks whether a certain combination has occurred in the Canada lottery data set
- `multi_ticket_probability()` — calculates the probability for any number of tickets between 1 and 13,983,816

In most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers.


The function named `probability_less_6()` takes in an integer between 2 and 5 and prints information about the chances of winning depending on the value of that integer.

In [25]:
def probability_less_6(n_choice):
    n_combinations = combinations(6, n_choice)
    #print(n_combinations)
    #for each n_choice
    #successful_outcomes = n_combinations * (49 - n_combinations)
    successful_outcomes = n_combinations * combinations(43, 6 - n_choice)
 
    #print(successful_outcomes)
    #for 6/49 lottery draw
    possible_outcomes = combinations(49,6)
    
    #probability
    probability = (successful_outcomes/possible_outcomes)*100
    
    print(" For having {} number match with this ticket , the percentage of win is {:.6f}%".format(n_choice, probability))

In [27]:
probability_less_6(4)

 For having 4 number match with this ticket , the percentage of win is 0.096862%


In [28]:
for ticket in [2,3,4,5]:
    probability_less_6(ticket)

 For having 2 number match with this ticket , the percentage of win is 13.237803%
 For having 3 number match with this ticket , the percentage of win is 1.765040%
 For having 4 number match with this ticket , the percentage of win is 0.096862%
 For having 5 number match with this ticket , the percentage of win is 0.001845%


### Next Steps
That was all for the guided part of the project! We managed to write four main functions for our app:

- `one_ticket_probability()` — calculates the probability of winning the big prize with a single ticket
- `check_historical_occurrence()` — checks whether a certain combination has occurred in the Canada lottery data set
- `multi_ticket_probability()` — calculates the probability for any number of of tickets between 1 and 13,983,816
- `probability_less_6()` — calculates the probability of having two, three, four or five winning numbers

Possible features for a second version of the app include:
- Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery").
- Combining the `one_ticket_probability()` and `check_historical_occurrence()` to output information on probability and historical occurrence at the same time.
- Creating a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers. Hint: the number of successful outcomes for having at least four winning numbers is the sum of these three numbers:
    - The number of successful outcomes for having four winning numbers exactly
    - The number of successful outcomes for having five winning numbers exactly
    - The number of successful outcomes for having six winning numbers exactly
    