## Buidling Logic for Mobile App for Lottery

A medical institute that aims to prevent and treat gambling addictions wants to build a dedicated mobile app to help lottery addicts better estimate their chances of winning. The institute has a team of engineers that will build the app, but they need us to create the logical core of the app and calculate probabilities.
In this project,the focus will be on 6/49 Lottery. Using past data, I will build functions that will enable uers answer the following pertinent questions 
* What is the probability of winning the big prize with a songle ticket?
* What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
* What is the probablity of having at least five (or four, or three, or two) winning number on a single ticket?

In [1]:
## Writing a function for factorial of n

def factorial(n):
    result = 1
    for i in range(0, n):
        x = n - i
        result = result * x
    return result

In [2]:
## Writing a function for combination of n

def combinations(n, k):
    return int(factorial(n)/(factorial(k)*factorial(n-k)))

#### Determining Probability to Win a Big Prize

In the 6/49 lottery, six numbers are drawn from a set of 49 numbers that range from 1 to 49. A player wins the big prize if the six numbers on their tickets match all the six numbers drawn. If a player has a ticket with the numbers {13, 22, 24, 27, 42, 44}, he only wins the big prize if the numbers drawn are {13, 22, 24, 27, 42, 44}. If only one number differs, he doesn't win. In the next cell, I will write a function that would print out the probability value of a player winning a big prize in a user friendly way

In [3]:
def one_ticket_probability(n):
    x = len(n)
    prob = 1 / combinations(49, x)
    return "For that combo, your have a chance of {:.10f} % chance of winning big".format(prob*100)    

In [4]:
print(one_ticket_probability([2,3,4,5,6,7]))

For that combo, your have a chance of 0.0000071511 % chance of winning big


#### Using Historical Dataset to Determine Chances

In the next cells, I'll focus on exploring the historical data coming from the Canada 6/49 lottery. Users will be able to compare their ticket against the historical lottery data in Canada and determine whether they would have ever won by now.

In [5]:
import pandas as pd
file = pd.read_csv('649.csv')

In [6]:
file.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


In [7]:
file.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3665 entries, 0 to 3664
Data columns (total 11 columns):
PRODUCT            3665 non-null int64
DRAW NUMBER        3665 non-null int64
SEQUENCE NUMBER    3665 non-null int64
DRAW DATE          3665 non-null object
NUMBER DRAWN 1     3665 non-null int64
NUMBER DRAWN 2     3665 non-null int64
NUMBER DRAWN 3     3665 non-null int64
NUMBER DRAWN 4     3665 non-null int64
NUMBER DRAWN 5     3665 non-null int64
NUMBER DRAWN 6     3665 non-null int64
BONUS NUMBER       3665 non-null int64
dtypes: int64(10), object(1)
memory usage: 315.0+ KB


In [45]:
#writing a function that collects six input for each row as a set

def extract_numbers(n):
    a = n[['NUMBER DRAWN 1', 'NUMBER DRAWN 2', 'NUMBER DRAWN 3', 'NUMBER DRAWN 4', 'NUMBER DRAWN 5', 'NUMBER DRAWN 6']].values
    return set(a.tolist())

In [47]:
winning_set = file.apply(extract_numbers, axis=1)

In [62]:
#creating a function to compare the user's input with historical winnings

def check_historical_occurence(n, m):
    m = set(m)
    length = len(n[n==m])
    return "Your combination has occured in the past (1982-2018) {} time(s). The overall chances of winning a big prize is 0.0000071511 %".format(length)

In [63]:
print(check_historical_occurence(winning_set, {33, 36, 37, 39, 8, 41}))

Your combination has occured in the past (1982-2018) 1 time(s). The overall chances of winning a big prize is 0.0000071511 %


#### Multi Ticket Probability

Lottery addicts usually play more than one ticket on a single drawing, thinking that this might increase their chances of winning significantly. Since the objective of this project is to help them better estimate their chances of winning — in the next couple of cells, I am going to write a function that will allow the users to calculate the chances of winning for any number of different tickets.

In [64]:
#function for calculating chances from the number of tickets played

def multi_ticket_probability(n):
    prob = n/ combinations(49, 6)
    return "For that number of tickets, your have a chance of {:.10f} % chance of winning big".format(prob*100)

In [65]:
print(multi_ticket_probability(10000))

For that number of tickets, your have a chance of 0.0715112384 % chance of winning big


#### Less Winning Number- Functions

For extra context, in most 6/49 lotteries there are smaller prizes if a player's ticket match two, three, four, or five of the six numbers drawn. As a consequence, the users might be interested in knowing the probability of having two, three, four, or five winning numbers. In the next cell(s), I am going to write one more function to allow the users to calculate probabilities for two, three, four, or five winning numbers.

In [77]:
#the function

def probability_less_6(n):
    if 2>n<=6:
        return 'Invalid Input'
    comb = combinations(6, n)
    n_combinations = combinations(43, 6 - n)
    outcomes = n_combinations * comb
    tot_outcomes = combinations(49,6)
    prob = outcomes/tot_outcomes
    return "There is a probability of {:.10f} % of having exactly {} winning numbers".format(prob*100, n)
    

In [78]:
probability_less_6(2)

'There is a probability of 13.2378029002 % of having exactly 2 winning numbers'

#### Conclusion

The basic logic for the app has been coded in a modular style in this project. But then, this is just the first version, for the second version, the following points would be considered:
* Making the outputs even easier to understand by adding fun analogies (for example, we can find probabilities for strange events and compare with the chances of winning in lottery; for instance, we can output something along the lines "You are 100 times more likely to be the victim of a shark attack than winning the lottery").
* Combining the one_ticket_probability() and check_historical_occurrence() to output information on probability and historical occurrence at the same time.
* Creating a function similar to probability_less_6() which calculates the probability of having at least two, three, four or five winning numbers.