# What lottery can teach us about probability? 

Lottery is an amazing concept and it touches intensively our emotions. Why say not to an idea that promises huge prizes with a minimum effort? This is the simple concept behind lottery, at least, for our first thought. 

In this notebook, let's find out some probability and combinatorics concepts applied in lottery.  


## Context

Give a try to play lottery is not bad at all. Simply, it demonstrates our willing to play with random events and, of course, our curiosity to non-deterministic results. The problem is when lottery it is a source of addiction and many people have financial, personal, professional or relationships problems.

So, our main goal is to demsotrate to those people with mathematical fundations that lottery is a representation of a random event and the true potential to win in this game is not in our favor. 

To create the simulation, we use available and real data from 6/49 lottery game in Canada. This dataset can be found in Kaggle: https://www.kaggle.com/datascienceai/lottery-dataset


## Basic concepts

Combinations are our friends when we want to count groups of elements in a set without replacement and without any given order.

The kind of lottery that we will be using during this project says "6 numbers are drawn from a set of 49 possible numbers"

So, our first math definition:

A combination take only k objects from a group of n objects (order doesn't matter and without replacement). The formule is: n! /  k!(n-k)!


In [1]:
# Factorial iterative function
def factorial(n):
    result = 1
    for i in range(n, 0, -1):
        result *= i
    return result
# Combinatorial function
def combination(n, k):
    return factorial(n)/(factorial(k) * factorial(n-k))

## Winning the big prize

To win the big prize, our 6 numbers need to be exactly as the 6 win numbers. Let's calculate this probability. Before that, what probability does it mean? and Why is so important?

A probability is a number between 0 and 1 that tell us the "frequency" or "belief degree" a random variable take some specific or set numbers. It's important to say that there are different approaches in probability such as frequentist and bayesian.

Probability is an awesome discipline because it can give us objective and scientific data from randomness processes.

Now, let's create a function to achieve this:

In [2]:
def big_prize_proba(my_numbers):
    proba =  (1 / int(combination(49, 6))) * 100
    print('''The probability to win the big prize with your numbers {} is {}%'''.format(my_numbers,proba))
# Some random examples
big_prize_proba([10,20,30,40,43,44])
big_prize_proba([1,3,6,7,21,41])

The probability to win the big prize with your numbers [10, 20, 30, 40, 43, 44] is 7.151123842018516e-06%
The probability to win the big prize with your numbers [1, 3, 6, 7, 21, 41] is 7.151123842018516e-06%


As we can see, it doesn't matter which numbers we chose, the probability is the same. So, why so many people believe that some numbers are magic, lucky? Are there compelling reasons to still believe that? 
Let's find out.

## What happen if I had played in ....?
Now, with the help of our dataset, let's propose a game called "What happen if I had played..." Obviously, let's assume all the considerations that we have talked about: Canada location and 6/49 lottery type.  

The goal is only recreative and fun but there's something else. It can allow us to demonstrate that every realization of this random experience is new without any kind of dependency. We are, then, talking about independent random variables.

First of all, let's check the anatomy of our data with the help of head() function.

In [3]:
import pandas as pd
lottery = pd.read_csv('649.csv')
lottery.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


We have interesting data in the dataset, but there are columns that don't help in our analysis such as "product, draw number and sequence number". So let's extract relevant data to work with them

In [4]:
def winners_numbers_historic(data):
    data = data[4:10]
    #Set can help handle unordered data
    data = set(data.values)
    return data

lucky_numbers = lottery.apply(winners_numbers_historic, axis=1)
lucky_numbers.head()

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [5]:
def historic(my_numbers, lucky_past_numbers): 
    my_numbers = set(my_numbers)
    check_numbers = lucky_past_numbers == my_numbers
    n_occurrences = check_numbers.sum()
    
    if n_occurrences == 0:
        print('''Your set of numbers {} has never happen'''.format(my_numbers))
    else:
        print('''Your set of numbers {} had happened {} times in the past (3665 independent games)'''.format(my_numbers,
        n_occurrences ))
        
historic([1, 6, 39, 23, 24, 27], lucky_numbers)
historic([19,17,42,7,8,9,11], lucky_numbers)


Your set of numbers {1, 6, 39, 23, 24, 27} had happened 1 times in the past (3665 independent games)
Your set of numbers {7, 8, 9, 42, 11, 17, 19} has never happen


As we can see, it's so difficult to find a combination that repeat twice or more times in this dataset. 

## Small prizes: There are something beyond big prize

As we can see, for this type of lottery we have 13,983,816 options to create combinations. In other words, just one from this options will be the winner. 

For this reason, it's motivating to think in give some small prizes for less correct numbers. At the end, these games are strongly based on Expected value concept. It has to be always positive, and that ensure that one side always win in long-term. Also, it has to mantain a certain logical balance (dopamine principle). If not, nobody will play.

For this reason, let's give a quick view about how probabilities change when we have less exact numbers. It's amazing to see the behaviour.

In [6]:
for i in range(1,6):
    print('''{} winning numbers: {}'''.format(i, combination(6,i)*combination(49-6,6-i)/combination(49,6)))     

1 winning numbers: 0.4130194504847604
2 winning numbers: 0.13237802900152576
3 winning numbers: 0.017650403866870102
4 winning numbers: 0.000968619724401408
5 winning numbers: 1.8449899512407772e-05


## Final thoughts

Knowing probability is really helpful to take better and informed decisions. Probability is part of Measure Theory and it quantifies and compares values under a normalized scale. Obviously, it can happen interesting phenomens. For example, Mr. Richard Lustig, an american man who won large prizes in seven state-sponsored lottery games from 1993 to 2010.

There are lots of concepts and interesting facts related to Probability that can be a surprise for our minds. But, I hope with this work, I can boost a little more your probabilistic thinking. 
 