# Lottery Probability Analysis for Responsible Gambling

![image.png](image.png)

A renowned medical institute focused on preventing and treating gambling addiction is developing a mobile application to assist lottery players in making informed decisions about their chances of winning. While their in-house engineering team is responsible for app development, our task is to design and implement the probability calculation engine that powers the app's insights.

For the initial release, our focus is on the [6/49 lottery](https://en.wikipedia.org/wiki/Lotto_6/49). We will build core functions that allow users to answer key probability-related questions, including:

- What are the odds of winning the jackpot with a single ticket?
- How does purchasing multiple tickets impact the probability of winning?
- What is the likelihood of matching a specific number of winning numbers (e.g., two, three, four, or five)?

## Core Probability Functions

In [1]:
def factorial(n):
    final_product = 1
    for i in range(n, 0, -1):
        final_product *= i
    return final_product

In [2]:
def combination(n, k):
    return factorial(n) / (factorial(k) * factorial(n - k))

In [3]:
combination(52, 4)

270725.0

In [4]:
factorial(5)

120

## Winning Probability for a Single Ticket

In the 6/49 lottery, six numbers are randomly drawn from a set of 49. A player wins the jackpot only if all six numbers on their ticket match the drawn numbers exactly. Even a single number mismatch results in a loss.

To help players understand their odds, we will implement a function that calculates and displays the probability of winning with a single ticket in a user-friendly manner.

### Engineering Considerations:
- Users input six unique numbers from 1 to 49.
- The function receives these numbers as a Python list.
- The probability should be presented in an easy-to-understand format.

In [5]:
def one_ticket_probability(numbers: list):
    total_outcomes = combination(n=49, k=6)
    probability = 1 / total_outcomes
    print(f"Probability of winning the jackpot with {numbers}: {probability * 100:.10f}%")
    print(f"You need to buy {1/probability / 2:,.0f} tickets to have a 50% chance of winning.")

In [6]:
one_ticket_probability([1,2,3,4,5,6])

Probability of winning the jackpot with [1, 2, 3, 4, 5, 6]: 0.0000071511%
You need to buy 6,991,908 tickets to have a 50% chance of winning.


## Historical Data Analysis for Canada 6/49 Lottery

To provide additional insights, we analyze historical data from the Canada 6/49 lottery, available on [Kaggle](https://www.kaggle.com/datascienceai/lottery-dataset).

In [7]:
import pandas as pd

In [8]:
df = pd.read_csv('649.csv')

In [9]:
print(f"Dataset contains {df.shape[0]:,} rows and {df.shape[1]:,} columns")

Dataset contains 3,665 rows and 11 columns


In [10]:
df.head()

Unnamed: 0,PRODUCT,DRAW NUMBER,SEQUENCE NUMBER,DRAW DATE,NUMBER DRAWN 1,NUMBER DRAWN 2,NUMBER DRAWN 3,NUMBER DRAWN 4,NUMBER DRAWN 5,NUMBER DRAWN 6,BONUS NUMBER
0,649,1,0,6/12/1982,3,11,12,14,41,43,13
1,649,2,0,6/19/1982,8,33,36,37,39,41,9
2,649,3,0,6/26/1982,1,6,23,24,27,39,34
3,649,4,0,7/3/1982,3,9,10,13,20,43,34
4,649,5,0,7/10/1982,5,14,21,31,34,47,45


The dataset includes records of 3,665 drawings from 1982 to 2018. Each draw has six winning numbers recorded in the following columns:
- `NUMBER DRAWN 1`
- `NUMBER DRAWN 2`
- `NUMBER DRAWN 3`
- `NUMBER DRAWN 4`
- `NUMBER DRAWN 5`
- `NUMBER DRAWN 6`

## Checking Historical Occurrences of a Ticket

We will implement a function that allows users to check how many times their selected ticket has appeared in historical draws and determine the probability of winning in the next drawing.

### Engineering Considerations:
- The function takes a list of six unique numbers.
- It compares the user’s selection against historical records.
- It prints:
    - The number of past occurrences.
    - The probability of winning in the next draw.

In [11]:
def extract_numbers(df):
    cols = [col for col in df.columns if 'NUMBER DRAWN' in col]
    data = [set(i) for i in df[cols].values.tolist()]
    return data

In [12]:
winning_numbers = pd.Series(extract_numbers(df))

In [13]:
def check_historical_occurrence(numbers: list, historical_numbers=winning_numbers):
    numbers = set(numbers)
    matches = (historical_numbers == numbers).sum()
    print(f"Your numbers matched {matches} times in history.")
    print("Probability of winning the next draw with these numbers is 0.0000071511%")

In [14]:
check_historical_occurrence([3, 11, 12, 14, 41, 43])

Your numbers matched 1 times in history.
Probability of winning the next draw with these numbers is 0.0000071511%


## Multi-ticket Probability

Many lottery players buy multiple tickets to increase their odds. To help them assess the impact, we will create a function that calculates the probability of winning based on the number of tickets purchased.

### Engineering Considerations:
- Users input the number of tickets they intend to buy.
- The function calculates the probability of winning based on that input.


In [15]:
def multi_ticket_probability(tickets: int):
    total_outcomes = combination(n=49, k=6)
    if not (1 <= tickets <= total_outcomes):
        raise ValueError(f"Number of tickets must be between 1 and {total_outcomes:,.0f}")
    probability = tickets / total_outcomes
    print(f"Winning probability with {tickets:,} ticket{'s' if tickets > 1 else ''}: {probability * 100:.4f}%")

In [16]:
multi_ticket_probability(tickets=1500)

Winning probability with 1,500 tickets: 0.0107%


In [17]:
for tickets in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
    multi_ticket_probability(tickets)

Winning probability with 1 ticket: 0.0000%
Winning probability with 10 tickets: 0.0001%
Winning probability with 100 tickets: 0.0007%
Winning probability with 10,000 tickets: 0.0715%
Winning probability with 1,000,000 tickets: 7.1511%
Winning probability with 6,991,908 tickets: 50.0000%
Winning probability with 13,983,816 tickets: 100.0000%


## Probability of Winning Smaller Prizes

In most 6/49 lotteries, players can win smaller prizes by matching two, three, four, or five numbers. We will create a function that calculates the probability of winning with fewer than six matching numbers.

### Engineering Considerations:
- Users input:
    - Six unique numbers.
    - A target match count (between 2 and 5).
- The function prints the probability of achieving the specified match count.

In [18]:
def probability_less_6(k):
    total_outcomes = combination(n=49, k=6)
    
    successful_combinations = combination(n=6, k=k)
    remaining_numbers = 49 - 6
    left_combinations = combination(n=remaining_numbers, k=6-k)
    successful_outcomes = successful_combinations * left_combinations
    
    probability = successful_outcomes / total_outcomes
    print(f"Probability of winning a prize with {k} matching number{'s' if k > 1 else ''}: {probability * 100:.10f}%")

In [19]:
for i in range(2,7):
    probability_less_6(i)

Probability of winning a prize with 2 matching numbers: 13.2378029002%
Probability of winning a prize with 3 matching numbers: 1.7650403867%
Probability of winning a prize with 4 matching numbers: 0.0968619724%
Probability of winning a prize with 5 matching numbers: 0.0018449900%
Probability of winning a prize with 6 matching numbers: 0.0000071511%


### Probability of having at least X winning numbers

In [20]:
def probability_at_least(x):
    total_outcomes = combination(n=49, k=6)
    successful_outcomes = 0
    remaining_numbers = 49 - 6
    
    for k in range(x, 7):
        successful_combinations = combination(n=6, k=k)
        left_combinations = combination(n=remaining_numbers, k=6-k)
        successful_outcomes += successful_combinations * left_combinations
        
    probability = successful_outcomes / total_outcomes
    print(f"Probability of winning a prize with at least {x} matching number{'s' if x > 1 else ''}: {probability * 100:.10f}%")

In [21]:
for i in range(2,7):
    probability_at_least(i)

Probability of winning a prize with at least 2 matching numbers: 15.1015574004%
Probability of winning a prize with at least 3 matching numbers: 1.8637545002%
Probability of winning a prize with at least 4 matching numbers: 0.0987141135%
Probability of winning a prize with at least 5 matching numbers: 0.0018521411%
Probability of winning a prize with at least 6 matching numbers: 0.0000071511%
