# Lottery Project - What Is The Chance of Winning

## Table of Contents <a name="toc"></a>
<ul>
    <li><a href="#odds">Odds and Probability</a></li>
    <li><a href="#source">What Odds Do The National Lottery Claim?</a></li>
    <li><a href="#derived">Deriving The Odds Arithmetically</a></li>
</ul>

## Odds and Probability <a name="odds"></a>
Whilst both *odds* and *probability* mean the same thing to a layman, a statistician and a gambler will tell you they are not.

Without getting to deep into it the basic difference is
<ul>
    <li>A probability is a decimal value $0 \leq p \leq 1$. If we add up the probability for each possible outcomes of an event, the sum is one.</li>
    <li>The odds of an event is, on average, how many times the event occurs compared to how times it doesn't.</li>
</ul>

The National Lottery site gives the odds for each prize winning outcome, this may be easy for a gambler to parse, but I'd argue its easier for a statistician to understand it in terms off the probability.

The first thing we're going to do is understand how to transform one into the other. The odds is the ratio of the probability of the event occurring and the probability of it not occurring. Luckily, as the probabilities of all outcomes sum to $1$, if the probability of an event occurring is $p$ then the probability of an event not occurring is $1 - p$. Thus
\begin{equation}
\mathrm{odds} = \frac{\mathrm{probability}}{1 - \mathrm{probability}}. \notag
\end{equation}
With a little manipulating, we obtain the reverse formula
\begin{equation}
\mathrm{probability} = \frac{\mathrm{odds}}{1 + \mathrm{odds}}. \notag
\end{equation}

We can write some simple Python functions to handle the conversion between the two.

In [99]:
def prob_to_odds(prob):
    return prob / (1 - prob)

As a test, we know the probability of rolling a six on a perfect die is $1/6$, so we expect the odds to be
\begin{equation}
\mathrm{odds} = \frac{\frac{1}{6}}{1 - \frac{1}{6}} = \frac{1}{5}. \notag \\
\end{equation}

In [100]:
round(prob_to_odds(1/6), 4)

0.2

That's kinda useful, we know $0.2 = 1/5$. But when it we get more complicated odds, we want to be able to read them easier. We can use the fractions module to return the decimal odds as a fraction.

In [101]:
import fractions
fractions.Fraction(prob_to_odds(1/6))

Fraction(7205759403792793, 36028797018963968)

That's made things worse! Ideally, we want a unit fraction, that is a fraction with one in the numerator. If we divide the number by itself, we get one. To ensure the fraction has the same value we also need to divide the denominator by the numerator as well.

In [111]:
def unit_fraction(decimal):
    return [1, 1 / decimal]

unit_fraction(prob_to_odds(1/6))

[1, 5.0]

The result is now much simpler to understand. Now to turn our attention to turning an odds into a probability. We know to expect $p = 1/6 = 0.1\overline{6}$ if we pass in the odds of rolling a six on a fair die.

In [113]:
def odds_to_prob(odds):
    return odds / (1 + odds)

round(odds_to_prob(1/5), 4)

0.1667

That's good, though for the sake of automation we'll also allow our function to take the list from *prob_to_odds()* then check it works for each argument type. We could do further type-checking obviously, we as this is just for this document we'll leave it at that.

In [114]:
def odds_to_prob(odds):
    if type(odds) == list:
        odds = odds[0] / odds[1]

    return odds / (1 + odds)

print(f'By fraction - {round(odds_to_prob(1/5), 4)}.')
print(f'By list - {round(odds_to_prob(prob_to_odds(1/6)), 4)}.')

By fraction - 0.1667.
By list - 0.1667.


[Table of Contents](#toc)

## What Odds Do The National Lottery Claim? <a name="source"></a>
First, we can look at what the National Lottery claims the probability of each prize is, and the overall probability of winning a prize.

| Prize | Odds <img width=100>| Prize |
|:---|:---|:---|
| Jackpot (Six Balls) | $1:45,057,474$ | A Share of $9.79\%$ of Sales or at least $£1,000,000$ |
| Five + Bonus Ball | $1:7,509,579$ | $£1,000,000$ |
| Five Balls | $1:144,415$ | $£1,750$ |
| Four Balls | $1:2,180$ | $£140$ |
| Three Balls | $1:96.2$ | $£30$ |
| Two Balls | $1:10.3$ | Free lucky dip, worth $£2$ |
| Any Prize | $1:9.3$ | |

We'll perform our working in a pandas DataFrame for ease of seeing each step. First we obtain a DataFrame with the odds as a decimal.

In [200]:
import pandas as pd

raw_odds = {'six': 1/45057474,
        'bonus': 1/7509579,
        'five': 1/144415,
        'four': 1/2180,
        'three': 1/96.2,
        'two': 1/10.3}

df = pd.DataFrame(odds, index=['raw_odds']).T
df

Unnamed: 0,raw_odds
six,2.219388e-08
bonus,1.331633e-07
five,6.924488e-06
four,0.0004587156
three,0.01039501
two,0.09708738


Then we use our *unit_fraction()* function to convert the decimal odds to human-readable odds.

In [201]:
df['odds'] = 0
for i in df.index:
    df.loc[i, 'odds'] = '1/' + str(round(unit_fraction(df.loc[i, 'raw_odds'])[1], 1))
df

Unnamed: 0,raw_odds,odds
six,2.219388e-08,1/45057474.0
bonus,1.331633e-07,1/7509579.0
five,6.924488e-06,1/144415.0
four,0.0004587156,1/2180.0
three,0.01039501,1/96.2
two,0.09708738,1/10.3


Finally, we use our *odds_to_prob()* function to convert the *raw_odds* into probabilities.

In [202]:
df['prob'] = odds_to_prob(df['raw_odds'])
df

Unnamed: 0,raw_odds,odds,prob
six,2.219388e-08,1/45057474.0,2.219388e-08
bonus,1.331633e-07,1/7509579.0,1.331632e-07
five,6.924488e-06,1/144415.0,6.924441e-06
four,0.0004587156,1/2180.0,0.0004585053
three,0.01039501,1/96.2,0.01028807
two,0.09708738,1/10.3,0.08849558


As an aside - we notice an interesting trend immediately. The smaller the odds and probability, the smaller the difference between the odds and probability. This is because, for very small $p$
\begin{equation}
\mathrm{odds} = \frac{\mathrm{probability}}{1 - \mathrm{probability}}
\simeq \frac{\mathrm{probability}}{1}. \notag
\end{equation}

[Table of Contents](#toc)

## Deriving Our Odds Arithmetically <a name="derived"></a>
The most simple way of calculating the probability of winning the jackpot is to consider that we need to match six chosen numbers to six balls drawn from a possible fifty-nine. We can start by calculating the probability that the ball drawn matches one of our six chosen numbers - this is a six in fifty-nine chance, order of the draw doesn't matter.
\begin{equation}
p(\mathrm{matching~one~of~six~numbers}) = \frac{6}{59} \simeq 0.102. \notag
\end{equation}

We are then looking to match one of our remaining five numbers to one the fifty-eight remaining balls.
\begin{equation}
p(\mathrm{matching~one~of~five~numbers}) = \frac{5}{58} \simeq 0.086. \notag
\end{equation}

We repeat the pattern until we have accounted for all six draws. The probability of one event happening after the other is the product of all the probabilities,
\begin{align}
p(\mathrm{matching~six~numbers})
& = \frac{6}{59} \times \frac{5}{58} \times \frac{4}{57}
\times \frac{3}{56} \times \frac{2}{55} \times \frac{1}{54} \notag \\
& = \frac{720}{32441381280} \notag \\
& \simeq 2.219 \times 10^{-8}. \notag
\end{align}

In [203]:
p_6 = (6/59) * (5/58) * (4/57) *(3/56) * (2/55) * (1/54)
print(f'The probability of drawing six matching numbers is {p_6}.')
print(f'The odds of drawing six matching numbers are {unit_fraction(prob_to_odds(p_6))}.')

The probability of drawing six matching numbers is 2.2193876203535066e-08.
The odds of drawing six matching numbers are [1, 45057473.00000001].


We can see that the probability and odds obtained by our method match, or are close enough, to those from the National Lottery website. The difference can be accounted by rounding errors.

We can also notice a pattern - the numerator of the fraction can be written as $6!$, or six-factorial. Further, in the denominator we are multiplying from $59$ to $54$, and this can be written as $59!$ divided by $(59 - 6)!$. Therefore, a shorter formula is
\begin{equation}
p(\mathrm{matching~six~numbers})
= \frac{6!}{59!(59 - 6)!}. \notag
\end{equation}

This looks familiar, in fact if we jiggle a little bit we have
\begin{equation}
p(\mathrm{matching~six~numbers})
= \frac{6! \big((59 - 6)! \big)^{-1}}{59!}, \notag
\end{equation}
which is the reciprocal of the binomial coefficient, sometimes called from $n$ choose $k$ as it describes the number of ways to choose $k$ items from a set of $n$ items without repetition. The formula for the binomial coefficient is
\begin{equation}
\begin{pmatrix} n \\ k \end{pmatrix}
= \frac{n!}{k!(n - k)!}, \notag
\end{equation}
And we can write
\begin{align}
\frac{1}{p(\mathrm{matching~six~numbers~of~59~possibilities})}
= \frac{59!}{6!(59 - 6)!}, \notag \\
\frac{1}{p(\mathrm{matching~}k\mathrm{~numbers~of}n\mathrm{~possibilities})}
= \frac{n!}{k!(n - k)!}. \notag
\end{align}

We can see that the derived probability from the odds ration provided by the national lottery, and the probability derived from the binomial coefficient are similar to seven significant figures.

In [215]:
import scipy.special as special

print(df.loc['six','prob'])
print( (1 / special.comb(59, 6)) / (1 + ( 1 / special.comb(59, 6))))

print(df.loc['six','raw_odds'])
print(1 / special.comb(59, 6))

2.2193875710966934e-08
2.2193875710966934e-08
2.2193876203535066e-08
2.2193876203535066e-08
