## Motivating Example

In 1693, Samuel Pepys wrote a letter to Isaac Newton inquiring about a wager that Pepys was planning to make. Pepys wanted to know which of the following events had the highest probability of occurring.

A. 6 dice are thrown and at least 1 is a 6

B. 12 dice are thrown and at least 2 are 6s

C. 18 dice are thrown and at least 3 are 6s

Pepys thought that C had the highest probability, but Newton disagreed.

The probability of A is straightforward to calculate. We use the Complement Rule (Theorem 5.2), much like we did in the Chevalier de Méré example from Lesson 5.

$$\begin{align}
    P(\text{at least one `6` in six rolls}) &= 1 - P(\text{no `6` in six rolls}) \\
    &= 1 - \frac{5^6}{6^6} \\
    &= 0.665
\end{align}$$

What about the other 2 events?



## Theory

- We covered hypergeometric distribution
    - This happens when draws are made from a box **without replacement**

- Here we cover the binomial distribution
    - This happens when draws are made from a box **with replacement**

- Theorem 13.1: If a random variable can be described as the number of 1s in $n$ random draws, with replacement, from the box of size $N$ with $N_0$ 0s and $N_1$ 1s, then its PMF is given by 
    - $$\begin{align}
        f(x) &= \frac{\binom{n}{x} N_1^{x} N_0^{n-x}}{N^n} \\
        &= \binom{n}{x} p^{x} (1-p)^{n-x}
        \end{align}$$
    - where $p = \frac{N_1}{N}$
    - We say that the random variable follows distribution Binomial(n, N1, N0)

- Proof of theorem 13.1
    - Imagine you have a box of $N$ items, with $N_1$ 1s and $N_0$ 0s
    - Let's count the number of ways to get $x$ 1s in $n$ draws
        - i.e. When drawing $n$ times from the box with replacement, I want to get $x$ 1s, and $n-x$ 0s
    - Each 1 is drawn with probability $\frac{N_1}{N}$, and 0 is drawn with probability $\frac{N_0}{N}$
    - So to have $x$ 1s, probablity is $\frac{N_1^x}{N^x}$. Similarly, $n-x$ 0s has probability $\frac{N_0^{n-x}}{N^{n-x}}$
    - Finally, count the number of ways to arrange the 1s and 0s you just drew
        - Think of this as choosing positions for your $x$ 1s out of $n$ spaces
        - So $\binom{n}{x}$
    - Taking the product gives us the binomial formula in 13.1

### Visualising distribution

- Holding the count of draws constant, what happens when we increase the number of 1s in the box?
    - More weight at the tails

In [14]:
import numpy as np
import scipy

display(scipy.stats.binom.pmf(range(10), 10, 0.2))
display(scipy.stats.binom.pmf(range(10), 10, 0.5))
display(scipy.stats.binom.pmf(range(10), 10, 0.8))

array([1.07374182e-01, 2.68435456e-01, 3.01989888e-01, 2.01326592e-01,
       8.80803840e-02, 2.64241152e-02, 5.50502400e-03, 7.86432000e-04,
       7.37280000e-05, 4.09600000e-06])

array([0.00097656, 0.00976563, 0.04394531, 0.1171875 , 0.20507812,
       0.24609375, 0.20507812, 0.1171875 , 0.04394531, 0.00976563])

array([1.02400000e-07, 4.09600000e-06, 7.37280000e-05, 7.86432000e-04,
       5.50502400e-03, 2.64241152e-02, 8.80803840e-02, 2.01326592e-01,
       3.01989888e-01, 2.68435456e-01])

- Does it matter if you increase the total number of units in the box while keeping the proportion constant?
    - No. Clearly, from the scipy implementation, you don't even need to specify the total counts, only the proportion of 1s
    - Intuitively, you are always replacing after sampling, so it doesn't matter if you have 1 out of 2, or 100 out of 200

### Solving motivating question

- Solving 18 dice with 3 6s
    - $$\begin{align}
        f(3) &= \binom{18}{3} \frac{1^{3}}{6^{3}} \frac{5^{15}}{6^{15}} \\
        &\approx 0.245
        \end{align}$$

In [16]:
import numpy as np
import scipy
scipy.stats.binom.pmf(3, 18, 1/6)

0.24519844796019247