# Lecture 5: Probability Theory

```{note}
This lecture builds upon the previous lecture on probability theory, introducing different types of distributions.
```

---

## Binomial Distribution

Given a box with a total $n$ tickets, wherein $p$ proportion of tickets are labeled as 1 while $1-p$ are labeled as 0, then a total of $k$ draws of tickets (with replacement) can render a sample sum of lables $S$. The probability that this sum takes a specific value $s \in [0,k]$ is contingent of the number of tickets labeled 0 and 1. Specifically, this probability is equivalent to drawing $s$ tickets labeled as 1 out of a total $k$ draws. Since each draw is an identical, indepedent, random (IIR) event, this proability can be evaluated as, $p^k (1-p)^{(n-k)}$ $_nC_k$

In [25]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binom
import ipywidgets as widgets
from IPython.display import display

# Function to plot the binomial histogram
def plot_binomial(n, p):
    k = np.arange(0, n + 1)
    probs = binom.pmf(k, n, p)

    # Theoretical statistics
    mean = n * p
    median = np.floor(n * p) if p != 0.5 else n / 2
    mode = np.floor((n + 1) * p)
    var = n * p * (1 - p)
    std_dev = np.sqrt(var)
    iqr = binom.ppf(0.75, n, p) - binom.ppf(0.25, n, p)
    value_range = n  # Range = max - min = n - 0
    skewness = (1 - 2 * p) / std_dev
    kurtosis = (1 - 6 * p * (1 - p)) / var

    # Create plot
    fig, ax = plt.subplots(figsize=(12, 5))
    ax.bar(k, probs, color='skyblue', edgecolor='black')
    ax.set_title(f'Binomial PMF (n = {n}, p = {p})')
    ax.set_xlabel('Number of Successes (k)')
    ax.set_ylabel('Probability')
    ax.grid(axis='y', linestyle='--', alpha=0.7)

    # Text boxes for each category of statistics
    location_stats = (
        f"Measures of Location\n"
        f"- Mean: {mean:.2f}\n"
        f"- Median: {median:.2f}\n"
        f"- Mode: {mode:.0f}"
    )

    dispersion_stats = (
        f"Measures of Dispersion\n"
        f"- Range: {value_range}\n"
        f"- Q1: {binom.ppf(0.25, n, p):.2f}\n"
        f"- Q3: {binom.ppf(0.75, n, p):.2f}\n"
        f"- IQR: {iqr:.2f}\n"
        f"- Std Dev: {std_dev:.2f}"
    )

    shape_stats = (
        f"Measures of Shape\n"
        f"- Skewness: {skewness:.2f}\n"
        f"- Kurtosis: {kurtosis:.2f}"
    )

    # Add each stats box in a different location
    ax.text(1.02, 0.95, location_stats, transform=ax.transAxes,
            fontsize=10, verticalalignment='top',
            bbox=dict(boxstyle='round,pad=0.4', facecolor='whitesmoke'))

    ax.text(1.02, 0.665, dispersion_stats, transform=ax.transAxes,
            fontsize=10, verticalalignment='top',
            bbox=dict(boxstyle='round,pad=0.4', facecolor='honeydew'))

    ax.text(1.02, 0.30, shape_stats, transform=ax.transAxes,
            fontsize=10, verticalalignment='top',
            bbox=dict(boxstyle='round,pad=0.4', facecolor='lavender'))

    plt.tight_layout()
    plt.show()

# Interactive widgets
n_slider = widgets.IntSlider(value=50, min=1, max=100, step=1, description='n (trials):')
p_slider = widgets.FloatSlider(value=0.5, min=0.0, max=1.0, step=0.01, description='p (success):')

ui = widgets.VBox([n_slider, p_slider])
out = widgets.interactive_output(plot_binomial, {'n': n_slider, 'p': p_slider})

# Display the widget
display(ui, out)

VBox(children=(IntSlider(value=50, description='n (trials):', min=1), FloatSlider(value=0.5, description='p (s…

Output()

```{attention}
## The Monty Hall Problem

The Monty Hall Problem is a classic example of how human intuition often goes against probabilistic reasoning.

**Scenario:** You're on a game show, wherein there are 3 doors. Behind one door is a car (the prize), while behind the other two doors are goats. You pick one door (say, Door #1), then the host (Monty Hall), who knows what’s behind all the doors, opens another door (say, Door #3), which has a goat. Monty then gives you a choice: Stick with your original pick (Door #1), or Switch to the remaining unopened door (Door #2).

**Question:** Should you switch?

**Solution:** Most people think the chances are now 50-50 between the two remaining doors, but this intuition is wrong. When you initially picked a door, your chance of picking the car was 1/3, while the change of picking a goat was 2/3. Since Monty will always open a door with a goat, giving you information, thus if your original pick was wrong (2/3 chance), switching will win you the car, however, if your original pick was right (1/3 chance), switching will lose you the car. Essentially, the probability of winning if you switch is 2/3, whereas sticking keeps you at 1/3. The table below details these possible set of events.

| Initial Pick | Car Location | Monty Opens | Switch Wins? |
|--------------|--------------|-------------|--------------|
| Door 1       | Door 1       | Door 2/3    | ❌            |
| Door 1       | Door 2       | Door 3      | ✅            |
| Door 1       | Door 3       | Door 2      | ✅            |

Out of 3 equally likely configurations, switching wins in 2 of them.
```