# Module 02: Probability Fundamentals

**Difficulty**: ⭐⭐ Intermediate

**Estimated Time**: 90 minutes

**Prerequisites**: 
- Module 00: Setup and Introduction
- Module 01: Descriptive Statistics
- Basic understanding of fractions and percentages

## Learning Objectives

By the end of this notebook, you will be able to:
1. Understand and calculate basic probabilities using sample spaces and events
2. Apply fundamental probability rules (addition rule, multiplication rule)
3. Calculate conditional probabilities and understand independence
4. Apply Bayes' Theorem to solve real-world problems
5. Work with common probability distributions (binomial, normal, Poisson)
6. Calculate expected values and variance for random variables

In [None]:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from itertools import product, combinations

# Configure visualization
%matplotlib inline
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette("husl")

# Set random seed for reproducibility
np.random.seed(42)

# Display options
np.set_printoptions(precision=4, suppress=True)
pd.set_option('display.precision', 4)

print("Setup complete!")

## 1. Basic Probability Concepts

Probability measures the likelihood that an event will occur. It's fundamental to statistics, machine learning, and data science.

### Key Definitions:

**Sample Space (S)**: The set of all possible outcomes
- Example: Rolling a die → S = {1, 2, 3, 4, 5, 6}

**Event (E)**: A subset of the sample space
- Example: Rolling an even number → E = {2, 4, 6}

**Probability**: 
$$P(E) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}$$

**Properties**:
- $0 \leq P(E) \leq 1$ (probability is always between 0 and 1)
- $P(S) = 1$ (something must happen)
- $P(\emptyset) = 0$ (impossible event has probability 0)

In [None]:
# Example: Probability of rolling a die

# Sample space
sample_space = [1, 2, 3, 4, 5, 6]
total_outcomes = len(sample_space)

# Event: rolling an even number
event_even = [2, 4, 6]
favorable_outcomes = len(event_even)

# Calculate probability
prob_even = favorable_outcomes / total_outcomes

print("Sample Space:", sample_space)
print("Event (even number):", event_even)
print(f"\nP(even) = {favorable_outcomes}/{total_outcomes} = {prob_even:.4f}")
print(f"P(even) = {prob_even * 100:.1f}%")

In [None]:
# Simulate rolling a die many times to verify the probability

num_rolls = 10000
rolls = np.random.randint(1, 7, size=num_rolls)

# Count even numbers
even_count = np.sum(rolls % 2 == 0)
simulated_prob = even_count / num_rolls

print(f"Simulated {num_rolls:,} rolls")
print(f"Even numbers: {even_count:,}")
print(f"Simulated P(even) = {simulated_prob:.4f}")
print(f"Theoretical P(even) = {prob_even:.4f}")
print(f"\nDifference: {abs(simulated_prob - prob_even):.4f}")
print("\nAs we roll more times, the simulated probability approaches the theoretical value!")

## 2. Probability Rules

### Addition Rule (OR)
For mutually exclusive events (events that cannot happen simultaneously):
$$P(A \cup B) = P(A) + P(B)$$

For non-mutually exclusive events:
$$P(A \cup B) = P(A) + P(B) - P(A \cap B)$$

### Multiplication Rule (AND)
For independent events (one event doesn't affect the other):
$$P(A \cap B) = P(A) \times P(B)$$

### Complement Rule
$$P(A^c) = 1 - P(A)$$
where $A^c$ is "not A"

In [None]:
# Example: Drawing a card from a standard deck (52 cards)

total_cards = 52

# Probability of drawing a heart (13 hearts in deck)
prob_heart = 13 / total_cards

# Probability of drawing a king (4 kings in deck)
prob_king = 4 / total_cards

# Probability of drawing the king of hearts (1 card)
prob_king_and_heart = 1 / total_cards

# Addition rule: P(heart OR king)
# These are NOT mutually exclusive (king of hearts exists)
prob_heart_or_king = prob_heart + prob_king - prob_king_and_heart

print("=== Playing Card Probabilities ===")
print(f"P(Heart) = {prob_heart:.4f}")
print(f"P(King) = {prob_king:.4f}")
print(f"P(King AND Heart) = {prob_king_and_heart:.4f}")
print(f"\nP(Heart OR King) = {prob_heart:.4f} + {prob_king:.4f} - {prob_king_and_heart:.4f}")
print(f"P(Heart OR King) = {prob_heart_or_king:.4f}")

In [None]:
# Example: Multiplication rule for independent events
# Flipping a coin twice

prob_heads = 0.5

# Probability of getting heads on both flips
prob_two_heads = prob_heads * prob_heads

print("=== Two Coin Flips ===")
print(f"P(Heads on flip 1) = {prob_heads:.2f}")
print(f"P(Heads on flip 2) = {prob_heads:.2f}")
print(f"P(Two heads) = {prob_heads:.2f} × {prob_heads:.2f} = {prob_two_heads:.2f}")

# Verify with simulation
num_trials = 100000
flip1 = np.random.choice(['H', 'T'], size=num_trials)
flip2 = np.random.choice(['H', 'T'], size=num_trials)
two_heads = np.sum((flip1 == 'H') & (flip2 == 'H'))
simulated_prob = two_heads / num_trials

print(f"\nSimulated P(Two heads) = {simulated_prob:.4f}")
print(f"Theoretical P(Two heads) = {prob_two_heads:.4f}")

## 3. Conditional Probability and Independence

### Conditional Probability
The probability of event A given that event B has occurred:

$$P(A|B) = \frac{P(A \cap B)}{P(B)}$$

### Independence
Events A and B are independent if:
$$P(A|B) = P(A)$$

or equivalently:
$$P(A \cap B) = P(A) \times P(B)$$

**Key insight**: If knowing B doesn't change the probability of A, then A and B are independent.

In [None]:
# Example: Drawing cards without replacement (dependent events)

# Probability of drawing 2 aces in a row (without replacement)
prob_first_ace = 4 / 52
prob_second_ace_given_first = 3 / 51  # Only 3 aces left in 51 cards

prob_two_aces = prob_first_ace * prob_second_ace_given_first

print("=== Drawing Without Replacement ===")
print(f"P(First card is Ace) = {prob_first_ace:.4f}")
print(f"P(Second card is Ace | First was Ace) = {prob_second_ace_given_first:.4f}")
print(f"P(Two Aces) = {prob_first_ace:.4f} × {prob_second_ace_given_first:.4f}")
print(f"P(Two Aces) = {prob_two_aces:.6f}")
print(f"\nThese events are DEPENDENT: knowing the first card affects the second.")

In [None]:
# Real-world example: Medical testing
# A disease affects 1% of the population
# A test is 95% accurate for both positive and negative cases

prob_disease = 0.01  # P(D)
prob_no_disease = 0.99  # P(D^c)

# Test accuracy
prob_positive_given_disease = 0.95  # P(+ | D) - True positive rate
prob_negative_given_no_disease = 0.95  # P(- | D^c) - True negative rate
prob_positive_given_no_disease = 0.05  # P(+ | D^c) - False positive rate

# What's P(+)? The total probability of testing positive
# P(+) = P(+ | D) × P(D) + P(+ | D^c) × P(D^c)
prob_positive = (prob_positive_given_disease * prob_disease + 
                prob_positive_given_no_disease * prob_no_disease)

print("=== Medical Test Example ===")
print(f"P(Disease) = {prob_disease * 100:.1f}%")
print(f"P(Test positive | Have disease) = {prob_positive_given_disease * 100:.1f}%")
print(f"P(Test positive | No disease) = {prob_positive_given_no_disease * 100:.1f}%")
print(f"\nP(Test positive) = {prob_positive:.4f} or {prob_positive * 100:.2f}%")

## 4. Bayes' Theorem

Bayes' Theorem allows us to reverse conditional probabilities:

$$P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$$

**In words**: 
- $P(A|B)$ = **Posterior probability**: What we want to find
- $P(B|A)$ = **Likelihood**: Probability of evidence given hypothesis
- $P(A)$ = **Prior probability**: What we knew before seeing evidence
- $P(B)$ = **Marginal probability**: Total probability of evidence

**Applications**: Medical diagnosis, spam filtering, machine learning, A/B testing

In [None]:
# Continuing the medical test example
# If someone tests positive, what's the probability they actually have the disease?
# We need P(D | +) using Bayes' Theorem

# P(D | +) = P(+ | D) × P(D) / P(+)
prob_disease_given_positive = (prob_positive_given_disease * prob_disease / 
                               prob_positive)

print("=== Bayes' Theorem: Medical Test ===")
print(f"Given: Person tests POSITIVE")
print(f"\nCalculation:")
print(f"P(Disease | Positive) = P(+ | Disease) × P(Disease) / P(Positive)")
print(f"P(Disease | Positive) = {prob_positive_given_disease:.2f} × {prob_disease:.2f} / {prob_positive:.4f}")
print(f"P(Disease | Positive) = {prob_disease_given_positive:.4f}")
print(f"\n** KEY INSIGHT **")
print(f"Even with a positive test result, the probability of having the disease")
print(f"is only {prob_disease_given_positive * 100:.1f}% because the disease is rare!")
print(f"\nThis counterintuitive result is due to the LOW BASE RATE (1% prevalence).")

In [None]:
# Visualization: Bayes' Theorem with different base rates

base_rates = np.linspace(0.001, 0.5, 100)
test_accuracy = 0.95

posteriors = []
for base_rate in base_rates:
    # P(+) = P(+ | D) × P(D) + P(+ | D^c) × P(D^c)
    prob_pos = test_accuracy * base_rate + (1 - test_accuracy) * (1 - base_rate)
    # P(D | +) = P(+ | D) × P(D) / P(+)
    posterior = (test_accuracy * base_rate) / prob_pos
    posteriors.append(posterior)

plt.figure(figsize=(12, 6))
plt.plot(base_rates * 100, np.array(posteriors) * 100, linewidth=2.5, color='darkblue')
plt.xlabel('Disease Prevalence (Base Rate) %', fontsize=12)
plt.ylabel('P(Disease | Positive Test) %', fontsize=12)
plt.title('How Base Rate Affects Positive Predictive Value\n(Test Accuracy = 95%)', 
          fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3)
plt.axhline(y=50, color='red', linestyle='--', alpha=0.5, label='50% threshold')
plt.axvline(x=1, color='green', linestyle='--', alpha=0.5, label='1% prevalence')
plt.legend(fontsize=11)
plt.tight_layout()
plt.show()

print("Notice: Even with a highly accurate test (95%), low prevalence means")
print("a positive test doesn't guarantee the disease is present!")

## 5. Common Probability Distributions

### Binomial Distribution
Models the number of successes in n independent trials, each with probability p.

$$P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$$

**Example**: Number of heads in 10 coin flips

### Normal Distribution (Gaussian)
The most important distribution in statistics!

$$f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$$

Parameters: $\mu$ (mean), $\sigma$ (standard deviation)

### Poisson Distribution
Models the number of events occurring in a fixed interval.

$$P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$$

**Example**: Number of customers arriving per hour

In [None]:
# Binomial Distribution Example
# Flipping a fair coin 10 times, what's the probability of getting exactly 7 heads?

n = 10  # Number of trials
p = 0.5  # Probability of success (heads)
k = 7  # Number of successes we want

# Calculate using scipy
prob_7_heads = stats.binom.pmf(k, n, p)

print("=== Binomial Distribution ===")
print(f"Scenario: Flip a coin {n} times")
print(f"P(Exactly {k} heads) = {prob_7_heads:.4f}")
print(f"P(Exactly {k} heads) = {prob_7_heads * 100:.2f}%")

# Simulate to verify
num_experiments = 100000
simulated_flips = np.random.binomial(n, p, size=num_experiments)
simulated_prob = np.sum(simulated_flips == k) / num_experiments

print(f"\nSimulated probability: {simulated_prob:.4f}")
print(f"Difference: {abs(prob_7_heads - simulated_prob):.4f}")

In [None]:
# Visualize the binomial distribution

k_values = np.arange(0, n + 1)
probabilities = stats.binom.pmf(k_values, n, p)

plt.figure(figsize=(12, 6))
plt.bar(k_values, probabilities, edgecolor='black', alpha=0.7)
plt.xlabel('Number of Heads', fontsize=12)
plt.ylabel('Probability', fontsize=12)
plt.title(f'Binomial Distribution (n={n}, p={p})', fontsize=14, fontweight='bold')
plt.xticks(k_values)
plt.grid(True, alpha=0.3, axis='y')

# Highlight the specific value
plt.bar(k, probabilities[k], color='red', edgecolor='black', alpha=0.7, 
        label=f'P(X={k}) = {prob_7_heads:.4f}')
plt.legend(fontsize=11)
plt.tight_layout()
plt.show()

print(f"The distribution is symmetric because p = 0.5")
print(f"Most likely outcome: {k_values[np.argmax(probabilities)]} heads")

In [None]:
# Normal Distribution Example

mu = 100  # Mean IQ score
sigma = 15  # Standard deviation

# What's the probability of having IQ between 90 and 110?
prob_between = stats.norm.cdf(110, mu, sigma) - stats.norm.cdf(90, mu, sigma)

print("=== Normal Distribution (IQ Scores) ===")
print(f"Mean (μ) = {mu}")
print(f"Standard Deviation (σ) = {sigma}")
print(f"\nP(90 ≤ IQ ≤ 110) = {prob_between:.4f}")
print(f"P(90 ≤ IQ ≤ 110) = {prob_between * 100:.2f}%")

# Generate and visualize
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 1000)
y = stats.norm.pdf(x, mu, sigma)

plt.figure(figsize=(12, 6))
plt.plot(x, y, linewidth=2.5, label=f'Normal(μ={mu}, σ={sigma})')

# Shade the area between 90 and 110
x_fill = x[(x >= 90) & (x <= 110)]
y_fill = stats.norm.pdf(x_fill, mu, sigma)
plt.fill_between(x_fill, y_fill, alpha=0.3, label=f'P(90 ≤ X ≤ 110) = {prob_between:.2%}')

plt.xlabel('IQ Score', fontsize=12)
plt.ylabel('Probability Density', fontsize=12)
plt.title('Normal Distribution of IQ Scores', fontsize=14, fontweight='bold')
plt.axvline(mu, color='red', linestyle='--', linewidth=1.5, alpha=0.7, label=f'Mean = {mu}')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

In [None]:
# Poisson Distribution Example
# Average 3 customers arrive per hour at a store
# What's the probability of exactly 5 customers in the next hour?

lambda_rate = 3  # Average rate
k_customers = 5  # Number of arrivals

prob_5_customers = stats.poisson.pmf(k_customers, lambda_rate)

print("=== Poisson Distribution (Customer Arrivals) ===")
print(f"Average rate (λ) = {lambda_rate} customers/hour")
print(f"P(Exactly {k_customers} customers) = {prob_5_customers:.4f}")
print(f"P(Exactly {k_customers} customers) = {prob_5_customers * 100:.2f}%")

# Visualize the distribution
k_range = np.arange(0, 12)
probs = stats.poisson.pmf(k_range, lambda_rate)

plt.figure(figsize=(12, 6))
plt.bar(k_range, probs, edgecolor='black', alpha=0.7)
plt.bar(k_customers, prob_5_customers, color='red', edgecolor='black', alpha=0.7,
        label=f'P(X={k_customers}) = {prob_5_customers:.4f}')
plt.xlabel('Number of Customers', fontsize=12)
plt.ylabel('Probability', fontsize=12)
plt.title(f'Poisson Distribution (λ={lambda_rate})', fontsize=14, fontweight='bold')
plt.xticks(k_range)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()

## 6. Random Variables and Expected Values

### Random Variable
A variable whose value is determined by the outcome of a random process.

### Expected Value (Mean)
The long-run average value of a random variable:

$$E[X] = \sum_{i} x_i \cdot P(X = x_i)$$

### Variance
Measures the spread of a random variable:

$$Var(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2$$

In [None]:
# Example: Expected value of a die roll

outcomes = np.array([1, 2, 3, 4, 5, 6])
probabilities = np.array([1/6, 1/6, 1/6, 1/6, 1/6, 1/6])

# Expected value
expected_value = np.sum(outcomes * probabilities)

# Variance
variance = np.sum((outcomes - expected_value)**2 * probabilities)
std_dev = np.sqrt(variance)

print("=== Die Roll Statistics ===")
print("Outcomes:", outcomes)
print("Probabilities:", probabilities)
print(f"\nExpected Value E[X] = {expected_value:.4f}")
print(f"Variance Var(X) = {variance:.4f}")
print(f"Standard Deviation σ = {std_dev:.4f}")

# Verify with simulation
num_rolls = 1000000
simulated_rolls = np.random.randint(1, 7, size=num_rolls)
simulated_mean = np.mean(simulated_rolls)
simulated_var = np.var(simulated_rolls)

print(f"\n=== Simulation ({num_rolls:,} rolls) ===")
print(f"Simulated mean: {simulated_mean:.4f}")
print(f"Simulated variance: {simulated_var:.4f}")
print(f"\nThe simulated values converge to the theoretical values!")

In [None]:
# Real-world example: Expected value in decision making
# Should you take a gamble?

# Scenario: Pay $10 to flip a coin
# Heads: Win $25
# Tails: Win $0

cost = 10
win_heads = 25
win_tails = 0
prob_heads = 0.5
prob_tails = 0.5

# Expected winnings (before cost)
expected_winnings = win_heads * prob_heads + win_tails * prob_tails

# Expected profit (after cost)
expected_profit = expected_winnings - cost

print("=== Gamble Analysis ===")
print(f"Cost to play: ${cost}")
print(f"Win if Heads: ${win_heads} (probability: {prob_heads})")
print(f"Win if Tails: ${win_tails} (probability: {prob_tails})")
print(f"\nExpected winnings: ${expected_winnings:.2f}")
print(f"Expected profit: ${expected_profit:.2f}")

if expected_profit > 0:
    print(f"\n✓ This is a GOOD BET! Expected profit is positive.")
else:
    print(f"\n✗ This is a BAD BET! Expected profit is negative.")

print(f"\nIn the long run, you'll make ${expected_profit:.2f} per game on average.")

## 7. Practice Exercises

### Exercise 1: Basic Probability

A bag contains 5 red balls, 3 blue balls, and 2 green balls.

Calculate:
1. P(red)
2. P(not red)
3. P(red or blue)
4. If you draw 2 balls without replacement, what's P(both red)?

In [None]:
# Your code here
red = 5
blue = 3
green = 2
total = red + blue + green

print("=== Exercise 1 Solution ===")
print(f"Bag contents: {red} red, {blue} blue, {green} green (total: {total})")

# 1. P(red)
prob_red = red / total
print(f"\n1. P(red) = {red}/{total} = {prob_red:.4f}")

# 2. P(not red) = 1 - P(red)
prob_not_red = 1 - prob_red
print(f"2. P(not red) = 1 - {prob_red:.4f} = {prob_not_red:.4f}")

# 3. P(red or blue) - mutually exclusive
prob_red_or_blue = (red + blue) / total
print(f"3. P(red or blue) = ({red} + {blue})/{total} = {prob_red_or_blue:.4f}")

# 4. P(both red) without replacement
prob_first_red = red / total
prob_second_red_given_first = (red - 1) / (total - 1)
prob_both_red = prob_first_red * prob_second_red_given_first
print(f"\n4. P(both red):")
print(f"   P(1st red) = {red}/{total} = {prob_first_red:.4f}")
print(f"   P(2nd red | 1st red) = {red-1}/{total-1} = {prob_second_red_given_first:.4f}")
print(f"   P(both red) = {prob_first_red:.4f} × {prob_second_red_given_first:.4f} = {prob_both_red:.4f}")

### Exercise 2: Bayes' Theorem Application

A factory has two machines:
- Machine A produces 60% of items, with 2% defective rate
- Machine B produces 40% of items, with 5% defective rate

If an item is found to be defective, what's the probability it came from Machine A?

In [None]:
# Your code here
prob_A = 0.60  # P(A)
prob_B = 0.40  # P(B)
prob_defect_given_A = 0.02  # P(Defect | A)
prob_defect_given_B = 0.05  # P(Defect | B)

print("=== Exercise 2 Solution ===")
print(f"Machine A: Produces {prob_A*100:.0f}% of items, {prob_defect_given_A*100:.0f}% defective")
print(f"Machine B: Produces {prob_B*100:.0f}% of items, {prob_defect_given_B*100:.0f}% defective")

# Total probability of defect: P(Defect)
prob_defect = (prob_defect_given_A * prob_A + prob_defect_given_B * prob_B)
print(f"\nP(Defect) = P(D|A)×P(A) + P(D|B)×P(B)")
print(f"P(Defect) = {prob_defect_given_A}×{prob_A} + {prob_defect_given_B}×{prob_B}")
print(f"P(Defect) = {prob_defect:.4f}")

# Bayes' Theorem: P(A | Defect)
prob_A_given_defect = (prob_defect_given_A * prob_A) / prob_defect
prob_B_given_defect = (prob_defect_given_B * prob_B) / prob_defect

print(f"\nUsing Bayes' Theorem:")
print(f"P(A | Defect) = P(D|A)×P(A) / P(D)")
print(f"P(A | Defect) = {prob_defect_given_A}×{prob_A} / {prob_defect:.4f}")
print(f"P(A | Defect) = {prob_A_given_defect:.4f} or {prob_A_given_defect*100:.2f}%")
print(f"\nP(B | Defect) = {prob_B_given_defect:.4f} or {prob_B_given_defect*100:.2f}%")

print(f"\nConclusion: Even though Machine A produces more items (60%),")
print(f"a defective item is more likely from Machine B due to its higher defect rate.")

### Exercise 3: Binomial Distribution

A basketball player has a 70% free throw success rate.

If they take 15 free throws:
1. What's the probability of making exactly 10?
2. What's the expected number of successful shots?
3. Create a visualization showing the probability distribution

In [None]:
# Your code here
n_shots = 15
p_success = 0.70
k_target = 10

print("=== Exercise 3 Solution ===")
print(f"Number of shots: {n_shots}")
print(f"Success rate: {p_success*100:.0f}%")

# 1. P(exactly 10 successes)
prob_10_successes = stats.binom.pmf(k_target, n_shots, p_success)
print(f"\n1. P(Exactly {k_target} successful) = {prob_10_successes:.4f} or {prob_10_successes*100:.2f}%")

# 2. Expected value for binomial is n × p
expected_successes = n_shots * p_success
variance = n_shots * p_success * (1 - p_success)
std_dev = np.sqrt(variance)

print(f"\n2. Expected number of successful shots: {expected_successes:.2f}")
print(f"   Variance: {variance:.2f}")
print(f"   Standard deviation: {std_dev:.2f}")

# 3. Visualization
k_values = np.arange(0, n_shots + 1)
probabilities = stats.binom.pmf(k_values, n_shots, p_success)

plt.figure(figsize=(12, 6))
plt.bar(k_values, probabilities, edgecolor='black', alpha=0.7, label='All outcomes')
plt.bar(k_target, prob_10_successes, color='red', edgecolor='black', alpha=0.7,
        label=f'P(X={k_target}) = {prob_10_successes:.4f}')
plt.axvline(expected_successes, color='green', linestyle='--', linewidth=2,
            label=f'Expected = {expected_successes:.1f}')
plt.xlabel('Number of Successful Free Throws', fontsize=12)
plt.ylabel('Probability', fontsize=12)
plt.title(f'Binomial Distribution (n={n_shots}, p={p_success})', fontsize=14, fontweight='bold')
plt.xticks(k_values)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()

print(f"\n3. The distribution peaks near the expected value ({expected_successes:.1f})")
print(f"   Most likely outcome: {k_values[np.argmax(probabilities)]} successful shots")

### Exercise 4: Normal Distribution

Heights of adult men follow a normal distribution with mean 175 cm and standard deviation 7 cm.

Calculate:
1. P(height > 180 cm)
2. P(165 cm < height < 185 cm)
3. What height corresponds to the 95th percentile?
4. Visualize the distribution with these regions shaded

In [None]:
# Your code here
mu_height = 175  # Mean
sigma_height = 7  # Standard deviation

print("=== Exercise 4 Solution ===")
print(f"Heights ~ Normal(μ={mu_height} cm, σ={sigma_height} cm)")

# 1. P(height > 180)
prob_above_180 = 1 - stats.norm.cdf(180, mu_height, sigma_height)
print(f"\n1. P(height > 180 cm) = {prob_above_180:.4f} or {prob_above_180*100:.2f}%")

# 2. P(165 < height < 185)
prob_between = stats.norm.cdf(185, mu_height, sigma_height) - stats.norm.cdf(165, mu_height, sigma_height)
print(f"2. P(165 cm < height < 185 cm) = {prob_between:.4f} or {prob_between*100:.2f}%")

# 3. 95th percentile
percentile_95 = stats.norm.ppf(0.95, mu_height, sigma_height)
print(f"3. 95th percentile height: {percentile_95:.2f} cm")
print(f"   (Only 5% of men are taller than {percentile_95:.2f} cm)")

# 4. Visualization
x = np.linspace(mu_height - 4*sigma_height, mu_height + 4*sigma_height, 1000)
y = stats.norm.pdf(x, mu_height, sigma_height)

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Plot 1: P(height > 180)
axes[0].plot(x, y, linewidth=2.5, color='darkblue')
x_fill = x[x >= 180]
y_fill = stats.norm.pdf(x_fill, mu_height, sigma_height)
axes[0].fill_between(x_fill, y_fill, alpha=0.3, color='red',
                     label=f'P(X > 180) = {prob_above_180:.2%}')
axes[0].axvline(180, color='red', linestyle='--', alpha=0.7)
axes[0].set_title('Height > 180 cm', fontweight='bold')
axes[0].set_xlabel('Height (cm)')
axes[0].set_ylabel('Probability Density')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Plot 2: P(165 < height < 185)
axes[1].plot(x, y, linewidth=2.5, color='darkblue')
x_fill = x[(x >= 165) & (x <= 185)]
y_fill = stats.norm.pdf(x_fill, mu_height, sigma_height)
axes[1].fill_between(x_fill, y_fill, alpha=0.3, color='green',
                     label=f'P(165 < X < 185) = {prob_between:.2%}')
axes[1].axvline(165, color='green', linestyle='--', alpha=0.7)
axes[1].axvline(185, color='green', linestyle='--', alpha=0.7)
axes[1].set_title('165 cm < Height < 185 cm', fontweight='bold')
axes[1].set_xlabel('Height (cm)')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

# Plot 3: 95th percentile
axes[2].plot(x, y, linewidth=2.5, color='darkblue')
axes[2].axvline(percentile_95, color='purple', linestyle='--', linewidth=2.5,
               label=f'95th percentile = {percentile_95:.1f} cm')
x_fill = x[x >= percentile_95]
y_fill = stats.norm.pdf(x_fill, mu_height, sigma_height)
axes[2].fill_between(x_fill, y_fill, alpha=0.3, color='purple')
axes[2].set_title('95th Percentile', fontweight='bold')
axes[2].set_xlabel('Height (cm)')
axes[2].legend()
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 8. Summary and Key Takeaways

In this module, you learned:

✅ **Basic Probability Concepts**
- Sample spaces, events, and probability calculation
- Probability rules: addition, multiplication, complement
- Properties: $0 \leq P(E) \leq 1$

✅ **Conditional Probability**
- $P(A|B) = P(A \cap B) / P(B)$
- Independence: $P(A|B) = P(A)$
- Dependent vs independent events

✅ **Bayes' Theorem**
- Reversing conditional probabilities
- Applications in medical testing, spam filtering
- Importance of base rates

✅ **Probability Distributions**
- Binomial: Fixed number of trials
- Normal: Continuous, bell-shaped
- Poisson: Events in fixed intervals

✅ **Random Variables**
- Expected value: Long-run average
- Variance: Measure of spread
- Applications in decision making

### What's Next?

In **Module 03: Statistical Inference**, you'll learn:
- Sampling distributions and the Central Limit Theorem
- Confidence intervals for population parameters
- Hypothesis testing and p-values
- A/B testing in practice

### Additional Resources

- [Khan Academy - Probability](https://www.khanacademy.org/math/statistics-probability/probability-library)
- [3Blue1Brown - Bayes' Theorem](https://www.youtube.com/watch?v=HZGCoVF3YvM)
- [Seeing Theory - Probability Visualizations](https://seeing-theory.brown.edu/basic-probability/index.html)
- [Think Stats - Allen Downey](https://greenteapress.com/thinkstats/)

---

**Outstanding work!** You now understand the fundamental concepts of probability that underpin statistical inference and machine learning.

**Next**: Proceed to `03_statistical_inference.ipynb`