# P&S Module 5: Random Variables

**Course:** Probability and Statistics

**Week 5:** Random Variables

**Topics Covered:**
- Random Variable Definition and Types
- Discrete Random Variables (PMF, Mean, Variance)
- Continuous Random Variables (PDF, Mean, Variance)
- Mathematical Expectation
- Probability Distribution Functions

---

## Introduction

In Weeks 3 and 4, we learned about probability and events. Now we'll generalize these concepts using **random variables**.

**What is a Random Variable?**

A random variable is a function that assigns a numerical value to each outcome of a random experiment.

**Real-world examples:**
- Rolling a die → The outcome (1, 2, 3, 4, 5, or 6)
- Flipping three coins → The number of heads (0, 1, 2, or 3)
- Measuring rainfall → Amount in millimeters (any non-negative real number)
- Customer wait time → Time in minutes (continuous)

**Why Random Variables?**

Instead of dealing with abstract outcomes like "Heads" or "King of Spades", we work with numbers. This allows us to:
- Calculate averages (expected values)
- Measure spread (variance)
- Apply mathematical operations
- Build statistical models

---

## Part A: What are Random Variables?

### Definition

A **random variable** is a function that assigns a numerical value to each outcome of a random experiment.


### Example 1: Rolling a Die

Let X = the outcome when rolling a fair six-sided die.

X can take values: {1, 2, 3, 4, 5, 6}

Each value has probability 1/6.

### Example 2: Number of Heads in 3 Coin Flips

Let Y = number of heads when flipping 3 coins.

Y can take values: {0, 1, 2, 3}

- P(Y=0) = 1/8 (TTT)
- P(Y=1) = 3/8 (HTT, THT, TTH)
- P(Y=2) = 3/8 (HHT, HTH, THH)
- P(Y=3) = 1/8 (HHH)

---

## Part B: Types of Random Variables

There are two main types of random variables:

### 1. Discrete Random Variables

**Definition:** A random variable that can take on a countable number of distinct values.

**Characteristics:**
- Values are separate and distinct
- Can list all possible values
- Probabilities sum to 1

**Examples:**
- Number of students in a class (0, 1, 2, ...)
- Number of cars passing through a toll booth per hour
- Number of defective items in a batch
- Outcome of rolling dice

### 2. Continuous Random Variables

**Definition:** A random variable that can take on any value within a continuous range.

**Characteristics:**
- Infinite number of possible values
- Can take any value in an interval
- Probability at a single point is 0
- Use intervals for probability

**Examples:**
- Height of students (any value between 140cm and 200cm)
- Time until next customer arrives
- Temperature measurements
- Rainfall amount

### Key Difference:

| Feature | Discrete | Continuous |
|---------|----------|------------|
| Values | Countable | Uncountable |
| Example | Number of children | Height in cm |
| Probability | P(X = x) | P(a ≤ X ≤ b) |
| Distribution | PMF | PDF |
| Sum/Integral | Sum = 1 | Integral = 1 |

---

## Part C: Discrete Random Variables

### Probability Mass Function (PMF)

For a discrete random variable X, the **Probability Mass Function** gives the probability that X takes a specific value:

$$P(X = x) = p(x)$$

**Properties of PMF:**
1. $0 \leq p(x) \leq 1$ for all x
2. $\sum_{\text{all } x} p(x) = 1$

### Example: Fair Die

For a fair six-sided die:

$$P(X = x) = \frac{1}{6} \text{ for } x \in \{1, 2, 3, 4, 5, 6\}$$

---

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Example: PMF of a fair die
x_values = [1, 2, 3, 4, 5, 6]
probabilities = [1/6] * 6

# Visualize PMF
plt.figure(figsize=(8, 5))
plt.bar(x_values, probabilities, color='skyblue', edgecolor='black', width=0.6)
plt.xlabel('Outcome (X)', fontsize=12)
plt.ylabel('Probability P(X)', fontsize=12)
plt.title('PMF of a Fair Die', fontsize=14, fontweight='bold')
plt.xticks(x_values)
plt.ylim(0, 0.25)
plt.grid(axis='y', alpha=0.3)

# Add probability values on bars
for x, p in zip(x_values, probabilities):
    plt.text(x, p + 0.01, f'{p:.3f}', ha='center', fontsize=10)

plt.tight_layout()
plt.show()

print("Verification: Sum of probabilities =", sum(probabilities))

### Expected Value (Mean) of Discrete Random Variables

The **expected value** (or mean) is the average value we expect to get if we repeat the experiment many times.

**Formula:**

$$E[X] = \mu = \sum_{\text{all } x} x \cdot P(X = x)$$

**Intuition:** It's a weighted average where each value is weighted by its probability.

### Variance of Discrete Random Variables

**Variance** measures how spread out the values are from the mean.

**Formula:**

$$\text{Var}(X) = \sigma^2 = \sum_{\text{all } x} (x - \mu)^2 \cdot P(X = x)$$

Or equivalently:

$$\text{Var}(X) = E[X^2] - (E[X])^2$$

**Standard Deviation:**

$$\sigma = \sqrt{\text{Var}(X)}$$

---

### Problem 1: Simple Discrete Random Variable

Let X = {0, 1, 2, 3} with probabilities {0.1, 0.3, 0.4, 0.2}.

Find the mean and variance.

**Step-by-step calculation:**

**Mean:**
$$E[X] = 0(0.1) + 1(0.3) + 2(0.4) + 3(0.2) = 0 + 0.3 + 0.8 + 0.6 = 1.7$$

**Variance:**
First, find $E[X^2]$:
$$E[X^2] = 0^2(0.1) + 1^2(0.3) + 2^2(0.4) + 3^2(0.2) = 0 + 0.3 + 1.6 + 1.8 = 3.7$$

Then:
$$\text{Var}(X) = E[X^2] - (E[X])^2 = 3.7 - (1.7)^2 = 3.7 - 2.89 = 0.81$$

In [None]:
# Problem 1: Simple discrete random variable

def mean_and_variance(values, probabilities):
    """Calculate mean and variance of a discrete random variable"""

    # Validation
    if len(values) != len(probabilities):
        raise ValueError("Length of values and probabilities must be the same.")

    if not abs(sum(probabilities) - 1.0) < 1e-6:
        raise ValueError(f"Probabilities must sum to 1. Current sum: {sum(probabilities)}")

    # Calculate mean
    mean = sum(x * p for x, p in zip(values, probabilities))

    # Calculate variance
    variance = sum(((x - mean) ** 2) * p for x, p in zip(values, probabilities))

    # Standard deviation
    std_dev = variance ** 0.5

    return mean, variance, std_dev

# Problem 1: X = {0, 1, 2, 3} with probabilities {0.1, 0.3, 0.4, 0.2}
x_values = [0, 1, 2, 3]
p_values = [0.1, 0.3, 0.4, 0.2]

mean, variance, std_dev = mean_and_variance(x_values, p_values)

print("Problem 1: Simple Discrete Random Variable")
print("=" * 50)
print(f"X values: {x_values}")
print(f"Probabilities: {p_values}")
print(f"\nResults:")
print(f"  Mean (Expected Value): {mean:.4f}")
print(f"  Variance: {variance:.4f}")
print(f"  Standard Deviation: {std_dev:.4f}")

# Visualize
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

# PMF
ax1.bar(x_values, p_values, color='lightcoral', edgecolor='black', width=0.5)
ax1.axvline(mean, color='red', linestyle='--', linewidth=2, label=f'Mean = {mean:.2f}')
ax1.set_xlabel('X')
ax1.set_ylabel('P(X)')
ax1.set_title('Probability Mass Function')
ax1.legend()
ax1.grid(axis='y', alpha=0.3)

# Show spread around mean
deviations = [(x - mean)**2 * p for x, p in zip(x_values, p_values)]
ax2.bar(x_values, deviations, color='lightblue', edgecolor='black', width=0.5)
ax2.set_xlabel('X')
ax2.set_ylabel('$(x - \mu)^2 \cdot P(X)$')
ax2.set_title('Contribution to Variance')
ax2.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

### 📝 TO DO #1: Calculate Mean and Variance

Try these different scenarios by modifying the values and probabilities:

In [None]:
# TO DO #1: Try different discrete random variables

# Scenario 1: Uniform distribution (all probabilities equal)
print("Scenario 1: Uniform Distribution")
print("-" * 50)
x_uniform = [1, 2, 3, 4, 5]
p_uniform = [0.2, 0.2, 0.2, 0.2, 0.2]
mean1, var1, std1 = mean_and_variance(x_uniform, p_uniform)
print(f"Mean: {mean1:.4f}, Variance: {var1:.4f}\n")

# Scenario 2: Skewed distribution (high probability on one side)
print("Scenario 2: Right-Skewed Distribution")
print("-" * 50)
x_skewed = [1, 2, 3, 4, 5]
p_skewed = [0.5, 0.25, 0.15, 0.07, 0.03]
mean2, var2, std2 = mean_and_variance(x_skewed, p_skewed)
print(f"Mean: {mean2:.4f}, Variance: {var2:.4f}\n")

# TO DO: Create your own scenario
# Uncomment and modify:
# print("Your Custom Scenario")
# print("-" * 50)
# x_custom = [0, 10, 20, 30]
# p_custom = [0.4, 0.3, 0.2, 0.1]
# mean3, var3, std3 = mean_and_variance(x_custom, p_custom)
# print(f"Mean: {mean3:.4f}, Variance: {var3:.4f}")

print("\n--- YOUR TURN ---")
print("Try creating a random variable where:")
print("1. The mean is exactly 5")
print("2. Most probability is concentrated at the ends")
print("3. What happens to variance when values are far from mean?")

### Problem 2: Biased Die

A biased six-sided die has the following probability distribution:

| Outcome (X) | 1 | 2 | 3 | 4 | 5 | 6 |
|-------------|---|---|---|---|---|---|
| P(X) | 0.1 | 0.1 | 0.2 | 0.2 | 0.2 | 0.2 |

Note: This is NOT a fair die. Outcomes 3-6 are more likely than 1-2.

**Question:** What is the expected value and variance?

In [None]:
# Problem 2: Biased die

print("Problem 2: Biased Die")
print("=" * 50)

x_die = [1, 2, 3, 4, 5, 6]
p_die = [0.1, 0.1, 0.2, 0.2, 0.2, 0.2]

mean_die, var_die, std_die = mean_and_variance(x_die, p_die)

print(f"Biased Die Probabilities:")
for x, p in zip(x_die, p_die):
    print(f"  P(X = {x}) = {p}")

print(f"\nResults:")
print(f"  Expected Value (Mean): {mean_die:.4f}")
print(f"  Variance: {var_die:.4f}")
print(f"  Standard Deviation: {std_die:.4f}")

# Compare with fair die
mean_fair = 3.5
var_fair = 35/12
print(f"\nComparison with Fair Die:")
print(f"  Fair Die Mean: {mean_fair:.4f}")
print(f"  Biased Die Mean: {mean_die:.4f}")
print(f"  Difference: {abs(mean_die - mean_fair):.4f}")

# Visualize comparison
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

# Fair die
p_fair = [1/6] * 6
ax1.bar(x_die, p_fair, color='lightgreen', edgecolor='black', width=0.6, alpha=0.7, label='Fair')
ax1.set_xlabel('Outcome')
ax1.set_ylabel('Probability')
ax1.set_title('Fair Die PMF')
ax1.set_ylim(0, 0.25)
ax1.grid(axis='y', alpha=0.3)

# Biased die
ax2.bar(x_die, p_die, color='salmon', edgecolor='black', width=0.6, alpha=0.7, label='Biased')
ax2.set_xlabel('Outcome')
ax2.set_ylabel('Probability')
ax2.set_title('Biased Die PMF')
ax2.set_ylim(0, 0.25)
ax2.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

### 📝 TO DO #2: Design Your Own Biased Die

Create a biased die where the expected value is greater than 4.0:

In [None]:
# TO DO #2: Design your own biased die

# YOUR TASK: Adjust the probabilities to make E[X] > 4.0
# Remember: Probabilities must sum to 1!

x_custom_die = [1, 2, 3, 4, 5, 6]
# TO DO: Modify these probabilities
p_custom_die = [0.05, 0.05, 0.1, 0.2, 0.3, 0.3]  # Example: favors high numbers

# Check if valid
if abs(sum(p_custom_die) - 1.0) < 1e-6:
    mean_custom, var_custom, std_custom = mean_and_variance(x_custom_die, p_custom_die)
    print("Your Biased Die:")
    print(f"  Mean: {mean_custom:.4f}")
    print(f"  Variance: {var_custom:.4f}")

    if mean_custom > 4.0:
        print(f"\n✓ Success! Your die has expected value > 4.0")
    else:
        print(f"\n✗ Try again! Increase probabilities for higher outcomes")
else:
    print(f"Error: Probabilities sum to {sum(p_custom_die):.4f}, not 1.0")

print("\n--- CHALLENGE ---")
print("Can you create a die where:")
print("1. Mean = 4.5 exactly?")
print("2. Variance is minimized?")
print("3. Outcome 6 has the highest probability?")

---

## Part D: Continuous Random Variables

### Probability Density Function (PDF)

For a continuous random variable X, we use a **Probability Density Function** f(x).

**Key Difference from PMF:**
- For discrete: P(X = x) makes sense
- For continuous: P(X = x) = 0 for any specific value
- Instead, we use: $P(a \leq X \leq b) = \int_a^b f(x) dx$

**Properties of PDF:**
1. $f(x) \geq 0$ for all x
2. $\int_{-\infty}^{\infty} f(x) dx = 1$

### Example: Uniform Distribution

A random variable X is uniformly distributed on [0, 1] if:

$$f(x) = \begin{cases}
1 & \text{if } 0 \leq x \leq 1 \\
0 & \text{otherwise}
\end{cases}$$

This means all values between 0 and 1 are equally likely.

---

In [None]:
# Visualize PDF of Uniform Distribution

x = np.linspace(-0.5, 1.5, 1000)
pdf_uniform = np.where((x >= 0) & (x <= 1), 1, 0)

plt.figure(figsize=(10, 5))
plt.plot(x, pdf_uniform, 'b-', linewidth=2, label='f(x) = 1 for x ∈ [0,1]')
plt.fill_between(x, 0, pdf_uniform, alpha=0.3)
plt.xlabel('x', fontsize=12)
plt.ylabel('f(x)', fontsize=12)
plt.title('PDF of Uniform Distribution on [0, 1]', fontsize=14, fontweight='bold')
plt.xlim(-0.5, 1.5)
plt.ylim(0, 1.5)
plt.grid(True, alpha=0.3)
plt.legend(fontsize=11)
plt.tight_layout()
plt.show()

# Calculate probability P(0.2 ≤ X ≤ 0.6)
a, b = 0.2, 0.6
prob = b - a  # For uniform, it's just the length
print(f"P({a} ≤ X ≤ {b}) = {prob}")
print(f"\nThis means there's a {prob*100}% chance X falls between {a} and {b}")

### Expected Value and Variance for Continuous RV

**Expected Value (Mean):**

$$E[X] = \mu = \int_{-\infty}^{\infty} x \cdot f(x) dx$$

**Variance:**

$$\text{Var}(X) = \sigma^2 = \int_{-\infty}^{\infty} (x - \mu)^2 \cdot f(x) dx$$

Or equivalently:

$$\text{Var}(X) = E[X^2] - (E[X])^2$$

Where:

$$E[X^2] = \int_{-\infty}^{\infty} x^2 \cdot f(x) dx$$

---

### Problem 3: Exponential Distribution

The **exponential distribution** models waiting times (time until next event).

**PDF:**

$$f(x) = \lambda e^{-\lambda x} \text{ for } x \geq 0$$

where λ > 0 is the rate parameter.

**Example:** Time (in minutes) until the next customer arrives, with λ = 2.

**Theoretical values:**
- Mean: $E[X] = \frac{1}{\lambda} = \frac{1}{2} = 0.5$ minutes
- Variance: $\text{Var}(X) = \frac{1}{\lambda^2} = \frac{1}{4} = 0.25$

In [None]:
from scipy.integrate import quad

# Problem 3: Exponential Distribution with λ = 2

# Parameter
lmbda = 2

# Define the PDF
def pdf_exponential(x):
    return lmbda * np.exp(-lmbda * x)

# Calculate Mean: E[X] = ∫ x * f(x) dx from 0 to ∞
mean_integrand = lambda x: x * pdf_exponential(x)
mean_exp, _ = quad(mean_integrand, 0, np.inf)

# Calculate E[X²]: ∫ x² * f(x) dx from 0 to ∞
mean_square_integrand = lambda x: x**2 * pdf_exponential(x)
e_x_squared, _ = quad(mean_square_integrand, 0, np.inf)

# Variance: Var(X) = E[X²] - (E[X])²
variance_exp = e_x_squared - mean_exp**2

# Standard deviation
std_exp = np.sqrt(variance_exp)

print("Problem 3: Exponential Distribution (λ = 2)")
print("=" * 50)
print(f"PDF: f(x) = {lmbda} * e^(-{lmbda}x) for x ≥ 0")
print(f"\nCalculated values:")
print(f"  E[X] = {mean_exp:.4f}")
print(f"  E[X²] = {e_x_squared:.4f}")
print(f"  Var(X) = E[X²] - (E[X])² = {variance_exp:.4f}")
print(f"  Std Dev = {std_exp:.4f}")

print(f"\nTheoretical values:")
print(f"  E[X] = 1/λ = {1/lmbda:.4f}")
print(f"  Var(X) = 1/λ² = {1/lmbda**2:.4f}")

# Visualize the PDF
x_vals = np.linspace(0, 5, 1000)
y_vals = pdf_exponential(x_vals)

plt.figure(figsize=(10, 5))
plt.plot(x_vals, y_vals, 'b-', linewidth=2, label=f'f(x) = {lmbda}e^(-{lmbda}x)')
plt.fill_between(x_vals, 0, y_vals, alpha=0.3)
plt.axvline(mean_exp, color='red', linestyle='--', linewidth=2, label=f'Mean = {mean_exp:.2f}')
plt.xlabel('x (time in minutes)', fontsize=12)
plt.ylabel('f(x)', fontsize=12)
plt.title('Exponential Distribution PDF', fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Interpretation
print(f"\nInterpretation:")
print(f"On average, customers arrive every {mean_exp:.2f} minutes.")
print(f"The standard deviation is {std_exp:.2f} minutes.")

### 📝 TO DO #3: Experiment with Different λ Values

The rate parameter λ controls how quickly events occur:
- **Large λ**: Events happen quickly (short wait times)
- **Small λ**: Events happen slowly (long wait times)

In [None]:
# TO DO #3: Experiment with different λ values

def analyze_exponential(lmbda, label):
    """Analyze exponential distribution for given lambda"""
    pdf = lambda x: lmbda * np.exp(-lmbda * x)

    # Calculate mean
    mean_integrand = lambda x: x * pdf(x)
    mean, _ = quad(mean_integrand, 0, np.inf)

    # Calculate variance
    var_integrand = lambda x: (x - mean)**2 * pdf(x)
    variance, _ = quad(var_integrand, 0, np.inf)

    return mean, variance, pdf

# Test different λ values
lambdas = [0.5, 1.0, 2.0, 4.0]
colors = ['red', 'blue', 'green', 'purple']

plt.figure(figsize=(12, 6))

for lam, color in zip(lambdas, colors):
    mean, var, pdf = analyze_exponential(lam, f'λ={lam}')
    x = np.linspace(0, 5, 1000)
    y = [pdf(xi) for xi in x]
    plt.plot(x, y, color=color, linewidth=2, label=f'λ={lam}, Mean={mean:.2f}')

plt.xlabel('x', fontsize=12)
plt.ylabel('f(x)', fontsize=12)
plt.title('Exponential PDFs for Different λ Values', fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.xlim(0, 5)
plt.tight_layout()
plt.show()

print("Observations:")
print("- Larger λ → steeper curve, shorter wait times")
print("- Smaller λ → flatter curve, longer wait times")
print("\n--- YOUR TURN ---")
print("What λ would you use to model:")
print("1. Time between buses arriving every 15 minutes?")
print("2. Time between phone calls if you get 6 per hour?")

### Problem 4: Custom PDF

Let's work with a custom probability density function:

$$f(x) = 3x^2 \text{ for } x \in [0, 1]$$

**Verification:** Is this a valid PDF?

Check: $\int_0^1 3x^2 dx = [x^3]_0^1 = 1 - 0 = 1$ ✓

**Question:** Find the mean and variance.

In [None]:
# Problem 4: Custom PDF f(x) = 3x² on [0, 1]

import sympy as sp

# Define symbol
x = sp.Symbol('x', real=True, positive=True)

# Define the PDF
pdf_custom = 3 * x**2

# Integration limits
a, b = 0, 1

print("Problem 4: Custom PDF")
print("=" * 50)
print(f"PDF: f(x) = {pdf_custom} for x ∈ [{a}, {b}]")

# Verify it's a valid PDF
integral_check = sp.integrate(pdf_custom, (x, a, b))
print(f"\nVerification: ∫ f(x) dx = {integral_check}")

# Compute Mean: E[X] = ∫ x * f(x) dx
mean_expr = x * pdf_custom
mean_symbolic = sp.integrate(mean_expr, (x, a, b))
mean_value = float(mean_symbolic)

print(f"\nMean Calculation:")
print(f"  E[X] = ∫ x * 3x² dx from 0 to 1")
print(f"  E[X] = ∫ 3x³ dx from 0 to 1")
print(f"  E[X] = [3x⁴/4]₀¹")
print(f"  E[X] = {mean_symbolic} = {mean_value:.4f}")

# Compute E[X²]: ∫ x² * f(x) dx
variance_expr = x**2 * pdf_custom
e_x_squared_symbolic = sp.integrate(variance_expr, (x, a, b))
e_x_squared_value = float(e_x_squared_symbolic)

# Variance: Var(X) = E[X²] - (E[X])²
variance_value = e_x_squared_value - mean_value**2

print(f"\nVariance Calculation:")
print(f"  E[X²] = ∫ x² * 3x² dx from 0 to 1")
print(f"  E[X²] = {e_x_squared_symbolic} = {e_x_squared_value:.4f}")
print(f"  Var(X) = E[X²] - (E[X])²")
print(f"  Var(X) = {e_x_squared_value:.4f} - ({mean_value:.4f})²")
print(f"  Var(X) = {variance_value:.4f}")

# Visualize
x_plot = np.linspace(0, 1, 1000)
y_plot = 3 * x_plot**2

plt.figure(figsize=(10, 5))
plt.plot(x_plot, y_plot, 'b-', linewidth=2, label='f(x) = 3x²')
plt.fill_between(x_plot, 0, y_plot, alpha=0.3)
plt.axvline(mean_value, color='red', linestyle='--', linewidth=2, label=f'Mean = {mean_value:.3f}')
plt.xlabel('x', fontsize=12)
plt.ylabel('f(x)', fontsize=12)
plt.title('Custom PDF: f(x) = 3x²', fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### 📝 TO DO #4: Create Your Own PDF

Design your own probability density function and calculate its properties.

**Requirements for a valid PDF:**
1. f(x) ≥ 0 for all x
2. ∫ f(x) dx = 1 over the domain

**Try these examples:**
- f(x) = 2x for x ∈ [0, 1]
- f(x) = (3/8)x² for x ∈ [0, 2]
- f(x) = (1/2) for x ∈ [0, 2]

In [None]:
# TO DO #4: Design your own PDF

# Example 1: f(x) = 2x on [0, 1]
print("Example 1: f(x) = 2x on [0, 1]")
print("-" * 50)

x = sp.Symbol('x', real=True, positive=True)
pdf1 = 2 * x
a1, b1 = 0, 1

# Check if valid
integral1 = sp.integrate(pdf1, (x, a1, b1))
print(f"∫ f(x) dx = {integral1}")

if integral1 == 1:
    # Calculate mean and variance
    mean1 = float(sp.integrate(x * pdf1, (x, a1, b1)))
    e_x2_1 = float(sp.integrate(x**2 * pdf1, (x, a1, b1)))
    var1 = e_x2_1 - mean1**2
    print(f"✓ Valid PDF!")
    print(f"Mean: {mean1:.4f}")
    print(f"Variance: {var1:.4f}")
else:
    print(f"✗ Not a valid PDF (integral = {integral1})")

# TO DO: Try your own PDF
print("\n--- YOUR TURN ---")
print("Create a PDF where:")
print("1. The mean is exactly 0.5")
print("2. The variance is less than 0.1")
print("3. The function is defined on [0, 1]")

# Uncomment and modify:
# pdf_custom = ???  # Your function here
# a_custom, b_custom = 0, 1
# Check validity and calculate mean/variance

---

## Summary

### Key Concepts Covered:

**1. Random Variables:**
- Function that assigns numbers to outcomes
- Two types: Discrete and Continuous

**2. Discrete Random Variables:**
- PMF: P(X = x)
- Probabilities sum to 1
- Examples: dice, coin flips, counting

**3. Continuous Random Variables:**
- PDF: f(x)
- P(X = x) = 0 for any single point
- Use integrals: P(a ≤ X ≤ b) = ∫ₐᵇ f(x) dx

**4. Expected Value (Mean):**
- Discrete: $E[X] = \sum x \cdot P(X = x)$
- Continuous: $E[X] = \int x \cdot f(x) dx$

**5. Variance:**
- Measures spread around the mean
- $\text{Var}(X) = E[X^2] - (E[X])^2$
- Standard deviation: $\sigma = \sqrt{\text{Var}(X)}$

### Important Distributions:

| Distribution | Type | PDF/PMF | Mean | Variance |
|--------------|------|---------|------|----------|
| Uniform [a,b] | Continuous | 1/(b-a) | (a+b)/2 | (b-a)²/12 |
| Exponential | Continuous | λe^(-λx) | 1/λ | 1/λ² |
| Fair Die | Discrete | 1/6 each | 3.5 | 35/12 |

---

## Practice Problems

**Problem A:** Discrete Random Variable

X has PMF:
| x | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| P(X=x) | 0.2 | 0.3 | 0.1 | 0.3 | 0.1 |

Find E[X] and Var(X).

**Problem B:** Continuous Random Variable

f(x) = 2(1-x) for x ∈ [0, 1]

(a) Verify this is a valid PDF
(b) Find E[X]
(c) Find Var(X)

**Problem C:** Exponential Distribution

Customers arrive at a rate of 3 per hour. Let X = time (in hours) until next customer.

(a) What is λ?
(b) Find E[X] and Var(X)
(c) What is P(X > 0.5)?

**Problem D:** Applications

Give three real-world examples each of:
1. Discrete random variables
2. Continuous random variables

Explain what values they can take and why they're discrete or continuous.

---

## In Week 6, you'll explore:

- **Binomial Distribution**: n independent trials with success probability p
- **Poisson Distribution**: Number of events in a fixed interval
- **Properties and Applications**: When to use each distribution

### Key Takeaways from Week 5:

1. **Random variables** generalize probability to numerical values
2. **PMF** for discrete, **PDF** for continuous
3. **Expected value** is the long-run average
4. **Variance** measures spread
5. **Integration** (continuous) vs **Summation** (discrete)

### Practice Tips:

- Always check if probabilities sum/integrate to 1
- Draw diagrams to visualize PDFs and PMFs
- Remember: for continuous RV, P(X = exact value) = 0
- Variance is always non-negative

**Keep practicing with the TO DO sections!**

---

**Great job completing Week 5! 🎉**

You now understand how to work with random variables, calculate their properties, and apply them to real-world problems!