# CHAPTER 9: Patterns Everywhere

**Pages:** 157-174  
**Word Count:** ~4,500 words  
**Figures:** 5

---

## Setup: Python Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats
from scipy.integrate import odeint
import seaborn as sns

# Set style for all plots
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette("husl")

# For reproducibility
np.random.seed(42)

---

## Part 1: The Opening ‚Äì Seeing Patterns Everywhere

Two months after the monsoon ended, Ananya couldn't stop seeing patterns.

The morning bus to school arrived at 7:43 AM most days, but with a variation of ¬± 4 minutes‚Äîroughly normal distribution. The number of students absent each day seemed to follow something like a Poisson process. Even her mother's grocery shopping amounts clustered around ‚Çπ800 with predictable spread.

"You've become obsessed," Kabir teased one afternoon, watching her sketch a histogram of their cafeteria's lunch line wait times during their break.

"Not obsessed," Ananya protested. "Just... aware. **Once you learn to see distributions, you can't unsee them.**"

She was right. The monsoon project had changed something fundamental. What began as helping Uncle Bikram had become a new way of seeing the world. Weather patterns, cricket statistics, disease spread, even the scatter of classmates' test scores: all of it following patterns, dancing with uncertainty in predictable ways.

It was November now, the post-monsoon clarity settling over Sambalpur. School had returned to its normal rhythm‚Äîtests, homework, the relentless march toward board exams. But something was different. Word of their insurance company victory had spread. The state science fair invitation sat on Ananya's desk. And everywhere she looked, she saw questions that could be answered with the tools they'd learned.

---

## Part 2: Kabir's Cricket Revolution

### FIGURE 9.1: Cricket Player Profiles

In [None]:
# Create three types of cricket player profiles
np.random.seed(42)
n_matches = 20

# Type A: Mr. Reliable (high mean, low variance)
reliable_mean = 45
reliable_sd = 8
reliable_scores = np.random.normal(reliable_mean, reliable_sd, n_matches)
reliable_scores = np.clip(reliable_scores, 0, None)  # No negative scores

# Type B: Mr. Explosive (high mean, high variance)
explosive_mean = 45
explosive_sd = 25
explosive_scores = np.random.normal(explosive_mean, explosive_sd, n_matches)
explosive_scores = np.clip(explosive_scores, 0, None)

# Type C: Mr. Declining (decreasing mean over time)
declining_base = 55
decline_rate = 2  # Declining by ~2 runs per match
declining_scores = declining_base - decline_rate * np.arange(n_matches) + np.random.normal(0, 10, n_matches)
declining_scores = np.clip(declining_scores, 0, None)

# Create summary statistics
print("\nüèè KABIR'S CRICKET ANALYTICS PROJECT\n")
print("="*70)
print(f"{'Player Type':<20} {'Average':<15} {'Std Dev':<15} {'Range'}")
print("="*70)
print(f"{'Mr. Reliable':<20} {np.mean(reliable_scores):<15.1f} {np.std(reliable_scores, ddof=1):<15.1f} {max(reliable_scores)-min(reliable_scores):.1f}")
print(f"{'Mr. Explosive':<20} {np.mean(explosive_scores):<15.1f} {np.std(explosive_scores, ddof=1):<15.1f} {max(explosive_scores)-min(explosive_scores):.1f}")
print(f"{'Mr. Declining':<20} {np.mean(declining_scores):<15.1f} {np.std(declining_scores, ddof=1):<15.1f} {max(declining_scores)-min(declining_scores):.1f}")
print("="*70)
print("\nüí° KEY INSIGHT: Same average doesn't mean same story!")

In [None]:
# Visualize the three player types
fig, axes = plt.subplots(2, 3, figsize=(18, 10))
fig.suptitle('FIGURE 9.1: Three Types of Batsmen\n' +
             'Same average can hide very different stories',
             fontsize=16, fontweight='bold', y=1.00)

players = [
    ('Mr. Reliable', reliable_scores, 'steelblue'),
    ('Mr. Explosive', explosive_scores, 'coral'),
    ('Mr. Declining', declining_scores, 'purple')
]

for idx, (name, scores, color) in enumerate(players):
    # Top row: Line plot (match-by-match)
    ax_line = axes[0, idx]
    matches = np.arange(1, n_matches+1)
    ax_line.plot(matches, scores, marker='o', linewidth=2.5, 
                 color=color, markersize=6, alpha=0.8)
    ax_line.axhline(y=np.mean(scores), color='red', linestyle='--', 
                    linewidth=2, alpha=0.7, label=f'Avg: {np.mean(scores):.1f}')
    ax_line.set_xlabel('Match Number', fontsize=11, fontweight='bold')
    ax_line.set_ylabel('Runs Scored', fontsize=11, fontweight='bold')
    ax_line.set_title(name, fontsize=13, fontweight='bold', color=color)
    ax_line.set_ylim([0, max(max(reliable_scores), max(explosive_scores), max(declining_scores)) + 10])
    ax_line.legend(fontsize=10)
    ax_line.grid(True, alpha=0.3)
    
    # Add cricket emoji
    ax_line.text(0.95, 0.95, 'üèè', transform=ax_line.transAxes, 
                 fontsize=25, va='top', ha='right', alpha=0.3)
    
    # Bottom row: Histogram (distribution)
    ax_hist = axes[1, idx]
    ax_hist.hist(scores, bins=10, alpha=0.7, color=color, edgecolor='black')
    ax_hist.axvline(x=np.mean(scores), color='red', linestyle='--', 
                    linewidth=2, alpha=0.7)
    ax_hist.set_xlabel('Runs Scored', fontsize=11, fontweight='bold')
    ax_hist.set_ylabel('Frequency', fontsize=11, fontweight='bold')
    ax_hist.set_title('Distribution', fontsize=12, fontweight='bold')
    ax_hist.grid(True, alpha=0.3, axis='y')
    
    # Add stats box
    stats_text = f'SD: {np.std(scores, ddof=1):.1f}'
    ax_hist.text(0.95, 0.95, stats_text, transform=ax_hist.transAxes,
                 fontsize=11, va='top', ha='right', fontweight='bold',
                 bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8))

plt.tight_layout()
plt.show()

print("\nüìä KABIR'S ANALYSIS:\n")
print("Mr. Reliable: Consistent around 45 ‚Äî perfect for crucial matches")
print("Mr. Explosive: Averages 45 but swings wildly ‚Äî risky gamble")
print("Mr. Declining: Performance trending downward over time ‚Äî needs rest or form work")
print("\n‚úÖ Team selection should consider variance, not just averages!")

### The Story Continues...

"They keep picking Sharma over Patel," Kabir complained during lunch. "Both average 35 runs. But Sharma is so inconsistent‚Äîsome matches he gets 70, some matches he gets 5. Patel's more reliable."

"So model it," Ananya said simply.

"What?"

"You have the data. IPL publishes all statistics. Build a model showing consistency differences."

Kabir stared at her. Then a grin spread across his face. **"You're right. I could actually do this."**

He spent the next week collecting data: last three seasons' batting scores for twenty players. For each player, he calculated mean and standard deviation. **Player profiles emerged.**

Kabir shared his analysis on the school blog‚Äîand got unexpected attention from a local cricket club coach!

---

## Part 3: Priya's Genetics Exploration

### FIGURE 9.2: Genetic Inheritance Probability

In [None]:
# Create Punnett Square visualization with distribution
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 7))

# Left: Punnett Square
ax1.axis('off')
ax1.set_xlim([0, 4])
ax1.set_ylim([0, 4])

# Draw grid
for i in range(5):
    ax1.plot([i, i], [0, 4], 'k-', linewidth=2)
    ax1.plot([0, 4], [i, i], 'k-', linewidth=2)

# Labels
ax1.text(1.5, 4.3, 'Parent 1', fontsize=14, ha='center', fontweight='bold')
ax1.text(0.5, 3.5, 'A', fontsize=16, ha='center', fontweight='bold')
ax1.text(1.5, 3.5, 'a', fontsize=16, ha='center', fontweight='bold')
ax1.text(-0.5, 2.5, 'Parent 2', fontsize=14, ha='center', fontweight='bold', rotation=90)
ax1.text(0.5, 2.5, 'A', fontsize=16, ha='center', fontweight='bold', rotation=90)
ax1.text(0.5, 1.5, 'a', fontsize=16, ha='center', fontweight='bold', rotation=90)

# Offspring genotypes with colors
genotypes = [
    (1, 2, 'AA', 'lightgreen', '25%\nUnaffected'),
    (2, 2, 'Aa', 'lightyellow', '25%\nCarrier'),
    (1, 1, 'Aa', 'lightyellow', '25%\nCarrier'),
    (2, 1, 'aa', 'lightcoral', '25%\nAffected')
]

for x, y, genotype, color, label in genotypes:
    rect = plt.Rectangle((x, y), 1, 1, facecolor=color, edgecolor='black', linewidth=2)
    ax1.add_patch(rect)
    ax1.text(x+0.5, y+0.6, genotype, fontsize=18, ha='center', 
             fontweight='bold', color='black')
    ax1.text(x+0.5, y+0.25, label, fontsize=10, ha='center', va='center')

ax1.set_title('Punnett Square: Aa √ó Aa Cross\n(Both Parents Carriers)', 
              fontsize=14, fontweight='bold', pad=20)

# Right: Distribution if many offspring
# Simulate 1000 offspring
n_offspring = 1000
genotype_counts = {'AA': 0, 'Aa': 0, 'aa': 0}

for _ in range(n_offspring):
    # Each parent contributes A or a with 50% probability
    parent1_allele = np.random.choice(['A', 'a'])
    parent2_allele = np.random.choice(['A', 'a'])
    
    genotype = ''.join(sorted([parent1_allele, parent2_allele]))
    if genotype == 'AA':
        genotype_counts['AA'] += 1
    elif genotype == 'aa':
        genotype_counts['aa'] += 1
    else:
        genotype_counts['Aa'] += 1

genotypes_list = list(genotype_counts.keys())
counts_list = [genotype_counts[g] for g in genotypes_list]
colors_list = ['lightgreen', 'lightyellow', 'lightcoral']

bars = ax2.bar(genotypes_list, counts_list, color=colors_list, 
               edgecolor='black', linewidth=2, alpha=0.8)

# Add expected line
expected = [n_offspring * 0.25, n_offspring * 0.5, n_offspring * 0.25]
ax2.plot([-0.5, 2.5], [expected[1], expected[1]], 'r--', linewidth=2, 
         alpha=0.7, label='Expected (25%-50%-25%)')

# Labels
for bar, count in zip(bars, counts_list):
    height = bar.get_height()
    ax2.text(bar.get_x() + bar.get_width()/2., height,
            f'{count}\n({count/n_offspring*100:.1f}%)',
            ha='center', va='bottom', fontsize=12, fontweight='bold')

ax2.set_xlabel('Genotype', fontsize=13, fontweight='bold')
ax2.set_ylabel('Number of Offspring (out of 1000)', fontsize=13, fontweight='bold')
ax2.set_title('Distribution in Large Population\n(Simulation)', 
              fontsize=14, fontweight='bold', pad=10)
ax2.legend(fontsize=11)
ax2.grid(True, alpha=0.3, axis='y')

fig.suptitle('FIGURE 9.2: Genetic Inheritance Follows Mathematical Rules\n' +
             'Mendel\'s peas showed that genes are determined by probability, not chance',
             fontsize=16, fontweight='bold', y=0.98)

plt.tight_layout()
plt.show()

print("\nüß¨ PRIYA'S INSIGHT:\n")
print("This is binomial distribution!")
print("If both parents are carriers (Aa):")
print("  - 25% chance: Child unaffected (AA)")
print("  - 50% chance: Child is carrier (Aa)")
print("  - 25% chance: Child has disease (aa)")
print("\n‚úÖ Genetics is just probability! Medical counseling uses these models.")

### The Story...

They were studying Mendelian inheritance‚ÄîPunnett squares showing how traits passed from parents to offspring. The teacher drew the classic Aa √ó Aa cross on the board.

Priya's hand shot up. **"Ma'am, this is binomial distribution!"**

The teacher paused. "What?"

"The probability distribution. Each offspring is like a coin flip‚Äîtwo independent events, each with probability. If both parents are Aa, each parent passes either A or a with 50% probability. That's literally the binomial distribution we learned about."

The teacher, Mrs. Nayak, looked intrigued. "Priya, that's... actually correct. I've never thought about it that way. The mathematics underneath the biology."

---

## Part 4: Disease Spread Modeling

### FIGURE 9.3: SIR Epidemic Model

In [None]:
# SIR Model: Susceptible-Infected-Recovered
def sir_model(y, t, beta, gamma):
    """
    SIR epidemic model differential equations
    y = [S, I, R] - proportions of population
    beta = infection rate
    gamma = recovery rate
    """
    S, I, R = y
    N = S + I + R  # Total population (should be 1)
    
    dSdt = -beta * S * I
    dIdt = beta * S * I - gamma * I
    dRdt = gamma * I
    
    return [dSdt, dIdt, dRdt]

# Parameters
beta = 0.5   # Infection rate
gamma = 0.1  # Recovery rate
R0 = beta / gamma  # Basic reproduction number

# Initial conditions: 99.9% susceptible, 0.1% infected, 0% recovered
S0 = 0.999
I0 = 0.001
R0_val = 0.0
y0 = [S0, I0, R0_val]

# Time points (days)
t = np.linspace(0, 200, 1000)

# Solve ODE
solution = odeint(sir_model, y0, t, args=(beta, gamma))
S, I, R = solution.T

# Plot
fig, ax = plt.subplots(figsize=(14, 8))

ax.plot(t, S, 'b-', linewidth=3, label='Susceptible', alpha=0.8)
ax.plot(t, I, 'r-', linewidth=3, label='Infected', alpha=0.8)
ax.plot(t, R, 'g-', linewidth=3, label='Recovered', alpha=0.8)

# Mark key points
peak_idx = np.argmax(I)
peak_day = t[peak_idx]
peak_infected = I[peak_idx]

ax.plot(peak_day, peak_infected, 'ro', markersize=12)
ax.annotate(f'Peak Infection\nDay {peak_day:.0f}\n{peak_infected*100:.1f}% of population',
            xy=(peak_day, peak_infected), xytext=(peak_day+30, peak_infected+0.15),
            arrowprops=dict(arrowstyle='->', color='red', lw=2),
            fontsize=12, fontweight='bold',
            bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.9))

# Herd immunity threshold
herd_immunity = 1 - (1/R0)
ax.axhline(y=herd_immunity, color='purple', linestyle='--', linewidth=2, 
           alpha=0.7, label=f'Herd Immunity Threshold ({herd_immunity*100:.0f}%)')

ax.set_xlabel('Time (days)', fontsize=13, fontweight='bold')
ax.set_ylabel('Proportion of Population', fontsize=13, fontweight='bold')
ax.set_title('FIGURE 9.3: SIR Epidemic Model\n' +
             'Disease spread follows predictable curves: Susceptible decreases, ' +
             'Infections rise and fall, Recovered grows',
             fontsize=15, fontweight='bold', pad=20)
ax.legend(fontsize=12, loc='right')
ax.grid(True, alpha=0.3)
ax.set_ylim([0, 1])

# Add R‚ÇÄ information
ax.text(0.02, 0.98, f'R‚ÇÄ = {R0:.1f}\n(Each infected person infects {R0:.1f} others on average)', 
        transform=ax.transAxes, fontsize=11, va='top',
        bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8))

plt.tight_layout()
plt.show()

print("\nü¶† EPIDEMIC MODELING INSIGHTS:\n")
print(f"R‚ÇÄ (Basic Reproduction Number) = {R0:.1f}")
print(f"  ‚Üí Each infected person infects {R0:.1f} others on average")
print(f"\nPeak Infection: Day {peak_day:.0f} with {peak_infected*100:.1f}% infected")
print(f"Herd Immunity Threshold: {herd_immunity*100:.0f}% recovered needed to stop spread")
print(f"\nFinal State: {R[-1]*100:.1f}% of population infected over course of epidemic")
print("\n‚úÖ Public health uses these models to plan interventions!")

### The Guest Lecture

The school had invited a public health official to discuss pandemic preparedness‚Äîlessons from COVID-19.

Dr. Ranjan showed them epidemic curves, explaining how diseases spread through populations. He introduced the SIR model: Susceptible, Infected, Recovered.

Ananya's hand shot up. **"Sir, those curves‚Äîthey're distributions, aren't they? The infection curve looks like a bell curve."**

"Exactly!" Dr. Ranjan looked delighted. **"Every epidemic model is built on probability. The R‚ÇÄ‚Äîbasic reproduction number‚Äîis literally an expected value. How many people, on average, does each infected person infect?"**

---

## Part 5: Professional Applications

### Stock Market: Understanding Risk

In [None]:
# Simulate stock returns with normal distribution (and fat tails)
np.random.seed(42)
days = 252  # Trading days in a year

# Stock A: Low volatility (SD = 1%)
stock_a_returns = np.random.normal(0.03/252, 0.01, days)  # 3% annual return, 1% daily SD

# Stock B: High volatility (SD = 3%)
stock_b_returns = np.random.normal(0.03/252, 0.03, days)  # Same 3% annual return, 3% daily SD

# Calculate cumulative returns
stock_a_price = 100 * np.cumprod(1 + stock_a_returns)
stock_b_price = 100 * np.cumprod(1 + stock_b_returns)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Left: Price evolution
ax1.plot(stock_a_price, linewidth=2, label='Stock A (Low Volatility)', color='steelblue', alpha=0.8)
ax1.plot(stock_b_price, linewidth=2, label='Stock B (High Volatility)', color='coral', alpha=0.8)
ax1.axhline(y=100, color='black', linestyle='--', linewidth=1, alpha=0.5)
ax1.set_xlabel('Trading Days', fontsize=12, fontweight='bold')
ax1.set_ylabel('Stock Price (‚Çπ)', fontsize=12, fontweight='bold')
ax1.set_title('Stock Price Evolution\nSame Expected Return, Different Risk', 
              fontsize=14, fontweight='bold')
ax1.legend(fontsize=11)
ax1.grid(True, alpha=0.3)

# Right: Returns distribution
ax2.hist(stock_a_returns * 100, bins=30, alpha=0.6, color='steelblue', 
         label='Stock A', edgecolor='black')
ax2.hist(stock_b_returns * 100, bins=30, alpha=0.6, color='coral', 
         label='Stock B', edgecolor='black')
ax2.axvline(x=0, color='black', linestyle='--', linewidth=2, alpha=0.5)
ax2.set_xlabel('Daily Return (%)', fontsize=12, fontweight='bold')
ax2.set_ylabel('Frequency', fontsize=12, fontweight='bold')
ax2.set_title('Distribution of Daily Returns\nVolatility = Risk', 
              fontsize=14, fontweight='bold')
ax2.legend(fontsize=11)
ax2.grid(True, alpha=0.3, axis='y')

fig.suptitle('Stock Market: Risk Management is About Variance', 
             fontsize=16, fontweight='bold', y=1.02)

plt.tight_layout()
plt.show()

print("\nüìà STOCK MARKET ANALYST'S WISDOM:\n")
print(f"Stock A Final Price: ‚Çπ{stock_a_price[-1]:.2f} | Return: {(stock_a_price[-1]/100-1)*100:.1f}%")
print(f"Stock B Final Price: ‚Çπ{stock_b_price[-1]:.2f} | Return: {(stock_b_price[-1]/100-1)*100:.1f}%")
print(f"\nBoth had same expected return (~3% annual)")
print(f"But Stock A (SD={np.std(stock_a_returns)*100:.2f}%) vs Stock B (SD={np.std(stock_b_returns)*100:.2f}%)")
print("\nüí° 'High-risk, high-reward' literally means high standard deviation!")
print("\nYoung investors: Can take volatile Stock B (time to recover)")
print("Retirees: Need stable Stock A (can't afford losses)")

### FIGURE 9.4: Manufacturing Quality Control

In [None]:
# Manufacturing quality control with Six Sigma
target_diameter = 10.0  # mm
tolerance = 0.3  # ¬±0.3mm acceptable
process_sigma = 0.1  # Standard deviation of manufacturing process

# Simulate bolt production
n_bolts = 10000
np.random.seed(42)
bolt_diameters = np.random.normal(target_diameter, process_sigma, n_bolts)

# Calculate defects
lower_limit = target_diameter - tolerance
upper_limit = target_diameter + tolerance
defects = np.sum((bolt_diameters < lower_limit) | (bolt_diameters > upper_limit))
defect_rate = defects / n_bolts * 100

# Create visualization
fig, ax = plt.subplots(figsize=(14, 8))

# Plot histogram
counts, bins, patches = ax.hist(bolt_diameters, bins=50, alpha=0.7, 
                                 color='steelblue', edgecolor='black')

# Color the defective regions
for i, patch in enumerate(patches):
    if bins[i] < lower_limit or bins[i] > upper_limit:
        patch.set_facecolor('red')
        patch.set_alpha(0.6)

# Add specification limits
ax.axvline(x=target_diameter, color='green', linestyle='-', linewidth=3, 
           label=f'Target: {target_diameter}mm', alpha=0.8)
ax.axvline(x=lower_limit, color='red', linestyle='--', linewidth=2.5, 
           label=f'Lower Spec Limit: {lower_limit}mm', alpha=0.8)
ax.axvline(x=upper_limit, color='red', linestyle='--', linewidth=2.5, 
           label=f'Upper Spec Limit: {upper_limit}mm', alpha=0.8)

# Shade acceptable region
ax.axvspan(lower_limit, upper_limit, alpha=0.1, color='green', 
           label='Acceptable Region (99.7% of parts)')

# Add ¬±sigma markers
for i in range(1, 4):
    ax.axvline(x=target_diameter + i*process_sigma, color='gray', 
               linestyle=':', linewidth=1.5, alpha=0.5)
    ax.axvline(x=target_diameter - i*process_sigma, color='gray', 
               linestyle=':', linewidth=1.5, alpha=0.5)
    if i <= 3:
        ax.text(target_diameter + i*process_sigma, max(counts)*0.9, 
                f'+{i}œÉ', fontsize=10, ha='center')
        ax.text(target_diameter - i*process_sigma, max(counts)*0.9, 
                f'-{i}œÉ', fontsize=10, ha='center')

ax.set_xlabel('Bolt Diameter (mm)', fontsize=13, fontweight='bold')
ax.set_ylabel('Frequency', fontsize=13, fontweight='bold')
ax.set_title('FIGURE 9.4: Manufacturing Quality Control\n' +
             'Every bolt is slightly different ‚Äî good manufacturing keeps variation within tolerance limits',
             fontsize=15, fontweight='bold', pad=20)
ax.legend(fontsize=11, loc='upper left')
ax.grid(True, alpha=0.3, axis='y')

# Add stats box
stats_text = (f'Process Capability:\n'
              f'Mean: {np.mean(bolt_diameters):.3f} mm\n'
              f'Std Dev: {np.std(bolt_diameters):.3f} mm\n'
              f'Defect Rate: {defect_rate:.2f}%\n'
              f'({defects} defects / {n_bolts} parts)')
ax.text(0.98, 0.97, stats_text, transform=ax.transAxes,
        fontsize=11, va='top', ha='right', fontweight='bold',
        bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.9))

# Add bolt/gear emoji
ax.text(0.02, 0.97, '‚öôÔ∏è', transform=ax.transAxes, 
        fontsize=40, va='top', ha='left', alpha=0.3)

plt.tight_layout()
plt.show()

print("\n‚öôÔ∏è QUALITY CONTROL ENGINEER'S INSIGHT:\n")
print(f"Target: {target_diameter}mm ¬± {tolerance}mm")
print(f"Process Standard Deviation: {process_sigma}mm")
print(f"\nSix Sigma Methodology: Keep defects within ¬±6œÉ")
print(f"Our process is ¬±{tolerance/process_sigma:.1f}œÉ")
print(f"\nDefect Rate: {defect_rate:.3f}% ({defects}/{n_bolts} bolts)")
print("\n‚úÖ Quality control is applied statistics ‚Äî factories use distributions all day!")

### Education: Standardized Testing

In [None]:
# Standardized test scores and percentiles
mean_score = 500
std_score = 100
n_students = 10000

np.random.seed(42)
test_scores = np.random.normal(mean_score, std_score, n_students)
test_scores = np.clip(test_scores, 200, 800)  # Typical test score range

# Example students
ananya_score = 650
ananya_percentile = stats.percentileofscore(test_scores, ananya_score)

fig, ax = plt.subplots(figsize=(14, 8))

# Plot distribution
counts, bins, patches = ax.hist(test_scores, bins=50, alpha=0.7, 
                                 color='skyblue', edgecolor='black')

# Mark Ananya's score
ax.axvline(x=ananya_score, color='red', linestyle='--', linewidth=3, 
           label=f"Ananya's Score: {ananya_score}", alpha=0.8)

# Shade area below Ananya's score
for i, patch in enumerate(patches):
    if bins[i] < ananya_score:
        patch.set_facecolor('lightgreen')
        patch.set_alpha(0.5)

# Add percentile markers
percentiles = [10, 25, 50, 75, 90]
percentile_scores = np.percentile(test_scores, percentiles)

for p, score in zip(percentiles, percentile_scores):
    ax.axvline(x=score, color='gray', linestyle=':', linewidth=1.5, alpha=0.5)
    ax.text(score, max(counts)*0.95, f'{p}th', fontsize=9, 
            ha='center', rotation=90, va='bottom')

# Annotation for Ananya
ax.annotate(f'{ananya_percentile:.1f}th Percentile\n(Scored better than {ananya_percentile:.1f}% of students)',
            xy=(ananya_score, max(counts)*0.6), xytext=(ananya_score+80, max(counts)*0.75),
            arrowprops=dict(arrowstyle='->', color='red', lw=2),
            fontsize=12, fontweight='bold',
            bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.9))

ax.set_xlabel('Test Score', fontsize=13, fontweight='bold')
ax.set_ylabel('Number of Students', fontsize=13, fontweight='bold')
ax.set_title('Standardized Testing: Percentile Ranks\n' +
             'Test scores deliberately designed to follow normal distribution',
             fontsize=15, fontweight='bold', pad=20)
ax.legend(fontsize=12)
ax.grid(True, alpha=0.3, axis='y')

# Stats box
stats_text = (f'Population Stats:\n'
              f'Mean: {mean_score}\n'
              f'Std Dev: {std_score}\n'
              f'N = {n_students:,} students')
ax.text(0.02, 0.97, stats_text, transform=ax.transAxes,
        fontsize=11, va='top', ha='left',
        bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8))

plt.tight_layout()
plt.show()

print("\nüìö EDUCATION RESEARCHER'S EXPLANATION:\n")
print(f"Ananya's score: {ananya_score}")
print(f"Percentile rank: {ananya_percentile:.1f}th")
print(f"\nMeaning: She scored better than {ananya_percentile:.1f}% of all test-takers")
print("\nPercentiles are purely distributional:")
for p, score in zip(percentiles, percentile_scores):
    print(f"  {p}th percentile = Score of {score:.0f}")
print("\n‚ö†Ô∏è Controversial: Norm-referenced tests require some students to 'fail'")
print("The bell curve ensures people in the tails...")

---

## Part 6: School Assembly Presentation

The Friday before Diwali break, Principal Sahoo invited Ananya to present at school assembly. The entire school‚Äî600 students, all staff‚Äîgathered in the courtyard.

Ananya stood at the microphone, her heart pounding. She'd prepared a 10-minute talk: **"How Mathematics Helps Farmers."**

She started with the insurance denial letter‚Äîthe problem that began everything. She showed them the bell curve of rainfall data. She explained how the insurance company looked at yearly totals while she looked at monthly distributions.

**"Same data, different questions, different answers,"** she said. **"The insurance company asked: Is the total normal? We asked: Is the monthly pattern normal? That difference changed everything."**

She showed them the validation chart‚ÄîUncle Bikram's year with three extreme months.

**"Mathematics isn't just for exams,"** she concluded. **"It's a tool for understanding the world and solving real problems. My uncle is a farmer who failed math in school. But this year, he used probability to plan his planting. He prepared for extreme weather because the model said extreme events were possible. The same math we do in class helped him protect his harvest."**

The students were surprisingly attentive. Silence held through her whole presentation.

When she finished, the courtyard erupted in applause. Not polite assembly clapping‚Äîgenuine enthusiasm.

Principal Sahoo took the microphone. **"This is what education should be. Not memorizing formulas for exams, but using knowledge to serve your community."**

---

## FIGURE 9.5: Patterns in Daily Life

In [None]:
# Create a collage showing distributions in multiple contexts
fig, axes = plt.subplots(2, 3, figsize=(18, 10))
fig.suptitle('FIGURE 9.5: Once You Learn to See Distributions, You See Them Everywhere', 
             fontsize=16, fontweight='bold', y=0.98)

np.random.seed(42)

# 1. Traffic patterns (Poisson)
ax = axes[0, 0]
traffic = np.random.poisson(12, 100)  # Cars per minute
ax.hist(traffic, bins=20, color='orange', alpha=0.7, edgecolor='black')
ax.set_title('üöó Traffic Flow\n(Cars per minute)', fontsize=12, fontweight='bold')
ax.set_xlabel('Cars')
ax.set_ylabel('Frequency')
ax.grid(True, alpha=0.3, axis='y')

# 2. Wait times (Exponential)
ax = axes[0, 1]
wait_times = np.random.exponential(3, 200)  # Minutes
ax.hist(wait_times, bins=30, color='skyblue', alpha=0.7, edgecolor='black')
ax.set_title('‚è±Ô∏è Cafeteria Wait Times\n(Minutes)', fontsize=12, fontweight='bold')
ax.set_xlabel('Minutes')
ax.set_ylabel('Frequency')
ax.grid(True, alpha=0.3, axis='y')

# 3. Test scores (Normal)
ax = axes[0, 2]
scores = np.random.normal(75, 12, 200)
ax.hist(scores, bins=25, color='lightgreen', alpha=0.7, edgecolor='black')
ax.set_title('üìù Test Scores\n(Percentage)', fontsize=12, fontweight='bold')
ax.set_xlabel('Score')
ax.set_ylabel('Frequency')
ax.grid(True, alpha=0.3, axis='y')

# 4. Heights (Normal)
ax = axes[1, 0]
heights = np.random.normal(165, 8, 200)  # cm
ax.hist(heights, bins=25, color='mediumpurple', alpha=0.7, edgecolor='black')
ax.set_title('üìè Student Heights\n(Centimeters)', fontsize=12, fontweight='bold')
ax.set_xlabel('Height (cm)')
ax.set_ylabel('Frequency')
ax.grid(True, alpha=0.3, axis='y')

# 5. Social media engagement (Power law)
ax = axes[1, 1]
likes = np.random.pareto(2, 200) * 10  # Power law
ax.hist(likes, bins=30, color='coral', alpha=0.7, edgecolor='black')
ax.set_title('üì± Post Likes\n(Count)', fontsize=12, fontweight='bold')
ax.set_xlabel('Likes')
ax.set_ylabel('Frequency')
ax.set_xlim([0, 100])
ax.grid(True, alpha=0.3, axis='y')

# 6. Daily steps (Normal-ish)
ax = axes[1, 2]
steps = np.random.normal(8000, 2000, 200)
steps = np.clip(steps, 0, None)
ax.hist(steps, bins=25, color='lightcoral', alpha=0.7, edgecolor='black')
ax.set_title('üëü Daily Steps\n(Count)', fontsize=12, fontweight='bold')
ax.set_xlabel('Steps')
ax.set_ylabel('Frequency')
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

print("\nüåç PATTERNS EVERYWHERE:\n")
print("‚úì Traffic patterns follow Poisson distribution")
print("‚úì Wait times follow exponential distribution")
print("‚úì Test scores follow normal distribution")
print("‚úì Human heights follow normal distribution")
print("‚úì Social media engagement follows power law")
print("‚úì Daily activity levels roughly normal")
print("\nüí° Once you understand distributions, you see the world differently!")

---

## Part 7: Looking Forward

Later that week, the three friends sat with Professor Mishra, discussing potential next projects.

"Kabir wants to model video game player behavior," Professor said.

"Yeah! Like, do players get better over time? Is there a learning curve? Can we predict who'll stick with a game?"

"Priya wants to study disease transmission patterns in schools."

"Right. Like, if one person has the flu, how does it spread through the classroom? Can we model it like we did with the SIR curves?"

"And Ananya's curious about climate change trends‚Äîlonger-term modeling."

"Temperature patterns over decades. Whether extreme weather is becoming more common. If our monsoon model would work for future predictions..."

Professor Mishra smiled. **"You've all caught the pattern-seeking bug. Good. The world needs more people who can think probabilistically."**

He paused. **"But there's one skill we haven't deeply explored yet: visualization."**

"What do you mean?" Ananya asked. "We've been making lots of graphs."

**"Data without good visualization is like a story without words,"** Professor said. **"You can have perfect analysis, but if you can't communicate it clearly‚Äîif people can't SEE the pattern you found‚Äîit doesn't matter. Let me show you..."**

---

## üéØ TRY THIS: Find Distributions in Your World

### Distribution Hunt Challenge

In [None]:
# Template for students to track their own distributions
print("\nüìä DISTRIBUTION HUNT CHALLENGE\n")
print("Over the next week, track one of these and plot the distribution:\n")

examples = [
    "1. School cafeteria: Number of students buying lunch each day",
    "2. Bus arrival times: How many minutes early/late does your bus arrive?",
    "3. Phone notifications: Track hourly notifications for a week",
    "4. Homework time: Minutes spent on homework each day",
    "5. Basketball: Free throw success (track 50 attempts)",
    "6. Typing speed: Test yourself 20 times, track words per minute",
    "7. Reaction time: Use online game, record 50 trials",
    "8. Family dinner time: What time does dinner start each night?"
]

for example in examples:
    print(f"  {example}")

print("\nTemplate for your data:")
print("="*60)

# Example template
your_data = [45, 52, 48, 50, 46, 49, 51, 47, 53, 48]  # Replace with YOUR data!

print(f"\nYour Data: {your_data}")
print(f"\nStatistics:")
print(f"  Mean: {np.mean(your_data):.2f}")
print(f"  Standard Deviation: {np.std(your_data, ddof=1):.2f}")
print(f"  Range: {max(your_data) - min(your_data)}")

# Quick plot
fig, ax = plt.subplots(figsize=(10, 6))
ax.hist(your_data, bins=10, color='skyblue', edgecolor='black', alpha=0.7)
ax.axvline(x=np.mean(your_data), color='red', linestyle='--', linewidth=2,
           label=f'Mean: {np.mean(your_data):.1f}')
ax.set_xlabel('Value', fontsize=12, fontweight='bold')
ax.set_ylabel('Frequency', fontsize=12, fontweight='bold')
ax.set_title('My Distribution Hunt Results', fontsize=14, fontweight='bold')
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()

print("\n‚ùì Questions to ask yourself:")
print("  - What shape is the distribution?")
print("  - Is it symmetric or skewed?")
print("  - Are there any outliers?")
print("  - What causes the variation?")
print("  - Could you predict future values?")

---

## üìö Key Concepts from Chapter 9

### 1. Universal Application
- Probability thinking applies to nearly every domain
- Sports, genetics, disease, finance, manufacturing, education
- **Once you learn to see distributions, you can't unsee them**

### 2. Domain-Specific Examples

**Cricket Analytics:**
- Player profiles: Consistent, Volatile, Declining
- Team selection should consider variance, not just average

**Genetics:**
- Mendelian inheritance is binomial distribution
- Genetic counseling uses probability to assess risk

**Epidemiology:**
- SIR model: Susceptible, Infected, Recovered
- R‚ÇÄ (basic reproduction number) is expected value
- Disease curves are predictable distributions

**Finance:**
- Stock returns follow distributions (with fat tails)
- Risk management = managing variance
- High volatility = high standard deviation

**Manufacturing:**
- Quality control uses normal distribution
- Six Sigma: Keep defects within ¬±6œÉ
- Statistical process control

**Education:**
- Test scores designed to be normally distributed
- Percentile ranks are distributional statements
- Norm-referenced vs criterion-referenced testing

### 3. Teaching as Learning
- Ananya presents at school assembly
- Explaining concepts to others deepens understanding
- Mathematics as service to community

### 4. Professional Validation
- Real professionals use these tools daily
- Same mathematics, different applications
- Pattern-seeking is a transferable skill

---

## üîó References

1. Montgomery, D. C. (2012). *Statistical Quality Control* (7th ed.). Wiley. [Six Sigma and quality control]

2. Bodie, Z., Kane, A., & Marcus, A. J. (2018). *Investments* (11th ed.). McGraw-Hill. [Portfolio theory]

3. Keeling, M. J., & Rohani, P. (2008). *Modeling Infectious Diseases in Humans and Animals*. Princeton University Press. [SIR models]

4. ESPN Cricinfo. (2024). *IPL Player Statistics and Performance Analytics*. https://www.espncricinfo.com/

5. Kahneman, D., & Tversky, A. (1979). Prospect Theory: An Analysis of Decision under Risk. *Econometrica*, 47(2), 263-291.

6. Centers for Disease Control and Prevention. (2024). *Epidemic Intelligence Service*. https://www.cdc.gov/

7. Pyzdek, T., & Keller, P. (2014). *The Six Sigma Handbook* (4th ed.). McGraw-Hill.

---

## üí≠ Reflection Questions

1. What distributions have you noticed in your own life this week?

2. How does understanding variance change your view of sports statistics?

3. Why is the SIR model both powerful and limited?

4. What's the relationship between risk and standard deviation in investing?

5. Is it fair that standardized tests are designed to produce a bell curve? Why or why not?

6. How might probability thinking help you make better decisions?

---

## üìà Next Chapter Preview

**Chapter 10: When Models Break**

Ananya learns that models, no matter how sophisticated, have limits. A surprising weather event challenges their rainfall predictions. Professor Mishra teaches them the most important lesson of all: understanding when NOT to trust your model.

"All models are wrong," he says, "but some are useful. The art is knowing which is which, and when to stop trusting your mathematics."

They'll explore:
- Fat-tailed distributions and black swan events
- Model uncertainty and confidence intervals
- The danger of overfitting
- When to trust experts vs. models
- Ethical responsibility of model-builders

---

**End of Chapter 9**

*"Once you learn to see distributions, you see them everywhere."* ‚Äî Ananya