# Chapter 10: Observational Studies vs. Designed Experiments

## Lesson Title: "Crash Course: Real-World Accidents vs. Controlled Safety Tests"

## High School Learning Goals
*   Distinguish between **Observational Studies** (watching what happens) and **Designed Experiments** (making things happen).
*   Identify **Lurking Variables** that can trick us in observational data.
*   Apply the **Four Principles of Experimental Design**: Control, Randomize, Replicate, and Block.
*   Understand **Confounding** by running a simulation of a car braking test.

## Common Core Standards
*   **HSS.ID.C.9**: Distinguish between correlation and causation.
*   **HSS.IC.B.3**: Recognize the purposes of and differences among sample surveys, experiments, and observational studies.

## Engineering Context
How do we know a car is safe? We use two methods:
1.  **Observational**: We analyze police reports from real accidents (NHTSA FARS data). We *cannot* control the weather, the driver's age, or the speed.
2.  **Experimental**: We take cars to a test track. We *control* the speed, *randomize* the tires, and *block* by road surface to scientifically prove which brakes are better.

### Part 1: Observational Study (The "Lurking Variable" Trap)
We will look at a dataset of traffic accidents. We might see that "Red Cars" have more accidents. Does Red Paint cause crashes? Or is there a **Lurking Variable** (like the fact that sports cars are often red, and sports cars are driven faster)?

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, clear_output

# ------------------------------------------------------------------------
# PART 1: GENERATE SYNTHETIC ORSERVSATIONAL DATA
# ------------------------------------------------------------------------

def generate_crash_data(n=200):
    np.random.seed(42)
    
    # Hidden variable: Driver Style (0=Cautious, 1=Aggressive)
    driver_style = np.random.choice([0, 1], size=n, p=[0.7, 0.3])
    
    data = []
    for i in range(n):
        style = driver_style[i]
        
        # Aggressive drivers like Red Sports Cars
        if style == 1: 
            car_color = np.random.choice(['Red', 'Black', 'Silver'], p=[0.6, 0.3, 0.1])
            car_type = 'Sports Car'
            speed_excess = np.random.normal(20, 5) # Drive 20mph over limit
        else:
            car_color = np.random.choice(['White', 'Blue', 'Silver', 'Red'], p=[0.3, 0.3, 0.3, 0.1])
            car_type = 'Sedan'
            speed_excess = np.random.normal(0, 5) # Drive at limit
            
        # Accident Severity (0-10) depends on speed, NOT color
        severity = max(0, (speed_excess * 0.3) + np.random.normal(2, 1))
        
        data.append({
            'Car_Color': car_color,
            'Car_Type': car_type,
            'Speed_Over_Limit': round(speed_excess, 1),
            'Crash_Severity': round(severity, 1)
        })
    return pd.DataFrame(data)

df_obs = generate_crash_data()

print("Sample of Observational Data (Police Reports):")
display(df_obs.head())

# PLOT: Color vs Severity
plt.figure(figsize=(8, 4))
sns.barplot(data=df_obs, x='Car_Color', y='Crash_Severity', ci=None, palette='viridis')
plt.title("Observation: Crash Severity by Car Color")
plt.show()

print("\nDISCUSSION: It looks like RED cars have worse crashes. Is it the paint? Or is it the Car Type?")

### Part 2: Designed Experiment (The Brake Test)
You are the Lead Engineer at the Test Track. We need to measure the stopping distance of two new braking systems: **System A** and **System B**.

**The Challenge:**
We have 20 test runs. We also have a variable we use to **Block**: The Track Surface (Dry vs. Wet).

*   **Bad Design**: If we test all of System A on Dry roads and System B on Wet roads, we are *confounded*.
*   **Good Design**: We must RANDOMIZE which system is used on which run, or ensure equal testing on both surfaces.

In [None]:
# ------------------------------------------------------------------------
# PART 2: INTERACTIVE EXPERIMENT WIDGET
# ------------------------------------------------------------------------

style = {'description_width': 'initial'}

randomize_check = widgets.Checkbox(
    value=False, 
    description='Apply Randomization',
    style=style
)

output_exp = widgets.Output()

def run_brake_experiment(b):
    is_randomized = randomize_check.value
    
    with output_exp:
        clear_output(wait=True)
        
        # Setup the "environment"
        # 20 Test Runs. First 10 are Dry, Last 10 are Wet (Nature decides this)
        conditions = ['Dry'] * 10 + ['Wet'] * 10
        
        # ASSIGN TREATMENTS (Brake System A vs B)
        if not is_randomized:
            # BAD SCIENCE: Lazy engineer tests A in the morning (Dry) and B in the afternoon (Wet)
            treatments = ['System A'] * 10 + ['System B'] * 10
            design_score = "POOR (Confounded)"
        else:
            # GOOD SCIENCE: Randomly assign A or B to each run
            treatments = np.random.choice(['System A', 'System B'], size=20, p=[0.5, 0.5])
            design_score = "EXCELLENT (Randomized)"
            
        # SIMULATE RESULTS (Stopping Distance in meters)
        # Truth: System A is slightly better (shorter distance) than B.
        # Truth: Wet roads add massive distance.
        results = []
        for i in range(20):
            base_dist = 40 # avg stopping distance
            
            # Treatment Effect
            if treatments[i] == 'System A': dist = base_dist - 5
            else: dist = base_dist + 0
                
            # Blocking Factor Effect
            if conditions[i] == 'Wet': dist += 20 
            
            # Random noise
            dist += np.random.normal(0, 2)
            
            results.append({
                'Run_ID': i+1,
                'Road_Condition': conditions[i],
                'Brake_System': treatments[i],
                'Stop_Distance_m': dist
            })
            
        df_exp = pd.DataFrame(results)
        
        # VISUALIZE
        plt.figure(figsize=(10,6))
        sns.boxplot(data=df_exp, x='Brake_System', y='Stop_Distance_m', hue='Road_Condition')
        plt.title(f"Experiment Results (Design: {design_score})")
        plt.ylabel("Stopping Distance (meters) [Lower is Better]")
        plt.show()
        
        # ANALYSIS TEXT
        avg_a = df_exp[df_exp['Brake_System']=='System A']['Stop_Distance_m'].mean()
        avg_b = df_exp[df_exp['Brake_System']=='System B']['Stop_Distance_m'].mean()
        
        print(f"Average Stop Dist A: {avg_a:.1f}m")
        print(f"Average Stop Dist B: {avg_b:.1f}m")
        
        if not is_randomized:
            print("\n⚠ CRITICAL ERROR: System A looks amazing (35m) and System B looks terrible (60m).")
            print("But wait! Look at the colors. System A was ONLY tested on Dry roads!")
            print("We cannot tell if the difference is the Brakes or the Rain. This is CONFOUNDING.")
        else:
            print("\n✅ SUCCESS: Because we randomized, both systems faced Wet and Dry roads.")
            print("We can fairly compare the boxes. System A is indeed slightly better (lower median).")

btn = widgets.Button(description="Run Experiment")
btn.on_click(run_brake_experiment)

display(widgets.HBox([randomize_check, btn]))
display(output_exp)

## Student Assessment

### Discussion Questions
1.  **Observational vs Experimental**: In Part 1, why couldn't we just tell all the Red Car drivers to drive slower to prove our theory? Why does that make it an *Observational* study?
2.  **Confounding**: In the "Bad Design" of Part 2, System A had a stopping distance of ~35m and System B had ~60m. Why was it wrong to conclude "System A is twice as good"?
3.  **Ethics**: Imagine we want to study the effect of "Texting while Driving" on fatalities. Can we run a *Designed Experiment* for this? Why or why not? (Hint: Can we assign 1000 people to text and crash?)

### Challenge Problem: The Vaccine Trial
A pharmaceutical company wants to test a new vaccine. 
*   **Scenario**: They give the vaccine to 1,000 young, healthy volunteers. They give the Placebo to 1,000 elderly patients in a nursing home. 
*   **Result**: The Vaccine group had 0 illnesses. The Placebo group had 50.
*   **Task**: Identify the **Confounded Variable** and propose a **Randomized Block Design** to fix this study.