# üß™ Experimental Design
## The Foundation of Rigorous Ecological Research

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/The-Pattern-Hunter/interactive-ecology-biometry/blob/main/unit-4-biometry/notebooks/07_experimental_design.ipynb)

---

> *"To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of."* - R.A. Fisher

### üéØ Learning Objectives

By the end of this notebook, you will:
1. Understand the **three pillars** of good experimental design
2. Distinguish **observational** vs **experimental** studies
3. Apply **randomization** techniques properly
4. Implement **replication** for statistical power
5. Use **controls** to eliminate confounding
6. Design **factorial experiments** efficiently
7. Apply **blocking** to reduce variability
8. Calculate **required sample sizes** with power analysis
9. Recognize and avoid **pseudoreplication**

In [None]:
# Setup
!pip install numpy pandas plotly matplotlib scipy -q

import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from scipy import stats
import itertools

# Set random seed for reproducibility
np.random.seed(42)

print("‚úÖ Ready to design experiments!")
print("üß™ Let's learn the principles of rigorous research!")

---

## üìö Part 1: What is Experimental Design?

### Definition:

**Experimental Design**: The structure and strategy for conducting a study to answer a research question while controlling for confounding variables and maximizing statistical power.

### Why Does Design Matter?

**Bad design = Wasted effort**

Even with:
- Perfect execution
- Careful measurements
- Advanced statistics

**You cannot fix a poorly designed experiment!**

### The Three Pillars of Experimental Design:

```
    GOOD EXPERIMENT
         /|\
        / | \
       /  |  \
      /   |   \
     /    |    \
    /_____|_____\
   /             \
  /_______________\
 
 REPLICATION  RANDOMIZATION  CONTROL
```

#### **1. Replication** üîÑ
- **Multiple observations** per treatment
- **Why**: Estimate variability, increase power
- **How many**: Power analysis (we'll calculate!)

#### **2. Randomization** üé≤
- **Random assignment** to treatments
- **Why**: Eliminate bias, distribute unknown confounds
- **How**: Computer-generated random numbers

#### **3. Control** ‚öñÔ∏è
- **Control group** receives no treatment
- **Why**: Establish baseline, isolate treatment effect
- **Types**: Negative control, positive control, procedural control

### Observational vs Experimental Studies:

| Feature | Observational | Experimental |
|---------|---------------|-------------|
| **Manipulation** | None | Yes (treatments applied) |
| **Randomization** | Not possible | Required |
| **Causation** | Cannot establish | Can establish |
| **Control** | Limited | Strong |
| **Example** | Survey biodiversity | Test fertilizer effects |
| **Strength** | Natural conditions | Causal inference |
| **Weakness** | Confounding | Artificial conditions |

---

## üé≤ Part 2: Randomization - The Gold Standard

### Why Randomize?

**Problem**: Unknown confounding variables

**Example**:
```
BAD Design (No Randomization):
Field Layout:
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ All Controls    ‚îÇ ‚Üê North side (shady, moist)
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ All Treatments  ‚îÇ ‚Üê South side (sunny, dry)
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

Problem: Can't tell if treatment effect or location effect!
```

**Solution**: Randomize!
```
GOOD Design (Randomized):
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ C T C T C C T T ‚îÇ ‚Üê Mixed throughout
‚îÇ T C T C T T C C ‚îÇ ‚Üê Random assignment
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

Result: Location effects averaged out!
```

### How to Randomize:

**Step 1**: List all experimental units
```
Plot 1, Plot 2, Plot 3, ..., Plot 20
```

**Step 2**: Generate random assignments
```python
treatments = np.random.permutation(['Control']*10 + ['Treatment']*10)
```

**Step 3**: Apply according to random list
```
Plot 1 ‚Üí Treatment
Plot 2 ‚Üí Control
Plot 3 ‚Üí Treatment
...
```

### Types of Randomization:

#### **1. Complete Randomization** (CRD)
- Each unit independently assigned
- Simplest design
- Use when units are homogeneous

#### **2. Blocked Randomization** (RCBD)
- Group similar units into blocks
- Randomize within blocks
- Use when units heterogeneous

#### **3. Stratified Randomization**
- Ensure balance across strata
- Common in clinical trials
- Use for known confounders

In [None]:
# Demonstrate randomization
def visualize_randomization(n_plots=20, seed=42):
    """
    Compare systematic vs random assignment
    """
    np.random.seed(seed)
    
    # Create plot grid (4 rows x 5 columns)
    rows, cols = 4, 5
    total_plots = rows * cols
    
    # BAD: Systematic assignment (first half control, second half treatment)
    systematic = ['Control'] * (total_plots // 2) + ['Treatment'] * (total_plots // 2)
    
    # GOOD: Random assignment
    randomized = np.random.permutation(systematic)
    
    # Convert to grid
    sys_grid = np.array(systematic).reshape(rows, cols)
    rand_grid = np.array(randomized).reshape(rows, cols)
    
    # Create numeric versions for heatmap
    sys_numeric = np.where(sys_grid == 'Control', 0, 1)
    rand_numeric = np.where(rand_grid == 'Control', 0, 1)
    
    # Create visualization
    fig = make_subplots(
        rows=1, cols=2,
        subplot_titles=('‚ùå BAD: Systematic (Confounded)', '‚úÖ GOOD: Randomized'),
        horizontal_spacing=0.15
    )
    
    # Systematic
    fig.add_trace(
        go.Heatmap(
            z=sys_numeric,
            text=sys_grid,
            texttemplate='%{text}',
            textfont={"size": 10},
            colorscale=[[0, 'lightblue'], [1, 'lightcoral']],
            showscale=False,
            hovertemplate='Row: %{y}<br>Col: %{x}<br>%{text}<extra></extra>'
        ),
        row=1, col=1
    )
    
    # Randomized
    fig.add_trace(
        go.Heatmap(
            z=rand_numeric,
            text=rand_grid,
            texttemplate='%{text}',
            textfont={"size": 10},
            colorscale=[[0, 'lightblue'], [1, 'lightcoral']],
            showscale=False,
            hovertemplate='Row: %{y}<br>Col: %{x}<br>%{text}<extra></extra>'
        ),
        row=1, col=2
    )
    
    # Update layout
    fig.update_xaxes(title_text="Column", showticklabels=False)
    fig.update_yaxes(title_text="Row", showticklabels=False, autorange='reversed')
    
    fig.update_layout(
        title="üé≤ The Power of Randomization<br><sub>Imagine north (top) is shady, south (bottom) is sunny</sub>",
        height=400,
        template='plotly_white'
    )
    
    return fig, systematic, randomized

# Run visualization
fig, systematic, randomized = visualize_randomization()
fig.show()

print("\nüé≤ Randomization Analysis:\n")
print("   ‚ùå SYSTEMATIC (Bad):")
print("      ‚Ä¢ All controls in rows 1-2 (north, shady)")
print("      ‚Ä¢ All treatments in rows 3-4 (south, sunny)")
print("      ‚Ä¢ CONFOUNDED: Can't separate treatment from location!")
print("\n   ‚úÖ RANDOMIZED (Good):")
print("      ‚Ä¢ Controls and treatments mixed throughout")
print("      ‚Ä¢ Location effects averaged across groups")
print("      ‚Ä¢ Can isolate true treatment effect")
print("\nüí° Key Principle:")
print("   Randomization distributes unknown confounds evenly")
print("   across treatment groups, allowing causal inference!")

---

## üîÑ Part 3: Replication - How Many Do I Need?

### What is Replication?

**Replication**: Independent experimental units receiving the same treatment

### Why Replicate?

**1. Estimate Variability**
```
n = 1:  No idea if result is typical
n = 3:  Some sense of variation
n = 10: Good estimate of variation
n = 30: Excellent estimate
```

**2. Increase Statistical Power**
```
Power = Probability of detecting real effect

More replicates ‚Üí Higher power
```

**3. Represent Population**
```
Sample ‚Üí Estimate population parameters
Larger n ‚Üí Better estimates
```

### Pseudoreplication: The Cardinal Sin

**Pseudoreplication**: Treating non-independent observations as independent replicates

#### **Example 1: Temporal Pseudoreplication**
```
‚ùå BAD:
Measure same 5 plants 10 times each
Claim n = 50

Problem: 10 measurements per plant are NOT independent!

‚úÖ GOOD:
True n = 5 plants
Average the 10 measurements per plant
```

#### **Example 2: Spatial Pseudoreplication**
```
‚ùå BAD:
Apply fertilizer to one pond
Sample 20 locations in that pond
Claim n = 20

Problem: All 20 samples from ONE experimental unit!

‚úÖ GOOD:
True n = 1 pond (cannot test treatment effect!)
Need multiple ponds as replicates
```

### Sample Size Calculation

**Required sample size depends on**:

1. **Effect size (d)**: How big is the difference?
   ```
   d = (Œº‚ÇÅ - Œº‚ÇÇ) / œÉ
   ```
   - Small: d = 0.2
   - Medium: d = 0.5
   - Large: d = 0.8

2. **Significance level (Œ±)**: Usually 0.05
   - Probability of Type I error (false positive)

3. **Power (1 - Œ≤)**: Usually 0.80
   - Probability of detecting real effect
   - Œ≤ = Type II error (false negative)

4. **Variability (œÉ)**: How much noise?
   - Get from pilot studies or literature

### Rule of Thumb:

**Minimum per group**:
- **Large effect**: n ‚â• 10-15 per group
- **Medium effect**: n ‚â• 30-50 per group
- **Small effect**: n ‚â• 100+ per group

**Better**: Do formal power analysis!

In [None]:
# Power analysis: Sample size calculation
def calculate_sample_size(effect_size, alpha=0.05, power=0.80):
    """
    Calculate required sample size for t-test
    Uses simplified formula for equal groups
    """
    # Critical values
    z_alpha = stats.norm.ppf(1 - alpha/2)  # Two-tailed
    z_beta = stats.norm.ppf(power)
    
    # Sample size per group
    n = ((z_alpha + z_beta) / effect_size) ** 2 * 2
    
    return int(np.ceil(n))

# Calculate for different effect sizes
effect_sizes = np.linspace(0.2, 2.0, 50)
sample_sizes_80 = [calculate_sample_size(d, power=0.80) for d in effect_sizes]
sample_sizes_90 = [calculate_sample_size(d, power=0.90) for d in effect_sizes]

# Visualize
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=effect_sizes, y=sample_sizes_80,
    mode='lines',
    line=dict(width=3, color='blue'),
    name='Power = 0.80 (80%)'
))

fig.add_trace(go.Scatter(
    x=effect_sizes, y=sample_sizes_90,
    mode='lines',
    line=dict(width=3, color='red'),
    name='Power = 0.90 (90%)'
))

# Mark Cohen's conventions
for d, label in [(0.2, 'Small'), (0.5, 'Medium'), (0.8, 'Large')]:
    n_80 = calculate_sample_size(d, power=0.80)
    fig.add_annotation(
        x=d, y=n_80,
        text=f"{label}<br>n={n_80}",
        showarrow=True,
        arrowhead=2,
        ax=0, ay=-40,
        bgcolor='lightyellow',
        bordercolor='black'
    )

fig.update_layout(
    title="üîÑ Required Sample Size vs Effect Size<br><sub>Œ± = 0.05 (two-tailed t-test)</sub>",
    xaxis_title="Effect Size (Cohen's d)",
    yaxis_title="Sample Size per Group (n)",
    height=600,
    template='plotly_white',
    yaxis_type='log'
)

fig.show()

# Print table
print("\nüîÑ Sample Size Requirements (per group):\n")
print("Effect Size | Description | Power=0.80 | Power=0.90")
print("------------|-------------|------------|------------")
for d, desc in [(0.2, 'Small'), (0.5, 'Medium'), (0.8, 'Large'), (1.0, 'Very Large')]:
    n_80 = calculate_sample_size(d, power=0.80)
    n_90 = calculate_sample_size(d, power=0.90)
    print(f"   {d:4.1f}     | {desc:11} |    {n_80:3d}     |    {n_90:3d}")

print("\nüí° Key Insights:")
print("   ‚Ä¢ Larger effect sizes need FEWER samples")
print("   ‚Ä¢ Small effects require HUNDREDS of samples!")
print("   ‚Ä¢ Higher power (90% vs 80%) requires more samples")
print("   ‚Ä¢ Relationship is NONLINEAR (logarithmic)")
print("\n‚ö†Ô∏è Important:")
print("   Do power analysis BEFORE collecting data!")
print("   Otherwise risk: Too few ‚Üí waste effort, miss effect")
print("                   Too many ‚Üí waste resources")

---

## ‚öñÔ∏è Part 4: Controls - The Baseline for Comparison

### Why Do We Need Controls?

**Without controls, you can't know**:
- Would the outcome have occurred anyway?
- Is the treatment causing the effect?
- How big is the treatment effect?

### Types of Controls:

#### **1. Negative Control**
**Definition**: Receives no treatment (baseline)

**Example**: Plants with no fertilizer
```
Treatment: Fertilizer added
Control:   No fertilizer (but same soil, water, light)
```

#### **2. Positive Control**
**Definition**: Known to produce the expected effect

**Example**: Standard antibiotic in drug test
```
Treatment: New antibiotic (unknown)
Positive Control: Penicillin (known to work)
Negative Control: No antibiotic
```

**Why**: Confirms experimental system is working!

#### **3. Procedural Control**
**Definition**: Receives same handling except treatment

**Example**: Injection study
```
Treatment: Drug injection
Control:   Saline injection (same needle stick, no drug)
```

**Why**: Controls for stress of procedure itself

#### **4. Vehicle Control**
**Definition**: Receives carrier substance without active ingredient

**Example**: Drug dissolved in oil
```
Treatment: Drug in oil
Control:   Just oil (no drug)
```

### Common Control Mistakes:

‚ùå **No control group**
```
"All plants grew after fertilizer!"
‚Üí Maybe they would have grown anyway?
```

‚ùå **Different handling**
```
Treatment: In greenhouse, daily watering
Control:   Outside, weekly watering
‚Üí Too many differences!
```

‚ùå **Historical control**
```
Compare to last year's data
‚Üí Conditions may have changed!
```

‚úÖ **Good control**:
```
Treatment: Fertilizer + daily water in greenhouse
Control:   No fertilizer + daily water in greenhouse
‚Üí Only ONE difference: the fertilizer!
```

In [None]:
# Demonstrate importance of controls
def simulate_control_scenarios(n=30, seed=42):
    """
    Simulate three scenarios showing why controls matter
    """
    np.random.seed(seed)
    
    # Scenario 1: Treatment has real effect
    control_1 = np.random.normal(10, 2, n)  # Mean=10
    treatment_1 = np.random.normal(15, 2, n)  # Mean=15 (real effect!)
    
    # Scenario 2: Treatment has NO effect (would have grown anyway)
    control_2 = np.random.normal(10, 2, n)  # Mean=10
    treatment_2 = np.random.normal(10, 2, n)  # Mean=10 (no effect)
    
    # Scenario 3: Natural growth (both increase over time)
    control_3 = np.random.normal(15, 2, n)  # Mean=15 (natural growth)
    treatment_3 = np.random.normal(20, 2, n)  # Mean=20 (growth + treatment)
    
    # Create visualization
    fig = make_subplots(
        rows=1, cols=3,
        subplot_titles=(
            'Scenario 1: Treatment Works',
            'Scenario 2: No Effect',
            'Scenario 3: Natural Growth + Treatment'
        ),
        horizontal_spacing=0.1
    )
    
    scenarios = [
        (control_1, treatment_1, 1, 1),
        (control_2, treatment_2, 1, 2),
        (control_3, treatment_3, 1, 3)
    ]
    
    for control, treatment, row, col in scenarios:
        # Control
        fig.add_trace(
            go.Box(y=control, name='Control',
                   marker_color='lightblue',
                   showlegend=(col == 1)),
            row=row, col=col
        )
        
        # Treatment
        fig.add_trace(
            go.Box(y=treatment, name='Treatment',
                   marker_color='lightcoral',
                   showlegend=(col == 1)),
            row=row, col=col
        )
        
        # Add mean lines
        control_mean = np.mean(control)
        treatment_mean = np.mean(treatment)
        
        # Statistical test
        t_stat, p_value = stats.ttest_ind(treatment, control)
        
        # Add annotation
        effect = treatment_mean - control_mean
        fig.add_annotation(
            text=f"Difference: {effect:.1f}<br>p = {p_value:.3f}",
            x=0.5, y=max(treatment.max(), control.max()) * 1.1,
            xref=f'x{col if col > 1 else ""}', yref=f'y{col if col > 1 else ""}',
            showarrow=False,
            bgcolor='lightyellow' if p_value < 0.05 else 'lightgray',
            bordercolor='black'
        )
    
    fig.update_yaxes(title_text="Plant Height (cm)")
    
    fig.update_layout(
        title="‚öñÔ∏è Why Controls Are Essential<br><sub>Each scenario n=30 per group</sub>",
        height=500,
        template='plotly_white'
    )
    
    return fig

# Run simulation
fig = simulate_control_scenarios()
fig.show()

print("\n‚öñÔ∏è Control Group Analysis:\n")
print("   üìä Scenario 1: Treatment Works")
print("      ‚Ä¢ Control: Mean = 10 cm")
print("      ‚Ä¢ Treatment: Mean = 15 cm")
print("      ‚Ä¢ Conclusion: Treatment causes +5 cm growth")
print("\n   üìä Scenario 2: No Effect")
print("      ‚Ä¢ Control: Mean = 10 cm")
print("      ‚Ä¢ Treatment: Mean = 10 cm")
print("      ‚Ä¢ Conclusion: Treatment has no effect")
print("      ‚Ä¢ Without control: Would think it 'worked'!")
print("\n   üìä Scenario 3: Natural Growth + Treatment")
print("      ‚Ä¢ Control: Mean = 15 cm (natural growth)")
print("      ‚Ä¢ Treatment: Mean = 20 cm")
print("      ‚Ä¢ Conclusion: Treatment adds +5 cm beyond natural growth")
print("      ‚Ä¢ Without control: Would overestimate effect!")
print("\nüí° The Lesson:")
print("   ALWAYS include a control group!")
print("   It's the only way to isolate the treatment effect.")

---

## üéõÔ∏è Part 5: Factorial Designs - Testing Multiple Factors

### What is a Factorial Design?

**Factorial Design**: Test 2+ factors simultaneously in all combinations

### 2√ó2 Factorial Example:

**Question**: Effects of light and water on plant growth?

**Factors**:
- **Factor A**: Light (Low, High)
- **Factor B**: Water (Low, High)

**Design**:
```
              Water
           Low    High
      ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
  Low ‚îÇ   A    ‚îÇ   B   ‚îÇ
Light ‚îÇ  L+W-  ‚îÇ  L+W+ ‚îÇ
      ‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
 High ‚îÇ   C    ‚îÇ   D   ‚îÇ
      ‚îÇ  L-W+  ‚îÇ  L-W- ‚îÇ
      ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

4 treatment combinations!
```

### Advantages of Factorial Designs:

#### **1. Test Interactions**
```
Interaction: Effect of one factor depends on level of another

Example:
High water helps ONLY with high light
‚Üí Light √ó Water interaction!
```

#### **2. More Efficient**
```
One-factor-at-a-time: Need 2 separate experiments
Factorial: Get both answers in 1 experiment!
```

#### **3. Realistic**
```
Nature varies multiple factors simultaneously
Factorial design mimics reality
```

### Types of Effects:

#### **Main Effect**:
```
Average effect of one factor across all levels of other factors

Main effect of Light = 
  (High Light Average) - (Low Light Average)
```

#### **Interaction Effect**:
```
Effect of one factor changes at different levels of another

Simple patterns:
  No interaction: Parallel lines
  Interaction: Non-parallel (crossing) lines
```

### 2√ó3 Factorial Example:

**Factors**:
- **Factor A**: Fertilizer (None, Low, High) - 3 levels
- **Factor B**: Light (Low, High) - 2 levels

**Total combinations**: 2 √ó 3 = **6 treatments**

### 2√ó2√ó2 Factorial (Three Factors):

**Factors**:
- Light (Low, High)
- Water (Low, High)  
- Temperature (Low, High)

**Total combinations**: 2 √ó 2 √ó 2 = **8 treatments**

**Can test**:
- 3 main effects
- 3 two-way interactions (L√óW, L√óT, W√óT)
- 1 three-way interaction (L√óW√óT)

In [None]:
# Simulate 2x2 factorial design
def simulate_factorial_2x2(n_per_group=20, seed=42):
    """
    Simulate factorial experiment with interaction
    """
    np.random.seed(seed)
    
    # Create treatment combinations
    # Scenario: Light and Water effects with interaction
    # High water helps ONLY with high light (interaction!)
    
    treatments = {
        'Low Light\nLow Water':    np.random.normal(10, 2, n_per_group),
        'Low Light\nHigh Water':   np.random.normal(12, 2, n_per_group),  # Small water effect
        'High Light\nLow Water':   np.random.normal(15, 2, n_per_group),  # Good light effect
        'High Light\nHigh Water':  np.random.normal(25, 2, n_per_group),  # Synergy!
    }
    
    # Create DataFrame
    data = []
    for treatment, values in treatments.items():
        light = 'High' if 'High Light' in treatment else 'Low'
        water = 'High' if 'High Water' in treatment else 'Low'
        for value in values:
            data.append({'Light': light, 'Water': water, 
                        'Treatment': treatment, 'Height': value})
    
    df = pd.DataFrame(data)
    
    # Create visualization
    fig = make_subplots(
        rows=1, cols=2,
        subplot_titles=(
            'Treatment Means (Bar Plot)',
            'Interaction Plot (Line Plot)'
        ),
        horizontal_spacing=0.15
    )
    
    # Bar plot
    means = df.groupby('Treatment')['Height'].mean()
    stds = df.groupby('Treatment')['Height'].std()
    
    fig.add_trace(
        go.Bar(
            x=list(means.index),
            y=means.values,
            error_y=dict(type='data', array=stds.values),
            marker_color=['lightblue', 'lightcoral', 'lightgreen', 'gold'],
            showlegend=False
        ),
        row=1, col=1
    )
    
    # Interaction plot
    for water_level in ['Low', 'High']:
        subset = df[df['Water'] == water_level]
        means_by_light = subset.groupby('Light')['Height'].mean()
        
        fig.add_trace(
            go.Scatter(
                x=['Low', 'High'],
                y=[means_by_light['Low'], means_by_light['High']],
                mode='lines+markers',
                line=dict(width=3),
                marker=dict(size=12),
                name=f'Water: {water_level}'
            ),
            row=1, col=2
        )
    
    # Update axes
    fig.update_xaxes(title_text="Treatment", row=1, col=1, tickangle=45)
    fig.update_xaxes(title_text="Light Level", row=1, col=2)
    fig.update_yaxes(title_text="Plant Height (cm)")
    
    fig.update_layout(
        title="üéõÔ∏è 2√ó2 Factorial Design: Light √ó Water Interaction<br><sub>Non-parallel lines indicate interaction!</sub>",
        height=500,
        template='plotly_white'
    )
    
    return fig, df

# Run simulation
fig, df = simulate_factorial_2x2()
fig.show()

# Calculate main effects and interaction
means = df.groupby(['Light', 'Water'])['Height'].mean().unstack()

# Main effect of Light (average across water levels)
light_effect = means.loc['High'].mean() - means.loc['Low'].mean()

# Main effect of Water (average across light levels)
water_effect = means['High'].mean() - means['Low'].mean()

# Interaction (simple slopes difference)
slope_low_light = means.loc['Low', 'High'] - means.loc['Low', 'Low']
slope_high_light = means.loc['High', 'High'] - means.loc['High', 'Low']
interaction = slope_high_light - slope_low_light

print("\nüéõÔ∏è Factorial Design Analysis:\n")
print("   üìä Main Effects:")
print(f"      Light: {light_effect:.1f} cm (High vs Low)")
print(f"      Water: {water_effect:.1f} cm (High vs Low)")
print("\n   üîÑ Interaction Effect:")
print(f"      Light √ó Water: {interaction:.1f} cm")
print("\n   üìà Interpretation:")
print("      ‚Ä¢ Both factors have positive main effects")
print("      ‚Ä¢ STRONG INTERACTION detected")
print("      ‚Ä¢ Water effect is MUCH larger with high light")
print("      ‚Ä¢ Low light: Water adds only +2 cm")
print("      ‚Ä¢ High light: Water adds +8 cm (synergy!)")
print("\nüí° Why This Matters:")
print("   If you tested each factor separately, you'd miss")
print("   the synergistic effect of combining them!")
print("   Factorial designs reveal these interactions.")

---

## üß± Part 6: Blocking - Reducing Noise

### What is Blocking?

**Blocking**: Group similar experimental units together, then randomize within blocks

### Why Block?

**Problem**: Environmental gradients or natural variation

**Example**:
```
Field with gradient:
North end: Shady, moist, fertile
South end: Sunny, dry, poor soil
```

**Solution**: Create blocks!
```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Block 1 (North) ‚îÇ ‚Üê Homogeneous
‚îÇ  C  T  C  T  C   ‚îÇ ‚Üê Randomize within
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Block 2 (Mid)   ‚îÇ
‚îÇ  T  C  T  C  T   ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Block 3 (South) ‚îÇ
‚îÇ  C  T  C  T  C   ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### Benefits of Blocking:

**1. Reduces Error Variance**
```
Without blocking:
  Variance includes block differences
  
With blocking:
  Variance only within-block
  Block effect removed from error term
  
Result: MORE SENSITIVE to treatment effects!
```

**2. Increases Statistical Power**
```
Lower error ‚Üí Easier to detect real effects
```

**3. More Representative**
```
Each treatment tested in all conditions
Results apply broadly
```

### When to Use Blocking:

‚úÖ **Use blocking when**:
- Known source of variation exists
- Environmental gradient present
- Units come in natural groups
- Time periods differ

‚ùå **Don't block when**:
- Units are homogeneous
- No obvious grouping
- Costs too much precision (too few units per block)

### Common Blocking Variables:

**Spatial**:
- Location (north/south)
- Elevation
- Distance from edge

**Temporal**:
- Time of day
- Date/season
- Year

**Biological**:
- Age class
- Sex
- Size class
- Genetic line

**Technical**:
- Laboratory batch
- Instrument
- Observer

### Analysis:

**Two-way ANOVA**:
```
Source          | SS  | df | MS | F
----------------|-----|----|----|---
Treatment       | ... | t-1| ...|...  ‚Üê Test this!
Block           | ... | b-1| ...|...  ‚Üê Removes from error
Error           | ... |... | ...|
Total           | ... | n-1|
```

**Block effect not tested** (not of interest, just controlling)

In [None]:
# Demonstrate benefit of blocking
def compare_crd_vs_rcbd(n_blocks=4, n_reps_per_block=5, seed=42):
    """
    Compare Completely Randomized Design (CRD) vs 
    Randomized Complete Block Design (RCBD)
    """
    np.random.seed(seed)
    
    # Simulate data with block effect
    block_effects = [0, 5, 10, 15]  # Strong gradient
    treatment_effect = 3  # True treatment effect
    
    data = []
    for block_id, block_effect in enumerate(block_effects, 1):
        for treatment in ['Control', 'Treatment']:
            for rep in range(n_reps_per_block):
                base_value = block_effect
                if treatment == 'Treatment':
                    base_value += treatment_effect
                value = base_value + np.random.normal(0, 1)
                data.append({
                    'Block': f'Block {block_id}',
                    'Treatment': treatment,
                    'Value': value
                })
    
    df = pd.DataFrame(data)
    
    # Analyze as CRD (ignoring blocks)
    control_crd = df[df['Treatment'] == 'Control']['Value']
    treatment_crd = df[df['Treatment'] == 'Treatment']['Value']
    t_stat_crd, p_value_crd = stats.ttest_ind(treatment_crd, control_crd)
    
    # Analyze as RCBD (accounting for blocks) - simplified
    # Remove block effects
    block_means = df.groupby('Block')['Value'].transform('mean')
    grand_mean = df['Value'].mean()
    df['Value_corrected'] = df['Value'] - block_means + grand_mean
    
    control_rcbd = df[df['Treatment'] == 'Control']['Value_corrected']
    treatment_rcbd = df[df['Treatment'] == 'Treatment']['Value_corrected']
    t_stat_rcbd, p_value_rcbd = stats.ttest_ind(treatment_rcbd, control_rcbd)
    
    # Visualize
    fig = make_subplots(
        rows=1, cols=2,
        subplot_titles=(
            f'CRD: Ignoring Blocks<br>p = {p_value_crd:.4f}',
            f'RCBD: Accounting for Blocks<br>p = {p_value_rcbd:.4f}'
        ),
        horizontal_spacing=0.15
    )
    
    # CRD visualization (all data pooled)
    for treatment in ['Control', 'Treatment']:
        subset = df[df['Treatment'] == treatment]
        fig.add_trace(
            go.Box(y=subset['Value'], name=treatment,
                   marker_color='lightblue' if treatment == 'Control' else 'lightcoral',
                   showlegend=False),
            row=1, col=1
        )
    
    # RCBD visualization (by block)
    for block in df['Block'].unique():
        block_data = df[df['Block'] == block]
        for treatment in ['Control', 'Treatment']:
            subset = block_data[block_data['Treatment'] == treatment]
            fig.add_trace(
                go.Scatter(
                    x=[treatment] * len(subset),
                    y=subset['Value'],
                    mode='markers',
                    marker=dict(size=8, opacity=0.6),
                    name=block,
                    showlegend=(treatment == 'Control')
                ),
                row=1, col=2
            )
    
    # Add means
    for col in [1, 2]:
        data_col = 'Value' if col == 1 else 'Value'
        for i, treatment in enumerate(['Control', 'Treatment']):
            mean_val = df[df['Treatment'] == treatment][data_col].mean()
            fig.add_shape(
                type="line",
                x0=i - 0.4, x1=i + 0.4,
                y0=mean_val, y1=mean_val,
                line=dict(color="black", width=3),
                row=1, col=col
            )
    
    fig.update_yaxes(title_text="Response Variable")
    
    fig.update_layout(
        title="üß± Power of Blocking: Reducing Noise<br><sub>Same data, different analysis</sub>",
        height=500,
        template='plotly_white'
    )
    
    return fig, p_value_crd, p_value_rcbd

# Run comparison
fig, p_crd, p_rcbd = compare_crd_vs_rcbd()
fig.show()

print("\nüß± Blocking Analysis:\n")
print("   üìä CRD (Completely Randomized Design):")
print(f"      ‚Ä¢ p-value: {p_crd:.4f}")
print(f"      ‚Ä¢ Significant at Œ±=0.05? {p_crd < 0.05}")
print("      ‚Ä¢ Problem: Block variation included in error")
print("      ‚Ä¢ Large overlap between groups")
print("\n   üìä RCBD (Randomized Complete Block Design):")
print(f"      ‚Ä¢ p-value: {p_rcbd:.4f}")
print(f"      ‚Ä¢ Significant at Œ±=0.05? {p_rcbd < 0.05}")
print("      ‚Ä¢ Advantage: Block variation removed from error")
print("      ‚Ä¢ Clearer separation between treatments")
print("\nüí° Key Insight:")
print("   Same data, same treatment effect,")
print("   but RCBD has much greater power to detect it!")
print("   This is the magic of blocking.")
print("\n‚öñÔ∏è Trade-off:")
print("   Blocking uses degrees of freedom (3 df for 4 blocks)")
print("   But reduction in error variance more than compensates!")

---

## üéì Part 7: Common Experimental Designs

### Summary of Major Designs:

#### **1. Completely Randomized Design (CRD)**
```
Structure: Treatments randomly assigned to units
Analysis:  One-way ANOVA or t-test
Use when:  Units homogeneous
Example:   Lab study with uniform conditions
```

#### **2. Randomized Complete Block Design (RCBD)**
```
Structure: Units grouped into blocks, randomize within
Analysis:  Two-way ANOVA (treatment + block)
Use when:  Known source of variation
Example:   Field study with spatial gradient
```

#### **3. Factorial Design**
```
Structure: 2+ factors, all combinations tested
Analysis:  Multi-way ANOVA with interactions
Use when:  Testing multiple factors
Example:   Light √ó Water √ó Temperature
```

#### **4. Split-Plot Design**
```
Structure: One factor applied to large units (plots),
           another to subdivisions (subplots)
Analysis:  Mixed model ANOVA
Use when:  One factor hard to apply at small scale
Example:   Irrigation (whole plot) √ó Variety (subplot)
```

#### **5. Repeated Measures Design**
```
Structure: Same units measured multiple times
Analysis:  Repeated measures ANOVA or mixed model
Use when:  Following individuals over time
Example:   Growth measured weekly on same plants
```

#### **6. Nested Design**
```
Structure: Subunits nested within units
Analysis:  Nested ANOVA
Use when:  Hierarchical structure
Example:   Leaves (nested in) trees (nested in) sites
```

### Design Selection Flowchart:

```
START
  |
  ‚îú‚îÄ One factor? ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
  |                  |
  ‚îî‚îÄ Multiple? ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
                     |
  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
  |
  ‚îú‚îÄ Units homogeneous? ‚îÄ‚Üí YES ‚Üí CRD
  |                      ‚Üì
  ‚îî‚îÄ Known variation? ‚îÄ‚îÄ‚Üí NO  ‚Üí RCBD
                         ‚Üì
  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
  |
  ‚îú‚îÄ Test interactions? ‚Üí YES ‚Üí Factorial
  |                       ‚Üì
  ‚îî‚îÄ Repeated measures? ‚îÄ‚Üí NO  ‚Üí Split-plot or Nested
```

---

## üéì Summary

### Key Principles of Experimental Design:

‚úÖ **Three Pillars**: Replication, Randomization, Control  
‚úÖ **Replication**: Provides statistical power and estimates variability  
‚úÖ **Randomization**: Eliminates bias and distributes confounds  
‚úÖ **Control**: Establishes baseline for comparison  
‚úÖ **Blocking**: Reduces noise when variation is known  
‚úÖ **Factorial**: Tests interactions efficiently  
‚úÖ **Pseudoreplication**: Must avoid non-independent samples  
‚úÖ **Power Analysis**: Calculate n BEFORE collecting data  

### The Design Process:

**Step 1: Define Question**
- What exactly are you testing?
- What's your hypothesis?

**Step 2: Choose Response Variable**
- What will you measure?
- Is it appropriate?

**Step 3: Identify Factors**
- What's being manipulated?
- How many levels?

**Step 4: Select Design**
- CRD, RCBD, Factorial, etc.
- Match design to question

**Step 5: Power Analysis**
- How many replicates needed?
- Based on expected effect size

**Step 6: Randomize**
- Generate random assignments
- Document the scheme

**Step 7: Include Controls**
- Negative control minimum
- Consider positive/procedural

**Step 8: Plan Analysis**
- Choose statistical test in advance
- Prevents data dredging

### Common Mistakes to Avoid:

‚ùå **No control group**  
‚ùå **Pseudoreplication** (treating non-independent samples as independent)  
‚ùå **No randomization** (systematic assignment)  
‚ùå **Too few replicates** (underpowered)  
‚ùå **Confounded variables** (multiple things changing)  
‚ùå **Post-hoc sample size** (calculating n after seeing data)  
‚ùå **Data dredging** (testing many hypotheses after collecting data)  
‚ùå **Ignoring known variation** (not blocking when you should)  

### Golden Rules:

**Rule 1**: Design your experiment BEFORE collecting data  
**Rule 2**: Randomize, randomize, randomize!  
**Rule 3**: Always include a control group  
**Rule 4**: Calculate required sample size with power analysis  
**Rule 5**: Keep it as simple as possible (but no simpler)  
**Rule 6**: Avoid pseudoreplication at all costs  
**Rule 7**: Block when you know sources of variation  
**Rule 8**: Document everything (especially randomization)  
**Rule 9**: Consult a statistician BEFORE, not after!  
**Rule 10**: If in doubt, add more replicates  

### Quote to Remember:

> *"To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of."*  
> ‚Äî Sir Ronald A. Fisher

### Next Steps:

**Apply these principles**:
- Design your own experiments
- Critique published studies
- Calculate power for your research
- Use appropriate designs for your questions

**Further Learning**:
- Advanced designs (Latin square, crossover, etc.)
- Mixed models for complex designs
- Optimal experimental design theory
- Bayesian experimental design

---

<div align="center">

**Made with üíö by Ms. Susama Kar & Dr. Alok Patel**

[üìì Previous: Testing Fundamentals](06_hypothesis_testing_fundamentals.ipynb) | 
[üè† Unit 4 Home](../../)

</div>