# Advanced Active Learning: Model Disagreement

Welcome to advanced active learning! In this tutorial, you'll learn how to use **model ensembles** and **disagreement-based sampling** for even more sophisticated experimental design.

## Learning Objectives

By the end of this tutorial, you will be able to:
- Understand the difference between uncertainty and disagreement sampling
- Build model ensembles (committees) for disagreement estimation
- Implement query-by-committee with AutoRA's disagreement experimentalist
- Compare uncertainty vs. disagreement strategies
- Understand when to use each active learning approach

## Uncertainty vs. Disagreement: What's the Difference?

### Uncertainty Sampling (Previous Tutorial)
- Uses a **single model** (e.g., Gaussian Process)
- Queries where model is most uncertain: $\arg\max_x \sigma(x)$
- Works well when model can accurately estimate its own uncertainty

### Disagreement Sampling (This Tutorial)
- Uses **multiple models** (ensemble/committee)
- Queries where models disagree most: $\arg\max_x \text{Var}(\{f_1(x), f_2(x), ..., f_K(x)\})$
- More robust when individual models might be miscalibrated

### Key Insight

**Disagreement captures epistemic uncertainty** (what we don't know) rather than aleatoric uncertainty (inherent noise).

When models disagree, it means:
- The data doesn't strongly constrain the model in that region
- We need more observations to resolve the disagreement
- This is **highly informative** for learning!

## Setup & Imports

In [None]:
import sys, os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import cm
from sklearn.metrics import mean_squared_error
import warnings
warnings.filterwarnings('ignore')  # Suppress training warnings for cleaner output

np.random.seed(42)

# Add project folder to path
target_folder = os.path.abspath(os.path.join(os.getcwd(), '..'))
if target_folder not in sys.path:
    sys.path.append(target_folder)

## Part 1: Building a Model Ensemble

To use disagreement sampling, we need multiple models that can provide diverse predictions. We'll create an **ensemble** of neural networks.

### Why Neural Network Ensembles?

1. **Diversity**: Different random initializations lead to different learned functions
2. **Flexibility**: Can model complex, non-linear relationships
3. **Familiarity**: Students already know FFNRegressor from earlier tutorials
4. **Scalability**: Works well with high-dimensional data

In [None]:
from resources.regressors import FFN, FFNRegressor
from sklearn.base import BaseEstimator, RegressorMixin

class FFNEnsemble(BaseEstimator, RegressorMixin):
    """
    Ensemble of Feed-Forward Neural Networks
    
    Each model is trained independently with different random initialization.
    Predictions are averaged, and disagreement is measured by variance.
    """
    
    def __init__(self, n_units, input_dim, n_models=5, max_epochs=50, lr=0.1, verbose=False):
        self.n_units = n_units
        self.input_dim = input_dim
        self.n_models = n_models
        self.max_epochs = max_epochs
        self.lr = lr
        self.verbose = verbose
        self.models = []
        
    def fit(self, X, y):
        """Train ensemble of models"""
        self.models = []
        
        for i in range(self.n_models):
            # Create model with different random seed
            model = FFNRegressor(
                FFN(self.n_units, self.input_dim),
                max_epochs=self.max_epochs,
                lr=self.lr,
                verbose=self.verbose
            )
            
            # Train on full dataset (could also use bootstrap samples)
            model.fit(X, y)
            self.models.append(model)
            
        return self
    
    def predict(self, X, return_std=False):
        """Make predictions with ensemble"""
        # Get predictions from all models
        predictions = np.array([model.predict(X) for model in self.models])
        
        # Average predictions
        mean_pred = np.mean(predictions, axis=0)
        
        if return_std:
            # Standard deviation across models = disagreement
            std_pred = np.std(predictions, axis=0)
            return mean_pred, std_pred
        else:
            return mean_pred
    
    def get_params(self, deep=True):
        """Required for sklearn compatibility"""
        return {
            'n_units': self.n_units,
            'input_dim': self.input_dim,
            'n_models': self.n_models,
            'max_epochs': self.max_epochs,
            'lr': self.lr,
            'verbose': self.verbose
        }
    
    def set_params(self, **params):
        """Required for sklearn compatibility"""
        for key, value in params.items():
            setattr(self, key, value)
        return self

print("âœ“ FFNEnsemble class created!")
print("\nKey methods:")
print("  - fit(X, y): Train all models in ensemble")
print("  - predict(X, return_std=True): Get mean prediction and disagreement")

### Testing the Ensemble on Simple 1D Data

Let's visualize how ensemble disagreement works on a simple example:

In [None]:
# Create simple 1D ground truth
def ground_truth_1d(x):
    return np.sin(3 * x) + 0.3 * np.cos(9 * x)

# Sample sparse training data
X_train_1d = np.array([[0.1], [0.3], [0.7], [0.9]])
y_train_1d = ground_truth_1d(X_train_1d.ravel())

# Note: For 1D demo, we use input_dim=1 and n_units=1 (no participant ID)
ensemble_1d = FFNEnsemble(
    n_units=1,  # Single "pseudo-unit" for 1D case
    input_dim=1,
    n_models=5,
    max_epochs=100,
    lr=0.05,
    verbose=False
)
ensemble_1d.fit(X_train_1d, y_train_1d)

# Make predictions
X_test_1d = np.linspace(0, 1, 200).reshape(-1, 1)
y_pred, y_std = ensemble_1d.predict(X_test_1d, return_std=True)

# Visualize individual models and ensemble
plt.figure(figsize=(12, 5))

# Plot individual model predictions
for i, model in enumerate(ensemble_1d.models):
    y_individual = model.predict(X_test_1d)
    plt.plot(X_test_1d, y_individual, alpha=0.3, linewidth=1, color='blue')

# Plot ensemble mean and disagreement
plt.fill_between(X_test_1d.ravel(), 
                 y_pred - 2*y_std, 
                 y_pred + 2*y_std, 
                 alpha=0.3, 
                 color='red',
                 label='Disagreement (Â±2Ïƒ)')
plt.plot(X_test_1d, y_pred, 'r-', linewidth=2, label='Ensemble Mean')
plt.plot(X_test_1d, ground_truth_1d(X_test_1d.ravel()), 'k--', linewidth=2, label='True Function')
plt.scatter(X_train_1d, y_train_1d, c='green', s=200, zorder=10, edgecolors='black', linewidths=2, label='Training Data')
plt.xlabel('x', fontsize=12)
plt.ylabel('y', fontsize=12)
plt.title('Ensemble Disagreement: Individual Models vs. Ensemble', fontsize=14)
plt.legend(fontsize=10)
plt.grid(True, alpha=0.3)
plt.show()

print("Key Observations:")
print("  - Blue lines: Individual model predictions (diverse!)")
print("  - Red line: Ensemble mean (averaged prediction)")
print("  - Red band: Disagreement (where models disagree)")
print("  - Disagreement is HIGH far from training data")
print("  - Disagreement is LOW near training data")

## Part 2: AutoRA Disagreement Experimentalist

Now let's use AutoRA's built-in disagreement experimentalist with our 2AFC experiment.

### Installation

Make sure you have the disagreement experimentalist installed:

```bash
pip install -U "autora[experimentalist-inequality]"
```

Note: In AutoRA, disagreement sampling is part of the "inequality" experimentalist package.

In [None]:
from resources.synthetic import twoafc
from autora.state import StandardState, on_state, estimator_on_state
from autora.experimentalist.random import random_sample
from autora.experimentalist.pooler import grid_pool

# Import disagreement experimentalist
try:
    from autora.experimentalist.inequality import inequality_sample
    print("âœ“ Disagreement (inequality) experimentalist imported successfully!")
except ImportError as e:
    print("âœ— Error importing disagreement experimentalist.")
    print("  Please install with: pip install -U 'autora[experimentalist-inequality]'")
    raise e

# Define participant parameters
n_units = 100
parameters = np.random.normal(1, 0.5, (n_units, 2))
parameters = np.where(parameters < 0, 0, parameters)

# Create experiment
experiment = twoafc(parameters, resolution=10)

# Get variable names
iv_names = [iv.name for iv in experiment.variables.independent_variables]
dv_names = [dv.name for dv in experiment.variables.dependent_variables]

print("\nExperiment setup complete!")
print(f"IVs: {iv_names}")
print(f"DVs: {dv_names}")

## Part 3: Three-Way Comparison

Let's compare three strategies:
1. **Random sampling** (baseline)
2. **Uncertainty sampling** (from previous tutorial)
3. **Disagreement sampling** (new!)

We'll run 10 cycles and track performance.

### Strategy 1: Random Sampling (Baseline)

In [None]:
from resources.regressors import FFN, FFNRegressor

# Wrap components
experiment_runner = on_state(experiment.run, output=['experiment_data'])
experimentalist_random = on_state(random_sample, output=['conditions'])

# Create model
model_random = FFNRegressor(FFN(n_units, 2), max_epochs=50, lr=0.1, verbose=False)
theorist_random = estimator_on_state(model_random)

# Initialize state
state_random = StandardState(
    variables=experiment.variables,
    conditions=pd.DataFrame(columns=iv_names),
    experiment_data=pd.DataFrame(columns=iv_names + dv_names),
    models=[model_random]
)

# Run cycles
n_cycles = 10
samples_per_cycle = 5
mse_history_random = []

print("Running RANDOM sampling strategy...\n")

for cycle in range(n_cycles):
    state_random = experimentalist_random(
        state_random,
        num_samples=samples_per_cycle,
        random_state=42+cycle,
        sample_all=['participant_id']
    )
    state_random = experiment_runner(state_random, added_noise=0.0, random_state=42+cycle)
    state_random = theorist_random(state_random)
    
    X = state_random.experiment_data[iv_names].values
    y_true = state_random.experiment_data[dv_names].values.ravel()
    y_pred = state_random.models[0].predict(X)
    mse = mean_squared_error(y_true, y_pred)
    mse_history_random.append(mse)
    
    print(f"Cycle {cycle+1:2d}/{n_cycles}: {len(state_random.experiment_data):4d} samples, MSE = {mse:.4f}")

print("\nâœ“ Random sampling complete!")

### Strategy 2: Uncertainty Sampling (GP-based)

In [None]:
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C, WhiteKernel
from autora.experimentalist.uncertainty import uncertainty_sample

# Wrap uncertainty experimentalist
experimentalist_uncertainty = on_state(uncertainty_sample, output=['conditions'])
pool_generator = on_state(grid_pool, output=['conditions'])

# Create GP model
kernel_uncertainty = C(1.0, (1e-3, 1e3)) * RBF([1.0, 1.0, 1.0], (1e-2, 1e2)) + WhiteKernel(noise_level=0.01)
gp_uncertainty = GaussianProcessRegressor(
    kernel=kernel_uncertainty,
    n_restarts_optimizer=5,
    random_state=42,
    normalize_y=True
)
theorist_uncertainty = estimator_on_state(gp_uncertainty)

# Initialize with seed data
seed_conditions = random_sample(
    experiment.variables,
    num_samples=2,
    random_state=42,
    sample_all=['participant_id']
)
state_uncertainty = StandardState(
    variables=experiment.variables,
    conditions=seed_conditions,
    experiment_data=pd.DataFrame(columns=iv_names + dv_names),
    models=[gp_uncertainty]
)

state_uncertainty = experiment_runner(state_uncertainty, added_noise=0.0, random_state=42)
state_uncertainty = theorist_uncertainty(state_uncertainty)

mse_history_uncertainty = []

X = state_uncertainty.experiment_data[iv_names].values
y_true = state_uncertainty.experiment_data[dv_names].values.ravel()
y_pred = state_uncertainty.models[0].predict(X)
mse = mean_squared_error(y_true, y_pred)
mse_history_uncertainty.append(mse)

print("Running UNCERTAINTY sampling strategy...\n")
print(f"Cycle  0/{n_cycles}: {len(state_uncertainty.experiment_data):4d} samples (seed), MSE = {mse:.4f}")

for cycle in range(1, n_cycles):
    pool_state = StandardState(
        variables=experiment.variables,
        conditions=pd.DataFrame(columns=iv_names),
        experiment_data=state_uncertainty.experiment_data.copy(),
        models=state_uncertainty.models
    )
    pool_state = pool_generator(pool_state, num_samples=20, sample_all=['participant_id'])
    pool_state = experimentalist_uncertainty(pool_state, num_samples=samples_per_cycle)
    
    state_uncertainty.conditions = pool_state.conditions
    state_uncertainty = experiment_runner(state_uncertainty, added_noise=0.0, random_state=42+cycle)
    state_uncertainty = theorist_uncertainty(state_uncertainty)
    
    X = state_uncertainty.experiment_data[iv_names].values
    y_true = state_uncertainty.experiment_data[dv_names].values.ravel()
    y_pred = state_uncertainty.models[0].predict(X)
    mse = mean_squared_error(y_true, y_pred)
    mse_history_uncertainty.append(mse)
    
    print(f"Cycle {cycle:2d}/{n_cycles}: {len(state_uncertainty.experiment_data):4d} samples, MSE = {mse:.4f}")

print("\nâœ“ Uncertainty sampling complete!")

### Strategy 3: Disagreement Sampling (Ensemble-based)

In [None]:
# Wrap disagreement experimentalist
experimentalist_disagreement = on_state(inequality_sample, output=['conditions'])

# Create ensemble model
ensemble_model = FFNEnsemble(
    n_units=n_units,
    input_dim=2,  # ratio and scatteredness
    n_models=5,
    max_epochs=50,
    lr=0.1,
    verbose=False
)
theorist_disagreement = estimator_on_state(ensemble_model)

# Initialize with seed data
seed_conditions_disagreement = random_sample(
    experiment.variables,
    num_samples=2,
    random_state=42,
    sample_all=['participant_id']
)
state_disagreement = StandardState(
    variables=experiment.variables,
    conditions=seed_conditions_disagreement,
    experiment_data=pd.DataFrame(columns=iv_names + dv_names),
    models=[ensemble_model]
)

state_disagreement = experiment_runner(state_disagreement, added_noise=0.0, random_state=42)
state_disagreement = theorist_disagreement(state_disagreement)

mse_history_disagreement = []

X = state_disagreement.experiment_data[iv_names].values
y_true = state_disagreement.experiment_data[dv_names].values.ravel()
y_pred = state_disagreement.models[0].predict(X)
mse = mean_squared_error(y_true, y_pred)
mse_history_disagreement.append(mse)

print("Running DISAGREEMENT sampling strategy...\n")
print(f"Cycle  0/{n_cycles}: {len(state_disagreement.experiment_data):4d} samples (seed), MSE = {mse:.4f}")

for cycle in range(1, n_cycles):
    pool_state = StandardState(
        variables=experiment.variables,
        conditions=pd.DataFrame(columns=iv_names),
        experiment_data=state_disagreement.experiment_data.copy(),
        models=state_disagreement.models
    )
    pool_state = pool_generator(pool_state, num_samples=20, sample_all=['participant_id'])
    pool_state = experimentalist_disagreement(pool_state, num_samples=samples_per_cycle)
    
    state_disagreement.conditions = pool_state.conditions
    state_disagreement = experiment_runner(state_disagreement, added_noise=0.0, random_state=42+cycle)
    state_disagreement = theorist_disagreement(state_disagreement)
    
    X = state_disagreement.experiment_data[iv_names].values
    y_true = state_disagreement.experiment_data[dv_names].values.ravel()
    y_pred = state_disagreement.models[0].predict(X)
    mse = mean_squared_error(y_true, y_pred)
    mse_history_disagreement.append(mse)
    
    print(f"Cycle {cycle:2d}/{n_cycles}: {len(state_disagreement.experiment_data):4d} samples, MSE = {mse:.4f}")

print("\nâœ“ Disagreement sampling complete!")

## Part 4: Comparison Analysis

Let's compare all three strategies!

### MSE Comparison

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(range(1, n_cycles+1), mse_history_random, 'o-', 
         label='Random Sampling', linewidth=2, markersize=8, color='#ff7f0e')
plt.plot(range(1, n_cycles+1), mse_history_uncertainty, 's-', 
         label='Uncertainty Sampling (GP)', linewidth=2, markersize=8, color='#2ca02c')
plt.plot(range(1, n_cycles+1), mse_history_disagreement, '^-', 
         label='Disagreement Sampling (Ensemble)', linewidth=2, markersize=8, color='#d62728')
plt.xlabel('Cycle', fontsize=12)
plt.ylabel('Mean Squared Error', fontsize=12)
plt.title('Active Learning Comparison: Three Strategies', fontsize=14)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.yscale('log')
plt.show()

print("\nFinal Performance (MSE):")
print(f"  Random:       {mse_history_random[-1]:.4f}")
print(f"  Uncertainty:  {mse_history_uncertainty[-1]:.4f}")
print(f"  Disagreement: {mse_history_disagreement[-1]:.4f}")

print("\nImprovement vs. Random:")
improvement_uncertainty = (mse_history_random[-1] - mse_history_uncertainty[-1]) / mse_history_random[-1] * 100
improvement_disagreement = (mse_history_random[-1] - mse_history_disagreement[-1]) / mse_history_random[-1] * 100
print(f"  Uncertainty:  {improvement_uncertainty:.1f}% better")
print(f"  Disagreement: {improvement_disagreement:.1f}% better")

### Sampling Patterns Visualization

In [None]:
# Get samples for participant 0
participant_id = 0
random_samples = state_random.experiment_data[
    state_random.experiment_data['participant_id'] == participant_id
]
uncertainty_samples = state_uncertainty.experiment_data[
    state_uncertainty.experiment_data['participant_id'] == participant_id
]
disagreement_samples = state_disagreement.experiment_data[
    state_disagreement.experiment_data['participant_id'] == participant_id
]

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Random
scatter1 = axes[0].scatter(
    random_samples['ratio'],
    random_samples['scatteredness'],
    c=range(len(random_samples)),
    cmap='viridis',
    s=100,
    alpha=0.6,
    edgecolors='black',
    linewidths=1
)
axes[0].set_xlabel('Ratio', fontsize=12)
axes[0].set_ylabel('Scatteredness', fontsize=12)
axes[0].set_title(f'Random Sampling\n(Participant {participant_id})', fontsize=14)
axes[0].grid(True, alpha=0.3)
axes[0].set_xlim(-0.1, 1.1)
axes[0].set_ylim(-0.1, 1.1)
plt.colorbar(scatter1, ax=axes[0], label='Sample Order')

# Uncertainty
scatter2 = axes[1].scatter(
    uncertainty_samples['ratio'],
    uncertainty_samples['scatteredness'],
    c=range(len(uncertainty_samples)),
    cmap='viridis',
    s=100,
    alpha=0.6,
    edgecolors='black',
    linewidths=1
)
axes[1].set_xlabel('Ratio', fontsize=12)
axes[1].set_ylabel('Scatteredness', fontsize=12)
axes[1].set_title(f'Uncertainty Sampling\n(Participant {participant_id})', fontsize=14)
axes[1].grid(True, alpha=0.3)
axes[1].set_xlim(-0.1, 1.1)
axes[1].set_ylim(-0.1, 1.1)
plt.colorbar(scatter2, ax=axes[1], label='Sample Order')

# Disagreement
scatter3 = axes[2].scatter(
    disagreement_samples['ratio'],
    disagreement_samples['scatteredness'],
    c=range(len(disagreement_samples)),
    cmap='viridis',
    s=100,
    alpha=0.6,
    edgecolors='black',
    linewidths=1
)
axes[2].set_xlabel('Ratio', fontsize=12)
axes[2].set_ylabel('Scatteredness', fontsize=12)
axes[2].set_title(f'Disagreement Sampling\n(Participant {participant_id})', fontsize=14)
axes[2].grid(True, alpha=0.3)
axes[2].set_xlim(-0.1, 1.1)
axes[2].set_ylim(-0.1, 1.1)
plt.colorbar(scatter3, ax=axes[2], label='Sample Order')

plt.tight_layout()
plt.show()

print(f"\nSample Counts:")
print(f"  Random:       {len(random_samples)}")
print(f"  Uncertainty:  {len(uncertainty_samples)}")
print(f"  Disagreement: {len(disagreement_samples)}")

### Disagreement vs. Uncertainty Maps

In [None]:
# Create test grid
ratio_range = np.linspace(0, 1, 30)
scatter_range = np.linspace(0, 1, 30)
ratio_grid, scatter_grid = np.meshgrid(ratio_range, scatter_range)
X_grid = np.c_[
    np.full(ratio_grid.size, participant_id),
    ratio_grid.ravel(),
    scatter_grid.ravel()
]

# Get uncertainty/disagreement estimates
_, std_uncertainty = state_uncertainty.models[0].predict(X_grid, return_std=True)
_, std_disagreement = state_disagreement.models[0].predict(X_grid, return_std=True)

std_uncertainty_grid = std_uncertainty.reshape(ratio_grid.shape)
std_disagreement_grid = std_disagreement.reshape(ratio_grid.shape)

# Plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Uncertainty
im1 = axes[0].contourf(ratio_grid, scatter_grid, std_uncertainty_grid, levels=20, cmap='YlOrRd')
axes[0].scatter(uncertainty_samples['ratio'], uncertainty_samples['scatteredness'],
                c='blue', s=50, alpha=0.7, edgecolors='black', linewidths=1, label='Sampled Points')
axes[0].set_xlabel('Ratio', fontsize=12)
axes[0].set_ylabel('Scatteredness', fontsize=12)
axes[0].set_title('Uncertainty Map (GP)', fontsize=14)
axes[0].legend()
plt.colorbar(im1, ax=axes[0], label='Prediction Std Dev')

# Disagreement
im2 = axes[1].contourf(ratio_grid, scatter_grid, std_disagreement_grid, levels=20, cmap='YlOrRd')
axes[1].scatter(disagreement_samples['ratio'], disagreement_samples['scatteredness'],
                c='blue', s=50, alpha=0.7, edgecolors='black', linewidths=1, label='Sampled Points')
axes[1].set_xlabel('Ratio', fontsize=12)
axes[1].set_ylabel('Scatteredness', fontsize=12)
axes[1].set_title('Disagreement Map (Ensemble)', fontsize=14)
axes[1].legend()
plt.colorbar(im2, ax=axes[1], label='Ensemble Std Dev')

plt.tight_layout()
plt.show()

print("\nKey Differences:")
print("  - GP uncertainty: Smooth, distance-based (RBF kernel)")
print("  - Ensemble disagreement: Data-driven, captures model uncertainty")
print("  - Both capture epistemic uncertainty, but from different perspectives")

## Part 5: When to Use Each Strategy?

Let's summarize the trade-offs:

| Strategy | Pros | Cons | Best For |
|----------|------|------|----------|
| **Random** | Simple, no overhead | Inefficient, wastes samples | Baselines, very small budgets |
| **Uncertainty (GP)** | Well-calibrated, smooth | Assumes kernel structure, slow for large data | Smooth functions, medium data |
| **Disagreement (Ensemble)** | Robust, flexible | Computationally expensive, needs multiple models | Complex functions, large budgets |

### Practical Recommendations

1. **Start with Uncertainty (GP)** if:
   - Your function is relatively smooth
   - You have moderate sample budget (100-1000 samples)
   - You want well-calibrated uncertainty estimates

2. **Use Disagreement (Ensemble)** if:
   - Your function is highly non-linear or discontinuous
   - You have large computational budget
   - You want robustness to model misspecification

3. **Hybrid Approaches**:
   - Early cycles: Random (exploration)
   - Mid cycles: Uncertainty (efficient sampling)
   - Late cycles: Disagreement (refining complex regions)

## Summary & Key Takeaways

You've learned:

1. âœ… **Ensemble Models**: Building committees for disagreement estimation
2. âœ… **Disagreement Sampling**: Query where models disagree most
3. âœ… **AutoRA Implementation**: Using `inequality_sample` with ensemble models
4. âœ… **Uncertainty vs. Disagreement**: When to use each approach
5. âœ… **Performance Comparison**: Both outperform random sampling significantly
6. âœ… **Practical Guidelines**: Choosing the right strategy for your problem

### The Complete Active Learning Toolkit

You now have three powerful strategies:
- **Grid/Factorial**: Uniform coverage (Tutorial 1)
- **Uncertainty**: Model-driven intelligent sampling (Tutorial 2)
- **Disagreement**: Robust ensemble-based sampling (Tutorial 3)

### Group Project Connection

For your group project, you can:
1. Implement any of these strategies (or combine them!)
2. Compare performance on the 2AFC experiment
3. Test robustness to different noise levels
4. Explore your own experimentalist ideas!

## Exercises

1. **Ensemble Size**: Vary `n_models` (3, 5, 10, 20). How does ensemble size affect:
   - Model performance?
   - Computational cost?
   - Disagreement estimates?

2. **Hybrid Strategy**: Implement a hybrid experimentalist that:
   - Uses random sampling for first 2 cycles
   - Uses uncertainty for cycles 3-7
   - Uses disagreement for cycles 8-10
   
3. **Bootstrap Ensembles**: Modify `FFNEnsemble` to train each model on a bootstrap sample (random subset with replacement) instead of the full dataset. Does this improve diversity?

4. **Noise Robustness**: Add noise to observations (`added_noise=0.1, 0.5`). Which strategy is most robust?

5. **Calibration Analysis**: For both uncertainty and disagreement, compute:
   - Coverage: How often does true value fall within predicted interval?
   - Calibration: Plot predicted std vs. actual error

6. **Custom Experimentalist**: Implement your own experimentalist that combines:
   - Uncertainty estimates
   - Disagreement estimates
   - Distance from previous samples
   Create a weighted score and select based on that!

7. **Active Learning Literature**: Read Settles (2009) survey and implement another query strategy:
   - Query-by-bagging
   - Variance reduction
   - Expected model change

## Congratulations!

You've completed the **Advanced Active Learning** tutorial! You now have a complete toolkit for intelligent experimental design using AutoRA.

You're ready to tackle the group project and apply these methods to optimize experiments in cognitive science!

### Final Thoughts

Active learning is not just about algorithms - it's about:
- **Efficiency**: Making every observation count
- **Science**: Asking the right questions at the right time
- **Discovery**: Uncovering patterns faster than traditional methods

Good luck with your projects! ðŸš€