# Hyperparameter Management and Experiment Tracking

This notebook demonstrates how to manage hyperparameters using `params.yaml` and track experiments across Git commits using DVC metrics.

**Key Topics:**
- Parameter management with `params.yaml`
- DVC parameter substitution in pipeline commands
- Automatic parameter change detection
- Experiment tracking via Git commits
- Metric comparison and visualization

## 1. Create and Structure `params.yaml` Configuration

The `params.yaml` file centralizes all tunable hyperparameters for the ML pipeline.

In [None]:
# Show current params.yaml
with open('../params.yaml', 'r') as f:
    print(f.read())

**Parameters Explained:**
- `n_estimators`: Number of trees in the Random Forest (more = better accuracy but slower)
- `max_depth`: Maximum depth of each tree (controls overfitting)
- `test_size`: Proportion of data for validation (0.0 to 1.0)
- `random_state`: Seed for reproducibility
- `target_col`: Column to predict (price)

## 2. Update `dvc.yaml` to Use Parameters

The pipeline definition now references parameters using `${train.parameter_name}` syntax and tracks which parameters affect each stage.

In [None]:
# Show the train_model stage in dvc.yaml
import yaml

with open('../dvc.yaml', 'r') as f:
    dvc_config = yaml.safe_load(f)

train_stage = dvc_config['stages']['train_model']
print("train_model stage:")
print("-" * 80)
print(f"Command: {train_stage['cmd']}")
print(f"\nDependencies: {train_stage['deps']}")
print(f"\nTracked Parameters: {train_stage['params']}")
print(f"\nOutputs: {train_stage['outs']}")
print(f"\nMetrics: {train_stage['metrics']}")

**Key Changes:**
- Parameters are substituted at runtime: `${train.n_estimators}` â†’ `100`
- `params` section tells DVC which parameters to track
- DVC records parameter values in `dvc.lock` for reproducibility

## 3. Understand Parameter Substitution and DVC Tracking

In [None]:
# Read dvc.lock to see how DVC records parameters
with open('../dvc.lock', 'r') as f:
    dvc_lock = yaml.safe_load(f)

train_lock = dvc_lock['stages']['train_model']

print("Parameters recorded in dvc.lock:")
print("-" * 80)
if 'params' in train_lock:
    for param_file, param_values in train_lock['params'].items():
        print(f"\n{param_file}:")
        for key, value in param_values.items():
            print(f"  {key}: {value}")

**How DVC Tracks Parameters:**
1. DVC reads `params.yaml` and extracts parameter values
2. Records exact values in `dvc.lock` (not hashes, actual values)
3. When you run `dvc repro`, DVC detects changes in parameter values
4. If parameters changed, DVC re-runs the affected stage
5. If parameters unchanged, DVC skips the stage (caching)

## 4. Run Pipeline with Parameterized Configuration

### How `dvc repro` Works:

```bash
dvc repro
```

1. **Read params.yaml**: Load all parameter values
2. **Substitute in command**: Replace `${train.n_estimators}` with actual value (100)
3. **Execute command**: Run the training script with substituted parameters
4. **Update dvc.lock**: Record parameter values and output hashes
5. **Log metrics**: Capture model performance metrics

In [None]:
# View current metrics
import json

with open('../metrics/scores.json', 'r') as f:
    metrics = json.load(f)

print("Current Model Performance Metrics:")
print("=" * 80)
for metric, value in metrics.items():
    if metric == 'mae':
        print(f"{metric:10} (Mean Absolute Error):        ${value:,.2f}")
    elif metric == 'rmse':
        print(f"{metric:10} (Root Mean Squared Error):    ${value:,.2f}")
    elif metric == 'r2':
        print(f"{metric:10} (R-squared Score):            {value:.4f}")

## 5. Execute Experimentation Workflow

### Experiment Workflow Pattern:

```bash
1. Edit params.yaml    # Change hyperparameters
2. dvc repro            # Run pipeline with new params
3. dvc metrics show     # View new metrics
4. git add/commit       # Track experiment in Git
```

In [None]:
# Show how to run experiments programmatically
import subprocess

def run_experiment(n_estimators, max_depth, experiment_name):
    """
    Run a hyperparameter experiment.
    
    Args:
        n_estimators: Number of trees
        max_depth: Maximum tree depth
        experiment_name: Name for this experiment
    """
    print(f"\n{'='*80}")
    print(f"Running Experiment: {experiment_name}")
    print(f"Parameters: n_estimators={n_estimators}, max_depth={max_depth}")
    print(f"{'='*80}")
    
    # Update params.yaml
    params_yaml = f"""# Model Training Parameters
train:
  n_estimators: {n_estimators}
  max_depth: {max_depth}
  test_size: 0.2
  random_state: 42
  target_col: price
"""
    
    # Would update file and run pipeline in real scenario
    print(f"\nUpdated params.yaml:")
    print(params_yaml)
    print("\nWould run: dvc repro")
    print("Then commit: git add params.yaml dvc.lock metrics/scores.json")

# Example: Show different experiments
run_experiment(100, 10, "Baseline")
run_experiment(200, 15, "More trees + Deeper")
run_experiment(300, 20, "Even deeper forest")

## 6. Compare Metrics Across Experiments

In [None]:
# Simulate experiment history table
import pandas as pd

experiments = pd.DataFrame([
    {'Experiment': 'Baseline', 'n_estimators': 100, 'max_depth': 10, 'MAE': 6093.56, 'RMSE': 13605.37, 'R2': 0.2782},
    {'Experiment': 'More Trees', 'n_estimators': 200, 'max_depth': 10, 'MAE': 5900.12, 'RMSE': 13456.78, 'R2': 0.2890},
    {'Experiment': 'Deeper Trees', 'n_estimators': 200, 'max_depth': 15, 'MAE': 5761.73, 'RMSE': 13435.19, 'R2': 0.2961},
    {'Experiment': 'Even Deeper', 'n_estimators': 300, 'max_depth': 20, 'MAE': 5640.45, 'RMSE': 13312.56, 'R2': 0.3045},
])

print("\nExperiment Results Summary:")
print("=" * 100)
print(experiments.to_string(index=False))
print("\nKey Observations:")
print(f"âœ“ Best MAE: {experiments['Experiment'].iloc[experiments['MAE'].idxmin()]} (${experiments['MAE'].min():,.2f})")
print(f"âœ“ Best RÂ²: {experiments['Experiment'].iloc[experiments['R2'].idxmax()]} ({experiments['R2'].max():.4f})")
print(f"âœ“ MAE improved by: ${experiments['MAE'].iloc[0] - experiments['MAE'].iloc[-1]:,.2f}")

### Compare with DVC Commands:

```bash
# View current metrics
dvc metrics show

# Compare with previous commit
dvc metrics diff

# Compare specific commits
dvc metrics diff HEAD~2 HEAD

# View all experiments in git history
dvc metrics show --all-commits
```

## 7. Visualize Parameter vs Metric Relationships

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Create visualization of parameter vs metric relationships
fig, axes = plt.subplots(2, 3, figsize=(16, 10))
fig.suptitle('Hyperparameter Impact on Model Performance', fontsize=16, fontweight='bold')

# Plot 1: n_estimators vs MAE
axes[0, 0].scatter(experiments['n_estimators'], experiments['MAE'], s=100, alpha=0.6, color='blue')
axes[0, 0].set_xlabel('n_estimators (Number of Trees)', fontsize=10)
axes[0, 0].set_ylabel('MAE ($)', fontsize=10)
axes[0, 0].set_title('Trees vs MAE (Lower is Better)', fontsize=11, fontweight='bold')
axes[0, 0].grid(True, alpha=0.3)

# Plot 2: max_depth vs MAE
axes[0, 1].scatter(experiments['max_depth'], experiments['MAE'], s=100, alpha=0.6, color='green')
axes[0, 1].set_xlabel('max_depth (Tree Depth)', fontsize=10)
axes[0, 1].set_ylabel('MAE ($)', fontsize=10)
axes[0, 1].set_title('Depth vs MAE (Lower is Better)', fontsize=11, fontweight='bold')
axes[0, 1].grid(True, alpha=0.3)

# Plot 3: Combined parameters vs MAE
scatter = axes[0, 2].scatter(experiments['n_estimators'], experiments['max_depth'], 
                              c=experiments['MAE'], s=200, cmap='RdYlGn_r', alpha=0.7)
axes[0, 2].set_xlabel('n_estimators', fontsize=10)
axes[0, 2].set_ylabel('max_depth', fontsize=10)
axes[0, 2].set_title('Parameter Space (Color=MAE)', fontsize=11, fontweight='bold')
plt.colorbar(scatter, ax=axes[0, 2], label='MAE')

# Plot 4: n_estimators vs R2
axes[1, 0].scatter(experiments['n_estimators'], experiments['R2'], s=100, alpha=0.6, color='red')
axes[1, 0].set_xlabel('n_estimators (Number of Trees)', fontsize=10)
axes[1, 0].set_ylabel('RÂ² Score', fontsize=10)
axes[1, 0].set_title('Trees vs RÂ² (Higher is Better)', fontsize=11, fontweight='bold')
axes[1, 0].grid(True, alpha=0.3)

# Plot 5: max_depth vs R2
axes[1, 1].scatter(experiments['max_depth'], experiments['R2'], s=100, alpha=0.6, color='purple')
axes[1, 1].set_xlabel('max_depth (Tree Depth)', fontsize=10)
axes[1, 1].set_ylabel('RÂ² Score', fontsize=10)
axes[1, 1].set_title('Depth vs RÂ² (Higher is Better)', fontsize=11, fontweight='bold')
axes[1, 1].grid(True, alpha=0.3)

# Plot 6: MAE vs R2 tradeoff
axes[1, 2].scatter(experiments['R2'], experiments['MAE'], s=100, alpha=0.6, color='orange')
for i, exp in enumerate(experiments['Experiment']):
    axes[1, 2].annotate(exp, (experiments['R2'].iloc[i], experiments['MAE'].iloc[i]), 
                        fontsize=8, alpha=0.7)
axes[1, 2].set_xlabel('RÂ² Score', fontsize=10)
axes[1, 2].set_ylabel('MAE ($)', fontsize=10)
axes[1, 2].set_title('RÂ² vs MAE Tradeoff', fontsize=11, fontweight='bold')
axes[1, 2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nâœ“ Visualization shows how each hyperparameter affects model performance")
print("âœ“ Use this to identify optimal parameter combinations")

## 8. Track and Analyze Experiment History

In [None]:
# Create comprehensive experiment tracking
print("\nEXPERIMENT TRACKING SUMMARY")
print("=" * 100)

# Rank experiments by different metrics
print("\nðŸ“Š RANKING BY MAE (Lower is Better):")
mae_ranked = experiments.sort_values('MAE')
for i, (idx, row) in enumerate(mae_ranked.iterrows(), 1):
    print(f"{i}. {row['Experiment']:20} | MAE: ${row['MAE']:>8,.2f} | RÂ²: {row['R2']:.4f}")

print("\nðŸ“Š RANKING BY RÂ² (Higher is Better):")
r2_ranked = experiments.sort_values('R2', ascending=False)
for i, (idx, row) in enumerate(r2_ranked.iterrows(), 1):
    print(f"{i}. {row['Experiment']:20} | RÂ²: {row['R2']:.4f} | MAE: ${row['MAE']:>8,.2f}")

print("\nðŸŽ¯ RECOMMENDATIONS:")
best_mae_exp = mae_ranked.iloc[0]
best_r2_exp = r2_ranked.iloc[0]
print(f"âœ“ Best error reduction: {best_mae_exp['Experiment']} (MAE: ${best_mae_exp['MAE']:,.2f})")
print(f"âœ“ Best overall fit: {best_r2_exp['Experiment']} (RÂ²: {best_r2_exp['R2']:.4f})")

# Calculate improvement from baseline
baseline_mae = experiments.loc[0, 'MAE']
best_mae = best_mae_exp['MAE']
improvement_pct = ((baseline_mae - best_mae) / baseline_mae) * 100
print(f"âœ“ Improvement over baseline: {improvement_pct:.2f}% (${baseline_mae - best_mae:,.2f})")

In [None]:
# Show how to retrieve this from Git history in a real scenario
print("\nREAL EXPERIMENT TRACKING WORKFLOW:")
print("=" * 100)
print("""
# Get all commits with experiment changes
commits = subprocess.check_output(['git', 'log', '--grep=Exp', '--oneline']).decode().split('\\n')

# For each commit, extract params and metrics
for commit_hash in commits:
    # Checkout that commit
    subprocess.run(['git', 'checkout', commit_hash])
    
    # Read params.yaml
    with open('params.yaml') as f:
        params = yaml.safe_load(f)
    
    # Read metrics
    with open('metrics/scores.json') as f:
        metrics = json.load(f)
    
    # Store results
    results.append({
        'commit': commit_hash,
        'n_estimators': params['train']['n_estimators'],
        'max_depth': params['train']['max_depth'],
        'mae': metrics['mae'],
        'rmse': metrics['rmse'],
        'r2': metrics['r2']
    })

# Create DataFrame and analyze
df = pd.DataFrame(results)
print(df)
""")

## Summary: Production MLOps Workflow

### âœ… What You've Learned:

1. **Configuration Management**: `params.yaml` separates hyperparameters from pipeline logic
2. **Automatic Change Detection**: DVC detects when parameters change and re-runs affected stages
3. **Version Control**: Each experiment is a Git commit with tracked parameters and metrics
4. **Metric Comparison**: Use `dvc metrics diff` to quantitatively compare experiments
5. **Reproducibility**: `dvc.lock` stores exact parameter values for every run
6. **Analysis**: Visualize parameter-metric relationships to find optimal configurations

### ðŸš€ Quick Reference:

```bash
# Edit hyperparameters
vim params.yaml

# Run experiment
dvc repro

# View metrics
dvc metrics show

# Compare with previous
dvc metrics diff

# Commit experiment
git add params.yaml dvc.lock metrics/scores.json
git commit -m "Experiment: n_estimators=200, max_depth=15"
```

### ðŸŽ¯ Next Steps:

- Create experiment branches for systematic hyperparameter tuning
- Use Optuna or Hyperopt for automated hyperparameter optimization
- Integrate with MLflow for advanced experiment tracking
- Set up CI/CD to run experiments on push