# Experiment Tracking & Live Training Visualization Demo

This notebook demonstrates the comprehensive experiment tracking and live visualization system for CSVy hockey predictions.

## Features
- **MLflow Integration**: Track experiments, parameters, metrics, and models
- **Live Progress Bars**: Real-time progress with tqdm/rich
- **Training Callbacks**: Early stopping, metric logging, checkpoints
- **Live Plots**: Real-time loss curves and metric visualization
- **Experiment Comparison**: Compare runs side-by-side

In [None]:
# Required imports
import sys
import numpy as np
import pandas as pd
from pathlib import Path

# Add project root to path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

# Import our tracking modules
from utils.experiment_tracker import ExperimentTracker, create_tracker
from utils.training_callbacks import (
    TrainingCallback, ProgressBar, EarlyStopping, create_callback
)
from utils.live_dashboard import LivePlotter, TrainingDashboard, create_dashboard

print("Modules imported successfully!")

## 1. Basic Experiment Tracking with MLflow

The `ExperimentTracker` provides a simple interface for logging experiments.

In [None]:
# Create an experiment tracker
tracker = ExperimentTracker(
    experiment_name="demo_hockey_prediction",
    tracking_uri="./mlruns",
    tags={"project": "CSVy", "demo": "true"}
)

print(f"Experiment ID: {tracker.experiment_id}")

In [None]:
# Simulate training with tracking
np.random.seed(42)

with tracker.start_run(run_name="simulated_xgboost"):
    # Log hyperparameters
    tracker.log_params({
        "learning_rate": 0.05,
        "n_estimators": 500,
        "max_depth": 6,
        "model_type": "XGBoost"
    })
    
    # Simulate training epochs
    for epoch in range(50):
        # Simulate decreasing loss
        train_loss = 0.5 * np.exp(-epoch * 0.05) + np.random.normal(0, 0.02)
        val_loss = 0.55 * np.exp(-epoch * 0.04) + np.random.normal(0, 0.03)
        rmse = 0.8 * np.exp(-epoch * 0.03) + np.random.normal(0, 0.02)
        
        # Log metrics at each step
        tracker.log_metrics({
            "train_loss": max(0, train_loss),
            "val_loss": max(0, val_loss),
            "rmse": max(0, rmse)
        }, step=epoch)
    
    # Log final model info
    tracker.log_dict(
        {"feature_importance": {"home_elo": 0.35, "away_elo": 0.30, "rest_days": 0.20}},
        "feature_importance.json"
    )
    
    print(f"Run ID: {tracker.get_run_id()}")
    print(f"Best RMSE: {tracker.get_best_metric('rmse', mode='min'):.4f}")

## 2. Progress Bars with tqdm/rich

Use `ProgressBar` for visual feedback during training loops.

In [None]:
import time

# Simulated training with progress bar
n_epochs = 30

with ProgressBar(n_epochs, "Training Model", backend="auto") as pbar:
    for epoch in range(n_epochs):
        # Simulate work
        time.sleep(0.05)
        
        # Calculate mock metrics
        loss = 0.5 * np.exp(-epoch * 0.1)
        accuracy = 1 - loss
        
        # Update progress bar with metrics
        pbar.update(1, {"loss": loss, "acc": accuracy})

print("Training complete!")

## 3. Training Callbacks with Early Stopping

Use `TrainingCallback` for comprehensive training management.

In [None]:
# Create tracker and callback together
tracker = create_tracker("callback_demo")

with tracker.start_run("early_stopping_demo"):
    # Create callback with early stopping
    callback = TrainingCallback(
        tracker=tracker,
        verbose=True,
        early_stopping_patience=10,
        early_stopping_metric="val_loss",
        early_stopping_mode="min"
    )
    
    n_epochs = 100
    callback.on_train_start(n_epochs)
    
    for epoch in range(n_epochs):
        callback.on_epoch_start(epoch)
        
        # Simulate training (loss decreases then plateaus)
        train_loss = 0.5 * np.exp(-epoch * 0.1) + 0.1
        
        # Validation loss with early plateau (simulating overfitting)
        if epoch < 30:
            val_loss = 0.55 * np.exp(-epoch * 0.08) + 0.15
        else:
            val_loss = 0.2 + np.random.normal(0, 0.02)  # Plateaus
        
        # Log metrics and check for early stopping
        should_continue = callback.on_epoch_end(epoch, {
            "train_loss": train_loss,
            "val_loss": val_loss
        })
        
        if not should_continue:
            break
    
    callback.on_train_end()

## 4. Live Plot Visualization

Use `LivePlotter` for real-time training curves.

In [None]:
%matplotlib inline

# Create live plotter
plotter = LivePlotter(
    metrics=['train_loss', 'val_loss', 'rmse', 'mae'],
    figsize=(12, 5),
    update_interval=5  # Update visual every 5 epochs
)

plotter.start("Live Training Visualization")

# Simulate training
for epoch in range(100):
    # Simulate metrics
    train_loss = 0.6 * np.exp(-epoch * 0.05) + np.random.normal(0, 0.01)
    val_loss = 0.65 * np.exp(-epoch * 0.04) + np.random.normal(0, 0.015)
    rmse = 0.9 * np.exp(-epoch * 0.03) + np.random.normal(0, 0.01)
    mae = 0.7 * np.exp(-epoch * 0.035) + np.random.normal(0, 0.008)
    
    plotter.update({
        'train_loss': max(0, train_loss),
        'val_loss': max(0, val_loss),
        'rmse': max(0, rmse),
        'mae': max(0, mae)
    }, epoch=epoch)

plotter.finalize("training_curves.png")
print("Plot saved to training_curves.png")

## 5. Full Training Dashboard

Combine all features with `TrainingDashboard`.

In [None]:
# Create comprehensive dashboard
dashboard = create_dashboard(metrics=['loss', 'val_loss', 'rmse'])
dashboard.start()

# Simulate full training run
for epoch in range(50):
    # Simulate batch-level training
    loss = 0.5 * np.exp(-epoch * 0.06) + np.random.normal(0, 0.01)
    val_loss = 0.55 * np.exp(-epoch * 0.05) + np.random.normal(0, 0.015)
    rmse = 0.8 * np.exp(-epoch * 0.04) + np.random.normal(0, 0.01)
    
    dashboard.log_epoch(epoch, {
        'loss': max(0, loss),
        'val_loss': max(0, val_loss),
        'rmse': max(0, rmse)
    })

dashboard.stop()
dashboard.print_summary()

## 6. Experiment Comparison

Compare multiple runs side-by-side.

In [None]:
from utils.live_dashboard import MetricComparison

# Simulate multiple experiment runs
comparison = MetricComparison()

# Baseline model
baseline_rmse = [0.8 * np.exp(-i * 0.03) for i in range(50)]
comparison.add_experiment("Baseline", {"rmse": baseline_rmse})

# Improved model with faster convergence
improved_rmse = [0.7 * np.exp(-i * 0.05) for i in range(50)]
comparison.add_experiment("Improved", {"rmse": improved_rmse})

# Best model with regularization
best_rmse = [0.65 * np.exp(-i * 0.06) + 0.05 for i in range(50)]
comparison.add_experiment("Best (Regularized)", {"rmse": best_rmse})

# Create comparison plot
fig = comparison.plot(metrics_to_plot=['rmse'], figsize=(10, 5))
fig.savefig("experiment_comparison.png", dpi=150)
print("Comparison saved to experiment_comparison.png")

## 7. Real Model Training Example

Integrate tracking with actual XGBoost training.

In [None]:
try:
    import xgboost as xgb
    from sklearn.datasets import make_regression
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import mean_squared_error
    
    # Create synthetic data
    X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Create tracker
    tracker = create_tracker("real_xgboost_training")
    
    with tracker.start_run("xgboost_with_tracking"):
        # Define parameters
        params = {
            "objective": "reg:squarederror",
            "learning_rate": 0.05,
            "max_depth": 5,
            "n_estimators": 100,
            "subsample": 0.8
        }
        tracker.log_params(params)
        
        # Create callback for tracking
        callback = TrainingCallback(
            tracker=tracker,
            verbose=True,
            log_frequency=10
        )
        
        callback.on_train_start(params["n_estimators"])
        
        # Train with manual epoch tracking
        model = xgb.XGBRegressor(**{k: v for k, v in params.items() if k != 'n_estimators'})
        
        # Fit with eval set for validation
        model.fit(
            X_train, y_train,
            eval_set=[(X_train, y_train), (X_test, y_test)],
            verbose=False
        )
        
        # Get training history from evals_result
        results = model.evals_result()
        for i, (train_rmse, val_rmse) in enumerate(zip(
            results['validation_0']['rmse'],
            results['validation_1']['rmse']
        )):
            callback.on_epoch_end(i, {'train_rmse': train_rmse, 'val_rmse': val_rmse})
        
        callback.on_train_end()
        
        # Final evaluation
        y_pred = model.predict(X_test)
        final_rmse = np.sqrt(mean_squared_error(y_test, y_pred))
        tracker.log_metric("final_test_rmse", final_rmse)
        
        # Log model
        tracker.log_model(model, "xgboost_model", flavor="sklearn")
        
        print(f"\nFinal Test RMSE: {final_rmse:.4f}")
        
except ImportError:
    print("XGBoost not installed. Run: pip install xgboost")

## 8. Launch MLflow UI

View all experiments in the web dashboard.

In [None]:
# View best runs from an experiment
comparison_df = ExperimentTracker.compare_runs(
    experiment_name="demo_hockey_prediction",
    metric="rmse",
    top_n=5
)

if not comparison_df.empty:
    print("Top runs comparison:")
    display(comparison_df)
else:
    print("No runs found (MLflow may not be available)")

In [None]:
# Uncomment to launch MLflow UI (will block the notebook)
# This opens a web browser at http://localhost:5000

# ExperimentTracker.launch_ui(port=5000)

## Summary

This tracking system provides:

1. **`ExperimentTracker`**: MLflow-backed experiment logging
   - `log_params()`: Track hyperparameters
   - `log_metrics()`: Track training metrics
   - `log_model()`: Save trained models
   - `compare_runs()`: Compare experiments

2. **`TrainingCallback`**: Training loop management
   - Progress tracking
   - Early stopping
   - Metric history

3. **`LivePlotter`/`TrainingDashboard`**: Real-time visualization
   - Live loss curves
   - Multi-metric plotting
   - Experiment comparison

4. **`ProgressBar`**: Visual progress indicators
   - tqdm/rich backends
   - Metric display