# Simplified Optimizer Evaluation

This notebook implements a streamlined workflow for optimizer evaluation with three core steps:

1. Train models with different optimizers via subprocess calls to main.py
2. Scan for experiment results in the logs directory
3. Create performance comparison matrices for visualization

This approach is more maintainable and focused on optimizer performance analysis.

In [1]:
# Import utility functions
from optimizer_utils import (
    # Constants
    PROJECT_ROOT, LOGS_DIR, METRICS_DIR, EXPORTS_DIR, MAIN_SCRIPT,
    MODELS, OPTIMIZERS, EPOCHS, BATCH_SIZE,
    
    # Functions
    train_model, run_experiments, scan_for_experiment_results,
    load_experiment_results, get_summary_metrics,
    plot_learning_curves, create_performance_matrix, find_best_optimizer
)

# Import standard libraries
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display, HTML

# Set up models and optimizers to evaluate
selected_models = ['Base', 'Wide', 'Advanced']
selected_optimizers = ['Adam', 'ImprovedAdam', 'Nadam', 'RMSprop', 'SGD']

print(f"Selected models to evaluate: {selected_models}")
print(f"Selected optimizers to evaluate: {selected_optimizers}")

Project Root: /Users/marcofurrer/Documents/github/dspro2
Logs Directory: /Users/marcofurrer/Documents/github/dspro2/logs
Metrics Directory: /Users/marcofurrer/Documents/github/dspro2/logs/metrics
Exports Directory: /Users/marcofurrer/Documents/github/dspro2/exports
Models to evaluate: ['Base', 'Wide', 'Advanced']
Optimizers to evaluate: ['Adam', 'ImprovedAdam', 'Nadam', 'RMSprop', 'SGD', 'Adadelta']
Selected models to evaluate: ['Base', 'Wide', 'Advanced']
Selected optimizers to evaluate: ['Adam', 'ImprovedAdam', 'Nadam', 'RMSprop', 'SGD']


## Step 1: Train Models via Subprocess

We'll use subprocess calls to train models with different optimizers. You can choose to run all combinations or specific ones.

In [4]:
# Option 1: Run specific model-optimizer combinations
# Set pairs to a list of (model, optimizer) tuples you want to train
pairs = [
    ('Base', 'Adam'),
    ('Wide', 'ImprovedAdam')
]

# Set run_specific to True to execute this cell
run_specific = True

if run_specific and pairs:
    for model, optimizer in pairs:
        print(f"\nTraining {model} model with {optimizer} optimizer")
        result_path = train_model(model, optimizer, epochs=EPOCHS, batch_size=BATCH_SIZE)
        print(f"Results saved to: {result_path}")
else:
    print("Skipping specific experiments. Set run_specific=True to execute.")


Training Base model with Adam optimizer
Running: python /Users/marcofurrer/Documents/github/dspro2/main.py --model base --optimizer Adam --epochs 15 --batch_size 32


KeyboardInterrupt: 

In [None]:
# Option 2: Run all combinations of models and optimizers
run_all = False  # Set to True to run all experiments

if run_all:
    print(f"Running all combinations - {len(selected_models) * len(selected_optimizers)} experiments")
    experiment_results = run_experiments(selected_models, selected_optimizers, 
                                        epochs=EPOCHS, batch_size=BATCH_SIZE)
else:
    print("Skipping full experiment suite. Set run_all=True to execute all combinations.")

## Step 2: Scan for Experiment Results

Now we'll scan the logs directory to find all existing experiment results. This will find both results from the experiments we just ran and any previous ones.

In [None]:
# Scan for experiment results
print("Scanning for experiment results in logs directory and exports...")
experiment_paths = scan_for_experiment_results()

# Display summary of found experiments
if experiment_paths:
    print(f"\nFound {len(experiment_paths)} experiment results:")
    # Group by model
    model_groups = {}
    for (model, optimizer) in experiment_paths:
        if model not in model_groups:
            model_groups[model] = []
        model_groups[model].append(optimizer)
    
    # Display grouped results
    for model, optimizers in model_groups.items():
        print(f"\n{model} model with optimizers: {', '.join(optimizers)}")
else:
    print("No experiment results found. Please run training first.")

In [None]:
# Load metrics from experiment results
print("Loading metrics from experiment results...")
all_metrics = load_experiment_results(experiment_paths)

if not all_metrics.empty:
    # Display the first few rows
    print("Metrics preview:")
    display(all_metrics.head())
    
    # Get unique models and optimizers in the data
    found_models = sorted(all_metrics['model'].unique())
    found_optimizers = sorted(all_metrics['optimizer'].unique())
    
    print(f"\nFound metrics for models: {found_models}")
    print(f"Found metrics for optimizers: {found_optimizers}")
    
    # Calculate summary metrics
    print("\nCalculating summary metrics...")
    summary_metrics = get_summary_metrics(all_metrics)
    display(summary_metrics)
else:
    print("No metrics data found. Please check experiment results.")

## Step 3: Create Performance Comparison Matrix

Now we'll generate performance comparison matrices to visualize how different optimizers perform across model architectures.

In [None]:
# Plot learning curves for validation loss
if not all_metrics.empty:
    print("Plotting validation loss learning curves...")
    plot_learning_curves(all_metrics, 'val_loss')
else:
    print("No metrics data available for plotting learning curves.")

In [None]:
# Plot learning curves for validation accuracy (if available)
if not all_metrics.empty and 'val_accuracy' in all_metrics.columns:
    print("Plotting validation accuracy learning curves...")
    plot_learning_curves(all_metrics, 'val_accuracy')
else:
    print("No validation accuracy data available for plotting learning curves.")

In [None]:
# Create performance matrices for different metrics
if not summary_metrics.empty:
    print("Creating performance matrices...")
    
    # Validation Loss Matrix
    print("\nMinimum Validation Loss by Model and Optimizer:")
    loss_matrix = create_performance_matrix(
        summary_metrics, 
        'min_val_loss', 
        'Minimum Validation Loss by Model and Optimizer'
    )
    
    # Validation Accuracy Matrix (if available)
    if 'max_val_accuracy' in summary_metrics.columns and not summary_metrics['max_val_accuracy'].isna().all():
        print("\nMaximum Validation Accuracy by Model and Optimizer:")
        acc_matrix = create_performance_matrix(
            summary_metrics, 
            'max_val_accuracy', 
            'Maximum Validation Accuracy by Model and Optimizer'
        )
    
    # Convergence Speed Matrix
    print("\nConvergence Epoch by Model and Optimizer:")
    conv_matrix = create_performance_matrix(
        summary_metrics, 
        'convergence_epoch', 
        'Convergence Epoch by Model and Optimizer'
    )
    
    # Training Time Matrix (if available)
    if 'training_time' in summary_metrics.columns and not summary_metrics['training_time'].isna().all():
        print("\nTotal Training Time (s) by Model and Optimizer:")
        time_matrix = create_performance_matrix(
            summary_metrics,
            'training_time',
            'Total Training Time (s) by Model and Optimizer'
        )
else:
    print("No summary metrics available for creating performance matrices.")

## Optimizer Comparison Results

Let's identify the best optimizers for each model architecture based on different metrics.

In [None]:
# Find best optimizer for different metrics
if not summary_metrics.empty:
    print("Best optimizer by validation loss:")
    display(find_best_optimizer(summary_metrics, 'min_val_loss', True))
    
    if 'max_val_accuracy' in summary_metrics.columns and not summary_metrics['max_val_accuracy'].isna().all():
        print("\nBest optimizer by validation accuracy:")
        display(find_best_optimizer(summary_metrics, 'max_val_accuracy', False))
    
    print("\nBest optimizer by convergence speed:")
    display(find_best_optimizer(summary_metrics, 'convergence_epoch', True))
    
    if 'training_time' in summary_metrics.columns and not summary_metrics['training_time'].isna().all():
        print("\nBest optimizer by training time:")
        display(find_best_optimizer(summary_metrics, 'training_time', True))
else:
    print("No summary metrics available for finding best optimizer.")

## Conclusions and Recommendations

Based on our analysis, we can summarize the performance of different optimizers across model architectures:

In [None]:
# Create a ranking of optimizers based on multiple metrics
if not summary_metrics.empty:
    # Group by optimizer and calculate average ranks
    optimizer_ranks = {}
    
    # For validation loss (lower is better)
    for model in summary_metrics['model'].unique():
        model_data = summary_metrics[summary_metrics['model'] == model]
        # Rank optimizers for this model (rank 1 is best)
        ranked = model_data.sort_values('min_val_loss')['optimizer'].tolist()
        
        for i, opt in enumerate(ranked):
            if opt not in optimizer_ranks:
                optimizer_ranks[opt] = {'loss_rank': [], 'conv_rank': [], 'total': 0, 'count': 0}
            optimizer_ranks[opt]['loss_rank'].append(i+1)
    
    # For convergence epoch (lower is better)
    for model in summary_metrics['model'].unique():
        model_data = summary_metrics[summary_metrics['model'] == model]
        # Rank optimizers for this model (rank 1 is best)
        ranked = model_data.sort_values('convergence_epoch')['optimizer'].tolist()
        
        for i, opt in enumerate(ranked):
            optimizer_ranks[opt]['conv_rank'].append(i+1)
    
    # Calculate average ranks
    rank_data = []
    for opt, ranks in optimizer_ranks.items():
        avg_loss_rank = sum(ranks['loss_rank']) / len(ranks['loss_rank']) if ranks['loss_rank'] else 0
        avg_conv_rank = sum(ranks['conv_rank']) / len(ranks['conv_rank']) if ranks['conv_rank'] else 0
        overall_rank = (avg_loss_rank + avg_conv_rank) / 2
        
        rank_data.append({
            'optimizer': opt,
            'avg_loss_rank': avg_loss_rank,
            'avg_conv_rank': avg_conv_rank,
            'overall_rank': overall_rank
        })
    
    # Create DataFrame and sort by overall rank
    rank_df = pd.DataFrame(rank_data).sort_values('overall_rank')
    
    print("Overall optimizer ranking (lower is better):")
    display(rank_df)
    
    # Plot the rankings
    plt.figure(figsize=(10, 6))
    sns.barplot(y='optimizer', x='overall_rank', data=rank_df, palette='viridis')
    plt.title('Overall Optimizer Ranking (Lower is Better)')
    plt.xlabel('Average Rank')
    plt.ylabel('Optimizer')
    plt.grid(True, axis='x', linestyle='--', alpha=0.7)
    plt.tight_layout()
    plt.show()
else:
    print("No summary metrics available for creating rankings.")

In [None]:
# Save the results for future reference
if not summary_metrics.empty:
    # Create a timestamp
    from datetime import datetime
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    
    # Create output directory if it doesn't exist
    from pathlib import Path
    output_dir = Path("optimizer_results")
    output_dir.mkdir(exist_ok=True)
    
    # Save the summary metrics
    summary_metrics.to_csv(output_dir / f"optimizer_comparison_summary_{timestamp}.csv", index=False)
    
    # Save the ranking if available
    if 'rank_df' in locals():
        rank_df.to_csv(output_dir / f"optimizer_ranking_{timestamp}.csv", index=False)
        
    # Save performance matrices if they exist
    matrices = {
        'loss_matrix': 'validation_loss',
        'acc_matrix': 'validation_accuracy',
        'conv_matrix': 'convergence_epoch',
        'time_matrix': 'training_time'
    }
    
    for var_name, file_prefix in matrices.items():
        if var_name in locals():
            locals()[var_name].to_csv(output_dir / f"{file_prefix}_matrix_{timestamp}.csv")
        
    print(f"Results saved with timestamp {timestamp} in {output_dir}")
else:
    print("No results to save.")

## Summary of Findings

Based on our analysis, we can draw the following conclusions about optimizers and their performance across different model architectures:

1. **Best Overall Optimizer**: [Fill in based on results]

2. **Model-Specific Recommendations**:
   - For **Base** model: [Fill based on results]
   - For **Wide** model: [Fill based on results]
   - For **Advanced** model: [Fill based on results]

3. **Performance Characteristics**:
   - **Convergence Speed**: [Fill based on results]
   - **Final Performance**: [Fill based on results]
   - **Training Efficiency**: [Fill based on results]

4. **Practical Recommendations**:
   - For quick prototyping: [Fill based on results]
   - For final model training: [Fill based on results]
   - For complex architectures: [Fill based on results]