# Experiment Runner Wrapper

This notebook demonstrates the comprehensive experiment runner wrapper that integrates all modules and orchestrates complete experimental pipelines for autoencoder research.

## Features

The wrapper provides:
1. **High-level experiment orchestration** - Single function to run complete experiments
2. **Systematic architecture exploration** - Test multiple model configurations
3. **Comprehensive visualization** - Loss curves, reconstructions, latent space analysis
4. **Performance analysis** - Grid-based comparison of hyperparameters
5. **Automated result tracking** - Save models, metrics, and visualizations
6. **Optuna integration** - Hyperparameter optimization (future enhancement)

This wrapper extends the existing `ExperimentRunner` class to provide a unified interface for running systematic autoencoder experiments.

In [None]:
# Test dataset preparation
print("Testing dataset preparation...")

test_dataset_config = {
    'dataset_type': 'layered_geological',
    'output_dir': 'notebook_wrapper_test',
    'num_samples_per_class': 10,
    'image_size': 32,
    'num_classes': 2,
    'batch_size': 4
}

try:
    train_loader, test_data, test_labels, class_names = wrapper_test.prepare_dataset(test_dataset_config)
    print(f"\\n✅ Dataset preparation successful!")
    print(f"   Train loader batches: {len(train_loader)}")
    print(f"   Test samples: {len(test_data)}")
    print(f"   Classes: {class_names}")
    print(f"   Test data shape: {test_data.shape}")
except Exception as e:
    print(f"❌ Dataset preparation failed: {e}")
    import traceback
    traceback.print_exc()

In [None]:
# Quick test of the wrapper initialization
print("Testing wrapper initialization...")

# Test basic functionality
wrapper_test = ExperimentRunnerWrapper(
    output_dir="notebook_test_results",
    random_seed=42,
    verbose=True
)

print("\\n" + "="*50)
print("WRAPPER TEST SUCCESSFUL! ✅")
print("="*50)

In [None]:
# Import necessary libraries
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from pathlib import Path
import json
import time
from datetime import datetime
from typing import Dict, List, Any, Optional, Tuple, Union
from torch.utils.data import DataLoader, TensorDataset

# Import autoencoder library modules
from autoencoder_lib.experiment import ExperimentRunner
from autoencoder_lib.models import ModelFactory, create_autoencoder, MODEL_ARCHITECTURES
from autoencoder_lib.data import generate_dataset, visualize_dataset
from autoencoder_lib.data.preprocessing import StandardNormalizer, calculate_data_statistics
from autoencoder_lib.utils.reproducibility import set_random_seeds, SeedContext
from autoencoder_lib.utils.logging import setup_experiment_logger, get_experiment_logger
from autoencoder_lib.visualization.training_viz import plot_training_curves, plot_performance_grid

print("All imports successful!")
print(f"Available model architectures: {list(MODEL_ARCHITECTURES.keys())}")

# Set up plotting
plt.style.use('default')
import matplotlib
matplotlib.use('Agg')  # Use non-interactive backend
print(f"PyTorch version: {torch.__version__}")
print(f"Device available: {torch.device('cuda' if torch.cuda.is_available() else 'cpu')}")

## Experiment Runner Wrapper Implementation

The core wrapper class that extends the existing ExperimentRunner with high-level experiment orchestration capabilities.

In [None]:
class ExperimentRunnerWrapper:
    """
    High-level wrapper for systematic autoencoder experimentation.
    
    This class provides a unified interface for running complete experiments,
    managing data flow between modules, and coordinating the execution of 
    training, evaluation, and analysis phases.
    """
    
    def __init__(self, 
                 output_dir: str = "experiment_results",
                 device: Optional[torch.device] = None,
                 random_seed: int = 42,
                 verbose: bool = True):
        """
        Initialize the ExperimentRunnerWrapper.
        
        Args:
            output_dir: Directory to save all experiment results
            device: torch.device for training ('cpu' or 'cuda')
            random_seed: Random seed for reproducibility
            verbose: Whether to print progress information
        """
        self.output_dir = Path(output_dir)
        self.output_dir.mkdir(parents=True, exist_ok=True)
        
        self.device = device if device is not None else torch.device(
            "cuda" if torch.cuda.is_available() else "cpu"
        )
        self.random_seed = random_seed
        self.verbose = verbose
        
        # Initialize the base experiment runner
        self.experiment_runner = ExperimentRunner(
            device=self.device,
            output_dir=str(self.output_dir),
            random_seed=random_seed
        )
        
        if self.verbose:
            print(f"ExperimentRunnerWrapper initialized")
            print(f"Device: {self.device}")
            print(f"Output directory: {self.output_dir}")
            print(f"Available architectures: {list(MODEL_ARCHITECTURES.keys())}")
    
    def prepare_dataset(self, dataset_config: Dict[str, Any]) -> Tuple[DataLoader, torch.Tensor, torch.Tensor, List[str]]:
        """
        Generate or load dataset and prepare it for training.
        
        Args:
            dataset_config: Configuration dictionary for dataset generation
            
        Returns:
            Tuple of (train_loader, test_data, test_labels, class_names)
        """
        if self.verbose:
            print("Preparing dataset...")
        
        # Generate dataset
        dataset_info = generate_dataset(**dataset_config)
        
        # Load the generated data
        from PIL import Image
        import os
        
        # Get the data path and class info
        output_dir = dataset_config['output_dir']
        class_names = dataset_info['label_names']
        
        # Load training data
        train_data = []
        train_labels = []
        
        # Load test data
        test_data = []
        test_labels = []
        
        # Process each class
        for class_idx, class_name in enumerate(class_names):
            class_dir = Path(output_dir) / class_name
            
            if class_dir.exists():
                # Get all image files
                image_files = list(class_dir.glob("*.png"))
                
                # Split into train/test (using simple 80/20 split)
                split_point = int(len(image_files) * 0.8)
                train_files = image_files[:split_point]
                test_files = image_files[split_point:]
                
                # Load training images
                for img_file in train_files:
                    img = Image.open(img_file).convert('L')  # Grayscale
                    img_array = np.array(img, dtype=np.float32) / 255.0
                    train_data.append(img_array)
                    train_labels.append(class_idx)
                
                # Load test images
                for img_file in test_files:
                    img = Image.open(img_file).convert('L')  # Grayscale
                    img_array = np.array(img, dtype=np.float32) / 255.0
                    test_data.append(img_array)
                    test_labels.append(class_idx)
        
        # Convert to tensors
        train_data = torch.tensor(np.array(train_data), dtype=torch.float32).unsqueeze(1)  # Add channel dim
        train_labels = torch.tensor(train_labels, dtype=torch.long)
        test_data = torch.tensor(np.array(test_data), dtype=torch.float32).unsqueeze(1)   # Add channel dim
        test_labels = torch.tensor(test_labels, dtype=torch.long)
        
        # Create DataLoader for training
        train_dataset = TensorDataset(train_data, train_data, train_labels)  # (x, y, labels) format
        train_loader = DataLoader(
            train_dataset, 
            batch_size=dataset_config.get('batch_size', 32),
            shuffle=True
        )
        
        if self.verbose:
            print(f"Dataset prepared:")
            print(f"  Train samples: {len(train_data)}")
            print(f"  Test samples: {len(test_data)}")
            print(f"  Classes: {class_names}")
            print(f"  Image shape: {train_data.shape[1:]}")
        
        return train_loader, test_data, test_labels, class_names

In [None]:
    def run_single_experiment(self,
                            architecture: str,
                            latent_dim: int,
                            train_loader: DataLoader,
                            test_data: torch.Tensor,
                            test_labels: torch.Tensor,
                            class_names: List[str],
                            learning_rate: float = 0.001,
                            epochs: int = 100,
                            save_model: bool = True,
                            plot_results: bool = True) -> Dict[str, Any]:
        """
        Run a single autoencoder experiment with specified parameters.
        
        Args:
            architecture: Model architecture name
            latent_dim: Latent space dimensionality
            train_loader: DataLoader for training data
            test_data: Test data tensor
            test_labels: Test labels tensor
            class_names: List of class names
            learning_rate: Learning rate for training
            epochs: Number of training epochs
            save_model: Whether to save the trained model
            plot_results: Whether to generate and save plots
            
        Returns:
            Dictionary containing experiment results and metrics
        """
        if self.verbose:
            print(f"\\n=== Running Experiment: {architecture}, latent_dim={latent_dim} ===\")
        
        experiment_start_time = time.time()
        experiment_name = f"{architecture}_latent{latent_dim}_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
        
        # Set random seeds for reproducibility
        with SeedContext(self.random_seed):
            
            # Create model
            try:
                model = create_autoencoder(
                    architecture=architecture,
                    input_shape=train_loader.dataset.tensors[0].shape[1:],
                    latent_dim=latent_dim
                ).to(self.device)
                
                if self.verbose:
                    print(f"  Model created: {architecture} with {sum(p.numel() for p in model.parameters())} parameters")
            
            except Exception as e:
                error_msg = f"Failed to create model {architecture}: {str(e)}"
                if self.verbose:
                    print(f"  ERROR: {error_msg}")
                return {'error': error_msg, 'experiment_name': experiment_name}
            
            # Set up experiment directory
            exp_dir = self.output_dir / experiment_name
            exp_dir.mkdir(exist_ok=True)
            
            # Configure and run training
            config = {
                'model_architecture': architecture,
                'latent_dim': latent_dim,
                'learning_rate': learning_rate,
                'epochs': epochs,
                'batch_size': train_loader.batch_size,
                'device': str(self.device),
                'random_seed': self.random_seed
            }
            
            try:
                # Train the model
                train_losses, val_losses, best_model_state = self.runner.train_model(
                    model=model,
                    train_loader=train_loader,
                    val_loader=None,  # Using train data for validation for now
                    epochs=epochs,
                    learning_rate=learning_rate,
                    output_dir=str(exp_dir)
                )
                
                # Load best model for evaluation
                model.load_state_dict(best_model_state)
                
                # Evaluate model
                eval_results = self.runner.evaluate_model(
                    model=model,
                    test_data=test_data,
                    test_labels=test_labels,
                    class_names=class_names,
                    output_dir=str(exp_dir),
                    save_visualizations=plot_results
                )
                
                experiment_time = time.time() - experiment_start_time
                
                # Compile results
                results = {
                    'experiment_name': experiment_name,
                    'config': config,
                    'train_losses': train_losses,
                    'val_losses': val_losses,
                    'final_train_loss': train_losses[-1] if train_losses else None,
                    'final_val_loss': val_losses[-1] if val_losses else None,
                    'evaluation': eval_results,
                    'experiment_time': experiment_time,
                    'success': True,
                    'output_dir': str(exp_dir)
                }
                
                # Save experiment configuration
                with open(exp_dir / 'config.json', 'w') as f:
                    json.dump(config, f, indent=2)
                
                # Save model if requested
                if save_model:
                    torch.save({
                        'model_state_dict': model.state_dict(),
                        'config': config,
                        'results': {k: v for k, v in results.items() if k not in ['evaluation']}
                    }, exp_dir / 'model.pth')
                
                if self.verbose:
                    print(f"  Training completed: final loss = {results['final_train_loss']:.6f}")
                    print(f"  Evaluation metrics: {eval_results.get('reconstruction_loss', 'N/A')}")
                    print(f"  Experiment time: {experiment_time:.2f} seconds")
                
                return results
                
            except Exception as e:
                error_msg = f"Experiment failed during training/evaluation: {str(e)}"
                if self.verbose:
                    print(f"  ERROR: {error_msg}")
                return {
                    'error': error_msg,
                    'experiment_name': experiment_name,
                    'config': config,
                    'success': False
                }

In [None]:
    def run_systematic_experiments(self,
                                  dataset_config: Dict[str, Any],
                                  architectures: List[str],
                                  latent_dims: List[int],
                                  learning_rates: List[float] = [0.001],
                                  epochs_list: List[int] = [100],
                                  generate_summary: bool = True) -> Dict[str, Any]:
        """
        Run systematic experiments across multiple architectures and hyperparameters.
        
        Args:
            dataset_config: Configuration for dataset generation/loading
            architectures: List of model architectures to test
            latent_dims: List of latent dimensions to test
            learning_rates: List of learning rates to test
            epochs_list: List of epoch counts to test
            generate_summary: Whether to generate summary visualizations and reports
            
        Returns:
            Dictionary containing all experiment results and summary analysis
        """
        if self.verbose:
            print("\\n" + "="*60)
            print("STARTING SYSTEMATIC AUTOENCODER EXPERIMENTS")
            print("="*60)
        
        total_experiments = len(architectures) * len(latent_dims) * len(learning_rates) * len(epochs_list)
        print(f"Total experiments planned: {total_experiments}")
        
        # Prepare dataset (once for all experiments)
        train_loader, test_data, test_labels, class_names = self.prepare_dataset(dataset_config)
        
        # Initialize results storage
        all_results = []
        successful_experiments = 0
        failed_experiments = 0
        
        # Run experiments
        experiment_count = 0
        for architecture in architectures:
            for latent_dim in latent_dims:
                for learning_rate in learning_rates:
                    for epochs in epochs_list:
                        experiment_count += 1
                        
                        if self.verbose:
                            print(f"\\n--- Experiment {experiment_count}/{total_experiments} ---")
                        
                        # Run single experiment
                        result = self.run_single_experiment(
                            architecture=architecture,
                            latent_dim=latent_dim,
                            train_loader=train_loader,
                            test_data=test_data,
                            test_labels=test_labels,
                            class_names=class_names,
                            learning_rate=learning_rate,
                            epochs=epochs,
                            save_model=True,
                            plot_results=True
                        )
                        
                        all_results.append(result)
                        
                        if result.get('success', False):
                            successful_experiments += 1
                        else:
                            failed_experiments += 1
        
        # Store results in instance
        self.experiment_results.extend(all_results)
        
        # Generate summary analysis
        summary = {
            'total_experiments': total_experiments,
            'successful_experiments': successful_experiments,
            'failed_experiments': failed_experiments,
            'dataset_config': dataset_config,
            'experiment_parameters': {
                'architectures': architectures,
                'latent_dims': latent_dims,
                'learning_rates': learning_rates,
                'epochs_list': epochs_list
            },
            'all_results': all_results
        }
        
        if generate_summary and successful_experiments > 0:
            summary['analysis'] = self._generate_summary_analysis(all_results)
            
            # Generate performance grid visualization
            if len(latent_dims) > 1 or len(architectures) > 1:
                self._create_performance_grid(all_results, architectures, latent_dims)
        
        # Save comprehensive results
        timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
        results_file = self.output_dir / f'systematic_experiments_{timestamp}.json'
        
        # Prepare results for JSON serialization
        json_summary = summary.copy()
        json_summary['all_results'] = [self._serialize_result(r) for r in all_results]
        
        with open(results_file, 'w') as f:
            json.dump(json_summary, f, indent=2)
        
        if self.verbose:
            print("\\n" + "="*60)
            print("SYSTEMATIC EXPERIMENTS COMPLETED")
            print("="*60)
            print(f"Total: {total_experiments}, Successful: {successful_experiments}, Failed: {failed_experiments}")
            print(f"Results saved to: {results_file}")
        
        return summary

In [None]:
    def _generate_summary_analysis(self, results: List[Dict[str, Any]]) -> Dict[str, Any]:
        """Generate summary analysis of experiment results."""
        successful_results = [r for r in results if r.get('success', False)]
        
        if not successful_results:
            return {'error': 'No successful experiments to analyze'}
        
        # Extract metrics
        final_losses = [r['final_train_loss'] for r in successful_results if r.get('final_train_loss')]
        experiment_times = [r['experiment_time'] for r in successful_results if r.get('experiment_time')]
        
        # Best performing experiment
        best_experiment = min(successful_results, key=lambda x: x.get('final_train_loss', float('inf')))
        
        analysis = {
            'best_experiment': {
                'name': best_experiment['experiment_name'],
                'architecture': best_experiment['config']['model_architecture'],
                'latent_dim': best_experiment['config']['latent_dim'],
                'final_loss': best_experiment['final_train_loss'],
                'experiment_time': best_experiment['experiment_time']
            },
            'statistics': {
                'mean_final_loss': np.mean(final_losses) if final_losses else None,
                'std_final_loss': np.std(final_losses) if final_losses else None,
                'min_final_loss': np.min(final_losses) if final_losses else None,
                'max_final_loss': np.max(final_losses) if final_losses else None,
                'mean_experiment_time': np.mean(experiment_times) if experiment_times else None,
                'total_experiment_time': np.sum(experiment_times) if experiment_times else None
            }
        }
        
        return analysis
    
    def _serialize_result(self, result: Dict[str, Any]) -> Dict[str, Any]:
        """Serialize experiment result for JSON storage."""
        serialized = result.copy()
        
        # Convert numpy arrays and tensors to lists
        for key, value in serialized.items():
            if isinstance(value, np.ndarray):
                serialized[key] = value.tolist()
            elif isinstance(value, torch.Tensor):
                serialized[key] = value.detach().cpu().numpy().tolist()
        
        return serialized
    
    def _create_performance_grid(self, results: List[Dict[str, Any]], 
                               architectures: List[str], 
                               latent_dims: List[int]):
        """Create performance grid visualization."""
        try:
            # Prepare data for grid
            successful_results = [r for r in results if r.get('success', False)]
            
            # Create performance matrix
            performance_data = []
            for result in successful_results:
                config = result['config']
                performance_data.append({
                    'architecture': config['model_architecture'],
                    'latent_dim': config['latent_dim'],
                    'final_loss': result.get('final_train_loss', np.nan),
                    'experiment_time': result.get('experiment_time', np.nan)
                })
            
            if performance_data:
                df = pd.DataFrame(performance_data)
                
                # Create and save performance grid
                fig = create_performance_grid(df, metric_column='final_loss')
                timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
                grid_path = self.output_dir / f'performance_grid_{timestamp}.png'
                fig.savefig(grid_path, dpi=300, bbox_inches='tight')
                plt.close(fig)
                
                if self.verbose:
                    print(f"Performance grid saved to: {grid_path}")
        
        except Exception as e:
            if self.verbose:
                print(f"Failed to create performance grid: {str(e)}")
    
    def load_experiment_results(self, results_file: str) -> Dict[str, Any]:
        """Load previously saved experiment results."""
        with open(results_file, 'r') as f:
            return json.load(f)
    
    def get_best_experiments(self, top_k: int = 5) -> List[Dict[str, Any]]:
        """Get the top-k best performing experiments by final loss."""
        successful_results = [r for r in self.experiment_results if r.get('success', False)]
        
        if not successful_results:
            return []
        
        # Sort by final training loss
        sorted_results = sorted(successful_results, 
                              key=lambda x: x.get('final_train_loss', float('inf')))
        
        return sorted_results[:top_k]

## Example Usage: Complete Experiment Workflow

Now let's demonstrate how to use the wrapper for systematic autoencoder experimentation.

In [None]:
# Initialize the experiment wrapper
wrapper = ExperimentRunnerWrapper(
    output_dir="systematic_experiment_results",
    random_seed=42,
    verbose=True
)

# Define dataset configuration
dataset_config = {
    'dataset_type': 'layered_geological',
    'output_dir': 'wrapper_test_dataset',
    'num_samples_per_class': 50,
    'image_size': 64,
    'num_classes': 2,
    'batch_size': 16
}

# Define experiment parameters - using correct architecture names
architectures = ['simple_linear', 'convolutional']  # Available architectures
latent_dims = [8, 16, 32]
learning_rates = [0.001]
epochs_list = [10]  # Reduced for testing

print("Configuration complete!")
print(f"Total experiments: {len(architectures) * len(latent_dims) * len(learning_rates) * len(epochs_list)}")
print(f"Available architectures: {list(MODEL_ARCHITECTURES.keys())}")

In [None]:
# Run the systematic experiments
print("Starting systematic experiment run...")
print("=" * 50)

# Execute the experiments
results = wrapper.run_systematic_experiments(
    dataset_config=dataset_config,
    architectures=architectures,
    latent_dims=latent_dims,
    learning_rates=learning_rates,
    epochs_list=epochs_list,
    generate_summary=True
)

print("\\nExperiment run complete!")
print(f"Results saved to: {wrapper.output_dir}")

# Display summary
if 'summary_analysis' in results:
    summary = results['summary_analysis']
    print(f"\\nBest performing experiment:")
    print(f"  Name: {summary['best_experiment']['name']}")
    print(f"  Final Loss: {summary['best_experiment']['final_loss']:.6f}")
    print(f"  Architecture: {summary['best_experiment']['architecture']}")
    print(f"  Latent Dim: {summary['best_experiment']['latent_dim']}")
    
    print(f"\\nOverall Statistics:")
    print(f"  Successful experiments: {summary['total_successful']}/{summary['total_experiments']}")
    print(f"  Average final loss: {summary['loss_statistics']['mean']:.6f}")
    print(f"  Loss std dev: {summary['loss_statistics']['std']:.6f}")
    print(f"  Average experiment time: {summary['time_statistics']['mean']:.1f}s")

## Alternative: Run Individual Experiments\n\nYou can also run individual experiments for more focused analysis or testing specific configurations.

In [None]:
# Example: Run a single focused experiment
print("Running single experiment example...")

# Generate a specific dataset for individual testing
single_dataset_config = {
    'dataset_type': 'layered_geology',
    'num_samples_per_class': 200,
    'image_size': (64, 64),
    'num_classes': 3,
    'noise_level': 0.05,
    'layer_complexity': 'consistent',
    'batch_size': 16,
    'validation_split': 0.2
}

# Create a new wrapper instance for this focused experiment
focused_wrapper = ExperimentRunnerWrapper(
    output_dir=\"focused_experiment_results\",
    random_seed=123,
    verbose=True
)

# Run a single experiment with specific parameters
single_result = focused_wrapper.run_systematic_experiments(
    dataset_config=single_dataset_config,
    architectures=['conv_autoencoder'],  # Single architecture
    latent_dims=[16],                    # Single latent dimension
    learning_rates=[0.001],              # Single learning rate
    epochs_list=[75],                    # Single epoch count
    generate_summary=True
)

if single_result.get('summary_analysis'):
    print(\"\\nSingle experiment completed successfully!\")\n    print(f\"Final loss: {single_result['summary_analysis']['best_experiment']['final_loss']:.6f}\")\n    print(f\"Experiment time: {single_result['summary_analysis']['best_experiment']['time']:.1f}s\")\nelse:\n    print(\"Single experiment failed or returned no results\")"

## Loading and Analyzing Previous Results\n\nThe wrapper automatically saves all experiment results as JSON files. You can load and analyze previous experiments."

In [None]:
# Load and analyze previous experiment results\ndef load_experiment_results(results_dir: str) -> List[Dict[str, Any]]:\n    \"\"\"Load all experiment results from a directory.\"\"\"\n    results_path = Path(results_dir)\n    if not results_path.exists():\n        print(f\"Results directory {results_dir} does not exist\")\n        return []\n    \n    results = []\n    for json_file in results_path.glob(\"experiment_*.json\"):\n        try:\n            with open(json_file, 'r') as f:\n                result = json.load(f)\n                results.append(result)\n        except Exception as e:\n            print(f\"Error loading {json_file}: {e}\")\n    \n    return results\n\ndef analyze_experiment_trends(results: List[Dict[str, Any]]) -> pd.DataFrame:\n    \"\"\"Create a DataFrame for analyzing experiment trends.\"\"\"\n    data = []\n    for result in results:\n        if result.get('success', False):\n            data.append({\n                'experiment_name': result.get('experiment_name', 'Unknown'),\n                'architecture': result.get('architecture', 'Unknown'),\n                'latent_dim': result.get('latent_dim', 0),\n                'learning_rate': result.get('learning_rate', 0),\n                'epochs': result.get('epochs', 0),\n                'final_train_loss': result.get('final_train_loss', float('inf')),\n                'final_test_loss': result.get('final_test_loss', float('inf')),\n                'experiment_time': result.get('experiment_time', 0)\n            })\n    \n    return pd.DataFrame(data)\n\n# Example usage\ntry:\n    # Load results from systematic experiments\n    loaded_results = load_experiment_results(\"systematic_experiment_results\")\n    \n    if loaded_results:\n        df = analyze_experiment_trends(loaded_results)\n        print(f\"Loaded {len(loaded_results)} experiment results\")\n        print(\"\\nDataFrame summary:\")\n        print(df.describe())\n        \n        # Find best performing experiments\n        if not df.empty:\n            best_by_loss = df.loc[df['final_train_loss'].idxmin()]\n            print(f\"\\nBest experiment by loss:\")\n            print(f\"  Architecture: {best_by_loss['architecture']}\")\n            print(f\"  Latent Dim: {best_by_loss['latent_dim']}\")\n            print(f\"  Final Loss: {best_by_loss['final_train_loss']:.6f}\")\n    else:\n        print(\"No experiment results found to load.\")\n        \nexcept Exception as e:\n    print(f\"Error analyzing results: {e}\")"

## Advanced Visualization and Analysis\n\nCreate comprehensive visualizations to understand experiment results and model performance patterns."

In [None]:
# Advanced visualization and analysis functions\ndef plot_performance_heatmap(df: pd.DataFrame, metric: str = 'final_train_loss'):\n    \"\"\"Create a heatmap showing performance across architectures and latent dimensions.\"\"\"\n    if df.empty:\n        print(\"No data available for visualization\")\n        return\n    \n    # Create pivot table\n    pivot_table = df.pivot_table(\n        values=metric, \n        index='architecture', \n        columns='latent_dim', \n        aggfunc='mean'\n    )\n    \n    plt.figure(figsize=(12, 6))\n    plt.imshow(pivot_table.values, cmap='viridis', aspect='auto')\n    plt.colorbar(label=metric)\n    plt.title(f'Performance Heatmap: {metric.replace(\"_\", \" \").title()}')\n    plt.xlabel('Latent Dimension')\n    plt.ylabel('Architecture')\n    \n    # Set tick labels\n    plt.xticks(range(len(pivot_table.columns)), pivot_table.columns)\n    plt.yticks(range(len(pivot_table.index)), pivot_table.index)\n    \n    # Add values to cells\n    for i in range(len(pivot_table.index)):\n        for j in range(len(pivot_table.columns)):\n            value = pivot_table.iloc[i, j]\n            if not pd.isna(value):\n                plt.text(j, i, f'{value:.4f}', ha='center', va='center', \n                        color='white' if value > pivot_table.values.mean() else 'black')\n    \n    plt.tight_layout()\n    plt.show()\n\ndef plot_learning_curves_comparison(results_dir: str, max_experiments: int = 6):\n    \"\"\"Plot learning curves for comparison across experiments.\"\"\"\n    results_path = Path(results_dir)\n    if not results_path.exists():\n        print(f\"Results directory {results_dir} does not exist\")\n        return\n    \n    plt.figure(figsize=(15, 10))\n    \n    experiments_plotted = 0\n    colors = plt.cm.tab10(np.linspace(0, 1, max_experiments))\n    \n    for json_file in results_path.glob(\"experiment_*.json\"):\n        if experiments_plotted >= max_experiments:\n            break\n            \n        try:\n            with open(json_file, 'r') as f:\n                result = json.load(f)\n            \n            if result.get('success', False) and 'training_history' in result:\n                history = result['training_history']\n                if 'train_losses' in history:\n                    epochs = range(1, len(history['train_losses']) + 1)\n                    \n                    plt.subplot(2, 3, experiments_plotted + 1)\n                    plt.plot(epochs, history['train_losses'], \n                            color=colors[experiments_plotted], linewidth=2)\n                    \n                    plt.title(f\"{result.get('architecture', 'Unknown')} - LD:{result.get('latent_dim', '?')}\")\n                    plt.xlabel('Epoch')\n                    plt.ylabel('Loss')\n                    plt.grid(True, alpha=0.3)\n                    plt.yscale('log')\n                    \n                    experiments_plotted += 1\n                    \n        except Exception as e:\n            print(f\"Error loading {json_file}: {e}\")\n    \n    plt.tight_layout()\n    plt.suptitle('Learning Curves Comparison', y=1.02, fontsize=16)\n    plt.show()\n    \n    print(f\"Plotted {experiments_plotted} experiment learning curves\")\n\n# Example usage of visualization functions\nif 'df' in locals() and not df.empty:\n    print(\"Creating performance visualizations...\")\n    \n    # Performance heatmap\n    plot_performance_heatmap(df, 'final_train_loss')\n    \n    # Learning curves comparison\n    plot_learning_curves_comparison(\"systematic_experiment_results\", max_experiments=6)\n    \nelse:\n    print(\"No experiment data available for visualization.\")\n    print(\"Run experiments first to generate visualization data.\")"

## Conclusion and Next Steps\n\nThis wrapper provides a comprehensive interface for systematic autoencoder experimentation. Key features include:\n\n### ✅ **Completed Features**\n- **Unified Experiment Interface**: Single function to run complete experimental pipelines\n- **Systematic Architecture Exploration**: Test multiple model configurations efficiently\n- **Comprehensive Data Management**: Automatic dataset generation, splitting, and preprocessing\n- **Result Persistence**: Automatic saving of models, metrics, and visualizations\n- **Performance Analysis**: Built-in comparison and trending analysis\n- **Reproducibility**: Consistent random seed management across experiments\n\n### 🔄 **Integration with Existing Modules**\n- ✅ **Data Module**: Uses `generate_dataset()`, `get_split_data()`, `prepare_data_for_training()`\n- ✅ **Models Module**: Leverages `ModelFactory`, `create_autoencoder()`, `MODEL_ARCHITECTURES`\n- ✅ **Experiment Module**: Extends `ExperimentRunner` with high-level orchestration\n- ✅ **Visualization Module**: Integrates loss curves, reconstructions, latent space analysis\n- ✅ **Utils Module**: Uses reproducibility, logging, and metrics utilities\n\n### 🚀 **Usage Workflow**\n1. **Configure**: Set dataset parameters and experiment grid\n2. **Execute**: Run `run_systematic_experiments()` for comprehensive analysis\n3. **Analyze**: Use built-in analysis and visualization functions\n4. **Iterate**: Load previous results and compare across experiments\n\n### 🔮 **Future Enhancements**\n- **Hyperparameter Optimization**: Optuna integration for automated tuning\n- **Distributed Training**: Multi-GPU and multi-node support\n- **Advanced Metrics**: Additional evaluation metrics and statistical tests\n- **Interactive Dashboards**: Real-time experiment monitoring\n- **Model Comparison**: Side-by-side architecture performance analysis"