# SOLAS Automated Evaluation Notebook

This notebook runs controlled experiments to evaluate SOLAS pipeline performance across different configurations.

## Design Principles

1. **Controlled experiments**: Vary ONE parameter at a time while holding others fixed
2. **Config-based caching**: Hash configurations (including hardware runtime) to detect duplicates and skip re-runs
3. **Resumable**: Automatically resumes from last completed experiment after runtime restart
4. **Persistent storage**: All data saved to Google Drive immediately after each experiment
5. **Comprehensive metrics**: Time, RAM, VRAM, hardware info, full inputs/outputs for every stage
6. **Model lifecycle management**: Explicit load/unload to ensure accurate memory measurements

## Experiments

| Experiment | Purpose | Tests |
|------------|---------|-------|
| ASR Models | Compare Whisper tiny/small/large | 3 |
| Quantization | None vs 4-bit on all LLMs | 8 |
| Repetition Penalty | None vs 1.2 on smallest and largest LLMs | 4 |
| Summary Mode | greedy/sampled | 2 |
| Chunk Size | 2000 vs 4000 chars | 2 |
| Temperature | 0.2/0.5 on Mistral-7B | 2 |
| **Total** | | **21** |

## Results Analysis

After running experiments, use the **[SOLAS_Analysis.ipynb](SOLAS_Analysis.ipynb)** notebook to visualize and analyze results.
The Analysis notebook does not require a GPU and can be run on any runtime.

In [None]:
# @title ### Setup & Engine
# @markdown Initialize environment, clone SOLAS repository, and configure evaluation system.

import sys
import subprocess
from pathlib import Path

# Clone/update SOLAS repository
if Path('SOLAS').exists():
    subprocess.run(['git', 'pull'], check=True, cwd='SOLAS')
else:
    subprocess.run(['git', 'clone', 'https://github.com/andrecarini/SOLAS.git'], check=True)

sys.path.insert(0, 'SOLAS')

# Check environment and setup dependencies
from library import check_colab_environment, setup_environment_with_progress
check_colab_environment()
setup_result = setup_environment_with_progress()

# Set global flags for other cells to check
RESTART_NEEDED = setup_result.get('restart_needed', False)
SETUP_COMPLETE = True

# Create evaluation interface (handles Google Drive mounting automatically)
from library import EvaluationNotebook
evaluation = EvaluationNotebook(
    solas_dir=None,                      # Auto-detect: /content/SOLAS (Colab) or ./SOLAS (local)
    use_gdrive=None,                     # Auto-detect: True in Colab, False otherwise
    gdrive_mount_point='/gdrive',        # Where to mount Google Drive
    gdrive_folder='SOLAS',               # Folder name in Google Drive MyDrive
    gdrive_symlink='/content/gdrive',    # Symlink path for easy access
    local_dir='./evaluation_results'     # Local directory when not using Google Drive
)
evaluation.print_setup_info()

In [None]:
# @title ### Run All Experiments
# @markdown Execute all remaining experiments. Safe to re-run - automatically skips completed experiments and duplicates.

# ============================================================================
# Run basic checks
# ============================================================================
if 'RESTART_NEEDED' in globals() and RESTART_NEEDED:
    from library import show_restart_warning
    show_restart_warning()

if 'SETUP_COMPLETE' not in globals() or not SETUP_COMPLETE:
    raise RuntimeError("Setup not completed. Please run the first cell to complete environment setup.")

# ============================================================================
# Execute evaluation
# ============================================================================
evaluation.run_evaluation(dry_run=False);  # Semicolon suppresses results output

In [None]:
# @title ### View Results
# @markdown Display evaluation results summary with metrics and completion status.

# ============================================================================
# Run basic checks
# ============================================================================
if 'RESTART_NEEDED' in globals() and RESTART_NEEDED:
    from library import show_restart_warning
    show_restart_warning()

if 'SETUP_COMPLETE' not in globals() or not SETUP_COMPLETE:
    raise RuntimeError("Setup not completed. Please run the first cell to complete environment setup.")

# ============================================================================
# Display results
# ============================================================================
evaluation.display_results()