# Bayesian Optimization - LeNet-300-100

Multi-objective Pareto optimization to find optimal hyperparameters for soft weight-sharing.

## Workflow
1. **Run Bayesian Optimization**: 50 trials exploring hyperparameter space
2. **Visualize Pareto Front**: Inspect trade-offs between Compression Ratio (CR) and Accuracy Loss
3. **Select Best Hyperparameters**: Choose solution from Pareto front based on desired trade-off

## Optimization Strategy
- **Objectives**: Maximize CR, Minimize accuracy loss (multi-objective)
- **Sampler**: NSGA-II (automatic for Pareto mode)
- **Search Space**: 
  - `tau`: [3e-3, 1e-2] (log scale)
  - `gamma_alpha`: [100, 500]
  - `gamma_beta`: [0.05, 0.5]
  - `gamma_alpha_zero`: [3000, 6000]
  - `gamma_beta_zero`: [1, 5]
- **Output**: Pareto front of non-dominated solutions

---
## 1. Run Bayesian Optimization (50 Trials)

In [None]:
!python scripts/tune_optuna.py \
    --preset lenet_300_100 \
    --use-pareto \
    --n-trials 50 \
    --save-dir BO_results \
    --load-pretrained checkpoints/mnist_lenet_300_100_pre.pt \
    --pretrain-epochs 0 \
    --retrain-epochs 40 \
    --batch-size 128 \
    --num-workers 4 \
    --quant-skip-last \
    --eval-every 20 \
    --cr-every 20 \
    --seed 42

---
## 2. Visualize Pareto Front

In [None]:
!python scripts/tune_optuna_pareto_viz.py \
    --pareto-json BO_results/*_pareto_results.json \
    --annotate

---
## 3. Display Pareto Front Plot

In [None]:
from IPython.display import Image, display
import glob

# Find and display Pareto plot
plot_files = glob.glob("../BO_results/*_pareto_results_plot.png")
if plot_files:
    plot_path = plot_files[0]
    print(f"Displaying Pareto front: {plot_path}\n")
    display(Image(filename=plot_path))
else:
    print("⚠️  Pareto plot not found. Check if visualization completed successfully.")

---
## 4. Show Pareto-Optimal Solutions

In [None]:
import json
import glob

# Load Pareto results
pareto_files = glob.glob("../BO_results/*_pareto_results.json")
if not pareto_files:
    print("⚠️  Pareto results JSON not found.")
else:
    pareto_file = pareto_files[0]
    with open(pareto_file) as f:
        data = json.load(f)
    
    print("="*80)
    print(f"PARETO FRONT - {data['study_name']}")
    print("="*80)
    print(f"Total trials:        {data['n_trials']}")
    print(f"Pareto solutions:    {data['n_pareto']}")
    print(f"Preset:              {data['preset']}")
    print("="*80)
    print()
    
    if data['n_pareto'] == 0:
        print("⚠️  No Pareto-optimal solutions found. All trials may have failed.")
        print("   Check error logs in BO_results/*/ERROR.txt")
    else:
        print(f"Found {data['n_pareto']} Pareto-optimal solutions:\n")
        
        for i, sol in enumerate(data['pareto_front']):
            print(f"{'='*80}")
            print(f"Solution {i+1} - Trial #{sol['trial_number']}")
            print(f"{'='*80}")
            print(f"  Compression Ratio (CR):    {sol['CR']:.2f}x")
            print(f"  Accuracy Loss:             {sol['acc_loss']:.4f}")
            print(f"  Accuracy (pretrain):       {sol['acc_pre']:.4f}")
            print(f"  Accuracy (quantized):      {sol['acc_quantized']:.4f}")
            print(f"  Accuracy drop (pp):        {sol['acc_drop_pp']:.2f}%")
            print(f"\n  Hyperparameters:")
            for k, v in sol['params'].items():
                if isinstance(v, float):
                    print(f"    {k:<25} {v:.6g}")
                else:
                    print(f"    {k:<25} {v}")
            print(f"\n  Run directory: {sol['run_dir']}")
            print()

---
## 5. Export Pareto Front to CSV

In [None]:
# CSV is automatically generated by tune_optuna_pareto_viz.py
csv_files = glob.glob("../BO_results/*_pareto_results_plot.csv")
if csv_files:
    csv_path = csv_files[0]
    print(f"Pareto front CSV exported to: {csv_path}")
    print("\nPreview:")
    !head -20 {csv_path}
else:
    print("⚠️  CSV file not found.")

---
## 6. Selection Guide

### How to Choose from Pareto Front

The Pareto front shows the optimal trade-off between compression and accuracy:

- **High CR, Higher Acc Loss**: Solutions on the right side of the plot
  - Best for: Maximum compression, can tolerate slight accuracy drop
  - Use case: Deployment on extremely resource-constrained devices

- **Moderate CR, Low Acc Loss**: Solutions in the middle
  - Best for: Balanced compression and accuracy
  - Use case: General deployment scenarios

- **Lower CR, Minimal Acc Loss**: Solutions on the left side
  - Best for: Preserving accuracy with modest compression
  - Use case: Applications where accuracy is critical

### Next Steps

1. **Select a solution** from the Pareto front above
2. **Copy the hyperparameters** (tau, gamma_alpha, gamma_beta, etc.)
3. **Open `compression.ipynb`**
4. **Paste the hyperparameters** into the user input cell
5. **Run full 100-epoch training** with your selected hyperparameters
6. **Generate all diagnostics** (GIF, plots, compression report)

---
## Summary

Bayesian Optimization complete!

### Outputs:
- Pareto front plot: `BO_results/*_pareto_results_plot.png`
- Pareto solutions JSON: `BO_results/*_pareto_results.json`
- Pareto solutions CSV: `BO_results/*_pareto_results_plot.csv`
- Individual trial results: `BO_results/sws_tune_lenet_300_100_*_t*/`

### Recommended Workflow:
1. ✅ Analyze Pareto front plot above
2. ✅ Select solution based on CR vs accuracy trade-off
3. ➡️ Copy hyperparameters to `compression.ipynb`
4. ➡️ Run full training with selected hyperparameters