# Figures

## 1. Pareto Front

**Purpose:** Show trade-offs between conflicting objectives such as PSNR, training time, and memory usage.

**Plot Types:**
- 2D Scatter Plot: PSNR vs. Time or PSNR vs. Memory
- 3D Scatter Plot: PSNR vs. Time vs. Memory

**Instructions:**
- Color-code by generation or evaluation type (real vs. surrogate).
- Optionally highlight the true Pareto front formed by real evaluations only.


## 2. Fitness Evolution Over Generations

**Purpose:** Visualize convergence behavior and performance improvement.

**Plot Types:**
- Line plots of best, average, and worst PSNR per generation
- Separate plots for training time and memory usage

**Instructions:**
- Compute min, max, and average fitness values for each generation using the `logbook` or JSON files.
- Use shaded regions to indicate variability (min to max range).

## 3. Diversity of Individuals

**Purpose:** Assess exploration vs. exploitation and population diversity.

**Plot Types:**
- Box plots or violin plots of fitness values per generation
- PCA or t-SNE scatter plot of individuals’ hyperparameters, colored by generation
- Heatmap of pairwise distances between individuals per generation

**Instructions:**
- Extract individuals’ hyperparameter vectors.
- For PCA/t-SNE: reduce dimensionality of hyperparameters to 2D, color by generation.

## 4. Surrogate Model Accuracy

**Purpose:** Evaluate surrogate model prediction accuracy.

**Plot Types:**
- Scatter Plot: Predicted vs. True PSNR (and other objectives)
- Line Plot: Surrogate RMSE or R² score over time (as more real evals are added)

**Instructions:**
- Use only real-evaluated individuals as ground truth.
- Optionally evaluate on a held-out validation set if available.

## 5. Gaussian Process Uncertainty Visualization

**Purpose:** Show uncertainty in surrogate predictions (for interpretability).

**Plot Types:**
- 1D Line Plot: GP mean ± 2 standard deviations for one hyperparameter (fix others)
- 2D Heatmap/Contour: For two selected hyperparameters, showing predicted PSNR and uncertainty

**Instructions:**
- Fix other hyperparameters to median values.
- Generate plots at different training stages (e.g., gen 5, 15, 30) to show evolution.

## 6. Parameter Importance

**Purpose:** Identify which hyperparameters impact the objective the most.

**Plot Types:**
- Bar Plot of feature importances (e.g., via GP kernel lengthscales or permutation importance)
- Partial Dependence Plot: Effect of a single parameter on predicted output

**Instructions:**
- Run sensitivity analysis or extract kernel properties from trained surrogate models.

## 7. Parameter Distribution Over Generations

**Purpose:** Show how exploration and selection pressure changes over time.

**Plot Types:**
- Violin or KDE plots of parameter values per generation
- Stacked histograms or ridge plots showing value distributions

**Instructions:**
- Use `params` in each JSON log to extract hyperparameter values.
- Group and plot by generation.

---


- **Animation:** Animate Pareto front changes over time using `matplotlib.animation` or `plotly`.
- **Top Individuals Table:** Include a static table summarizing the best individuals and their hyperparameters.
- **Interactive Dashboard:** Use `plotly` or `streamlit` to explore results interactively (for presentation or supplementary material).


In [6]:
import os
import json
import pandas as pd

log_dir = "logs/optimization"

# Load all generations
all_data = []
for file in sorted(os.listdir(log_dir)):
    if file.startswith("gen_") and file.endswith(".json"):
        with open(os.path.join(log_dir, file)) as f:
            gen_data = json.load(f)
            all_data.extend(gen_data)

# Convert to DataFrame for easier filtering
df = pd.DataFrame(all_data)
