# Benchmark Quickstart

This notebook mirrors the CLI quickstart flow in an executable format. Use it to validate your environment and to inspect the outputs of a short benchmark run.

## 1. Load a configuration

Edit `CONFIG_PATH` if you want to point to a custom YAML file. The snippet below reads the file so we can re-use the metadata in later cells.

In [None]:
from pathlib import Path
import yaml

CONFIG_PATH = Path("config.yaml")
config = yaml.safe_load(CONFIG_PATH.read_text())
config['training_id']

## 2. Trigger training

Uncomment the next cell when you want to actually run training from inside the notebook. Keeping it as a string prevents accidental long-running jobs when the notebook is rendered on the documentation site.

In [None]:
# %%bash
# python run_training.py --config ${CONFIG_PATH} --devices cpu


## 3. Benchmark and collect metrics

After training finishes, call `run_eval.py` with the same configuration file.

In [None]:
# %%bash
# python run_eval.py --config ${CONFIG_PATH} --modes accuracy timing


## 4. Inspect generated results

Each surrogate writes a YAML summary under `results/<training_id>/<surrogate>/`. The cell below loads every summary file and prints the root mean squared error so you can compare surrogates quickly.

In [None]:
from pathlib import Path
import yaml

results_root = Path("results") / config['training_id']
metrics = {}
if results_root.exists():
    for surrogate_dir in results_root.glob('*'):
        summary = surrogate_dir / 'accuracy.yaml'
        if summary.exists():
            data = yaml.safe_load(summary.read_text())
            metrics[surrogate_dir.name] = data.get('accuracy', {}).get('root_mean_squared_error_real')
metrics

## Next steps

* Use `scripts/download_datasets.py` to fetch additional datasets.
* Try enabling `interpolation` or `uncertainty` inside the config file.
* Share the notebook with collaborators as a reproducible recipe.