
# Reproducing the Experiments

This notebook accompanies the paper *Improved Convergence in Parameter-Agnostic Error Feedback through Momentum* (2025) and re-generates the plots reported in the main figures. It assumes you have already recreated the `ef21-hess` Conda environment as described in the project `README.md` and that the `_release_data/` directory (shipped with this repo) is intact.

> **Tip:** The serialized runs stored under `_release_data/` already contain the hyper-parameter-tuned configurations we report in the paper, so you do **not** need to retrain the models to follow this notebook.



## Release Data

- Each `.pickle` file under `_release_data/` corresponds to the averaged metrics of a tuned run (e.g., `resnet18_cifar10_EF21_IGT_NORM_topk-0,1_lr-0,1_eta-1,0_p-None_q-0,57.pickle`).
- The plotting utilities below simply load those pickles and re-render the corresponding curves.
- If you train new models with `train.py`, drop the saved pickles into the same folder and adjust the glob patterns in the cells below.



## 0. Runtime sanity check

Before launching training or plotting, verify that your notebook kernel can see the intended CUDA stack. The cell below prints the detected GPUs, CUDA toolkit version, and PyTorch build so you can double-check that `ef21-hess` was activated properly.


In [None]:

import gc, torch, pickle, torchvision, wandb

print("CUDA available?", torch.cuda.is_available())
print("CUDA version:", torch.version.cuda)
print("Torch version:", torch.__version__)
for i in range(torch.cuda.device_count()):
    print(f"Device {i}: {torch.cuda.get_device_name(i)}")



## 1. Figures 1–3: Replot tuned experiments

Run the cell below to regenerate the train/test loss and test-accuracy panels that appear as Figures 1–3 in the paper. Set `X_MODE` to choose the x-axis: `"time"`, `"bp_total"`, `"exbp_total"`, `"epoch"`, or `"gpu_seconds"`.


In [None]:

from repro_plots import build_curve_specs, plot_metrics

# Choose the x-axis: "time", "bp_total", "exbp_total", "epoch", or "gpu_seconds"
X_MODE = "time"
CURVES = build_curve_specs(folder="_release_data/")

plot_metrics(curves=CURVES, x_mode=X_MODE, truncate_to_fastest=False)



## 1.1. Figure 4: Truncated time-to-solution view

Figure 4 in the paper plots the same three metrics but truncates the time axis to the fastest method so the “time-to-quality” comparison is easier to read. Execute the next cell (again, set `X_MODE`) to reproduce that view.


In [None]:

from repro_plots import build_curve_specs, plot_metrics

# Choose the x-axis: "time", "bp_total", "exbp_total", "epoch", or "gpu_seconds"
X_MODE = "time"
CURVES = build_curve_specs(folder="_release_data/")

plot_metrics(curves=CURVES, x_mode=X_MODE, truncate_to_fastest=True)



## 2. Run new experiments

Use the helpers in `repro_training.py` to launch fresh sweeps directly from this notebook. You can reuse the tuned presets from the paper or override them with your own grids; only the method list and high-level knobs (epochs, dataset, `topk` ratio, etc.) need to stay in the notebook.

> **W&B login:** Before executing the cell below, open a terminal where `conda activate ef21-hess` is active and run:
>
> ```bash
> wandb login --relogin    # paste the API key from https://wandb.ai/authorize
> # or keep runs local:
> wandb offline
> ```
>
> This prevents your experiments from inheriting whatever account was previously cached on the machine.


In [None]:

from repro_training import default_gen_config, tuned_presets, run_methods

METHODS = ["EF21_RHM_NORM"]        # choose any keys from tuned_presets()
GEN_CFG = default_gen_config(epochs=5)  # override epochs/dataset/seed if needed
PRESETS = tuned_presets()               # or edit individual entries before running

run_methods(
    METHODS,
    gen_cfg=GEN_CFG,
    presets=PRESETS,
    project_name="EF21_SOM",
    topk_ratio=0.1,
    n_workers=10,
    batch_size=64,
)



### 2.1 Batch automation via tmux

For the full tuning sweep reported in the paper we queued the methods sequentially inside detached `tmux` sessions. The repository ships a ready-to-run launcher (`run_all_experiments.sh`) that iterates over the tuned presets and calls `python -m repro_training --method <name>` inside each session. Update the `METHODS` array, `CONDA_ENV`, and `PROJECT_NAME` variables at the top of that script, then run:

```bash
./run_all_experiments.sh
```

Each run stream is logged to `train_<METHOD>_<timestamp>.log`, making it easy to monitor long sweeps without keeping notebook tabs open.
