Skip to content

AlessandroMorosini/2sharp2sure

Repository files navigation

Disclaimer: this repository is adapted from edge-of-stochastic-stability

Experiments: How to Run

Scripts

Two shell scripts automate multi‑LR sweeps:

  • run.sh: Local runs (CPU/GPU). Creates/reuses a virtual environment, runs all LRs sequentially. Reads experiment configuration from base_experiment.yaml by default.
  • run_cloud.sh: Cloud runs with SLURM (GPU). Mirrors the same CLI and LR parsing. Submits jobs to SLURM queue.

How to Run:

# Local run with default config (base_experiment.yaml)
./run.sh

# Local run with custom config file
CONFIG_FILE=my_experiment.yaml ./run.sh

# Override specific parameters
OPTIMIZER=gd LOSS=ce DATASET=cifar10 ./run.sh

# Quick test run (uses test_mode from config or TEST=true)
TEST=true ./run.sh

Supported Environment Overrides:

  • CONFIG_FILE=<path>: Use a different YAML config file (default: base_experiment.yaml)
  • OPTIMIZER={sgd,sgdm,adam,gd,sam}: Override optimizer
  • LOSS={mse,ce,mse_softmax}: Override loss function
  • DATASET={cifar10,cifar100,fmnist,svhn,cifar10_ez}: Override dataset
  • TEST=true: Force test mode (10k steps, lower stop_acc)
  • EOS_DROP_LAG=<steps>: Delay LR change after EoS when LR token ends with - (halve) or + (double)

LR Grids (defined in YAML config, loss‑dependent):

  • MSE: {2/600, 2/450, 2/300, 2/150}
  • CE: {2/200, 2/150, 2/100, 2/50} (or as specified in config)

Examples:

# CE, GD, CIFAR‑10
OPTIMIZER=gd LOSS=ce DATASET=cifar10 ./run.sh

# MSE‑softmax, GD, CIFAR‑100
OPTIMIZER=gd LOSS=mse_softmax DATASET=cifar100 ./run.sh

# LR changes after EoS: drop (halve) with `-` or increase (double) with `+`
EOS_DROP_LAG=500 ./run.sh   # set LR_ARRAY in script to include 2/70- (halve) or 2/70+ (double)

Direct Training (training.py)

python training.py \
  --dataset cifar10 --model mlp --loss ce \
  --optimizer gd --batch 64 --lr 0.008 --steps 100000 \
  --lambdamax --batch-sharpness --disable-wandb \
  --stop-acc 0.99

Key arguments:

  • Optimizers: --optimizer {sgd,sgdm,adam,gd,sam}
  • Losses: --loss {mse,ce,mse_softmax}
  • Datasets: --dataset {cifar10,cifar100,fmnist,svhn,cifar10_ez}
  • Models: --model {mlp,cnn,resnet}
  • Measurements: --lambdamax, --batch-sharpness, --step-sharpness, --gni, etc.
  • Early stopping: --stop-acc <value>, --stop-loss <value>

See python training.py --help for the full list.

Results Storage

eoss_results/plaintext/
└── <dataset>_<model>_<optimizer>_<loss>/
    └── <timestamp>_lr<lr>_b<batch>/
        ├── results.txt
        └── metadata.json

Each experiment run automatically saves structured metadata (metadata.json) with experiment parameters, timestamps, and a unique experiment_key (12-character hash). The experiment manager uses this metadata to query and filter experiments, making it easy to find and plot specific runs without parsing folder names. The experiment_key uniquely identifies experiments with the same core parameters (dataset, model, optimizer, loss, batch_size, num_data).

The plots and run labels preserve LR labels (e.g., 2/70-, 2/70+) so variants don't collapse.

Plotting

Setup:

export RESULTS="$(pwd)/eoss_results"

Basic Usage:

# Compare all learning rates for a specific experiment
python visualization/plot_results.py --compare --dataset cifar10 --model mlp --optimizer gd --loss ce

# Plot only today's runs
python visualization/plot_results.py --compare --today --dataset cifar10 --model mlp --optimizer gd --loss ce

# Plot by experiment key (unique identifier)
python visualization/plot_results.py --compare --experiment-key 18a7f2821859

# Disable test metrics (faster)
python visualization/plot_results.py --compare --no-test-metrics --dataset cifar10 --model mlp

List Available Experiments:

# List recent experiments
python scripts/list_experiments.py --recent 5

# Filter by parameters
python scripts/list_experiments.py --dataset cifar10 --model mlp --optimizer gd --loss ce

Plot Layout:

  • Columns 1-3: Training metrics (Accuracy, Loss, Training ECE, Sharpness, Last Layer Norm, Margin)
  • Column 4: Test metrics (Test ECE over steps, Reliability Diagram) - enabled by default

Key Options:

  • --today: Only plot experiments from today
  • --date YYYYMMDD: Filter by specific date
  • --experiment-key <key>: Plot specific experiment by unique key
  • --max-steps N: Limit x-axis range
  • --no-test-metrics: Disable test ECE and reliability diagram

See python visualization/plot_results.py --help for all options.

About

Project for SLT

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors