# TIUR Training Tricks — Experiment Set 1 (Quickstart)

This notebook runs a Colab-friendly suite of **training-trick experiments** and logs TIUR-style diagnostics:
- loss mean + ensemble loss fluctuations
- time-Fisher proxy (diagonal Gaussian ensemble estimate)
- drift vs churn decomposition (`I_mu` vs `I_sigma`)
- speed-limit efficiency `eta`

**Important:** Use `Runtime → Change runtime type → GPU`.

All outputs are written to **Google Drive** so you don't lose results if the Colab runtime resets.


In [None]:
# Mount Google Drive (persistent storage)
from google.colab import drive
drive.mount('/content/drive')

import os, time
BASE_OUT = '/content/drive/MyDrive/tiur_tricks_results'
run_name = time.strftime('run_%Y%m%d_%H%M%S')
out_dir = os.path.join(BASE_OUT, run_name)
os.makedirs(out_dir, exist_ok=True)
print('Saving outputs to:', out_dir)


In [None]:
# Install minimal deps
# NOTE: Colab already includes torch + torchvision.
# Avoid reinstalling torchvision here (it can break ABI/CUDA compatibility).
!pip install -q tqdm pandas matplotlib

# Optional (uncomment if you run fast=False and want the expanded optimizer grid)
# !pip install -q pytorch-optimizer transformers


In [None]:
# Import the repo code (robust against stale imports / path collisions)
import os, sys

REPO_DIR = '/content/tiur_tricks_colab'
assert os.path.exists(os.path.join(REPO_DIR, 'tiur_tricks', '__init__.py')), (
    f'Could not find repo at {REPO_DIR}. Did you unzip the repo to /content? ' 
    f'Expected {REPO_DIR}/tiur_tricks/__init__.py'
)

# Ensure our repo takes precedence over any similarly named installed package.
if REPO_DIR not in sys.path:
    sys.path.insert(0, REPO_DIR)

# If a previous cell imported a different tiur_tricks, clear it.
if 'tiur_tricks' in sys.modules:
    del sys.modules['tiur_tricks']

import tiur_tricks
print('tiur_tricks loaded from:', getattr(tiur_tricks, '__file__', None))

from tiur_tricks import RunConfig, make_experiment_suite_set1, run_experiment_suite

data_dir = '/content/data'  # dataset cache (not persisted; small + redownloadable)


In [None]:
# --- Quick suite settings (tweak these first) ---
# For the very first run, keep it cheap. Then increase num_replicates and subset sizes.
base = RunConfig(
    device='cuda',
    dataset='cifar10',
    model='small_cnn',   # 'resnet18' is slower but more realistic
    num_replicates=3,
    subset_train=5000,
    subset_eval=1000,
    epochs=2,
    checkpoint_every=50,
    eval_batches=10,
)

suite = make_experiment_suite_set1(base, fast=True)
print('Number of runs in suite:', len(suite))

logs_df, summary_df = run_experiment_suite(
    suite,
    out_dir=out_dir,
    data_dir=data_dir,
    show_plots=True,
    save_plots=True,
    persist_checkpoints=True,
)
summary_df


In [None]:
# Confirm what's been written to Google Drive
import glob
n_files = len(glob.glob(out_dir + '/**/*', recursive=True))
print('Total files written:', n_files)
print('Top-level outputs:')
!ls -lah {out_dir}


## Where to look in Drive

In your Google Drive you should now have:
- `summary.csv` and `all_logs.csv` at the run root
- one folder per run config (e.g. `opt_adamw/`) containing:
  - `config.json`
  - `logs.csv` (complete)
  - `logs_live.csv` (written incrementally during training)
  - `*_loss.png`, `*_fisher.png`, `*_efficiency.png`, `*_bound.png`
