# TG-SMN Runner

This notebook is the **top-level interface** for running TG-SMN experiments.

It uses the `tg_smn` Python package in this repo to:
- build environments (WT2 permuted-vocab; multi-domain continual LM)
- run baselines and TG-SMN variants
- run expert-count + seed sweeps
- load + visualize results


In [None]:
# Install the repo + runtime dependencies (Colab-friendly)
!pip -q install -e ..
!pip -q install datasets tqdm pandas matplotlib


[31mERROR: file:/// does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.[0m[31m
[0m

In [None]:
import os
import pandas as pd
from tg_smn.config import (
    DataCfg, TrainCfgLM, ModelCfgLM, FixedCtrlCfg, LearnedCtrlCfgLM,
    WT2EnvCfg, MultiDomainEnvCfg,
)
from tg_smn.sweep import run_grid, LearnedAblation
from tg_smn.analysis import load_grid_results, plot_scaling

print('OK: imported tg_smn')


OK: imported tg_smn


## Output directory

In Colab, you probably want this under Google Drive:

```python
from google.colab import drive
drive.mount('/content/drive')
OUT_ROOT = '/content/drive/MyDrive/tg_smn_runs'
```

If you don't want Drive, just use a local folder.


In [None]:
# Change this to your preferred location
OUT_ROOT = os.path.expanduser('~/tg_smn_runs')
os.makedirs(OUT_ROOT, exist_ok=True)
print('OUT_ROOT =', OUT_ROOT)


OUT_ROOT = /root/tg_smn_runs


## Choose environments

### 1) WT2 permuted-vocab
A strong adversarial continual-learning environment.

### 2) Multi-domain continual LM
A harder, more realistic environment: WT2 / PTB / AGNews / IMDb with domain shifts (and optional domain mixing per task).


In [None]:
envs = [
    WT2EnvCfg(name='wt2_perm_unique_10', n_tasks=10, permuted_vocab=True, perm_mode='unique'),
    MultiDomainEnvCfg(name='md_rr_40', n_tasks=40, schedule_mode='round_robin', mix_n_domains_per_task=1),
    # Harder: mixed-domain tasks
    MultiDomainEnvCfg(name='md_rr_mix2_40', n_tasks=40, schedule_mode='round_robin', mix_n_domains_per_task=2, mix_seed=0),
]

envs


[WT2EnvCfg(env_type='wt2', name='wt2_perm_unique_10', n_tasks=10, permuted_vocab=True, perm_mode='unique', repeat_k=4, drift_swaps=150, max_docs_total=None, val_frac_per_task=0.1),
 MultiDomainEnvCfg(env_type='multidomain', name='md_rr_40', n_tasks=40, schedule_mode='round_robin', block_size=10, schedule=None, mix_n_domains_per_task=1, mix_seed=0, train_docs_per_task=800, val_docs_per_task=200, test_docs_per_task=200, min_freq=2, max_vocab_size=60000, max_docs_per_domain=None),
 MultiDomainEnvCfg(env_type='multidomain', name='md_rr_mix2_40', n_tasks=40, schedule_mode='round_robin', block_size=10, schedule=None, mix_n_domains_per_task=2, mix_seed=0, train_docs_per_task=800, val_docs_per_task=200, test_docs_per_task=200, min_freq=2, max_vocab_size=60000, max_docs_per_domain=None)]

## Sweep configuration

Start small for a smoke test, then scale up experts + seeds.


In [None]:
# Shared configs
data_cfg  = DataCfg(seq_len=64, batch_size=32, num_workers=2)
train_cfg = TrainCfgLM(epochs_per_task=1, lr=3e-4, fisher_every=100, delta_rho_samples=3, log_every=20, max_steps_per_task=75)
model_cfg = ModelCfgLM(d_model=192, n_heads=4, n_layers=4, dropout=0.1, n_experts=256, rank=16, max_k=2, group_size=32)
fixed_ctrl_cfg = FixedCtrlCfg(k=2, replay_ratio=0.10, router_noise=0.30, router_temp=1.0)
learned_ctrl_cfg = LearnedCtrlCfgLM(k_min=1, k_max=2, replay_max=0.5, noise_max=0.5, temp_min=0.7, temp_max=1.3)

experts = [256, 512]
seeds   = [0, 1]

ablations = [
    LearnedAblation(name='none'),
    LearnedAblation(name='fix_k2', fixed_k=2),
    LearnedAblation(name='fix_replay0.1', fixed_replay=0.10),
    LearnedAblation(name='drop_obs_kl', drop_obs_kl=True),
]


In [None]:
df = run_grid(
    env_cfgs=envs,
    experts_list=experts,
    seeds=seeds,
    out_root=OUT_ROOT,
    variants=('dense_baseline','sparse_fixed','tg_smn_learned'),
    data_cfg=data_cfg,
    model_cfg=model_cfg,
    train_cfg=train_cfg,
    fixed_ctrl_cfg=fixed_ctrl_cfg,
    learned_ctrl_cfg=learned_ctrl_cfg,
    learned_ablations=ablations[1:],
    skip_existing=True,
)
df.sort_values(['env','variant','ablation','n_experts','seed']).head(20)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md: 0.00B [00:00, ?B/s]

wikitext-2-raw-v1/test-00000-of-00001.pa(…):   0%|          | 0.00/733k [00:00<?, ?B/s]

wikitext-2-raw-v1/train-00000-of-00001.p(…):   0%|          | 0.00/6.36M [00:00<?, ?B/s]

wikitext-2-raw-v1/validation-00000-of-00(…):   0%|          | 0.00/657k [00:00<?, ?B/s]

Generating test split:   0%|          | 0/4358 [00:00<?, ? examples/s]

Generating train split:   0%|          | 0/36718 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/3760 [00:00<?, ? examples/s]



## Visualize scaling

These plots use `grid_results.csv` under `OUT_ROOT`.


In [None]:
df2 = load_grid_results(OUT_ROOT)
print('rows:', len(df2))
df2.head()


In [None]:
# Pick an environment name from df2['env'].unique()
env_name = df2['env'].unique()[0]
print('env_name =', env_name)

plot_scaling(df2, env=env_name, metric='final_test_ppl', ablation='none')
plot_scaling(df2, env=env_name, metric='avg_forgetting_ppl', ablation='none')
