This folder is the code for the CS533 course project.
It contains:
scripts/train_midtrain.py: multi-round mid-training with anchor replay and router distillation options.scripts/eval_run_tree.py+scripts/run_lmeval_suite.py: runlm-eval-harnessevaluations over a run tree.scripts/routing/*: routing drift and specialization diagnostics.configs/accelerate/*,configs/deepspeed_stage3_bf16.json: launcher configs.env/conda*.yaml: environment specs used during development.
Not included:
- Data, checkpoints,
runs/, caches, or any expert-expansion code.
- Training entrypoint:
python scripts/train_midtrain.py --help - Evaluation entrypoint:
python scripts/eval_run_tree.py --help - Diagnostics:
python scripts/routing/compare_routing.py --help