Issue: gHashTag/trios#143
Author: Computer agent (R5-honest, ground-truth via gh API)
Anchor: φ² + φ⁻² = 3
Champion baseline (STANDS): 2446855 BPB=2.2393 @ 27K seed=43
Gate-2 deadline: 2026-04-30 23:59 UTC (T-3.5d)
R5-honest audit of current state
Crates with overlapping training code (5)
| Crate |
Lines (src/) |
Active path? |
Verdict |
| crates/trios-train-cpu |
~330 KB across 23 .rs files in src/, 23 binaries in src/bin/ |
✅ champion path (hybrid_train.rs) |
MIGRATE active subset → trios-trainer; DELETE rest |
| crates/trios-training |
~80 KB, half behind feature=burn-backend |
❌ 0 path-dep references in workspace |
DELETE entire crate |
| crates/trios-training-ffi |
3 KB stub for Zig FFI |
❌ no code uses it |
DELETE entire crate |
| crates/trios-igla-trainer |
~17 KB (jepa_runner + audit + schedule) |
🟡 referenced by leaderboard.yml + trios-cli + asha.rs |
MERGE jepa_runner into trios-trainer; DELETE rest |
| crates/trios-igla-race |
~250 KB, 5 backup files |
✅ source of truth for ASHA / invariants / victory |
KEEP — trainer depends on this. Delete main_*.rs.backup, victory_new.rs |
Verified duplicates
- 3 copies of transformer.rs: trios-train-cpu/src/transformer.rs, trios-training/src/transformer.rs, trios-training/src/trinity_3k_transformer.rs
- 2 JEPA paths: trios-train-cpu/src/jepa/{ema,loss,masking,predictor,mod}.rs (29 KB) vs trios-igla-trainer/src/jepa_runner.rs (2.5 KB)
- 2 EMA implementations: trios-train-cpu/src/jepa/ema.rs vs trios-igla-race/src/ema.rs
- 2 invariants modules: trios-train-cpu/src/invariants.rs (14 KB) vs trios-igla-race/src/invariants.rs (16 KB) — race version is canonical (per L-R14)
- 23 train binaries in trios-train-cpu/src/bin/ — only hybrid_train.rs is on the champion path
- Backup pollution in trios-igla-race/src/: main.rs.backup, main_broken.rs, main_clean.rs, main_corrupted.rs, main_fixed.rs, victory_new.rs, lib.rs.tmp, predictor.rs.bak2, ngram_train.rs.bak, ngram_train_backup.rs
- R1 violations (Python in repo): scripts/igla_race_worker.py 18 KB, scripts/igla_train.py 8 KB, scripts/train_gpt.py 47 KB
DEAD list (delete in consolidation PR)
Whole crates
```
crates/trios-training/ — 0 references, archive entire dir
crates/trios-training-ffi/ — Zig stub never wired
```
Backup / corrupted files
```
crates/trios-igla-race/src/main.rs.backup
crates/trios-igla-race/src/main_broken.rs
crates/trios-igla-race/src/main_clean.rs
crates/trios-igla-race/src/main_corrupted.rs
crates/trios-igla-race/src/main_fixed.rs
crates/trios-igla-race/src/victory_new.rs
crates/trios-train-cpu/src/lib.rs.tmp
crates/trios-train-cpu/src/jepa/predictor.rs.bak2
crates/trios-train-cpu/src/bin/ngram_train.rs.bak
crates/trios-train-cpu/src/bin/ngram_train_backup.rs
```
Dead binaries in trios-train-cpu/src/bin/ (kept here only because they share lib.rs; once trainer migrates, delete with the crate)
```
attn_train.rs cpu_train.rs lstm_train.rs
concat_train.rs igla_train.rs ngram_train.rs
arch_explorer.rs igla_trigram.rs ngram_train_gf16.rs
r12_optimizer_race.rs train_cpu.rs transformer_train.rs
trinity_3k_simple_train.rs trinity_3k_fineweb_train.rs trinity_3k_tinyshakespeare.rs
trinity_pr1722.rs trinity_tournament.rs train_v2.rs
```
Python scripts (R1 violation)
```
scripts/igla_race_worker.py
scripts/igla_train.py
scripts/train_gpt.py
```
ALIVE list (migrate into crates/trios-trainer/)
```
crates/trios-train-cpu/src/transformer.rs → src/model.rs (canonical)
crates/trios-train-cpu/src/hybrid_attn.rs → src/model_hybrid_attn.rs
crates/trios-train-cpu/src/optimizer.rs → src/optimizer.rs
crates/trios-train-cpu/src/forward.rs → src/forward.rs
crates/trios-train-cpu/src/backward.rs → src/backward.rs
crates/trios-train-cpu/src/objective.rs → src/objective.rs
crates/trios-train-cpu/src/jepa/ → src/jepa/
crates/trios-train-cpu/src/gf16.rs → DELETE; re-export from trios-golden-float
crates/trios-train-cpu/src/tokenizer.rs → src/data/tokenizer.rs
crates/trios-train-cpu/src/bin/hybrid_train.rs → src/bin/trios-train.rs (rewritten)
crates/trios-train-cpu/src/bin/tjepa_train.rs → MERGE into trios-train.rs as --mode tjepa
crates/trios-igla-trainer/src/jepa_runner.rs → MERGE into src/jepa/runner.rs
crates/trios-igla-race/ → KEEP, depend on it
```
Target layout
```
crates/trios-trainer/
├── Cargo.toml ← single bin "trios-train"
├── README.md ← run-on-any-machine + Railway recipe
├── Dockerfile ← multi-stage rust:1.75-slim → debian:bookworm-slim
├── railway.json ← Railway service config
├── .dockerignore
├── configs/
│ ├── champion.toml ← reproduce 2446855 (BPB=2.2393)
│ ├── gate2-attempt.toml ← HybridAttn + JEPA push
│ └── needle-v1-mup.toml ← L-V1 muP-transfer variant
├── src/
│ ├── lib.rs ← façade exports
│ ├── config.rs ← TOML schema + env override + validate(INV-8)
│ ├── train_loop.rs ← step loop, eval, ledger emit
│ ├── ledger.rs ← triplet-validated emit + embargo block
│ ├── checkpoint.rs ← save/load
│ ├── model.rs / hybrid_attn.rs
│ ├── optimizer.rs (AdamW + Muon + φ-schedule)
│ ├── jepa.rs (mod) → jepa/{ema, loss, masking, predictor}.rs
│ ├── objective.rs ← combined loss
│ ├── data.rs (tokenizer + loaders)
│ ├── gf16.rs ← re-export from trios-golden-float
│ └── bin/trios-train.rs ← clap → load config → run()
└── tests/
├── reproduce_champion.rs (smoke + ignored full)
└── invariants.rs (mirror INV-1..INV-10 from trios-igla-race)
```
Run patterns unlocked
Any machine (clone + cargo)
```bash
git clone https://github.com/gHashTag/trios.git
cd trios
cargo run --release -p trios-trainer --bin trios-train -- \
--config crates/trios-trainer/configs/champion.toml --seed 43
```
Railway (3 parallel seeds for Gate-2)
```bash
railway login
railway link gHashTag/trios
for s in 43 44 45; do
railway service create "trios-trainer-seed-$s"
railway variables set TRIOS_SEED=$s --service "trios-trainer-seed-$s"
railway up --service "trios-trainer-seed-$s"
done
```
Docker on any VPS
```bash
docker run --rm -e TRIOS_SEED=44 -e TRIOS_LEDGER_PUSH=1 \
-v $PWD/assertions:/work/assertions \
ghcr.io/ghashtag/trios-trainer:latest
```
Phased PR plan (R10 atomicity)
PR-1 (skeleton, this artifact) — crates/trios-trainer/ empty crate
- Add to workspace members, compile cleanly, no migrated code yet
- Adds Dockerfile, railway.json, configs/*.toml, README, ledger.rs (live), config.rs (live), train_loop.rs (skeleton)
- Acceptance: cargo build -p trios-trainer green; cargo test -p trios-trainer 1 test passes
PR-2 — migrate model + optimizer + data
- Move transformer.rs, hybrid_attn.rs, optimizer.rs, tokenizer.rs, forward.rs, backward.rs
- Update trios-train-cpu/src/bin/hybrid_train.rs to depend on trios-trainer (transitional)
- Acceptance: champion config dry-runs end-to-end; full run reproduces ≈ 2.2393 ± 0.01
PR-3 — migrate JEPA + objective
- Move jepa/* and objective.rs; merge trios-igla-trainer::jepa_runner into src/jepa/runner.rs
- Acceptance: gate2-attempt.toml runs full training on a single machine
PR-4 — DELETE phase (the housekeeping)
- Remove crates/trios-training/, crates/trios-training-ffi/, all backup files, all 22 dead bins
- Remove scripts/igla_*.py + scripts/train_gpt.py (R1)
- Update CI .github/workflows/leaderboard.yml to call trios-trainer instead of trios-igla-trainer
- Acceptance: workspace builds, test suite passes, no // TODO migrate left
PR-5 — Railway publish
- Push image to ghcr.io/ghashtag/trios-trainer
- Wire Railway service gHashTag/trios → trainer-seed-{43,44,45}
- Acceptance: 3 parallel seeds emit rows to assertions/seed_results.jsonl from cloud
Risk register
| Risk |
Mitigation |
| Champion path breaks during migration |
PR-1 is empty crate; PR-2 keeps trios-train-cpu alive in parallel until reproduction test goes green |
| Lost git history on moved files |
Use git mv for every migrated file; never copy-then-delete |
| INV-8 / INV-2 drift |
Trainer imports from trios-igla-race only — no private invariants module |
| Embargo bypass |
ledger.rs::is_embargoed runs before every emit; CI test asserts an embargoed SHA refuses |
| Railway image size |
Multi-stage build → final image ≈ 250 MB (rust binary + git + libssl + ca-certs) |
| Cost on Railway |
One trainer service per seed; auto-pause on idle; restartPolicyMaxRetries=10 |
Anchor: φ² + φ⁻² = 3.
Issue: gHashTag/trios#143
Author: Computer agent (R5-honest, ground-truth via gh API)
Anchor: φ² + φ⁻² = 3
Champion baseline (STANDS): 2446855 BPB=2.2393 @ 27K seed=43
Gate-2 deadline: 2026-04-30 23:59 UTC (T-3.5d)
R5-honest audit of current state
Crates with overlapping training code (5)
Verified duplicates
DEAD list (delete in consolidation PR)
Whole crates
```
crates/trios-training/ — 0 references, archive entire dir
crates/trios-training-ffi/ — Zig stub never wired
```
Backup / corrupted files
```
crates/trios-igla-race/src/main.rs.backup
crates/trios-igla-race/src/main_broken.rs
crates/trios-igla-race/src/main_clean.rs
crates/trios-igla-race/src/main_corrupted.rs
crates/trios-igla-race/src/main_fixed.rs
crates/trios-igla-race/src/victory_new.rs
crates/trios-train-cpu/src/lib.rs.tmp
crates/trios-train-cpu/src/jepa/predictor.rs.bak2
crates/trios-train-cpu/src/bin/ngram_train.rs.bak
crates/trios-train-cpu/src/bin/ngram_train_backup.rs
```
Dead binaries in trios-train-cpu/src/bin/ (kept here only because they share lib.rs; once trainer migrates, delete with the crate)
```
attn_train.rs cpu_train.rs lstm_train.rs
concat_train.rs igla_train.rs ngram_train.rs
arch_explorer.rs igla_trigram.rs ngram_train_gf16.rs
r12_optimizer_race.rs train_cpu.rs transformer_train.rs
trinity_3k_simple_train.rs trinity_3k_fineweb_train.rs trinity_3k_tinyshakespeare.rs
trinity_pr1722.rs trinity_tournament.rs train_v2.rs
```
Python scripts (R1 violation)
```
scripts/igla_race_worker.py
scripts/igla_train.py
scripts/train_gpt.py
```
ALIVE list (migrate into crates/trios-trainer/)
```
crates/trios-train-cpu/src/transformer.rs → src/model.rs (canonical)
crates/trios-train-cpu/src/hybrid_attn.rs → src/model_hybrid_attn.rs
crates/trios-train-cpu/src/optimizer.rs → src/optimizer.rs
crates/trios-train-cpu/src/forward.rs → src/forward.rs
crates/trios-train-cpu/src/backward.rs → src/backward.rs
crates/trios-train-cpu/src/objective.rs → src/objective.rs
crates/trios-train-cpu/src/jepa/ → src/jepa/
crates/trios-train-cpu/src/gf16.rs → DELETE; re-export from trios-golden-float
crates/trios-train-cpu/src/tokenizer.rs → src/data/tokenizer.rs
crates/trios-train-cpu/src/bin/hybrid_train.rs → src/bin/trios-train.rs (rewritten)
crates/trios-train-cpu/src/bin/tjepa_train.rs → MERGE into trios-train.rs as --mode tjepa
crates/trios-igla-trainer/src/jepa_runner.rs → MERGE into src/jepa/runner.rs
crates/trios-igla-race/ → KEEP, depend on it
```
Target layout
```
crates/trios-trainer/
├── Cargo.toml ← single bin "trios-train"
├── README.md ← run-on-any-machine + Railway recipe
├── Dockerfile ← multi-stage rust:1.75-slim → debian:bookworm-slim
├── railway.json ← Railway service config
├── .dockerignore
├── configs/
│ ├── champion.toml ← reproduce 2446855 (BPB=2.2393)
│ ├── gate2-attempt.toml ← HybridAttn + JEPA push
│ └── needle-v1-mup.toml ← L-V1 muP-transfer variant
├── src/
│ ├── lib.rs ← façade exports
│ ├── config.rs ← TOML schema + env override + validate(INV-8)
│ ├── train_loop.rs ← step loop, eval, ledger emit
│ ├── ledger.rs ← triplet-validated emit + embargo block
│ ├── checkpoint.rs ← save/load
│ ├── model.rs / hybrid_attn.rs
│ ├── optimizer.rs (AdamW + Muon + φ-schedule)
│ ├── jepa.rs (mod) → jepa/{ema, loss, masking, predictor}.rs
│ ├── objective.rs ← combined loss
│ ├── data.rs (tokenizer + loaders)
│ ├── gf16.rs ← re-export from trios-golden-float
│ └── bin/trios-train.rs ← clap → load config → run()
└── tests/
├── reproduce_champion.rs (smoke + ignored full)
└── invariants.rs (mirror INV-1..INV-10 from trios-igla-race)
```
Run patterns unlocked
Any machine (clone + cargo)
```bash
git clone https://github.com/gHashTag/trios.git
cd trios
cargo run --release -p trios-trainer --bin trios-train -- \
--config crates/trios-trainer/configs/champion.toml --seed 43
```
Railway (3 parallel seeds for Gate-2)
```bash
railway login
railway link gHashTag/trios
for s in 43 44 45; do
railway service create "trios-trainer-seed-$s"
railway variables set TRIOS_SEED=$s --service "trios-trainer-seed-$s"
railway up --service "trios-trainer-seed-$s"
done
```
Docker on any VPS
```bash
docker run --rm -e TRIOS_SEED=44 -e TRIOS_LEDGER_PUSH=1 \
-v $PWD/assertions:/work/assertions \
ghcr.io/ghashtag/trios-trainer:latest
```
Phased PR plan (R10 atomicity)
PR-1 (skeleton, this artifact) — crates/trios-trainer/ empty crate
PR-2 — migrate model + optimizer + data
PR-3 — migrate JEPA + objective
PR-4 — DELETE phase (the housekeeping)
PR-5 — Railway publish
Risk register
Anchor: φ² + φ⁻² = 3.