Skip to content

🎯 IGLA RACE β€” Distributed Hunt: JEPA-T + NCA + GF16 + ASHA + Coq Invariants (Rust-only, Never-Stopping)Β #143

@gHashTag

Description

@gHashTag

🎯 IGLA RACE v2 β€” ONE SHOT DASHBOARD

Updated: 2026-04-27T14:30+07 | phi^2 + phi^-2 = 3 | TRINITY | NEVER CLOSE
Deadline: Apr 30, 2026 | Target: BPB < 1.50 on 3 seeds | Gap: -0.68 BPB
Best: BPB = 2.18 (trios-train h=828 attn=2L, 81K steps, seed=43)
Rule #1: RUST ONLY -- zero .py, zero .sh.
Rule #2: Race NEVER stops until BPB < 1.50 is found across 3 seeds.
Rule #3: ALL agents work in branch main ONLY.


COQ INVARIANT SYSTEM -- TASK-COQ-001 (trinity-clara -> IGLA)

Document: trinity-clara/docs/TASK-COQ-001.md
Source: trinity-clara + 84 Coq theorems from t27
Principle: phi^2 + phi^-2 = 3 -- single algebraic anchor. All invariants derive from it.

Invariants INV-1..INV-10

ID Coq Theorem Status Effect Trinity link
INV-1 bpb_decreases_with_real_gradient partial fixes TASK-5D 7-step alpha_phi derivation
INV-2 asha_champion_survives PROVEN (0 Admitted) 0 false prunes threshold=3.5=phi^2+phi^-2+0.5
INV-3 gf16_safe_domain Lucas proven -40% configs Lucas closure phi^2n+psi^2n in Z
INV-4 nca_entropy_stability PROVEN (0 Admitted) -30% configs A5/E8 band width=1 (integer!)
INV-5 lucas_closure_gf16 n=1,2 proven GF16 consistency phi^2n+phi^-2n in Z for all n
INV-6 ema_decay_valid TODO -20% configs cos schedule in [0.996,1.0]
INV-7 igla_found_criterion TODO L-R14 gate victory iff 3-seed BPB<1.50
INV-8 lr_phi_band PROVEN (0 Admitted) -60% configs lr=0.004=alpha_phi/phi^3
INV-9 qk_gain_phi_sq IMPLEMENTED in model_hybrid_attn.rs -10% configs qk_gain=PHI_SQ=2.618
INV-10 asha_rungs_trinity TODO correctness rungs=1000*3^k, base=Trinity

phi-anchored Parameters

Parameter Old value phi-derived Theorem
bpb_prune_threshold 2.65 (BUG!) 3.5 = phi^2+phi^-2+0.5 INV-2 βœ…
NCA grid 9x9=81 81 = 3^4 INV-4 βœ…
NCA K states 9 9 = 3^2 INV-4 βœ…
lr champion 0.004 alpha_phi/phi^3 INV-8 βœ…
d_model 384 ~3^4 x phi^3 INV-3
qk_gain 1.0 PHI_SQ=2.618 INV-9 βœ…

DONE

Infrastructure

  • Neon DB schema -- igla_race_trials + igla_race_experience + igla_leaderboard view
  • crates/trios-igla-race/ -- coordinator crate
  • src/asha.rs -- ASHA rungs 1k->3k->9k->27k, prune logic
  • src/neon.rs -- tokio-postgres, register/checkpoint/lesson
  • src/status.rs -- leaderboard print
  • src/lessons.rs -- failure memory, lesson generation

Model Zoo (trios-train-cpu)

  • src/trinity_3k_model.rs -- Trinity 3k architecture
  • src/real_igla_model.rs -- Real IGLA model
  • src/pipeline.rs -- Training pipeline
  • src/optimizer.rs -- AdamW + Muon (base)
  • src/gf16.rs -- GF16 precision (inference)
  • src/forward.rs / backward.rs -- fwd/bwd pass

JEPA-T (jepa/ module)

  • jepa/ema.rs -- EMA target encoder wired
  • jepa/masking.rs -- mask_ratio=0.30, spans wired
  • jepa/loss.rs -- MSE gradient function
  • bin/tjepa_train.rs -- 309 lines, compiles, 90 tests pass (commit 132fac3)
  • TASK-5C predictor.rs -- real cross-attention predictor (commit 0fddd7f)
  • TASK-5E two-phase warm-up in main

CI / Tests

  • 411 tests pass -- GREEN
  • CI -- GREEN
  • cargo clippy -D warnings = 0

Hyperparameter Search

  • Champion: lr=0.004, seed=43, d_model=384, context=6 -> BPB 2.5329
  • SEED-EXPLORER -- variance=0.0077 < 0.02 = STABLE OPTIMUM
  • LR micro-search -- lr=0.004 is clear local minimum

GF16 Benchmarks

  • BENCH-004b: GF16 MNIST = 97.67% (0.00% gap vs f32)
  • BENCH-002: GF16 add = 7.2 ns/op (15% faster)
  • BENCH-005: FPGA GF16 add = 118 LUT
  • BENCH-006: FPGA MAC-16 = 71 LUT + 16 DSP

Coq Invariants (trinity-clara)

  • docs/TASK-COQ-001.md -- NASA-P10 full specification [CREATED 2026-04-25]
  • proofs/igla/igla_asha_bound.v -- INV-2 PROVEN (0 Admitted)
  • proofs/igla/gf16_precision.v -- INV-3, INV-5 Lucas proven
  • proofs/igla/nca_entropy_band.v -- INV-4 PROVEN, band_width=1 (0 Admitted)
  • proofs/igla/lr_convergence.v -- INV-8 PROVEN (0 Admitted)

Agent ALPHA β€” trios-trainer-igla (2026-04-27)

  • TASK-5D: Real backward pass in tjepa_train.rs β€” gradients are real (no fake 0.01)
  • TASK-5 full run: seed=43, steps=3000, d_model=384, lr=0.004 -> BPB=2.67 (Gate-1 FAILED)
  • T1-02: 2-layer HybridAttn with RoPE + ReLU^2 -> BPB=2.18 @ 81K steps (-0.35 from baseline)
  • T2-04: QK-Gain phi^2 implemented in model_hybrid_attn.rs (qk_gain=PHI_SQ)
  • T2-07: ReLU^2 activation in train_loop.rs
  • TASK-NCA: grid=81, K=9, entropy [1.5,2.8], w=0.25 β€” IMPLEMENTED in objective.rs
  • TASK-MUON: NS-5 in optimizer.rs β€” IMPLEMENTED but FALSIFIED (worse than AdamW)
  • FIX ASHA PRUNING: threshold=3.5 in src/invariants.rs (INV-2 enforced)
  • TASK-1: tri CLI β€” tri train/deploy/race subcommands
  • TASK-2: BPB printer β€” trios-train prints BPB=X.XXXX at eval
  • TASK-3: tri race start β€” ASHA worker via race::asha::run_worker()
  • Dockerfile: Pure Rust, no .sh, no curl, dynamic data download via ureq
  • railway.json: Dockerfile builder, NEVER restart
  • Env vars: TRIOS_SEED/STEPS/HIDDEN/LR/ATTN_LAYERS/OPTIMIZER β€” all via clap env
  • Seeds: [43, 44, 45] β€” GATE_FINAL_SEEDS updated per TRAINING_FLOW_V2
  • Data: auto-downloads tiny_shakespeare via ureq if file missing

TODO

P0 -- BLOCKERS

  • ATTENTION BACKWARD β€” model_hybrid_attn.rs has only forward(). Weights wq/wk/wv/wo don't receive gradients. Expected impact: -0.20 BPB
  • RAILWAY DEPLOY β€” tri deploy all works but account hit 25 services/day limit. Need reset or raise.
  • SCALE UP β€” hidden=828 plateaus at 2.15. Need h=2000+ or more layers.

P1 -- EXPERIMENTS NEEDED

  • JEPA gradients don't flow β€” predictor.forward_backward() returns loss but doesn't update model params. jepa_loss=0.003 is too small to matter.
  • NCA gradients don't flow β€” nca_entropy_loss() returns scalar but doesn't backprop into model. Need to scale proj gradients by NCA loss.
  • Muon FALSIFIED β€” NS-1 on proj(828x64): BPB=2.59 vs AdamW=2.48. NS-5 too slow on CPU. Decision: stick with AdamW.
  • Larger model β€” h=2000, 3 layers, seq=256. Need Railway for compute.

P2 -- Apr 28

  • TASK-GF16-TRAIN (BENCH-012): GF16 gradient training
  • Hybrid ASHA sweep: Config A/B/C
  • TASK-8: Launch on Railway (4 machines, 320 trials/hour)
  • COQ-INV-006/007: EMA proof, victory_condition.v

FINAL -- Apr 29-30

  • 3-seed verification: BPB < 1.50 on seeds 43, 44, 45 (p < 0.01)
  • cargo test --workspace = GREEN
  • Neon: status='winner'
  • INV-001..010 ALL GREEN: coqc *.v passes
  • Victory commit: git commit -m "IGLA FOUND: BPB=X.XXXX seed=43,44,45"

BPB ROADMAP β€” UPDATED WITH REAL RESULTS

Step Technique Expected Ξ” Actual Ξ” Target Status
Baseline 6-gram h=384 seed=43 -- -- 2.5329 βœ… DONE
T1-01 JEPA-T real backward -0.30 ~0 <=2.23 ❌ jepa_loss=0.003
T1-02 Attention + ReLU^2 (2L) -0.30 -0.35 <=2.00 βœ… BPB=2.18
T2-01 Muon optimizer -0.15 +0.11 (worse!) <=1.85 ❌ FALSIFIED
T2-02 NCA auxiliary (INV-4) -0.15 ~0 <=1.70 ❌ no grad flow
T2-04 QK-Gain phi^2 (INV-9) -0.10 in model <=1.60 βœ… implemented
T2-07 ReLU^2 activation -0.08 in model <=1.52 βœ… implemented
T2-07b GF16 d_model=384 -0.05 not tested <=1.47 TODO
Missing Attention backward ? ? KEY LEVER
Missing Scale up (h=2000+) ? ? KEY LEVER

Actual Results Table

Config Seed Steps Best BPB Time Notes
tjepa h=384 lr=0.004 43 3K 2.67 3 min TASK-5 full run
trios-train h=828 attn=2L 42 81K 2.19 2.9h V3 corrected grads
trios-train h=828 attn=2L 43 81K 2.18 2.9h BEST
trios-train h=828 attn=2L 44 81K 2.18 2.9h V3 corrected grads
P1 AdamW control h=828 43 12K 2.48 24 min P1 baseline
P1 Muon NS-1 h=828 43 12K 2.59 13 min FALSIFIED
Trinity3K h=27 l=2 42 10K 2.81 β€” Worse than n-gram
Trinity3K h=27 l=2 44 10K 2.70 β€” Best Trinity3K

CURRENT STATUS (2026-04-27T14:30+07)

Metric Value Status
Best BPB 2.18 (h=828 attn=2L seed=43 81K steps) ACTIVE
Tests 411 pass GREEN
CI success GREEN
ASHA pruning threshold=3.5 βœ… FIXED
QK-Gain phi^2=2.618 in model_hybrid_attn.rs βœ…
NCA grid=81 K=9 in objective.rs βœ… code, ❌ no grad flow
Muon NS-5 in optimizer.rs ❌ FALSIFIED vs AdamW
Attention backward forward() only, no backward() ❌ BLOCKER
Railway account limit 25/day ❌ BLOCKER
Seeds [43, 44, 45] βœ… updated
Dockerfile pure Rust, dynamic data βœ…
Gap to target 2.18 - 1.50 = 0.68 BPB NEED LEVERS

KEY FINDINGS (Agent ALPHA experience)

  1. T1-02 (2-layer HybridAttn + ReLU^2) β€” the only lever that worked: -0.35 BPB. ForwardCache struct caches intermediate activations for correct backprop.
  2. Muon FALSIFIED: NS-1 on 828x64 matrix gives +0.11 vs AdamW. NS-5 too slow (21M ops/step on CPU). Decision: AdamW wins.
  3. JEPA doesn't help: predictor.forward_backward() returns loss but doesn't update model params. The predictor is a separate network β€” its gradients don't flow to the encoder.
  4. NCA doesn't help: nca_entropy_loss() returns scalar loss but has no gradient connection to model weights. Need to add gradient scaling.
  5. Attention has no backward: model_hybrid_attn.rs implements only forward(). The 64x64 attention weights are initialized randomly but never trained. This is the single biggest missing piece.
  6. Architecture ceiling: 338K params (vocab=128, h=828, d=64 attn) plateaus at ~2.15. Need to scale up.

LAWS (L-R1..L-R14) -- VIOLATION = REVERT

Law Rule Violation ->
L-R1 RUST ONLY -- no .py, .sh, .ipynb REVERT
L-R2 WORKERS=4-16 via env var REVERT
L-R3 Every result -> Neon + .trinity/experience/ LESSON MISSING
L-R4 cargo test --workspace = GREEN before push PR BLOCKED
L-R5 cargo clippy -- -D warnings = 0 PR BLOCKED
L-R6 SIGTERM -> graceful shutdown DATA LOSS
L-R7 Neon query timeout <= 30 sec WORKER CRASH
L-R8 Trainer stdout: ONLY BPB=X.XXXX PARSE FAIL
L-R9 GF16 only with d_model >= 256 +3.21 BPB (INV-3 PROVEN)
L-R10 T-JEPA ASHA min rung = 3000 steps FALSE PRUNE
L-R11 NCA entropy [1.5, 2.8] = hard loss penalty COLLAPSE (INV-4 PROVEN)
L-R12 All agents -> branch main ONLY CONFLICT
L-R13 agent_id + branch='main' in every Neon record DASHBOARD FAIL
L-R14 coqc trinity-clara/proofs/igla/*.v = exit 0 before race RACE INVALID

VICTORY CONDITIONS (Closure)

Issue #143 closes ONLY when:

  • BPB < 1.50 on seeds 43, 44, 45 (three independent runs)
  • p < 0.01 statistical significance
  • cargo test --workspace = GREEN
  • Neon: status='winner'
  • coqc trinity-clara/proofs/igla/*.v = GREEN (INV-001..010 all compiled)
  • git commit -m "IGLA FOUND: BPB=X.XXXX seed=43,44,45"
  • git push origin main

phi^2 + phi^-2 = 3 | TRINITY | IGLA RACE v2 | 2026-04-27T14:30+07 | RUST ONLY | NEVER STOP

Metadata

Metadata

Assignees

Labels

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions