Record candidate: long-context no-QV rank56/prefix3000 TTT — val_bpb 1.05875 by himanshudongre · Pull Request #1965 · openai/parameter-golf

himanshudongre · 2026-04-30T06:42:16Z

Summary

Adds a 3-seed 10min/16MB record candidate package:

val_bpb = 1.05874877 (population std 0.00091680), max artifact 15,980,110 bytes, all three seeds under the 600s train/eval caps.

This is a clean score-first Track B refinement on the late-April CaseOps / LQER / SparseAttnGate / phased-TTT stack. It keeps the long-context no_qv setup from PR #1953 and reallocates the TTT eval budget with:

TTT_LORA_RANK=56
PHASED_TTT_PREFIX_DOCS=3000
PHASED_TTT_NUM_PHASES=3
EVAL_SEQ_LEN=2560, TTT_EVAL_SEQ_LEN=2560
TTT_MASK=no_qv, TTT_LOCAL_LR_MULT=0.75, QK_GAIN_INIT=5.25

Results

Seed	Train ms	Pre-quant BPB	Quant BPB	Final TTT BPB	Eval ms	Artifact bytes
42	596051	1.06108950	1.06949683	1.05780842	519467	15,975,989
0	596086	1.06200352	1.07034149	1.05844590	425873	15,976,674
1234	596146	1.06319088	1.07176584	1.05999198	400555	15,980,110
Mean	596094	1.06209463	1.07053472	1.05874877	448632	15,977,591

Leaderboard context at submission time:

vs merged PR Record: SP8192 + LQER + Sparse Attn Gate + BOS-Fixed SmearGate + 9-Hparam Greedy Stack — val_bpb 1.06108 (3-seed mean) #1855 (1.06108): -0.00233123 BPB, approximately -0.00510 nats.
vs open PR Record: PR #1945 base + 2560 long-context + no_qv TTT mask + TTT LR 0.75 + QK_GAIN 5.25 — val_bpb 1.05855 (3-seed mean) #1953 (1.05855370): +0.00019507 BPB worse. This PR is a clean record candidate against the merged leaderboard and a reproducible ablation on the Record: PR #1945 base + 2560 long-context + no_qv TTT mask + TTT LR 0.75 + QK_GAIN 5.25 — val_bpb 1.05855 (3-seed mean) #1953 lineage, not a claim to beat Record: PR #1945 base + 2560 long-context + no_qv TTT mask + TTT LR 0.75 + QK_GAIN 5.25 — val_bpb 1.05855 (3-seed mean) #1953 if that PR is accepted first.

Compliance

Checked against the Issue #1017 validity conditions:

C1 strict causal dependence: token predictions depend on artifact + strict prefix only.
C2 full normalized distribution: standard full-vocabulary neural distribution over the 8192 CaseOps token ids.
C3 score-before-update: phased TTT scores before LoRA/global updates use scored tokens.
C4 single pass: each validation token is scored once; no rescoring or best-of-k selection.

Additional exclusions: no SLOT, no byte/token PPM, no n-gram cache, no eval-time logit bias, no pre-quant TTT on validation data.

Files

records/track_10min_16mb/2026-04-30_LongCtx_NoQV_Rank56Prefix3000/README.md
submission.json
train_gpt.py
train_seed42.log, train_seed0.log, train_seed1234.log

Validation

Local checks before opening this PR:

python3 -m json.tool submission.json
python3 -m py_compile train_gpt.py
parsed all three logs to verify train ms, final BPB, eval ms, and artifact bytes match submission.json

… breakthrough NULL/NEUTRAL RESULTS (within ±0.0005 noise): - S37 GPTQ_BATCHES=32: 1.05884 (null) - S38 TTT_BETA2=0.995: 1.05884 (null) - S44 GLOBAL_TTT_LR=0.01: 1.05913 (within noise) - S46 GLOBAL_TTT_EPOCHS=2: 1.05902 (null) NEGATIVE RESULTS: - S36 lzma compressor: rejected - S36v2 LQER_TOP_K=2: 1.05912 - S41 openai#1965 bundle: 1.05916 - S42 LQER 8/5 + EMA 0.997: 1.05912 (EMA contaminated) - S43 LQER 8/5 isolated: 1.05925 - S52 LeakyReLU 0.3: 1.05977 (PR openai#1948 doesn't transfer to PR openai#1797) - S53 WARMDOWN_FRAC=0.95 + MIN_LR=0.05: 1.05950 (best pre-quant 1.06061 but bigger quant tax) INFRASTRUCTURE FIXES: - S39 lrzip -k flag bug, S40 SSH disconnect, S45 NCCL crash - S47/S49/S51 LeakyReLU integration bugs BREAKTHROUGH: - S54 n-gram tilt port from PR openai#1145/openai#1967: 1.05692 single seed (seed 314) - Pre-quant: 1.06057, Quantized: 1.06917, Final: 1.05692 - Eval: 503.4s under 600s cap, Size: 15,944,666 bytes under 16MB cap - Hint precompute outside timer: 173s (legal path) - Mode B with fused_log_softmax_dual_gather kernel - Hints fired on 13M of 47M tokens (27%) - Delta from current-env baseline: -0.00208 BPB Validating seeds 42, 1234 next.

Add rank56 prefix3000 long-context record candidate

023b2ef

BharathSShankar mentioned this pull request Apr 30, 2026

Non-record: Cross-Base Regularizer Transferability — methodological study (20+ cells, 10 figures) #2011

Open

3 tasks

alertcat mentioned this pull request Apr 30, 2026

Record: V22 = V21 + PR #1953 levers + EVAL_SEQ_LEN=2816 -- val_bpb 1.05877 (3-seed mean, all strict <600s) #1945

Open

12 tasks

himanshudongre mentioned this pull request May 1, 2026

Non-record: competition research notes #2111

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record candidate: long-context no-QV rank56/prefix3000 TTT — val_bpb 1.05875#1965

Record candidate: long-context no-QV rank56/prefix3000 TTT — val_bpb 1.05875#1965
himanshudongre wants to merge 1 commit intoopenai:mainfrom
himanshudongre:record/rank56-prefix3000-longctx

himanshudongre commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

himanshudongre commented Apr 30, 2026

Summary

Results

Compliance

Files

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant