Skip to content

Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195)#656

Closed
newjordan wants to merge 1 commit intoopenai:mainfrom
newjordan:submission/three-breadsticks
Closed

Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195)#656
newjordan wants to merge 1 commit intoopenai:mainfrom
newjordan:submission/three-breadsticks

Conversation

@newjordan
Copy link

@newjordan newjordan commented Mar 24, 2026

Results

Seed Pre-TTT BPB TTT BPB Artifact
1337 1.1196 1.1195 15.90 MB
42 1.1199 1.1200 15.61 MB
2045 1.1191 1.1190 15.81 MB
Mean 1.1195 1.1195

Progression

PR Mean BPB Notes
#577, #533 1.1207* Initial GPTQ submission
#578, #508 1.1215 QAT + TTT refinement
#587 1.1208 XSA + quantization tuning
#656 1.1195 Activation + eval improvements

*single seed

Architecture

11L/512d U-Net, 26.93M params. GPTQ int6+zstd, legal score-first TTT.

Reproduce

SEED=2045 torchrun --nproc_per_node=8 train_gpt.py

8xH100 SXM, 600s wallclock, ~6,900 steps.

11L/512d U-Net with leaky_relu_sq (slope 0.5), XSA last 4,
bigram 1536, legal score-first TTT (freeze_blocks=0, grad_clip=0.8).

3-seed results:
  seed 1337: 1.1195 post-TTT  (15.90MB)
  seed 42:   1.1200 post-TTT  (15.61MB)
  seed 2045: 1.1190 post-TTT  (15.81MB)
  mean:      1.1195

Run: SEED=2045 MLP_ACT=leaky_relu_sq MLP_LEAKY_SLOPE=0.5 \
     XSA_LAST_N=4 BIGRAM_VOCAB_SIZE=1536 \
     TTT_FREEZE_BLOCKS=0 TTT_GRAD_CLIP=0.8 \
     torchrun --nproc_per_node=8 train_gpt.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@newjordan newjordan changed the title Three Breadsticks: 1.1190 BPB Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195) Mar 24, 2026
@valerio-oai
Copy link
Contributor

This submission dos not include a submission.json or train logs, so I can't verify it enough to score it. Additionally, from the training code it looks like it applies GPTQ with training data calibration at eval time, which is disallowed. Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants