Commit d7263a3
committed
fix(submission): prepend BOS_ID=1 in prepare_caseops_data.py
External reproductions of this submission failed with ZeroDivisionError
in phased TTT eval because the shipped prep script did not prepend the
<s> control token (ID 1) to each doc. The SP tokenizer reserves IDs 0-7
(pad/s/</s>/unk + 4 CaseOps operators), so sp.encode cannot emit ID 1
naturally, and train_gpt.py:_find_docs (line 2209) requires BOS markers
with no fallback. Training ran because _init_shard:408-409 falls back to
bos_idx=[0] when no BOS is found; phased TTT eval has no equivalent
fallback.
Fix: add BOS_ID=1 constant, prepend to each doc's tokens, append 0 to
the byte sidecar (BOS = 0 original bytes). Matches the canonical pattern
in data/download_hf_docs_and_tokenize.py:364-366.
The submitted 1.06549 metric is unaffected — val_bpb reduces to
loss_sum/ln(2)/byte_sum (token counts cancel) and byte_sum is unchanged
with BOS prepended. Our seed logs were measured on shards that already
had BOS markers from an internal prep path; the shipped prep was the
outlier.
Also adds a Reproduction sanity check section to README.md that asserts
bos_count > 0 on the first val shard.
Reported by @codemath3000 in PR #1736 comment 4285805497.1 parent e100586 commit d7263a3
2 files changed
Lines changed: 22 additions & 2 deletions
File tree
- records/track_10min_16mb/2026-04-19_SP8192_CaseOps_GatedAttn_QuantGate_Loop45_PhasedTTT
Lines changed: 18 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
111 | 129 | | |
112 | 130 | | |
113 | 131 | | |
| |||
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
154 | 155 | | |
155 | 156 | | |
156 | 157 | | |
157 | | - | |
| 158 | + | |
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
161 | 162 | | |
162 | | - | |
| 163 | + | |
| 164 | + | |
163 | 165 | | |
164 | 166 | | |
165 | 167 | | |
| |||
0 commit comments