sc/kernels: apply Owen scramble family at full stoc_len too#14
Closed
heroarmor wants to merge 1 commit into
Closed
sc/kernels: apply Owen scramble family at full stoc_len too#14heroarmor wants to merge 1 commit into
heroarmor wants to merge 1 commit into
Conversation
_prepare_rng_prefix previously skipped the per-dim Owen XOR when
stoc_len == 2**sc_prec (full-length Sobol), applying it only to
truncated prefixes. The skip is wrong for the enable-signal matmul:
that path reads JOINT counts |{t : rng_a[d,t]<ba AND rng_b[d,t]<bb}|,
which depend on the per-d trajectory, not just the marginal count
count(r<v)=v (the only thing invariant under XOR over a Sobol
permutation).
With make_sobol_simple_config broadcasting the same Sobol-Q/Sobol-K
pair across all D dims, skipping the scramble at full length left
every dim with an identical joint trajectory, so SC noise across D
accumulated instead of averaging out. Always scrambling — selecting
counter / bitrev / random via SC_OWEN_MODE inside _owen_scramble —
restores per-dim decorrelation at stoc_len=256 (sc_prec=8).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
Contributor
|
let's discuss tomorrow on how to do it first |
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_prepare_rng_prefixpreviously skipped the per-dim Owen XOR whenstoc_len == 2**sc_prec(full-length Sobol), applying the scramble only to truncated prefixes. This re-applies the scramble family at full length too.The skip was wrong for the enable-signal matmul. That path reads joint counts
|{t : rng_a[d,t]<ba AND rng_b[d,t]<bb}|, which depend on the per-dtrajectory — not just the marginal countcount(r<v)=v(the only quantity invariant under XOR over a Sobol permutation). Withmake_sobol_simple_configbroadcasting the same Sobol-Q/Sobol-K pair across allDdims, skipping the scramble at full length left every dim with an identical joint trajectory, so SC noise acrossDaccumulated instead of averaging out.Always scrambling — selecting
counter/bitrev/randomviaSC_OWEN_MODEinside_owen_scramble— restores per-dim decorrelation atstoc_len=256(sc_prec=8).Relation to prior work
Supersedes the closed #11, rebased onto
mainso it also carries thehalve_bipolar_stoc_lenflag + housekeeping from #12.Effect (expected, Llama-3.1-8B-Instruct, wikitext-2 test, ctx=1024 stride=512, per_row, sc_prec=8)
sl=256(no scramble at full length, prior)sl=256(scramble family applied)Test plan
sl=256per_row PPL withSC_OWEN_MODE=counterandSC_OWEN_MODE=bitrev; confirm both drop from x1.172 toward the sl=128 / INT8 floor.SC_DISABLE_OWEN=1(orSC_OWEN_MODE=off) reproduces the prior unscrambledsl=256number — confirms the gate is the only behavioral change.stoc_len < 256numbers unchanged (that path already scrambled).🤖 Generated with Claude Code