sc/kernels: apply Owen scramble at full stoc_len too by heroarmor · Pull Request #11 · CrucibleComputingGroup/scmp_kernels

heroarmor · 2026-05-19T18:57:07Z

Summary

_prepare_rng_prefix previously skipped the per-dim XOR scramble when stoc_len == 2**sc_prec (full-length Sobol), with the comment "no scramble is needed". That reasoning only holds for the marginal count count(r<v)=v, which is invariant under XOR over a Sobol permutation.

The enable-signal matmul actually reads joint counts |{t : rng_a[d,t]<ba AND rng_b[d,t]<bb}| which depend on the per-d trajectory, not just marginals. With make_sobol_simple_config broadcasting the same Sobol-Q/Sobol-K pair across all D dims, skipping the scramble at full length left every dim with an identical joint trajectory, so SC noise across D accumulated instead of averaging out.

Effect

Llama-3.1-8B-Instruct PPL on wikitext-2 test (ctx=1024, stride=512, per_row, sc_prec=8):

config	PPL	xFP16
FP16 baseline	6.7711	1.000
INT8 per_row deterministic	6.9328	x1.024
SC `sl=128` (Owen applied)	7.1771	x1.060
SC `sl=256` (no Owen, prior)	7.9383	x1.172

SC sl=256 was worse than its own deterministic INT8 floor. After this change sl=256 also runs through Owen scramble; validation run pending.

Why this is safe

_owen_scramble only depends on prefix.shape[0] (D); it doesn't care whether the input is a prefix or the full sequence.
XOR is a bijection on [0, base_levels), so each per-d sequence is still a permutation of [0, base_levels). Marginals (and therefore k_table) are unchanged.
The fixed-level rescale branch (grid_levels != base_levels) is untouched.

Status

Proposed fix. Owen-at-full-length validation run (SC sl=256, full wikitext-2) is currently in flight; results will be posted in a follow-up comment.

Test plan

SC sl=256 per_row PPL drops from x1.172 toward x1.024 (INT8 floor) or better
SC sl=128 per_row PPL unchanged (it already went through Owen)
RULER VT @ 1K still scores 100 at sl>=96 (sanity)

🤖 Generated with Claude Code

When `stoc_len == 2**sc_prec`, `_prepare_rng_prefix` previously skipped the per-dim XOR scramble. The comment claimed scrambling was "not needed" at full length because the marginal `count(r<v)=v` over a Sobol-N permutation is invariant under XOR. That reasoning only covers the marginal. The enable-signal matmul reads joint counts `|{t: rng_a[d,t]<ba AND rng_b[d,t]<bb}|`, which depend on the per-d (rng_a, rng_b) *trajectory*, not just marginals. With the default `make_sobol_simple_config` broadcasting the same Sobol-Q/Sobol-K pair across all D dims, skipping the scramble at full length left every dim with an identical joint trajectory. SC noise across D then accumulated as a single biased estimator instead of averaging across independent estimators. Effect on Llama-3.1-8B-Instruct PPL (wikitext-2 test, ctx=1024 stride=512, per_row, sc_prec=8): FP16 6.7711 x1.000 INT8 per_row det. 6.9328 x1.024 (deterministic floor) SC sl=128 (Owen) 7.1771 x1.060 SC sl=256 (no Owen) 7.9383 x1.172 <- worse than INT8 floor After this change SC at sl=256 also goes through `_owen_scramble`, giving each dim a distinct XOR mask and recovering cross-D averaging. The prefix-vs-full distinction is no longer load-bearing, so the `is_prefix` guard is removed entirely. Owen scramble itself is unchanged; only the gate is widened. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR updates the SC enable-table RNG preparation logic so that the per-dimension Owen XOR scramble is applied even when using the full-length Sobol stream (stoc_len == 2**sc_prec). This aims to reduce cross-dimension correlation in joint (rng_a, rng_b) trajectories, improving noise averaging behavior across D dimensions.

Changes:

Always apply _owen_scramble() in _prepare_rng_prefix() when grid_levels == 2**sc_prec, including for full-length streams.
Remove the previous “no scramble needed at full length” special-casing and replace it with updated rationale in comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    prefix = rng[:, :stoc_len].contiguous() if stoc_len < rng.shape[1] else rng
    if grid_levels == base_levels:
-        # Fixed-level path: if we're truncating a longer Sobol sequence, apply
-        # Owen scramble to break the prefix stratification artifact. When the
-        # sequence is used in full (non-truncated), no scramble is needed.
-        if is_prefix:
-            return _owen_scramble(prefix, base_levels)
-        return prefix
+        # Per-dim Owen XOR — even at full length, this decorrelates the joint
+        # (rng_a, rng_b) trajectory across dimensions; without it, all D dims
+        # share the same joint, and SC noise accumulates instead of averaging.
+        return _owen_scramble(prefix, base_levels)


heroarmor · 2026-05-19T21:28:07Z

Per Allen's recommendation, dropping this implementation approach. Closing the PR; remote branch fix/owen-scramble-at-full-length is left intact so the diff stays referenceable. The PPL data motivating the investigation (INT8 floor + sl=128 vs sl=256 anomaly) lives in scmp_llm#5; the alternative fix direction is TBD.

heroarmor requested review from Allenjin123 and Copilot and removed request for Copilot May 19, 2026 18:57

Copilot started reviewing on behalf of heroarmor May 19, 2026 18:57 View session

heroarmor mentioned this pull request May 19, 2026

kernels: bump to scramble-family sl=256 fix + benchmark/ppl INT8 floor & SKIP_FP16 CrucibleComputingGroup/scmp_llm#5

Merged

5 tasks

Copilot AI reviewed May 19, 2026

View reviewed changes

heroarmor closed this May 19, 2026

This was referenced May 20, 2026

sc/kernels: apply Owen scramble family at full stoc_len too #14

Closed

sc/kernels: apply Owen scramble in the rescale branch (fixes halve x4.215 at 128 cycles) #16

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sc/kernels: apply Owen scramble at full stoc_len too#11

sc/kernels: apply Owen scramble at full stoc_len too#11
heroarmor wants to merge 1 commit into
mainfrom
fix/owen-scramble-at-full-length

heroarmor commented May 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

heroarmor commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

heroarmor commented May 19, 2026

Summary

Effect

Why this is safe

Status

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

heroarmor commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants