[WS1] Batch-invariant elementwise / RoPE pass-through audit

Part of WS1 — Full Batch-Invariant Forward Chain (epic: #<WS1 tracking issue>)

## Why

Pointwise ops (activations, residual adds, scaling) and RoPE are usually assumed "safe" because they have no cross-element reduction — but that assumption needs to be verified, not trusted. A stray fused reduction, a dtype downcast at the wrong point, or a position-dependent RoPE precision issue can quietly reintroduce drift between the residual stream of batch=1 and batch=N.

## Scope

Audit every elementwise op and RoPE on the forward path and confirm they are genuine pass-throughs with respect to batch configuration.

- Enumerate the pointwise ops in the standard-Transformer forward path (activation, residual add, scaling, bias add, mask fill, dtype casts) and confirm none performs a batch-shape-dependent reduction.
- Confirm RoPE produces identical rotated Q/K for a given position regardless of batch size, position in batch, or padding (cos/sin precompute and application order fixed; table-lookup vs inline documented if both exist).
- Verify position-id handling, including packed-sequence position reset (coordinate with #42).
- Flag any op that silently changes dtype or accumulation based on tensor shape.
- Validate the audited ops against the #108 harness across the standard sweep.

## Out of scope

- Rewriting ops that already pass — this is an audit; failed ops must be fixed here or linked to a blocking fix issue before this issue closes.
- RMSNorm, matmul, attention, logprob (covered by their own issues).
- FP8.

## Acceptance criteria

- A short written audit lists each elementwise op + RoPE and its invariance verdict (pass / needs-fix).
- RoPE output is bitwise-identical for a fixed position across all batch configs in the sweep.
- Any op found to reintroduce drift is fixed here or linked to a blocking fix issue with a minimal repro against the harness; this audit does not close while known drift remains unresolved.
- Audited ops pass the #108 shared test helper; CI includes minimal RoPE + elementwise invariance tests.

## Notes

- Depends on #108.
- Mostly verification rather than new kernels — but the RoPE precision/order check is the one with real risk; treat it as the priority within this issue.
- Padding-related position drift can silently corrupt attention; pay special attention to position ids.

## Planned PRs

- [ ] Enumerate forward-path pointwise ops + RoPE; write audit doc with pass/needs-fix verdicts
- [ ] RoPE batch-invariance tests (fixed position -> identical rotated Q/K)
- [ ] Position-id / padding consistency tests (incl. packed sequences; coordinate with #42)
- [ ] Fix or link blocking fix issues for any drift found
- [ ] Wire audited ops through the #108 harness; add CI coverage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WS1] Batch-invariant elementwise / RoPE pass-through audit #149

Why

Scope

Out of scope

Acceptance criteria

Notes

Planned PRs

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[WS1] Batch-invariant elementwise / RoPE pass-through audit #149

Description

Why

Scope

Out of scope

Acceptance criteria

Notes

Planned PRs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions