Skip to content

[WS1] Batch-invariant elementwise / RoPE pass-through audit #149

@Flink-ddd

Description

@Flink-ddd

Part of WS1 — Full Batch-Invariant Forward Chain (epic: #)

Why

Pointwise ops (activations, residual adds, scaling) and RoPE are usually assumed "safe" because they have no cross-element reduction — but that assumption needs to be verified, not trusted. A stray fused reduction, a dtype downcast at the wrong point, or a position-dependent RoPE precision issue can quietly reintroduce drift between the residual stream of batch=1 and batch=N.

Scope

Audit every elementwise op and RoPE on the forward path and confirm they are genuine pass-throughs with respect to batch configuration.

Out of scope

  • Rewriting ops that already pass — this is an audit; failed ops must be fixed here or linked to a blocking fix issue before this issue closes.
  • RMSNorm, matmul, attention, logprob (covered by their own issues).
  • FP8.

Acceptance criteria

  • A short written audit lists each elementwise op + RoPE and its invariance verdict (pass / needs-fix).
  • RoPE output is bitwise-identical for a fixed position across all batch configs in the sweep.
  • Any op found to reintroduce drift is fixed here or linked to a blocking fix issue with a minimal repro against the harness; this audit does not close while known drift remains unresolved.
  • Audited ops pass the [WS1] Ground-truth harness + numerical contract for batch-invariant ops #108 shared test helper; CI includes minimal RoPE + elementwise invariance tests.

Notes

Planned PRs

Metadata

Metadata

Assignees

Labels

component: kernelsTasks involving the development of CUDA and Triton underlying operatorsfeatureplatform: cudaSpecific optimizations or bugs in NVIDIA graphics cards (such as FlashInfer, TMA optimizations)priority: highSevere congestion issues require the highest priority for resolution.sprint-0615

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions