fix(hpc/activations): sigmoid_f32 stride mismatch (orphan rescue from PR #154)#155
Merged
Merged
Conversation
Codex flagged: same-shaped contiguous views with different memory orders (C-order input + F-order output) both succeeded at as_slice_memory_order but with mismatched logical indexing — the flat SIMD primitive wrote sigmoid values into the wrong output coordinates. Fix: add the same strides-equality guard that hpc/vml.rs already uses in dispatch_unary_contig / dispatch_binary_contig. Mismatched-stride inputs now route to the stride-aware Zip cold path. Adds test_sigmoid_f32_c_in_f_out_mismatched_strides regression: 2x2 C-order input, F-order zero-init output, asserts logical coordinates carry correct sigmoid values. Activations test count: 16 -> 17. Reductions are unaffected (read-only commutative/associative — memory order doesn't change the scalar result). vml unary/binary already guarded via dispatch_*_contig.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rescues an orphaned fix from PR #154: the Codex review of #154 flagged a stride-mismatch bug in
sigmoid_f32that I committed (b5a63fc7) but landed AFTER the merge cycle had completed — so the merged version of #154 still carries the bug.The bug
Same-shaped contiguous views with different memory orders (C-order input + F-order output) both succeeded at
as_slice_memory_orderbut with mismatched logical indexing. The flat SIMD primitive then wrote sigmoid values into the wrong output coordinates instead of falling back to the stride-awareZipcold path.Reproducer (also the regression test added in this PR):
The fix
Add the same
x.strides() == out.strides()guard thathpc/vml.rsalready uses indispatch_unary_contig/dispatch_binary_contig(vml's unary and binary fns were never affected because they routed through these helpers). Mismatched-stride inputs now correctly route to the stride-awareZipcold path.Why other W2 surfaces are unaffected
hpc/reductions.rs(sum / mean / max / min / nrm2 / argmax / argmin) — read-only commutative/associative reductions. Memory-order of the iteration doesn't change the scalar result.hpc/vml.rs(16 unary + 4 binary fns) — already routed throughdispatch_unary_contig/dispatch_binary_contigwhich carry the strides-equality guard.activations::softmax_f32/log_softmax_f32—ArrayView1only; 1-Das_slice()returnsNonefor non-unit stride, so the cold path engages for any strided 1-D view.Only
sigmoid_f32was generic-DAND used the rawas_slice_memory_ordercalls without the guard.Test plan
cargo check -p ndarray --no-default-features --features stdcleancargo test -p ndarray --lib --no-default-features --features std hpc::activations— 17 passed / 0 failed (was 16 before this fix)test_sigmoid_f32_c_in_f_out_mismatched_stridesexercises the exact Codex reprocargo fmt --all -- --checkcleancargo clippy --no-default-features --features std -- -D warningscleanGenerated by Claude Code