fix asan error via x86 tmp buffer alignment#6703
Conversation
|
|
|
@codex review |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6703 +/- ##
==========================================
- Coverage 93.96% 93.95% -0.01%
==========================================
Files 933 933
Lines 299477 299613 +136
==========================================
+ Hits 281396 281515 +119
- Misses 18081 18098 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Pull request overview
This PR addresses x86 ASan/alignment-related faults by ensuring temporary stack buffers used with SIMD intrinsics are explicitly aligned, and by switching from unaligned (*_loadu_* / *_storeu_*) to aligned (*_load_* / *_store_*) load/store intrinsics where appropriate.
Changes:
- Add explicit 16/32/64-byte alignment annotations to stack temporary arrays across multiple x86 kernels.
- Replace unaligned SIMD loads/stores with aligned variants once alignment is guaranteed.
- Consolidate some multi-array temporaries into a single aligned buffer with pointer slices (e.g.,
tmpbuf/sumbuf) to ensure consistent alignment.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/layer/x86/innerproduct_fp.h | Align sums buffers and use aligned AVX/SSE loads for accumulation initialization. |
| src/layer/x86/innerproduct_bf16s.h | Same alignment + aligned loads change for bf16s innerproduct path. |
| src/layer/x86/gemm_x86.cpp | Align sum buffers used with AVX512/AVX/SSE stores; switch to aligned stores. |
| src/layer/x86/gemm_int8.h | Align temporary output sum buffers; switch to aligned AVX/SSE stores. |
| src/layer/x86/gemm_bf16s.h | Align multiple tmp/sum buffers (float + bf16) and switch to aligned SIMD stores. |
| src/layer/x86/deconvolution_packed.h | Align sum/tmp buffers and switch to aligned AVX512/AVX/SSE loads/stores. |
| src/layer/x86/deconvolution_packed_bf16s.h | Same alignment + aligned load/store changes for bf16s deconvolution. |
| src/layer/x86/convolution1d_packed.h | Align sum buffers and use aligned stores for AVX512/AVX/SSE outputs. |
| src/layer/x86/convolution1d_packed_bf16s.h | Same alignment + aligned stores change for bf16s convolution1d. |
| src/layer/x86/convolution_packed.h | Align sum buffers and use aligned stores for AVX512/AVX/SSE outputs. |
| src/layer/x86/convolution_packed_int8.h | Align int sum buffers and use aligned integer SIMD stores. |
| src/layer/x86/convolution_packed_bf16s.h | Align sum buffers and use aligned stores for bf16s packed convolution outputs. |
| src/layer/x86/convolution_im2col_gemm.h | Align sum buffers and use aligned AVX512/AVX/SSE stores in im2col+gemm path. |
| src/layer/x86/convolution_im2col_gemm_int8.h | Align offset/sum buffers and switch to aligned SSE/AVX integer stores where used. |
| src/layer/x86/convolution_im2col_gemm_bf16s.h | Align sum buffers and use aligned stores for bf16s im2col+gemm output handling. |
| src/layer/x86/convolution_3x3_winograd.h | Align tmp buffers used for Winograd output transforms; switch to aligned stores. |
| src/layer/x86/convolution_3x3_winograd_int8.h | Align tmp int buffers used for Winograd int8 output transforms; switch to aligned stores. |
| src/layer/x86/convolution_3x3_winograd_bf16s.h | Align tmp buffers used for bf16s Winograd output transforms; switch to aligned stores. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.