x86: add BinaryOp_x86 bf16s storage support with avx512bf16 dispatch#6591
x86: add BinaryOp_x86 bf16s storage support with avx512bf16 dispatch#6591nihui merged 6 commits intoTencent:masterfrom
Conversation
|
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6591 +/- ##
==========================================
+ Coverage 92.71% 92.83% +0.11%
==========================================
Files 846 849 +3
Lines 267472 268049 +577
==========================================
+ Hits 247980 248834 +854
+ Misses 19492 19215 -277 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds BF16 (bf16s) storage support to the x86 BinaryOp layer, including an AVX512-BF16 optimized implementation with runtime CPU dispatch.
Changes:
- Refactors BinaryOp SIMD functors into a shared header (
binaryop_x86_functor.h) for reuse across translation units. - Adds bf16s compute paths for BinaryOp (including broadcasting) and wires them into
BinaryOp_x86forward/forward_inplace. - Introduces an AVX512BF16-dispatched bf16s vector kernel implementation (
binaryop_x86_avx512bf16.cpp).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/layer/x86/binaryop_x86_functor.h | New shared SIMD functor definitions for BinaryOp operations. |
| src/layer/x86/binaryop_x86_bf16s.h | New bf16s vector/broadcast kernels used by BinaryOp bf16 storage path. |
| src/layer/x86/binaryop_x86_avx512bf16.cpp | AVX512BF16-optimized bf16s vector kernel entrypoint for runtime dispatch. |
| src/layer/x86/binaryop_x86.h | Declares bf16s forward helpers on BinaryOp_x86 (guarded by NCNN_BF16). |
| src/layer/x86/binaryop_x86.cpp | Wires bf16s support, functor refactor, and AVX512BF16 runtime dispatch into x86 BinaryOp. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.