x86 concat slice flatten reshape crop padding packing support fp16 bf16 storage#6593
x86 concat slice flatten reshape crop padding packing support fp16 bf16 storage#6593nihui merged 6 commits intoTencent:masterfrom
Conversation
|
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6593 +/- ##
==========================================
- Coverage 92.82% 92.54% -0.28%
==========================================
Files 849 736 -113
Lines 268474 251883 -16591
==========================================
- Hits 249200 233111 -16089
+ Misses 19274 18772 -502 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds fp16 (half-precision float) and bf16 (bfloat16) storage support to several x86 layer implementations: concat, slice, flatten, reshape, crop, padding, and packing. These are data-movement layers that don't perform arithmetic computation, so they only need to handle memory copying and repacking of 16-bit data elements.
Changes:
- Add
forward_bf16s_fp16smethods to concat, slice, flatten, reshape, padding, and packing x86 layers, usingunsigned shortas the underlying data type for 16-bit storage - Add new bf16s/fp16s-specific padding helper headers (
padding_pack4_bf16s_fp16s.h,padding_pack8_bf16s_fp16s.h,padding_pack16_bf16s_fp16s.h) and crop helper functions using appropriately-sized SIMD or integer operations - Enable
support_fp16_storage(viacpu_support_x86_f16c()) andsupport_bf16_storagein constructors of all affected layers, with dispatch based onelembits() == 16
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/layer/x86/concat_x86.h | Add forward_bf16s_fp16s declaration |
| src/layer/x86/concat_x86.cpp | Add bf16/fp16 concat implementation with element repacking for all dims/axes |
| src/layer/x86/slice_x86.h | Add forward_bf16s_fp16s declaration |
| src/layer/x86/slice_x86.cpp | Add bf16/fp16 slice implementation with element repacking for all dims/axes |
| src/layer/x86/flatten_x86.h | Add forward_bf16s_fp16s declaration |
| src/layer/x86/flatten_x86.cpp | Add bf16/fp16 flatten implementation with deinterleaving |
| src/layer/x86/reshape_x86.h | Add forward_bf16s_fp16s declaration |
| src/layer/x86/reshape_x86.cpp | Add bf16/fp16 reshape implementation with flatten + repack |
| src/layer/x86/padding_x86.h | Add forward_bf16s_fp16s, create_pipeline/destroy_pipeline, and bf16/fp16 member fields |
| src/layer/x86/padding_x86.cpp | Add bf16/fp16 padding with pipeline for value/per-channel conversion |
| src/layer/x86/padding_pack4_bf16s_fp16s.h | New: pack4 constant/replicate/reflect padding for bf16/fp16 |
| src/layer/x86/padding_pack8_bf16s_fp16s.h | New: pack8 constant/replicate/reflect padding for bf16/fp16 |
| src/layer/x86/padding_pack16_bf16s_fp16s.h | New: pack16 constant/replicate/reflect padding for bf16/fp16 |
| src/layer/x86/packing_x86.h | Add forward_bf16s_fp16s declaration |
| src/layer/x86/packing_x86.cpp | Add bf16/fp16 packing for all elempack combinations (1↔4↔8↔16) |
| src/layer/x86/crop_x86.h | Copyright update |
| src/layer/x86/crop_x86.cpp | Add bf16/fp16 crop helpers and dispatch by elemsize within existing forward methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.