Skip to content

x86 concat slice flatten reshape crop padding packing support fp16 bf16 storage#6593

Merged
nihui merged 6 commits intoTencent:masterfrom
nihui:concat-slice-x86-f16
Mar 10, 2026
Merged

x86 concat slice flatten reshape crop padding packing support fp16 bf16 storage#6593
nihui merged 6 commits intoTencent:masterfrom
nihui:concat-slice-x86-f16

Conversation

@nihui
Copy link
Copy Markdown
Member

@nihui nihui commented Mar 10, 2026

No description provided.

@github-actions github-actions bot added the x86 label Mar 10, 2026
@tencent-adm
Copy link
Copy Markdown
Member

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@nihui nihui requested a review from Copilot March 10, 2026 08:24
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 99.31996% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.54%. Comparing base (0137512) to head (05eab95).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/layer/x86/padding_x86.cpp 94.44% 14 Missing ⚠️
src/layer/x86/packing_x86.cpp 99.30% 5 Missing ⚠️
src/layer/x86/reshape_x86.cpp 99.64% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6593      +/-   ##
==========================================
- Coverage   92.82%   92.54%   -0.28%     
==========================================
  Files         849      736     -113     
  Lines      268474   251883   -16591     
==========================================
- Hits       249200   233111   -16089     
+ Misses      19274    18772     -502     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds fp16 (half-precision float) and bf16 (bfloat16) storage support to several x86 layer implementations: concat, slice, flatten, reshape, crop, padding, and packing. These are data-movement layers that don't perform arithmetic computation, so they only need to handle memory copying and repacking of 16-bit data elements.

Changes:

  • Add forward_bf16s_fp16s methods to concat, slice, flatten, reshape, padding, and packing x86 layers, using unsigned short as the underlying data type for 16-bit storage
  • Add new bf16s/fp16s-specific padding helper headers (padding_pack4_bf16s_fp16s.h, padding_pack8_bf16s_fp16s.h, padding_pack16_bf16s_fp16s.h) and crop helper functions using appropriately-sized SIMD or integer operations
  • Enable support_fp16_storage (via cpu_support_x86_f16c()) and support_bf16_storage in constructors of all affected layers, with dispatch based on elembits() == 16

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/layer/x86/concat_x86.h Add forward_bf16s_fp16s declaration
src/layer/x86/concat_x86.cpp Add bf16/fp16 concat implementation with element repacking for all dims/axes
src/layer/x86/slice_x86.h Add forward_bf16s_fp16s declaration
src/layer/x86/slice_x86.cpp Add bf16/fp16 slice implementation with element repacking for all dims/axes
src/layer/x86/flatten_x86.h Add forward_bf16s_fp16s declaration
src/layer/x86/flatten_x86.cpp Add bf16/fp16 flatten implementation with deinterleaving
src/layer/x86/reshape_x86.h Add forward_bf16s_fp16s declaration
src/layer/x86/reshape_x86.cpp Add bf16/fp16 reshape implementation with flatten + repack
src/layer/x86/padding_x86.h Add forward_bf16s_fp16s, create_pipeline/destroy_pipeline, and bf16/fp16 member fields
src/layer/x86/padding_x86.cpp Add bf16/fp16 padding with pipeline for value/per-channel conversion
src/layer/x86/padding_pack4_bf16s_fp16s.h New: pack4 constant/replicate/reflect padding for bf16/fp16
src/layer/x86/padding_pack8_bf16s_fp16s.h New: pack8 constant/replicate/reflect padding for bf16/fp16
src/layer/x86/padding_pack16_bf16s_fp16s.h New: pack16 constant/replicate/reflect padding for bf16/fp16
src/layer/x86/packing_x86.h Add forward_bf16s_fp16s declaration
src/layer/x86/packing_x86.cpp Add bf16/fp16 packing for all elempack combinations (1↔4↔8↔16)
src/layer/x86/crop_x86.h Copyright update
src/layer/x86/crop_x86.cpp Add bf16/fp16 crop helpers and dispatch by elemsize within existing forward methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nihui nihui merged commit cba96a6 into Tencent:master Mar 10, 2026
66 of 83 checks passed
chenglimin pushed a commit to chenglimin/ncnn that referenced this pull request Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants