some activation opt on x86 by futz12 · Pull Request #6604 · Tencent/ncnn

futz12 · 2026-03-17T09:56:21Z

support erf simd on x86
support gelu normal mode on x86

support gelu normal mode on x86

codecov-commenter · 2026-03-17T11:15:30Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.47%. Comparing base (7237643) to head (b71ab52).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6604      +/-   ##
==========================================
+ Coverage   93.41%   93.47%   +0.05%     
==========================================
  Files         868      869       +1     
  Lines      275540   274712     -828     
==========================================
- Hits       257391   256776     -615     
+ Misses      18149    17936     -213

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR adds SIMD-optimized erf function implementations for x86 (SSE, AVX, AVX512) and uses them to support the normal (non-fast) GELU activation mode with SIMD packing on x86, removing the previous fallback to scalar-only processing for normal GELU.

Changes:

Added erf_ps, erf256_ps, and erf512_ps SIMD erf approximations in the respective mathfun headers
Added new Erf_x86 layer with SIMD-accelerated forward_inplace
Extended GELU_x86 to support normal mode (erf-based) with SIMD, removing the support_packing = false guard

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/layer/x86/sse_mathfun.h	Added SSE `erf_ps` polynomial approximation
src/layer/x86/avx_mathfun.h	Added AVX `erf256_ps` polynomial approximation
src/layer/x86/avx512_mathfun.h	Added AVX512 `erf512_ps` polynomial approximation
src/layer/x86/erf_x86.h	New Erf_x86 layer header
src/layer/x86/erf_x86.cpp	New Erf_x86 layer with SIMD forward_inplace
src/layer/x86/gelu_x86.cpp	Added normal GELU mode SIMD paths alongside existing fast GELU

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

nihui · 2026-03-18T00:32:20Z

Thanks for your contribution !

support erf simd on x86

b71ab52

support gelu normal mode on x86

github-actions bot added the x86 label Mar 17, 2026

nihui requested a review from Copilot March 17, 2026 11:43

Copilot started reviewing on behalf of nihui March 17, 2026 11:43 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

nihui merged commit 2eca3bc into Tencent:master Mar 18, 2026
87 of 88 checks passed

chenglimin pushed a commit to chenglimin/ncnn that referenced this pull request Apr 1, 2026

x86 erf and gelu optimization (Tencent#6604)

039f0c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some activation opt on x86#6604

some activation opt on x86#6604
nihui merged 1 commit intoTencent:masterfrom
futz12:some-activation-opt-on-x86

futz12 commented Mar 17, 2026

Uh oh!

codecov-commenter commented Mar 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

nihui commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

futz12 commented Mar 17, 2026

Uh oh!

codecov-commenter commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

nihui commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Mar 17, 2026 •

edited

Loading