Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LayerNorm Optimize x86] AVX512/AVX/SSE intrinsic #4060

Closed
wants to merge 9 commits into from

Conversation

LRY89757
Copy link
Contributor

  • Add the AVX512/AVX/SSE intrinsic for layernorm
  • Add some test samples for elempack == 16

@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2022

Codecov Report

Merging #4060 (cf053cc) into master (4f414c1) will decrease coverage by 1.43%.
The diff coverage is 97.35%.

@@            Coverage Diff             @@
##           master    #4060      +/-   ##
==========================================
- Coverage   94.41%   92.98%   -1.44%     
==========================================
  Files         745      743       -2     
  Lines      178496   178516      +20     
==========================================
- Hits       168533   165990    -2543     
- Misses       9963    12526    +2563     
Impacted Files Coverage Δ
src/layer/x86/layernorm_x86.cpp 97.35% <97.35%> (ø)
src/layer/x86/convolution_2x2_pack8.h 2.75% <0.00%> (-97.25%) ⬇️
src/layer/x86/deconvolution_pack8.h 10.76% <0.00%> (-89.24%) ⬇️
src/layer/x86/convolution_sgemm_pack8.h 14.24% <0.00%> (-85.24%) ⬇️
src/layer/x86/convolution_sgemm_pack4to8.h 29.16% <0.00%> (-70.84%) ⬇️
src/layer/x86/convolution_pack8.h 34.42% <0.00%> (-65.58%) ⬇️
src/layer/x86/convolution_pack4to8.h 42.85% <0.00%> (-55.11%) ⬇️
...c/layer/x86/convolution_winograd_transform_pack8.h 54.90% <0.00%> (-45.10%) ⬇️
src/layer/x86/convolution_3x3_pack1to8.h 39.95% <0.00%> (-40.04%) ⬇️
src/layer/x86/convolution_winograd_dot_pack8.h 60.24% <0.00%> (-39.16%) ⬇️
... and 83 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4158e63...cf053cc. Read the comment docs.

@LRY89757 LRY89757 closed this Jul 24, 2022
@LRY89757 LRY89757 reopened this Jul 24, 2022
@LRY89757
Copy link
Contributor Author

LRY89757 commented Jul 25, 2022

I will try to finish the merge of instancenorm and the batchnorm then. At the same time, I will also finish the layernorm in my own way

@nihui
Copy link
Member

nihui commented Jul 29, 2022

close for 03f2ad3

@nihui nihui closed this Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants