Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the layer_norm operator with AVX intrinsic function #14417

Merged

Conversation

Projects
None yet
3 participants
@yihuaxu
Copy link
Contributor

commented Nov 15, 2018

According to the performance status of Layer Norm, just implemented the intrinsic function's optimization to accelerate the data processing.

Platform: Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz / Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Batch Size: 1
Command: build/paddle/fluid/inference/tests/api/test_analyzer_dam --infer_model=PaddlePaddle/pretrained_models/dam --infer_data=third_party/inference_demo/dam/data.txt --gtest_filter=Analyzer_dam.profile --batch_size=1 --repeat=1 --test_all_data=true --num_threads=1 --use_analysis=false

The following is the comparison with the different scenarios.
image

@yihuaxu yihuaxu force-pushed the yihuaxu:develop_1f4a4343_layer_norm_opt branch from 5a9fc06 to 36c23e1 Nov 15, 2018

yihuaxu added some commits Nov 15, 2018

@yihuaxu yihuaxu force-pushed the yihuaxu:develop_1f4a4343_layer_norm_opt branch from 4424399 to 07c0fd7 Nov 15, 2018

@yihuaxu

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2018

@tensor-tang @luotao1 Please help us review this PR

@yihuaxu yihuaxu force-pushed the yihuaxu:develop_1f4a4343_layer_norm_opt branch from 7ecacd0 to 5023a20 Nov 15, 2018

@luotao1 luotao1 requested a review from tensor-tang Nov 15, 2018

@luotao1 luotao1 added the Intel label Nov 15, 2018

@tensor-tang
Copy link
Contributor

left a comment

Thanks for your contribution @yihuaxu

The jit_kernel is migrating to jit code, and we are using jitcode to generate the runtime code on different platform, instead of using intrinsic now.

We will let you know when it's ready.

@tensor-tang
Copy link
Contributor

left a comment

@yihuaxu 我突然意识到这部分code在Mac也需要被注释掉吧,我们线下也会测试Mac上的单测。

yihuaxu added some commits Nov 19, 2018

@yihuaxu
Copy link
Contributor Author

left a comment

Remove the remnant code

@tensor-tang tensor-tang merged commit f4c869d into PaddlePaddle:develop Nov 19, 2018

4 checks passed

PR_CI (Paddle) TeamCity build finished
Details
PR_CI_python35 (Paddle) TeamCity build finished
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.