Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary clone of program in C++ Executor.Run #9043

Merged
merged 5 commits into from
Mar 19, 2018

Conversation

Xreki
Copy link
Contributor

@Xreki Xreki commented Mar 14, 2018

Inspired by results in #8990 , we know the clone of program may be time-consuming. But the optimization in #8990 doesn't affect inference.

Profile results of image_classification_resnet:
GPU: Tesla K40m, CUDA 8.0, CUDNN v7

batch_size 1 2 4 8 16 32 64 128 256
2019/3/9 1048.01 1081.18 1041.19 1135.8 1222.75 1658.94 2339.08 4478.58 8556.82
2019/03/14,remove clone 866.048 852.3 858.021 987.86 1115.01 1556.01 2232.87 4375.75 8443.42
speed up 1.210106 1.268544 1.213478 1.149758 1.096627 1.06615 1.047567 1.0235 #1.013431

@Xreki Xreki changed the title Core inference remove clone Remove clone of program in C++ Executor.Run Mar 14, 2018
@Xreki Xreki changed the title Remove clone of program in C++ Executor.Run Remove unnecessary clone of program in C++ Executor.Run Mar 14, 2018
@Xreki Xreki added the 预测 原名Inference,包含Capi预测问题等 label Mar 14, 2018
@Xreki Xreki requested review from kexinzhao and luotao1 March 16, 2018 09:30
Copy link
Contributor

@kexinzhao kexinzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@luotao1 luotao1 merged commit c042137 into PaddlePaddle:develop Mar 19, 2018
@Xreki Xreki added this to Performance Tuning (DONE) in Inference Framework Apr 3, 2018
@Xreki Xreki deleted the core_inference_remove_clone branch November 14, 2018 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
No open projects
Inference Framework
Performance Tuning (DONE)
Development

Successfully merging this pull request may close these issues.

None yet

3 participants