Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Inference time speed up or not? #31

Closed
tuanmanh1410 opened this issue Aug 30, 2021 · 1 comment
Closed

[Question] Inference time speed up or not? #31

tuanmanh1410 opened this issue Aug 30, 2021 · 1 comment

Comments

@tuanmanh1410
Copy link

Thank for sharing this project and paper
I'm using GPipe Pytorch to testing inference time the same test dataset and compare with running on single GPU as baseline.
1/ The inference time running with GPiPe is seem slower than single GPU. Therefore , GPipe is suitable for training large model ? and not effective for speed up inference time ? Please correct me if I'm wrong.
2/ I'm curious that Does GPipe library support computes the communication latency among GPUs when intermediate data is transmitted between 2 GPUs in a row?

Thank you

@sublee
Copy link
Contributor

sublee commented Aug 31, 2021

The purpose of GPipe is:

  • Be able to train or evaluate large model which cannot be placed on a single GPU.
  • Reduce GPU idle time during model parallelism by pipelining. But there is still idle time. We call it "bubble".

If your model isn't large enough, you don't need GPipe.

For more details, see:

Also, GPipe itself does not provide latency measurement. We'd used NVIDIA Nsight Systems to optimize its communication cost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants