[Question] Inference time speed up or not? #31

tuanmanh1410 · 2021-08-30T04:52:03Z

Thank for sharing this project and paper
I'm using GPipe Pytorch to testing inference time the same test dataset and compare with running on single GPU as baseline.
1/ The inference time running with GPiPe is seem slower than single GPU. Therefore , GPipe is suitable for training large model ? and not effective for speed up inference time ? Please correct me if I'm wrong.
2/ I'm curious that Does GPipe library support computes the communication latency among GPUs when intermediate data is transmitted between 2 GPUs in a row?

Thank you

sublee · 2021-08-31T05:36:32Z

The purpose of GPipe is:

Be able to train or evaluate large model which cannot be placed on a single GPU.
Reduce GPU idle time during model parallelism by pipelining. But there is still idle time. We call it "bubble".

If your model isn't large enough, you don't need GPipe.

For more details, see:

Also, GPipe itself does not provide latency measurement. We'd used NVIDIA Nsight Systems to optimize its communication cost.

tuanmanh1410 closed this as completed Sep 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Inference time speed up or not? #31

[Question] Inference time speed up or not? #31

tuanmanh1410 commented Aug 30, 2021

sublee commented Aug 31, 2021

[Question] Inference time speed up or not? #31

[Question] Inference time speed up or not? #31

Comments

tuanmanh1410 commented Aug 30, 2021

sublee commented Aug 31, 2021