Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using libtorch, gpu decoding is slower than cpu. #1643

Closed
hms1205 opened this issue Jan 3, 2023 · 7 comments
Closed

When using libtorch, gpu decoding is slower than cpu. #1643

hms1205 opened this issue Jan 3, 2023 · 7 comments

Comments

@hms1205
Copy link

hms1205 commented Jan 3, 2023

When using gpu to decode, gpu memory gets allocated but gpu-util rises after a lot of time.
For example, if you proceed with decoding 600 voices, it progresses very slowly until about the 100th, and then speeds up from the point when gpu-util rises.
Increasing the number of threads in decoder_main.cc makes it faster, but I'd like to fix the problem when it's single-threaded.
What should I do?

cpu = 24 cores
gpu = rtx a5000(24gb) x 2
ubuntu 20.04.4

@hms1205 hms1205 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 4, 2023
@hms1205 hms1205 reopened this Jan 4, 2023
@hms1205
Copy link
Author

hms1205 commented Jan 4, 2023

Same symptoms when tested on cpu.
(Speed ​​increases after a certain number of progress)
There is a big difference in speed in the 'forward_attention_decoder' api part, but I don't know if it's an attention problem or a libtorch problem.

@robin1001
Copy link
Collaborator

The slow decoding at the begging maybe caused by libtorch, libtorch model requires warmup.

@hms1205
Copy link
Author

hms1205 commented Jan 4, 2023

The slow decoding at the begging maybe caused by libtorch, libtorch model requires warmup.

What number should I specify when using warmup?
Should I do a test?

@robin1001
Copy link
Collaborator

Seems 100 is okay in your testing.

@hms1205
Copy link
Author

hms1205 commented Jan 4, 2023

Seems 100 is okay in your testing.

Thank you. I'll give it a try.

@hms1205
Copy link
Author

hms1205 commented Jan 4, 2023

The slow decoding at the begging maybe caused by libtorch, libtorch model requires warmup.

Sorry, it's an additional question. Why does libtorch need a warmup?

@xingchensong
Copy link
Member

The slow decoding at the begging maybe caused by libtorch, libtorch model requires warmup.

Sorry, it's an additional question. Why does libtorch need a warmup?

search the best path to execute forward calculation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants