You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.
I got the result that polybeast is slower than monobeast:
monobeast speed is about 10000SPS.
polybeast speed is about 3000SPS.
I have checked GPU, it works fine. monobeast used 100% of every CPU processor, but polybeast used only 50% of every CPU processor.
How can I speed up the polybeast?
The text was updated successfully, but these errors were encountered:
I think to start with a batch size of 4 isn't very large. The reason your CPUs are less busy for Polybeast is that the actor forward passes ("inference") happen on the GPU in that case. Options include:
Increase batch size
Use different GPUs for inference and learning
Potentially increase the number of parallel inference and learner threads
I'm having similar issue on a ubuntu machine with 32cpu cores, and 4 V100 gpus. with monobeast, it only uses 1 gpu, and full cpu power, the frame rate is ~5000SPS; while with polybeast, I set batch_size=16, num_inference/learner_threads=8, but the frame rate is only ~300SPS, and only 2 gpus are running. Were you able to speed up polybeast? Can you share some insight with me? Thanks!
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I build the cuda docker container like this, and tested mono and poly by almost same parameters below:
I got the result that polybeast is slower than monobeast:
monobeast speed is about 10000SPS.
polybeast speed is about 3000SPS.
I have checked GPU, it works fine. monobeast used 100% of every CPU processor, but polybeast used only 50% of every CPU processor.
How can I speed up the polybeast?
The text was updated successfully, but these errors were encountered: