Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi gpu inference #19

Closed
chikiuso opened this issue Mar 31, 2020 · 7 comments
Closed

multi gpu inference #19

chikiuso opened this issue Mar 31, 2020 · 7 comments

Comments

@chikiuso
Copy link

Thanks for your great work. I tried to inference on a machine with multiple gpu, it could detect all gpu (I also set the n_gpus as the number of gpu). it works fine on dlib, after running dlib part, it halts on just after "Model Created" and "Model Loaded".

Could I have some hints how I could try on multi gpu machine? Thanks for your help!

@prajwalkr
Copy link
Collaborator

Hi,

I think inference works pretty fast with single-GPU itself. Any reason you specifically want to use multi-GPU?

@chikiuso
Copy link
Author

chikiuso commented Mar 31, 2020

Hi @prajwalkr , I am trying to make a real time chat bot for fun, but she reply me slow as she has to reply me in 1min and 20 sec :D

@chikiuso
Copy link
Author

Hi @prajwalkr , when I run it on multiple gpu machine, I observed that all GPU allocate the same amount of ram , no matter it is one gpu or 8 gpu. E.g. When I run it on single gpu machine, it use up 7GB gpu ram. I run the same thing on 8 gpu machine, each gpu also use up 7GB ram.

@ak9250
Copy link

ak9250 commented Mar 31, 2020

@chikiuso noticed the bot is following facial movements while lipgan for a single image only follows mouth movements are you using some other model as well? Also, how are you getting the speech input?

@chikiuso
Copy link
Author

chikiuso commented Apr 1, 2020

Hi @ak9250 , yes I use other model as well, the speech input is normal tts

@prajwalkr
Copy link
Collaborator

Hi @prajwalkr , when I run it on multiple gpu machine, I observed that all GPU allocate the same amount of ram , no matter it is one gpu or 8 gpu.

By default Tensorflow uses up all GPUs available. Run it as:
CUDA_VISIBLE_DEVICES=0 python batch_inference.py .... to use gpu:0 and CUDA_VISIBLE_DEVICES=1 python batch_inference.py .... to use `gpu:1 and so on.

Hi @prajwalkr , I am trying to make a real time chat bot for fun, but she reply me slow as she has to reply me in 1min and 20 sec :D

Are you taking inference on a single static image or a video? Static image would be faster, as it does not have to do face detection in each frame.

@prajwalkr
Copy link
Collaborator

Closing due to inactivity. Please re-open if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants