Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference speed #28

Open
Inferencer opened this issue Jun 16, 2024 · 8 comments
Open

Inference speed #28

Inferencer opened this issue Jun 16, 2024 · 8 comments

Comments

@Inferencer
Copy link

Based on the table 7 in the paper

Inference w. HADVS 9.77gb 1.63secs
Inference w.o. HADVS 9.76gb 1.63secs
Inference (256 × 256) 6.62gb 0.46secs
Inference (1024 × 1024) 20.66gb 10.29secs

As far as I am aware users are not currently achieving these speeds, am I right in assuming those experimented times are per frame?

@samreid
Copy link

samreid commented Jun 17, 2024

Running example image 1 with audio file 2 like so:

time python scripts/inference.py --source_image examples/reference_images/1.jpg --driving_audio examples/driving_audios/2.wav

I saw it took around 9 minutes and 31 seconds, on a 4090 on runpod.

1 x RTX 4090
16 vCPU 62 GB RAM

Moviepy - Done !                                                                                                                                                    
Moviepy - video ready .cache/output.mp4

real	9m31.036s
user	19m24.173s
sys	1m17.841s

UPDATE: That is a 12 second audio file

@gordon0414
Copy link

@samreid how long was your audio??

@Inferencer
Copy link
Author

on mine I got 20 mins inference duration on a L4 30GB ram - 24GB vram with 9 seconds audio and a 512x512 image, pretty poor results but the face was turned about 40% and the recommended max was a 30% head yaw

output-cd49a137-6514-4f6b-b73d-b33044012c31.mp4

@Subin-Vidhu
Copy link

Hey @Inferencer , Was your GPU fully utilised? If 24GB VRAM is providing these results, it might be really slow, right?

@Inferencer
Copy link
Author

Hey @Inferencer , Was your GPU fully utilised? If 24GB VRAM is providing these results, it might be really slow, right?

It said GPU usage 100% but underneath I believe said 15/24 used

@Inferencer
Copy link
Author

I have seen a significant performance increase (speed) in using 256 source image & changing the ./configs/inference/default.yaml
to a width and height of 256, the difference appears to be

512x512.png + 0:09.wav = 20 mins inference duration
256xx256.png + 0:23.wav = 9 mins inference duration

not going to do any proper benchmark but give you an idea whats that roughly a third of the time considering the audio duration was longer in my 256 test

@Subin-Vidhu
Copy link

Thanks for the update @Inferencer

@metncelik
Copy link

2 min on H100 80GB with 512x512 image and 7secs audio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants