Inference speed #28

Inferencer · 2024-06-16T17:08:55Z

Based on the table 7 in the paper

Inference w. HADVS 9.77gb 1.63secs
Inference w.o. HADVS 9.76gb 1.63secs
Inference (256 × 256) 6.62gb 0.46secs
Inference (1024 × 1024) 20.66gb 10.29secs

As far as I am aware users are not currently achieving these speeds, am I right in assuming those experimented times are per frame?

samreid · 2024-06-17T00:42:51Z

Running example image 1 with audio file 2 like so:

time python scripts/inference.py --source_image examples/reference_images/1.jpg --driving_audio examples/driving_audios/2.wav

I saw it took around 9 minutes and 31 seconds, on a 4090 on runpod.

1 x RTX 4090
16 vCPU 62 GB RAM

Moviepy - Done !                                                                                                                                                    
Moviepy - video ready .cache/output.mp4

real	9m31.036s
user	19m24.173s
sys	1m17.841s

UPDATE: That is a 12 second audio file

gordon0414 · 2024-06-17T00:43:28Z

@samreid how long was your audio??

Inferencer · 2024-06-17T00:47:50Z

on mine I got 20 mins inference duration on a L4 30GB ram - 24GB vram with 9 seconds audio and a 512x512 image, pretty poor results but the face was turned about 40% and the recommended max was a 30% head yaw

output-cd49a137-6514-4f6b-b73d-b33044012c31.mp4

Subin-Vidhu · 2024-06-17T02:15:44Z

Hey @Inferencer , Was your GPU fully utilised? If 24GB VRAM is providing these results, it might be really slow, right?

Inferencer · 2024-06-17T10:08:49Z

Hey @Inferencer , Was your GPU fully utilised? If 24GB VRAM is providing these results, it might be really slow, right?

It said GPU usage 100% but underneath I believe said 15/24 used

Inferencer · 2024-06-17T14:34:29Z

I have seen a significant performance increase (speed) in using 256 source image & changing the ./configs/inference/default.yaml
to a width and height of 256, the difference appears to be

512x512.png + 0:09.wav = 20 mins inference duration
256xx256.png + 0:23.wav = 9 mins inference duration

not going to do any proper benchmark but give you an idea whats that roughly a third of the time considering the audio duration was longer in my 256 test

Subin-Vidhu · 2024-06-19T01:51:03Z

Thanks for the update @Inferencer

metncelik · 2024-06-22T06:34:28Z

2 min on H100 80GB with 512x512 image and 7secs audio.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference speed #28

Inference speed #28

Inferencer commented Jun 16, 2024

samreid commented Jun 17, 2024 •

edited

Loading

gordon0414 commented Jun 17, 2024

Inferencer commented Jun 17, 2024

Subin-Vidhu commented Jun 17, 2024

Inferencer commented Jun 17, 2024

Inferencer commented Jun 17, 2024

Subin-Vidhu commented Jun 19, 2024

metncelik commented Jun 22, 2024

Inference speed #28

Inference speed #28

Comments

Inferencer commented Jun 16, 2024

samreid commented Jun 17, 2024 • edited Loading

gordon0414 commented Jun 17, 2024

Inferencer commented Jun 17, 2024

Subin-Vidhu commented Jun 17, 2024

Inferencer commented Jun 17, 2024

Inferencer commented Jun 17, 2024

Subin-Vidhu commented Jun 19, 2024

metncelik commented Jun 22, 2024

samreid commented Jun 17, 2024 •

edited

Loading