CPU High #17

xllusion-dong · 2024-07-05T15:49:52Z

When runing sample video d6, the cpu reach 100% for long time.

Any improvment for that?

Celtmant · 2024-07-06T02:22:17Z

I agree heavily loads the processor, as well as after RAM! I would like to note here is another thing, if you load not such a large image and do not choose a fully long video will not be much easier, perhaps reduce upscale

funwithforks · 2024-07-06T03:57:07Z

pip install onnxruntime-gpu

xllusion-dong · 2024-07-06T07:03:33Z

Already install onnxruntime-gpu， but it seems sometime it consume cpu，sometime it cost GPU. I will keep on watching it to find out the reason.

wandrzej · 2024-07-07T18:14:38Z

Overall I wonder about the performance. The paper claims to achieve 12.8 ms per frame, but in my case it's far from it and for 3/4 of the time it's not even utilizing the GPU, nor a CPU (both are at 20 and 10% respectively), so wonder apart from the onnx issue, is there anything else that could be a bottleneck - looks like a single core process is running and blocking the whole thing.

funwithforks · 2024-07-08T23:35:07Z

I don't have the issue so I have no input beyond onnx, but after getting that going my 4090 is at 55% steadily during the run. For reference. CPU on process is %345. Without GPU it was much higher CPU.

LubuLubu2 · 2024-07-09T19:44:23Z

For me it uses 100% cpu + 100% gpu and around 2.5GB v-ram the entier generation time with a 832x1152 resolution image. but it generates pretty quickly, 3060ti.

Celtmant · 2024-07-10T00:07:51Z

For me it uses 100% cpu + 100% gpu and around 2.5GB v-ram the entier generation time with a 832x1152 resolution image. but it generates pretty quickly, 3060ti.

And there's even more to it than that. Video with longer duration sucked a lot of CPU and RAM resources. My RAM was eating up almost all 29 gigabytes and the computer was freezing. I used "pip install onnxruntime-gpu" and it became not much easier, but with long videos, the RAM clogged and I'm afraid there may be everything. I have a 360rtx/12 video card, 32 RAM.

LubuLubu2 · 2024-07-10T04:59:04Z

For me it uses 100% cpu + 100% gpu and around 2.5GB v-ram the entier generation time with a 832x1152 resolution image. but it generates pretty quickly, 3060ti.

And there's even more to it than that. Video with longer duration sucked a lot of CPU and RAM resources. My RAM was eating up almost all 29 gigabytes and the computer was freezing. I used "pip install onnxruntime-gpu" and it became not much easier, but with long videos, the RAM clogged and I'm afraid there may be everything. I have a 360rtx/12 video card, 32 RAM.

Yep, 35seconds example video or i even tried 1minuted can eat all resources that you have and if you don't have enough your pc will freeze for minutes :)) mine was frozen for 15 minutes for a 1min video, again generation is fine, but at the end every single frame of lets say 1minutes video at 24fps have to be proccessed, i mean that created more than a thousand images and eats all your ram. 20second or less is fine, for longer videos we have to limited frame cap and generate couple of videos and join them together later.

kosmicdream · 2024-07-10T17:22:51Z

Same problem here, I've been trying to run the example video on an A40 instance on Runpod and everything freezes.

wandrzej · 2024-07-11T06:16:17Z

On my side, I think it's not really a matter of some bottleneck, I have 128GB RAM, so some frame off-loading is not the problem. Same with vram - 24GB. I do have onnx-gpu installed, but it's 1.5 I believe, maybe there's a version mismatch, but even that wouldn't explain the low load on both CPU and GPU in the pre-processing phase.

Anyway this could work way more efficiently, and given the low utilization numbers provided by others, I think with proper use of both CPU and GPU the claimed 12.8ms per frame is possible, regardless of the length of the video. It could be that this is an issue with comfy itself, that it needs to finish one 'block' from the pre-process node, before moving to the generation.

kijai · 2024-07-11T07:39:02Z

On my side, I think it's not really a matter of some bottleneck, I have 128GB RAM, so some frame off-loading is not the problem. Same with vram - 24GB. I do have onnx-gpu installed, but it's 1.5 I believe, maybe there's a version mismatch, but even that wouldn't explain the low load on both CPU and GPU in the pre-processing phase.

Anyway this could work way more efficiently, and given the low utilization numbers provided by others, I think with proper use of both CPU and GPU the claimed 12.8ms per frame is possible, regardless of the length of the video. It could be that this is an issue with comfy itself, that it needs to finish one 'block' from the pre-process node, before moving to the generation.

Their code has a lot of inefficiencies, I don't know if the their speed claim is about the whole process or part of it. For example skipping the pasteback gives ~30% speed boost.

For reference the numbers I'm currently getting for video editing in the develop branch with 4090, for the detection/cropping part, using CUDA for onnx: 33it/s

And the rest, which uses mostly GPU but there's also lots of CV2/numpy operations that are done on CPU, I'm getting 12it/s on Ryzen 7950x

So something like ~14 fps without pasteback and ~11 fps with.

kijai · 2024-07-11T07:48:24Z

Oh, and about the memory issue...that's common in Comfy when the frame count gets really high, it's not really designed to handle that in general as everything is kept in memory with no disk caching.

kijai closed this as completed Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU High #17

CPU High #17

xllusion-dong commented Jul 5, 2024

Celtmant commented Jul 6, 2024

funwithforks commented Jul 6, 2024

xllusion-dong commented Jul 6, 2024

wandrzej commented Jul 7, 2024

funwithforks commented Jul 8, 2024

LubuLubu2 commented Jul 9, 2024

Celtmant commented Jul 10, 2024 •

edited

Loading

LubuLubu2 commented Jul 10, 2024

kosmicdream commented Jul 10, 2024

wandrzej commented Jul 11, 2024

kijai commented Jul 11, 2024 •

edited

Loading

kijai commented Jul 11, 2024

CPU High #17

CPU High #17

Comments

xllusion-dong commented Jul 5, 2024

Celtmant commented Jul 6, 2024

funwithforks commented Jul 6, 2024

xllusion-dong commented Jul 6, 2024

wandrzej commented Jul 7, 2024

funwithforks commented Jul 8, 2024

LubuLubu2 commented Jul 9, 2024

Celtmant commented Jul 10, 2024 • edited Loading

LubuLubu2 commented Jul 10, 2024

kosmicdream commented Jul 10, 2024

wandrzej commented Jul 11, 2024

kijai commented Jul 11, 2024 • edited Loading

kijai commented Jul 11, 2024

Celtmant commented Jul 10, 2024 •

edited

Loading

kijai commented Jul 11, 2024 •

edited

Loading