Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU not being used with inswapper-only CPU #361

Closed
3 tasks done
ckao10301 opened this issue Jul 15, 2024 · 12 comments
Closed
3 tasks done

GPU not being used with inswapper-only CPU #361

ckao10301 opened this issue Jul 15, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@ckao10301
Copy link

ckao10301 commented Jul 15, 2024

First, confirm

  • I have read the instruction carefully
  • I have searched the existing issues
  • I have updated the extension to the latest version

What happened?

When running face swap on a video, my CPU spikes to 90-100% usage in the "Analyzing target image" and "Swapping..." phases.

Why is CPU usage so high?Shouldn't this be using the GPU instead (driver is installed properly)?
If CPU is critical, what kind of CPU and RAM would allow me to run reactor the fastest? Is multithread or higher frequency more important?
@Gourieff

Steps to reproduce the problem

Your workflow
video test workflow.json

Sysinfo

mint linux
chrome
rtx 3090
1950x threadripper
ddr4
Screenshot from 2024-07-15 02-10-21

Relevant console log

na

Additional information

No response

@ckao10301 ckao10301 added bug Something isn't working new labels Jul 15, 2024
@0002kgHg
Copy link

same problem

@ckao10301
Copy link
Author

I already followed the instructions and ran install.py, but my instance isn't using GPU at all during analyzing image and swapping images phase. So seems like the inswapper model is being run only on CPU. How do I fix this?

CUDA 12 Support - don't forget to run (Windows) install.bat or (Linux/MacOS) install.py for ComfyUI's Python enclosure or try to install ORT-GPU for CU12 manually (https://onnxruntime.ai/docs/install/#install-onnx-runtime-gpu-cuda-12x)

@ckao10301 ckao10301 changed the title High CPU Usage normal? GPU not being used with inswapper-only CPU Jul 19, 2024
@Amit30swgoh
Copy link

same problem :(

@ckao10301
Copy link
Author

@Gourieff is this expected behavior or is it an issue with CUDA or python version?

@Gourieff
Copy link
Owner

@ckao10301 could you please show your pip list? The best compatibility/performance at the moment is to use Py3.10/3.11 + Cu11.8

@Gourieff Gourieff removed the new label Jul 26, 2024
@ckao10301
Copy link
Author

ckao10301 commented Jul 26, 2024 via email

@webfiltered
Copy link

webfiltered commented Jul 27, 2024

It is not expected behaviour.

Instead of downgrading, I upgraded everything. Resolved on Windows using (latest @ post date where not specified):

  • pip uninstall onnxruntime
  • pip uninstall onnxruntime-gpu
  • Followed onnxruntime install doc, installing only the -gpu variant (also mentioned earlier: GPU not being used with inswapper-only CPU #361 (comment))
  • CUDA 12.5 (Development & Runtime - dev may be unnecessary, and using 12.4 is probably a better idea)
  • cuDNN 9.2.1
  • Update GPU drivers
  • pip install -U torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Troubleshooting:

Below is possibly just misdirection, ping me if I should remove it, but in case it helps:

scripts/reactor_swapper.py and scripts/r_faceboost/restorer.py both set providers = [...]. I configured these explicitly per onnxruntime examples:
providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]

I also changed the onnx logging levels in reactor_patcher.py and reactor_utils.py:
onnxruntime.set_default_logger_severity(3) -> log level 1/2.

@ckao10301
Copy link
Author

ckao10301 commented Jul 28, 2024

@webfiltered I followed your steps and they worked! Face swapping is waaay faster now. Using comfyui manual install method rather than comfyui portable with the embedded python (don't know if it makes a difference). Running on windows 11. got it to work on linux too. Thank you so much!

@webfiltered
Copy link

No worries!

There's a workaround, too, if anyone can't use those versions. May be Windows only. Ensure you run Comfy in the foreground (e.g. via a terminal / cmd / powershell window), and just alt + tab to that window after you queue a prompt.

For me at least, it was 30x+ faster than with the console hidden.. and may actually run faster than CUDA. Worth testing, if you're doing large batches.

@ckao10301
Copy link
Author

ckao10301 commented Jul 28, 2024 via email

@webfiltered
Copy link

Short answer is UX. This is probably an incomplete answer, and based on the magnitude of difference, also inaccurate.

The OS has *almost no idea that your Comfy browser tab is actually a local program, so the Comfy process will not receive priority. The thing which the user is interacting with is the most important thing, for an end user OS.

*Yes it probably has some idea and figuring it out is ridiculously simple, but this is an edge case at present, so expect to work around the limitation manually.

@alperc84
Copy link

Yes, I had the same problem. The GPU was only used in the restoring stage. In other stages, the GPU usage was zero and the processes were very slow. My processor was reaching 94C degrees. applied @webfiltered solution step by step in comfyui portable version fixed the problem. Now all stages using very high gpu which is increased speed maybe 10x. I used cuda 12.4 for solution btw. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants