Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Initial Checkpoint Load onto CPU/GPU #27

Closed
ryan13mt opened this issue Jun 30, 2023 · 3 comments
Closed

Improve Initial Checkpoint Load onto CPU/GPU #27

ryan13mt opened this issue Jun 30, 2023 · 3 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@ryan13mt
Copy link

Hey @painebenjamin it's me again sorry. I installed the latest update and got the following logs, maybe they can help you think of possible issues that might be causing this.

This is how i setup the model:
image

And the log files with only the failing call:
enfugue.log
enfugue-engine.log

@painebenjamin painebenjamin self-assigned this Jun 30, 2023
@painebenjamin painebenjamin added bug Something isn't working enhancement New feature or request labels Jun 30, 2023
@painebenjamin
Copy link
Owner

Don't be sorry at all, you've saved me hours and hours of chasing down bugs already. So thank you!

After investigating how A1111 handles this, it looks like it loads the initial checkpoint on the CPU first. I'm loading it directly onto the GPU straight away - which is probably a little faster, but for a gigantic checkpoint like SD 1.5 base, it overloads your VRAM.

Theirs:
image

Mine:
image

I'm going to investigate setting this to always load to CPU first and see how it affects performance. If it's not too bad, I'll leave it as always CPU - if it dips performance too much, I'll add some logic around comparing the size of the checkpoint and the size of your available VRAM before loading it.

If you're willing to try something while I'm working - have you tried using a more fine-tuned checkpoint off of CivitAI or elsewhere? Most of those clock in at around 2GB, which is way smaller than the base models' ~7GB. If my guess is correct, those should be able to work for you.

@painebenjamin painebenjamin changed the title CUDA out of memory even on small generations Improve Initial Checkpoint Load onto CPU/GPU Jun 30, 2023
@ryan13mt
Copy link
Author

I tried it and i'm happy to share my first generation using Enfugue. Thanks a lot for your work, this is miles easier to use compared to Auto1111. Dont get me wrong, it has the most functionality but it takes time to setup.

I used the pruned Dreamshaper model for this.
Would be nice to have the functionality to use the full ones later on.

image

@painebenjamin
Copy link
Owner

This has been fixed in the latest release. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants