-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPT4All v2.6.1 leaks VRAM when switching models #1840
Comments
Same here, clean install of everything doesn't help. |
Could you try a smaller model, e.g. a Q4_0 or Q4_K_S? And please provide a link to where you're downloading it from. This model loads fine for me on CPU, but on my server with a new enough GPU it runs out of memory. And please confirm the version of GPT4All you are using - and the last known good version, if you know it. |
In my case 2.5.4 worked fine. I download models from application, I used mistral, falcon (both GPU), wizardlm (CPU). After update 2.5.4->2.6.1 I experience problems, that was explained by @lexisdude. |
Hello Lets ignore the example I gave and instead use Hermes, or Snoozy downloaded directly from the GPT4ALL application. They do the very same thing when switching to those. By default Mistral is the dataset that loads by default in the application when I first start it. Regardless of what, or how many datasets I have in the models directory, switching to any other dataset , causes GPT4ALL to crash. The 2.5.4 version of the application works fine for anything I load into it , the 2.6.1 version crashes almost instantaneously when I select any other dataset regardless of it's size. I have 3 datasets that I downloaded directly from the GPT4ALL application, the rest of the datasets came from I would not say my rig is uber powerful or cannot run out of memory, but it has no problems with the 2.5.4 version of GPT4ALL, and does not have any issues with stable-diffusion XL, even processing requests from a 160 gig SDXL dataset. |
I see what's going on here. GPT4All eventually runs out of VRAM if you switch models enough times, due to a memory leak. Using larger models on a GPU with less VRAM will exacerbate this, especially on an OS like Windows that tends to fragment VRAM (and we don't handle that as well as we should). |
Awesome, hopefully there will be a fix coming soon because there should be no doubt, I love using GPT4ALL. And windows - for all insinuating purposes, breaks more things than it fixes. But I didn't say that. |
This is bad for the Display-Server under Linux with 1 GPU only. I had to sysrq me out because of VRAM get 100% full if i didn't kill it fast enough though. That's bad. |
This should be finally fixed as of eadc3b8 |
Fixed in v2.6.2. It still seems like VRAM is leaked when model loading fails, but that's a separate issue. |
System Info
Windows 10 22H2 128GB ram - AMD Ryzen 7 5700X 8-Core Processor / Nvidea GeForce RTX 3060
Information
Reproduction
Expected behavior
All my datasets were working fine before the update. After it started crashing, I even tried deleting and performing a full (clean) install - and it still crashes trying to switch to another dataset other than the one it loads with.
The text was updated successfully, but these errors were encountered: