v2.7.3 crashes when loading large models, where v2.5.1 did not #2182

dailysoftware · 2024-03-30T16:14:22Z

Bug Report

GPT4ALL crashes without any warning when using a model with RAM requirements greater than 16 GB. But when I switch version to 2.5.1 or loading a model with RAM requirements under 8GB, there is no problem.

Steps to Reproduce

1.Run GPT4ALL
2.Choose model
3.GPT4All then crashes

Your Environment

GPT4All version:2.7.3
Operating System:Win11
Chat model used (if applicable):nous-hermes-llama2-13b.Q4_0.gguf

dailysoftware · 2024-03-30T16:29:50Z

2.5.1 version can use various models, but the device can only use CPU and not GPU, with an error message indicating that GPU loading failed (out of VRAM). On the other hand, 2.7.3 version cannot use memory requirements of 16GB models, but it can use 8GB models and also use GPUs

cebtenzzre · 2024-04-01T19:39:23Z

How much RAM do you have? Do you think it is possible that GPT4All is running out of RAM (e.g. does it crash when you set the device to "CPU"), or is it really crashing when it runs out of VRAM? The latter is possible, but it would definitely be a bug and not an intentional occurrence.

Syclusion · 2024-04-15T18:57:30Z

I am having this issue as well, 4090 and 96gb of memory. Running on cpu fixes crash but runs slow af

TREHAND-Christian · 2024-04-22T04:09:14Z

I have the same problem, 80GB memory, NVIDIA RTX 3060.

QML debugging is enabled. Only use this in a safe environment.
[Debug] (Mon Apr 22 06:20:54 2024): deserializing chat "F:/AI/gpt4all/nomic.ai/GPT4All//gpt4all-3ca3afb4-8c17-4c97-8693-135477a84612.chat"
[Debug] (Mon Apr 22 06:20:54 2024): deserializing chats took: 4 ms
llama_new_context_with_model: max tensor size = 102.54 MB
llama.cpp: using Vulkan on NVIDIA GeForce RTX 3060
error loading model: Memory type index for buffer creation not found
llama_load_model_from_file_internal: failed to load model
LLAMA ERROR: failed to load model from F:/AI/gpt4all/nomic.ai/GPT4All/wizardcoder-python-34b-v1.0.Q4_0.gguf
GGML_ASSERT: C:\msys64\home\Jared\gpt4all-navarro\gpt4all-backend\llama.cpp-mainline\llama.cpp:552: data

ItsCheif · 2024-06-04T16:47:13Z

Issue seems to still exist on v2.8.0.

I've just got a large model that crashes GPT4ALL without warning, switched to CPU and it doesn't crash anymore. But it also just takes forever to write a single letter.

securityopa · 2024-06-07T18:37:58Z

I have the same problem with the latest version from flathub.

I have 128GB of RAM and I am using AMD Radeon 6800XT which is pretty fast in generating answers. But suddenly when the response is large, it crashes.

dailysoftware added bug-unconfirmed chat gpt4all-chat issues labels Mar 30, 2024

dailysoftware closed this as completed Mar 31, 2024

cebtenzzre reopened this Apr 1, 2024

cebtenzzre added the need-info Further information from issue author is requested label Apr 1, 2024

cebtenzzre changed the title ~~Gpt4All crashes when loading models~~ v2.7.3 crashes when loading large models, where v2.5.1 did not Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.7.3 crashes when loading large models, where v2.5.1 did not #2182

v2.7.3 crashes when loading large models, where v2.5.1 did not #2182

dailysoftware commented Mar 30, 2024

dailysoftware commented Mar 30, 2024

cebtenzzre commented Apr 1, 2024

Syclusion commented Apr 15, 2024

TREHAND-Christian commented Apr 22, 2024 •

edited

Loading

ItsCheif commented Jun 4, 2024

securityopa commented Jun 7, 2024

v2.7.3 crashes when loading large models, where v2.5.1 did not #2182

v2.7.3 crashes when loading large models, where v2.5.1 did not #2182

Comments

dailysoftware commented Mar 30, 2024

Bug Report

Steps to Reproduce

Your Environment

dailysoftware commented Mar 30, 2024

cebtenzzre commented Apr 1, 2024

Syclusion commented Apr 15, 2024

TREHAND-Christian commented Apr 22, 2024 • edited Loading

ItsCheif commented Jun 4, 2024

securityopa commented Jun 7, 2024

TREHAND-Christian commented Apr 22, 2024 •

edited

Loading