Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models remain resident in VRAM after deletion #4443

Closed
coder543 opened this issue May 15, 2024 · 2 comments
Closed

Models remain resident in VRAM after deletion #4443

coder543 opened this issue May 15, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@coder543
Copy link

coder543 commented May 15, 2024

What is the issue?

I downloaded the wrong model, ran it, realized my mistake, then deleted it, and noticed it was still listed as being present in VRAM according to ollama ps.

$ ollama ps
NAME            ID              SIZE    PROCESSOR       UNTIL
yi:9b-v1.5-q8_0 6ea05582d5ca    10 GB   100% GPU        4 minutes from now
$ ollama rm yi:9b-v1.5-q8_0
deleted 'yi:9b-v1.5-q8_0'
$ ollama ps
NAME            ID              SIZE    PROCESSOR       UNTIL
yi:9b-v1.5-q8_0 6ea05582d5ca    10 GB   100% GPU        4 minutes from now
$ nvidia-smi
Wed May 15 02:48:11 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04             Driver Version: 535.171.04   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3090        Off | 00000000:01:00.0 Off |                  N/A |
|  0%   50C    P8              24W / 420W |   9588MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     63185      C   ...unners/cuda_v11/ollama_llama_server     9582MiB |
+---------------------------------------------------------------------------------------+

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.1.38

@coder543 coder543 added the bug Something isn't working label May 15, 2024
@dhiltgen
Copy link
Collaborator

As noted in the output of ps it will unload after 5 minutes by default (looks like you had about 4 minutes remaining.) We'll also unload an idle model if we need the VRAM for loading other models automatically, so you can safely pull and run the model you actually wanted to load.

If you upgrade to the latest version, you can use ollama.exe run yi:9b-v1.5-q8_0 --keepalive 0 "" to quickly trigger an unload.

@dhiltgen dhiltgen self-assigned this May 21, 2024
@coder543
Copy link
Author

coder543 commented May 21, 2024

I agree with what you said in general, but it is still surprising behavior, and if you're using the GPU for other things besides ollama... there is a window of opportunity for the surprising behavior to cause an OOM for whatever other application is trying to load a model into VRAM.

There is no valid use case for a deleted model to remain in VRAM, since you cannot use the deleted model. (ollama will either complain or start downloading it again, rather than just using it from VRAM.)

But, it's fine, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants