ollama can't run qwen:72b, error msg ""gpu VRAM usage didn't recover within timeout #4427

changingshow · 2024-05-14T09:50:54Z

What is the issue?

I have already downloaded qwen:7b, but when i run ollama run qwen:7b,got this error Error: timed out waiting for llama runner to start:, in the server.log have this msg gpu VRAM usage didn't recover within timeout

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

ollama version is 0.1.37

The text was updated successfully, but these errors were encountered:

chrisoutwright · 2024-05-19T12:08:09Z

I also get it (with smaller Params) when running RTX 2080 TI OR GTX 1060 with codeqwen:chat and codegamma:instruct for Win10.
It stays in RAM but will have to copy to GPU RAM everytime after one chat POST.

The GPU RAM is not exceeding on them, so not sure why it timesout every time.

dhiltgen · 2024-05-21T22:30:27Z

This was fixed in 0.1.38 via pr #4430. If you're still seeing problems after upgrading, please share your server log and I'll reopen.

pamanseau · 2024-05-24T10:28:09Z

@dhiltgen I have the same issue with 0.1.38 with Linux
ollama.log

dhiltgen · 2024-05-25T16:11:26Z

@pamanseau from the logs you shared, it looks like the client gave up before the model finished loading, and since the client request was canceled, we canceled the loading of the model. Are you using our CLI, or are you calling the API? If you're calling the API, what timeout are you setting in your client?

changingshow added the bug Something isn't working label May 14, 2024

dhiltgen self-assigned this May 21, 2024

dhiltgen closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ollama can't run qwen:72b, error msg ""gpu VRAM usage didn't recover within timeout #4427

ollama can't run qwen:72b, error msg ""gpu VRAM usage didn't recover within timeout #4427

changingshow commented May 14, 2024

chrisoutwright commented May 19, 2024 •

edited

dhiltgen commented May 21, 2024

pamanseau commented May 24, 2024 •

edited

dhiltgen commented May 25, 2024

ollama can't run qwen:72b, error msg ""gpu VRAM usage didn't recover within timeout #4427

ollama can't run qwen:72b, error msg ""gpu VRAM usage didn't recover within timeout #4427

Comments

changingshow commented May 14, 2024

What is the issue?

OS

GPU

CPU

Ollama version

chrisoutwright commented May 19, 2024 • edited

dhiltgen commented May 21, 2024

pamanseau commented May 24, 2024 • edited

dhiltgen commented May 25, 2024

chrisoutwright commented May 19, 2024 •

edited

pamanseau commented May 24, 2024 •

edited