Configurable model loading timeout #4350

ProjectMoon · 2024-05-11T08:14:04Z

The model loading timeout, the time to wait for the llama runner, is hard coded. It would be nice to be able to configure this to increase or decrease it (for me, mostly increase). This would allow experimenting with big models that take forever to load, but might run fine once loaded.

lengrongfu · 2024-05-13T03:07:12Z

I can try to add this feature.

lengrongfu · 2024-05-13T03:07:18Z

/assign

lengrongfu · 2024-05-13T03:13:13Z

@ProjectMoon Your problem is that when ollama runs a model that does not exist locally, does pulling it from the registry time out?

ProjectMoon · 2024-05-13T06:48:29Z

No, it's the loading of the model from disk. There's a hard coded timeout of 10 minutes. I forget the exact file where this is, but the comment above the line says something like "be generous, as long models can take a long time to load."

ProjectMoon · 2024-05-13T17:41:53Z

An example of why this would be useful (to me): I can load Mixtral 8x7b using the Q2_K quant (smallest available file). It loads in 565 seconds, just under the 10 minute timeout limit. But once it's loaded, it generates at 16 tokens/second. It would be lovely if I could try the higher quants and see what happens.

ProjectMoon added the feature request New feature or request label May 11, 2024

lengrongfu mentioned this issue May 14, 2024

add load timeout env #4419

Closed

dhiltgen mentioned this issue May 23, 2024

Wire up load progress #4547

Merged

3 tasks

dhiltgen closed this as completed in #4547 May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable model loading timeout #4350

Configurable model loading timeout #4350

ProjectMoon commented May 11, 2024

lengrongfu commented May 13, 2024

lengrongfu commented May 13, 2024

lengrongfu commented May 13, 2024

ProjectMoon commented May 13, 2024

ProjectMoon commented May 13, 2024

Configurable model loading timeout #4350

Configurable model loading timeout #4350

Comments

ProjectMoon commented May 11, 2024

lengrongfu commented May 13, 2024

lengrongfu commented May 13, 2024

lengrongfu commented May 13, 2024

ProjectMoon commented May 13, 2024

ProjectMoon commented May 13, 2024