-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Labels
Description
LocalAI version:
v.3.7.0, hipblas image
Describe the bug
If application requests new model and there is already another model loaded that takes up necessary VRAM, LocalAI fails to stop previous model to release enough VRAM for new model. User needs to manually log into into LocalAI and click Stop to release VRAM.
To Reproduce
- Load model that takes most of VRAM
- Try to load another model that takes most of VRAM (more than currently available)
Expected behavior
LocalAI stops / unloads old model to make room for new model.
lee-b