Killing idle backend failure. #1760

TwinFinz · 2024-02-25T23:53:44Z

LocalAI version:
sha-ff88c39-cublas-cuda12-ffmpeg (docker image)

Environment, CPU architecture, OS, and Version:
(Standard docker image)
Linux LocalAi 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 x86_64 x86_64 GNU/Linux

Describe the bug
Killing idle llama backend from watchdog causes invalid pointer crash.

To Reproduce
Set enviorment variables:

SINGLE_ACTIVE_BACKEND=true
BUILD_TYPE=cublas
REBUILD=true
BUILD_GRPC_FOR_BACKEND_LLAMA=true
PYTHON_GRPC_MAX_WORKERS=3
LLAMACPP_PARALLEL=1
PARALLEL_REQUESTS=true
WATCHDOG_IDLE=true
WATCHDOG_IDLE_TIMEOUT=5m

Load multiple (2+) models with llama in succession then let it idle allowing watchdog to kill the backend.

Expected behavior
Backend should shutdown gracefully.

Logs
11:41PM WRN [WatchDog] Address 127.0.0.1:35825 is idle for too long, killing it
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xb95344]

goroutine 39 [running]:
github.com/mudler/go-processmanager.(*Process).path(...)
/root/go/pkg/mod/github.com/mudler/go-processmanager@v0.0.0-20230818213616-f204007f963c/process.go:32
github.com/mudler/go-processmanager.(*Process).readPID(0x0)
/root/go/pkg/mod/github.com/mudler/go-processmanager@v0.0.0-20230818213616-f204007f963c/process.go:52 +0x24
github.com/mudler/go-processmanager.(*Process).Stop(0x0)
/root/go/pkg/mod/github.com/mudler/go-processmanager@v0.0.0-20230818213616-f204007f963c/process.go:150 +0x1c
github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).deleteProcess(0xc0004dee00, {0xc000a3a200, 0x20})
/build/pkg/model/process.go:32 +0x45
github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).StopModel(0xd1f720?, {0xc000a3a200, 0x20})
/build/pkg/model/loader.go:160 +0x111
github.com/go-skynet/LocalAI/pkg/model.(*WatchDog).checkIdle(0xc0005d4240)
/build/pkg/model/watchdog.go:115 +0x2c2
github.com/go-skynet/LocalAI/pkg/model.(*WatchDog).Run(0xc0005d4240)
/build/pkg/model/watchdog.go:99 +0xdd
created by github.com/go-skynet/LocalAI/core/http.Startup in goroutine 1
/build/core/http/api.go:100 +0x8fe

Additional context
This only tends to happen when multiple models are called, when it is only a single model loaded regardless of requests sent the crash does not occur.

The text was updated successfully, but these errors were encountered:

mudler · 2024-02-26T19:24:42Z

good catch, I can't confirm that yet - but look likes a mutex issue

ctxwing · 2024-03-17T03:53:47Z

seem i have the same issue..
i tested 2.10.0 (2024-03-17) version. with localai/localai:v2.10.0-cublas-cuda12-ffmpeg-core.
any progress for it ?

Fixes #1760

mudler · 2024-03-23T14:10:40Z

should be fixed in #1882

Fixes #1760

mudler · 2024-03-23T15:21:39Z

@TwinFinz can you try again from master when the builds are finished? Feel free to re-open if problem persists

TwinFinz added bug Something isn't working unconfirmed labels Feb 25, 2024

mudler added high prio roadmap confirmed and removed unconfirmed labels Feb 26, 2024

mudler added a commit that referenced this issue Mar 23, 2024

fix(watchdog): use ShutdownModel instead of StopModel

684bbd0

Fixes #1760

mudler mentioned this issue Mar 23, 2024

fix(watchdog): use ShutdownModel instead of StopModel #1882

Merged

mudler closed this as completed in #1882 Mar 23, 2024

mudler added a commit that referenced this issue Mar 23, 2024

fix(watchdog): use ShutdownModel instead of StopModel (#1882)

bd25d80

Fixes #1760

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Killing idle backend failure. #1760

Killing idle backend failure. #1760

TwinFinz commented Feb 25, 2024 •

edited

mudler commented Feb 26, 2024

ctxwing commented Mar 17, 2024

mudler commented Mar 23, 2024

mudler commented Mar 23, 2024

Killing idle backend failure. #1760

Killing idle backend failure. #1760

Comments

TwinFinz commented Feb 25, 2024 • edited

mudler commented Feb 26, 2024

ctxwing commented Mar 17, 2024

mudler commented Mar 23, 2024

mudler commented Mar 23, 2024

TwinFinz commented Feb 25, 2024 •

edited