Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Killing idle backend failure. #1760

Closed
TwinFinz opened this issue Feb 25, 2024 · 4 comments · Fixed by #1882
Closed

Killing idle backend failure. #1760

TwinFinz opened this issue Feb 25, 2024 · 4 comments · Fixed by #1882
Labels

Comments

@TwinFinz
Copy link
Contributor

TwinFinz commented Feb 25, 2024

LocalAI version:
sha-ff88c39-cublas-cuda12-ffmpeg (docker image)

Environment, CPU architecture, OS, and Version:
(Standard docker image)
Linux LocalAi 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 x86_64 x86_64 GNU/Linux

Describe the bug
Killing idle llama backend from watchdog causes invalid pointer crash.

To Reproduce
Set enviorment variables:

SINGLE_ACTIVE_BACKEND=true
BUILD_TYPE=cublas
REBUILD=true
BUILD_GRPC_FOR_BACKEND_LLAMA=true
PYTHON_GRPC_MAX_WORKERS=3
LLAMACPP_PARALLEL=1
PARALLEL_REQUESTS=true
WATCHDOG_IDLE=true
WATCHDOG_IDLE_TIMEOUT=5m

Load multiple (2+) models with llama in succession then let it idle allowing watchdog to kill the backend.

Expected behavior
Backend should shutdown gracefully.

Logs
11:41PM WRN [WatchDog] Address 127.0.0.1:35825 is idle for too long, killing it
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xb95344]

goroutine 39 [running]:
github.com/mudler/go-processmanager.(*Process).path(...)
/root/go/pkg/mod/github.com/mudler/go-processmanager@v0.0.0-20230818213616-f204007f963c/process.go:32
github.com/mudler/go-processmanager.(*Process).readPID(0x0)
/root/go/pkg/mod/github.com/mudler/go-processmanager@v0.0.0-20230818213616-f204007f963c/process.go:52 +0x24
github.com/mudler/go-processmanager.(*Process).Stop(0x0)
/root/go/pkg/mod/github.com/mudler/go-processmanager@v0.0.0-20230818213616-f204007f963c/process.go:150 +0x1c
github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).deleteProcess(0xc0004dee00, {0xc000a3a200, 0x20})
/build/pkg/model/process.go:32 +0x45
github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).StopModel(0xd1f720?, {0xc000a3a200, 0x20})
/build/pkg/model/loader.go:160 +0x111
github.com/go-skynet/LocalAI/pkg/model.(*WatchDog).checkIdle(0xc0005d4240)
/build/pkg/model/watchdog.go:115 +0x2c2
github.com/go-skynet/LocalAI/pkg/model.(*WatchDog).Run(0xc0005d4240)
/build/pkg/model/watchdog.go:99 +0xdd
created by github.com/go-skynet/LocalAI/core/http.Startup in goroutine 1
/build/core/http/api.go:100 +0x8fe

Additional context
This only tends to happen when multiple models are called, when it is only a single model loaded regardless of requests sent the crash does not occur.

@TwinFinz TwinFinz added bug Something isn't working unconfirmed labels Feb 25, 2024
@mudler
Copy link
Owner

mudler commented Feb 26, 2024

good catch, I can't confirm that yet - but look likes a mutex issue

@ctxwing
Copy link

ctxwing commented Mar 17, 2024

seem i have the same issue..
i tested 2.10.0 (2024-03-17) version. with localai/localai:v2.10.0-cublas-cuda12-ffmpeg-core.
any progress for it ?

@mudler
Copy link
Owner

mudler commented Mar 23, 2024

should be fixed in #1882

@mudler
Copy link
Owner

mudler commented Mar 23, 2024

@TwinFinz can you try again from master when the builds are finished? Feel free to re-open if problem persists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants