fix(model): track intentional stops, stop misreading clean shutdowns as crashes#10060
Merged
Conversation
…as crashes Two separate issues made graceful backend shutdown look ungraceful in the logs, even though the processes were being terminated correctly (go-processmanager defaults to process-group SIGTERM + 15s grace + SIGKILL): 1. "failed to read PID" — startProcess registers a per-process graceful- termination handler that calls Stop(), but StopAllGRPC (registered earlier, via app.Shutdown) already stopped and released store-tracked backends first. The second Stop() then failed reading the removed pidfile. Guard the handler with IsAlive() so it skips already-stopped processes; it still covers backends StopAllGRPC doesn't track (worker- supervised ones). 2. "Backend process exited unexpectedly" exitCode=-1 — the exit watcher treated only exit codes 0/143 as clean. But a child killed by our own SIGTERM/SIGKILL is reported by Go as exitCode -1 (signal termination), not the shell's 128+signal convention, so every intentional stop logged a false crash warning. The exit code can't distinguish an intended stop from a signal-induced crash. Track intent directly instead: a stoppingProcs sync.Map (keyed by the *process.Process pointer) is marked wherever LocalAI calls Stop() on purpose, and the exit watcher uses it to pick the log level — Info "stopped" when intentional, Warn "exited unexpectedly" otherwise (still catching real crashes). The raw exit code is reported as a field but no longer interpreted. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com>
mudler
approved these changes
May 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Track when we intentionally kill a backend and use that to decide if its death is unexpected.
Otherwise we get error messages because the backend process returned a non-zero status we didn't
expect and there are no standards for status codes so we'd have to ensure every backend handles signals
in the same way and explicitly uses the exit codes we expect.
Notes for Reviewers
Two separate issues made graceful backend shutdown look ungraceful in the
logs, even though the processes were being terminated correctly
(go-processmanager defaults to process-group SIGTERM + 15s grace + SIGKILL):
"failed to read PID" — startProcess registers a per-process graceful-
termination handler that calls Stop(), but StopAllGRPC (registered
earlier, via app.Shutdown) already stopped and released store-tracked
backends first. The second Stop() then failed reading the removed
pidfile. Guard the handler with IsAlive() so it skips already-stopped
processes; it still covers backends StopAllGRPC doesn't track (worker-
supervised ones).
"Backend process exited unexpectedly" exitCode=-1 — the exit watcher
treated only exit codes 0/143 as clean. But a child killed by our own
SIGTERM/SIGKILL is reported by Go as exitCode -1 (signal termination),
not the shell's 128+signal convention, so every intentional stop logged
a false crash warning. The exit code can't distinguish an intended stop
from a signal-induced crash.
Track intent directly instead: a stoppingProcs sync.Map (keyed by the
*process.Process pointer) is marked wherever LocalAI calls Stop() on
purpose, and the exit watcher uses it to pick the log level — Info
"stopped" when intentional, Warn "exited unexpectedly" otherwise (still
catching real crashes). The raw exit code is reported as a field but no
longer interpreted.
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Richard Palethorpe io@richiejp.com
Signed commits