v0.7.2
Engine lifecycle hardening
Reliability fix for a user-reported cascade: cancelling/closing Claude left requests queuing forever, and stopping TurboLLM left the model loaded in RAM with the UI showing nothing.
Fixed
-
Engine load lock — static
Manager.loadGateshared across every Manager instance ensures at most one model load/reload is ever in flight. Newload()method is the single atomic entry point: stop → ComfyUI reverse gate → spawn → readiness wait, all under the lock. Eliminates the double-VRAM-allocation race between gateway auto-swap and a concurrent HTTP load. -
Orphan-engine reaping — each engine writes a pidfile (
run/engine-{pid}.pid) with its port and owner-daemon pid. On startup,reapStaleEngines()kills any engine whose port is live but whose owner daemon is gone (terminal closed, killed, crashed).killTrackedEnginesSync()on processexitcovers abrupt exits that bypass signal handlers. Owner-aware: a restarting daemon never reaps engines the incoming process already owns. -
Client-cancel propagation — gateway wires an
AbortControllerinto every upstream engine fetch.stream.onAbortfiresac.abort()so a cancelled Claude turn actually stops the engine generating rather than running to completion and clogging the queue.streamToAnthropicnow usesreader.cancel()(notreleaseLock()) to tear down the upstream body on disconnect. -
Daemon crash on client disconnect — guarded the final
writeSSE('done')in chat routes;unhandledRejectionhandler in CLI swallows expectedAbortErrors. A disconnecting client can no longer crash the daemon and orphan the engine. -
SIGHUPhandled — added to graceful-shutdown signals.
Tests added
manager.loadlock.test.ts— proves two concurrentload()calls on different Managers serialise under the global lockmanager.reap.test.ts— reap live orphan, skip live-owner, skip dead-port (recycled-pid guard)anthropic.cancel.test.ts—reader.cancel()propagates to the upstream body on generator teardown