You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Uses a global $running flag (lines 75–83) — untestable, single-process-only.
threads.each(&:join) (line 114) has no timeout.
trap('TERM') flips the flag, but a thread inside a long perform (HTTP call, slow DB query) will not return until it finishes. Heroku and K8s send SIGKILL after 30s → the process dies exactly the way we feared in the reliable-fetch issue.
Impact
Deploys either hang up to the platform's kill grace period or forcibly lose in-flight jobs.
No way to run two isolated workers in one process (tests, embedded use).
Fix
Replace $running with a NebulaQueue::Launcher instance holding its own @running / Concurrent::AtomicBoolean.
On SIGTERM:
Stop the fetcher loop (no new jobs pulled).
Wait up to shutdown_timeout (default 25s on Heroku, configurable) for in-flight threads to finish.
Force-requeue any jobs still checked out (pairs with the reliable-fetch working lists) before exiting.
Exit non-zero if forced termination was required, so the platform logs a clear signal.
Acceptance
SIGTERM during a long-running job returns within shutdown_timeout seconds.
In-flight jobs are either completed or cleanly requeued — never silently dropped.
Shutdown is testable without process-level globals.
Problem
bin/nebula_queue_worker:$runningflag (lines 75–83) — untestable, single-process-only.threads.each(&:join)(line 114) has no timeout.trap('TERM')flips the flag, but a thread inside a longperform(HTTP call, slow DB query) will not return until it finishes. Heroku and K8s sendSIGKILLafter 30s → the process dies exactly the way we feared in the reliable-fetch issue.Impact
Fix
$runningwith aNebulaQueue::Launcherinstance holding its own@running/Concurrent::AtomicBoolean.shutdown_timeout(default 25s on Heroku, configurable) for in-flight threads to finish.Acceptance
shutdown_timeoutseconds.