Graceful shutdown: worker join has no timeout; stuck perform blocks SIGTERM

## Problem

`bin/nebula_queue_worker`:
- Uses a global `$running` flag (lines 75–83) — untestable, single-process-only.
- `threads.each(&:join)` (line 114) has no timeout.
- `trap('TERM')` flips the flag, but a thread inside a long `perform` (HTTP call, slow DB query) will not return until it finishes. Heroku and K8s send `SIGKILL` after 30s → the process dies exactly the way we feared in the reliable-fetch issue.

## Impact

- Deploys either hang up to the platform's kill grace period or forcibly lose in-flight jobs.
- No way to run two isolated workers in one process (tests, embedded use).

## Fix

1. Replace `$running` with a `NebulaQueue::Launcher` instance holding its own `@running` / `Concurrent::AtomicBoolean`.
2. On SIGTERM:
   - Stop the fetcher loop (no new jobs pulled).
   - Wait up to `shutdown_timeout` (default 25s on Heroku, configurable) for in-flight threads to finish.
   - Force-requeue any jobs still checked out (pairs with the reliable-fetch working lists) before exiting.
3. Exit non-zero if forced termination was required, so the platform logs a clear signal.

## Acceptance

- SIGTERM during a long-running job returns within `shutdown_timeout` seconds.
- In-flight jobs are either completed or cleanly requeued — never silently dropped.
- Shutdown is testable without process-level globals.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graceful shutdown: worker join has no timeout; stuck perform blocks SIGTERM #7

Problem

Impact

Fix

Acceptance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Graceful shutdown: worker join has no timeout; stuck perform blocks SIGTERM #7

Description

Problem

Impact

Fix

Acceptance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions