Skip to content

fix: handle SIGUSR2 reloads in runner mode#561

Merged
buger merged 1 commit intomainfrom
fix/runner-signal-reload
Mar 24, 2026
Merged

fix: handle SIGUSR2 reloads in runner mode#561
buger merged 1 commit intomainfrom
fix/runner-signal-reload

Conversation

@buger
Copy link
Copy Markdown
Contributor

@buger buger commented Mar 24, 2026

Summary

  • handle SIGUSR2 config reloads in runner mode even without --watch
  • keep --watch limited to file watching and avoid duplicate SIGUSR2 listeners
  • add regression coverage for runner-mode reload setup

Validation

  • npx jest tests/unit/runners/config-reload-runtime.test.ts tests/unit/config/config-watcher.test.ts tests/unit/runners/graceful-restart.test.ts --runInBand
  • full pre-commit suite
  • direct runtime repro with node dist/index.js --config tests/fixtures/minimal-config.yaml --mcp --mcp-auth-token testtoken --debug and kill -USR2

@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented Mar 24, 2026

PR Overview: Handle SIGUSR2 Reloads in Runner Mode

Summary

This PR fixes a gap in config reload handling for runner mode (Slack, Telegram, MCP, etc.). Previously, SIGUSR2-based config hot-reload was only available when using the --watch flag. This change ensures SIGUSR2 reloads work in runner mode even without --watch, while preventing duplicate signal listeners when both features are enabled.

Files Changed

Modified Files

  • src/cli-main.ts (-20 lines net)

    • Refactored config watcher setup to use new setupRunnerConfigReloadRuntime() helper
    • Removed inline ConfigWatcher/ConfigReloader instantiation (moved to dedicated module)
    • Updated cleanup paths to use unified configReloadRuntime.cleanup()
    • Simplified graceful shutdown and restart cleanup logic
  • src/config/config-watcher.ts (+1 line)

    • Added optional listenForSignals parameter to start() method
    • Allows disabling SIGUSR2 listener when signal handling is managed externally
    • Defaults to true for backward compatibility

New Files

  • src/runners/config-reload-runtime.ts (+76 lines)

    • New module: setupRunnerConfigReloadRuntime() orchestrates config reload for runner mode
    • Key responsibilities:
      • Registers SIGUSR2 handler unconditionally (non-Windows)
      • Sets up ConfigSnapshotStore and ConfigReloader when --config is provided
      • Conditionally enables file watching when --watch flag is set
      • Prevents duplicate SIGUSR2 listeners by passing { listenForSignals: false } to ConfigWatcher
      • Provides unified cleanup() method for graceful shutdown/restart
    • Behavior without --config: Installs a no-op SIGUSR2 handler that logs a warning (non-fatal)
  • tests/unit/runners/config-reload-runtime.test.ts (+130 lines)

    • Comprehensive unit tests for the new reload runtime module
    • Test coverage:
      • SIGUSR2 reload handling without file watch mode
      • File watching with disabled duplicate SIGUSR2 handler
      • No-op handler when no config path is provided
      • Proper cleanup of watchers, signal handlers, and snapshot store
      • Process signal listener registration/unregistration

Architecture & Impact Assessment

What This PR Accomplishes

  1. Enables SIGUSR2 config reload in runner mode without requiring --watch

    • Operators can now trigger config reloads via kill -USR2 <pid> or visor process reload even when not watching files
    • File watching remains optional via --watch flag
  2. Eliminates duplicate SIGUSR2 listeners

    • Previously, both the manual signal handler and ConfigWatcher would register for SIGUSR2 when --watch was enabled
    • Now ConfigWatcher's signal listener is disabled when managed by the reload runtime
  3. Centralizes config reload lifecycle management

    • Single cleanup() method handles all teardown (watchers, signals, snapshot store)
    • Consistent cleanup across normal shutdown and graceful restart paths

Key Technical Changes

Signal Handling Flow

flowchart TD
    A[Process receives SIGUSR2] --> B{Config path provided?}
    B -->|Yes| C[ConfigReloader.reload]
    B -->|No| D[Log warning: no --config]
    C --> E[Load & validate config]
    E --> F[Save snapshot]
    F --> G[onSwap callback]
    G --> H[Update config reference]
    H --> I[RunnerHost.broadcastConfigUpdate]
    I --> J[All runners receive new config]
    
    K[File change detected] --> L{--watch enabled?}
    L -->|Yes| M[ConfigWatcher debouncedReload]
    L -->|No| N[No action]
    M --> O[ConfigReloader.reload]
    
    style C fill:#e1f5e1
    style M fill:#e1f5e1
    style D fill:#ffe1e1
Loading

Component Relationships

classDiagram
    class CLI {
        +watch: boolean
        +configPath: string
    }
    
    class ConfigReloadRuntime {
        +setupRunnerConfigReloadRuntime()
        +cleanup()
        -snapshotStore: ConfigSnapshotStore
        -reloader: ConfigReloader
        -watcher: ConfigWatcher
        -signalHandler: Function
    }
    
    class ConfigWatcher {
        +start(options: {listenForSignals})
        +stop()
        -watchFile()
        -debouncedReload()
    }
    
    class ConfigReloader {
        +reload(): Promise~boolean~
        -loadConfig()
        -saveSnapshot()
        -onSwap()
    }
    
    class RunnerHost {
        +broadcastConfigUpdate()
        +stopAll()
        +stopListeningAll()
        +drainAll()
    }
    
    class GracefulRestartManager {
        +initiateRestart()
        +onCleanup()
        -spawnNewProcess()
        -waitForChildReady()
    }
    
    class ConfigSnapshotStore
    
    CLI --> ConfigReloadRuntime
    ConfigReloadRuntime --> ConfigWatcher
    ConfigReloadRuntime --> ConfigReloader
    ConfigReloadRuntime --> ConfigSnapshotStore
    ConfigReloader --> RunnerHost
    GracefulRestartManager --> ConfigReloadRuntime
    GracefulRestartManager --> RunnerHost
Loading

Affected System Components

Direct Impact

  • Runner mode entry point (cli-main.ts): Simplified config reload setup
  • ConfigWatcher: New parameter to disable internal signal listener
  • Graceful restart: Uses unified cleanup callback

Indirect Impact

  • All runner types (Slack, Telegram, MCP, etc.): Now receive config updates via SIGUSR2 without --watch
  • Process CLI handler (visor process reload): Works consistently regardless of --watch flag
  • Config snapshot store: Now initialized for all SIGUSR2 reloads (when --config provided)

No Impact

  • Non-runner modes (single-shot CLI execution)
  • Windows platforms (SIGUSR2 not supported)

Scope Discovery & Context Expansion

Related Modules & Services

Configuration Pipeline

  • src/config/config-snapshot-store.ts: Stores config snapshots on each reload for version history
  • src/config/config-reloader.ts: Orchestrates load → validate → snapshot → swap flow
  • src/config/config-watcher.ts: File watching with debouncing and dependency tracking

Runner Infrastructure

  • src/runners/runner-host.ts: Broadcasts config updates to all active runners
  • src/runners/graceful-restart.ts: Manages process restart with cleanup callbacks
  • src/runners/process-cli-handler.ts: CLI command for sending SIGUSR2 to running processes
  • src/runners/process-discovery.ts: Finds and signals Visor processes

Individual Runners (Config Update Recipients)

  • src/slack/socket-runner.ts: Implements updateConfig() to hot-swap config
  • src/telegram/runner.ts: Implements updateConfig() for Telegram bot
  • src/mcp/mcp-server-runner.ts: Implements updateConfig() for MCP server

Testing Coverage

New Tests

  • tests/unit/runners/config-reload-runtime.test.ts: 130 lines covering:
    • SIGUSR2 handler registration without file watch
    • File watching with disabled duplicate signal listener
    • No-op handler when no config path
    • Cleanup of all resources (watchers, signals, snapshot store)

Existing Tests (Validated by Author)

  • tests/unit/config/config-watcher.test.ts: File watching, debouncing, dependency tracking
  • tests/unit/runners/graceful-restart.test.ts: Graceful restart with cleanup callbacks

Integration Validation

  • Manual testing with node dist/index.js --config ... --mcp --debug and kill -USR2 <pid>

Configuration & Documentation

Relevant Config Options

  • --config <path>: Required for SIGUSR2 reloads to load new config
  • --watch: Optional file watching (independent of signal-based reload)
  • graceful_restart.*: Configures restart behavior (drain timeout, child ready timeout, etc.)

Documentation References

  • docs/guides/graceful-restart.md: Explains SIGUSR1 (restart) and SIGUSR2 (reload) usage
  • docs/commands.md: Documents visor process reload command
  • docs/bot-integrations.md: Describes runner mode config watching

References

Source Code

  • src/cli-main.ts:1818-1890 - Runner mode config reload setup (refactored)
  • src/config/config-watcher.ts:113-134 - ConfigWatcher.start() with listenForSignals option
  • src/runners/config-reload-runtime.ts:1-76 - New module (complete implementation)
  • tests/unit/runners/config-reload-runtime.test.ts:1-130 - New test suite

Related Code

  • src/config/config-reloader.ts:1-77 - ConfigReloader.reload() orchestrates reload flow
  • src/runners/graceful-restart.ts:85-92 - onCleanup() callback registration
  • src/runners/runner-host.ts:60-65 - broadcastConfigUpdate() propagates to all runners
  • src/runners/process-cli-handler.ts:79-92 - handleReload() sends SIGUSR2
  • src/runners/process-discovery.ts:57-64 - signalProcess() implementation

Tests

  • tests/unit/config/config-watcher.test.ts:1-289 - File watching and debouncing tests
  • tests/unit/runners/graceful-restart.test.ts:228-245 - Cleanup callback execution tests
Metadata
  • Review Effort: 2 / 5
  • Primary Label: bug

Powered by Visor from Probelabs

Last updated: 2026-03-24T07:29:29.925Z | Triggered by: pr_opened | Commit: 0852a05

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented Mar 24, 2026

Security Issues (7)

Severity Location Issue
🟠 Error src/cli-main.ts:1852
configReloadRuntime.cleanup() is called without checking if configReloadRuntime is defined. If setupRunnerConfigReloadRuntime threw an error and was caught (per the suggestion above), this would cause a TypeError.
💡 SuggestionAdd optional chaining or a null check before calling cleanup.
🔧 Suggested Fix
          await configReloadRuntime?.cleanup();
🟠 Error src/cli-main.ts:1877
configReloadRuntime.cleanup() is called in the restartManager.onCleanup callback without checking if configReloadRuntime is defined. Same issue as line 1852.
💡 SuggestionAdd optional chaining or a null check before calling cleanup.
🔧 Suggested Fix
          await configReloadRuntime?.cleanup();
🟡 Warning src/runners/config-reload-runtime.ts:42
The signal handler wraps reloader.reload() in void without error handling. If reload throws an unhandled exception, it could crash the process. Signal handlers should be defensive and never throw.
💡 SuggestionWrap the reload call in a try-catch block to prevent unhandled exceptions in the signal handler from crashing the process.
🔧 Suggested Fix
        signalHandler = () => {
          logger.info('[ConfigReload] Received SIGUSR2, triggering config reload');
          try {
            void reloader.reload();
          } catch (err) {
            logger.error(`[ConfigReload] Reload failed: ${err}`);
          }
        };
🟡 Warning src/runners/config-reload-runtime.ts:35
If snapshotStore.initialize() throws on line 36, the signalHandler registered on line 42 will never be cleaned up, causing a resource leak. The try-catch on line 38 only catches errors during watcher setup, not during snapshotStore initialization.
💡 SuggestionMove the signalHandler registration inside the try block or ensure cleanup happens if initialization fails.
🔧 Suggested Fix
    if (process.platform !== 'win32') {
      if (options.configPath) {
        try {
          snapshotStore = new ConfigSnapshotStore();
          await snapshotStore.initialize();
      const reloader = new ConfigReloader({
        configPath: options.configPath,
        configManager: options.configManager,
        snapshotStore,
        onSwap: options.onSwap,
      });

      signalHandler = () =&gt; {
        logger.info(&#39;[ConfigReload] Received SIGUSR2, triggering config reload&#39;);
        void reloader.reload();
      };
      process.on(&#39;SIGUSR2&#39;, signalHandler);
      logger.info(&#39;[ConfigReload] Send SIGUSR2 to hot-reload configuration&#39;);</code></pre>
🟡 Warning src/runners/config-reload-runtime.ts:70
The cleanup function removes the signal handler before stopping the watcher. If a SIGUSR2 signal arrives between removing the handler and stopping the watcher, it could be processed by a different handler or cause unexpected behavior. The order should be: stop watcher, then remove signal handler, then shutdown store.
💡 SuggestionReorder cleanup operations to prevent race conditions: stop watcher first, then remove signal handler, then shutdown snapshot store.
🔧 Suggested Fix
    return {
      async cleanup(): Promise<void> {
        if (watcher) watcher.stop();
        if (snapshotStore) {
          await snapshotStore.shutdown().catch(() => {});
        }
        if (signalHandler && process.platform !== 'win32') {
          process.removeListener('SIGUSR2', signalHandler);
        }
      },
    };
🟡 Warning src/cli-main.ts:1827
The setupRunnerConfigReloadRuntime is called without awaiting its result in a try-catch block. If it throws during initialization, the entire runner startup will fail. The original code had a try-catch around config watcher setup that allowed runners to continue without it.
💡 SuggestionWrap the setupRunnerConfigReloadRuntime call in a try-catch to allow runners to continue even if config reload setup fails, matching the original behavior.
🔧 Suggested Fix
      const { setupRunnerConfigReloadRuntime } = await import('./runners/config-reload-runtime');
      let configReloadRuntime: RunnerConfigReloadRuntime | undefined;
      try {
        configReloadRuntime = await setupRunnerConfigReloadRuntime({
          configPath: options.configPath,
          watch: Boolean(options.watch),
          configManager,
          onSwap: newConfig => {
            config = newConfig;
            host.broadcastConfigUpdate(newConfig);
            logger.info('[Watch] Config updated');
          },
        });
      } catch (reloadErr: unknown) {
        logger.warn(`Config reload setup failed (runners continue without it): ${reloadErr}`);
      }
🟡 Warning src/config/config-watcher.ts:116
The signalHandler in ConfigWatcher calls this.safeReload() without error handling. If safeReload throws, the entire process could crash from a signal handler.
💡 SuggestionWrap the safeReload call in a try-catch to prevent signal handler exceptions from crashing the process.
🔧 Suggested Fix
      this.signalHandler = () => {
        logger.info('[ConfigWatcher] Received SIGUSR2, triggering config reload');
        try {
          this.safeReload();
        } catch (err) {
          logger.error(`[ConfigWatcher] Reload failed: ${err}`);
        }
      };

Architecture Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Security Issues (7)

Severity Location Issue
🟠 Error src/cli-main.ts:1852
configReloadRuntime.cleanup() is called without checking if configReloadRuntime is defined. If setupRunnerConfigReloadRuntime threw an error and was caught (per the suggestion above), this would cause a TypeError.
💡 SuggestionAdd optional chaining or a null check before calling cleanup.
🔧 Suggested Fix
          await configReloadRuntime?.cleanup();
🟠 Error src/cli-main.ts:1877
configReloadRuntime.cleanup() is called in the restartManager.onCleanup callback without checking if configReloadRuntime is defined. Same issue as line 1852.
💡 SuggestionAdd optional chaining or a null check before calling cleanup.
🔧 Suggested Fix
          await configReloadRuntime?.cleanup();
🟡 Warning src/runners/config-reload-runtime.ts:42
The signal handler wraps reloader.reload() in void without error handling. If reload throws an unhandled exception, it could crash the process. Signal handlers should be defensive and never throw.
💡 SuggestionWrap the reload call in a try-catch block to prevent unhandled exceptions in the signal handler from crashing the process.
🔧 Suggested Fix
        signalHandler = () => {
          logger.info('[ConfigReload] Received SIGUSR2, triggering config reload');
          try {
            void reloader.reload();
          } catch (err) {
            logger.error(`[ConfigReload] Reload failed: ${err}`);
          }
        };
🟡 Warning src/runners/config-reload-runtime.ts:35
If snapshotStore.initialize() throws on line 36, the signalHandler registered on line 42 will never be cleaned up, causing a resource leak. The try-catch on line 38 only catches errors during watcher setup, not during snapshotStore initialization.
💡 SuggestionMove the signalHandler registration inside the try block or ensure cleanup happens if initialization fails.
🔧 Suggested Fix
    if (process.platform !== 'win32') {
      if (options.configPath) {
        try {
          snapshotStore = new ConfigSnapshotStore();
          await snapshotStore.initialize();
      const reloader = new ConfigReloader({
        configPath: options.configPath,
        configManager: options.configManager,
        snapshotStore,
        onSwap: options.onSwap,
      });

      signalHandler = () =&gt; {
        logger.info(&#39;[ConfigReload] Received SIGUSR2, triggering config reload&#39;);
        void reloader.reload();
      };
      process.on(&#39;SIGUSR2&#39;, signalHandler);
      logger.info(&#39;[ConfigReload] Send SIGUSR2 to hot-reload configuration&#39;);</code></pre>
🟡 Warning src/runners/config-reload-runtime.ts:70
The cleanup function removes the signal handler before stopping the watcher. If a SIGUSR2 signal arrives between removing the handler and stopping the watcher, it could be processed by a different handler or cause unexpected behavior. The order should be: stop watcher, then remove signal handler, then shutdown store.
💡 SuggestionReorder cleanup operations to prevent race conditions: stop watcher first, then remove signal handler, then shutdown snapshot store.
🔧 Suggested Fix
    return {
      async cleanup(): Promise<void> {
        if (watcher) watcher.stop();
        if (snapshotStore) {
          await snapshotStore.shutdown().catch(() => {});
        }
        if (signalHandler && process.platform !== 'win32') {
          process.removeListener('SIGUSR2', signalHandler);
        }
      },
    };
🟡 Warning src/cli-main.ts:1827
The setupRunnerConfigReloadRuntime is called without awaiting its result in a try-catch block. If it throws during initialization, the entire runner startup will fail. The original code had a try-catch around config watcher setup that allowed runners to continue without it.
💡 SuggestionWrap the setupRunnerConfigReloadRuntime call in a try-catch to allow runners to continue even if config reload setup fails, matching the original behavior.
🔧 Suggested Fix
      const { setupRunnerConfigReloadRuntime } = await import('./runners/config-reload-runtime');
      let configReloadRuntime: RunnerConfigReloadRuntime | undefined;
      try {
        configReloadRuntime = await setupRunnerConfigReloadRuntime({
          configPath: options.configPath,
          watch: Boolean(options.watch),
          configManager,
          onSwap: newConfig => {
            config = newConfig;
            host.broadcastConfigUpdate(newConfig);
            logger.info('[Watch] Config updated');
          },
        });
      } catch (reloadErr: unknown) {
        logger.warn(`Config reload setup failed (runners continue without it): ${reloadErr}`);
      }
🟡 Warning src/config/config-watcher.ts:116
The signalHandler in ConfigWatcher calls this.safeReload() without error handling. If safeReload throws, the entire process could crash from a signal handler.
💡 SuggestionWrap the safeReload call in a try-catch to prevent signal handler exceptions from crashing the process.
🔧 Suggested Fix
      this.signalHandler = () => {
        logger.info('[ConfigWatcher] Received SIGUSR2, triggering config reload');
        try {
          this.safeReload();
        } catch (err) {
          logger.error(`[ConfigWatcher] Reload failed: ${err}`);
        }
      };
\n\n ### Architecture Issues (1)
Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'
\n\n ### Performance Issues (1)
Severity Location Issue
🟠 Error src/runners/config-reload-runtime.ts:42
SIGUSR2 signal handler calls async reloader.reload() without proper error handling, risking unhandled promise rejections that could crash the process
💡 SuggestionWrap the reload() call in proper error handling similar to ConfigWatcher.safeReload() pattern: use .catch() to handle errors and prevent unhandled rejections. Consider making the handler async with try-catch or use the fire-and-forget pattern with explicit error handling.
🔧 Suggested Fix
        signalHandler = () => {
          logger.info('[ConfigReload] Received SIGUSR2, triggering config reload');
          void reloader.reload().catch(err => {
            logger.error(`[ConfigReload] Reload failed: ${err}`);
          });
        };

Powered by Visor from Probelabs

Last updated: 2026-03-24T07:24:06.900Z | Triggered by: pr_opened | Commit: 0852a05

💡 TIP: You can chat with Visor using /visor ask <your question>

@buger buger merged commit 72911ce into main Mar 24, 2026
12 of 16 checks passed
@buger buger deleted the fix/runner-signal-reload branch March 24, 2026 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant