Skip to content

Conversation

@MasterPtato
Copy link
Contributor

@MasterPtato MasterPtato commented Jan 14, 2026

Fixes RVT-5333

@linear
Copy link

linear bot commented Jan 14, 2026

Copy link
Contributor Author

MasterPtato commented Jan 14, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 14, 2026

Code Review - PR #3896: fix(engine-runner): handle shutdown close codes correctly

Overview

This PR refactors the shutdown handling logic in the Runner class by:

  1. Removing the redundant #shuttingDown flag
  2. Consolidating shutdown state into a single #shutdown flag
  3. Restructuring the WebSocket close event handler to better handle shutdown scenarios

Positive Changes

1. Elimination of Redundant State

  • Removes the #shuttingDown flag, which was redundant with #shutdown
  • Previously, both flags were being set, creating unnecessary complexity
  • This simplification makes the shutdown state easier to reason about

2. Improved Control Flow

  • The refactored WebSocket close handler now has clearer logic:
    • If NOT shutting down → handle reconnection logic
    • If shutting down → log and notify, but don't reconnect
  • This prevents race conditions where reconnection attempts could occur during shutdown

3. Consistent State Management

  • Setting #shutdown = true at the start of the shutdown() method ensures all subsequent checks work correctly
  • The guard clause (if (this.#shutdown)) now properly prevents concurrent shutdown attempts

Concerns & Suggestions

1. Potential Race Condition in Shutdown Method ⚠️

The shutdown() method is marked as async, but immediately sets #shutdown = true before any cleanup:

async shutdown(immediate: boolean, exit: boolean = false) {
    if (this.#shutdown) {
        return;
    }
    this.#shutdown = true;  // Set immediately

    // ... cleanup code follows

Issue: If the WebSocket close event fires during the shutdown cleanup operations (which is likely, since shutdown may close the WebSocket), the close handler will see #shutdown = true and take the else branch. This is actually correct behavior and the PR handles it well.

However, there's a subtle concern: The old code structure had the shutdown check as if (this.#shuttingDown) and only set this.#shutdown = true later. This meant there was a brief window where shutdown was in progress but #shutdown was still false. The new code closes this window, which is good.

2. Missing Special Case Handling ⚠️

The old code had special handling for the pegboard.runner_shutdown close reason:

// OLD CODE (removed):
if (closeError?.group === "pegboard" && closeError?.error === "runner_shutdown") {
    this.log?.info("runner shutdown");
}

This special case has been removed entirely.

Question: Is this intentional? If the server sends a pegboard.runner_shutdown close code, it will now be treated as a normal disconnection (logged with log.warn) rather than a graceful shutdown signal. This could be:

  • Correct if the runner should handle server-initiated shutdowns as unexpected disconnections
  • Incorrect if pegboard.runner_shutdown should trigger a graceful shutdown on the client side

Recommendation: Clarify whether pegboard.runner_shutdown should still be a special case. If yes, it should trigger await this.shutdown(true) similar to the eviction case.

3. onDisconnected Called Multiple Times ℹ️

The onDisconnected callback is now called in multiple places within the close handler:

  • When evicted: this.#config.onDisconnected(ev.code, ev.reason); → then await this.shutdown(true)
  • On normal disconnect: this.#config.onDisconnected(ev.code, ev.reason);
  • During shutdown: this.#config.onDisconnected(ev.code, ev.reason);

This is consistent with the old behavior, but worth noting that consumers of this API should expect onDisconnected to be called in all these scenarios.

4. Cleanup Ordering

The #ackInterval is now only cleared when NOT shutting down. This is correct because the shutdown() method already clears all intervals:

async shutdown(immediate: boolean, exit: boolean = false) {
    // ...
    if (this.#ackInterval) {
        clearInterval(this.#ackInterval);
        this.#ackInterval = undefined;
    }
    // ...
}

However, this creates a slight redundancy. If the WebSocket closes during normal operation (not shutdown), the interval is cleared in the close handler. If it closes during shutdown, the interval is cleared in the shutdown() method. This is fine but creates two code paths that achieve the same result.

Testing Considerations

Missing Test Coverage

Looking at tests/lifecycle.test.ts, the entire file is commented out. This PR changes critical shutdown logic but there are no active tests to verify:

  1. That concurrent shutdown attempts are properly rejected
  2. That WebSocket close during shutdown doesn't trigger reconnection
  3. That different close codes are handled correctly
  4. That the pegboard.runner_shutdown case works as intended (or is intentionally removed)

Recommendation:

  • Uncomment and update the existing tests
  • Add specific test cases for:
    • Shutdown racing with WebSocket close
    • Different close code scenarios
    • Eviction handling
    • Server-initiated shutdown (if still supported)

Performance Considerations

No Concerns

  • The changes reduce complexity and memory usage (one fewer boolean flag)
  • No performance regressions expected

Security Considerations

No Concerns

  • No new security vulnerabilities introduced
  • Proper cleanup and state management maintained

Code Style & Conventions

Follows CLAUDE.md

  • Uses proper TypeScript conventions
  • Follows existing patterns in the codebase
  • Structured logging with lowercase messages (e.g., "websocket closed")
  • No formatting issues

Summary

Overall Assessment: The PR makes good improvements to the shutdown logic by eliminating redundancy and improving clarity. However, there are two key questions that should be addressed:

  1. Is the removal of pegboard.runner_shutdown special case handling intentional? This seems like a significant behavioral change that should be explicitly documented or reconsidered.

  2. Why are tests disabled? Critical path logic like shutdown handling should have test coverage.

Recommendation:

  • ✅ Approve the structural improvements (removing #shuttingDown, reorganizing close handler)
  • ⚠️ Request clarification on the pegboard.runner_shutdown handling
  • ⚠️ Request test coverage before merging

Risk Level: Medium - The core logic appears sound, but the removal of special case handling could cause issues if it was serving an important purpose.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 14, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3896

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3896

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3896

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3896

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3896

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3896

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3896

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3896

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3896

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3896

commit: d6cf484

@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch from 600206d to 55a38d5 Compare January 14, 2026 22:45
@MasterPtato MasterPtato force-pushed the 01-13-chore_gas_add_overview_and_history_docs branch 2 times, most recently from 654b4e9 to c023ef5 Compare January 14, 2026 22:47
@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch 2 times, most recently from 715b5e8 to d6cf484 Compare January 14, 2026 22:52
@MasterPtato MasterPtato force-pushed the 01-13-chore_gas_add_overview_and_history_docs branch from c023ef5 to af232ee Compare January 14, 2026 22:52
@MasterPtato MasterPtato force-pushed the 01-13-chore_gas_add_overview_and_history_docs branch from af232ee to 1c80e13 Compare January 14, 2026 23:02
@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch from d6cf484 to 13e2511 Compare January 14, 2026 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants