Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@vercel
Copy link

vercel bot commented Nov 19, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Ready Ready Preview Comment Nov 19, 2025 9:21pm
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 19, 2025 9:21pm
rivet-inspector Ignored Ignored Preview Nov 19, 2025 9:21pm
rivet-site Ignored Ignored Preview Nov 19, 2025 9:21pm

Copy link
Member Author

NathanFlurry commented Nov 19, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

I've reviewed this PR and the changes look good overall. The PR removes the echo close behavior on ToClientWebSocketClose and simplifies the WebSocket close handling logic.

Summary of Changes

The PR modifies two TypeScript files in the engine runner SDK:

  1. tunnel.ts:

    • Removes the hibernate parameter from the close callback signature
    • Hardcodes hibernate: false in ToServerWebSocketClose messages sent by the runner
    • Replaces _closeWithHibernate() call with standard close() on timeout
    • Adds clarifying comment that close response isn't needed when handling ToClientWebSocketClose
    • Improves error message for dropped tunnel data
  2. websocket-tunnel-adapter.ts:

    • Removes the hibernate parameter from the close callback signature
    • Removes the _closeWithHibernate() method entirely
    • Simplifies #closeInner() by removing the hibernate parameter
    • Updates all call sites to use the simplified signature

Analysis

Protocol Correctness

Based on the protocol schema (v3.bare:265-269), ToServerWebSocketClose has a hibernate: bool field. This PR always sets it to false from the runner side, which is correct based on the architectural design:

  • Per pegboard-gateway/src/lib.rs:428-433, the gateway checks if both can_hibernate && close.hibernate to trigger hibernation
  • Per the hibernation docs (line 215): "Automatically hibernates WS, runner does not need to do anything"
  • The runner shouldn't be deciding hibernation - that's the gateway's responsibility based on whether the connection was opened as hibernatable

Behavioral Impact

Looking at pegboard-gateway/src/lib.rs:425-433:

protocol::ToServerTunnelMessageKind::ToServerWebSocketClose(close) => {
    if can_hibernate && close.hibernate {
        return Err(WebSocketServiceHibernate.build());
    } else {
        return Ok(LifecycleResult::ServerClose(close));
    }
}

Since close.hibernate is now always false, the gateway will never enter the hibernation path from a runner-initiated close. This is correct because:

  1. Hibernation should occur when the gateway/pegboard closes the connection (via ToClientWebSocketClose)
  2. When the runner closes, it's an explicit termination, not hibernation

Code Quality

  • Removes unused complexity from the WebSocket adapter
  • Simplifies the close callback signature
  • The removal of _closeWithHibernate() method makes the API cleaner
  • Better error messages ("ws.ack_timeout" is more specific than "Message acknowledgment timeout")

Potential Concerns

Minor: Protocol Field Unused 🟡

The hibernate field in ToServerWebSocketClose is now always false from the runner. Consider whether this field should be removed from the protocol schema in a future protocol version, or if there are other consumers that might use it. However, this is not blocking for this PR.

Question: Zombie Connection Cleanup

In tunnel.ts:513-524, the timeout handler now uses close(1000, "ws.ack_timeout") instead of _closeWithHibernate(). Can you confirm this won't affect the cleanup of zombie/stale connections during actor restoration? Based on the docs (Phase 3.2/3.3 in HIBERNATABLE_CONNECTIONS.md), it seems fine, but worth confirming.

Recommendations

Approve: The changes are correct and improve code clarity. The removal of the echo close behavior aligns with the documented architecture where hibernation is managed by the gateway, not the runner.

Suggestion: Consider adding a comment in the protocol schema or documentation explaining that ToServerWebSocketClose.hibernate should always be false from the runner side, to prevent future confusion.

Test Coverage: Ensure integration tests cover:

  • Normal WebSocket close flow (runner-initiated)
  • Gateway-initiated close with hibernation
  • Timeout-based cleanup with the new close code

Great cleanup! 🎉

@MasterPtato MasterPtato force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 647d688 to debe04e Compare November 19, 2025 21:19
@MasterPtato MasterPtato force-pushed the 11-19-chore_engine-runner_remove_echo_close_on_toclientwebsocketclose branch from 972242d to 556b074 Compare November 19, 2025 21:19
@NathanFlurry NathanFlurry force-pushed the 11-19-chore_engine-runner_remove_echo_close_on_toclientwebsocketclose branch from 556b074 to 972242d Compare November 19, 2025 21:22
@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from debe04e to 647d688 Compare November 19, 2025 21:22
@claude
Copy link

claude bot commented Nov 19, 2025

Updated Code Review

I've reviewed this PR again with fresh analysis. The changes are solid and correctly simplify the WebSocket close handling.

Summary

This PR removes the runner's ability to echo close messages with the hibernate flag, always setting it to false. This aligns with the correct architecture where hibernation decisions are made by the gateway, not the runner.

Key Changes Analysis

1. Protocol Alignment

The ToServerWebSocketClose protocol message includes a hibernate: bool field (v3.bare:268). This PR ensures the runner always sends false, which is architecturally correct:

  • Gateway controls hibernation based on can_hibernate && close.hibernate (pegboard-gateway/src/lib.rs:428)
  • Runner shouldn't decide hibernation - that's the gateway's job based on connection capabilities
  • Comment at tunnel.ts:1070 correctly notes "We don't need to send a close response"

2. Removed Complexity

  • Removed unused _closeWithHibernate() method
  • Simplified #closeInner() signature from 3 to 2 parameters
  • Cleaner close callback: (code?, reason?) instead of (code?, reason?, hibernate?)
  • Better error message: "ws.ack_timeout" vs "Message acknowledgment timeout"

3. Behavioral Correctness

The timeout handler change (tunnel.ts:516) from _closeWithHibernate() to close() is correct:

  • Both ultimately call the close callback
  • Timeout scenarios should terminate, not hibernate
  • Cleanup still happens via actor.webSockets.delete()

4. Error Message Improvement 👍

Line 400 improves logging: "cannot send tunnel message, socket not connected to engine. tunnel data dropped." - more explicit about data loss.

Testing Observations

The test file (lifecycle.test.ts) is currently commented out. While this PR's changes are straightforward, consider:

  • Re-enabling WebSocket close tests to verify behavior
  • Testing timeout-based cleanup with new close code
  • Verifying gateway correctly handles hibernate: false

Minor Suggestions

  1. Documentation: Consider adding a comment in the protocol schema explaining ToServerWebSocketClose.hibernate is always false from runners to prevent future confusion.

  2. Protocol Evolution: In a future protocol version, consider removing the hibernate field from ToServerWebSocketClose since it's now unused, or documenting its specific use cases.

Verdict

✅ LGTM - Clean refactoring that correctly implements the architecture. The removal of echo close behavior and the hibernate parameter simplifies the codebase without breaking functionality.


Files reviewed:

  • engine/sdks/typescript/runner/src/tunnel.ts:317-328
  • engine/sdks/typescript/runner/src/tunnel.ts:513-516
  • engine/sdks/typescript/runner/src/tunnel.ts:1067-1076
  • engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:65-68
  • engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:320-337

@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

Thanks for this cleanup! This PR simplifies the WebSocket close handling by removing the hibernate parameter. Here's my detailed review:

✅ Positive Changes

  1. Simplification: Removing the unused hibernate parameter from the close callback signature makes the code cleaner and easier to understand.

  2. Consistent behavior: All ToServerWebSocketClose messages now consistently set hibernate: false, which aligns with the comment at tunnel.ts:1070 indicating "We don't need to send a close response".

  3. Removed dead code: The _closeWithHibernate method has been properly removed as it's no longer needed.

  4. Better error message: The improved log message at tunnel.ts:400 (cannot send tunnel message, socket not connected to engine. tunnel data dropped.) provides better clarity about what happens when messages can't be sent.

🔍 Observations & Questions

  1. Hibernate always false: The PR hardcodes hibernate: false in all ToServerWebSocketClose messages. Is this intentional, or was the hibernate feature planned but never implemented? If hibernation was a planned feature, this PR effectively removes that capability permanently.

  2. Protocol consistency: The ToServerWebSocketClose protocol message still includes the hibernate field (as seen in stringify.ts:64). If this field will always be false, consider whether it should be removed from the protocol definition as well for consistency.

  3. Comment clarity at tunnel.ts:516: The timeout handler now calls webSocket.close(1000, "ws.ack_timeout"), which changed from the previous _closeWithHibernate call. The close reason changed from "Message acknowledgment timeout" to "ws.ack_timeout" - is this intentional naming convention change?

🐛 Potential Issues

  1. State machine consistency: In websocket-tunnel-adapter.ts, the _handleClose method at line 232 calls this.#closeInner(code, reason, true) (with callback=true), while close() at line 494 also calls this.#closeInner(code, reason, true). However, when the engine sends a ToClientWebSocketClose, the adapter calls _handleClose which will trigger the close callback, even though the comment at tunnel.ts:1070 says "We don't need to send a close response".

    Issue: When the engine closes the WebSocket via ToClientWebSocketClose, the adapter will still try to send a ToServerWebSocketClose back because _handleClose has callback=true. This could create an unnecessary echo/loop of close messages.

    Suggested fix: Change line 232 in websocket-tunnel-adapter.ts:

    // Change from:
    this.#closeInner(code, reason, true);
    // To:
    this.#closeInner(code, reason, false);
  2. Inconsistent engine close handling: At tunnel.ts:320-331, there's a check for engineAlreadyClosed before sending ToServerWebSocketClose. However, at tunnel.ts:1071, when handling ToClientWebSocketClose, we call adapter._handleClose() which will trigger the callback (as noted above), but there's no engineAlreadyClosed flag being set or checked.

🧪 Test Coverage

The test file lifecycle.test.ts appears to be commented out entirely. Consider:

  • Adding test coverage for WebSocket close scenarios
  • Testing that close messages don't create echo loops
  • Verifying proper cleanup of webSockets map and requestToActor map

📋 Minor Suggestions

  1. Consistency in close reasons: The codebase uses different naming conventions for close reasons:

    • tunnel.ts:516: "ws.ack_timeout" (dot notation)
    • tunnel.ts:560: "actor.stopped" (dot notation)
    • tunnel.ts:152: "ws.meta_not_found_during_restore" (dot notation with underscores)

    Consider documenting the naming convention for close reasons.

  2. Type safety: The closeCallback type changed but the private field type at line 50-52 still shows the old 3-parameter signature in the diff context. Ensure this was updated.

🎯 Summary

Overall this is a good cleanup PR that simplifies the close handling. The main concern is the potential for echoing close messages when the engine initiates the close. I'd recommend addressing the callback flag in _handleClose to prevent unnecessary network traffic.

Recommendation: Request changes to fix the potential close message echo issue before merging.

@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

Summary

This PR removes the hibernate parameter echo behavior from WebSocket close operations. The changes simplify the close flow by always setting hibernate: false when sending ToServerWebSocketClose messages to the engine.

Positive Changes ✅

  1. Simplified API: Removing the hibernate parameter from close callbacks makes the API cleaner and easier to understand
  2. Removed dead code: The _closeWithHibernate method was removed, reducing code complexity
  3. Clearer intent: The comment at tunnel.ts:1070 ("We don't need to send a close response") documents the design decision
  4. Better error messaging: Enhanced the warning message to include "tunnel data dropped" for better debugging

Issues & Concerns 🔍

1. Breaking Protocol Change - Needs Documentation

The ToServerWebSocketClose protocol still includes the hibernate field (runner-protocol/src/index.ts:1183-1187):

export type ToServerWebSocketClose = {
    readonly code: u16 | null
    readonly reason: string | null
    readonly hibernate: boolean  // Still present in protocol
}

Issue: While the client now always sends hibernate: false, the protocol definition hasn't changed. This could cause confusion.

Recommendations:

  • Document why hibernate is always false in the protocol comments
  • OR consider removing the field from the protocol entirely if it's no longer used (requires engine-side changes)
  • Add migration notes if this is a breaking change for existing deployments

2. Inconsistent Behavior with Hibernatable WebSockets

The code still has hibernatable WebSocket support (tunnel.ts:92-94):

// Only close non-hibernatable websockets to prevent sending
// unnecessary close messages for websockets that will be hibernated
if (\!ws[HIBERNATABLE_SYMBOL]) {

Issue: There's a disconnect between:

  • The existence of hibernatable WebSocket infrastructure
  • Always sending hibernate: false on close

Questions:

  • Is the hibernatable WebSocket feature still in use?
  • If yes, should close operations for hibernatable WebSockets send hibernate: true?
  • If no, consider removing the hibernatable WebSocket code to reduce maintenance burden

3. Potential Race Condition in Timeout Handler

At tunnel.ts:516, the timeout handler now calls webSocket.close() instead of _closeWithHibernate():

webSocket.close(1000, "ws.ack_timeout");

Issue: The .close() method calls #closeInner(code, reason, true) which triggers the close callback. This might send a ToServerWebSocketClose message even during a timeout scenario.

Question: Is this the intended behavior? The timeout already indicates a network/communication issue, so echoing a close message might be redundant.

4. Missing Test Coverage

No test changes were included in this PR.

Recommendations:

  • Add tests for WebSocket close behavior (normal close, timeout close, engine-initiated close)
  • Verify that hibernate: false is always sent
  • Test that no duplicate close messages are sent

Code Quality

Good:

  • Follows existing code style
  • Simplifies the codebase by removing unused parameters
  • Clear commit message following conventional commits

Suggestions:

  • Consider adding JSDoc comments explaining the close behavior
  • The callback parameter in #closeInner could be renamed to shouldNotifyEngine for clarity

Security & Performance

✅ No security concerns identified
✅ Performance impact is neutral (slightly positive due to code simplification)

Recommendation

Conditional Approval - The code changes are sound, but need:

  1. Documentation/clarification on the protocol design decision
  2. Resolution of the hibernatable WebSocket question
  3. Test coverage for the new behavior

Please address the issues mentioned above before merging.

@claude
Copy link

claude bot commented Nov 19, 2025

PR Review: Remove Echo Close on ToClientWebSocketClose

Summary

This PR simplifies the WebSocket close flow by removing the hibernate parameter and related functionality. The changes hard-code hibernate: false in all ToServerWebSocketClose messages and remove the now-unused _closeWithHibernate method.


Code Quality & Best Practices ✅

Positive:

  • Clean removal of unused functionality - the _closeWithHibernate method is properly removed
  • Simplified method signatures make the code easier to understand
  • The #closeInner method now has a clearer, more straightforward signature
  • All call sites consistently updated (tunnel.ts:516, 232, 246, 491)

Suggestions:

  1. Documentation: Consider adding a comment explaining why hibernate is always false now, or link to relevant issue/decision. The comment at line 1070 ("We don't need to send a close response") is good but could be more specific about the protocol change.

  2. Consistency: The warning message at tunnel.ts:400 was improved with better context ("tunnel data dropped"), which is great for debugging.


Potential Issues ⚠️

  1. Protocol Compatibility: The most critical concern is whether the engine/server side expects the hibernate field in ToServerWebSocketClose. If this is a breaking change to the protocol, ensure:

    • The server-side code has been updated to handle hibernate: false or ignore this field
    • This is deployed in coordination with backend changes
    • Consider if this needs a protocol version bump
  2. Missing Context: The PR title says "remove echo close" but the changes are about removing hibernate functionality. Are these related? If so, a brief explanation would help reviewers understand the connection.

  3. Behavior Change: Previously, timeout scenarios (tunnel.ts:516) used _closeWithHibernate, but now use regular close(). This changes behavior from hibernate: true to hibernate: false. Is this intentional?

    • Before: webSocket._closeWithHibernate(1000, "Message acknowledgment timeout")
    • After: webSocket.close(1000, "ws.ack_timeout")
    • The reason text also changed - ensure this is intentional

Performance Considerations ✅

  • Removing the hibernate parameter slightly reduces message payload size
  • Simplified control flow should have negligible but positive performance impact
  • No obvious performance concerns

Security Considerations ✅

  • No security issues identified
  • The close codes and reasons follow WebSocket standards (1000, 1011)
  • Error messages are appropriately vague ("ws.ack_timeout") without leaking internal details

Test Coverage ⚠️

Concerns:

  1. No test changes included in this PR
  2. The existing test file (lifecycle.test.ts) should ideally be updated to verify:
    • WebSocket close operations work correctly without hibernate
    • The hibernate: false value is sent in all close scenarios
    • Timeout behavior works as expected with the new close method

Recommendation: Add or update tests to cover:

// Verify ToServerWebSocketClose messages always have hibernate: false
// Test timeout scenario sends proper close message
// Test normal close scenarios

Questions for Author

  1. What is the relationship between "echo close" (PR title) and removing hibernate functionality?
  2. Has the server-side protocol been updated to handle this change?
  3. Was the behavior change in timeout scenarios (line 516) intentional?
  4. Are there any edge cases where hibernate functionality was actually needed?

Recommendation

Conditional Approval - The code changes are clean and well-executed, but I'd like confirmation on:

  1. Protocol compatibility with server-side changes
  2. Intentional behavior change in timeout scenarios
  3. Test coverage for the modified close flows

If these are addressed, this is a good simplification that removes complexity from the codebase.


Code-Specific Comments

tunnel.ts:328 - Good: Hard-coding hibernate: false is clear and explicit
tunnel.ts:516 - Changed: Verify timeout close behavior change is intentional
tunnel.ts:1070 - Good: Comment explaining no close response needed
websocket-tunnel-adapter.ts:232 - Simplified: _handleClose now cleaner with one less parameter
websocket-tunnel-adapter.ts:245 - Removed: _closeWithHibernate method properly cleaned up

@MasterPtato MasterPtato force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 647d688 to debe04e Compare November 19, 2025 21:54
@MasterPtato MasterPtato force-pushed the 11-19-chore_engine-runner_remove_echo_close_on_toclientwebsocketclose branch from 972242d to 556b074 Compare November 19, 2025 21:54
@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

I've reviewed this PR and found a critical issue that needs to be addressed before merging.

Critical Issue: Breaking WebSocket Hibernation Feature

The PR removes the hibernate parameter from WebSocket close handling, but this parameter is actively used by the engine's pegboard-gateway to manage WebSocket hibernation.

Evidence from engine/packages/pegboard-gateway/src/lib.rs:428:

if can_hibernate && close.hibernate {
    return Err(WebSocketServiceHibernate.build());
} else {
    // Successful closure
    return Ok(LifecycleResult::ServerClose(close));
}

Impact:

  1. Breaking Change: The PR hardcodes hibernate: false in all ToServerWebSocketClose messages (tunnel.ts:328)
  2. Lost Functionality: WebSockets that should hibernate will now be permanently closed instead
  3. Protocol Mismatch: The protocol definition in engine/sdks/schemas/runner-protocol/v3.bare:268 still expects the hibernate field

Specific Issues:

  1. tunnel.ts:328 - Hardcoding hibernate: false prevents WebSocket hibernation from ever triggering
  2. tunnel.ts:516 - Changed from webSocket._closeWithHibernate(1000, "Message acknowledgment timeout") to webSocket.close(1000, "ws.ack_timeout") - this removes the ability to hibernate on timeout
  3. websocket-tunnel-adapter.ts - Removed the _closeWithHibernate method entirely, eliminating hibernation support

Questions for Clarification

  1. Intent: Is the goal to completely remove WebSocket hibernation support from the system? If so:

    • This needs to be documented as a breaking change
    • The engine code in pegboard-gateway/src/lib.rs needs updating
    • The protocol schema should be updated
    • Migration path for existing hibernatable WebSockets should be defined
  2. Alternative: If hibernation should still work, the hibernate parameter needs to be preserved and passed correctly through the call chain

Recommendations

If removing hibernation is intentional:

  • Update engine/packages/pegboard-gateway/src/lib.rs to remove hibernation logic
  • Update protocol schema to remove the hibernate field
  • Document this as a breaking change in the PR description
  • Consider migration impact on running actors with hibernatable WebSockets

If hibernation should be preserved:

  • Revert the removal of the hibernate parameter
  • Investigate why the PR title mentions "remove echo close on ToClientWebSocketClose" when changes affect ToServerWebSocketClose
  • Clarify what specific behavior this PR intends to fix

Minor Issues

  1. tunnel.ts:400 - Good: Enhanced error message with "tunnel data dropped" for better debugging
  2. tunnel.ts:1070 - Comment says "We don't need to send a close response" but doesn't explain why this changed

Code Quality

The code changes themselves are clean and well-structured, but the semantic implications for the WebSocket lifecycle need clarification before merging.


Verdict:Request Changes - Need clarification on hibernation removal intent and corresponding engine updates if this is intentional.

@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

Summary

This PR removes the hibernate parameter from WebSocket close handling in the TypeScript runner. The change simplifies the close callback by always setting hibernate: false when sending ToServerWebSocketClose messages.

Code Quality ✅

Good:

  • Clean removal of unused functionality - the _closeWithHibernate method is properly removed
  • Consistent changes across both files (tunnel.ts and websocket-tunnel-adapter.ts)
  • Simplifies the API by removing the hibernate parameter from callbacks
  • The comment at tunnel.ts:1070 ("We don't need to send a close response") helps clarify intent

Concerns:

  1. Protocol Mismatch ⚠️

    • The protocol definition in runner-protocol/src/index.ts:1183-1186 still expects a hibernate: boolean field
    • This change hardcodes hibernate: false (line 328 in tunnel.ts), but the protocol supports true/false
    • The engine code in pegboard-gateway/src/lib.rs:428-432 still checks if can_hibernate && close.hibernate to trigger hibernation
    • Impact: This effectively disables WebSocket hibernation from the client side. If this is intentional, it should be documented.
  2. Incomplete Cleanup

    • The _closeWithHibernate method removal is good, but there's still a _closeWithoutCallback method
    • Consider if _closeWithoutCallback could also be simplified or if the naming convention should be reconsidered
  3. Error Message Improvement

    • The enhanced error message at line 400 in tunnel.ts ("tunnel data dropped.") is helpful
    • Good addition for debugging

Potential Bugs 🐛

  1. WebSocket Hibernation Broken

    • If hibernation was previously working, this change breaks it entirely
    • The engine-side code (pegboard-gateway) will never see close.hibernate = true, so WebSocketServiceHibernate error will never be triggered
    • Question: Is this intentional? Should the protocol be updated or should this functionality be preserved?
  2. Close Timeout Behavior Change

    • Line 516 in tunnel.ts changes from _closeWithHibernate to regular close for timeout scenarios
    • This changes the behavior when acknowledgment times out - previously might have hibernated, now always closes
    • This could affect running connections during temporary network issues

Security Concerns 🔒

No security issues identified. The changes reduce attack surface by removing functionality.

Performance Considerations ⚡

  • Positive: Simpler code path may have marginal performance improvement
  • Neutral: No significant performance impact expected

Test Coverage 🧪

Concerns:

  • No test changes included in this PR
  • No evidence that existing tests cover WebSocket hibernation scenarios
  • Recommendation: Add tests to verify:
    1. WebSocket close without hibernation works correctly
    2. Timeout scenarios properly close connections
    3. Engine correctly handles hibernate: false in all cases

Recommendations

  1. Clarify Intent: Add a comment or update the PR description explaining why hibernation is being removed
  2. Protocol Cleanup: If hibernation is permanently removed, consider:
    • Updating the BARE protocol schema to remove the hibernate field
    • Removing hibernation logic from pegboard-gateway/src/lib.rs
    • Updating any documentation about WebSocket hibernation
  3. Testing: Add integration tests covering the WebSocket close flow
  4. Consider Logging: Add a debug log when sending ToServerWebSocketClose to help trace close events

Questions for Author

  1. Is WebSocket hibernation functionality being deprecated entirely, or will it be re-implemented differently?
  2. Are there any known scenarios where hibernate: true was being used that need alternative solutions?
  3. Should the base branch name (11-13-chore_rivetkit_implement_new_hibernating_ws_protocol) be updated to reflect this change?

Verdict

Conditional Approval - The code changes are clean and well-executed, but need clarification on:

  • Whether this intentionally disables hibernation
  • Whether protocol/engine cleanup is needed
  • Test coverage for the new behavior

If hibernation removal is intentional, consider this as part 1 of a larger refactor.

@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

I've reviewed PR #3487 and here's my feedback:

Summary

This PR removes the hibernate functionality from the WebSocket close callback in the engine runner. The changes simplify the close callback by removing the hibernate parameter and always passing false for the hibernate flag.

Code Quality ✅

Positive aspects:

  • Clean, focused refactoring that removes unused functionality
  • Consistent changes across both files
  • Maintains proper error handling patterns
  • Good logging improvement in tunnel.ts:400 - adding "tunnel data dropped." makes the message clearer

Changes Analysis

websocket-tunnel-adapter.ts

  1. Constructor signature change (line 68): Removed hibernate parameter from closeCallback
  2. Simplified #closeInner (lines 320-336): Removed hibernate parameter, now only takes callback boolean
  3. Removed _closeWithHibernate method: This method is no longer needed
  4. Simplified close methods: close() and _handleClose() now consistently call #closeInner with callback=true

tunnel.ts

  1. Line 320: Close callback signature simplified
  2. Line 328: Hardcoded hibernate: false in ToServerWebSocketClose message
  3. Line 400: Improved warning message clarity
  4. Line 516: Changed from _closeWithHibernate to standard close for timeout handling
  5. Line 1070: Added helpful comment explaining no close response needed

Potential Issues & Questions 🤔

  1. Hardcoded hibernate: false (tunnel.ts:328)

    • This always sends hibernate: false to the server. Is this intentional?
    • If the hibernation protocol is being deprecated/removed entirely from WebSockets, this makes sense
    • However, if there's any scenario where hibernation might be needed in the future, this removes that capability
    • Question: Can you confirm this is the intended behavior and that hibernatable WebSockets are being removed or handled differently?
  2. Protocol compatibility

    • The server still expects a hibernate field in ToServerWebSocketClose messages
    • Verify that the server properly handles hibernate: false in all cases
    • Recommendation: Consider updating the protocol definition if hibernate is no longer used
  3. Comment at line 1070 says "We don't need to send a close response"

    • This is a behavioral change worth noting - previously the engine would echo close messages back
    • Question: Is there documentation or a related issue explaining why the close response is no longer needed?

Performance & Security ✅

  • No performance concerns
  • No security issues identified
  • Changes actually simplify the code path, reducing complexity

Test Coverage ⚠️

  • The test file (lifecycle.test.ts) is entirely commented out
  • Recommendation: Consider adding or uncommenting tests for WebSocket close behavior, especially:
    • Normal close flow
    • Close on timeout (ws.ack_timeout)
    • Close when actor stops
    • Verify close messages are handled correctly by the engine

Suggestions

  1. Consider adding a brief comment explaining why hibernate is hardcoded to false or why the parameter was removed
  2. If this is part of a larger effort to remove hibernatable WebSocket support, consider linking to the related issue/PR
  3. Update any protocol documentation to reflect that the hibernate field is no longer used in close callbacks

Architecture Consistency ✅

Changes align well with the error handling patterns shown in CLAUDE.md:

  • Maintains proper logging with structured fields
  • Uses consistent error messages
  • Follows the existing code patterns

Overall Assessment

This is a clean, well-executed refactoring that simplifies the WebSocket close logic. The main question is around the broader context of removing hibernation support - as long as that's intentional and coordinated with the server-side changes, this PR looks good.

Recommendation: Approve pending clarification on the hibernation removal intent.


Review conducted with attention to the repository's CLAUDE.md guidelines

@claude
Copy link

claude bot commented Nov 19, 2025

Code Review

Summary

This PR removes the hibernate parameter from WebSocket close handling in the TypeScript runner, always setting it to false when sending ToServerWebSocketClose messages. This simplifies the close flow by removing the distinction between hibernating and non-hibernating closes.

Analysis

Changes:

  1. Removed hibernate parameter from close callbacks in tunnel.ts and websocket-tunnel-adapter.ts
  2. Always sends hibernate: false in ToServerWebSocketClose messages (tunnel.ts:328)
  3. Removed _closeWithHibernate() method from WebSocketTunnelAdapter
  4. Simplified #closeInner() by removing the hibernate parameter
  5. Added clarifying comment at tunnel.ts:1070 that close response is not needed

Server-side Impact:
Based on pegboard-gateway/src/lib.rs, the server checks:

if can_hibernate && close.hibernate {
    return Err(WebSocketServiceHibernate.build());
} else {
    return Ok(LifecycleResult::ServerClose(close));
}

With hibernate always false, the server will never trigger hibernation via ToServerWebSocketClose. Hibernation can still occur through other paths (e.g., tunnel subscription closing, actor stopping).

Concerns & Questions

1. Intended Behavior Change

  • Is this intentional? The PR removes the ability for the runner to request hibernation when closing a WebSocket. Was this feature unused/broken, or is this a behavioral regression?
  • The timeout close at tunnel.ts:516 previously used _closeWithHibernate() but now uses regular close(). Is this the desired behavior for timeout scenarios?

2. Protocol Consistency

  • ⚠️ The ToServerWebSocketClose protocol still includes the hibernate: boolean field, but it's now always false from the TypeScript runner
  • Consider if the protocol definition should be updated or if future functionality might need this field

3. Missing Context

  • The PR description is empty - please add context explaining:
    • Why hibernation on close is being removed
    • What testing was performed
    • Whether this affects existing production workloads

4. Logging Improvement

  • ✅ Good: Added ". tunnel data dropped." to the warning message (tunnel.ts:400) for better clarity

Recommendations

Before merging:

  1. Add PR description explaining the rationale for removing hibernate-on-close functionality
  2. Verify test coverage - The tests in lifecycle.test.ts are mostly commented out. Are there active tests covering WebSocket close behavior?
  3. Document behavioral change - If this changes production behavior, ensure it's documented
  4. Consider server-side cleanup - If close.hibernate is always false, the server-side check could be simplified in a follow-up

Code quality: ✅ The changes are clean and simplify the codebase
Security: ✅ No security concerns identified
Performance: ✅ No performance impact

Verdict

The code changes are well-executed, but needs clarification on whether removing hibernate-on-close is intentional and what impact this has on production systems. Please provide context in the PR description before merging.

@MasterPtato MasterPtato force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 647d688 to debe04e Compare November 20, 2025 03:07
@MasterPtato MasterPtato force-pushed the 11-19-chore_engine-runner_remove_echo_close_on_toclientwebsocketclose branch from 972242d to 556b074 Compare November 20, 2025 03:07
@claude
Copy link

claude bot commented Nov 20, 2025

Code Review

Summary

This PR removes the hibernate parameter functionality from WebSocket close operations in the TypeScript runner. The changes simplify the close callback signature and always set hibernate: false when sending close messages to the engine.


Positive Aspects

  1. Code Simplification: The removal of the unused _closeWithHibernate method and simplification of #closeInner reduces code complexity and maintenance burden.

  2. Consistent Behavior: Hardcoding hibernate: false (tunnel.ts:328) makes the behavior explicit and predictable, which is good if hibernation is being handled at a different layer.

  3. Improved Logging: The updated warning message in tunnel.ts:400 is more informative: "tunnel data dropped" clearly communicates data loss.


🔍 Potential Concerns

1. Breaking Change Without Context

The PR removes hibernation support from the TypeScript runner, but the hibernate field is still present in the protocol definition (ToServerWebSocketClose). From the Rust gateway code (pegboard-gateway/src/lib.rs:428), I can see:

if can_hibernate && close.hibernate {
    return Err(WebSocketServiceHibernate.build());
}

Questions:

  • Is hibernation functionality being deprecated entirely, or just not supported from the TypeScript runner?
  • Are there other language SDKs (Rust, etc.) that still use this feature?
  • Should the protocol definition be updated to reflect this change, or is this intentional to maintain backward compatibility?

2. Timeout Handling Change

In tunnel.ts:516, the timeout handler was changed from:

webSocket._closeWithHibernate(1000, "Message acknowledgment timeout");

to:

webSocket.close(1000, "ws.ack_timeout");

The close reason changed from "Message acknowledgment timeout" to "ws.ack_timeout". While the new format is more machine-readable, this could be a breaking change if any downstream systems parse these close reasons.

Recommendation: Document this change in the commit message or PR description if it's intentional.

3. Comment Accuracy

At tunnel.ts:1070, the comment states:

// We don't need to send a close response

This comment seems slightly misleading. The code does call adapter._handleClose(), which triggers the close event. Perhaps "We don't echo the close back to the engine" would be more accurate, since _handleClose uses callback: true internally (websocket-tunnel-adapter.ts:232).

Wait, I need to verify this...

Looking at websocket-tunnel-adapter.ts:232:

_handleClose(_requestId: ArrayBuffer, code?: number, reason?: string): void {
    this.#closeInner(code, reason, true);
}

And #closeInner with callback: true calls this.#closeCallback(code, reason) at line 336. This would send a message back to the engine via the close callback.

However, looking at tunnel.ts:1067-1077, this is handling ToClientWebSocketClose - a close message FROM the engine TO the client. In this case, the engineAlreadyClosed logic would prevent echoing (tunnel.ts:322). But the comment could be clearer about this flow.

4. Protocol Definition Mismatch

The closeCallback signature was updated to remove the hibernate parameter, but internally it's still being sent as false. Consider whether:

  • The protocol should be updated to make hibernate optional
  • Or if this is a temporary state before hibernation is fully removed

🧪 Testing Concerns

  1. Missing Test Updates: No test file changes are included in this PR. If there are tests that verify hibernation behavior, they should be updated or removed.

  2. Edge Case Testing: Consider testing:

    • What happens when a hibernatable WebSocket is closed with the new code?
    • Does the gateway correctly handle hibernate: false for all scenarios?
    • Are there any race conditions with the close callback changes?

🔒 Security & Performance

No security concerns identified. The changes are straightforward refactoring without introducing new attack vectors.

Performance impact is negligible - if anything, it's slightly improved by removing unused code paths.


📝 Recommendations

  1. Add context to the PR description: Explain why hibernation is being removed or disabled. Is this:

    • A temporary change while refactoring hibernation?
    • A permanent removal because hibernation is handled elsewhere?
    • A bug fix for incorrect hibernation behavior?
  2. Update or verify tests: Ensure that any tests relying on hibernation are updated.

  3. Consider updating the protocol: If hibernation is truly deprecated, consider updating the protocol schema in a follow-up PR.

  4. Clarify the comment at line 1070: Make it more specific about what response is/isn't being sent.

  5. Document the close reason format change: If "ws.ack_timeout" is part of a larger standardization effort, document it.


Code Quality

  • ✅ Follows TypeScript best practices
  • ✅ Maintains consistent error handling
  • ✅ Reduces code complexity
  • ✅ No obvious bugs or logic errors
  • ⚠️ Could benefit from more detailed commit message/PR description

Overall Assessment: The code changes are clean and well-executed, but the PR lacks context about why hibernation is being removed. This makes it difficult to assess whether this is the right approach or if it introduces unintended side effects. I'd recommend adding more context before merging.

@claude
Copy link

claude bot commented Nov 20, 2025

Code Review

This PR removes the echo close functionality when receiving a ToClientWebSocketClose message from the engine. Here's my analysis:

Summary of Changes

The PR makes the following key changes:

  1. Removes the hibernate parameter from the close callback signature
  2. Always sets hibernate: false when sending ToServerWebSocketClose messages
  3. Removes the _closeWithHibernate method from WebSocketTunnelAdapter
  4. Simplifies the #closeInner method by removing the hibernate parameter
  5. Updates the timeout handler to use close() instead of _closeWithHibernate()
  6. Adds clarifying comment that close response is not needed when handling ToClientWebSocketClose

Code Quality ✅

Positive aspects:

  • The changes are consistent across both files
  • The simplification removes unused functionality (the hibernate parameter was never actually used - it was always hardcoded to false)
  • The new comment at line 1070 in tunnel.ts clearly documents the intent
  • The improved log message at line 400 ("tunnel data dropped.") is more descriptive

Type Safety:

  • The callback signature change properly removes the unused third parameter
  • All call sites have been updated correctly

Behavioral Changes ⚠️

Important consideration:
Looking at the protocol schema, ToServerWebSocketClose in v3 has a hibernate: bool field that is required. This PR hardcodes it to false everywhere:

hibernate: false,  // line 328

Questions:

  1. Is this intentional? The PR title mentions removing "echo close" which suggests the runner should not respond with a close message when receiving ToClientWebSocketClose. However, the code still sends the close message, just with hibernate: false.

  2. Protocol version compatibility: Since v3 protocol has hibernate: bool as a required field, always sending false should work correctly. However, I notice that v2 had a retry: bool field instead. Which protocol version is currently in use?

  3. Actual intent: Based on the comment "We don't need to send a close response" at line 1070, it seems like the intent might be to NOT send any ToServerWebSocketClose message at all when receiving ToClientWebSocketClose. But the code at lines 320-330 still sends it. Is this a partial implementation?

Potential Issues 🔍

1. Inconsistent close behavior

Looking at line 516 in tunnel.ts, when ACK timeout occurs:

webSocket.close(1000, "ws.ack_timeout");

This calls close() which has callback: true, meaning it WILL send a ToServerWebSocketClose with hibernate: false. This seems correct for timeout scenarios.

However, in #handleWebSocketClose (line 1071), the comment says "We don't need to send a close response" but then calls _handleClose with callback: true, which would attempt to send the close message (though it's prevented by the engineAlreadyClosed guard at line 322).

Wait, I see the issue now! Looking more carefully at line 1071:

The method _handleClose is called (not the old _closeWithHibernate), which calls #closeInner(code, reason, true) where the third param is callback: boolean, not hibernate: boolean. But then in the close callback (lines 320-341), there's a check:

if (\!engineAlreadyClosed) {
    this.#sendMessage(requestId, { /* ... */ });
}

So the actual behavior is: when the engine sends a close, we don't echo it back. The engineAlreadyClosed variable would need to be set somewhere. Let me check if that variable exists...

Actually, I don't see engineAlreadyClosed being set anywhere in the visible code! This might be a bug.

Security ⚠️

No security issues identified, but the engineAlreadyClosed variable issue could lead to unnecessary message echoing.

Performance ✅

No performance concerns. The simplification actually reduces complexity.

Test Coverage 📝

Recommendation: Add tests for:

  1. WebSocket close flow when receiving ToClientWebSocketClose from engine
  2. Verify that close is not echoed back to the engine
  3. ACK timeout scenario with the new close() call

Suggested Improvements

  1. Clarify the engineAlreadyClosed variable: Where is this supposed to be set? The logic relies on it but I don't see it being tracked.

  2. Consider removing the close message entirely: If the intent is to not send a close response when receiving ToClientWebSocketClose, perhaps the _handleClose call should use _closeWithoutCallback instead of _handleClose?

- adapter._handleClose(
+ adapter._closeWithoutCallback(
      requestId,
      close.code || undefined,
      close.reason || undefined,
  );
  1. Log message improvement: The added ". tunnel data dropped." at line 400 is good, but consider using structured logging format per CLAUDE.md:
this.log?.warn({
    msg: "cannot send tunnel message, socket not connected to engine",
    status: "dropped",
    requestId: idToStr(requestId),
    message: stringifyToServerTunnelMessageKind(messageKind),
});

Verdict

The changes are generally good and simplify unused code, but there are questions about:

  1. The engineAlreadyClosed variable that's referenced but not visible in the diff
  2. Whether the close callback should be completely disabled for ToClientWebSocketClose handling

Recommendation: Clarify the intended behavior and verify that engineAlreadyClosed is properly tracked before merging.

@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from debe04e to 647d688 Compare November 20, 2025 04:51
@NathanFlurry NathanFlurry force-pushed the 11-19-chore_engine-runner_remove_echo_close_on_toclientwebsocketclose branch from 556b074 to 972242d Compare November 20, 2025 04:51
@claude
Copy link

claude bot commented Nov 20, 2025

Code Review - PR #3487

I've reviewed this PR which removes the echo close behavior on ToClientWebSocketClose and simplifies the WebSocket close handling by always setting hibernate: false.

Summary of Changes

Files Modified:

  • engine/sdks/typescript/runner/src/tunnel.ts (5 additions, 7 deletions)
  • engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts (5 additions, 14 deletions)

Key Changes:

  1. Removed hibernate parameter from close callback signature
  2. Hard-coded hibernate: false in all ToServerWebSocketClose messages sent from the runner
  3. Removed _closeWithHibernate() method from WebSocketTunnelAdapter
  4. Updated timeout handler to use standard close() instead of _closeWithHibernate()
  5. Added clarifying comment at tunnel.ts:1070 about not sending close response
  6. Improved error message for dropped tunnel data

Analysis

✅ Code Quality & Best Practices

Positive:

  • Simplification: Removing the unused hibernate parameter reduces complexity and makes the API cleaner
  • Consistent behavior: All runner-initiated closes now consistently set hibernate: false
  • Better logging: The enhanced message "cannot send tunnel message, socket not connected to engine. tunnel data dropped." is more explicit (tunnel.ts:400)
  • Code cleanup: Properly removed _closeWithHibernate() method and updated all call sites

🔍 Architectural Correctness

Looking at the server-side code (pegboard-gateway/src/lib.rs:428-433):

if can_hibernate && close.hibernate {
    return Err(WebSocketServiceHibernate.build());
} else {
    return Ok(LifecycleResult::ServerClose(close));
}

This PR is architecturally correct:

  • Hibernation decisions belong to the gateway, not the runner
  • The runner shouldn't decide when to hibernate connections
  • By always sending hibernate: false, the runner properly delegates hibernation control to the gateway
  • When the gateway receives ToClientWebSocketClose, it can decide whether to hibernate based on can_hibernate

🎯 Behavioral Impact

  1. Timeout handling (tunnel.ts:516): Changed from _closeWithHibernate() to close(1000, "ws.ack_timeout") - correct because timeouts should terminate, not hibernate

  2. Close reason consistency: The change from "Message acknowledgment timeout" to "ws.ack_timeout" follows the kebab-case pattern used elsewhere (e.g., "ws.meta_not_found_during_restore", "ws.tunnel_shutdown")

  3. Engine-initiated closes: The comment at tunnel.ts:1070 correctly notes "We don't need to send a close response" when handling ToClientWebSocketClose

⚠️ Potential Considerations

  1. Protocol field consistency: The ToServerWebSocketClose protocol definition still includes the hibernate: bool field (runner-protocol/src/index.ts:1186). Since this is now always false from the runner:

    • Consider adding a protocol comment documenting this behavior
    • In a future protocol version, evaluate if this field should be removed or made optional
    • Not blocking for this PR - the protocol still works correctly
  2. Test coverage: No test changes included. Consider verifying:

    • Normal WebSocket close flow works correctly
    • Timeout-based cleanup with new close code
    • Gateway correctly handles hibernate: false from runner
    • Not blocking but recommended for follow-up

Verdict

✅ LGTM - Approved

This is a clean refactoring that:

  • Correctly implements the architecture where hibernation is controlled by the gateway
  • Simplifies the runner codebase by removing unused complexity
  • Maintains protocol compatibility
  • Improves code clarity and maintainability

The changes are well-executed and align with the proper separation of concerns between the runner and gateway components.


Minor suggestions for future work:

  1. Add protocol documentation explaining hibernate field usage
  2. Add integration tests for WebSocket close scenarios
  3. Consider protocol evolution to reflect current usage patterns

@claude
Copy link

claude bot commented Nov 20, 2025

PR Review: Remove echo close on ToClientWebSocketClose

Summary

This PR removes the hibernate echo functionality from WebSocket close handling in the TypeScript runner SDK. The changes simplify the close flow by removing the _closeWithHibernate method and always setting hibernate: false when sending ToServerWebSocketClose messages.

Code Quality ✅

Strengths:

  • Clean removal of dead code path (_closeWithHibernate method)
  • Simplifies the #closeInner method signature by removing the hibernate parameter
  • Consistent changes across both affected files
  • Good comment added at tunnel.ts:1070 explaining that no close response is needed

Observations:

  • The closeCallback type signature was already updated in the file to remove the hibernate parameter (line 48-52 shows old signature in comments vs line 68 actual signature)
  • The change properly updates all call sites of #closeInner

Potential Issues ⚠️

Protocol Compatibility:
The protocol definition at engine/sdks/typescript/runner-protocol/src/index.ts:1183-1186 still defines ToServerWebSocketClose with a hibernate: boolean field:

export type ToServerWebSocketClose = {
    readonly code: u16 | null
    readonly reason: string | null
    readonly hibernate: boolean  // Still present in protocol
}

While always sending hibernate: false is safe, this creates a semantic mismatch:

  • Question: Is there a plan to update the protocol schema to remove this field entirely?
  • Risk: Future developers might attempt to use hibernation again, not realizing it's been intentionally disabled

Hibernation Flow:
The broader codebase still has hibernation logic:

  • engine/packages/guard-core/src/proxy_service.rs:2540-2546 checks for websocket_service_hibernate errors
  • engine/packages/pegboard-gateway/src/lib.rs handles hibernation state

Question: Is this PR part of a larger effort to remove WebSocket hibernation entirely, or is hibernation being handled at a different layer now?

Performance ✅

  • Removing the unused _closeWithHibernate method reduces code complexity
  • The simplified #closeInner signature reduces parameter passing overhead (minimal but measurable)
  • No performance regressions expected

Security ✅

  • No security concerns identified
  • The timeout close now uses standard close() instead of _closeWithHibernate(), which is more predictable
  • The improved warning message at line 400 ("tunnel data dropped") provides better observability

Testing Considerations ⚠️

Missing Test Coverage:

  • No test changes included in this PR
  • The timeout scenario at tunnel.ts:516 changed behavior (was _closeWithHibernate, now regular close)
  • Recommendation: Add tests covering:
    1. WebSocket close with ToClientWebSocketClose message (verify no echo is sent)
    2. Acknowledgment timeout scenario (verify close(1000, "ws.ack_timeout") works correctly)
    3. Verify hibernate: false is always sent in ToServerWebSocketClose messages

Test Files Found:

  • engine/sdks/typescript/runner/tests/lifecycle.test.ts
  • engine/sdks/typescript/runner/tests/utils.test.ts

Recommendations

  1. Protocol Schema: Consider updating the BARE schema to remove the hibernate field if it's truly no longer used, or document why it must remain false

  2. Documentation: Add a comment explaining why hibernate is hardcoded to false at tunnel.ts:328

  3. Testing: Add test coverage for the changed timeout behavior

  4. Consistency: Search for any other references to WebSocket hibernation to ensure this change is complete

Minor Improvements

Log Message Enhancement (tunnel.ts:400):
Good improvement to add "tunnel data dropped" to the warning message for better debugging.

Code Cleanup:
The removal of the unused _closeWithHibernate method is excellent housekeeping.

Overall Assessment

Verdict:Approve with minor recommendations

The code changes are clean and correct. The main concerns are around protocol alignment and test coverage. If this is part of a larger hibernation removal effort, consider creating follow-up issues to:

  1. Update protocol schemas
  2. Add comprehensive test coverage
  3. Audit and document remaining hibernation code

The PR successfully achieves its stated goal of removing the echo close on ToClientWebSocketClose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants