Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@vercel
Copy link

vercel bot commented Nov 6, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Ready Ready Preview Comment Nov 6, 2025 7:29am
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 6, 2025 7:29am
rivet-inspector Ignored Ignored Preview Nov 6, 2025 7:29am
rivet-site Ignored Ignored Preview Nov 6, 2025 7:29am

Copy link
Member Author

NathanFlurry commented Nov 6, 2025


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Base automatically changed from 11-05-fix_runner_skip_closing_hibernatable_websockets_on_shutdown to main November 6, 2025 07:27
@NathanFlurry NathanFlurry force-pushed the 11-05-fix_rivetkit_correctly_handle_hibernatable_websocket_reconnection_with_persisted_request_ids branch from b4fdc49 to e43ae38 Compare November 6, 2025 07:28
@NathanFlurry NathanFlurry merged commit f513e50 into main Nov 6, 2025
9 of 10 checks passed
@NathanFlurry NathanFlurry deleted the 11-05-fix_rivetkit_correctly_handle_hibernatable_websocket_reconnection_with_persisted_request_ids branch November 6, 2025 07:28
@claude
Copy link

claude bot commented Nov 6, 2025

PR Review: Hibernatable WebSocket Reconnection Fix

Summary

This PR correctly addresses the hibernatable WebSocket reconnection issue by persisting the hibernatableRequestId in the connection state, enabling proper reconnection matching using request IDs instead of relying on driver state checks.


Positive Aspects ✅

  1. Clean Architecture: The refactor removes the isHibernatable method from connection drivers, centralizing hibernation logic in the connection and actor instance layers.

  2. Proper State Persistence: Adding hibernatableRequestId to the persisted connection schema ensures reconnection can work correctly even after the actor hibernates.

  3. Schema Migration: The v1→v2 migration properly initializes the new field as null for existing connections.

  4. Comprehensive Logging: Good debug logging throughout the reconnection flow aids in troubleshooting.


Issues & Concerns 🔴

1. Critical: Missing Driver State Update on Reconnection

Location: rivetkit-typescript/packages/rivetkit/src/actor/instance.ts:1071-1073

The hibernatable reconnection path updates the socket but doesn't update the driver state:

// Update with new driver state
existingConn.__socket = socket;
existingConn.__persist.lastSeen = Date.now();

However, the regular reconnection path (lines 1120-1130) properly updates __driverState:

existingConn.__socket = socket;
existingConn.__driverState = socket.driverState; // ← Missing in hibernatable path
existingConn.__persist.lastSeen = Date.now();

Impact: The connection will have the new socket but retain stale driver state, which could cause issues with WebSocket operations.

Fix: Add existingConn.__driverState = socket.driverState; at line 1072.


2. Potential Race Condition in Status Check

Location: rivetkit-typescript/packages/rivetkit/src/actor/conn.ts:72-73

The TODO comment highlights a real issue:

// TODO: isHibernatible might be true while the actual hibernatable websocket has disconnected
if (this.__socket || this.isHibernatable) {
    return "connected";
}

The isHibernatable getter checks if a hibernatable WebSocket exists in actor[PERSIST_SYMBOL].hibernatableWebSocket, but this doesn't track real-time connection state. The WebSocket could be disconnected while still appearing in the list.

Impact: conn.status may incorrectly report "connected" when the hibernatable WebSocket has actually disconnected.

Recommendation: Either:

  • Track WebSocket liveness in PersistedHibernatableWebSocket
  • Only consider isHibernatable as "connected" if there's an active socket reference
  • Document this limitation clearly in the API docs

3. Logic Duplication in Driver Cleanup

Location: rivetkit-typescript/packages/rivetkit/src/actor/instance.ts:1050-1069 and 1106-1120

Both the hibernatable and regular reconnection paths have nearly identical driver cleanup logic. This violates DRY principles and makes maintenance harder.

Recommendation: Extract to a helper method:

private cleanupExistingDriverState(
    conn: Conn<S, CP, CS, V, I, DB>,
    reason: string
) {
    if (conn.__driverState) {
        const driverKind = getConnDriverKindFromState(conn.__driverState);
        const driver = CONN_DRIVERS[driverKind];
        if (driver.disconnect) {
            driver.disconnect(this, conn, (conn.__driverState as any)[driverKind], reason);
        }
    }
}

4. Inconsistent Warning vs Debug Logging

Location: rivetkit-typescript/packages/rivetkit/src/actor/instance.ts:1051

this.#rLog.warn({
    msg: "found existing driver state on hibernatable websocket",
    // ...
});

This warning suggests an unexpected condition, but in the regular reconnection path (line 1107), the same scenario has no logging at all. Is this truly warn-worthy, or should it be debug-level?

Recommendation: Either add similar warnings to the regular path or downgrade to debug level for consistency.


5. Performance: Inefficient Array Search

Location: Multiple locations use Array.from(values()).find() and findIndex()

Examples:

  • rivetkit-typescript/packages/rivetkit/src/actor/instance.ts:1033
  • rivetkit-typescript/packages/rivetkit/src/actor/conn.ts:140-146
  • rivetkit-typescript/packages/rivetkit/src/actor/instance.ts:1225

Issue: Converting to array and linear searching is O(n) each time. With many connections, this could become a bottleneck.

Recommendation: Consider maintaining a Map of hibernatableRequestId → Connection for O(1) lookups during reconnection.


6. Missing Test Coverage

No tests found for hibernatable WebSocket reconnection. This is critical functionality that should have:

  • Unit tests for arrayBuffersEqual
  • Integration tests for hibernatable reconnection flow
  • Tests for edge cases (non-existent requestId, mismatched requestId, etc.)

Recommendation: Add test coverage, especially for the reconnection logic in createConn.


7. Typo in Comment

Location: rivetkit-typescript/packages/rivetkit/src/actor/persisted.ts:19

/** Last time the socket was seen. This is set when disconnected so we can determine when we need to clean this up. */

"disconnected" was corrected in the diff, but the comment doesn't match the actual behavior - lastSeen is also updated on reconnection (line 1073, 1123).

Recommendation: Clarify that it's updated on any activity, not just disconnect.


8. Type Safety: Non-null Assertion Operator

Location: Multiple uses of ! operator, e.g., socket.requestIdBuf!

While the code checks existence before using !, this pattern can be fragile if refactored.

Recommendation: Use TypeScript's type narrowing more explicitly or destructure with defaults.


Minor Issues ⚠️

  1. Inconsistent naming: "hibernatable" vs "hibernatible" (in TODO comment at conn.ts:72)

  2. Missing null checks: requestIdBuf is optional but used with ! in several places after conditional checks

  3. Magic number: Line 1033 checks > -1, consider using !== -1 or storing result in a named variable for clarity


Security Considerations 🔒

  1. Request ID as Security Token: The requestId (via requestIdBuf) is used to match connections. Ensure this is sufficiently random and cannot be predicted by attackers.

  2. ArrayBuffer Comparison: The arrayBuffersEqual function looks secure (constant-time-ish byte comparison), which is good for preventing timing attacks.


Recommendations Summary

Must Fix:

  1. ❗ Add missing __driverState update in hibernatable reconnection path

Should Fix:
2. Add test coverage for hibernatable reconnection
3. Extract duplicate driver cleanup logic
4. Resolve the TODO about status reporting accuracy
5. Optimize connection lookups with a Map

Nice to Have:
6. Fix naming consistency
7. Improve type safety around nullable fields
8. Clarify comment about lastSeen updates


Conclusion

The core fix is sound and addresses the issue correctly by persisting and matching hibernatableRequestId. However, there's a critical bug in the reconnection path where __driverState isn't updated, which could cause runtime issues. Once that's fixed and test coverage is added, this will be a solid improvement to the hibernatable WebSocket feature.

Overall: Approve with required changes ✅ (pending driver state fix)

@claude
Copy link

claude bot commented Nov 6, 2025

Pull Request Review

Summary

This PR fixes a critical issue with hibernatable WebSocket reconnection handling in RivetKit. The fix ensures that persisted request IDs are correctly used to identify and reconnect hibernatable WebSocket connections, preventing duplicate connection creation on reconnection.


Code Quality & Best Practices ✅

Strengths:

  1. Clean Refactoring: The removal of the isHibernatable method from ConnDriver interface simplifies the abstraction and moves the logic to where it belongs (on the Conn class).

  2. Good Structured Logging: Excellent use of structured logging throughout the reconnection logic:

    this.rLog.debug({
        msg: "checking for hibernatable websocket connection",
        requestId: socket.requestId,
        existingConnectionsCount: this.#connections.size,
    });
  3. Proper Schema Versioning: The migration from v1 to v2 in versioned.ts:23-33 correctly handles backward compatibility by initializing hibernatableRequestId: null for existing connections.

  4. Consistent Naming: Good use of hibernatableRequestId naming convention following the codebase patterns.

Suggestions:

  1. Type Safety in conn-socket.ts:5: Consider making requestIdBuf required when hibernatable is true:

    export interface ConnSocket {
        requestId: string;
        hibernatable: boolean;
        requestIdBuf?: ArrayBuffer; // Should be required when hibernatable is true
        driverState: ConnDriverState;
    }

    Could use a discriminated union:

    export type ConnSocket = 
        | { hibernatable: true; requestId: string; requestIdBuf: ArrayBuffer; driverState: ConnDriverState }
        | { hibernatable: false; requestId: string; requestIdBuf?: undefined; driverState: ConnDriverState };
  2. Documentation: The TODO comment at conn.ts:72 should be addressed or converted to a tracked issue:

    // TODO: isHibernatible might be true while the actual hibernatable websocket has disconnected

Potential Bugs & Issues 🔍

  1. Race Condition Risk (instance.ts:1033-1040): The Array.from(this.#connections.values()).find() could be slow with many connections and creates a new array on each reconnection. Consider using a Map for O(1) lookup:

    // Add to class:
    #hibernatableConnections: Map<string, Conn> = new Map();
    
    // Update on connection creation:
    if (persist.hibernatableRequestId) {
        this.#hibernatableConnections.set(
            idToStr(persist.hibernatableRequestId), 
            conn
        );
    }
  2. Inconsistent Logging (instance.ts:1051): Uses this.#rLog (private) while other places use this.rLog (public). Should be consistent:

    this.#rLog.warn({  // Line 1051
    // vs
    this.rLog.debug({  // Lines 1026, 1043, 1085
  3. Missing Cleanup (instance.ts:1218-1231): When a connection with hibernatableRequestId is created, if there's an old connection with the same ID in #connections, it won't be cleaned up. Consider checking for and cleaning up any existing connections with the same hibernatableRequestId.

  4. ArrayBuffer Comparison Performance (utils.ts:262-275): The arrayBuffersEqual function is called multiple times in hot paths. Consider caching the result or using a string-based comparison with idToStr for faster lookups.


Performance Considerations ⚡

  1. O(n) Search on Every Reconnection (instance.ts:1033-1040 and conn.ts:139-146):

    • Creating arrays and linear searches could be expensive with many connections
    • The isHibernatable getter in conn.ts:135-147 runs a findIndex on every access
    • Recommendation: Cache the hibernatable status or use a Map-based lookup
  2. Redundant arrayBuffersEqual Calls: Called in both createConn and the isHibernatable getter. Consider optimizing with string-based IDs since idToStr is already available.

  3. Driver State Cleanup (instance.ts:1049-1069): The driver cleanup code is duplicated between hibernatable and regular reconnection logic. Consider extracting to a helper method.


Security Concerns 🔒

No major security issues found, but some considerations:

  1. Request ID Validation: The code assumes requestIdBuf is a valid UUID buffer. Consider adding validation to prevent malformed data from causing issues.

  2. Token Validation: Good that the regular reconnection path still validates the connection token (instance.ts:1099). The hibernatable path bypasses this since it relies on the request ID match, which is acceptable given the design.


Test Coverage 📋

Concerns:

  1. No New Tests: This PR adds complex reconnection logic but doesn't include test coverage
  2. Critical Path: Hibernatable WebSocket reconnection is a critical feature that should have tests

Recommendations:

  1. Add tests for:

    • Hibernatable WebSocket reconnection with matching requestIdBuf
    • Non-hibernatable WebSocket behavior (should create new connection)
    • Edge case: Reconnection with hibernatable=true but no matching existing connection
    • Edge case: Multiple connections with same hibernatableRequestId
    • Migration from v1 to v2 schema
  2. Consider using the rivetkit-driver-test-writer agent to add comprehensive test coverage


Additional Notes

What This PR Fixes:
Before this change, hibernatable WebSockets would create a new connection on reconnection instead of reusing the existing persisted connection. The fix:

  1. Adds hibernatableRequestId to persisted connection data
  2. Checks for existing connections by requestIdBuf during reconnection
  3. Reuses the existing connection instead of creating a new one

Architecture Improvement:
Moving isHibernatable logic from the driver interface to the Conn class is a good architectural decision, as it's connection state rather than driver behavior.


Verdict

Overall Assessment: Good Fix

This PR correctly addresses the hibernatable WebSocket reconnection issue. The main concerns are:

  • Performance optimizations for O(n) searches
  • Missing test coverage
  • Minor code consistency issues

Recommendation: Approve with suggestions for follow-up improvements, particularly around test coverage and performance optimization for high-connection scenarios.


🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants