Skip to content

Comments

chore(rivetkit): update conns to be hibernatable#4260

Merged
NathanFlurry merged 7 commits intomainfrom
02-22-chore_rivetkit_update_conns_to_be_hibernatable
Feb 24, 2026
Merged

chore(rivetkit): update conns to be hibernatable#4260
NathanFlurry merged 7 commits intomainfrom
02-22-chore_rivetkit_update_conns_to_be_hibernatable

Conversation

@NathanFlurry
Copy link
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-4260 February 22, 2026 21:06 Destroyed
@railway-app
Copy link

railway-app bot commented Feb 22, 2026

🚅 Deployed to the rivet-pr-4260 environment in rivet-frontend

Service Status Web Updated (UTC)
website ❌ Build Failed (View Logs) Web Feb 22, 2026 at 9:10 pm
mcp-hub ✅ Success (View Logs) Web Feb 22, 2026 at 9:07 pm
frontend-inspector ❌ Build Failed (View Logs) Web Feb 22, 2026 at 9:07 pm
frontend-cloud ❌ Build Failed (View Logs) Web Feb 22, 2026 at 9:07 pm
ladle ❌ Build Failed (View Logs) Web Feb 22, 2026 at 9:07 pm

@NathanFlurry NathanFlurry mentioned this pull request Feb 22, 2026
11 tasks
Copy link
Member Author

NathanFlurry commented Feb 22, 2026


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Feb 22, 2026

PR Review: chore(rivetkit): update conns to be hibernatable

This PR fixes several interconnected issues with hibernatable WebSocket connections in the file system driver and the broader connection lifecycle. The changes are well-reasoned and address real bugs.


Core Fix: isHibernatable getter (conn/mod.ts)

The key fix. The old implementation checked this[CONN_DRIVER_SYMBOL]?.hibernatable \!== undefined, which incorrectly returned false for restored connections because the driver is cleared when a WebSocket disconnects. The new implementation correctly checks this.#stateManager.hibernatableDataRaw \!== undefined.

This is the root cause fix that makes everything else work. Connections created as hibernatable retain that designation even when no active WebSocket is present.


Message Ordering Fix (router-websocket-endpoints.ts)

Chaining pendingMessage promises to serialize message processing is a correct fix for the subscription/action race. One note: after an error in the chain, ws.close() is called but the chain resolves normally from .catch(), so messages that arrived before the close event propagates could still be queued. This is benign in practice since they fail gracefully once the connection is torn down, but worth being aware of.


Client-Side Message Queuing (client/actor-conn.ts)

The change from checking readyState === 1 to this.#connStatus \!== "connected" correctly queues messages during the initialization window (before the Init message is received). This prevents messages from being sent before the server has fully established the connection state.


Issue: 4-byte IDs in createHibernatableRequestMetadata

const gatewayId = new Uint8Array(4);  // 32 bits
const requestId = new Uint8Array(4);  // 32 bits

These IDs are used as lookup keys in findHibernatableConn. With only 32 bits of entropy, the birthday paradox gives ~1% collision probability with ~9,000 concurrent connections. While the file system driver is single-node and unlikely to reach that scale, this differs from the pattern used elsewhere (e.g., crypto.randomUUID() for connection IDs). Consider using 16 bytes (128 bits) to match UUID entropy, or document why 4 bytes is sufficient for this context.


cleanupPersistedHibernatableConnections — Timing

The method is called from onBeforeActorStart, which runs after #ready = true but before #started = true. The resetSleepTimer() call inside connDisconnected will be a no-op at that point (it guards on #started). This is the correct intentional behavior — the sleep timer should only start after full initialization — but it relies on an implicit ordering that is not documented at the call site.


Test Assertion Fragility (driver-file-system.test.ts)

expect(disconnectWakeCounts).toEqual([2]);

Each call to getDisconnectWakeCounts() inside waitFor wakes the actor. If the actor sleeps and wakes again between loop iterations, the stale connection has already been cleaned up during wake #2, so disconnectWakeCounts remains [2] — subsequent wakes do not re-add to it. The assertion is stable, but this reasoning is non-obvious. A brief comment explaining why repeated wakes do not corrupt the expected value would help future maintainers.


Summary

Area Assessment
isHibernatable fix Correct and necessary
Message ordering chain Solid fix, minor edge case in error path
Client message queuing Correct fix
ID entropy (4 bytes) Should be increased or documented
Cleanup timing Correct but implicit, worth a comment
Test coverage Good scenario coverage
Engine driver parity Engine onBeforeActorStart correctly omits cleanup since it manages WS restoration itself

The changes are a clear improvement. The 4-byte ID size is the main actionable concern.

@NathanFlurry NathanFlurry marked this pull request as ready for review February 23, 2026 06:40
@NathanFlurry NathanFlurry force-pushed the 02-21-chore_inspector_add_back_database_tab branch from ebbe0a2 to 240add0 Compare February 23, 2026 08:25
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch from 57c3534 to 033c3b9 Compare February 23, 2026 08:25
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch from 033c3b9 to 7d43770 Compare February 23, 2026 18:22
@NathanFlurry NathanFlurry force-pushed the 02-21-chore_inspector_add_back_database_tab branch from 240add0 to 88fcb5f Compare February 23, 2026 18:22
@NathanFlurry NathanFlurry force-pushed the 02-21-chore_inspector_add_back_database_tab branch from 88fcb5f to 240add0 Compare February 23, 2026 18:34
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch 2 times, most recently from 033c3b9 to d8a30c7 Compare February 23, 2026 18:35
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch from d8a30c7 to c2be680 Compare February 24, 2026 02:39
@NathanFlurry NathanFlurry force-pushed the 02-21-chore_inspector_add_back_database_tab branch from 070db06 to 3c2f302 Compare February 24, 2026 02:40
@NathanFlurry NathanFlurry mentioned this pull request Feb 24, 2026
11 tasks
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch from c2be680 to 310b793 Compare February 24, 2026 02:57
@NathanFlurry NathanFlurry force-pushed the 02-21-chore_inspector_add_back_database_tab branch from 3c2f302 to 62a8778 Compare February 24, 2026 02:57
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch from 310b793 to d8a30c7 Compare February 24, 2026 03:19
@NathanFlurry NathanFlurry force-pushed the 02-21-chore_inspector_add_back_database_tab branch from 070db06 to ada2163 Compare February 24, 2026 04:01
@NathanFlurry NathanFlurry force-pushed the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch from d8a30c7 to 326942f Compare February 24, 2026 04:01
Base automatically changed from 02-21-chore_inspector_add_back_database_tab to main February 24, 2026 04:07
@NathanFlurry NathanFlurry merged commit aa88f1e into main Feb 24, 2026
11 of 30 checks passed
@NathanFlurry NathanFlurry deleted the 02-22-chore_rivetkit_update_conns_to_be_hibernatable branch February 24, 2026 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant