Skip to content

fix(rivetkit-core): gate startup until runtime is ready#4736

Draft
NathanFlurry wants to merge 1 commit into04-24-fix_sqlite_text_nul_roundtripfrom
04-24-fix_rivetkit-core_gate_startup_until_runtime_ready
Draft

fix(rivetkit-core): gate startup until runtime is ready#4736
NathanFlurry wants to merge 1 commit into04-24-fix_sqlite_text_nul_roundtripfrom
04-24-fix_rivetkit-core_gate_startup_until_runtime_ready

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 24, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Review: fix(rivetkit-core): gate startup until runtime is ready

Overview

This PR fixes a race condition where client requests could be dispatched to an actor that has not finished its startup sequence (i.e., onInitialize has not completed). It introduces a new actor.starting error, returns proper errors from active_actor for both pre-ready and stopping actors, and teaches the TypeScript client to retry on this new error code.


What is correct and well-done

  • The core fix is sound. Checking instance.ctx.ready() before handing out the instance prevents a whole class of race conditions where requests land on an actor mid-initialization.
  • Stopping state behavior fix. The old active_actor silently returned a stopping instance (with a warning). Returning a proper error code and having the client re-resolve via query target is strictly better behavior.
  • HTTP dispatch refactoring is clean. Computing framework_http_route before active_actor so the error response can be encoded in the correct format (message_boundary_error_response vs. inspector_anyhow_response) is the right separation.
  • Client retry symmetry. Treating starting identically to stopping in the invalidation and retry paths of actor-handle.ts and actor-query.ts is consistent.

Issues and suggestions

1. Missing newline at EOF in actor.starting.json

All other JSON artifact files end with a newline; this one does not. Should add a trailing newline.

2. Unrelated pnpm dependency (@rivetkit/sql-loader@2.2.1)

The pnpm-lock.yaml change adds @rivetkit/sql-loader to the rivetkit package. This appears unrelated to the startup-gating fix and should be in a separate PR or explained in the description.

3. No tests for the new Starting error path

The lifecycle and registry tests in rivetkit-rust/packages/rivetkit-core/tests/modules/ have no coverage for the case where active_actor is called while the actor is in Loading state. A test that (1) spawns an actor with a slow onInitialize, (2) issues a request concurrently, and (3) asserts the actor.starting error is returned would prevent regressions. This matters because active_actor is also used by the WebSocket path (registry/websocket.rs:23), which propagates the error via ?.

4. Active + not ready + destroy_requested edge case

When an actor is in ActorInstanceState::Active but !ready() and destroy_requested(), the code returns Destroying. This appears correct (actor destroyed during initialization), but a short comment explaining the state combination would help future readers since it is non-obvious.

5. Hardcoded 100 ms delay inconsistency in actor-handle.ts

At line ~351 (the WebSocket retry path), starting and stopping use a raw setTimeout(resolve, 100). The other retry sites all use #waitForRetryWindow. This was pre-existing for stopping but is now extended to starting. Consider unifying to #waitForRetryWindow or documenting why this path uses a different delay.

6. WebSocket path propagates Starting without explicit close-frame handling

handle_websocket propagates the active_actor error via ?. Per the CLAUDE.md convention, WebSocket rejections should carry a meaningful close code and group.code reason. Confirm the calling layer converts this anyhow::Error into a proper close frame (close code 1008, reason actor.starting) so browser clients get diagnostic info.


Summary

The fix is correct and addresses a real race condition. The main gaps before merging are a test covering the Loading-state rejection, resolving the unrelated lock-file change, and a trailing newline in the artifact file. The delay inconsistency and WebSocket close-frame question are lower priority.

@NathanFlurry NathanFlurry force-pushed the 04-24-fix_sqlite_text_nul_roundtrip branch from 38f839d to e8072b7 Compare April 24, 2026 09:52
@NathanFlurry NathanFlurry force-pushed the 04-24-fix_rivetkit-core_gate_startup_until_runtime_ready branch from 6db71dc to e33e626 Compare April 24, 2026 09:52
@NathanFlurry NathanFlurry mentioned this pull request Apr 24, 2026
11 tasks
@NathanFlurry NathanFlurry force-pushed the 04-24-fix_rivetkit-core_gate_startup_until_runtime_ready branch from e33e626 to 06b3383 Compare April 24, 2026 09:54
@github-actions
Copy link
Copy Markdown
Contributor

Preview packages published to npm

Install with:

npm install rivetkit@pr-4736

All packages published as 0.0.0-pr.4736.ed96c00 with tag pr-4736.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-ed96c00
docker pull rivetdev/engine:full-ed96c00
Individual packages
npm install rivetkit@pr-4736
npm install @rivetkit/react@pr-4736
npm install @rivetkit/rivetkit-napi@pr-4736
npm install @rivetkit/workflow-engine@pr-4736

@NathanFlurry NathanFlurry force-pushed the 04-24-fix_rivetkit-core_gate_startup_until_runtime_ready branch from 49fed12 to c4a5108 Compare April 24, 2026 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant