Skip to content

fix(rivetkit): fix empty state [slopfix]#4967

Closed
MasterPtato wants to merge 1 commit into05-04-chore_update_commentfrom
05-04-fix_rivetkit_fix_empty_state_slopfix_
Closed

fix(rivetkit): fix empty state [slopfix]#4967
MasterPtato wants to merge 1 commit into05-04-chore_update_commentfrom
05-04-fix_rivetkit_fix_empty_state_slopfix_

Conversation

@MasterPtato
Copy link
Copy Markdown
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Contributor Author

MasterPtato commented May 5, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@MasterPtato MasterPtato mentioned this pull request May 5, 2026
11 tasks
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Preview packages published to npm

Install with:

npm install rivetkit@pr-4967

All packages published as 0.0.0-pr.4967.9037bcc with tag pr-4967.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-9037bcc
docker pull rivetdev/engine:full-9037bcc
Individual packages
npm install rivetkit@pr-4967
npm install @rivetkit/react@pr-4967
npm install @rivetkit/rivetkit-napi@pr-4967
npm install @rivetkit/workflow-engine@pr-4967

@MasterPtato MasterPtato force-pushed the 05-04-chore_update_comment branch from 54bc0b3 to 1d1ad71 Compare May 5, 2026 02:03
@MasterPtato MasterPtato force-pushed the 05-04-fix_rivetkit_fix_empty_state_slopfix_ branch from f574ff9 to 148d187 Compare May 5, 2026 02:03
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 5, 2026

PR #4967 Review: fix(rivetkit): fix empty state [slopfix]

The PR fixes a genuine initialization race and adds a backwards-compatibility recovery path. The core logic is sound. Notes below since the PR is still in DRAFT.


What the PR does

Three distinct fixes bundled together:

  1. task.rs — defer has_initialized=true persist for new actors. Previously, has_initialized=true was written to KV before the runtime preamble (createState) ran. A crash in that window would leave has_initialized=true + empty state, causing subsequent wakes to skip createState and surface c.state === undefined. The fix persists the flag only after spawn_run_handle returns for new actors.

  2. connection.rsstate_initialized flag on ConnHandle. A new AtomicBool tracks whether createConnState has run. Connections that disconnect before the flag is set skip onDisconnect, avoiding broken state exposed to user code. The from_persisted path correctly initializes it to true since persisted connections have already been through set_state_inner.

  3. napi_actor_events.rs — Two changes:

    • Empty snapshot recovery: snapshot.filter(|bytes| !bytes.is_empty()) treats an empty persisted snapshot as "no snapshot" so createState re-runs. This is a backwards-compat recovery for actors already in the broken state.
    • state_initialized() guard in dispatch_event: Consistent with the connection.rs guard for DisconnectConn events.
  4. pools.rs / guard/lib.rs — Rustls provider init moved to Pools::new(). Valid centralization since Pools::new() is always called before guard::start() at all call sites.


Issues

Potential bug: empty-bytes heuristic (napi_actor_events.rs)

snapshot.filter(|bytes| !bytes.is_empty())

This silently re-runs createState if state legitimately serializes to zero bytes (e.g., an actor whose state CBOR-encodes to null or {} at zero bytes). Unlikely for TypeScript actors in practice, but architecturally fragile. A comment explaining the assumption ("empty bytes here always means a crash-window write, not valid empty state, because the CBOR encoding of any non-null TS object is never zero bytes") would help future readers and make the invariant explicit.

Redundancy question: state_initialized() in dispatch_event

connection.rs::disconnect_managed already guards on state_initialized() before emitting ConnectionClosed. The dispatch_event handler in NAPI also checks state_initialized() before calling on_disconnect_final. If ActorEvent::DisconnectConn is only ever emitted by disconnect_managed, the NAPI check is defensive dead code. If there is a separate code path that emits DisconnectConn without going through disconnect_managed, then the NAPI check is load-bearing and worth a comment explaining that.

No regression tests

The PR checklist is entirely unchecked (consistent with being DRAFT), but before merging:

  • A test for the crash-window scenario: actor persisted with has_initialized=true + empty state bytes → next wake re-runs createState instead of surfacing undefined.
  • A test for the disconnect-before-createConnState path: onDisconnect must not fire.

Minor: SeqCst ordering for state_initialized

The flag transitions monotonically false → true with no ordering dependencies on other atomics. Release on the store and Acquire on the load would be sufficient. Not incorrect, just more conservative than necessary.

Minor: comment style

A few new comments use parentheticals (e.g., "their state populated (i.e. the transport disconnected before...)"). CLAUDE.md asks to avoid "fragmented structures with parentheticals and dashes" — prefer rewriting as a complete sentence.


Conventions

Check Status
scc::HashMap instead of Mutex<HashMap> N/A — no new maps
tokio::sync for async locks OK — parking_lot::RwLock used correctly in sync-context ConnHandleInner
Tracing structured fields OK — tracing::debug!(actor_id = %..., conn_id, reason = ?..., "lowercase msg")
No _ => fallthrough on enums N/A — no new match statements
No Box::leak in per-request paths OK
Conventional commits OK
No eprintln!/println! OK
Comment style Minor violations (parentheticals)

Summary

The fix is correct. Before promoting out of DRAFT:

  1. Add a comment in napi_actor_events.rs documenting why empty bytes implies uninitialized (not valid empty state).
  2. Clarify whether the state_initialized() check in dispatch_event is defensive or load-bearing relative to the one in disconnect_managed.
  3. Add regression tests for the two fixed scenarios.

@NathanFlurry
Copy link
Copy Markdown
Member

Closing in favor of the split replacement stack:\n\n- conn-preflight/no-disconnect-on-failed-preflight\n- rivetkit-state/defer-initialized-persist\n- engine-pools/install-rustls-provider\n\nThe replacement stack keeps the failed-preflight/onDisconnect behavior, empty-state initialization ordering, and rustls provider setup as separate Graphite branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants