Skip to content

fix(rivetkit-core): return stopping not starting when sleep/destroy called mid-shutdown#4757

Draft
NathanFlurry wants to merge 1 commit intosleep-cleanup/11-remove-prevent-sleep-fixturefrom
sleep-cleanup/12-sleep-destroy-mid-shutdown-diagnostic
Draft

fix(rivetkit-core): return stopping not starting when sleep/destroy called mid-shutdown#4757
NathanFlurry wants to merge 1 commit intosleep-cleanup/11-remove-prevent-sleep-fixturefrom
sleep-cleanup/12-sleep-destroy-mid-shutdown-diagnostic

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 24, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Code Review - PR 4757 (sleep/destroy mid-shutdown fix)

This is a focused, well-scoped fix. The root cause is clearly identified: sleep.started is used as a proxy for two distinct states and the old code could not distinguish "never started" from "already shutting down", returning a misleading actor/starting error to callers who called sleep() or destroy() mid-shutdown.

Correctness - sleep() fix (context.rs): The disambiguation logic is sound. When started=false, reading sleep_requested || destroy_requested is sufficient to detect the in-flight-shutdown case. The swap(true) later in sleep() still guards against a second sleep() call in the same codepath, so there is no double-accept risk.

Correctness - destroy() fix (context.rs): The new guard allows destroy() to proceed when the actor has already entered sleep-shutdown (sleep_requested=true, started=false). The comment explaining "destroy is a stronger signal" is accurate and matches the existing sleep_then_destroy_signal_tasks_do_not_leak_after_teardown test. If the lifecycle machine has cleared started and both request flags are still false but mark_destroy_requested() has already been called, the guard correctly passes because mark_destroy_requested sets destroy_requested = true.

Test coverage: The new test sleep_after_grace_clears_started_returns_stopping_not_starting directly covers the sleep() regression. It correctly simulates the real lifecycle sequence: set started=true, call sleep() (sets sleep_requested=true), clear started=false (simulating transition_to(SleepGrace)), then assert second sleep() returns actor/stopping not actor/starting.

Missing parallel test for destroy(): There is a test for destroy() called before startup (destroy_before_started_errors_with_actor_starting) and double_destroy_errors_with_actor_stopping, but no test for the symmetric case that was also changed: calling destroy() mid-shutdown (after started is cleared). A test like destroy_after_grace_clears_started_succeeds_as_escalation would give confidence that the extension is exercised and does not accidentally accept destroy() in a fully-terminated state.

Test placement: Per CLAUDE.md, Rust tests should live under tests/, not inline in src/. The new test is added to the inline mod tests at the bottom of src/actor/sleep.rs rather than to tests/modules/sleep.rs. Existing lifecycle tests are also inline, so this may be an accepted exception because new_for_sleep_tests and set_sleep_started are pub(crate) helpers only reachable from the same crate. If these helpers can be made reachable from tests/, the new test should move there for consistency.

Ordering consistency: The existing sleep() path uses SeqCst for all three atomic loads/swaps. The new destroy() guard also uses SeqCst. Ordering is consistent.

Summary: The fix is correct and the test demonstrates the exact bug. Actionable feedback: (1) Add a parallel test for destroy() mid-shutdown where started=false and sleep_requested=true should allow destroy() to succeed, not return Starting. (2) Consider moving the new test and related sleep lifecycle tests to tests/modules/sleep.rs per CLAUDE.md guidance.

@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/11-remove-prevent-sleep-fixture branch from c7f9133 to ace5f49 Compare April 24, 2026 13:16
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/12-sleep-destroy-mid-shutdown-diagnostic branch from 1939547 to 68ddecb Compare April 24, 2026 13:16
@github-actions
Copy link
Copy Markdown
Contributor

Preview packages published to npm

Install with:

npm install rivetkit@pr-4757

All packages published as 0.0.0-pr.4757.2ad2d5d with tag pr-4757.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-2ad2d5d
docker pull rivetdev/engine:full-2ad2d5d
Individual packages
npm install rivetkit@pr-4757
npm install @rivetkit/react@pr-4757
npm install @rivetkit/rivetkit-napi@pr-4757
npm install @rivetkit/workflow-engine@pr-4757

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant