fix: #544 — ScheduleWakeup state-lifetime (#507 redux for v0.35.3rc16)#545
Merged
Nathan Schram (nathanschram) merged 1 commit intoMay 15, 2026
Merged
Conversation
…35.3rc16) The rc11 #507 fix added `live_wakeups_arm_delay: dict[str, float]` populated in `_register_background_handle` and read in `_post_result_idle_watchdog` to shorten the 600 s post-result idle timeout to `max_armed_delay + 60 s` when /loop is OFF. But the dict was wiped by `_clear_background_handle` on the ScheduleWakeup tool_result — which is the schedule-confirmation, NOT a terminal signal — so by the time the watchdog ticked (after the `result` event, which lands AFTER tool_result) the dict was empty and the dead-wakeup shortcut never engaged. Live impact: channelo VPS auditor-toolkit session d11739ee-… on rc15, 24+ min hold-open with `pending_wakeup=False` despite `last_action='tool:ScheduleWakeup (done)'`. Fix: replace the per-tool_id dict with `ClaudeStreamState.last_schedule_wakeup_arm_delay: float | None` — a per-turn scalar high-water-mark (`max` semantics for multi-wakeup turns) that survives `_clear_background_handle` and resets on each fresh user prompt (`StreamUserMessage` with non-tool_result content; mixed batches preserve the scalar so a tool turn still in flight doesn't lose state). The original #507 unit tests directly seeded `live_wakeups_arm_delay` and bypassed `_clear_background_handle`, which is why the rc11 fix appeared green in CI but failed on channelo rc15 in production. 4 new tests in `tests/test_claude_runner.py` cover the full tool_use → tool_result → result lifecycle (does NOT bypass `_clear_background_handle`), multi-wakeup max selection, new-turn reset, and the mixed-batch edge case. The two existing #507 tests now seed the scalar instead of the dict. The broader background-task-lifecycle refactor (terminal-vs-arm signal per primitive + deadline-expiry sweeps) tracked in #374 stays in v0.35.4. The sibling defect where the 600 s safety-net watchdog silently doesn't fire stays in #333 for v0.35.4 pending entry/exit instrumentation. Refs #507, #374, #333. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This was referenced May 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
live_wakeups_arm_delay: dict[str, float]with a per-turn scalar high-water-mark (ClaudeStreamState.last_schedule_wakeup_arm_delay: float | None) so the rc11 Untether keeps session alive after agent calls ScheduleWakeup outside /loop dynamic mode — wakeup never fires, user must cancel manually #507 dead-ScheduleWakeup shortcut actually engages_clear_background_handlewiped the dict on the ScheduleWakeup tool_result (which is the schedule-confirmation, not a terminal signal), so by the time_post_result_idle_watchdogticked the dict was empty and the shortcut never firedd11739ee-…sat 24+ min afterresultwithpending_wakeup=Falsedespitelast_action='tool:ScheduleWakeup (done)'— full evidence in #544_clear_background_handle(which is what hid the original defect); 2 existing Untether keeps session alive after agent calls ScheduleWakeup outside /loop dynamic mode — wakeup never fires, user must cancel manually #507 tests updated to seed the scalarBumps version to
0.35.3rc16.Why the rc11 unit tests missed it
The four tests added in
test_claude_runner.pyfor #507 directly seededstate.live_wakeups_arm_delayand ran the watchdog. They bypassed_clear_background_handleentirely. A full tool_use → tool_result → result sequence test would have caught it. The newtest_dead_schedule_wakeup_shortens_post_result_after_tool_result_clearedtest now exercises that path.Out of scope (still v0.35.4)
resultevent (idle-but-alive UX gap) #333). Needs entry/exit instrumentation before code; documented as a sibling defect in bug(#507 redux): ScheduleWakeup state cleared on tool_result before _post_result_idle_watchdog reads it #544.Test plan
uv run ruff format --check src/ tests/clean (279 files, no diff)uv run ruff check src/ tests/cleanuv run pytest --no-cov— 2664 passed, 2 skippedpython3 scripts/validate_release.py—Version: 0.35.3rc16pre-release skip@untether_dev_botclaude-test chat: instruct Claude to callScheduleWakeup(delaySeconds=120)then end its turn. Verify within ~3 min:claude.post_result_idle.closing_stdin dead_wakeup=True effective_timeout_s=180.0log appears, subprocess rc=0, no manual/cancelneeded./config → Loop mode ON, repeat the above. Shortcut MUST NOT engage; effective_timeout stays at 600 s.scripts/run-integration-tests.sh 0.35.3rc16 --manual --tiers "tier7,tier1-claude" --notes "..."scripts/fleet-rollout.sh 0.35.3rc16to all 4 hostsCloses #544. Refs #507, #374, #333.
🤖 Generated with Claude Code