Skip to content

fix: #544 — ScheduleWakeup state-lifetime (#507 redux for v0.35.3rc16)#545

Merged
Nathan Schram (nathanschram) merged 1 commit into
devfrom
fix/544-schedule-wakeup-state-lifetime
May 15, 2026
Merged

fix: #544 — ScheduleWakeup state-lifetime (#507 redux for v0.35.3rc16)#545
Nathan Schram (nathanschram) merged 1 commit into
devfrom
fix/544-schedule-wakeup-state-lifetime

Conversation

@nathanschram
Copy link
Copy Markdown
Member

Summary

Bumps version to 0.35.3rc16.

Why the rc11 unit tests missed it

The four tests added in test_claude_runner.py for #507 directly seeded state.live_wakeups_arm_delay and ran the watchdog. They bypassed _clear_background_handle entirely. A full tool_use → tool_result → result sequence test would have caught it. The new test_dead_schedule_wakeup_shortens_post_result_after_tool_result_cleared test now exercises that path.

Out of scope (still v0.35.4)

Test plan

  • uv run ruff format --check src/ tests/ clean (279 files, no diff)
  • uv run ruff check src/ tests/ clean
  • uv run pytest --no-cov2664 passed, 2 skipped
  • python3 scripts/validate_release.pyVersion: 0.35.3rc16 pre-release skip
  • (post-merge) CI publishes 0.35.3rc16 to TestPyPI
  • (post-merge) Integration test on @untether_dev_bot claude-test chat: instruct Claude to call ScheduleWakeup(delaySeconds=120) then end its turn. Verify within ~3 min: claude.post_result_idle.closing_stdin dead_wakeup=True effective_timeout_s=180.0 log appears, subprocess rc=0, no manual /cancel needed.
  • (post-merge) Regression on Loop mode: with /config → Loop mode ON, repeat the above. Shortcut MUST NOT engage; effective_timeout stays at 600 s.
  • (post-merge) Tier 7 + Tier 1-Claude integration attestation via scripts/run-integration-tests.sh 0.35.3rc16 --manual --tiers "tier7,tier1-claude" --notes "..."
  • (post-merge) scripts/fleet-rollout.sh 0.35.3rc16 to all 4 hosts

Closes #544. Refs #507, #374, #333.

🤖 Generated with Claude Code

…35.3rc16)

The rc11 #507 fix added `live_wakeups_arm_delay: dict[str, float]`
populated in `_register_background_handle` and read in
`_post_result_idle_watchdog` to shorten the 600 s post-result idle
timeout to `max_armed_delay + 60 s` when /loop is OFF. But the dict was
wiped by `_clear_background_handle` on the ScheduleWakeup tool_result —
which is the schedule-confirmation, NOT a terminal signal — so by the
time the watchdog ticked (after the `result` event, which lands AFTER
tool_result) the dict was empty and the dead-wakeup shortcut never
engaged.

Live impact: channelo VPS auditor-toolkit session d11739ee-… on rc15,
24+ min hold-open with `pending_wakeup=False` despite
`last_action='tool:ScheduleWakeup (done)'`.

Fix: replace the per-tool_id dict with
`ClaudeStreamState.last_schedule_wakeup_arm_delay: float | None` —
a per-turn scalar high-water-mark (`max` semantics for multi-wakeup
turns) that survives `_clear_background_handle` and resets on each
fresh user prompt (`StreamUserMessage` with non-tool_result content;
mixed batches preserve the scalar so a tool turn still in flight
doesn't lose state).

The original #507 unit tests directly seeded `live_wakeups_arm_delay`
and bypassed `_clear_background_handle`, which is why the rc11 fix
appeared green in CI but failed on channelo rc15 in production. 4 new
tests in `tests/test_claude_runner.py` cover the full
tool_use → tool_result → result lifecycle (does NOT bypass
`_clear_background_handle`), multi-wakeup max selection, new-turn
reset, and the mixed-batch edge case. The two existing #507 tests now
seed the scalar instead of the dict.

The broader background-task-lifecycle refactor (terminal-vs-arm signal
per primitive + deadline-expiry sweeps) tracked in #374 stays in
v0.35.4. The sibling defect where the 600 s safety-net watchdog
silently doesn't fire stays in #333 for v0.35.4 pending entry/exit
instrumentation.

Refs #507, #374, #333.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 73456826-81d9-4807-83cc-da731ddb78f3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/544-schedule-wakeup-state-lifetime

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant