test(background): fix flaky approval-wait tests via wait_for_status by ahyangyi · Pull Request #2008 · MoonshotAI/kimi-cli

ahyangyi · 2026-04-22T14:13:25Z

Related Issue

N/A

Description

Two tests in test_agent_tool.py polled task status with tight 200ms budgets (20 iterations of 10ms sleeps), which flake on slow runners. The status transition goes through an asyncio.create_task + asyncio.to_thread hop in BackgroundAgentRunner._apply_approval_runtime_event, so the wire-visible tool-call publication can race ahead of the on-disk status flip.

Add an event-driven wait_for_status primitive on BackgroundTaskManager: each mark_task* writer now calls _notify_status_changed, which resolves any futures registered by concurrent wait_for_status calls. This avoids changing production behavior while giving tests a deterministic observation point for non-terminal transitions (e.g. 'awaiting_approval').

Replace the polling loops in:

test_agent_tool_background_agent_waits_for_approval
test_task_stop_kills_background_agent_waiting_for_approval with wait_for_status(task_id, 'awaiting_approval', timeout_s=2).

Add unit tests covering the new primitive: immediate return, transition event wake-up, timeout, thread-boundary notification, and predicate form.

Checklist

I have read the CONTRIBUTING document.
I have linked the related issue, if any.
I have added tests that prove my fix is effective or that my feature works.
I have run make gen-changelog to update the changelog.
I have run make gen-docs to update the user documentation.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Two tests in test_agent_tool.py polled task status with tight 200ms budgets (20 iterations of 10ms sleeps), which flake on slow runners. The status transition goes through an asyncio.create_task + asyncio.to_thread hop in BackgroundAgentRunner._apply_approval_runtime_event, so the wire-visible tool-call publication can race ahead of the on-disk status flip. Add an event-driven wait_for_status primitive on BackgroundTaskManager: each _mark_task_* writer now calls _notify_status_changed, which resolves any futures registered by concurrent wait_for_status calls. This avoids changing production behavior while giving tests a deterministic observation point for non-terminal transitions (e.g. 'awaiting_approval'). To avoid a lost-wakeup race where a notification fires after the store read but before the future is registered, the waiter registers its future BEFORE reading the store. The post-registration merged_view then either observes the target status (and returns immediately) or the future will be resolved by any subsequent notification. The waiter removes its future in a finally block so timed-out or cancelled waits do not accumulate stale entries. Because _resolve_status_waiters pops the whole list atomically, the cleanup tolerates the list already being gone; empty lists are dropped so the dict cannot grow unboundedly. The cross-thread branch of _notify_status_changed checks loop.is_closed() and also wraps call_soon_threadsafe in try/except RuntimeError, so a background agent_runner thread that races with event-loop shutdown cannot surface a spurious error. Replace the polling loops in: - test_agent_tool_background_agent_waits_for_approval - test_task_stop_kills_background_agent_waiting_for_approval with wait_for_status(task_id, 'awaiting_approval', timeout_s=2). Add unit tests covering the new primitive: immediate return, transition event wake-up, timeout, thread-boundary notification, predicate form, cleanup on timeout and cancellation, the register-before-read no- lost-wakeup property, and the closed-loop no-op guarantee.

Copilot AI review requested due to automatic review settings April 22, 2026 14:13

Copilot started reviewing on behalf of ahyangyi April 22, 2026 14:13 View session

ahyangyi mentioned this pull request Apr 22, 2026

fix(soul): re-inject yolo reminder after context compaction #2003

Open

5 tasks