feat(desktop): client-side stall detection for chat tool calls#7555
feat(desktop): client-side stall detection for chat tool calls#7555eulicesl wants to merge 2 commits into
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Greptile SummaryAdds client-side stall detection for chat tool calls on macOS. A new
Confidence Score: 3/5The core stall detection logic and UI wiring are sound, but the integration between the detector's synthetic tool IDs and the content-block matching has a functional gap that prevents the banner from appearing for tools without a bridge-supplied toolUseId. The applyStallTransitions method registers tools under a synthetic key untracked-name when toolUseId is nil, but then matches content blocks by tuid == id where tuid is also nil — so the comparison is always false. Any tool the bridge delivers without a toolUseId will never show the amber spinner or Cancel banner, which is the primary deliverable of this feature. The fix is a one-line change, but until it lands the feature is partially inoperative for that code path. desktop/Desktop/Sources/Providers/ChatProvider.swift — specifically applyStallTransitions (the synthetic-ID / nil-tuid mismatch). All other changed files look correct. Important Files Changed
Sequence DiagramsequenceDiagram
participant Bridge as AgentBridge
participant CP as ChatProvider (MainActor)
participant SD as StallDetector (actor)
participant UI as ToolCallsGroup
Bridge->>CP: toolActivityHandler("started", toolUseId)
CP->>CP: addToolActivity(.running)
CP->>SD: await step(.toolStarted(id), atMs)
SD-->>CP: [] no transitions yet
loop every 500ms stallTickTask
CP->>SD: await tick(atMs now)
alt gap under 8s
SD-->>CP: []
else gap reaches 8s
SD-->>CP: [.tool(id, .running to .slow)]
CP->>CP: "applyStallTransitions block status = .slow"
CP-->>UI: amber spinner
else gap reaches 20s
SD-->>CP: [.tool(id, .slow to .stalled)]
CP->>CP: "applyStallTransitions block status = .stalled"
CP-->>UI: Cancel banner shown
end
end
UI->>CP: onCancelTurn tapped
CP->>Bridge: stopAgent interrupt
Bridge->>CP: toolActivityHandler("completed", toolUseId)
CP->>CP: addToolActivity(.completed)
CP->>SD: await step(.toolCompleted(id), atMs)
SD-->>CP: [.tool(id, .stalled to .running)]
Reviews (1): Last reviewed commit: "feat(desktop): client-side stall detecti..." | Re-trigger Greptile |
Greptile P1 on BasedHardware#7555: tools that arrive without a `toolUseId` were registered in the StallDetector under a synthetic `untracked-<name>` key, but their content block stored `toolUseId: nil`. The transition match used `tuid == id`, so `nil == "untracked-<name>"` was always false and every stall transition for those tools was silently dropped — the spinner never went amber and the Cancel banner never appeared, the exact case the feature exists to fix. Route both the registration site and the match site through a single `ChatProvider.stallTrackingId(toolUseId:name:)` helper so the key is derived identically and cannot diverge again. Add tests locking the derivation (real id passes through; nil falls back to the name key). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
kodjima33
left a comment
There was a problem hiding this comment.
Desktop feature: client-side stall detection, well-tested actor pattern. Approve only — feature >500 lines and mergeStateStatus BLOCKED, author needs to rebase.
A running tool call previously had only two states (running /
completed), so a slow or hung tool showed an indefinite spinner with
no signal and no way out.
Adds a `StallDetector` actor that watches two clocks per turn — the
inter-event gap and each in-flight tool's own duration — and promotes
state through `.running → .slow → .stalled` at configurable thresholds
(`StallThresholds`, default 8s / 20s). The detector is pure logic:
time is passed in on every call, so it is fully deterministic and
testable without wall-clock waits.
`ToolCallStatus` is extended to `{ running, slow, stalled, completed,
failed }` with an `isInFlight` helper, and the exhaustive switches that
consume it are updated across the chat, onboarding, and task-chat
surfaces. `ChatProvider` feeds every bridge event to the detector and
runs a periodic tick task so promotions fire during silent gaps. The
UI surfaces a "still working…" annotation on slow tools and a
message-level Cancel banner on stalled ones, wired to
`AgentBridge.interrupt()`.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Greptile P1 on BasedHardware#7555: tools that arrive without a `toolUseId` were registered in the StallDetector under a synthetic `untracked-<name>` key, but their content block stored `toolUseId: nil`. The transition match used `tuid == id`, so `nil == "untracked-<name>"` was always false and every stall transition for those tools was silently dropped — the spinner never went amber and the Cancel banner never appeared, the exact case the feature exists to fix. Route both the registration site and the match site through a single `ChatProvider.stallTrackingId(toolUseId:name:)` helper so the key is derived identically and cannot diverge again. Add tests locking the derivation (real id passes through; nil falls back to the name key). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
147b9b7 to
a10264a
Compare
|
Rebased onto the latest |
Summary
running/completed. When a tool is slow or hangs, the user sees an indefinite spinner with no signal that anything is wrong and no way to bail out. This is a common source of "the chat is stuck / broken" perception.StallDetectoractor watches two clocks per turn (the inter-event gap and each in-flight tool's own duration) and promotes UI state through.running → .slow → .stalledat configurable thresholds. The UI surfaces a "still working…" annotation on slow tools and a message-level Cancel banner on stalled ones, wired to the existingAgentBridge.interrupt().Linked Issue / Bounty
Root Cause (bug fixes only)
How It Works
StallDetectoris pure logic: the current time is passed in as a parameter on every operation (step(kind:atMs:),tick(atMs:)), so it has no wall-clock dependency and is fully deterministic under test. Production wraps it in a periodicticktask while a turn is active so promotions still fire during silent gaps when no events arrive.StallThresholdsholds the promotion cutoffs (slowGapMs/stalledGapMs, default 8s / 20s) as aSendable/Equatablestruct so they can be tuned later with no behavioral churn.ToolCallStatusis extended from{ running, completed }to{ running, slow, stalled, completed, failed }with anisInFlighthelper. The exhaustiveswitches that consume it are updated across the chat, onboarding, and task-chat surfaces (this is the bulk of the changed-line count — mechanical enum-extension ripple).ChatProviderfeeds every bridge callback to the detector and maps detector transitions onto the tool-call rows; the stalled state drives the Cancel banner.Change Type
Scope
Security Impact
AgentBridge.interrupt(). The detector observes only event timing, never content.Testing
xcrun swift build -c debug --package-path Desktop→Build complete!StallDetectorTests(13 tests: inter-event promotion, per-tool timers tracked independently, reset-on-event, threshold guard) andToolCallStatusTests(3 tests: the extended-enum +isInFlightcontract).Human Verification
main; 16 new tests pass; full test target compiles..slow/.stalledstate. Deterministically forcing a real stall end-to-end needs a fake-bridge harness (out of scope for this PR), so runtime UI evidence is a known gap.Evidence
Docs
Performance, Privacy, and Reliability
Migration / Backward Compatibility
ToolCallStatusextension is additive; persistence treats the new in-flight states as transient and never relies on them being present in stored history.Risks and Mitigations
ToolCallStatuscould miss an exhaustiveswitch. Mitigation: Swift's exhaustiveness checking forces every consumer to handle the new cases; the compile is green, andToolCallStatusTestslocks theisInFlightcontract.deferon turn-scope exit (success or throw).AI Disclosure