Skip to content

Use process CPU metrics to improve agent activity detection #4151

@gregpriday

Description

@gregpriday

Summary

Incorporate process-level CPU usage as a signal in the agent activity detection system (ActivityMonitor / AgentStateMachine) to improve accuracy of working/waiting/idle state transitions. CPU activity is a strong, objective indicator of whether a process is actually doing work — complementing the current heuristic-based approach that relies on terminal output patterns.

Problem Statement

Canopy's agent state detection currently relies on output pattern analysis: output volume, completion patterns, prompt detection, line rewrites, and silence timeouts. This works well in most cases but has known weaknesses:

  • False "working" during silence: An agent may produce no output while making API calls or processing internally. The system eventually times out to "idle" after 3 minutes, but this is a long delay when the agent is genuinely working.
  • False "idle" during background work: Some agents do CPU-intensive work (code analysis, large diffs) without producing terminal output. The current system has no way to distinguish "silent but working" from "actually idle."
  • Confidence calibration: State transitions rely on trigger-based confidence scores that don't account for objective system-level signals.

CPU usage from the process tree provides a ground-truth signal: if the terminal's process tree is consuming significant CPU, the agent is doing work regardless of whether it's producing output. Conversely, near-zero CPU with no output is a strong indicator of "waiting" (blocked on user input or API response).

Desired Behavior

CPU as an additional signal to ActivityMonitor:

  • High CPU (>10%) + output activity → high-confidence "working" state
  • High CPU + no output → "working" state (silent processing) — prevents premature timeout to "idle"
  • Near-zero CPU + no output → higher-confidence "waiting" or "idle" — faster state transitions
  • Near-zero CPU + output → "working" (streaming API response, low local CPU since the work is on the server)

Integration points:

  • The existing ProcessStateValidator interface has hasActiveChildren() — this could be extended to also report CPU-level busyness
  • The WorkingSignalDebouncer and CompletionTimer could factor in CPU state to adjust their timing
  • State change confidence scores could be boosted or penalized based on CPU alignment

Not a replacement: CPU should be an additional input alongside existing pattern detection, not a replacement. Output patterns remain the primary signal for nuanced states like "waiting for approval" vs "waiting for input." CPU is a coarse but reliable tiebreaker.

Context

The current activity detection pipeline:

  • ActivityMonitor orchestrates multiple detectors: InputTracker, OutputVolumeDetector, HighOutputDetector, CompletionTimer, LineRewriteDetector, BootDetector
  • ProcessStateValidator interface already exists with hasActiveChildren() — a natural extension point
  • AgentStateService manages state machine transitions with trigger and confidence metadata
  • ProcessDetector already tracks child process busyness via isBusy in detection results

Scope & Constraints

Not in scope:

  • Changing the fundamental state machine model (idle/working/waiting/completed/failed)
  • Using memory metrics for activity detection (memory doesn't correlate well with activity)
  • Adding new visible UI states — this improves detection accuracy of existing states

Constraints:

  • Must not increase false-positive rate for "working" state (showing spinning icon when agent is idle)
  • Must not slow down "completed" detection — completion patterns in output should still transition immediately
  • CPU signal has inherent latency (2.5s polling + macOS decaying average) — it's a supporting signal, not a primary trigger

Acceptance Criteria

  • Agent state detection incorporates process CPU usage as an input signal
  • Silent high-CPU periods prevent premature timeout to "idle" (agent stays in "working")
  • Near-zero CPU with no output accelerates transition to "waiting" or "idle"
  • State change confidence scores reflect CPU alignment (higher when CPU confirms the state)
  • No regression in detection accuracy for output-based states (completion, prompt, approval)
  • CPU signal gracefully degrades if resource monitoring data is unavailable

Edge Cases & Risks

  • API-bound agents: Agents waiting on API responses have near-zero local CPU but are genuinely "working" (from the user's perspective). The system should not mark these as "idle" — output patterns and the "waiting for API" heuristic should take priority over low CPU.
  • Background compilation: If the user runs a build in the same terminal after an agent completes, the CPU spike shouldn't flip the state back to "working." Exit/completion patterns should be authoritative.
  • CPU polling lag: macOS %cpu is a decaying average with ~30-60s lag. CPU might report high values briefly after the agent has actually stopped. The SMA smoothing and existing debounce timers should handle this.
  • Multi-core normalization: CPU% can exceed 100% on multi-core systems. The threshold should account for this (e.g., "significant CPU" might be >10% on an 8-core machine, not >50%).

Dependencies

Depends on #4149 (per-terminal resource monitoring provides the CPU data pipeline)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions