Skip to content

Add idle agent detection metrics and GET /agents/idle endpoint #263

@chernistry

Description

@chernistry

What

The heartbeat monitor (src/bernstein/core/heartbeat.py) already has an IDLE_LOG_AGE_THRESHOLD_SECONDS = 180 constant for detecting idle agents, and the orchestrator's agent_lifecycle.py has a recycle_idle_agents function. But there's no API to query which agents are currently idle, how long they've been idle, and what they were last working on. Add:

  1. An IdleAgentInfo dataclass with session ID, idle duration, last task, and last activity timestamp
  2. A detect_idle_agents() function in heartbeat.py that returns a list of IdleAgentInfo
  3. A GET /agents/idle endpoint that returns idle agent info as JSON

Why

Idle agents consume provider quota and cost money. Operators need to see at a glance which agents are idle so they can decide whether to recycle them or let them wait for new tasks. This data also feeds into the adaptive parallelism system.

Where

This change spans 3 files:

  • Heartbeat monitor: src/bernstein/core/heartbeat.pyHeartbeatMonitor class (add detect_idle_agents)
  • Route: src/bernstein/core/routes/agents.py — add GET /agents/idle endpoint
  • Models: No new file needed; use the existing heartbeat module for the dataclass

How to implement

Step 1: Add IdleAgentInfo dataclass

In src/bernstein/core/heartbeat.py, add:

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Priority 2 - NormalenhancementNew feature or requesthelp wantedExtra attention is neededpythonPython

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions