Session status model — final structure (5 states: Working / Needs input / Ready / Stalled / Idle) #332

harshitsinghbhandari · 2026-06-19T12:48:35Z

harshitsinghbhandari
Jun 19, 2026
Collaborator

Final structure for the session status model, following the decisions on #331 and the review that followed. This is the spec to build against. Reply with changes and I will fold them in.

Background and the original problem statement are in #331 (three parallel mappings off SessionStatus, no_signal computed then discarded, "Needs input" overloaded, duplicated pill tables). This post is the resolved design.

Why we are doing this

The dashboard exists to do one thing well: let you run many agents in parallel and tell, at a glance, which ones need you. Every design choice should serve that scan. The current status model works against it.

A status is only worth showing if it changes what you do. Across everything we render there are really only a handful of distinct moves when you are scanning a wall of agents:

leave it alone (it is working)
respond (it is blocked on me)
act on a clean PR (merge it, or go review it)
get it moving (it will not finish on its own)
nothing (done, dead, or empty)

That is five moves. Today we expand one raw signal into thirteen SessionStatus values and render them through three different groupings, and most of those thirteen map to the same move, so they add states to scan without changing the decision you make. That is cognitive load with no payoff. The fine PR detail (CI failing vs changes requested vs approved) still matters, but it belongs in the inspector where you are already looking at that one PR, not in the glanceable pill where it is noise.

So we derive the states from the distinct moves, not from the backend values. Five states, each tied to one clear "what do I do."

We are not deleting information. The backend still has the full PR facts and activity history; this is purely about how much the glance surfaces. Collapsing is choosing the right altitude per surface: the pill answers "what does this session need from me right now," the inspector answers "what exactly is going on."

The five states

State	Means	Your move
Working	the agent is actively running	leave it alone
Needs input	the agent is blocked on you	respond
Ready	a clean PR is waiting on you (mergeable, approved, or needs your review)	merge it / go review it
Stalled	the agent will not finish on its own (hung, never booted, or stopped with unfinished work)	get it moving
Idle	nothing is happening, or the work is finished	nothing

The change from the earlier draft of this spec is that the old Ready bucket was doing two jobs (a clean one-click PR and an agent that quit on an unfinished PR), and the old Stuck only caught "never booted." Both are fixed below.

How we cut: whose move is it

The clean cut is not "is there a PR." It is two axes: what the agent is doing, and what the PR needs. Every case is a cell, and the five states fall straight out.

PR buckets:

clean (your move): mergeable, approved, review_required. There is a real, clean action for you.
unfinished (the agent's move, left undone): draft, ci_failed, changes_requested, merge-conflicting. The agent has more to do.
neutral: bare pr_open (open, no review requested, not mergeable, CI not failing). Just sitting there.
none.

agent state	no PR	clean PR	unfinished PR
active (healthy)	Working	Working `*`	Working `*`
waiting_input	Needs input	Needs input	Needs input
stopped (idle)	Idle	Ready	Stalled
silent: hung or never-booted	Stalled	Stalled	Stalled

* active-wins blind spot, see below. Bare pr_open behaves as "no PR" for status purposes: stopped on a neutral PR reads Idle.

Two things this grid makes explicit. First, an agent that stopped on an unfinished PR (failing CI, requested changes, still a draft) is Stalled, not Ready. It had the move and quit; folding it into a green "Ready" tells you to act, you open it, and there is nothing to do but kick the agent. That is the exact trust erosion this whole effort is trying to kill. Second, the entire silent row is one move regardless of PR: the agent is not responding, go look.

The single derivation

One function, evaluated top to bottom, first match wins. Every surface reads from this and nothing re-derives. Inputs: activity.state, isTerminated, the PR buckets above, and three timing facts (FirstSignalAt, LastActivityAt, signalCapable).

if isTerminated:                 -> Idle      # includes merged
elif activity == waiting_input:  -> Needs input
elif stalled():                  -> Stalled
elif activity == active:         -> Working
elif has a clean open PR:        -> Ready
else:                            -> Idle      # stopped, no actionable PR (incl. neutral pr_open)

where:

stalled() =
     (signalCapable && FirstSignalAt == 0 && now - LastActivityAt > BOOT_GRACE)   # never booted
  || (activity == active && now - LastActivityAt > HANG_TIMEOUT)                  # hung mid-run
  || (activity != active && hasUnfinishedPR)                                      # stopped, work undone

Notes:

stalled() is checked before the active branch on purpose, so a hung-but-still-"active" session is caught instead of reading Working.
stalled() is checked before the clean-PR branch, so the silent row outranks "go review." Reasonable (a possibly-broken agent is more urgent than a review), and easy to flip if we disagree later.
BOOT_GRACE is the old noSignalGrace, raised 90s -> 120s.
HANG_TIMEOUT is the one genuinely open number (see Open items).

Where this lives (important implementation note)

This mapping cannot be a remapping of the current 13-value SessionStatus. That value already resolved the active-vs-PR precedence the wrong way for this model (PR wins), so once a session reads ci_failed you have lost whether the agent is active, which is exactly the bit we now need. The new state must be derived from activity.state plus the raw PR buckets, not from the pre-collapsed status.

Recommendation: move the single source of truth into the backend by rewriting service/session/deriveStatus to emit these five states directly. It already has the activity state, the PR facts, the clock, and FirstSignalAt, so the timing logic (BOOT_GRACE, HANG_TIMEOUT) lives in one place with the data instead of being reimplemented in TypeScript. The frontend then renders the state verbatim and deletes workerDisplayStatus, attentionZone, sessionNeedsAttention, and workerStatusPulses entirely. One function, server-side, every surface downstream of it.

Current vs proposed

Today: one raw signal expands into 13 SessionStatus values, run through three independent groupings (workerDisplayStatus for the pill, attentionZone for the board, plus the sessionNeedsAttention / workerStatusPulses predicates), rendered across four surfaces with the pill table duplicated in two of them. The pill shows six labels (Working, Needs input, CI failed, Ready, Done, Idle), the board shows five columns, and they do not line up. no_signal is computed by the backend but has no frontend state, so it falls back to "Working".

Proposed: activity.state plus PR facts go through one backend function to five states (Working, Needs input, Ready, Stalled, Idle), and every surface renders that state verbatim.

Per case, how it renders now versus what it becomes (bold = change). The proposed column splits on agent activity where it matters, because active now wins:

case	Today: pill	Today: board	Proposed
agent active, no/any PR	Working (unless PR overrides)	Working / Pending / Needs you	Working
`waiting_input`	Needs input	Needs you	Needs input
never signalled, silent > grace (`no_signal`)	Working (fallback)	Working	Stalled
agent active but silent > `HANG_TIMEOUT`	Working	Working	Stalled (new detector)
stopped, `ci_failed`	CI failed	Needs you	Stalled
stopped, `changes_requested`	Needs input	Needs you	Stalled
stopped, `draft`	Working	Pending	Stalled
stopped, `review_required`	Needs input	Pending	Ready
stopped, `approved`	Ready	Ready to merge	Ready
stopped, `mergeable`	Ready	Ready to merge	Ready
stopped, bare `pr_open`	Working	Pending	Idle
`merged`	Done	Done	Idle
`terminated`	Done	Done	Idle
stopped, no PR (`idle`)	Idle (inspector) / Working (board)	Working	Idle

Today the pill and board disagree on several rows, and three states (no_signal, idle, active working) all collapse to "Working" so a possibly-broken agent is invisible. The proposed column has one value per case, rendered identically everywhere.

Decisions resolved (from the #331 review)

Stalled replaces Stuck; the ceiling stays at five. "The agent will not finish on its own" is one move whether the cause is never-booted, hung mid-run, or stopped-with-unfinished-work. Restart-vs-reprompt is a real difference, but it is one you learn when you open the session, not on the wall. So slot four is redefined and widened rather than split into a sixth state.
The no_signal detector is widened. "Never booted" is just the boot-time special case of "not making progress." The mid-run hang (agent worked, then wedged on a tool call or rate limit) is the failure that actually costs you, and today it shows a calm breathing "Working." We accept the false positives that a time-based hang detector brings, because missing the hang is worse.
Ready is tightened to clean PRs only. mergeable, approved, review-required. Unfinished PRs leave Ready: Working while the agent is active, Stalled once it stops.
Draft gets no special treatment. AO never creates PRs (confirmed: every Draft reference in the backend is read-only observation from GitHub via observe/scm/observer.go; there is no PR-creation path and no draft-as-handoff convention). So a draft is just an unfinished, merge-blocked PR and folds in with ci_failed / changes_requested: Working while active, Stalled when stopped.
Active-wins is kept, and now stated as a known blind spot. When the agent is still running on top of a PR that is already mergeable, the agent wins and the one-click merge is hidden until it stops. The common case (agent opens a PR, keeps iterating) is correct; the rare case (agent puttering while a green PR waits) buries a win. We keep active-wins because the alternative makes a clearly-busy agent show green, which reads as a lie, but we acknowledge the miss rather than pretend it does not exist.
Attention = Needs input + Stalled. Those are the states that actually need a human to unblock progress. Ready is a softer "your move" that should not nag.

Open items

HANG_TIMEOUT. The seconds of active-but-silent before a session flips to Stalled. Too low and long legitimate turns false-flag; too high and we eat the hang. This wants a real number from watching live sessions, not a guess. Marked TBD.
Bare pr_open when stopped. Weak lean is Idle (PR is up, nothing broken, nothing requested of you). Flip to Ready if we would rather nudge on any open PR.
Stalled color. Deferred. Label is "Stalled". It should not breathe (it is not alive); a muted warning tone fits its "go look" move.

Per-surface rendering

Pill (topbar + inspector): one shared definition, the five states. Working breathes; the rest are static.
Board columns: the same five states, re-derived from the one function (not a separate attentionZone set). Proposed left-to-right by urgency: Needs input, Stalled, Ready, Working, Idle.
Sidebar dot: same function. Idle renders distinctly, not folded into Working.
Attention (sidebar emphasis / counters): true for Needs input and Stalled only.

Out of scope (unchanged)

The hook -> ActivityState derivation (claudecode/activity.go and siblings).
The activity write path, CDC, SSE transport, and the reaper backstop for exited.
Notifications: NotificationNeedsInput already fires only on the transition into waiting_input, consistent with "Needs input = agent blocked". ready_to_merge / pr_merged / pr_closed_unmerged untouched.
POST /activity validation and the persisted activity_state CHECK set stay at the four ActivityState values. The new states need no API enum change: the wire status field is already a plain string. If we keep any client-side fallback, the frontend SessionStatus union should learn the new vocabulary so nothing silently maps to "working".

Implementation blast radius

backend/internal/service/session/status.go — the rewrite lands here: deriveStatus emits the five states from activity.state + PR buckets + timing; add the HANG_TIMEOUT detector and the BOOT_GRACE 90s -> 120s bump; tighten the PR buckets (clean vs unfinished vs neutral).
backend/internal/domain/status.go — the status vocabulary the API serializes (five states, or keep the richer set internally and project to five at the edge).
frontend/src/renderer/types/workspace.ts — delete workerDisplayStatus, attentionZone, sessionNeedsAttention, workerStatusPulses; render the backend state verbatim; teach SessionStatus the new vocabulary.
frontend/src/renderer/components/SessionsBoard.tsx — COLUMNS, BADGE.
frontend/src/renderer/components/ShellTopbar.tsx — STATUS_PILL.
frontend/src/renderer/components/SessionInspector.tsx — STATUS_PILL, activityDetail, the idle special-case.
frontend/src/renderer/components/Sidebar.tsx — SessionDot.
Tests: backend/internal/service/session/status_test.go and frontend/src/renderer/types/workspace.test.ts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session status model — final structure (5 states: Working / Needs input / Ready / Stalled / Idle) #332

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Session status model — final structure (5 states: Working / Needs input / Ready / Stalled / Idle) #332

Uh oh!

Uh oh!

harshitsinghbhandari Jun 19, 2026 Collaborator

Why we are doing this

The five states

How we cut: whose move is it

The single derivation

Where this lives (important implementation note)

Current vs proposed

Decisions resolved (from the #331 review)

Open items

Per-surface rendering

Out of scope (unchanged)

Implementation blast radius

Replies: 0 comments

harshitsinghbhandari
Jun 19, 2026
Collaborator