Skip to content

Loop agent thread on active issue #78

@VincentShipsIt

Description

@VincentShipsIt

Loop agent thread on active issue

Executive Summary

ShipCode's pipeline runs plan→review→execute→verify and exits. If the verify phase fails or new information arrives mid-run, the user has to restart from scratch. Symphony (SPEC §7.1, §10.3) keeps the agent thread alive and dispatches continuation turns ("keep working on this issue, address the gaps") while the issue is still in an active state, up to max_turns. This issue adds that loop so verify-phase rework no longer requires manual restart, and so the agent can iterate on the same context window instead of cold-starting.

Implementation Checklist

  • Confirm scope and affected code paths
  • Implement the requested behavior
  • Add or update focused tests
  • Run verification and document evidence

Problem Statement

A failed verify today produces a single transition to FAILED and stops. The agent's session ends, its context is discarded, and the next attempt starts fresh — losing all the in-context reasoning from the previous run. Most verify failures are small fixable gaps (a missing test, a lint rule, an off-by-one) that the agent could resolve in one more turn if it stayed on the same thread. Symphony's continuation pattern bundles this into the dispatcher.

Goals

  • After the first turn, the worker re-fetches issue state and decides whether to continue.
  • If the issue is still active and turn_count < max_turns, send a continuation prompt on the same agent thread.
  • Continuation prompt is short, references the prior failure, and instructs the agent to address remaining gaps — never re-sends the original PRD body.
  • turn_count is tracked in pipeline state and surfaced in the UI.
  • Cap is configurable per project; default 20 (matching Symphony).

Non-Goals

  • Not changing the planner→reviewer→executor→verifier sequence inside one turn.
  • Not implementing dynamic prompt selection per turn (continuation prompt is a single template).
  • Not adding human-in-the-loop continuation approval; the loop is automatic until cap or terminal state.

User Stories with Acceptance Criteria

Story 1: Verify-failure auto-retry

  • As a user, I file an issue. The first verify fails. The pipeline runs another turn on the same agent thread without my intervention.
  • Acceptance: turn_count increments to 2; agent transcript shows continuation prompt, not original PRD; verify result re-recorded.

Story 2: Cap enforcement

  • As a user with max_turns: 3, the third verify failure exits the loop.
  • Acceptance: turn_count == 3, pipeline transitions to terminal FAILED state with reason max_turns_reached.

Story 3: Issue closed mid-loop exits cleanly

  • As a user, I close the issue while a continuation turn is mid-flight.
  • Acceptance: next iteration's pre-check observes the closed state and exits to terminal IDLE without sending another turn.

Story 4: Successful turn exits the loop

  • As a user, the second turn passes verify.
  • Acceptance: pipeline transitions to terminal SUCCESS; no third turn is dispatched.

Functional Requirements

  • Worker loop wraps the existing phase sequence: after a turn completes, re-poll issue state from GitHub.
  • Exit conditions checked in order: terminal/non-active issue → exit; turn_count >= max_turns → exit; verify success → exit; else continue.
  • Codex thread reused via the existing app-server thread ID (already supported).
  • Claude session reused via claude -r <session-id>.
  • Continuation prompt is a separate WORKFLOW.md template (continuation_prompt) rendered with the same context plus prior_failure_reason.
  • turn_count persisted to pipeline state, emitted in pipeline events, displayed in the UI.

Non-Functional Requirements

  • Loop overhead per turn (state poll + prompt render + dispatch) under 2 seconds excluding agent runtime.
  • Cancellation request observed within one polling interval.
  • No additional GitHub API calls beyond one issue-state poll per turn.

Success Criteria

  • A failing-then-passing verify scenario auto-resolves within 2 turns without user input.
  • A test forcing 3 consecutive verify failures with max_turns: 3 ends in terminal FAILED with max_turns_reached.
  • Closing the GitHub issue mid-loop terminates the next iteration without spawning a new agent process.
  • Pipeline events stream contains turn_started and turn_completed markers for each turn.
  • Renderer test confirms the continuation prompt does not include the original issue body.

Out of Scope

  • Per-turn model switching.
  • Human approval gate between turns.
  • Branching turns (try multiple strategies in parallel).
  • Cross-issue thread reuse.

Dependencies

Verification Plan

  • Integration test (happy path): mock verify to fail once then succeed; assert two turns, terminal SUCCESS.
  • Integration test (cap): mock verify to fail forever with max_turns: 3; assert terminal FAILED with max_turns_reached.
  • Integration test (issue closed): close the mock issue between turns; assert loop exits without dispatch.
  • Unit test: continuation prompt rendering includes prior_failure_reason and excludes original PRD body.
  • Manual: run against a real GH issue with a deliberately weak first plan and observe the second turn fix it.

Risks & Open Questions

  • Risk: Agent thread state grows unboundedly across turns; document max_turns as the bound and consider per-turn token budget warnings.
  • Risk: Continuation prompt phrasing matters a lot for outcome quality; treat the default template as a tunable surface.
  • Open Q: Should the worktree be reset between turns, or carry forward changes from prior turns? Default: carry forward.
  • Open Q: What constitutes "issue still active" — open + has the dispatch label, or just open?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions