Context
Tools like roborev and no-mistakes validate the same direction for ShipCode: AI-generated work needs a durable review/checks layer, not just transient verifier text. For ShipCode, the right product shape is a GitHub/CI review lane attached to PRs and pipeline runs.
This should behave like a first-class CI review system: review generated commits/PR diffs, persist findings, expose unresolved items in the app, and feed structured findings back into execution/refine retries.
Goal
Add a CI/PR review lane that turns reviewer/verifier output into persistent findings linked to a thread, plan, execution attempt, commit/sha, and PR/check run.
Requirements
- Persist review findings as structured records instead of only storing latest phase output.
- Link each finding to enough source-of-truth context: project, thread, plan/review round, execution attempt, worktree path/branch, commit SHA when available, PR number when available, phase, severity, status, and source agent/model.
- Surface an open findings queue in the desktop app so unresolved findings do not disappear after retries.
- Support finding lifecycle states: open, fixed, ignored, superseded, and closed.
- When verification/review fails with structured findings, route retry back through execution/refine with the findings included in the executor prompt.
- Add a GitHub/CI integration path that can publish review status to the PR as a check/status/comment once a PR exists.
- Keep the implementation path-as-truth for worktrees: always use persisted thread/worktree paths, never recompute cleanup paths from thread IDs.
- Clamp any IPC/reviewer/CI error messages before they reach the renderer.
Acceptance Criteria
- A failed structured review or verification creates durable finding records.
- Retrying from a structured failure resumes execution/refine with those findings, not another blind verify run.
- The app can show current open findings for a thread/PR after the original phase output has changed.
- A later successful fix can mark prior findings fixed or superseded without deleting history.
- PR/check publishing is optional/configurable and does not block local pipeline execution when GitHub integration is unavailable.
- Tests cover finding persistence, retry routing from structured failures, lifecycle transitions, and renderer-safe error clamping.
Notes
Competitive reference points:
- roborev: persistent review ledger, per-commit reviews, open findings queue, refine loop, workflow-specific models.
- no-mistakes: push/PR gate, disposable worktrees, fixed validation sequence, CI babysitting.
ShipCode should not copy their CLI/TUI product shape. The useful piece is the durable CI-review layer inside ShipCode's issue-to-PR pipeline.
Context
Tools like roborev and no-mistakes validate the same direction for ShipCode: AI-generated work needs a durable review/checks layer, not just transient verifier text. For ShipCode, the right product shape is a GitHub/CI review lane attached to PRs and pipeline runs.
This should behave like a first-class CI review system: review generated commits/PR diffs, persist findings, expose unresolved items in the app, and feed structured findings back into execution/refine retries.
Goal
Add a CI/PR review lane that turns reviewer/verifier output into persistent findings linked to a thread, plan, execution attempt, commit/sha, and PR/check run.
Requirements
Acceptance Criteria
Notes
Competitive reference points:
ShipCode should not copy their CLI/TUI product shape. The useful piece is the durable CI-review layer inside ShipCode's issue-to-PR pipeline.