Skip to content

chore: consolidate docs and planner/worker instructions#2

Open
Coldaine wants to merge 4 commits intomainfrom
chore/consolidate-docs-20260408
Open

chore: consolidate docs and planner/worker instructions#2
Coldaine wants to merge 4 commits intomainfrom
chore/consolidate-docs-20260408

Conversation

@Coldaine
Copy link
Copy Markdown
Owner

@Coldaine Coldaine commented Apr 8, 2026

Consolidates documentation, planner/worker instructions, and removes deprecated PLANNING.md. Adds NORTH_STAR.md and PLANNER_RUNS.md docs.

Summary by CodeRabbit

Release Notes

  • Documentation

    • Restructured system documentation to emphasize gap-finding and task-based workflow over PR review cycles.
    • Simplified task state model; removed PR-centric workflow states.
    • Updated CLI command reference with refined project and task operation syntax.
  • Tests

    • Added integration test runner for validating CLI and queue functionality.

Copilot AI review requested due to automatic review settings April 8, 2026 11:36
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

This pull request redesigns the Zo Work Queue system from a PR-centric review workflow to a gap-based task model. The planner discovers implementation gaps and creates TODO tasks; workers claim and execute tasks, opening PRs as their delivery mechanism; neither planner nor worker manages PR review/merge states. Structured JSONL run records are now mandatory for both roles.

Changes

Cohort / File(s) Summary
Core System Documentation
AGENTS.md, NORTH_STAR.md, SYSTEM_INTENT.md
Redefined system behavior around gap-based tasks and planner/worker roles. Updated state machine from PR-review sequence to simple TODO → IN_PROGRESS → BLOCKED/COMPLETED/CANCELLED. Added mandatory JSONL run record requirements and explicit operational rules (planner does not code/open PRs, workers own build-to-PR lifecycle, CLI is queue authority).
Architecture & CLI Reference
docs/ARCHITECTURE.md, docs/WORKLOG_CLI.md
Narrowed planner scope from discovery/decomposition/prioritization/hygiene to discovery/decomposition/prioritization only. Removed PR review/merge/branch tracking metadata. Updated documented CLI commands to project-scoped set and simplified task state machine. Added "Persisted Metadata" and "Operational Notes" sections.
Queue Execution & Run Logs
docs/NORTH_STAR.md, docs/PLANNER_RUNS.md, docs/dev/proposedTests.md
Added new NORTH_STAR.md defining core loop (planner selects projects and generates gap-based tasks; worker claims and builds; planner reconciles drift). Added PLANNER_RUNS.md with dated planner run logs and queue reconciliation outcomes. Extended proposedTests.md with quick-run integration harness workflow.
Planning Documentation
docs/PLANNING.md (deleted), docs/plans/2026-03-31-*.md (added)
Removed legacy docs/PLANNING.md containing rollout plan and state/command enumerations. Added two new plan documents: pr-removal.md specifying elimination of PR-centric states and GitHub PR inspection, and rewrite-planner.md detailing new gap-based project selection, mandatory JSONL records, and updated control flow rules.
Planner Behavior
prompts/PLANNER_INSTRUCTION.md, prompts/PLANNER_PERSONA.md
Changed queue reconciliation from PR/branch-URL verification to local-git-only checks. Replaced PR-state handling with CANCELLED task recovery on branch deletion. Added mandatory daily JSONL structured run record with projects_selected/skipped, tasks_added (with gap_addressed field), tasks_closed, and queue_health. Updated project selection to 6+ projects with detailed gap-finding heuristic (read NORTH_STAR, scan source, compare intent vs reality).
Worker Behavior
prompts/WORKER_INSTRUCTION.md, prompts/WORKER_PERSONA.md
Removed multi-phase review/fix/merge workflow. Simplified to: claim TODO, read context/plan, validate gap statement, execute changes, open PR, mark COMPLETED. Added mandatory JSONL run record with task claim, branch, PR URL, validation, and next step. Removed prior branches for NEEDS_REVIEW/CHANGES_REQUESTED/READY_TO_MERGE handling.
Integration Test Harness
tools/run_worklog_cli_integration_tests.py
Added new CLI integration test runner exercising project registration, task lifecycle (claim → update → complete), priority ordering, stale lease detection/reclaim, and registry state transitions via SQLite manipulation. Uses temporary HOME and isolated paths to avoid touching production database. Includes custom test failure exception and exit-code handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The majority of changes are cohesive documentation rewrites (homogeneous in pattern) reflecting a single workflow redesign. However, the breadth spans multiple conceptual domains (planner logic, worker logic, architecture, CLI, test harness), and the SYSTEM_INTENT.md, PLANNER_INSTRUCTION.md changes introduce dense logic requiring careful reasoning. The new integration test harness adds executable complexity. Mixed scope and density drive a moderate effort.

Poem

🐰 A gap-based future hops into view,
No PR review loops to construe—
Planner finds needs, worker ships the fix,
JSONL records tell the mix!
From chaos to clarity, work stacks align,
One queue to rule them all. Oh, so fine!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: the PR consolidates documentation and planner/worker instructions by reorganizing docs, adding NORTH_STAR.md and PLANNER_RUNS.md, and removing PLANNING.md.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/consolidate-docs-20260408
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch chore/consolidate-docs-20260408

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Consolidates and rewrites the planner/worker documentation to shift from PR-centric workflows to gap-based planning + worker-owned delivery, while removing deprecated planning docs and adding new run-log and test-harness documentation.

Changes:

  • Rewrites planner/worker personas + instructions (gap-based planning, structured JSONL run records, PRs not treated as planning targets).
  • Replaces/removes legacy planning docs and adds new run logs / plan writeups.
  • Adds a Python integration harness and documents how to run it.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tools/run_worklog_cli_integration_tests.py Adds an isolated CLI integration test harness for worklog state transitions.
prompts/WORKER_PERSONA.md Updates worker role definition and “done” criteria under the new workflow.
prompts/WORKER_INSTRUCTION.md Updates worker execution steps and adds mandatory structured run record output.
prompts/PLANNER_PERSONA.md Updates planner boundaries to avoid PR-centric behavior.
prompts/PLANNER_INSTRUCTION.md Rewrites planner process (exclusions, deep reading, gap-based tasks, run records).
docs/plans/2026-03-31-rewrite-planner.md Adds a retrospective plan doc describing the planner rewrite.
docs/plans/2026-03-31-pr-removal.md Adds a plan doc describing removal of PR-review workflow states.
docs/dev/proposedTests.md Documents running the new integration harness before manual tests.
docs/WORKLOG_CLI.md Rewrites CLI docs to match the new state machine and registry commands.
docs/PLANNING.md Removes deprecated planning doc.
docs/PLANNER_RUNS.md Adds a planner run log document.
docs/NORTH_STAR.md Adds a north-star doc under docs/ (currently overlaps/inconsistently mirrors root NORTH_STAR.md).
docs/ARCHITECTURE.md Updates architecture doc to remove PR-review workflow references.
SYSTEM_INTENT.md Rewrites system intent around machine-readable run records and gap-based planning.
NORTH_STAR.md Updates root north star to reflect non-PR-review workflow and structured observability.
AGENTS.md Updates repo entrypoint doc and links to the consolidated doc set.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +149 to +161
print("[3/9] NEEDS_REVIEW → READY_TO_MERGE → COMPLETED")
run_cmd(["review", id1, "--pr", "https://example.com/pr/1", "--summary", "ready"], env)
run_cmd(["ready-to-merge", id1, "--summary", "approved"], env)
complete = run_cmd([
"complete",
id1,
"--result",
"merged",
"--pr",
"https://example.com/pr/1",
"--files",
"a.py,b.py",
], env).stdout
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This harness hard-depends on PR-centric states/commands (review, ready-to-merge, request-changes, NEEDS_REVIEW, CHANGES_REQUESTED) that the updated docs explicitly remove (TODO/IN_PROGRESS/BLOCKED/COMPLETED/CANCELLED only). As written, the script will fail against the new CLI contract. Align the harness with the new state machine (e.g., exercise TODO→IN_PROGRESS→COMPLETED, BLOCKED, CANCELLED, stale-lease reclaim priority), or keep the old workflow docs—right now the PR introduces a direct contract mismatch.

Copilot uses AI. Check for mistakes.
get1 = run_cmd(["get", id1], env).stdout
assert_field_value(get1, "Status", "COMPLETED", "get archived item")

print("[4/9] CHANGES_REQUESTED priority over TODO")
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This harness hard-depends on PR-centric states/commands (review, ready-to-merge, request-changes, NEEDS_REVIEW, CHANGES_REQUESTED) that the updated docs explicitly remove (TODO/IN_PROGRESS/BLOCKED/COMPLETED/CANCELLED only). As written, the script will fail against the new CLI contract. Align the harness with the new state machine (e.g., exercise TODO→IN_PROGRESS→COMPLETED, BLOCKED, CANCELLED, stale-lease reclaim priority), or keep the old workflow docs—right now the PR introduces a direct contract mismatch.

Copilot uses AI. Check for mistakes.
Comment on lines +168 to +188
print("[4/9] CHANGES_REQUESTED priority over TODO")
add2 = run_cmd([
"add",
"--project",
"alpha",
"--task",
"Needs review changes",
"--priority",
"P1",
], env).stdout
id2 = extract_id(add2)
run_cmd(["claim"], env)
run_cmd(["review", id2, "--pr", "https://example.com/pr/2"], env)
run_cmd([
"request-changes",
id2,
"--summary",
"fix requested",
"--next",
"address comments",
], env)
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This harness hard-depends on PR-centric states/commands (review, ready-to-merge, request-changes, NEEDS_REVIEW, CHANGES_REQUESTED) that the updated docs explicitly remove (TODO/IN_PROGRESS/BLOCKED/COMPLETED/CANCELLED only). As written, the script will fail against the new CLI contract. Align the harness with the new state machine (e.g., exercise TODO→IN_PROGRESS→COMPLETED, BLOCKED, CANCELLED, stale-lease reclaim priority), or keep the old workflow docs—right now the PR introduces a direct contract mismatch.

Suggested change
print("[4/9] CHANGES_REQUESTED priority over TODO")
add2 = run_cmd([
"add",
"--project",
"alpha",
"--task",
"Needs review changes",
"--priority",
"P1",
], env).stdout
id2 = extract_id(add2)
run_cmd(["claim"], env)
run_cmd(["review", id2, "--pr", "https://example.com/pr/2"], env)
run_cmd([
"request-changes",
id2,
"--summary",
"fix requested",
"--next",
"address comments",
], env)
print("[4/9] stale IN_PROGRESS lease priority over TODO")
add2 = run_cmd([
"add",
"--project",
"alpha",
"--task",
"Stale claimed work",
"--priority",
"P1",
], env).stdout
id2 = extract_id(add2)
run_cmd(["claim"], env)
get2 = run_cmd(["get", id2], env).stdout
assert_field_value(get2, "Status", "IN_PROGRESS", "claimed item enters in-progress")

Copilot uses AI. Check for mistakes.
5. Commit: `git add . && git commit -m "<concise summary>"`
6. Push: `git push -u origin <branch>`
7. Open a PR: `gh pr create --title "<title>" --body "<what changed, why, how validated>"`
8. Update the task: `worklog.py update <id> --status COMPLETED --pr "<url>"`
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worker instruction says to mark completion via worklog.py update --status COMPLETED, but elsewhere in this PR (and in the integration harness) completion is performed via worklog.py complete ... (which typically also archives to history and captures result/files). If complete remains the canonical completion action, update the worker instruction to use it; if update --status COMPLETED is now intended, the CLI reference and integration harness should be changed accordingly.

Suggested change
8. Update the task: `worklog.py update <id> --status COMPLETED --pr "<url>"`
8. Complete the task: `worklog.py complete <id> --pr "<url>"`

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +27
python3 /home/workspace/Tools/worklog/worklog.py projects add \
--slug family-fling \
--path /home/workspace/GitHub/coldaine-github-repos/family-fling \
--repo Coldaine/family-fling \
--read-first AGENTS.md \
--read-first PLANNING.md
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PLANNING.md is removed in this PR (docs/PLANNING.md deleted), but the CLI docs still recommend registering it as a --read-first document. Update examples to reference files that still exist (e.g., docs/NORTH_STAR.md, NORTH_STAR.md, or project-specific docs/ entrypoints), otherwise new users will copy/paste a broken command.

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +48
python3 /home/workspace/Tools/worklog/worklog.py projects update family-fling \
--default-branch main \
--read-first AGENTS.md \
--read-first docs/PLANNING.md
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example refers to docs/PLANNING.md, which is deleted by this PR. Replace with the new canonical planning entrypoints (e.g., docs/NORTH_STAR.md / NORTH_STAR.md) or remove the reference.

Copilot uses AI. Check for mistakes.
A structured system where AI agents can advance personal software projects across repeated runs: finding work, doing it, reviewing it, fixing it, merging it, and leaving a durable trail for the next run.

## The Core Loop

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc conflicts with the rewritten instructions elsewhere in the same PR: planner selection is now described as '6+ projects' (prompts/PLANNER_INSTRUCTION.md), and the worker no longer 'reviews/fixes/merges based on task state' under the new simplified state machine. Either update docs/NORTH_STAR.md to match the current design, or explicitly label it as legacy/historical to avoid having two contradictory north stars (this file vs root NORTH_STAR.md).

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +15
Changes:
- Added Step 6: mandatory structured run record — append one JSON line to `/root/.zo/workqueue/runs/planner/YYYY-MM-DD.jsonl`
- Changed Step 4 from "scan gh pr list across 37 repos" to "deep-read 1-2 projects"
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan doc says the new approach is to deep-read '1-2 projects', but the updated prompts/PLANNER_INSTRUCTION.md Step 4 requires advancing '6+ registered projects per run'. Update this plan doc to match what actually shipped, or clarify that the plan changed and point to the final instruction as source of truth.

Suggested change
Changes:
- Added Step 6: mandatory structured run record — append one JSON line to `/root/.zo/workqueue/runs/planner/YYYY-MM-DD.jsonl`
- Changed Step 4 from "scan gh pr list across 37 repos" to "deep-read 1-2 projects"
Planned changes in this rewrite:
- Added Step 6: mandatory structured run record — append one JSON line to `/root/.zo/workqueue/runs/planner/YYYY-MM-DD.jsonl`
- Replaced the old "scan gh pr list across 37 repos" approach with project-documentation reading and gap-based planning. This plan originally described that as "deep-read 1-2 projects", but the shipped `prompts/PLANNER_INSTRUCTION.md` later changed Step 4 again; the final instruction is the source of truth and now requires advancing 6+ registered projects per run.

Copilot uses AI. Check for mistakes.

## 2026-03-27 09:20 UTC

### Queue Reconcilliation
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct spelling of 'Reconcilliation' to 'Reconciliation'.

Suggested change
### Queue Reconcilliation
### Queue Reconciliation

Copilot uses AI. Check for mistakes.

Claim order:
1. Stale IN_PROGRESS (lease expired)
2. TODO, sorted by priority ASC
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“sorted by priority ASC” is ambiguous in a P0/P1/P2 system (and can be misread as lexicographic). Consider stating the explicit order (P0 first, then P1, then P2) and—if applicable—tie-breaking (e.g., oldest created first) to match the behavior described elsewhere in the instructions.

Suggested change
2. TODO, sorted by priority ASC
2. TODO, ordered by priority: P0 first, then P1, then P2

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 38076b29fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +149 to +151
print("[3/9] NEEDS_REVIEW → READY_TO_MERGE → COMPLETED")
run_cmd(["review", id1, "--pr", "https://example.com/pr/1", "--summary", "ready"], env)
run_cmd(["ready-to-merge", id1, "--summary", "approved"], env)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Align integration harness with current task-state CLI

The new integration harness still drives deprecated workflow commands (review, ready-to-merge, request-changes) and asserts CHANGES_REQUESTED, but this commit’s own docs define the active state machine as TODO → IN_PROGRESS → BLOCKED/COMPLETED/CANCELLED (see docs/WORKLOG_CLI.md state machine and claim policy). In any environment running the redesigned CLI, the harness will fail on the first review call, so the recommended quick integration check cannot validate the current system behavior.

Useful? React with 👍 / 👎.

Comment on lines +18 to +19
Path(__file__).resolve().parents[3] / "Tools" / "worklog" / "worklog.py",
Path("/home/workspace/Tools/worklog/worklog.py"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve worklog path from checkout layout robustly

Path resolution is brittle: parents[3] makes the first candidate /Tools/worklog/worklog.py when the repo is checked out at /workspace/zo-workqueue, and the only fallback is /home/workspace/Tools/worklog/worklog.py. That misses common /workspace/Tools/worklog/worklog.py layouts, causing the documented “Quick Run (Recommended)” command to exit before tests start unless WORKLOG_PATH is manually set.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (1)
docs/PLANNER_RUNS.md (1)

3-82: Add a clear “historical/pre-redesign” disclaimer to prevent policy drift.

The run entries document PR-centric planner behavior. Since this file sits in active docs, add a short header clarifying these entries are historical snapshots from before/around the redesign and are not current planner policy.

Based on learnings: "System is gap-based only: planner finds gaps, worker builds and ships, PR is opened as the delivery mechanism - neither planner nor worker works on existing PRs".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/PLANNER_RUNS.md` around lines 3 - 82, Add a short header at the top of
PLANNER_RUNS.md (before the first "## 2026-03-27 09:20 UTC" entry) that clearly
labels the document as historical/pre-redesign snapshots and not current planner
policy; include one sentence stating the system is "gap-based only: planner
finds gaps, worker builds and ships, PR is opened as the delivery mechanism -
neither planner nor worker works on existing PRs" to prevent policy drift and
ensure readers treat these entries as archival.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/NORTH_STAR.md`:
- Line 35: Fix the article typo in the sentence "PRs that already exist are not
plans. A PR is a artifact of work that was already done." by changing "a
artifact" to "an artifact" so the sentence reads "A PR is an artifact of work
that was already done." Target the exact sentence in NORTH_STAR.md to update the
article.
- Around line 5-52: Update NORTH_STAR.md to remove wording that assigns
planner/worker responsibilities for reviewing, fixing, or reconciling existing
PRs and instead state the gap-based model: change references such as "Planner
reconciles drift", any bullets or sentences that say the planner/worker
"reviews/fixes/merges PRs", "PRs that already exist are not plans", and the
Goal/What Makes This Different sections to emphasize that the planner only finds
gaps and writes new tasks and the worker implements and opens a PR as the
delivery mechanism (but does not operate on existing PRs); keep notes about full
git workflow limited to workers opening normal PRs as delivery and the planner
recognizing merged work, remove instructions that describe working on or
reconciling open PRs, and ensure all examples and Goals reflect the gap-first
flow.

In `@docs/PLANNER_RUNS.md`:
- Line 5: Heading text "Queue Reconcilliation" contains a spelling error; change
the heading string "Queue Reconcilliation" to "Queue Reconciliation" wherever it
appears (e.g., the Markdown heading in PLANNER_RUNS.md) so the word
"Reconciliation" is spelled correctly.

In `@prompts/WORKER_INSTRUCTION.md`:
- Around line 35-45: Update the step that marks tasks completed to use the
worklog.py complete command instead of the update --status flow: replace the
existing step referencing `worklog.py update <id> --status COMPLETED --pr
"<url>"` with a step instructing `worklog.py complete <id> --result
"<description>" --pr "<url>" --files "<file-list>"`; ensure the change appears
in the WORKER_INSTRUCTION.md step list and references the `worklog.py complete`
syntax exactly as required by the CLI and tests.

In `@prompts/WORKER_PERSONA.md`:
- Around line 7-10: The document has contradictory completion semantics: one
place ("PR open + task COMPLETED") conflicts with another ("TODO to merged PR");
pick a single canonical completion model (recommend: "task is complete when PR
is merged") and update the phrasing wherever referenced in
WORKER_PERSONA.md—specifically replace the string "PR open + task COMPLETED" and
the phrase "TODO to merged PR" so they match the chosen model (search for those
exact phrases), and add a short cross-reference note to WORKER_INSTRUCTION.md to
ensure both persona and instruction files use the same "merged PR" completion
definition.

In `@tools/run_worklog_cli_integration_tests.py`:
- Around line 168-210: The test is exercising the removed CHANGES_REQUESTED
state and the request-changes command; remove or replace those parts so the test
follows the current claim policy (only stale IN_PROGRESS and TODO are
claimable). Specifically, stop calling run_cmd(["review", id2, ...]) and
run_cmd(["request-changes", ...]) and instead create two TODO items (e.g., via
run_cmd(["add", ...]) for alpha and beta) with differing priorities, use
update_db to make one stale IN_PROGRESS if needed, then call claim and
assert_contains that the higher-priority TODO id is returned and
assert_field_value checks "Status" is "TODO"; update references to extract_id,
run_cmd, update_db, claim, assert_contains, and assert_field_value accordingly.
- Around line 253-258: The except block currently re-raises a new SystemExit
without chaining the caught TestFailure (in the block handling TestFailure as
exc); change the re-raise to preserve exception chaining by raising the
SystemExit from the caught exception (i.e., use "raise SystemExit(1) from exc"
when re-raising), leaving the initial "raise SystemExit(main())" behavior intact
and ensuring TestFailure is referenced as exc.
- Around line 149-166: The test uses removed PR-review commands and states
(review, ready-to-merge, request-changes and NEEDS_REVIEW / READY_TO_MERGE /
CHANGES_REQUESTED); update the sequence in the run_cmd calls to the simplified
workflow: replace calls to "review" and "ready-to-merge" (and any
"request-changes") with the new sequence using "add" -> "claim" to move TODO →
IN_PROGRESS, use "update" with "--status BLOCKED" to test blocking, use
"complete" to mark IN_PROGRESS → COMPLETED, and use "cancel" for cancellation
paths; also update assertions that check Status (via assert_field_value /
assert_contains on get/history outputs) to expect TODO, IN_PROGRESS, BLOCKED,
COMPLETED, or CANCELLED as appropriate and remove any references to the old
state names or commands while keeping the existing run_cmd/assert_* calls (e.g.,
run_cmd, assert_contains, assert_field_value, id1).

---

Nitpick comments:
In `@docs/PLANNER_RUNS.md`:
- Around line 3-82: Add a short header at the top of PLANNER_RUNS.md (before the
first "## 2026-03-27 09:20 UTC" entry) that clearly labels the document as
historical/pre-redesign snapshots and not current planner policy; include one
sentence stating the system is "gap-based only: planner finds gaps, worker
builds and ships, PR is opened as the delivery mechanism - neither planner nor
worker works on existing PRs" to prevent policy drift and ensure readers treat
these entries as archival.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a566dad-0110-4429-a3c2-f496ea6a73a8

📥 Commits

Reviewing files that changed from the base of the PR and between 809e82d and 38076b2.

📒 Files selected for processing (16)
  • AGENTS.md
  • NORTH_STAR.md
  • SYSTEM_INTENT.md
  • docs/ARCHITECTURE.md
  • docs/NORTH_STAR.md
  • docs/PLANNER_RUNS.md
  • docs/PLANNING.md
  • docs/WORKLOG_CLI.md
  • docs/dev/proposedTests.md
  • docs/plans/2026-03-31-pr-removal.md
  • docs/plans/2026-03-31-rewrite-planner.md
  • prompts/PLANNER_INSTRUCTION.md
  • prompts/PLANNER_PERSONA.md
  • prompts/WORKER_INSTRUCTION.md
  • prompts/WORKER_PERSONA.md
  • tools/run_worklog_cli_integration_tests.py
💤 Files with no reviewable changes (1)
  • docs/PLANNING.md

Comment on lines +5 to +52
A structured system where AI agents can advance personal software projects across repeated runs: finding work, doing it, reviewing it, fixing it, merging it, and leaving a durable trail for the next run.

## The Core Loop

```
Planner selects 1-2 projects
→ reads NORTH_STAR, docs/, source code
→ finds gaps: missing tests, incomplete features, undocumented behavior, tests that haven't been written
→ writes a specific task with context and a step-by-step plan
→ adds it to the queue

Worker claims one task
→ builds, reviews, fixes, or merges based on task state
→ leaves enough context for the next run to continue

Planner reconciles drift
→ stale PRs, externally merged work, abandoned branches
→ closes loops before opening new ones
```

## What Makes This Different

This is not a PR-review machine. The planner's job is to find **new work** by reading code and documentation — not by scanning for open PRs.

The planner is a senior engineering manager who:
- Reads a project's north star and docs to understand what it's trying to be
- Reads the source to understand what it actually does
- Identifies the gap between those two
- Writes a specific, actionable task to close that gap

PRs that already exist are not plans. A PR is a artifact of work that was already done. The planner does not add tasks just because a PR is open.

## Goals

1. **Projects move forward without manual re-briefing.** Register a project once; the system keeps finding and closing meaningful work.
2. **Continuity across stateless sessions.** Each agent run starts cold, but durable task and PR state carry the workflow forward.
3. **Separation of planning and execution.** The planner decides what matters and keeps the queue healthy. Workers do build, review/fix, and merge work.
4. **Structured tooling, not freeform writing.** Agents interact through a CLI against a durable local store.
5. **Full git workflow.** Workers commit to branches, open normal PRs, apply valid review fixes, merge finished work.
6. **Observable.** Every run emits a concise summary.

## What Success Looks Like

- Planner reads a project's docs and source, finds a gap, writes a plan — not a PR-review task
- Worker executes the plan and opens a real PR
- Later worker reviews, requests fixes, merges
- Next planner run sees the merged work and moves on to the next gap
- The user observes projects advancing, not the same PRs being re-reviewed
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Align this document with the gap-based model (remove PR-review/merge responsibilities).

This file currently instructs planner/worker behavior around reviewing/fixing/merging PRs and PR-driven reconciliation (for example, Line 5, Line 17, Line 21, Line 41, Line 43, Line 50). That conflicts with the current model where planner finds gaps and workers deliver new work, without operating existing PRs.

Suggested doc fix (example)
-A structured system where AI agents can advance personal software projects across repeated runs: finding work, doing it, reviewing it, fixing it, merging it, and leaving a durable trail for the next run.
+A structured system where AI agents can advance personal software projects across repeated runs: finding work, building it, validating it, and shipping it while leaving a durable trail for the next run.
...
-  → builds, reviews, fixes, or merges based on task state
+  → builds and validates the scoped task, opens a PR, and marks it completed
...
-  → stale PRs, externally merged work, abandoned branches
-  → closes loops before opening new ones
+  → stale task leases, blocked tasks, and cancelled/outdated task intent
+  → keeps queue state accurate before opening new task loops
...
-3. **Separation of planning and execution.** The planner decides what matters and keeps the queue healthy. Workers do build, review/fix, and merge work.
+3. **Separation of planning and execution.** The planner decides what matters and keeps the queue healthy. Workers build, validate, and ship.
...
-5. **Full git workflow.** Workers commit to branches, open normal PRs, apply valid review fixes, merge finished work.
+5. **Delivery via git workflow.** Workers commit to branches, open PRs for delivered work, and mark tasks completed.

Based on learnings: "System is gap-based only: planner finds gaps, worker builds and ships, PR is opened as the delivery mechanism - neither planner nor worker works on existing PRs".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/NORTH_STAR.md` around lines 5 - 52, Update NORTH_STAR.md to remove
wording that assigns planner/worker responsibilities for reviewing, fixing, or
reconciling existing PRs and instead state the gap-based model: change
references such as "Planner reconciles drift", any bullets or sentences that say
the planner/worker "reviews/fixes/merges PRs", "PRs that already exist are not
plans", and the Goal/What Makes This Different sections to emphasize that the
planner only finds gaps and writes new tasks and the worker implements and opens
a PR as the delivery mechanism (but does not operate on existing PRs); keep
notes about full git workflow limited to workers opening normal PRs as delivery
and the planner recognizing merged work, remove instructions that describe
working on or reconciling open PRs, and ensure all examples and Goals reflect
the gap-first flow.

- Identifies the gap between those two
- Writes a specific, actionable task to close that gap

PRs that already exist are not plans. A PR is a artifact of work that was already done. The planner does not add tasks just because a PR is open.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix article typo at Line 35.

“a artifact” should be “an artifact”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/NORTH_STAR.md` at line 35, Fix the article typo in the sentence "PRs
that already exist are not plans. A PR is a artifact of work that was already
done." by changing "a artifact" to "an artifact" so the sentence reads "A PR is
an artifact of work that was already done." Target the exact sentence in
NORTH_STAR.md to update the article.


## 2026-03-27 09:20 UTC

### Queue Reconcilliation
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Correct spelling at Line 5.

“Queue Reconcilliation” → “Queue Reconciliation”.

🧰 Tools
🪛 LanguageTool

[grammar] ~5-~5: Ensure spelling is correct
Context: ...Log ## 2026-03-27 09:20 UTC ### Queue Reconcilliation - zo-workqueue w-77f6: PR #1 merg...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/PLANNER_RUNS.md` at line 5, Heading text "Queue Reconcilliation"
contains a spelling error; change the heading string "Queue Reconcilliation" to
"Queue Reconciliation" wherever it appears (e.g., the Markdown heading in
PLANNER_RUNS.md) so the word "Reconciliation" is spelled correctly.

Comment on lines +35 to +45
## Step 3: Execute

1. Read the project's `AGENTS.md` if it exists
1. Create a feature branch: `git checkout -b worker/<project>/<task-id>-<short-desc>`
2. Read the files mentioned in the task's context field
3. If a branch exists — check `git log --oneline -5` on it
4. If a PR exists — run `gh pr view <url>` to see its current state
5. If the task is `READY_TO_MERGE` — verify the PR is still mergeable

## Step 4: Execute

### Building (`TODO` or stale `IN_PROGRESS`)

1. Create a feature branch if one doesn't exist: `git checkout -b worker/<project>/<task-id>-<short-desc>`
2. Make the change
3. Validate — run tests, type checks, build, or behavioral verification
4. Commit: `git add . && git commit -m "<task summary>"`
5. Push: `git push -u origin <branch>`
6. Open a PR: `gh pr create --title "<title>" --body "<what changed, why, how validated>"`
7. Update the task: `worklog.py update <id> --status NEEDS_REVIEW --pr "<url>"`

### Reviewing or fixing (`NEEDS_REVIEW`)

1. Read the PR diff: `gh pr diff <url>`
2. Read the review comments if any
3. Either implement valid fixes on the same branch, or assess whether the feedback is actionable
4. If you fix: commit, push, and either move to `READY_TO_MERGE` or back to `NEEDS_REVIEW`
5. Record your review findings in the task notes

### Following up (`CHANGES_REQUESTED`)

1. Same branch, same PR
2. Implement the requested changes
3. Validate
4. Push
5. Either move to `READY_TO_MERGE` or back to `NEEDS_REVIEW`

### Merging (`READY_TO_MERGE`)
3. Make the changes as described in the plan
4. Validate — run tests, type checks, build, or behavioral verification
5. Commit: `git add . && git commit -m "<concise summary>"`
6. Push: `git push -u origin <branch>`
7. Open a PR: `gh pr create --title "<title>" --body "<what changed, why, how validated>"`
8. Update the task: `worklog.py update <id> --status COMPLETED --pr "<url>"`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if worklog.py supports both update --status COMPLETED and complete commands
fd -t f 'worklog.py' --exec grep -n -A5 'def.*complete\|--status.*COMPLETED\|add_parser.*complete' {} \;

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# First, locate worklog.py in the repository
find . -name 'worklog.py' -type f

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# Also check for WORKLOG_CLI.md documentation
find . -name 'WORKLOG_CLI.md' -type f

Repository: Coldaine/zo-workqueue

Length of output: 86


🏁 Script executed:

# Search for any python files that might contain worklog-related code
fd -t f '\.py$' | head -20

Repository: Coldaine/zo-workqueue

Length of output: 107


🏁 Script executed:

# Read the documentation file to see what commands are documented
cat -n docs/WORKLOG_CLI.md

Repository: Coldaine/zo-workqueue

Length of output: 5843


🏁 Script executed:

# Search for worklog-related Python files more broadly
find . -type f -name '*.py' | xargs grep -l 'worklog\|complete.*command' | head -20

Repository: Coldaine/zo-workqueue

Length of output: 109


🏁 Script executed:

# Find and read the actual worklog.py implementation
cat -n Tools/worklog/worklog.py

Repository: Coldaine/zo-workqueue

Length of output: 121


🏁 Script executed:

# Search for worklog.py with correct case sensitivity
fd -t f 'worklog\.py'

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# Check the tools directory structure
ls -la tools/ | head -30

Repository: Coldaine/zo-workqueue

Length of output: 262


🏁 Script executed:

# Read the integration test to see how worklog CLI is invoked
cat -n tools/run_worklog_cli_integration_tests.py

Repository: Coldaine/zo-workqueue

Length of output: 10627


🏁 Script executed:

# Search for any references to "complete" or "update --status" in the codebase
rg 'complete|update.*--status' --type markdown --type py | head -40

Repository: Coldaine/zo-workqueue

Length of output: 3574


Update line 44 to use the complete command instead of update --status COMPLETED.

The CLI documentation (docs/WORKLOG_CLI.md) explicitly distinguishes worklog.py complete as the command for marking tasks as COMPLETED, with required fields --result, --pr, and --files. The integration test also uses complete, not update --status COMPLETED. Line 44 should be:

8. Complete the task: `worklog.py complete <id> --result "<description>" --pr "<url>" --files "<file-list>"`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prompts/WORKER_INSTRUCTION.md` around lines 35 - 45, Update the step that
marks tasks completed to use the worklog.py complete command instead of the
update --status flow: replace the existing step referencing `worklog.py update
<id> --status COMPLETED --pr "<url>"` with a step instructing `worklog.py
complete <id> --result "<description>" --pr "<url>" --files "<file-list>"`;
ensure the change appears in the WORKER_INSTRUCTION.md step list and references
the `worklog.py complete` syntax exactly as required by the CLI and tests.

Comment on lines +7 to 10
You execute TODO tasks from the queue. You build, validate, commit, push, and open a PR — then mark the task COMPLETED.

Your job is to move a task from one state to the next, or from open to merged, without leaving things half-done.
Your job is to own the task through to delivery: from TODO to merged PR, without leaving things half-done.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve contradictory completion semantics (“open PR” vs “merged PR”).

Line 7/Line 37 defines done as “PR open + task COMPLETED,” but Line 9 says “TODO to merged PR.” Keep one model to avoid worker ambiguity.

Suggested wording fix
-Your job is to own the task through to delivery: from TODO to merged PR, without leaving things half-done.
+Your job is to own the task through to delivery: from TODO to an opened PR with the task marked COMPLETED, without leaving things half-done.

As per coding guidelines: "Worker persona and instructions are defined in prompts/WORKER_PERSONA.md and prompts/WORKER_INSTRUCTION.md - reference these for worker agent behavior".

🧰 Tools
🪛 LanguageTool

[style] ~7-~7: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... You execute TODO tasks from the queue. You build, validate, commit, push, and open...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prompts/WORKER_PERSONA.md` around lines 7 - 10, The document has
contradictory completion semantics: one place ("PR open + task COMPLETED")
conflicts with another ("TODO to merged PR"); pick a single canonical completion
model (recommend: "task is complete when PR is merged") and update the phrasing
wherever referenced in WORKER_PERSONA.md—specifically replace the string "PR
open + task COMPLETED" and the phrase "TODO to merged PR" so they match the
chosen model (search for those exact phrases), and add a short cross-reference
note to WORKER_INSTRUCTION.md to ensure both persona and instruction files use
the same "merged PR" completion definition.

Comment on lines +149 to +166
print("[3/9] NEEDS_REVIEW → READY_TO_MERGE → COMPLETED")
run_cmd(["review", id1, "--pr", "https://example.com/pr/1", "--summary", "ready"], env)
run_cmd(["ready-to-merge", id1, "--summary", "approved"], env)
complete = run_cmd([
"complete",
id1,
"--result",
"merged",
"--pr",
"https://example.com/pr/1",
"--files",
"a.py,b.py",
], env).stdout
assert_contains(complete, f"Completed [{id1}]", "complete output")
h1 = run_cmd(["history", "--limit", "10"], env).stdout
assert_contains(h1, id1, "history contains completed item")
get1 = run_cmd(["get", id1], env).stdout
assert_field_value(get1, "Status", "COMPLETED", "get archived item")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if the removed commands still exist in worklog.py
echo "Searching for removed commands in worklog.py..."
fd -t f 'worklog.py' --exec grep -n 'review\|ready-to-merge\|request-changes\|NEEDS_REVIEW\|CHANGES_REQUESTED\|READY_TO_MERGE' {} \;

Repository: Coldaine/zo-workqueue

Length of output: 112


🏁 Script executed:

# Search for the documentation file mentioned in the review
find . -type f -name "*pr-removal*" -o -name "*2026-03*" | head -20

Repository: Coldaine/zo-workqueue

Length of output: 200


🏁 Script executed:

# Search more broadly for the removed commands/states in the entire codebase
rg "review|ready-to-merge|request-changes|NEEDS_REVIEW|CHANGES_REQUESTED|READY_TO_MERGE" --type py -i | head -50

Repository: Coldaine/zo-workqueue

Length of output: 1076


🏁 Script executed:

# Check what documentation exists about the state machine
find . -type f -name "*.md" | xargs grep -l "state machine\|TODO.*IN_PROGRESS\|workflow" 2>/dev/null | head -20

Repository: Coldaine/zo-workqueue

Length of output: 564


🏁 Script executed:

# Look at the test file structure to understand what CLI it's actually testing
head -100 tools/run_worklog_cli_integration_tests.py

Repository: Coldaine/zo-workqueue

Length of output: 3198


🏁 Script executed:

# Read the pr-removal documentation
cat ./docs/plans/2026-03-31-pr-removal.md

Repository: Coldaine/zo-workqueue

Length of output: 2633


🏁 Script executed:

# Read the WORKLOG_CLI documentation to understand the documented state machine
cat ./docs/WORKLOG_CLI.md

Repository: Coldaine/zo-workqueue

Length of output: 4562


🏁 Script executed:

# Find and examine the main worklog.py file
find . -type f -name "worklog.py" | head -5

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# Search for command definitions in the actual worklog implementation
rg "def (review|ready_to_merge|request_changes|complete)" --type py -A 3 | head -100

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# Find worklog.py file location
find . -name "worklog.py" -type f 2>/dev/null

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# Get the size and first part of worklog.py to understand its structure
find . -name "worklog.py" -type f -exec wc -l {} \; -exec echo "---" \; -exec head -150 {} \;

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# Search for argparse subparser definitions for commands
rg "add_parser|subparser" -A 2 Tools/worklog/worklog.py 2>/dev/null | head -100

Repository: Coldaine/zo-workqueue

Length of output: 47


🏁 Script executed:

# List files in Tools directory if it exists
ls -la Tools/ 2>/dev/null || echo "Tools directory does not exist"

Repository: Coldaine/zo-workqueue

Length of output: 95


🏁 Script executed:

# Search for any Python files with worklog in name anywhere
find . -type f -name "*worklog*" 2>/dev/null

Repository: Coldaine/zo-workqueue

Length of output: 164


🏁 Script executed:

# Check what the test expects - look at the full test file
wc -l tools/run_worklog_cli_integration_tests.py

Repository: Coldaine/zo-workqueue

Length of output: 111


🏁 Script executed:

# Get more context from the test file - see the commands it's testing
sed -n '140,220p' tools/run_worklog_cli_integration_tests.py

Repository: Coldaine/zo-workqueue

Length of output: 2947


Critical: Test uses removed PR-review workflow states and commands.

This test section (lines 149-166 and 168+) uses review, ready-to-merge, and request-changes commands with NEEDS_REVIEW, READY_TO_MERGE, and CHANGES_REQUESTED states. According to docs/plans/2026-03-31-pr-removal.md and docs/WORKLOG_CLI.md, these were explicitly removed:

  • Removed states: NEEDS_REVIEW, CHANGES_REQUESTED, READY_TO_MERGE
  • Removed commands: review, ready-to-merge, request-changes

The documented state machine is: TODO → IN_PROGRESS → BLOCKED / COMPLETED / CANCELLED.

Update the test to use the simplified workflow:

  1. addclaim (TODO → IN_PROGRESS)
  2. update --status BLOCKED (for blocked tasks)
  3. complete (IN_PROGRESS → COMPLETED)
  4. cancel (for cancelled tasks)

Also see docs/WORKLOG_CLI.md for the full command reference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/run_worklog_cli_integration_tests.py` around lines 149 - 166, The test
uses removed PR-review commands and states (review, ready-to-merge,
request-changes and NEEDS_REVIEW / READY_TO_MERGE / CHANGES_REQUESTED); update
the sequence in the run_cmd calls to the simplified workflow: replace calls to
"review" and "ready-to-merge" (and any "request-changes") with the new sequence
using "add" -> "claim" to move TODO → IN_PROGRESS, use "update" with "--status
BLOCKED" to test blocking, use "complete" to mark IN_PROGRESS → COMPLETED, and
use "cancel" for cancellation paths; also update assertions that check Status
(via assert_field_value / assert_contains on get/history outputs) to expect
TODO, IN_PROGRESS, BLOCKED, COMPLETED, or CANCELLED as appropriate and remove
any references to the old state names or commands while keeping the existing
run_cmd/assert_* calls (e.g., run_cmd, assert_contains, assert_field_value,
id1).

Comment on lines +168 to +210
print("[4/9] CHANGES_REQUESTED priority over TODO")
add2 = run_cmd([
"add",
"--project",
"alpha",
"--task",
"Needs review changes",
"--priority",
"P1",
], env).stdout
id2 = extract_id(add2)
run_cmd(["claim"], env)
run_cmd(["review", id2, "--pr", "https://example.com/pr/2"], env)
run_cmd([
"request-changes",
id2,
"--summary",
"fix requested",
"--next",
"address comments",
], env)

add3 = run_cmd([
"add",
"--project",
"beta",
"--task",
"Plain TODO",
"--priority",
"P0",
], env).stdout
id3 = extract_id(add3)

db_path = home / ".zo" / "workqueue" / "worklog.db"
update_db(
db_path,
"UPDATE items SET lease_expires_at=datetime('now','-1 minute') WHERE id=?",
(id2,),
)

claim_priority = run_cmd(["claim"], env).stdout
assert_contains(claim_priority, id2, "claim should prioritize CHANGES_REQUESTED")
assert_field_value(claim_priority, "Status", "CHANGES_REQUESTED", "claim keeps review state")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Test validates removed CHANGES_REQUESTED state and request-changes command.

This section tests request-changes command and asserts CHANGES_REQUESTED status, which contradicts the PR-removal plan. The claim priority assertion at line 209-210 expects CHANGES_REQUESTED to be prioritized over TODO, but this state no longer exists in the documented system.

Per docs/WORKLOG_CLI.md claim policy (lines 62-66), the only claimable statuses are:

  1. Stale IN_PROGRESS (lease expired)
  2. TODO, sorted by priority ASC
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/run_worklog_cli_integration_tests.py` around lines 168 - 210, The test
is exercising the removed CHANGES_REQUESTED state and the request-changes
command; remove or replace those parts so the test follows the current claim
policy (only stale IN_PROGRESS and TODO are claimable). Specifically, stop
calling run_cmd(["review", id2, ...]) and run_cmd(["request-changes", ...]) and
instead create two TODO items (e.g., via run_cmd(["add", ...]) for alpha and
beta) with differing priorities, use update_db to make one stale IN_PROGRESS if
needed, then call claim and assert_contains that the higher-priority TODO id is
returned and assert_field_value checks "Status" is "TODO"; update references to
extract_id, run_cmd, update_db, claim, assert_contains, and assert_field_value
accordingly.

Comment on lines +253 to +258
if __name__ == "__main__":
try:
raise SystemExit(main())
except TestFailure as exc:
print(str(exc), file=sys.stderr)
raise SystemExit(1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use raise ... from exc for proper exception chaining.

Per Python best practices, when re-raising an exception within an except block, use from exc or from None to preserve the exception chain.

Proposed fix
 if __name__ == "__main__":
     try:
         raise SystemExit(main())
     except TestFailure as exc:
         print(str(exc), file=sys.stderr)
-        raise SystemExit(1)
+        raise SystemExit(1) from None
🧰 Tools
🪛 Ruff (0.15.9)

[warning] 258-258: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/run_worklog_cli_integration_tests.py` around lines 253 - 258, The
except block currently re-raises a new SystemExit without chaining the caught
TestFailure (in the block handling TestFailure as exc); change the re-raise to
preserve exception chaining by raising the SystemExit from the caught exception
(i.e., use "raise SystemExit(1) from exc" when re-raising), leaving the initial
"raise SystemExit(main())" behavior intact and ensuring TestFailure is
referenced as exc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants