chore: consolidate docs and planner/worker instructions#2
chore: consolidate docs and planner/worker instructions#2
Conversation
📝 WalkthroughWalkthroughThis pull request redesigns the Zo Work Queue system from a PR-centric review workflow to a gap-based task model. The planner discovers implementation gaps and creates TODO tasks; workers claim and execute tasks, opening PRs as their delivery mechanism; neither planner nor worker manages PR review/merge states. Structured JSONL run records are now mandatory for both roles. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The majority of changes are cohesive documentation rewrites (homogeneous in pattern) reflecting a single workflow redesign. However, the breadth spans multiple conceptual domains (planner logic, worker logic, architecture, CLI, test harness), and the SYSTEM_INTENT.md, PLANNER_INSTRUCTION.md changes introduce dense logic requiring careful reasoning. The new integration test harness adds executable complexity. Mixed scope and density drive a moderate effort. Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Comment |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Consolidates and rewrites the planner/worker documentation to shift from PR-centric workflows to gap-based planning + worker-owned delivery, while removing deprecated planning docs and adding new run-log and test-harness documentation.
Changes:
- Rewrites planner/worker personas + instructions (gap-based planning, structured JSONL run records, PRs not treated as planning targets).
- Replaces/removes legacy planning docs and adds new run logs / plan writeups.
- Adds a Python integration harness and documents how to run it.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/run_worklog_cli_integration_tests.py | Adds an isolated CLI integration test harness for worklog state transitions. |
| prompts/WORKER_PERSONA.md | Updates worker role definition and “done” criteria under the new workflow. |
| prompts/WORKER_INSTRUCTION.md | Updates worker execution steps and adds mandatory structured run record output. |
| prompts/PLANNER_PERSONA.md | Updates planner boundaries to avoid PR-centric behavior. |
| prompts/PLANNER_INSTRUCTION.md | Rewrites planner process (exclusions, deep reading, gap-based tasks, run records). |
| docs/plans/2026-03-31-rewrite-planner.md | Adds a retrospective plan doc describing the planner rewrite. |
| docs/plans/2026-03-31-pr-removal.md | Adds a plan doc describing removal of PR-review workflow states. |
| docs/dev/proposedTests.md | Documents running the new integration harness before manual tests. |
| docs/WORKLOG_CLI.md | Rewrites CLI docs to match the new state machine and registry commands. |
| docs/PLANNING.md | Removes deprecated planning doc. |
| docs/PLANNER_RUNS.md | Adds a planner run log document. |
| docs/NORTH_STAR.md | Adds a north-star doc under docs/ (currently overlaps/inconsistently mirrors root NORTH_STAR.md). |
| docs/ARCHITECTURE.md | Updates architecture doc to remove PR-review workflow references. |
| SYSTEM_INTENT.md | Rewrites system intent around machine-readable run records and gap-based planning. |
| NORTH_STAR.md | Updates root north star to reflect non-PR-review workflow and structured observability. |
| AGENTS.md | Updates repo entrypoint doc and links to the consolidated doc set. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| print("[3/9] NEEDS_REVIEW → READY_TO_MERGE → COMPLETED") | ||
| run_cmd(["review", id1, "--pr", "https://example.com/pr/1", "--summary", "ready"], env) | ||
| run_cmd(["ready-to-merge", id1, "--summary", "approved"], env) | ||
| complete = run_cmd([ | ||
| "complete", | ||
| id1, | ||
| "--result", | ||
| "merged", | ||
| "--pr", | ||
| "https://example.com/pr/1", | ||
| "--files", | ||
| "a.py,b.py", | ||
| ], env).stdout |
There was a problem hiding this comment.
This harness hard-depends on PR-centric states/commands (review, ready-to-merge, request-changes, NEEDS_REVIEW, CHANGES_REQUESTED) that the updated docs explicitly remove (TODO/IN_PROGRESS/BLOCKED/COMPLETED/CANCELLED only). As written, the script will fail against the new CLI contract. Align the harness with the new state machine (e.g., exercise TODO→IN_PROGRESS→COMPLETED, BLOCKED, CANCELLED, stale-lease reclaim priority), or keep the old workflow docs—right now the PR introduces a direct contract mismatch.
| get1 = run_cmd(["get", id1], env).stdout | ||
| assert_field_value(get1, "Status", "COMPLETED", "get archived item") | ||
|
|
||
| print("[4/9] CHANGES_REQUESTED priority over TODO") |
There was a problem hiding this comment.
This harness hard-depends on PR-centric states/commands (review, ready-to-merge, request-changes, NEEDS_REVIEW, CHANGES_REQUESTED) that the updated docs explicitly remove (TODO/IN_PROGRESS/BLOCKED/COMPLETED/CANCELLED only). As written, the script will fail against the new CLI contract. Align the harness with the new state machine (e.g., exercise TODO→IN_PROGRESS→COMPLETED, BLOCKED, CANCELLED, stale-lease reclaim priority), or keep the old workflow docs—right now the PR introduces a direct contract mismatch.
| print("[4/9] CHANGES_REQUESTED priority over TODO") | ||
| add2 = run_cmd([ | ||
| "add", | ||
| "--project", | ||
| "alpha", | ||
| "--task", | ||
| "Needs review changes", | ||
| "--priority", | ||
| "P1", | ||
| ], env).stdout | ||
| id2 = extract_id(add2) | ||
| run_cmd(["claim"], env) | ||
| run_cmd(["review", id2, "--pr", "https://example.com/pr/2"], env) | ||
| run_cmd([ | ||
| "request-changes", | ||
| id2, | ||
| "--summary", | ||
| "fix requested", | ||
| "--next", | ||
| "address comments", | ||
| ], env) |
There was a problem hiding this comment.
This harness hard-depends on PR-centric states/commands (review, ready-to-merge, request-changes, NEEDS_REVIEW, CHANGES_REQUESTED) that the updated docs explicitly remove (TODO/IN_PROGRESS/BLOCKED/COMPLETED/CANCELLED only). As written, the script will fail against the new CLI contract. Align the harness with the new state machine (e.g., exercise TODO→IN_PROGRESS→COMPLETED, BLOCKED, CANCELLED, stale-lease reclaim priority), or keep the old workflow docs—right now the PR introduces a direct contract mismatch.
| print("[4/9] CHANGES_REQUESTED priority over TODO") | |
| add2 = run_cmd([ | |
| "add", | |
| "--project", | |
| "alpha", | |
| "--task", | |
| "Needs review changes", | |
| "--priority", | |
| "P1", | |
| ], env).stdout | |
| id2 = extract_id(add2) | |
| run_cmd(["claim"], env) | |
| run_cmd(["review", id2, "--pr", "https://example.com/pr/2"], env) | |
| run_cmd([ | |
| "request-changes", | |
| id2, | |
| "--summary", | |
| "fix requested", | |
| "--next", | |
| "address comments", | |
| ], env) | |
| print("[4/9] stale IN_PROGRESS lease priority over TODO") | |
| add2 = run_cmd([ | |
| "add", | |
| "--project", | |
| "alpha", | |
| "--task", | |
| "Stale claimed work", | |
| "--priority", | |
| "P1", | |
| ], env).stdout | |
| id2 = extract_id(add2) | |
| run_cmd(["claim"], env) | |
| get2 = run_cmd(["get", id2], env).stdout | |
| assert_field_value(get2, "Status", "IN_PROGRESS", "claimed item enters in-progress") |
| 5. Commit: `git add . && git commit -m "<concise summary>"` | ||
| 6. Push: `git push -u origin <branch>` | ||
| 7. Open a PR: `gh pr create --title "<title>" --body "<what changed, why, how validated>"` | ||
| 8. Update the task: `worklog.py update <id> --status COMPLETED --pr "<url>"` |
There was a problem hiding this comment.
The worker instruction says to mark completion via worklog.py update --status COMPLETED, but elsewhere in this PR (and in the integration harness) completion is performed via worklog.py complete ... (which typically also archives to history and captures result/files). If complete remains the canonical completion action, update the worker instruction to use it; if update --status COMPLETED is now intended, the CLI reference and integration harness should be changed accordingly.
| 8. Update the task: `worklog.py update <id> --status COMPLETED --pr "<url>"` | |
| 8. Complete the task: `worklog.py complete <id> --pr "<url>"` |
| python3 /home/workspace/Tools/worklog/worklog.py projects add \ | ||
| --slug family-fling \ | ||
| --path /home/workspace/GitHub/coldaine-github-repos/family-fling \ | ||
| --repo Coldaine/family-fling \ | ||
| --read-first AGENTS.md \ | ||
| --read-first PLANNING.md |
There was a problem hiding this comment.
PLANNING.md is removed in this PR (docs/PLANNING.md deleted), but the CLI docs still recommend registering it as a --read-first document. Update examples to reference files that still exist (e.g., docs/NORTH_STAR.md, NORTH_STAR.md, or project-specific docs/ entrypoints), otherwise new users will copy/paste a broken command.
| python3 /home/workspace/Tools/worklog/worklog.py projects update family-fling \ | ||
| --default-branch main \ | ||
| --read-first AGENTS.md \ | ||
| --read-first docs/PLANNING.md |
There was a problem hiding this comment.
This example refers to docs/PLANNING.md, which is deleted by this PR. Replace with the new canonical planning entrypoints (e.g., docs/NORTH_STAR.md / NORTH_STAR.md) or remove the reference.
| A structured system where AI agents can advance personal software projects across repeated runs: finding work, doing it, reviewing it, fixing it, merging it, and leaving a durable trail for the next run. | ||
|
|
||
| ## The Core Loop | ||
|
|
There was a problem hiding this comment.
This doc conflicts with the rewritten instructions elsewhere in the same PR: planner selection is now described as '6+ projects' (prompts/PLANNER_INSTRUCTION.md), and the worker no longer 'reviews/fixes/merges based on task state' under the new simplified state machine. Either update docs/NORTH_STAR.md to match the current design, or explicitly label it as legacy/historical to avoid having two contradictory north stars (this file vs root NORTH_STAR.md).
| Changes: | ||
| - Added Step 6: mandatory structured run record — append one JSON line to `/root/.zo/workqueue/runs/planner/YYYY-MM-DD.jsonl` | ||
| - Changed Step 4 from "scan gh pr list across 37 repos" to "deep-read 1-2 projects" |
There was a problem hiding this comment.
This plan doc says the new approach is to deep-read '1-2 projects', but the updated prompts/PLANNER_INSTRUCTION.md Step 4 requires advancing '6+ registered projects per run'. Update this plan doc to match what actually shipped, or clarify that the plan changed and point to the final instruction as source of truth.
| Changes: | |
| - Added Step 6: mandatory structured run record — append one JSON line to `/root/.zo/workqueue/runs/planner/YYYY-MM-DD.jsonl` | |
| - Changed Step 4 from "scan gh pr list across 37 repos" to "deep-read 1-2 projects" | |
| Planned changes in this rewrite: | |
| - Added Step 6: mandatory structured run record — append one JSON line to `/root/.zo/workqueue/runs/planner/YYYY-MM-DD.jsonl` | |
| - Replaced the old "scan gh pr list across 37 repos" approach with project-documentation reading and gap-based planning. This plan originally described that as "deep-read 1-2 projects", but the shipped `prompts/PLANNER_INSTRUCTION.md` later changed Step 4 again; the final instruction is the source of truth and now requires advancing 6+ registered projects per run. |
|
|
||
| ## 2026-03-27 09:20 UTC | ||
|
|
||
| ### Queue Reconcilliation |
There was a problem hiding this comment.
Correct spelling of 'Reconcilliation' to 'Reconciliation'.
| ### Queue Reconcilliation | |
| ### Queue Reconciliation |
|
|
||
| Claim order: | ||
| 1. Stale IN_PROGRESS (lease expired) | ||
| 2. TODO, sorted by priority ASC |
There was a problem hiding this comment.
“sorted by priority ASC” is ambiguous in a P0/P1/P2 system (and can be misread as lexicographic). Consider stating the explicit order (P0 first, then P1, then P2) and—if applicable—tie-breaking (e.g., oldest created first) to match the behavior described elsewhere in the instructions.
| 2. TODO, sorted by priority ASC | |
| 2. TODO, ordered by priority: P0 first, then P1, then P2 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38076b29fc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| print("[3/9] NEEDS_REVIEW → READY_TO_MERGE → COMPLETED") | ||
| run_cmd(["review", id1, "--pr", "https://example.com/pr/1", "--summary", "ready"], env) | ||
| run_cmd(["ready-to-merge", id1, "--summary", "approved"], env) |
There was a problem hiding this comment.
Align integration harness with current task-state CLI
The new integration harness still drives deprecated workflow commands (review, ready-to-merge, request-changes) and asserts CHANGES_REQUESTED, but this commit’s own docs define the active state machine as TODO → IN_PROGRESS → BLOCKED/COMPLETED/CANCELLED (see docs/WORKLOG_CLI.md state machine and claim policy). In any environment running the redesigned CLI, the harness will fail on the first review call, so the recommended quick integration check cannot validate the current system behavior.
Useful? React with 👍 / 👎.
| Path(__file__).resolve().parents[3] / "Tools" / "worklog" / "worklog.py", | ||
| Path("/home/workspace/Tools/worklog/worklog.py"), |
There was a problem hiding this comment.
Resolve worklog path from checkout layout robustly
Path resolution is brittle: parents[3] makes the first candidate /Tools/worklog/worklog.py when the repo is checked out at /workspace/zo-workqueue, and the only fallback is /home/workspace/Tools/worklog/worklog.py. That misses common /workspace/Tools/worklog/worklog.py layouts, causing the documented “Quick Run (Recommended)” command to exit before tests start unless WORKLOG_PATH is manually set.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
docs/PLANNER_RUNS.md (1)
3-82: Add a clear “historical/pre-redesign” disclaimer to prevent policy drift.The run entries document PR-centric planner behavior. Since this file sits in active docs, add a short header clarifying these entries are historical snapshots from before/around the redesign and are not current planner policy.
Based on learnings: "System is gap-based only: planner finds gaps, worker builds and ships, PR is opened as the delivery mechanism - neither planner nor worker works on existing PRs".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/PLANNER_RUNS.md` around lines 3 - 82, Add a short header at the top of PLANNER_RUNS.md (before the first "## 2026-03-27 09:20 UTC" entry) that clearly labels the document as historical/pre-redesign snapshots and not current planner policy; include one sentence stating the system is "gap-based only: planner finds gaps, worker builds and ships, PR is opened as the delivery mechanism - neither planner nor worker works on existing PRs" to prevent policy drift and ensure readers treat these entries as archival.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/NORTH_STAR.md`:
- Line 35: Fix the article typo in the sentence "PRs that already exist are not
plans. A PR is a artifact of work that was already done." by changing "a
artifact" to "an artifact" so the sentence reads "A PR is an artifact of work
that was already done." Target the exact sentence in NORTH_STAR.md to update the
article.
- Around line 5-52: Update NORTH_STAR.md to remove wording that assigns
planner/worker responsibilities for reviewing, fixing, or reconciling existing
PRs and instead state the gap-based model: change references such as "Planner
reconciles drift", any bullets or sentences that say the planner/worker
"reviews/fixes/merges PRs", "PRs that already exist are not plans", and the
Goal/What Makes This Different sections to emphasize that the planner only finds
gaps and writes new tasks and the worker implements and opens a PR as the
delivery mechanism (but does not operate on existing PRs); keep notes about full
git workflow limited to workers opening normal PRs as delivery and the planner
recognizing merged work, remove instructions that describe working on or
reconciling open PRs, and ensure all examples and Goals reflect the gap-first
flow.
In `@docs/PLANNER_RUNS.md`:
- Line 5: Heading text "Queue Reconcilliation" contains a spelling error; change
the heading string "Queue Reconcilliation" to "Queue Reconciliation" wherever it
appears (e.g., the Markdown heading in PLANNER_RUNS.md) so the word
"Reconciliation" is spelled correctly.
In `@prompts/WORKER_INSTRUCTION.md`:
- Around line 35-45: Update the step that marks tasks completed to use the
worklog.py complete command instead of the update --status flow: replace the
existing step referencing `worklog.py update <id> --status COMPLETED --pr
"<url>"` with a step instructing `worklog.py complete <id> --result
"<description>" --pr "<url>" --files "<file-list>"`; ensure the change appears
in the WORKER_INSTRUCTION.md step list and references the `worklog.py complete`
syntax exactly as required by the CLI and tests.
In `@prompts/WORKER_PERSONA.md`:
- Around line 7-10: The document has contradictory completion semantics: one
place ("PR open + task COMPLETED") conflicts with another ("TODO to merged PR");
pick a single canonical completion model (recommend: "task is complete when PR
is merged") and update the phrasing wherever referenced in
WORKER_PERSONA.md—specifically replace the string "PR open + task COMPLETED" and
the phrase "TODO to merged PR" so they match the chosen model (search for those
exact phrases), and add a short cross-reference note to WORKER_INSTRUCTION.md to
ensure both persona and instruction files use the same "merged PR" completion
definition.
In `@tools/run_worklog_cli_integration_tests.py`:
- Around line 168-210: The test is exercising the removed CHANGES_REQUESTED
state and the request-changes command; remove or replace those parts so the test
follows the current claim policy (only stale IN_PROGRESS and TODO are
claimable). Specifically, stop calling run_cmd(["review", id2, ...]) and
run_cmd(["request-changes", ...]) and instead create two TODO items (e.g., via
run_cmd(["add", ...]) for alpha and beta) with differing priorities, use
update_db to make one stale IN_PROGRESS if needed, then call claim and
assert_contains that the higher-priority TODO id is returned and
assert_field_value checks "Status" is "TODO"; update references to extract_id,
run_cmd, update_db, claim, assert_contains, and assert_field_value accordingly.
- Around line 253-258: The except block currently re-raises a new SystemExit
without chaining the caught TestFailure (in the block handling TestFailure as
exc); change the re-raise to preserve exception chaining by raising the
SystemExit from the caught exception (i.e., use "raise SystemExit(1) from exc"
when re-raising), leaving the initial "raise SystemExit(main())" behavior intact
and ensuring TestFailure is referenced as exc.
- Around line 149-166: The test uses removed PR-review commands and states
(review, ready-to-merge, request-changes and NEEDS_REVIEW / READY_TO_MERGE /
CHANGES_REQUESTED); update the sequence in the run_cmd calls to the simplified
workflow: replace calls to "review" and "ready-to-merge" (and any
"request-changes") with the new sequence using "add" -> "claim" to move TODO →
IN_PROGRESS, use "update" with "--status BLOCKED" to test blocking, use
"complete" to mark IN_PROGRESS → COMPLETED, and use "cancel" for cancellation
paths; also update assertions that check Status (via assert_field_value /
assert_contains on get/history outputs) to expect TODO, IN_PROGRESS, BLOCKED,
COMPLETED, or CANCELLED as appropriate and remove any references to the old
state names or commands while keeping the existing run_cmd/assert_* calls (e.g.,
run_cmd, assert_contains, assert_field_value, id1).
---
Nitpick comments:
In `@docs/PLANNER_RUNS.md`:
- Around line 3-82: Add a short header at the top of PLANNER_RUNS.md (before the
first "## 2026-03-27 09:20 UTC" entry) that clearly labels the document as
historical/pre-redesign snapshots and not current planner policy; include one
sentence stating the system is "gap-based only: planner finds gaps, worker
builds and ships, PR is opened as the delivery mechanism - neither planner nor
worker works on existing PRs" to prevent policy drift and ensure readers treat
these entries as archival.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3a566dad-0110-4429-a3c2-f496ea6a73a8
📒 Files selected for processing (16)
AGENTS.mdNORTH_STAR.mdSYSTEM_INTENT.mddocs/ARCHITECTURE.mddocs/NORTH_STAR.mddocs/PLANNER_RUNS.mddocs/PLANNING.mddocs/WORKLOG_CLI.mddocs/dev/proposedTests.mddocs/plans/2026-03-31-pr-removal.mddocs/plans/2026-03-31-rewrite-planner.mdprompts/PLANNER_INSTRUCTION.mdprompts/PLANNER_PERSONA.mdprompts/WORKER_INSTRUCTION.mdprompts/WORKER_PERSONA.mdtools/run_worklog_cli_integration_tests.py
💤 Files with no reviewable changes (1)
- docs/PLANNING.md
| A structured system where AI agents can advance personal software projects across repeated runs: finding work, doing it, reviewing it, fixing it, merging it, and leaving a durable trail for the next run. | ||
|
|
||
| ## The Core Loop | ||
|
|
||
| ``` | ||
| Planner selects 1-2 projects | ||
| → reads NORTH_STAR, docs/, source code | ||
| → finds gaps: missing tests, incomplete features, undocumented behavior, tests that haven't been written | ||
| → writes a specific task with context and a step-by-step plan | ||
| → adds it to the queue | ||
|
|
||
| Worker claims one task | ||
| → builds, reviews, fixes, or merges based on task state | ||
| → leaves enough context for the next run to continue | ||
|
|
||
| Planner reconciles drift | ||
| → stale PRs, externally merged work, abandoned branches | ||
| → closes loops before opening new ones | ||
| ``` | ||
|
|
||
| ## What Makes This Different | ||
|
|
||
| This is not a PR-review machine. The planner's job is to find **new work** by reading code and documentation — not by scanning for open PRs. | ||
|
|
||
| The planner is a senior engineering manager who: | ||
| - Reads a project's north star and docs to understand what it's trying to be | ||
| - Reads the source to understand what it actually does | ||
| - Identifies the gap between those two | ||
| - Writes a specific, actionable task to close that gap | ||
|
|
||
| PRs that already exist are not plans. A PR is a artifact of work that was already done. The planner does not add tasks just because a PR is open. | ||
|
|
||
| ## Goals | ||
|
|
||
| 1. **Projects move forward without manual re-briefing.** Register a project once; the system keeps finding and closing meaningful work. | ||
| 2. **Continuity across stateless sessions.** Each agent run starts cold, but durable task and PR state carry the workflow forward. | ||
| 3. **Separation of planning and execution.** The planner decides what matters and keeps the queue healthy. Workers do build, review/fix, and merge work. | ||
| 4. **Structured tooling, not freeform writing.** Agents interact through a CLI against a durable local store. | ||
| 5. **Full git workflow.** Workers commit to branches, open normal PRs, apply valid review fixes, merge finished work. | ||
| 6. **Observable.** Every run emits a concise summary. | ||
|
|
||
| ## What Success Looks Like | ||
|
|
||
| - Planner reads a project's docs and source, finds a gap, writes a plan — not a PR-review task | ||
| - Worker executes the plan and opens a real PR | ||
| - Later worker reviews, requests fixes, merges | ||
| - Next planner run sees the merged work and moves on to the next gap | ||
| - The user observes projects advancing, not the same PRs being re-reviewed |
There was a problem hiding this comment.
Align this document with the gap-based model (remove PR-review/merge responsibilities).
This file currently instructs planner/worker behavior around reviewing/fixing/merging PRs and PR-driven reconciliation (for example, Line 5, Line 17, Line 21, Line 41, Line 43, Line 50). That conflicts with the current model where planner finds gaps and workers deliver new work, without operating existing PRs.
Suggested doc fix (example)
-A structured system where AI agents can advance personal software projects across repeated runs: finding work, doing it, reviewing it, fixing it, merging it, and leaving a durable trail for the next run.
+A structured system where AI agents can advance personal software projects across repeated runs: finding work, building it, validating it, and shipping it while leaving a durable trail for the next run.
...
- → builds, reviews, fixes, or merges based on task state
+ → builds and validates the scoped task, opens a PR, and marks it completed
...
- → stale PRs, externally merged work, abandoned branches
- → closes loops before opening new ones
+ → stale task leases, blocked tasks, and cancelled/outdated task intent
+ → keeps queue state accurate before opening new task loops
...
-3. **Separation of planning and execution.** The planner decides what matters and keeps the queue healthy. Workers do build, review/fix, and merge work.
+3. **Separation of planning and execution.** The planner decides what matters and keeps the queue healthy. Workers build, validate, and ship.
...
-5. **Full git workflow.** Workers commit to branches, open normal PRs, apply valid review fixes, merge finished work.
+5. **Delivery via git workflow.** Workers commit to branches, open PRs for delivered work, and mark tasks completed.Based on learnings: "System is gap-based only: planner finds gaps, worker builds and ships, PR is opened as the delivery mechanism - neither planner nor worker works on existing PRs".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/NORTH_STAR.md` around lines 5 - 52, Update NORTH_STAR.md to remove
wording that assigns planner/worker responsibilities for reviewing, fixing, or
reconciling existing PRs and instead state the gap-based model: change
references such as "Planner reconciles drift", any bullets or sentences that say
the planner/worker "reviews/fixes/merges PRs", "PRs that already exist are not
plans", and the Goal/What Makes This Different sections to emphasize that the
planner only finds gaps and writes new tasks and the worker implements and opens
a PR as the delivery mechanism (but does not operate on existing PRs); keep
notes about full git workflow limited to workers opening normal PRs as delivery
and the planner recognizing merged work, remove instructions that describe
working on or reconciling open PRs, and ensure all examples and Goals reflect
the gap-first flow.
| - Identifies the gap between those two | ||
| - Writes a specific, actionable task to close that gap | ||
|
|
||
| PRs that already exist are not plans. A PR is a artifact of work that was already done. The planner does not add tasks just because a PR is open. |
There was a problem hiding this comment.
Fix article typo at Line 35.
“a artifact” should be “an artifact”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/NORTH_STAR.md` at line 35, Fix the article typo in the sentence "PRs
that already exist are not plans. A PR is a artifact of work that was already
done." by changing "a artifact" to "an artifact" so the sentence reads "A PR is
an artifact of work that was already done." Target the exact sentence in
NORTH_STAR.md to update the article.
|
|
||
| ## 2026-03-27 09:20 UTC | ||
|
|
||
| ### Queue Reconcilliation |
There was a problem hiding this comment.
Correct spelling at Line 5.
“Queue Reconcilliation” → “Queue Reconciliation”.
🧰 Tools
🪛 LanguageTool
[grammar] ~5-~5: Ensure spelling is correct
Context: ...Log ## 2026-03-27 09:20 UTC ### Queue Reconcilliation - zo-workqueue w-77f6: PR #1 merg...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/PLANNER_RUNS.md` at line 5, Heading text "Queue Reconcilliation"
contains a spelling error; change the heading string "Queue Reconcilliation" to
"Queue Reconciliation" wherever it appears (e.g., the Markdown heading in
PLANNER_RUNS.md) so the word "Reconciliation" is spelled correctly.
| ## Step 3: Execute | ||
|
|
||
| 1. Read the project's `AGENTS.md` if it exists | ||
| 1. Create a feature branch: `git checkout -b worker/<project>/<task-id>-<short-desc>` | ||
| 2. Read the files mentioned in the task's context field | ||
| 3. If a branch exists — check `git log --oneline -5` on it | ||
| 4. If a PR exists — run `gh pr view <url>` to see its current state | ||
| 5. If the task is `READY_TO_MERGE` — verify the PR is still mergeable | ||
|
|
||
| ## Step 4: Execute | ||
|
|
||
| ### Building (`TODO` or stale `IN_PROGRESS`) | ||
|
|
||
| 1. Create a feature branch if one doesn't exist: `git checkout -b worker/<project>/<task-id>-<short-desc>` | ||
| 2. Make the change | ||
| 3. Validate — run tests, type checks, build, or behavioral verification | ||
| 4. Commit: `git add . && git commit -m "<task summary>"` | ||
| 5. Push: `git push -u origin <branch>` | ||
| 6. Open a PR: `gh pr create --title "<title>" --body "<what changed, why, how validated>"` | ||
| 7. Update the task: `worklog.py update <id> --status NEEDS_REVIEW --pr "<url>"` | ||
|
|
||
| ### Reviewing or fixing (`NEEDS_REVIEW`) | ||
|
|
||
| 1. Read the PR diff: `gh pr diff <url>` | ||
| 2. Read the review comments if any | ||
| 3. Either implement valid fixes on the same branch, or assess whether the feedback is actionable | ||
| 4. If you fix: commit, push, and either move to `READY_TO_MERGE` or back to `NEEDS_REVIEW` | ||
| 5. Record your review findings in the task notes | ||
|
|
||
| ### Following up (`CHANGES_REQUESTED`) | ||
|
|
||
| 1. Same branch, same PR | ||
| 2. Implement the requested changes | ||
| 3. Validate | ||
| 4. Push | ||
| 5. Either move to `READY_TO_MERGE` or back to `NEEDS_REVIEW` | ||
|
|
||
| ### Merging (`READY_TO_MERGE`) | ||
| 3. Make the changes as described in the plan | ||
| 4. Validate — run tests, type checks, build, or behavioral verification | ||
| 5. Commit: `git add . && git commit -m "<concise summary>"` | ||
| 6. Push: `git push -u origin <branch>` | ||
| 7. Open a PR: `gh pr create --title "<title>" --body "<what changed, why, how validated>"` | ||
| 8. Update the task: `worklog.py update <id> --status COMPLETED --pr "<url>"` | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if worklog.py supports both update --status COMPLETED and complete commands
fd -t f 'worklog.py' --exec grep -n -A5 'def.*complete\|--status.*COMPLETED\|add_parser.*complete' {} \;Repository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# First, locate worklog.py in the repository
find . -name 'worklog.py' -type fRepository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# Also check for WORKLOG_CLI.md documentation
find . -name 'WORKLOG_CLI.md' -type fRepository: Coldaine/zo-workqueue
Length of output: 86
🏁 Script executed:
# Search for any python files that might contain worklog-related code
fd -t f '\.py$' | head -20Repository: Coldaine/zo-workqueue
Length of output: 107
🏁 Script executed:
# Read the documentation file to see what commands are documented
cat -n docs/WORKLOG_CLI.mdRepository: Coldaine/zo-workqueue
Length of output: 5843
🏁 Script executed:
# Search for worklog-related Python files more broadly
find . -type f -name '*.py' | xargs grep -l 'worklog\|complete.*command' | head -20Repository: Coldaine/zo-workqueue
Length of output: 109
🏁 Script executed:
# Find and read the actual worklog.py implementation
cat -n Tools/worklog/worklog.pyRepository: Coldaine/zo-workqueue
Length of output: 121
🏁 Script executed:
# Search for worklog.py with correct case sensitivity
fd -t f 'worklog\.py'Repository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# Check the tools directory structure
ls -la tools/ | head -30Repository: Coldaine/zo-workqueue
Length of output: 262
🏁 Script executed:
# Read the integration test to see how worklog CLI is invoked
cat -n tools/run_worklog_cli_integration_tests.pyRepository: Coldaine/zo-workqueue
Length of output: 10627
🏁 Script executed:
# Search for any references to "complete" or "update --status" in the codebase
rg 'complete|update.*--status' --type markdown --type py | head -40Repository: Coldaine/zo-workqueue
Length of output: 3574
Update line 44 to use the complete command instead of update --status COMPLETED.
The CLI documentation (docs/WORKLOG_CLI.md) explicitly distinguishes worklog.py complete as the command for marking tasks as COMPLETED, with required fields --result, --pr, and --files. The integration test also uses complete, not update --status COMPLETED. Line 44 should be:
8. Complete the task: `worklog.py complete <id> --result "<description>" --pr "<url>" --files "<file-list>"`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@prompts/WORKER_INSTRUCTION.md` around lines 35 - 45, Update the step that
marks tasks completed to use the worklog.py complete command instead of the
update --status flow: replace the existing step referencing `worklog.py update
<id> --status COMPLETED --pr "<url>"` with a step instructing `worklog.py
complete <id> --result "<description>" --pr "<url>" --files "<file-list>"`;
ensure the change appears in the WORKER_INSTRUCTION.md step list and references
the `worklog.py complete` syntax exactly as required by the CLI and tests.
| You execute TODO tasks from the queue. You build, validate, commit, push, and open a PR — then mark the task COMPLETED. | ||
|
|
||
| Your job is to move a task from one state to the next, or from open to merged, without leaving things half-done. | ||
| Your job is to own the task through to delivery: from TODO to merged PR, without leaving things half-done. | ||
|
|
There was a problem hiding this comment.
Resolve contradictory completion semantics (“open PR” vs “merged PR”).
Line 7/Line 37 defines done as “PR open + task COMPLETED,” but Line 9 says “TODO to merged PR.” Keep one model to avoid worker ambiguity.
Suggested wording fix
-Your job is to own the task through to delivery: from TODO to merged PR, without leaving things half-done.
+Your job is to own the task through to delivery: from TODO to an opened PR with the task marked COMPLETED, without leaving things half-done.As per coding guidelines: "Worker persona and instructions are defined in prompts/WORKER_PERSONA.md and prompts/WORKER_INSTRUCTION.md - reference these for worker agent behavior".
🧰 Tools
🪛 LanguageTool
[style] ~7-~7: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... You execute TODO tasks from the queue. You build, validate, commit, push, and open...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@prompts/WORKER_PERSONA.md` around lines 7 - 10, The document has
contradictory completion semantics: one place ("PR open + task COMPLETED")
conflicts with another ("TODO to merged PR"); pick a single canonical completion
model (recommend: "task is complete when PR is merged") and update the phrasing
wherever referenced in WORKER_PERSONA.md—specifically replace the string "PR
open + task COMPLETED" and the phrase "TODO to merged PR" so they match the
chosen model (search for those exact phrases), and add a short cross-reference
note to WORKER_INSTRUCTION.md to ensure both persona and instruction files use
the same "merged PR" completion definition.
| print("[3/9] NEEDS_REVIEW → READY_TO_MERGE → COMPLETED") | ||
| run_cmd(["review", id1, "--pr", "https://example.com/pr/1", "--summary", "ready"], env) | ||
| run_cmd(["ready-to-merge", id1, "--summary", "approved"], env) | ||
| complete = run_cmd([ | ||
| "complete", | ||
| id1, | ||
| "--result", | ||
| "merged", | ||
| "--pr", | ||
| "https://example.com/pr/1", | ||
| "--files", | ||
| "a.py,b.py", | ||
| ], env).stdout | ||
| assert_contains(complete, f"Completed [{id1}]", "complete output") | ||
| h1 = run_cmd(["history", "--limit", "10"], env).stdout | ||
| assert_contains(h1, id1, "history contains completed item") | ||
| get1 = run_cmd(["get", id1], env).stdout | ||
| assert_field_value(get1, "Status", "COMPLETED", "get archived item") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if the removed commands still exist in worklog.py
echo "Searching for removed commands in worklog.py..."
fd -t f 'worklog.py' --exec grep -n 'review\|ready-to-merge\|request-changes\|NEEDS_REVIEW\|CHANGES_REQUESTED\|READY_TO_MERGE' {} \;Repository: Coldaine/zo-workqueue
Length of output: 112
🏁 Script executed:
# Search for the documentation file mentioned in the review
find . -type f -name "*pr-removal*" -o -name "*2026-03*" | head -20Repository: Coldaine/zo-workqueue
Length of output: 200
🏁 Script executed:
# Search more broadly for the removed commands/states in the entire codebase
rg "review|ready-to-merge|request-changes|NEEDS_REVIEW|CHANGES_REQUESTED|READY_TO_MERGE" --type py -i | head -50Repository: Coldaine/zo-workqueue
Length of output: 1076
🏁 Script executed:
# Check what documentation exists about the state machine
find . -type f -name "*.md" | xargs grep -l "state machine\|TODO.*IN_PROGRESS\|workflow" 2>/dev/null | head -20Repository: Coldaine/zo-workqueue
Length of output: 564
🏁 Script executed:
# Look at the test file structure to understand what CLI it's actually testing
head -100 tools/run_worklog_cli_integration_tests.pyRepository: Coldaine/zo-workqueue
Length of output: 3198
🏁 Script executed:
# Read the pr-removal documentation
cat ./docs/plans/2026-03-31-pr-removal.mdRepository: Coldaine/zo-workqueue
Length of output: 2633
🏁 Script executed:
# Read the WORKLOG_CLI documentation to understand the documented state machine
cat ./docs/WORKLOG_CLI.mdRepository: Coldaine/zo-workqueue
Length of output: 4562
🏁 Script executed:
# Find and examine the main worklog.py file
find . -type f -name "worklog.py" | head -5Repository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# Search for command definitions in the actual worklog implementation
rg "def (review|ready_to_merge|request_changes|complete)" --type py -A 3 | head -100Repository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# Find worklog.py file location
find . -name "worklog.py" -type f 2>/dev/nullRepository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# Get the size and first part of worklog.py to understand its structure
find . -name "worklog.py" -type f -exec wc -l {} \; -exec echo "---" \; -exec head -150 {} \;Repository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# Search for argparse subparser definitions for commands
rg "add_parser|subparser" -A 2 Tools/worklog/worklog.py 2>/dev/null | head -100Repository: Coldaine/zo-workqueue
Length of output: 47
🏁 Script executed:
# List files in Tools directory if it exists
ls -la Tools/ 2>/dev/null || echo "Tools directory does not exist"Repository: Coldaine/zo-workqueue
Length of output: 95
🏁 Script executed:
# Search for any Python files with worklog in name anywhere
find . -type f -name "*worklog*" 2>/dev/nullRepository: Coldaine/zo-workqueue
Length of output: 164
🏁 Script executed:
# Check what the test expects - look at the full test file
wc -l tools/run_worklog_cli_integration_tests.pyRepository: Coldaine/zo-workqueue
Length of output: 111
🏁 Script executed:
# Get more context from the test file - see the commands it's testing
sed -n '140,220p' tools/run_worklog_cli_integration_tests.pyRepository: Coldaine/zo-workqueue
Length of output: 2947
Critical: Test uses removed PR-review workflow states and commands.
This test section (lines 149-166 and 168+) uses review, ready-to-merge, and request-changes commands with NEEDS_REVIEW, READY_TO_MERGE, and CHANGES_REQUESTED states. According to docs/plans/2026-03-31-pr-removal.md and docs/WORKLOG_CLI.md, these were explicitly removed:
- Removed states:
NEEDS_REVIEW,CHANGES_REQUESTED,READY_TO_MERGE - Removed commands:
review,ready-to-merge,request-changes
The documented state machine is: TODO → IN_PROGRESS → BLOCKED / COMPLETED / CANCELLED.
Update the test to use the simplified workflow:
add→claim(TODO → IN_PROGRESS)update --status BLOCKED(for blocked tasks)complete(IN_PROGRESS → COMPLETED)cancel(for cancelled tasks)
Also see docs/WORKLOG_CLI.md for the full command reference.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/run_worklog_cli_integration_tests.py` around lines 149 - 166, The test
uses removed PR-review commands and states (review, ready-to-merge,
request-changes and NEEDS_REVIEW / READY_TO_MERGE / CHANGES_REQUESTED); update
the sequence in the run_cmd calls to the simplified workflow: replace calls to
"review" and "ready-to-merge" (and any "request-changes") with the new sequence
using "add" -> "claim" to move TODO → IN_PROGRESS, use "update" with "--status
BLOCKED" to test blocking, use "complete" to mark IN_PROGRESS → COMPLETED, and
use "cancel" for cancellation paths; also update assertions that check Status
(via assert_field_value / assert_contains on get/history outputs) to expect
TODO, IN_PROGRESS, BLOCKED, COMPLETED, or CANCELLED as appropriate and remove
any references to the old state names or commands while keeping the existing
run_cmd/assert_* calls (e.g., run_cmd, assert_contains, assert_field_value,
id1).
| print("[4/9] CHANGES_REQUESTED priority over TODO") | ||
| add2 = run_cmd([ | ||
| "add", | ||
| "--project", | ||
| "alpha", | ||
| "--task", | ||
| "Needs review changes", | ||
| "--priority", | ||
| "P1", | ||
| ], env).stdout | ||
| id2 = extract_id(add2) | ||
| run_cmd(["claim"], env) | ||
| run_cmd(["review", id2, "--pr", "https://example.com/pr/2"], env) | ||
| run_cmd([ | ||
| "request-changes", | ||
| id2, | ||
| "--summary", | ||
| "fix requested", | ||
| "--next", | ||
| "address comments", | ||
| ], env) | ||
|
|
||
| add3 = run_cmd([ | ||
| "add", | ||
| "--project", | ||
| "beta", | ||
| "--task", | ||
| "Plain TODO", | ||
| "--priority", | ||
| "P0", | ||
| ], env).stdout | ||
| id3 = extract_id(add3) | ||
|
|
||
| db_path = home / ".zo" / "workqueue" / "worklog.db" | ||
| update_db( | ||
| db_path, | ||
| "UPDATE items SET lease_expires_at=datetime('now','-1 minute') WHERE id=?", | ||
| (id2,), | ||
| ) | ||
|
|
||
| claim_priority = run_cmd(["claim"], env).stdout | ||
| assert_contains(claim_priority, id2, "claim should prioritize CHANGES_REQUESTED") | ||
| assert_field_value(claim_priority, "Status", "CHANGES_REQUESTED", "claim keeps review state") |
There was a problem hiding this comment.
Critical: Test validates removed CHANGES_REQUESTED state and request-changes command.
This section tests request-changes command and asserts CHANGES_REQUESTED status, which contradicts the PR-removal plan. The claim priority assertion at line 209-210 expects CHANGES_REQUESTED to be prioritized over TODO, but this state no longer exists in the documented system.
Per docs/WORKLOG_CLI.md claim policy (lines 62-66), the only claimable statuses are:
- Stale
IN_PROGRESS(lease expired) TODO, sorted by priority ASC
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/run_worklog_cli_integration_tests.py` around lines 168 - 210, The test
is exercising the removed CHANGES_REQUESTED state and the request-changes
command; remove or replace those parts so the test follows the current claim
policy (only stale IN_PROGRESS and TODO are claimable). Specifically, stop
calling run_cmd(["review", id2, ...]) and run_cmd(["request-changes", ...]) and
instead create two TODO items (e.g., via run_cmd(["add", ...]) for alpha and
beta) with differing priorities, use update_db to make one stale IN_PROGRESS if
needed, then call claim and assert_contains that the higher-priority TODO id is
returned and assert_field_value checks "Status" is "TODO"; update references to
extract_id, run_cmd, update_db, claim, assert_contains, and assert_field_value
accordingly.
| if __name__ == "__main__": | ||
| try: | ||
| raise SystemExit(main()) | ||
| except TestFailure as exc: | ||
| print(str(exc), file=sys.stderr) | ||
| raise SystemExit(1) |
There was a problem hiding this comment.
Use raise ... from exc for proper exception chaining.
Per Python best practices, when re-raising an exception within an except block, use from exc or from None to preserve the exception chain.
Proposed fix
if __name__ == "__main__":
try:
raise SystemExit(main())
except TestFailure as exc:
print(str(exc), file=sys.stderr)
- raise SystemExit(1)
+ raise SystemExit(1) from None🧰 Tools
🪛 Ruff (0.15.9)
[warning] 258-258: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/run_worklog_cli_integration_tests.py` around lines 253 - 258, The
except block currently re-raises a new SystemExit without chaining the caught
TestFailure (in the block handling TestFailure as exc); change the re-raise to
preserve exception chaining by raising the SystemExit from the caught exception
(i.e., use "raise SystemExit(1) from exc" when re-raising), leaving the initial
"raise SystemExit(main())" behavior intact and ensuring TestFailure is
referenced as exc.
Consolidates documentation, planner/worker instructions, and removes deprecated PLANNING.md. Adds NORTH_STAR.md and PLANNER_RUNS.md docs.
Summary by CodeRabbit
Release Notes
Documentation
Tests