feat(state+rules): rule 22 (consumer-first + RED-first TDD) + parentTaskId fan-out cost rollup + 09-byterover-cli comparison#27
Conversation
…(09-byterover-cli B3)
Optional parentTaskId on agent_invoked/agent_completed correlates per-call
cost rows back to the parent orchestrator step (REVIEW panel run, REVIEW
debate fire path). Closes the M14/M15 telemetry gap described in
docs/comparison/09-byterover-cli/SYNTHESIS.md and CODEX_RESPONSE F3.
Schema-compatible: existing readers parse new events identically (M10/M12/M13
forward-compat precedent); validator enforces canonical T-NNN pattern when
present; FIFO-by-phase budget pairing untouched. Telemetry-only — no rule-21
parallel-provider surface, no new authority boundary.
- src/providers/types.ts: ProviderRequest.parentTaskId (optional)
- src/state/schemas.ts: agent_invoked + agent_completed variants extended
- src/state/events.ts: validator (T-NNN regex when present)
- src/providers/invoke.ts: wrapper writes through on both events
- src/providers/cost.ts: summarizeByParentTask() — separate report,
not folded into summarizeBudgetUse to keep budget pairing simple
- src/tools/debate-request.ts: DebateRequestInput.parentTaskId, threaded
onto opposing + synthesis turns
- src/phases/review.ts: two requestDebate sites set parentTaskId from opts.taskId
- src/phases/review-panel.ts: PanelistInvoker takes optional ctx arg;
runReviewPanel passes { parentTaskId: opts.upstreamRefs.taskId }
- src/cli/production-seams.ts: productionPanelistInvoker reads ctx.parentTaskId
and stamps it on the constituent ProviderRequest
Codex pre-design memo: docs/comparison/09-byterover-cli/CODEX_PREDESIGN_B3.md
(thread 019e1318, verdict revise-and-implement; corrections applied).
Tests: +20 passing (3108 → 3128).
…atus; influence library Adopts 09-byterover-cli borrows B1 (Outside-In feature design) and B4 (strict TDD ordering for behavior changes) as a single consolidated non-negotiable rule per Codex fix-first F5 (avoid rule-list bloat from 21 → 23). The detailed RED-first sequence lives in src/agents/defaults/builder.md; rule 22 is the structural non-negotiable. Bundles the agent-skills round 2 carry-over: refresh stale Status line from v0.13.0-alpha.0 / 1983 tests / PE-1 to v0.17.0-alpha.0 / 3128 tests (M16 closed; M13/M14/M15/PE-1 closed). Adds byterover-cli row to the influence library (consumer-first design + RED-first TDD; parentTaskId fan-out cost rollup). Adds docs/comparison/ pointer to "Where decisions live." See docs/comparison/09-byterover-cli/SYNTHESIS.md commit-2 plan. Codex thread 019e12ec (fix-first → all 8 findings closed in synthesis).
Adds executable detail for the strict-TDD ordering non-negotiable just landed in CLAUDE.md rule 22(b). Five steps: write the failing test first, run it to confirm it fails for the right reason, write the minimal implementation, run it to confirm green, refactor only if green stays green. Bug-fix tasks must name the reproduction test in `## Notes`. Bundles M8 mutation-gated discipline cross-reference so behavior changes in mutation-tested code automatically inherit RED-first; rule 22(b) covers the prompt-level intent for the rest. Closes 09-byterover-cli SYNTHESIS commit-3 plan; closes the agent-skills round-2 carry-over for builder validation language.
Self-contained comparison artifacts for the byterover-cli template (memory-layer CLI vs SDLC-runtime category). Records: - COMPARISON.md — 21-row feature matrix; 6 borrows priced in rule-20 sub-surfaces; 10 explicit rejects with reasons. - CODEX_BRIEFING.md — debate brief; locked answers (do not relitigate); recommended landing plan; 8 specific debate prompts. - CODEX_RESPONSE.md — Codex fix-first verdict, thread 019e12ec; 8 findings (1 block-push: B2 invented `code-oz consult` surface; 2 block-next- milestone: B2 under-priced + B3 hotfix-not-followup; 2 fix-soon; 3 fyi). - CODEX_PREDESIGN_B3.md — Codex pre-implementation design memo, thread 019e1318; revise-and-implement; corrected file map for the patch. - SYNTHESIS.md — closed; 7 decision points resolved by Claude under Ozzy's autonomy grant; 4-commit landing plan shipped on this branch. Verdict: code-oz exceeds byterover-cli, scoped to SDLC discipline mechanics. byterover ships more product-mature memory-layer engineering (daemon, REPL, web UI, MCP, 21 providers, public benchmarks); code-oz operates in a different category and structurally exceeds on 12 discipline authorities. Three borrows earn their place at v0.17: - B3 pre-M17 telemetry hotfix (Commit 1) - Rule 22 consolidating B1+B4 (Commit 2) - Builder RED-first detail referencing rule 22(b) (Commit 3) Three deferred (B2 reframed against tool_use.repo_context as M17/M18 contender; B5 AsyncLocalStorage pattern; B6 ESLint boundary). One reject reclassified (R10 defer-with-high-bar after SHIP). This PR is self-contained and does not modify docs/comparison/README.md to avoid colliding with parallel template-comparison sessions; the README index entry will land via a separate sync commit on main.
…AP conflicts CLAUDE.md: combined HEAD's M16/byterover-cli status line with main's 3108 baseline + M16 R0/R1/R2 closure language; kept rule 1 intervention-writer authority expansion, rule 16 persona-generation paragraph (from PR #20 mimir), rule 22 (this branch). Decisions-live list keeps both bullets. docs/comparison/README.md: kept all existing 01/02/07 session rows; added 09 | byterover-cli row; removed byterover-cli from the Unaudited backlog. tests/build-prompt-composer.test.ts: raised builder.md body upper-bound cap from 6000 to 7000 (rule 22(b) RED-first detail + rule-9 enforcement layer additions are intentional, not regressions). 3198 tests pass, 2 skip (live xAI gated), 0 fail.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (22)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This PR introduces parentTaskId threading for cost attribution across reviewer panels and debates, alongside new project rules for consumer-first design and RED-first TDD. Feedback recommends renaming a parameter in production-seams.ts to prevent confusion with the InvokeContext type.
| opts: ProductionPanelistInvokerOptions, | ||
| ): import('../phases/review-panel.ts').PanelistInvoker { | ||
| return async (cfg, round) => { | ||
| return async (cfg, round, invokeCtx) => { |
There was a problem hiding this comment.
The third parameter is named invokeCtx, which shadows the conceptual meaning of InvokeContext used in opts.invokeCtx. In this codebase, InvokeContext is a specific type containing the registry, config, and other runtime dependencies. Naming this parameter ctx or panelCtx would avoid confusion with the orchestrator's invocation context.
| return async (cfg, round, invokeCtx) => { | |
| return async (cfg, round, ctx) => { |
| ...(invokeCtx?.parentTaskId !== undefined | ||
| ? { parentTaskId: invokeCtx.parentTaskId } | ||
| : {}), |
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull request overview
Adds a parentTaskId correlation field to provider-call events so fan-out operations (review panels + debates) can be cost-attributed back to a single orchestrator task, and lands the byterover-cli session-09 comparison closure + a new consolidated rule (consumer-first + RED-first TDD).
Changes:
- Thread
parentTaskIdthroughProviderRequest→agent_invoked/agent_completed, validateT-\\d{3,}when present, and addsummarizeByParentTask()rollup reporting. - Propagate
parentTaskIdthrough REVIEW debate fire-paths and REVIEW panel production seam wiring. - Add rule 22 + builder “RED-first” execution details; add session-09 comparison artifacts and index row; adjust builder prompt-length test cap.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/state-events-parent-task-id.test.ts | Adds validator coverage for optional parentTaskId on agent events. |
| tests/provider-invoke-parent-task-id.test.ts | Ensures invokeAgent records/omits parentTaskId correctly in events.jsonl. |
| tests/cost-by-parent-task.test.ts | Adds regression/behavior tests for summarizeByParentTask FIFO pairing + rollup. |
| tests/build-prompt-composer.test.ts | Raises builder persona body upper bound to accommodate new builder content. |
| src/tools/debate-request.ts | Threads optional parentTaskId into opposing + synthesis ProviderRequests. |
| src/state/schemas.ts | Extends agent_invoked / agent_completed event variants with optional parentTaskId. |
| src/state/events.ts | Validates parentTaskId against TASK_ID_PATTERN when present. |
| src/providers/types.ts | Adds ProviderRequest.parentTaskId as an optional correlation id. |
| src/providers/invoke.ts | Writes parentTaskId into agent_invoked and mirrors it into agent_completed. |
| src/providers/cost.ts | Introduces summarizeByParentTask() rollup report keyed by parentTaskId. |
| src/phases/review.ts | Sets parentTaskId on REVIEW debate invocations (single + panel-debate branch). |
| src/phases/review-panel.ts | Extends PanelistInvoker seam to accept optional ctx with parentTaskId; passes it through. |
| src/cli/production-seams.ts | Stamps parentTaskId onto panelist ProviderRequests in production invoker when provided. |
| src/agents/defaults/builder.md | Adds executable “RED-first” test ordering steps per rule 22(b). |
| docs/comparison/README.md | Adds session 09 row and removes byterover-cli from unaudited backlog list. |
| docs/comparison/09-byterover-cli/SYNTHESIS.md | Adds/records the session closure + decisions and shipped scope. |
| docs/comparison/09-byterover-cli/COMPARISON.md | Adds byterover-cli comparison matrix and borrow/reject analysis. |
| docs/comparison/09-byterover-cli/CODEX_RESPONSE.md | Adds Codex review output for the comparison. |
| docs/comparison/09-byterover-cli/CODEX_PREDESIGN_B3.md | Adds Codex pre-design memo for the B3 hotfix. |
| docs/comparison/09-byterover-cli/CODEX_FINAL_REVIEW.md | Adds Codex final review / closure notes for the landing batch. |
| docs/comparison/09-byterover-cli/CODEX_BRIEFING.md | Adds the briefing used for Codex review. |
| CLAUDE.md | Updates status line, adds rule 22, and adds byterover-cli to influence library. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // so reducer pairing keeps the parent correlation across the | ||
| // invoke/complete pair (FIFO-by-phase pairs by order, but | ||
| // explicit echo lets summarizeByParentTask join cleanly without | ||
| // assuming pairing semantics). |
| | 01 | ace | 2026-05-10 | YES, with selective borrows (M17-M20 Reviewer Memory sequence; see SYNTHESIS) | [01-ace/](01-ace/) | | ||
| | 02 | agenticSeek | 2026-05-10 | YES, structurally stronger on SDLC authority mechanics that overlap (not "ahead on every"); 4 borrow candidates ranked B3 (conditional on MCP-gap evidence) -> B1 (VERIFY-fail bad-plan telemetry, no plan-mutation authority) -> B4 (local-first OpenAI-compatible provider, demand-gated to PE-2) -> B2 (advisory DEFINE risk/effort hint, no `suggested_path`); substring denylist + memory-compression-as-canonical-state killed; local-first privacy upgraded from off-mission to demand-gated borrow; 3 rounds (Codex `accept-with-modifications` thread `019e12ac` -> 12 round-2 deltas, 10 distinct after merge -> round 3 both Opus and Codex independently report `converged` with 0 deltas, threads `019e131b` / `019e1323`); GPL-3.0 license noted | [02-agenticSeek/](02-agenticSeek/) | | ||
| | 07 | maestro | 2026-05-10 | YES, with selective borrows (B1 narrowed wave-verify; B2 heartbeat deferred as projection; B3 PLAN_DIFF blocked on SHIP contract; B4 separated from B5; B5 `outcome=abandoned` use-case-gated; B7 maestro bash loop rejected, `code-oz watch` deferred with contract draft); Codex `fix-first` thread `019e12ee` -- all 6 findings closed in synthesis (rule-21 misapplication corrected -> rule 20, RUN_OUTCOMES schema risk surfaced, SHIP-contract gap identified, Bun-native CI added to deferred set); maestro is the parent template -- three load-bearing rules already absorbed (rules 1/3/4) | [07-maestro/](07-maestro/) | | ||
| | 09 | byterover-cli | 2026-05-10 | YES, with selective borrows (B1+B4 consolidated into rule 22 — consumer-first + RED-first TDD; B3 `parentTaskId` fan-out cost rollup shipped on `feat/byterover-09-borrows`; B2 `code-oz consult` deferred to M17/M18 after Codex F1 caught the invented surface; B5/B6 pattern-only; R10 reclassified to defer-with-high-bar); 3 Codex rounds — `fix-first` thread `019e12ec` (8 findings) + pre-design thread `019e1318` + final review (round 3 closure); 3128 offline tests pass | [09-byterover-cli/](09-byterover-cli/) | |
| `code-oz` is a standalone Bun + TypeScript CLI that boots an adaptive multi-agent software-company simulation over a hybrid phase-graph + agentic sub-orchestration spine. Hard SDLC gates between phases (file-based, schema-validated). Cross-family adversarial review. Non-technical-user intent elicitation at the front. Multi-provider via `IAgentProvider` (Claude / Codex / Gemini SDKs reading CLI OAuth tokens). | ||
|
|
||
| Status: **v0.17.0-alpha.0 — M16 closed.** Production CLI completion (per-task cursor, dispatch infra, milestone-level e2e through the binary): `code-oz run`, `approve`, and `doctor` are wired end-to-end across DEFINE → REVIEW; full `resume` command remains M17. 3108 offline tests pass (+402 across M16); live xAI gated behind `CODE_OZ_LIVE_PROVIDER_TESTS=xai` + `CODE_OZ_LIVE_XAI_MODEL=<grok-variant>`. M16 R0/R1/R2 closed (8 production bugs caught by C12 e2e + 4 by Codex R1; per-commit cross-model peer review pattern validated for shared infra). Latest tag pushed: `v0.17.0-alpha.0` (2026-05-10). | ||
| Status: **v0.17.0-alpha.0 — M16 closed (production CLI completion).** Per-task-cursor CLI (init/run/dispatchBuild/dispatchVerify/dispatchReview/approve/resume/SHIP) shipped 2026-05-10 with exit-codes contract, prod-seam injection, phase-locks, and full audit-completeness recovery. 3128 offline tests pass (3108 baseline + 20 in 09-byterover-cli B3); live xAI integration test gated behind `CODE_OZ_LIVE_PROVIDER_TESTS=xai` + `CODE_OZ_LIVE_XAI_MODEL=<grok-variant>`. M16 R0/R1/R2 closed (8 production bugs caught by C12 e2e + 4 by Codex R1; per-commit cross-model peer review pattern validated for shared infra). PE-1 (xAI HTTP adapter, v0.13.0-alpha.0), M13 (role-cost policy under `budgets.global`, v0.14.0-alpha.0), M14 (reviewer panel v1, v0.15.0-alpha.0 — first simultaneous-provider surface), M15 (debate-policy scheduler v1, v0.16.0-alpha.0) all closed. PE-2 demand-gated; multi-cloud deferred to v0.2. Latest tag pushed: `v0.17.0-alpha.0` (2026-05-10). |
Summary
Closes the
byterover-clitemplate comparison (session 09) and lands the two consolidated borrows onfeat/byterover-09-borrows:fix-firstthread019e12ec). The detailed RED-first 5-step sequence lives insrc/agents/defaults/builder.mdfor execution; rule 22 is the CLAUDE.md non-negotiable.parentTaskIdthroughstate/events.jsonlcost-recorded events so reviewer-panel + debate fan-outs roll up correctly againstbudgets.global. Repaired the parent-task pairing logic per Codex round 3 closure.COMPARISON.md(21-row feature matrix),CODEX_BRIEFING.md,CODEX_RESPONSE.md(8 findings,fix-firstverdict),CODEX_PREDESIGN_B3.md(pre-implementation design memo, thread019e1318),CODEX_FINAL_REVIEW.md(round 3 closure),SYNTHESIS.md(closed; 7 decision points resolved).Commits
e672e9bfeat(state,cost,phases): thread parentTaskId for fan-out cost rollup (09-byterover-cli B3)57a1456docs(rules): add rule 22 (consumer-first + RED-first TDD); refresh status; influence libraryaae1e7bdocs(builder): RED-first 5-step ordering detail (rule 22(b))a0d377edocs(comparison): add 09-byterover-cli folder + close decision pointsfcd4bfbfix(cost): repair parentTaskId rollup pairing (Codex round 3 closure)11c08fddocs(comparison): add Codex final review (Codex round 3 closure)f2a17bfmerge(main): integrate 11 merged PRs + resolve CLAUDE.md/README/ROADMAP conflictsMerge resolution
mainadvanced by 12 commits (11 merged PRs) while this branch was in flight. Conflicts resolved:09 | byterover-clirow; removedbyterover-clifrom the Unaudited backlog list.Test plan
bun test— 3198 pass, 2 skip (live xAI gated), 0 fail (3200 across 205 files, 17.84s)docs/comparison/README.mdlists session 09 with closed verdictsrc/agents/defaults/builder.mdretains rule 22(b) reference at line 69tests/state-events-parent-task-id.test.tspresent and passing