feat(loop): agent-suggested parallel batches + end-of-run verify + loop graph DAG (hew-lf40) by droidnoob · Pull Request #59 · droidnoob/hew

droidnoob · 2026-05-30T11:29:54Z

Closes the hew-lf40 epic — agent-suggested parallel batches for hew loop run --jobs N, plus end-of-run test verification and a DAG renderer for loop iteration history.

Why this exists

DECISION:loop-parallel-overlap-policy ("trust the graph") shipped in v1 as a deliberate punt: bd dep edges encode safety, and conflicts get caught at merge-back time. That's correct for sparse graphs but bites when two independent tasks touch the same file. The 2026-05-29 autonomous run made the cost real — loop_log.rs overlap between hew-2cq and hew-6nxs required a manual rebase even though both branches were green in isolation.

This epic layers informed batching on top of trust-the-graph without contradicting it:

Iter agent's own suggestion — when the agent that just closed iter N emits a next_iteration: [task_ids] block, the dispatcher honors it for the next tick. Cheapest signal: the agent already has full context.
Dedicated planner runtime — between iters, when (1) is silent, a small claude -p / codex exec call (capped by loop.planner.budget_tokens, default 10k) reads the bd-ready set + recent symbol-touch sets and returns the next batch. Never truncates context to fit budget — skips cleanly instead.
Trust-the-graph floor — dispatch_tick intersects the batch with bd ready. The batch can only narrow the candidate set, never expand it. Floor is locked.

What lands (8 atomic commits)

commit	task	what
`108d148`	hew-58ac	`hew_core::batch_plan` module — BatchPlan + BatchSource enum + atomic file I/O at `.hew/loop/<run-id>/batch-NNN.json`, schema_version=1
`e33abb0`	hew-7klt	`batch_plan_parse::extract_next_iteration` — tolerant parser for the agent's close-output block (fenced `next_iteration` form + `<next_iteration>` XML form), filters malformed task IDs
`f58ff12`	hew-pxw9	`spawn_planner` — subprocess with pre-spawn budget check; failure modes all return `BatchPlan::Skipped { reason }` rather than propagating errors
`48506d9`	hew-rplg	Dispatcher threading: `Dispatcher::new` accepts `Option<BatchPlan>`; `dispatch_tick` filters by `batch ∩ bd_ready`; ready_seen reflects post-filter
`31ef9ff`	hew-7k1m	CLI + config — `--no-planner`, `--planner-budget`, `loop.planner.*` schema; iter-end hook chooses agent → planner → skipped
`5e595fa`	hew-z7rz	`hew loop summary` adds `planner: agent=N, runtime=M, fallback=K`; `docs/LOOP.md` "Batch planner" section; CHANGELOG entry; new `DECISION:loop-batch-planner-floor` memory
`dbe56b4`	hew-bon7	End-of-run verify step: stack-detected test command (Rust → Node → Python → Go → Make/Just), `loop.end_of_run.verify_tests` config + `--verify-tests` flag, opt-in default false, budget-capped, writes `verify.log` and `STATUS:loop-verify-failed:` memory on failure
`42014ba`	hew-m7lq	`hew loop graph` — DAG renderer over iter + batch + run logs. Outputs mermaid (default), dot, or ASCII. Handles incomplete iters, cancelled runs, runtime-error-with-empty-stderr, backpressure rollback, verify outcomes, parallel worker swimlanes

Bonus: the `hew loop graph` unhappy paths the user flagged

Per the chore body, the graph must render the cases where things didn't go cleanly:

case	node treatment
incomplete iter (`started_at`, no `ended_at`)	`⋯` glyph, dashed border, partial label
cancelled mid-run (`stop_reason: Cancelled`)	`⊘` glyph, gray, `[Cancelled at <ts>]` annotation
runtime_error with empty stderr (the 2h hang case)	`✗` glyph, `(no stderr — possibly hung)` annotation
backpressure_fail with rollback	`↺` edge back to previous iter's HEAD sha
verify failed	red verify node + top 3 failed test names

CI parity

Local: cargo test --workspace — 41 suites green, 0 failed.
bd-scrubbed parity check (mimicking CI's lack of bd): loop_scope_e2e 7/7, loop_backpressure 14/14. The precheck-before-discover guard from feat(loop): hew loop run --scope={ready|epics} flag (hew-b3yl) #58 still holds.

Backward compat

Legacy runs (no batch-*.json files in run-dir): dispatcher reads read() == None, falls through to bd ready. Byte-identical to today.
Legacy run.json (no verify_outcome field): #[serde(default)] → None, summary line omitted.
--jobs=1 (the default): batch-plan layer skipped entirely. Single-worker loops don't write batch files.

Non-goals (v1)

Static touches-overlap analysis (would parse description prose; brittle).
Cross-run batch memory.
Auto-fix on verify failure.
Live-updating graph (websocket/fswatch).

🤖 Generated with Claude Code

- BatchPlan { schema_version, iter_number, task_ids, source, reason, created_at, planner_tokens } + tagged BatchSource (Agent/Planner/Skipped, snake_case on the wire) - path/read/write API; atomic write via loop_log::write_json_atomic; read returns Ok(None) on missing file and rejects mismatched SCHEMA_VERSION with a clear miette diagnostic - 9 unit tests covering zero-pad path, missing-file, all three source roundtrips, atomic temp-cleanup, wire form, pinned version, unknown- version rejection First-class artifact for the parent epic hew-lf40's batch-planner pipeline; downstream parser/planner/dispatcher consume this type. Closes hew-58ac.

…ew-7klt) - New hew_core::batch_plan_parse module - Parses fenced ```next_iteration JSON-array and <next_iteration> XML-tag CSV forms - Hand-rolled hew-id validator (no new regex dep) - Distinct None / Some(vec![]) / Some(ids) return states - 13 tests including 1000-iter adversarial fuzz Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- spawn_planner in hew/src/commands/loop_cmd.rs: assembles a small prompt over bd_ready + recent_touches, runs a pre-spawn token budget check, drives the runtime, parses extract_next_iteration from the response. - Every failure path returns BatchPlan { source: Skipped, reason }: budget_exceeded (no spawn), runtime_error, parse_error. Planner must never kill the loop. - skills/data/planner-prompt.md holds the system body; embedded via include_str! and treated as a data file (not a registered skill). - skills drift test now skips skills/data/ since it ships embedded resources (.toml + .md), not skill bodies. - 6 inline unit tests cover all branches via MockSpawner + a custom Err-returning spawner. Closes hew-pxw9.

…bd-ready - Dispatcher::new gains Option<BatchPlan>; field cached on the struct. - dispatch_tick narrows post-scope candidates by linear contains against plan.task_ids (typical batch <10 — avoids per-tick HashSet alloc). Filter is non-expansive: bd dep graph stays the safety floor per DECISION:loop-parallel-overlap-policy. - Source::Skipped and empty task_ids fall through to trust-the-graph with no batch_source signaled. - New DispatchTick.batch_source + Dispatcher::current_batch_source() for downstream summary aggregation. - 8 new tests cover the matrix; existing 13 dispatcher tests pass unchanged with batch_plan: None. Closes hew-rplg.

- LoopPlannerConfig {enabled, budget_tokens, runtime}; default enabled=true / 10_000 tokens / runtime=None. - hew config get/set for loop.planner.{enabled,budget_tokens,runtime}. - hew loop run --no-planner / --planner-budget / --planner-runtime, resolved via resolve_planner_config (CLI > config > default). - Iter-end hook in run_worker_loop_with_scope writes <run-dir>/batch-NNN+1.json under --jobs >= 2 covering all four branches: Agent (raw_text named the block) → Planner (spawned) → Skipped (planner_disabled / budget_exceeded / parse_error / runtime_error) → bypass entirely when jobs == 1. - Pure resolve_iter_completion_plan helper keeps the branch arithmetic test-friendly.

…w-z7rz) - Summary gains PlannerCounts{agent,planner,skipped} + scan_planner_counts(run_dir) helper that walks batch-NNN.json artifacts; render emits 'planner: agent=N, runtime=M, fallback=K' between scope and tokens, omits when zero - loop_cmd::print_summary populates from run_dir so live, replay, and parallel-aggregate paths all carry it - docs/LOOP.md '## Batch planner' section: agent→planner→trust-the-graph cascade, batch-NNN.json schema, summary line, --no-planner / loop.planner.* surface - CHANGELOG [Unreleased] entry; DECISION:loop-batch-planner-floor memory persisted - 5 new lib tests; fmt+clippy clean; 712 lib tests green

Adds an opt-in mandatory verify step that runs after the last iter (and after merge-back on --jobs >= 2) to prove the final stacked state is green. Conditional on both a resolvable test command (CLI > config > gate::detect) and an explicit opt-in. - new hew_core::verify (VerifyOutcome + resolve_command + run_verify) - new [loop.end_of_run] config block (verify_tests, verify_command, verify_budget_wall) with three settable keys - Run + RunLog gain verify_outcome with backward-compat parse - summary renderer adds a coloured "verify:" line below planner - --verify-tests / --no-verify-tests / --verify-command CLI flags - failure writes STATUS:loop-verify-failed:<run-id> + non-zero exit; closed tasks are NOT rolled back - defaults byte-identical to today (verify_tests = false) - 18 new tests; docs/LOOP.md + CHANGELOG updated Closes bd-hew-bon7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- hew_core::loop_graph IR + mermaid/dot/ascii renderers (pure, no I/O) - builders read iter*.json, batch*.json, run.json, manifest.json - unhappy paths render distinctly: incomplete (dashed), cancelled (⊘), runtime-error+empty-stderr ("possibly hung"), backpressure rollback (↺ self-edge), verify outcomes (passed/failed/skipped) - parallel runs lay out per-worker swimlanes from manifest.json - pre-batch-plan legacy runs render with sequential edges only - CLI: hew loop graph [--run-id ID] [--format ...] [--out PATH] [--all] - 13 unit tests covering each acceptance criterion + 5 e2e CLI tests - docs/LOOP.md § Loop graph section + CHANGELOG entry Closes epic hew-lf40 (8/8 children).

* chore(release): 0.11.0 - workspace Cargo.toml: 0.10.0 -> 0.11.0 - 23 skill body `hew:version=` markers bumped to match - .claude/ install snapshot refreshed via `hew init --runtime=claude` - CHANGELOG.md: move [Unreleased] content into [0.11.0] — 2026-05-30 Release contents since 0.10.0: #53 parallel hew loop via per-worker git worktrees (hew-6az) #54 per-task model selection + per-model token spend (hew-1tq) #55 init re-run UX — refresh/reconfigure/cancel (hew-0wa) #56 split /hew:auto from /hew:loop semantics (hew-6n0v) #57 cut local cargo test from ~2 min to ~22s (hew-v2ib) #58 hew loop run --scope={ready|epics} (hew-b3yl) #59 batch planner + end-of-run verify + loop graph (hew-lf40) #60 retry_etxtbsy stub flake fix (hew-0rky) Breaking surface: hew loop run in non-interactive mode now requires --scope. Justifies the minor bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(readme): reflect 0.11.0 surface changes - /hew:auto description updated to in-conversation epic walk (was the legacy plan→decompose→execute→verify; rewritten in hew-6n0v / #56) - slash count 40 → 41 (new /hew:auto + various) - loop snippets show --scope (required in non-interactive mode per hew-b3yl / #58), --jobs N, --verify-tests, hew loop summary, hew loop graph - autonomous-loop bullets gain parallel-workers, scoped-runs + per-task-model, end-of-run-verification entries - Selected knobs table adds loop.model.*, loop.planner.*, loop.end_of_run.verify_tests, loop.fallback_runtime No changes to brand, hero copy, or repo description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

droidnoob and others added 8 commits May 30, 2026 17:39

droidnoob force-pushed the feat/batch-plan-module branch from 42014ba to e9c23c0 Compare May 30, 2026 12:09

droidnoob merged commit 0c07687 into main May 30, 2026
14 checks passed

droidnoob mentioned this pull request May 30, 2026

chore(release): 0.11.0 #61

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(loop): agent-suggested parallel batches + end-of-run verify + loop graph DAG (hew-lf40)#59

feat(loop): agent-suggested parallel batches + end-of-run verify + loop graph DAG (hew-lf40)#59
droidnoob merged 8 commits into
mainfrom
feat/batch-plan-module

droidnoob commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

droidnoob commented May 30, 2026

Why this exists

What lands (8 atomic commits)

Bonus: the hew loop graph unhappy paths the user flagged

CI parity

Backward compat

Non-goals (v1)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bonus: the `hew loop graph` unhappy paths the user flagged