feat(loop): agent-suggested parallel batches + end-of-run verify + loop graph DAG (hew-lf40)#59
Merged
Merged
Conversation
- BatchPlan { schema_version, iter_number, task_ids, source, reason,
created_at, planner_tokens } + tagged BatchSource (Agent/Planner/Skipped,
snake_case on the wire)
- path/read/write API; atomic write via loop_log::write_json_atomic;
read returns Ok(None) on missing file and rejects mismatched
SCHEMA_VERSION with a clear miette diagnostic
- 9 unit tests covering zero-pad path, missing-file, all three source
roundtrips, atomic temp-cleanup, wire form, pinned version, unknown-
version rejection
First-class artifact for the parent epic hew-lf40's batch-planner
pipeline; downstream parser/planner/dispatcher consume this type.
Closes hew-58ac.
…ew-7klt) - New hew_core::batch_plan_parse module - Parses fenced ```next_iteration JSON-array and <next_iteration> XML-tag CSV forms - Hand-rolled hew-id validator (no new regex dep) - Distinct None / Some(vec![]) / Some(ids) return states - 13 tests including 1000-iter adversarial fuzz Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- spawn_planner in hew/src/commands/loop_cmd.rs: assembles a small
prompt over bd_ready + recent_touches, runs a pre-spawn token
budget check, drives the runtime, parses extract_next_iteration
from the response.
- Every failure path returns BatchPlan { source: Skipped, reason }:
budget_exceeded (no spawn), runtime_error, parse_error. Planner
must never kill the loop.
- skills/data/planner-prompt.md holds the system body; embedded via
include_str! and treated as a data file (not a registered skill).
- skills drift test now skips skills/data/ since it ships embedded
resources (.toml + .md), not skill bodies.
- 6 inline unit tests cover all branches via MockSpawner + a custom
Err-returning spawner.
Closes hew-pxw9.
…bd-ready - Dispatcher::new gains Option<BatchPlan>; field cached on the struct. - dispatch_tick narrows post-scope candidates by linear contains against plan.task_ids (typical batch <10 — avoids per-tick HashSet alloc). Filter is non-expansive: bd dep graph stays the safety floor per DECISION:loop-parallel-overlap-policy. - Source::Skipped and empty task_ids fall through to trust-the-graph with no batch_source signaled. - New DispatchTick.batch_source + Dispatcher::current_batch_source() for downstream summary aggregation. - 8 new tests cover the matrix; existing 13 dispatcher tests pass unchanged with batch_plan: None. Closes hew-rplg.
- LoopPlannerConfig {enabled, budget_tokens, runtime}; default
enabled=true / 10_000 tokens / runtime=None.
- hew config get/set for loop.planner.{enabled,budget_tokens,runtime}.
- hew loop run --no-planner / --planner-budget / --planner-runtime,
resolved via resolve_planner_config (CLI > config > default).
- Iter-end hook in run_worker_loop_with_scope writes
<run-dir>/batch-NNN+1.json under --jobs >= 2 covering all four
branches: Agent (raw_text named the block) → Planner (spawned) →
Skipped (planner_disabled / budget_exceeded / parse_error /
runtime_error) → bypass entirely when jobs == 1.
- Pure resolve_iter_completion_plan helper keeps the branch
arithmetic test-friendly.
…w-z7rz)
- Summary gains PlannerCounts{agent,planner,skipped} + scan_planner_counts(run_dir)
helper that walks batch-NNN.json artifacts; render emits
'planner: agent=N, runtime=M, fallback=K' between scope and tokens, omits when zero
- loop_cmd::print_summary populates from run_dir so live, replay, and parallel-aggregate
paths all carry it
- docs/LOOP.md '## Batch planner' section: agent→planner→trust-the-graph cascade,
batch-NNN.json schema, summary line, --no-planner / loop.planner.* surface
- CHANGELOG [Unreleased] entry; DECISION:loop-batch-planner-floor memory persisted
- 5 new lib tests; fmt+clippy clean; 712 lib tests green
Adds an opt-in mandatory verify step that runs after the last iter (and after merge-back on --jobs >= 2) to prove the final stacked state is green. Conditional on both a resolvable test command (CLI > config > gate::detect) and an explicit opt-in. - new hew_core::verify (VerifyOutcome + resolve_command + run_verify) - new [loop.end_of_run] config block (verify_tests, verify_command, verify_budget_wall) with three settable keys - Run + RunLog gain verify_outcome with backward-compat parse - summary renderer adds a coloured "verify:" line below planner - --verify-tests / --no-verify-tests / --verify-command CLI flags - failure writes STATUS:loop-verify-failed:<run-id> + non-zero exit; closed tasks are NOT rolled back - defaults byte-identical to today (verify_tests = false) - 18 new tests; docs/LOOP.md + CHANGELOG updated Closes bd-hew-bon7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- hew_core::loop_graph IR + mermaid/dot/ascii renderers (pure, no I/O)
- builders read iter*.json, batch*.json, run.json, manifest.json
- unhappy paths render distinctly: incomplete (dashed), cancelled (⊘),
runtime-error+empty-stderr ("possibly hung"), backpressure rollback
(↺ self-edge), verify outcomes (passed/failed/skipped)
- parallel runs lay out per-worker swimlanes from manifest.json
- pre-batch-plan legacy runs render with sequential edges only
- CLI: hew loop graph [--run-id ID] [--format ...] [--out PATH] [--all]
- 13 unit tests covering each acceptance criterion + 5 e2e CLI tests
- docs/LOOP.md § Loop graph section + CHANGELOG entry
Closes epic hew-lf40 (8/8 children).
42014ba to
e9c23c0
Compare
Merged
droidnoob
added a commit
that referenced
this pull request
May 30, 2026
* chore(release): 0.11.0 - workspace Cargo.toml: 0.10.0 -> 0.11.0 - 23 skill body `hew:version=` markers bumped to match - .claude/ install snapshot refreshed via `hew init --runtime=claude` - CHANGELOG.md: move [Unreleased] content into [0.11.0] — 2026-05-30 Release contents since 0.10.0: #53 parallel hew loop via per-worker git worktrees (hew-6az) #54 per-task model selection + per-model token spend (hew-1tq) #55 init re-run UX — refresh/reconfigure/cancel (hew-0wa) #56 split /hew:auto from /hew:loop semantics (hew-6n0v) #57 cut local cargo test from ~2 min to ~22s (hew-v2ib) #58 hew loop run --scope={ready|epics} (hew-b3yl) #59 batch planner + end-of-run verify + loop graph (hew-lf40) #60 retry_etxtbsy stub flake fix (hew-0rky) Breaking surface: hew loop run in non-interactive mode now requires --scope. Justifies the minor bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(readme): reflect 0.11.0 surface changes - /hew:auto description updated to in-conversation epic walk (was the legacy plan→decompose→execute→verify; rewritten in hew-6n0v / #56) - slash count 40 → 41 (new /hew:auto + various) - loop snippets show --scope (required in non-interactive mode per hew-b3yl / #58), --jobs N, --verify-tests, hew loop summary, hew loop graph - autonomous-loop bullets gain parallel-workers, scoped-runs + per-task-model, end-of-run-verification entries - Selected knobs table adds loop.model.*, loop.planner.*, loop.end_of_run.verify_tests, loop.fallback_runtime No changes to brand, hero copy, or repo description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes the
hew-lf40epic — agent-suggested parallel batches forhew loop run --jobs N, plus end-of-run test verification and a DAG renderer for loop iteration history.Why this exists
DECISION:loop-parallel-overlap-policy("trust the graph") shipped in v1 as a deliberate punt: bd dep edges encode safety, and conflicts get caught at merge-back time. That's correct for sparse graphs but bites when two independent tasks touch the same file. The 2026-05-29 autonomous run made the cost real —loop_log.rsoverlap betweenhew-2cqandhew-6nxsrequired a manual rebase even though both branches were green in isolation.This epic layers informed batching on top of trust-the-graph without contradicting it:
next_iteration: [task_ids]block, the dispatcher honors it for the next tick. Cheapest signal: the agent already has full context.claude -p/codex execcall (capped byloop.planner.budget_tokens, default 10k) reads the bd-ready set + recent symbol-touch sets and returns the next batch. Never truncates context to fit budget — skips cleanly instead.bd ready. The batch can only narrow the candidate set, never expand it. Floor is locked.What lands (8 atomic commits)
108d148hew_core::batch_planmodule — BatchPlan + BatchSource enum + atomic file I/O at.hew/loop/<run-id>/batch-NNN.json, schema_version=1e33abb0batch_plan_parse::extract_next_iteration— tolerant parser for the agent's close-output block (fencednext_iterationform +<next_iteration>XML form), filters malformed task IDsf58ff12spawn_planner— subprocess with pre-spawn budget check; failure modes all returnBatchPlan::Skipped { reason }rather than propagating errors48506d9Dispatcher::newacceptsOption<BatchPlan>;dispatch_tickfilters bybatch ∩ bd_ready; ready_seen reflects post-filter31ef9ff--no-planner,--planner-budget,loop.planner.*schema; iter-end hook chooses agent → planner → skipped5e595fahew loop summaryaddsplanner: agent=N, runtime=M, fallback=K;docs/LOOP.md"Batch planner" section; CHANGELOG entry; newDECISION:loop-batch-planner-floormemorydbe56b4loop.end_of_run.verify_testsconfig +--verify-testsflag, opt-in default false, budget-capped, writesverify.logandSTATUS:loop-verify-failed:memory on failure42014bahew loop graph— DAG renderer over iter + batch + run logs. Outputs mermaid (default), dot, or ASCII. Handles incomplete iters, cancelled runs, runtime-error-with-empty-stderr, backpressure rollback, verify outcomes, parallel worker swimlanesBonus: the
hew loop graphunhappy paths the user flaggedPer the chore body, the graph must render the cases where things didn't go cleanly:
started_at, noended_at)⋯glyph, dashed border, partial labelstop_reason: Cancelled)⊘glyph, gray,[Cancelled at <ts>]annotation✗glyph,(no stderr — possibly hung)annotation↺edge back to previous iter's HEAD shaCI parity
cargo test --workspace— 41 suites green, 0 failed.loop_scope_e2e7/7,loop_backpressure14/14. The precheck-before-discover guard from feat(loop): hew loop run --scope={ready|epics} flag (hew-b3yl) #58 still holds.Backward compat
batch-*.jsonfiles in run-dir): dispatcher readsread() == None, falls through tobd ready. Byte-identical to today.run.json(noverify_outcomefield):#[serde(default)]→None, summary line omitted.--jobs=1(the default): batch-plan layer skipped entirely. Single-worker loops don't write batch files.Non-goals (v1)
🤖 Generated with Claude Code