Skip to content

Add coding-loop workflow + fix tmux agent stage-status env forwarding#82

Merged
mattleaverton merged 3 commits into
danshapiro:mainfrom
mattleaverton:feat/coding-loop-workflow
Apr 17, 2026
Merged

Add coding-loop workflow + fix tmux agent stage-status env forwarding#82
mattleaverton merged 3 commits into
danshapiro:mainfrom
mattleaverton:feat/coding-loop-workflow

Conversation

@mattleaverton
Copy link
Copy Markdown
Collaborator

Summary

  • New workflow package workflows/coding-loop/ — an iterative coding-agent loop: task chooser → implementer → reviewer → done-gate, wrapped in the trapezium/invtrapezium loop primitive with file-based termination. One sub-task per iteration; feedback persisted to .reviews/iter-NNN.md + rolling .reviews/latest.md; done-gate uses LLM judgment against the spec.
  • Engine fix — the tmux agent session now receives KILROY_STAGE_STATUS_PATH and KILROY_STAGE_STATUS_FALLBACK_PATH (plus the full BuildStageRuntimeEnv set). The engine's status-contract preamble already tells agents to write to these vars, but previously only the API agent_loop path actually set them in the process env. The tmux path didn't. This affects every agent_tool=claude|codex|gemini|opencode node in every workflow.

Motivation

Writing and running the coding-loop workflow exposed a latent tmux/API parity gap. When I ran the workflow end-to-end on a toy-list spec (7 sub-tasks → 7 iterations, full green run), I observed the implementer burning roughly 15 of 45 tool calls per iteration hunting for $KILROY_STAGE_STATUS_PATH that was never set:

echo "$KILROY_STAGE_STATUS_PATH"     → (empty)
printenv | grep -i kilroy            → only KILROY_RUN_ID, KILROY_NODE_ID
grep -r "status_path" <logs_root>    → no match in any config

Agents eventually gave up and wrote status.json to arbitrary fallback paths. Runs still succeeded because the engine already tolerates a missing status file, but the wasted latency per iteration was substantial (~2-3 minutes each) and the behavior was confusing (preamble tells you to write to an env var that doesn't exist).

What changed

internal/attractor/agents/tmux_handler.go — extracted the session-env construction into buildTmuxAgentEnv, which now merges:

  • The tool template's BuildEnv() defaults (unchanged)
  • engine.BuildStageRuntimeEnv — run/node IDs, worktree/logs paths, data dir, inputs manifest, KILROY_INPUT_* (replaces the previous hand-rolled subset)
  • engine.BuildStageStatusContract(...).EnvVars — the two status-contract paths (new)

This brings the tmux session env into parity with what buildAgentLoopOverrides already provides to the API agent_loop path.

internal/attractor/agents/tmux_env_test.go — new unit tests covering both populated and nil-template code paths; asserts all expected runtime + status-contract vars are present.

workflows/coding-loop/ — new workflow package (graph.dot, workflow.toml, README.md) exercising the loop primitive end-to-end. Proven on a toy-math spec (1 iter, 4m 24s) and a toy-list spec (7 iters, 30m 54s). Both green.

Test plan

  • go test ./internal/attractor/agents/ ./internal/attractor/engine/ — green (agents 7.4s, engine 220.4s)
  • kilroy attractor validate --graph workflows/coding-loop/graph.dot — ok
  • End-to-end run: toy-math (3 sub-tasks) — 1 iteration, success
  • End-to-end run: toy-list (7 sub-tasks, empty-safe invariant) — 7 iterations, all features + tests implemented, go test ./... exits 0 in target repo
  • go build ./cmd/kilroy/ — clean
  • go vet ./... — clean
  • gofmt -l on touched files — clean (pre-existing drift elsewhere unchanged; PR fix(ci): gofmt all unformatted files (engine.go, worktree_hint_test.go, cli_only_models_test.go, codergen_router_cxdb_test.go) #74 covers some of it)

Notes

  • Two pre-existing test failures on main (TestRunWithConfig_ForceModel_BypassesCatalogGate, TestRunWithConfig_AllowsKimiAndZai_WhenCatalogUsesOpenRouterPrefixes) are unrelated to this change.
  • The workflow package is functional but v0.1.0 — future work could use the housekeeping-LLM primitive (separate exploration) to replace the exact-string termination match, drop the loop_max=12 cap, and make the done-gate more robust to prompt variation.

🤖 Generated with Claude Code

mattleaverton and others added 3 commits April 17, 2026 15:49
Iterative coding agent workflow: task chooser → implementer → reviewer
→ done-gate, wrapped in a trapezium/invtrapezium loop primitive with
loop_max=8 and loop_until_file_contains-based termination.

- Chooser and done-gate on claude-haiku-4.5 (cheap, API)
- Implementer and reviewer on claude-sonnet-4.6 via agent_tool=claude
- Feedback persisted to .reviews/iter-NNN.md plus .reviews/latest.md;
  chooser reads latest only, done-gate can list/read any iteration
- Spec passed via --input spec=<abs-path> and read in place; never
  committed into the target repo
- Reviewer uses git show HEAD (vs git diff HEAD~1 HEAD) so the
  first-iteration case works without a fallback branch

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Force one-subtask-per-iteration to exercise the loop dynamics.

- Chooser: pick EXACTLY ONE smallest self-contained sub-task; do not bundle.
  Explicit guardrail written into .kilroy/task.md for the implementer.
- Implementer: implement ONLY what the task asks for; do not guess ahead.
- loop_max bumped 8 → 12 to accommodate multi-iteration specs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Agents run via TmuxAgentHandler (agent_tool=claude|codex|gemini|opencode)
were seeing the engine-injected status-contract preamble that instructs
them to write status JSON to $KILROY_STAGE_STATUS_PATH — but the tmux
session env never actually set those variables. Only the API agent_loop
path set them (via buildAgentLoopOverrides). Agents wasted tool calls
hunting for the unset env var and eventually gave up.

Consolidate the env build into buildTmuxAgentEnv, which now merges:

  - The tool template's BuildEnv() defaults
  - Engine runtime invariants (KILROY_RUN_ID, KILROY_NODE_ID,
    KILROY_LOGS_ROOT, KILROY_STAGE_LOGS_DIR, KILROY_WORKTREE_DIR,
    KILROY_DATA_DIR, KILROY_INPUTS_MANIFEST_PATH, KILROY_INPUT_*)
    via BuildStageRuntimeEnv
  - Stage status contract paths (KILROY_STAGE_STATUS_PATH,
    KILROY_STAGE_STATUS_FALLBACK_PATH) via BuildStageStatusContract

This matches the API agent_loop path's env, so tmux and API backends
are now consistent with respect to what the status-contract preamble
can actually reference.

Observed in the wild: a 7-iteration coding-loop run where the
implementer burned ~15 of 45 tool calls per iteration searching for
KILROY_STAGE_STATUS_PATH. With this fix the env var is set at session
start and the preamble instruction is actionable.

Adds unit test coverage in tmux_env_test.go.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mattleaverton mattleaverton force-pushed the feat/coding-loop-workflow branch from 73bea5e to 4e30264 Compare April 17, 2026 20:49
@mattleaverton mattleaverton merged commit 7073ea0 into danshapiro:main Apr 17, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant