Skip to content

feat(loops): recursive execution atom — budget-conserving Scope + Supervisor keystone#151

Merged
drewstone merged 5 commits into
mainfrom
feat/recursive-execution-atom
Jun 4, 2026
Merged

feat(loops): recursive execution atom — budget-conserving Scope + Supervisor keystone#151
drewstone merged 5 commits into
mainfrom
feat/recursive-execution-atom

Conversation

@drewstone
Copy link
Copy Markdown
Contributor

What

The v1 recursive execution atom — the keystone for drivers-of-drivers: one self-similar Agent whose act spawns child agents through a Scope, run by a Supervisor that owns a conserved budget pool, an event-sourced journal, and an observability/conversation handle. The flat experiment harness is recovered as the simplest act (it does not compete with this — it is one program over it).

Design + decision record: docs/research/recursive-execution-atom.md (frozen contract, build order, the 4 resolved forks, the adversarial critique that shaped the surface).

The pieces

  • src/loops/supervise/types.ts — the frozen contract. Agent, an open LeafExecutor interface (execute returns a promise or an async stream; router/inline + sandbox + cli are implementations, a user's own agent is first-class via the registry/BYO — no per-vendor adapters), Scope, Supervisor, Settled, Budget.
  • supervise/budget.ts — a conserved reservation pool: atomic reserve-on-spawn, fail-closed admission, refund-on-settle. This is the load-bearing invariant: Σk(treatment) ≡ Σk(blind) holds by construction, so a steered arm can never silently out-compute blind (the confound that burned the earlier "+20pp steering" result).
  • src/durable/spawn-journal.ts — event-sourced SpawnJournal + content-addressed ResultBlobStore + seq-ordered replay. Resumable, queryable, reproducible from one log.
  • supervise/scope.ts — a ray.wait cursor over an in-memory nursery; spawn reserves budget and resolves the executor through the open registry; a Settled → Iteration adapter keeps defaultSelectWinner single-sourced.
  • supervise/runtime.ts — the open executor registry; the sandbox executor composes runLoop (forwarding an optional lineage passthrough) rather than reinventing checkpoint/fork.
  • supervise/supervisor.ts — nursery join barrier, abort cascade (incl. the acquire lifecycle), OTP intensity breaker, typed SupervisedResult, RootHandle (view/signal/abort) as the observability substrate.
  • bench/src/drivers/flat-harness control (with the equal-k assertion), progressive-widening control, and the LLM-meta-driver treatment. WidenGate defaults to flat so the selector≠judge firewall stays dormant; widening reads trace findings, never the raw verdict, unless judgeExempt.
  • program.tsmapPool one-for-one failure semantics (a down child is excluded from the merge; an all-down batch re-throws the first original error so a maxDepth guard still propagates loud).

Verification

Independently re-run on this branch: typecheck ✓, lint ✓ (204 files), build ✓, full suite 642/642 (incl. 28 keystone property tests: conserved-budget fail-closed + refund, equal-k by construction, monotonic-seq cursor, abort/teardown, replay determinism), bench typecheck ✓.

Honest gaps (why draft)

  • Live executor paths unproven. The router/sandbox/cli executors are exercised only through the offline mock LeafExecutor — they typecheck and the wiring is proven, but no real router-HTTP / sandbox-runLoop / cli-subprocess run has happened.
  • The LLM-meta-driver is the gated treatment, not a default. It ships so the diverse-strategy-vs-blind gate can actually be run; it is not wired into any default path.
  • Deferred (gated on a positive gate result): a tuned MCTS-PW algorithm, learned widening, a Temporal/DBOS durable backend, deleting runProgram's loop-layer parallel op.

Relationship to #150

#150 adds the leaf-level continued-session/fork lineage on runLoop. This is the driver layer on top — the sandbox executor forwards that lineage passthrough rather than duplicating it. #150's findings (esp. verify the client-minted sessionId) are load-bearing here too.

drewstone added 4 commits June 4, 2026 05:26
Capture the design thread as tracked research docs under docs/research:
- recursive-execution-atom: the next generation (one recursive Agent atom
  run as a durable, observable supervision tree; analyst-as-agent-with-runtime;
  async dynamic spawning), the proposed surface, the file-grounded gap, and the
  open forks. Plane B contains the flat harness.
- flat-harness-design: the assumption-free experiment harness synthesis
  (profiles x steer x executionMode x allocation). Plane A.
- long-horizon-benchmark-survey: adversarially-verified survey; Commit0 and
  tau2-bench as the multi-turn picks.
Index them under the new Research track in docs/README.md.
…spec

Freeze the contract: the budget-conserving reactive Scope + Supervisor keystone,
the event-sourced SpawnJournal + ResultBlobStore (outRef replay), the LeafExecutor
per-harness model (harness:null = Router inference, sandbox, cli), and the 8-step
v1 build order. Records the 4 resolved forks and the operator override (build the
LLM meta-driver now as the treatment on top of the conserved reservation pool, so
the equal-k gate stays valid by construction).
Operator refinement: the runtime is ONE open interface with an execute that
returns a promise or an async stream, not a closed inline|sandbox|cli union.
Built-ins are implementations; a user agent (mastra/agno/HTTP/custom) is
first-class by implementing it; no per-vendor adapters. The sandbox executor
composes runLoop and forwards PR #150's lineage passthrough rather than
reinventing checkpoint/fork. Records the #150 review (approve-to-land; verify
client-minted sessionId, bound fork acquisition, document parent-image fork).
…ervisor

The v1 keystone for the recursive-agent-atom (drivers-of-drivers, async,
observable). One self-similar atom whose act spawns child agents through a
Scope; the flat experiment harness is recovered as the simplest act.

- src/loops/supervise/types.ts: the frozen contract — Agent, an OPEN
  LeafExecutor interface (execute returns a promise or an async stream;
  router/inline + sandbox + cli are implementations, BYO is first-class via
  the registry; no per-vendor adapters), Scope, Supervisor, Settled, Budget.
- supervise/budget.ts: a conserved reservation pool — atomic reserve-on-spawn,
  fail-closed admission, refund-on-settle, so equal-k holds by construction
  (the invariant that keeps a steered arm from silently out-computing blind).
- durable/spawn-journal.ts: event-sourced SpawnJournal + content-addressed
  ResultBlobStore + seq-ordered replay (resumable, queryable, reproducible).
- supervise/scope.ts: a ray.wait cursor over an in-memory nursery; spawn
  reserves budget and resolves the executor through the open registry; a
  Settled-to-Iteration adapter keeps defaultSelectWinner single-sourced.
- supervise/runtime.ts: the open executor registry; the sandbox executor
  composes runLoop (forwarding an optional lineage passthrough) rather than
  reinventing checkpoint/fork.
- supervise/supervisor.ts: nursery join barrier, abort cascade (incl. the
  acquire lifecycle), OTP intensity breaker, typed SupervisedResult, RootHandle
  (view/signal/abort) as the observability substrate.
- bench/src/drivers: flat-harness control (with the equal-k assertion),
  progressive-widening control, and the LLM-meta-driver treatment. The
  WidenGate defaults to flat so the selector-not-judge firewall stays dormant;
  widening reads trace findings, never the raw verdict, unless judgeExempt.
- program.ts: mapPool one-for-one failure semantics (a down child is excluded
  from the merge, an all-down batch re-throws the first original error so a
  maxDepth guard still propagates loud).

Verified: typecheck, lint (204 files), build, full suite (642 tests incl. 28
keystone property tests), bench typecheck. Caveat: the live executor paths
(real router HTTP, sandbox runLoop, cli subprocess) are exercised only through
the offline mock LeafExecutor; no live-backend run yet.
@drewstone drewstone marked this pull request as ready for review June 4, 2026 13:02
@tangletools
Copy link
Copy Markdown
Contributor

tangletools commented Jun 4, 2026

🔍 Reviewing 78d80209

Pass Status ETA
opencode DeepSeek v4 Pro Running (2 min) ~5-15 min
opencode GLM 5.1 Running (2 min) ~5-15 min

Agent review running. Reads the actual code. This comment updates in place.

tangletools · #151 · model: kimi-for-coding · started 2026-06-04T13:05:48Z

@drewstone drewstone merged commit 06efe71 into main Jun 4, 2026
1 check passed
@drewstone drewstone deleted the feat/recursive-execution-atom branch June 4, 2026 13:06
drewstone added a commit that referenced this pull request Jun 6, 2026
Cuts the 58-commit backlog on main into a published release. Headline surface:
- runToolLoop / streamToolLoop — bounded turn-level tool-dispatch loop (#137)
- RSI agent tree: recursive Agent.act, Supervisor keystone, runProgram, the
  adaptive-driver channel (#139/#151/#165)
- optimization API collapsed onto agent-eval selfImprove; the runtime keeps the
  CODE-surface ImprovementDriver you pass as driver (#172)
- deployable benchmark adapters: AppWorld, commit0, aec-bench, EnterpriseOps-Gym;
  runBenchmarks over one ADAPTERS registry (#153/#156/#157)
- agent-eval floor raised to >=0.83.0 (#175)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants