feat(harness): add nested subagent runner helper#86
Conversation
Add the first reusable Sage v2 substrate slice: a harness helper for wiring createSubagentToolRegistry to isolated nested harness turns. Keep Sage product prompts, specialist rosters, runtime policy, telemetry, eval, and insight slices out of this PR. Validation is recorded in docs/architecture/sage-v2-nested-subagent-runner-80-to-100-validation.md.
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR implements a nested subagent runner helper in the harness package, enabling parent turns to spawn child harness subagent executions with deterministic trace/turn IDs, parent-context filtering, tool allowlisting, and result translation. It includes comprehensive architecture documentation, detailed test coverage, registry integration with a new input/output contract, package exports, and multi-stage review completion confirming a fixed parent-context bug and 25 deterministic tests passing. ChangesNested Subagent Runner — Core Implementation & Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7479ef8ad3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -1,4 +1,5 @@ | |||
| export { createHarness } from './harness.js'; | |||
| export { createRuntimePolicy } from './runtime-policy/index.js'; | |||
There was a problem hiding this comment.
Remove the leaked runtime-policy export
In this environment I checked find packages/harness -path '*runtime-policy*' and rg "createRuntimePolicy|HarnessRuntimePolicy"; there is no src/runtime-policy/index.ts and the HarnessRuntimePolicy* types are not exported from types.ts. As a result, any build or root import of @agent-assistant/harness now hits TS2307/TS2305 from src/index.ts, so the nested subagent helper cannot be published or consumed until these runtime-policy re-exports are removed or the missing implementation is included.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/harness/src/index.ts (1)
2-21:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep
runtime-policyout of the deprecated root entry for this slice.Lines 2-21 introduce a brand-new root API surface without the existing deprecation wrapper pattern. That effectively publishes
createRuntimePolicyand theHarnessRuntimePolicy*types as supported top-level imports before the dedicated runtime-policy slice lands.✂️ Suggested diff
-export { createRuntimePolicy } from './runtime-policy/index.js'; @@ -export type { - HarnessRuntimePolicy, - HarnessRuntimePolicyConfig, - HarnessRuntimePolicyDecision, - HarnessRuntimePolicyEvent, - HarnessRuntimePolicyEventKind, - HarnessRuntimePolicyGuard, - HarnessRuntimePolicyGuardDecision, - HarnessRuntimePolicyGuardInput, - HarnessRuntimePolicySanitizeResult, -} from './types.js';🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/harness/src/index.ts` around lines 2 - 21, The new root exports expose createRuntimePolicy and the HarnessRuntimePolicy* type names prematurely; remove the direct exports of createRuntimePolicy and all HarnessRuntimePolicy, HarnessRuntimePolicyConfig, HarnessRuntimePolicyDecision, HarnessRuntimePolicyEvent, HarnessRuntimePolicyEventKind, HarnessRuntimePolicyGuard, HarnessRuntimePolicyGuardDecision, HarnessRuntimePolicyGuardInput, and HarnessRuntimePolicySanitizeResult from this file and restore the existing deprecation-wrapper pattern by leaving runtime-policy exports only in the dedicated runtime-policy slice (or re-add a deprecated passthrough if your pattern requires it) so the runtime-policy API surface is not published at the root.
🧹 Nitpick comments (2)
packages/harness/src/nested-subagent-runner.test.ts (2)
609-659: ⚡ Quick winRelax the wall-clock assertion in
parallel_task_batch.A 50ms sleep with a
< 90msbudget leaves very little scheduler headroom, so this can fail on a loaded CI runner without any behavior change. Widening the threshold here would make the test much less flaky.⏱️ Suggested diff
- await new Promise((resolve) => setTimeout(resolve, 50)); + await new Promise((resolve) => setTimeout(resolve, 100)); @@ - expect(elapsedMs).toBeLessThan(90); + expect(elapsedMs).toBeLessThan(150);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/harness/src/nested-subagent-runner.test.ts` around lines 609 - 659, The wall-clock assertion that elapsedMs (measured after calling parentHarness.runTurn) must be < 90ms is too tight given the 50ms artificial delay in the subagent runTurn; increase the threshold to a more lenient value (e.g., 150ms or 200ms) by updating the expect(elapsedMs).toBeLessThan(...) assertion so tests are not flaky under CI scheduling; adjust the assertion near where elapsedMs is computed in nested-subagent-runner.test.ts (the block around parentHarness.runTurn / runTurn and the expect(...).toBeLessThan call).
51-60: ⚡ Quick winUse a product-neutral fixture name here.
doc-drafterreads like a Sage roster term, which leaks product vocabulary into a shared harness test. A neutral default such asexample-drafterkeeps the substrate boundary cleaner.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/harness/src/nested-subagent-runner.test.ts` around lines 51 - 60, The fixture default name in makeSubagent (returning a HarnessSubagent) currently uses the product-specific string 'doc-drafter'; change the name property to a product-neutral value like 'example-drafter' (or similar neutral token) so tests don’t leak product vocabulary—update the name field in the makeSubagent function and keep all other fields identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/architecture/sage-v2-nested-subagent-runner-fresh-eyes-review.md`:
- Around line 9-13: Replace the machine-local absolute markdown links in
docs/architecture/sage-v2-nested-subagent-runner-fresh-eyes-review.md with
repo-relative paths: change links like
/Users/khaliqgant/Projects/AgentWorkforce/agent-assistant/packages/harness/package.json
and the other absolute paths to repo-relative links (e.g.,
packages/harness/package.json or ./packages/harness/src/index.ts) so they
resolve on GitHub and for other contributors; update all occurrences (including
the package.json, src/index.ts, subpath-exports.test.ts,
nested-subagent-runner.ts, and nested-subagent-runner.test.ts links mentioned in
the comment) to use the repository-relative form consistently.
In `@packages/harness/src/nested-subagent-runner.test.ts`:
- Around line 511-537: The test currently sets a throwing getter on
globalThis.fetch but still relies on the module imported at top-level, so it
won't catch top-level fetch reads during module initialization; refactor the
test to set the throwing getter before dynamically importing the module that
exports createNestedSubagentRunner (i.e., import('./nested-subagent-runner') or
similar), then call createNestedSubagentRunner() from that imported module and
assert it does not throw and that the getter was not called; finally restore the
original globalThis.fetch descriptor in a finally block. Ensure you reference
createNestedSubagentRunner and the './nested-subagent-runner' module when
locating the change.
In `@packages/harness/src/nested-subagent-runner.ts`:
- Around line 187-199: The calls to options.createHarness and
options.composeInstructions need to be wrapped in the same failure boundary as
runtime.runTurn so any thrown errors are normalized into a TaskToolResult (an
isolated subagent failure) instead of rejecting createNestedSubagentRunner;
update createNestedSubagentRunner to catch exceptions from
options.createHarness(...) and options.composeInstructions(...) just like
runtime.runTurn, convert those exceptions into the same TaskToolResult failure
shape returned for runTurn errors, and apply the same change to the other
occurrence (the block around the code referenced at the second location
corresponding to lines 223-236) so all injected seams consistently return an
isolated subagent failure rather than throwing.
- Around line 59-67: The fallback order prefers metadata.parentTraceId before
metadata.childTraceId which causes nested subagents to attach to an ancestor
trace; change the check order so metadata?.childTraceId is consulted before
metadata?.parentTraceId (i.e., swap the two metadata checks) so
readTraceIdCandidate(metadata?.childTraceId) comes before
readTraceIdCandidate(metadata?.parentTraceId); keep the other checks
(readTraceIdCandidate(input.parentContext.traceId),
readTraceIdCandidate(input.parentContext.parentTraceId),
readTraceIdCandidate(input.parentContext.childTraceId),
readTraceIdCandidate(metadata?.traceId), and input.parentContext.turnId)
unchanged.
In `@packages/harness/src/subagent-registry.ts`:
- Around line 326-344: The getParentTraceId function currently prefers
parentTraceId over childTraceId when recovering the current trace, which breaks
nested-task tracing; update the fallback order in both places to prefer
childTraceId before parentTraceId (keep traceId first). Specifically, in
getParentTraceId swap the checks so you call readTraceIdCandidate on
parentContext.childTraceId before parentContext.parentTraceId, and likewise
prefer metadata?.childTraceId before metadata?.parentTraceId; keep using
readTraceIdCandidate for all checks.
---
Outside diff comments:
In `@packages/harness/src/index.ts`:
- Around line 2-21: The new root exports expose createRuntimePolicy and the
HarnessRuntimePolicy* type names prematurely; remove the direct exports of
createRuntimePolicy and all HarnessRuntimePolicy, HarnessRuntimePolicyConfig,
HarnessRuntimePolicyDecision, HarnessRuntimePolicyEvent,
HarnessRuntimePolicyEventKind, HarnessRuntimePolicyGuard,
HarnessRuntimePolicyGuardDecision, HarnessRuntimePolicyGuardInput, and
HarnessRuntimePolicySanitizeResult from this file and restore the existing
deprecation-wrapper pattern by leaving runtime-policy exports only in the
dedicated runtime-policy slice (or re-add a deprecated passthrough if your
pattern requires it) so the runtime-policy API surface is not published at the
root.
---
Nitpick comments:
In `@packages/harness/src/nested-subagent-runner.test.ts`:
- Around line 609-659: The wall-clock assertion that elapsedMs (measured after
calling parentHarness.runTurn) must be < 90ms is too tight given the 50ms
artificial delay in the subagent runTurn; increase the threshold to a more
lenient value (e.g., 150ms or 200ms) by updating the
expect(elapsedMs).toBeLessThan(...) assertion so tests are not flaky under CI
scheduling; adjust the assertion near where elapsedMs is computed in
nested-subagent-runner.test.ts (the block around parentHarness.runTurn / runTurn
and the expect(...).toBeLessThan call).
- Around line 51-60: The fixture default name in makeSubagent (returning a
HarnessSubagent) currently uses the product-specific string 'doc-drafter';
change the name property to a product-neutral value like 'example-drafter' (or
similar neutral token) so tests don’t leak product vocabulary—update the name
field in the makeSubagent function and keep all other fields identical.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 60d7e2c2-d94b-44c8-8937-f1ce8a7c1cc9
⛔ Files ignored due to path filters (1)
workflows/generated/sage-v2-substrate-extraction-pr.tsis excluded by!**/generated/**
📒 Files selected for processing (19)
docs/architecture/sage-v2-nested-subagent-runner-80-to-100-validation.mddocs/architecture/sage-v2-nested-subagent-runner-claude-peer-review.mddocs/architecture/sage-v2-nested-subagent-runner-claude-reflection.mddocs/architecture/sage-v2-nested-subagent-runner-executor-reflection.mddocs/architecture/sage-v2-nested-subagent-runner-fresh-eyes-reflection.mddocs/architecture/sage-v2-nested-subagent-runner-fresh-eyes-review.mddocs/architecture/sage-v2-nested-subagent-runner-lead-reflection.mddocs/architecture/sage-v2-nested-subagent-runner-plan.mddocs/architecture/sage-v2-nested-subagent-runner-validation-reflection.mddocs/architecture/sage-v2-substrate-extraction-map.mddocs/index.mdpackages/harness/package.jsonpackages/harness/src/index.tspackages/harness/src/nested-subagent-runner.test.tspackages/harness/src/nested-subagent-runner.tspackages/harness/src/registries/index.tspackages/harness/src/subagent-registry.test.tspackages/harness/src/subagent-registry.tspackages/harness/src/subpath-exports.test.ts
|
Addressed the review feedback in 6bcd57e.\n\nChanges made:\n- Removed the leaked runtime-policy package/root exports from this nested-runner PR slice.\n- Wrapped createHarness and composeInstructions in the nested runner failure boundary so seam failures return isolated task results.\n- Fixed trace fallback ordering so current child traces win where required, with regression coverage.\n- Updated the Workers fetch compatibility test to install the throwing fetch getter before dynamic import.\n- Relaxed the timing-sensitive concurrency assertions and switched the fixture subagent name to example-drafter.\n- Replaced machine-local doc links with repo-relative links and updated the fresh-eyes verdict after the scope fix.\n\nValidation:\n- npm test -w @agent-assistant/harness -- src/nested-subagent-runner.test.ts src/subagent-registry.test.ts src/subpath-exports.test.ts\n- npm run build -w @agent-assistant/harness\n- Remote Build, Test & Typecheck: passing on run 25496101224 |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/harness/src/index.ts (1)
89-94: ⚡ Quick winAdd a compatibility test for the new root wrapper.
createNestedSubagentRunner()now follows the deprecated-root shim pattern, but the existing root-import warning test still doesn't exercise it. A small assertion there would keep this public compatibility path from drifting untested.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/harness/src/index.ts` around lines 89 - 94, The root-import warning test doesn't call the new wrapper createNestedSubagentRunner, so add an assertion that exercises this compatibility path: in the existing root-import warning test, require/import createNestedSubagentRunner and call it (mock or spy warnDeprecatedRootImport or the logger/console warning) to assert a deprecation warning was emitted, and optionally compare the return value to calling createNestedSubagentRunnerFromSubpath or ensure it returns the expected runner shape; reference the functions createNestedSubagentRunner, createNestedSubagentRunnerFromSubpath and the warnDeprecatedRootImport shim when adding the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/harness/src/nested-subagent-runner.ts`:
- Around line 175-183: childCounters (Map<string, number>) leaks because it
stores string keys forever; replace it with a GC-friendly structure (e.g. a
WeakMap keyed by the parent context object) or add explicit cleanup when a
parent turn finishes. Concretely, stop using childCounters:Map<string,number>
and instead use a WeakMap<object, Map<string,number>> (keyed by
input.parentContext or filteredParentContext from
filterParentContextForSubagent), then look up/create the inner Map and increment
the counter using childCounterKey(input, parentTraceId) (or a simpler
turn-specific key) so entries drop when the parentContext is GC'd;
alternatively, if you prefer explicit lifecycle, add a cleanup hook on the
runner to delete the related entry from childCounters using childCounterKey when
a parent turn completes (functions to update: the closure variables
childCounters, usage in the returned async (input) handler, and any code that
computes parentTraceId via resolveParentTraceId).
- Around line 59-65: The fallback chain is checking
input.parentContext.parentTraceId before input.parentContext.childTraceId, which
causes descendants to attach to the ancestor trace; in nested-subagent-runner.ts
reorder the readTraceIdCandidate calls so that
readTraceIdCandidate(input.parentContext.childTraceId) is tried before
readTraceIdCandidate(input.parentContext.parentTraceId) (keep the rest of the
chain as-is, including metadata?.childTraceId before metadata?.parentTraceId) so
child trace IDs are preferred when present.
---
Nitpick comments:
In `@packages/harness/src/index.ts`:
- Around line 89-94: The root-import warning test doesn't call the new wrapper
createNestedSubagentRunner, so add an assertion that exercises this
compatibility path: in the existing root-import warning test, require/import
createNestedSubagentRunner and call it (mock or spy warnDeprecatedRootImport or
the logger/console warning) to assert a deprecation warning was emitted, and
optionally compare the return value to calling
createNestedSubagentRunnerFromSubpath or ensure it returns the expected runner
shape; reference the functions createNestedSubagentRunner,
createNestedSubagentRunnerFromSubpath and the warnDeprecatedRootImport shim when
adding the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: dd690f6e-76ed-4903-83d1-7355513feb1e
📒 Files selected for processing (7)
docs/architecture/sage-v2-nested-subagent-runner-fresh-eyes-review.mdpackages/harness/src/index.tspackages/harness/src/nested-subagent-runner.test.tspackages/harness/src/nested-subagent-runner.tspackages/harness/src/subagent-registry.test.tspackages/harness/src/subagent-registry.tspackages/harness/src/subpath-exports.test.ts
✅ Files skipped from review due to trivial changes (1)
- docs/architecture/sage-v2-nested-subagent-runner-fresh-eyes-review.md
|
Addressed the second review pass in af46554. Changes made:
Validation:
|
Summary
First Sage v2 substrate extraction slice:
Validation
Follow-ups