feat(agent): codify chat → reasoning → worker spawn hierarchy by senamakel · Pull Request #2026 · tinyhumansai/openhuman

senamakel · 2026-05-18T01:19:01Z

Summary

New AgentTier enum ({Chat, Reasoning, Worker}, default Worker) on AgentDefinition; tagged orchestrator = chat and planner = reasoning. Everything else inherits the worker default.
Loader-time validate_tier_hierarchy() enforces: chat must not spawn chat, reasoning must not spawn reasoning, worker must not list any subagents. Skill-wildcard entries are exempt (they collapse to one delegate_to_integrations_agent tool pointed at a worker).
Prompt-level rules added to orchestrator/prompt.md (hand off to reasoning tier for sustained thinking; never spawn chat) and planner/prompt.md (never delegate to another reasoning agent).
New "Spawn hierarchy and tiers" section in gitbooks/developing/architecture/agent-harness.md with the ASCII diagram, tier table, and enforcement notes.
Registry::load() re-validates after merging workspace TOML overrides so custom user agents are held to the same contract.

Problem

The newly-introduced chat tier (fast UX model on the chat hint, reasoning-quick-v1 / Kimi K2.6 Turbo) is great for TTFT but weak at sustained multi-step reasoning. There was no codified rule that:

The chat tier must not spawn another chat agent (defeats the fast-tier purpose; doubles TTFT).
The reasoning tier must not spawn another reasoning agent (chains of slow models re-decompose the same problem and blow up depth).
Workers are leaves (so the parent always sees one compact result, not a transcript of nested delegations).
Total spawn chain depth has a ceiling.

Without these rules, an over-eager router or a custom user TOML can produce chat → chat → chat chains or recursive reasoning loops that burn tokens and latency without buying capability.

Solution

Codify the hierarchy at three layers so the rules are visible to humans, the model, and the registry loader:

Data model (harness/definition.rs): AgentTier enum + agent_tier field. Default Worker so existing specialists need no edits. Documented contract on the enum doc comment.
Loader validation (agents/loader.rs): validate_tier_hierarchy() walks subagents lists and rejects same-tier and worker-with-subagents entries. Called from load_builtins() and from AgentDefinitionRegistry::load() after workspace overrides are merged.
TOML tagging: orchestrator → chat; planner → reasoning; all others inherit worker.
Prompts: chat-tier and reasoning-tier rules added to the two agents that occupy those tiers today.
Doc: canonical "Spawn hierarchy and tiers" section with the ASCII diagram and a tier table; runtime depth gate (MAX_SPAWN_DEPTH = 3) is referenced as the planned defence-in-depth follow-up.

Tests: 5 new contract tests (orchestrator_is_chat_tier, planner_is_reasoning_tier, other_builtins_default_to_worker_tier, rejects_chat_to_chat_delegation, rejects_reasoning_to_reasoning_delegation, rejects_worker_with_subagents, allows_skill_wildcards_on_any_non_worker_tier). All 32 agents::loader tests pass.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — new code in loader.rs (validate_tier_hierarchy) is covered by 5 new tests (chat→chat, reasoning→reasoning, worker-with-subagents, skill-wildcards-allowed, plus tier-tag assertions). New AgentTier field is exercised in every loader test. No Vitest changes — Rust-only.
N/A: Coverage matrix updated — behaviour-only / architecture change; no new user-visible feature row.
N/A: All affected feature IDs from the matrix are listed in the PR description under ## Related — no matrix entries touched.
No new external network dependencies introduced (mock backend used per Testing Strategy)
N/A: Manual smoke checklist updated — change is internal to the agent harness; no release-cut surface.
N/A: Linked issue closed via Closes #NNN — no issue tracker entry for this proactive refactor.

Impact

Runtime: Desktop core only. No frontend, Tauri, mobile, or CLI surface affected.
Performance: Loader runs the new validation once at boot (O(n_agents * n_subagents), trivial). No per-spawn overhead.
Security: Tightens the spawn surface — workers can no longer be tagged with subagents in custom TOMLs, and reasoning/chat loops are statically rejected at boot rather than discovered at runtime.
Migration: Backwards-compatible. Existing custom user TOMLs without an agent_tier field default to worker (so they keep working unless they also declared subagents — which would now correctly fail).
Follow-up: Runtime MAX_SPAWN_DEPTH = 3 task-local gate + SpawnDepthExceeded error variant is referenced in the doc as defence-in-depth; deferred to a separate PR to keep this change focused.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: feat/agent-spawn-hierarchy-tiers
Commit SHA: 1d2efd4

Validation Run

N/A: pnpm --filter openhuman-app format:check — Rust-only change
N/A: pnpm typecheck — Rust-only change
Focused tests: cargo test --lib agents::loader → 32/32 pass (including 6 new tier contract tests)
Rust fmt/check (if changed): cargo fmt && cargo check clean
N/A: Tauri fmt/check (if changed) — Tauri shell untouched

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: Loader now rejects agent registries that declare chat→chat or reasoning→reasoning delegation, or workers with non-empty subagent lists.
User-visible effect: None at runtime today (no built-in violates the contract). Custom user TOMLs that violate it will now fail at boot with a descriptive error instead of behaving subtly wrong at spawn time.

Parity Contract

Legacy behavior preserved: Yes — agent_tier defaults to worker, every existing built-in keeps its current subagent surface, and all pre-existing tests still pass.
Guard/fallback/dispatch parity checks: Skill-wildcard expansion is intentionally exempt from the tier check (documented inline) because it always routes to the integrations_agent worker via a single delegation tool — not a recursive spawn.

Duplicate / Superseded PR Handling

Duplicate PR(s): N/A
Canonical PR: N/A
Resolution: N/A

Summary by CodeRabbit

New Features
- Enforced a three-tier agent spawn hierarchy (Chat, Reasoning, Worker) at registry/load time to prevent invalid delegation chains.
- Clarified orchestrator/planner roles and updated prompts to enforce tiered delegation rules.
- Planned runtime spawn-depth gate (max chain depth = 3) documented but not yet activated.
Chores
- Updated built-in agent definitions, docs, and tests to reflect tier-based spawning.

Introduces an `AgentTier` ({Chat, Reasoning, Worker}, default Worker) field on AgentDefinition and a loader-time `validate_tier_hierarchy` check so the spawn surface mirrors the cost/latency split between models: * chat (fast UX, e.g. orchestrator) → reasoning OR worker, never chat * reasoning (deep thinking, e.g. planner) → worker, never reasoning * worker (leaf executors) → nothing in `subagents` Tags `orchestrator = chat` and `planner = reasoning`; all other built-ins inherit the worker default. Skill-wildcard entries are exempt because they collapse to a single `delegate_to_integrations_agent` tool aimed at a worker. Adds matching prompt-level rules to orchestrator/prompt.md and planner/prompt.md, and a new "Spawn hierarchy and tiers" section in gitbooks/developing/architecture/agent-harness.md. Registry::load() re-validates after merging workspace TOML overrides so custom user agents are held to the same contract. Runtime depth gate (MAX_SPAWN_DEPTH = 3 task-local) is referenced in the doc and prompts as defence-in-depth but is deferred to a follow-up.

…erarchy-tiers

coderabbitai · 2026-05-18T01:19:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 05f488a1-1baf-4d19-a15c-20b457316568

📥 Commits

Reviewing files that changed from the base of the PR and between 1d2efd4 and ab14f2e.

📒 Files selected for processing (3)

gitbooks/developing/architecture/agent-harness.md
src/openhuman/agent/agents/orchestrator/prompt.md
src/openhuman/agent/agents/planner/prompt.md

✅ Files skipped from review due to trivial changes (2)

src/openhuman/agent/agents/planner/prompt.md
gitbooks/developing/architecture/agent-harness.md

📝 Walkthrough

Walkthrough

This PR adds tier-based spawn-hierarchy validation to the agent harness. Built-in agents (orchestrator=Chat, planner=Reasoning) are assigned tiers that enforce static delegation rules (Chat→Reasoning/Worker, Reasoning→Worker, Worker is leaf). A new validator runs at loader and registry build time, while test fixtures are updated to include the required agent_tier field.

Changes

Agent Tier Hierarchy and Validation

Layer / File(s)	Summary
Tier Definition and Core Data Structures `src/openhuman/agent/harness/definition.rs`	`AgentTier` enum (Chat, Reasoning, Worker) and `agent_tier: AgentTier` field added to `AgentDefinition` with serde defaults and comprehensive tier-semantics documentation.
Tier Hierarchy Validation Implementation `src/openhuman/agent/agents/loader.rs`	`validate_tier_hierarchy()` function builds a tier lookup, iterates subagents, enforces tier constraints (Worker=leaf, Chat/Reasoning no-self-delegation, Skills wildcard exempt), with unit tests verifying built-in assignments and failure cases.
Built-in Loader and Module Exports `src/openhuman/agent/agents/loader.rs`, `src/openhuman/agent/agents/mod.rs`	`load_builtins()` validates tier hierarchy on loaded built-ins; validator is re-exported from `agents::mod.rs` public API.
Registry Load-time Validation `src/openhuman/agent/harness/definition.rs`	`AgentDefinitionRegistry::load()` re-validates merged (custom + built-in) definitions after overrides, surfacing hierarchy violations with context.
Built-in Agent Tier Assignments and Prompts `src/openhuman/agent/agents/orchestrator/agent.toml`, `src/openhuman/agent/agents/orchestrator/prompt.md`, `src/openhuman/agent/agents/planner/agent.toml`, `src/openhuman/agent/agents/planner/prompt.md`	Orchestrator assigned `Chat` tier with delegation rules (use `reasoning` or `worker`, never chat→chat); planner assigned `Reasoning` tier with worker-only delegation constraints. Prompts and TOML updated with explicit tier-based handoff guidance.
Test Fixture Updates Across Harness `src/openhuman/agent/harness/builtin_definitions.rs`, `src/openhuman/agent/harness/definition_tests.rs`, `src/openhuman/agent/harness/payload_summarizer.rs`, `src/openhuman/agent/harness/subagent_runner/ops_tests.rs`, `src/openhuman/channels/runtime/dispatch.rs`, `src/openhuman/tools/orchestrator_tools.rs`	All test-only `AgentDefinition` constructors updated with explicit `agent_tier: AgentTier::Worker` to match new required field.
Architecture Documentation `gitbooks/developing/architecture/agent-harness.md`	New "Spawn hierarchy and tiers" section documents tier constraints, loader-time static validation, runtime spawn-depth enforcement, and status notes on implementation coverage.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

tinyhumansai/openhuman#1957: Updates test-only agent definitions in builtin_definitions.rs that now require tier assignments per this PR's schema changes.

Suggested labels

working

Poem

🐰 I hop through tiers where thoughts align,
Chat greets the user, plans trace the line,
Reasoning crafts maps for workers to run,
Load-time guards keep the spawn-chain to one,
A rabbit applauds: rules checked, tasks done.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and precisely describes the main change: introduction of a formal spawn-hierarchy structure for agents organized into three tiers (chat, reasoning, worker) with arrows indicating the delegation flow.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

src/openhuman/agent/agents/planner/prompt.md (1)

43-43: 💤 Low value

Clarify the tier prohibition phrasing.

The parenthetical examples group "no planner-spawns-planner" (reasoning→reasoning) with "no planner-spawns-orchestrator" (reasoning→chat) under the label "Never delegate to another reasoning agent", which incorrectly implies the orchestrator is a reasoning-tier agent. The orchestrator is chat tier (per orchestrator/agent.toml line 13).

The rule is correct (reasoning tier can only spawn worker tier), but the phrasing could be clearer.

Suggested rephrasing for clarity

-**You are the reasoning tier.** The chat-tier Orchestrator handed off to you because the task needs sustained thinking. Compose plans for the **worker tier** — `code_executor`, `researcher`, `critic`, `integrations_agent`, `archivist`. **Never delegate to another reasoning agent** (no planner-spawns-planner, no planner-spawns-orchestrator); the loader and the harness depth gate will reject it. If a single worker can't cover a node, split the node — don't smuggle a second reasoning hop in.
+**You are the reasoning tier.** The chat-tier Orchestrator handed off to you because the task needs sustained thinking. Compose plans for the **worker tier** — `code_executor`, `researcher`, `critic`, `integrations_agent`, `archivist`. **Never delegate to chat or reasoning tiers** (no planner-spawns-planner, no planner-spawns-orchestrator); the loader and the harness depth gate will reject it. If a single worker can't cover a node, split the node — don't smuggle a second reasoning hop in.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/agents/planner/prompt.md` at line 43, The phrasing
incorrectly groups "no planner-spawns-orchestrator" with reasoning→reasoning
examples and may imply the orchestrator is a reasoning-tier agent; update the
sentence in the "You are the reasoning tier." paragraph to state clearly that
the reasoning tier may only spawn worker-tier agents (code_executor, researcher,
critic, integrations_agent, archivist) and not other reasoning or chat-tier
agents, and remove or reword the parenthetical so it does not list
"orchestrator" as an example of a reasoning agent (refer to the symbol
orchestrator and the worker names to locate the text to edit).

gitbooks/developing/architecture/agent-harness.md (1)

199-204: ⚡ Quick win

Clarify implementation status of runtime depth gate.

The description on line 202 uses present tense ("caps total spawn chain depth") but the status note on line 204 indicates this is "sketched" rather than live. Consider rewording the runtime enforcement description to make it clear this is a planned safeguard, not yet active.

✏️ Suggested clarification

-2. **Runtime depth gate (dynamic).** Independent of tier, the sub-agent runner caps total spawn chain depth at `MAX_SPAWN_DEPTH = 3` via a task-local counter incremented across `run_subagent`. A user-shipped TOML that drops the tier annotation still can't recurse past three hops. The harness surfaces this as the `SpawnDepthExceeded` agent error.
+2. **Runtime depth gate (dynamic, planned).** Independent of tier, the sub-agent runner will cap total spawn chain depth at `MAX_SPAWN_DEPTH = 3` via a task-local counter incremented across `run_subagent`. A user-shipped TOML that drops the tier annotation will not be able to recurse past three hops. The harness will surface this as the `SpawnDepthExceeded` agent error.

Alternatively, if the runtime gate is partially implemented, clarify which parts are live vs. sketched.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gitbooks/developing/architecture/agent-harness.md` around lines 199 - 204,
The runtime depth-gate description uses present-tense but the status note says
it's only sketched; update the wording to clearly mark the runtime enforcement
as planned/partial or describe which pieces are implemented vs. sketched:
mention MAX_SPAWN_DEPTH, the task-local counter sketch in
harness/fork_context.rs, and the gating in subagent_runner::run_subagent as
not-yet-fully-active (or list which of those are already implemented), while
keeping the loader-time enforcement (agents::loader::validate_tier_hierarchy and
the agent_tier field) identified as live and keep SpawnDepthExceeded referenced
as the intended surfaced error.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@gitbooks/developing/architecture/agent-harness.md`:
- Around line 174-182: The fenced ASCII diagram block beginning with "Chat      
(fast, UX-focused — e.g. orchestrator on `chat` hint)" is missing a language
specifier; update the opening triple backticks to include a language (e.g.,
```text) so the diagram renders correctly, leaving the block contents unchanged.

In `@src/openhuman/agent/agents/orchestrator/prompt.md`:
- Line 43: The text in the spawn hierarchy section asserts "Total chain depth is
capped at 3 hops by the harness" but the runtime enforcement is described
elsewhere as a planned gate (MAX_SPAWN_DEPTH = 3), so update the sentence in
prompt.md (the "Spawn hierarchy (hard rule)" line) to reflect that enforcement
is not yet live: either change to "will be capped at 3 hops" or append a
parenthetical note like "(enforcement tracked in `#XXXX` / planned via
MAX_SPAWN_DEPTH = 3)"; ensure references to MAX_SPAWN_DEPTH remain consistent
and add the issue/PR number if available.

---

Nitpick comments:
In `@gitbooks/developing/architecture/agent-harness.md`:
- Around line 199-204: The runtime depth-gate description uses present-tense but
the status note says it's only sketched; update the wording to clearly mark the
runtime enforcement as planned/partial or describe which pieces are implemented
vs. sketched: mention MAX_SPAWN_DEPTH, the task-local counter sketch in
harness/fork_context.rs, and the gating in subagent_runner::run_subagent as
not-yet-fully-active (or list which of those are already implemented), while
keeping the loader-time enforcement (agents::loader::validate_tier_hierarchy and
the agent_tier field) identified as live and keep SpawnDepthExceeded referenced
as the intended surfaced error.

In `@src/openhuman/agent/agents/planner/prompt.md`:
- Line 43: The phrasing incorrectly groups "no planner-spawns-orchestrator" with
reasoning→reasoning examples and may imply the orchestrator is a reasoning-tier
agent; update the sentence in the "You are the reasoning tier." paragraph to
state clearly that the reasoning tier may only spawn worker-tier agents
(code_executor, researcher, critic, integrations_agent, archivist) and not other
reasoning or chat-tier agents, and remove or reword the parenthetical so it does
not list "orchestrator" as an example of a reasoning agent (refer to the symbol
orchestrator and the worker names to locate the text to edit).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6076be2b-20f7-4cec-b08f-03e5547aaf68

📥 Commits

Reviewing files that changed from the base of the PR and between ac245a0 and 1d2efd4.

📒 Files selected for processing (14)

gitbooks/developing/architecture/agent-harness.md
src/openhuman/agent/agents/loader.rs
src/openhuman/agent/agents/mod.rs
src/openhuman/agent/agents/orchestrator/agent.toml
src/openhuman/agent/agents/orchestrator/prompt.md
src/openhuman/agent/agents/planner/agent.toml
src/openhuman/agent/agents/planner/prompt.md
src/openhuman/agent/harness/builtin_definitions.rs
src/openhuman/agent/harness/definition.rs
src/openhuman/agent/harness/definition_tests.rs
src/openhuman/agent/harness/payload_summarizer.rs
src/openhuman/agent/harness/subagent_runner/ops_tests.rs
src/openhuman/channels/runtime/dispatch.rs
src/openhuman/tools/orchestrator_tools.rs

Address CodeRabbit suggestions on PR tinyhumansai#2026: - arch-doc ASCII diagram: add `text` language tag to the fenced block (markdownlint MD040). - orchestrator/prompt.md, planner/prompt.md, agent-harness.md: soften "Total chain depth is capped at 3 hops by the harness" to reflect that the runtime `MAX_SPAWN_DEPTH` task-local is a planned follow-up; only the loader-time tier check is live today.

senamakel added 2 commits May 17, 2026 18:15

Merge remote-tracking branch 'upstream/main' into feat/agent-spawn-hi…

1d2efd4

…erarchy-tiers

senamakel requested a review from a team May 18, 2026 01:19

coderabbitai Bot added the working A PR that is being worked on by the team. label May 18, 2026

coderabbitai Bot requested changes May 18, 2026

View reviewed changes

Comment thread gitbooks/developing/architecture/agent-harness.md Outdated

Comment thread src/openhuman/agent/agents/orchestrator/prompt.md Outdated

coderabbitai Bot approved these changes May 18, 2026

View reviewed changes

senamakel merged commit 0257b2e into tinyhumansai:main May 18, 2026
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): codify chat → reasoning → worker spawn hierarchy#2026

feat(agent): codify chat → reasoning → worker spawn hierarchy#2026
senamakel merged 3 commits into
tinyhumansai:mainfrom
senamakel:feat/agent-spawn-hierarchy-tiers

senamakel commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senamakel commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading