Skip to content

refactor(agent): unify the three agent-turn loops into one TurnEngine#3012

Merged
senamakel merged 15 commits into
tinyhumansai:mainfrom
senamakel:refactor/unify-agent-turn-engine
May 30, 2026
Merged

refactor(agent): unify the three agent-turn loops into one TurnEngine#3012
senamakel merged 15 commits into
tinyhumansai:mainfrom
senamakel:refactor/unify-agent-turn-engine

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented May 30, 2026

Summary

  • Collapse the three near-duplicate agent-turn loops — Agent::turn (web/desktop chat), run_tool_call_loop (other channels + triage), and the subagent run_inner_loop — into a single engine::run_turn_engine that all three drive.
  • Everything that varied per caller is now a small seam the engine calls into: ToolSource, ProgressReporter, TurnObserver, CheckpointStrategy, ResponseParser.
  • The per-call tool executor (run_one_tool / run_agent_tool_call), the repeated-failure circuit breaker, and the ProviderDelta → AgentProgress stream forwarder are now shared by all three.
  • Net: ~1,000 lines of triplicated loop logic removed; tool_loop.rs is a ~30-line shim, the subagent + Agent loops are thin adapters.
  • Recursion/spawn-depth/parallelism semantics are untouched (they live outside the loop, in run_subagent/setup).

Problem

The same agentic loop (call LLM → parse tool calls → execute tools → append results → repeat to final text / cap) was implemented three times and had drifted: the repeated-failure circuit breaker existed in two of three; streaming in two of three; the subagent computed tool success via a fragile !starts_with("Error"); each carried its own copy of the per-call approval-gate + audit + tokenjuice path. Every fix had to be applied up to three times.

Solution

A single loop in src/openhuman/agent/harness/engine/, parameterized by trait seams:

  • ToolSource — which tools are advertised (request_specs) + how a call executes (execute_call). Impls: RegistryToolSource (channels/CLI/triage), SubagentToolSource (lazy toolkit resolution + allowlist + progressive-disclosure handoff), AgentToolSource (session policy + per-call permission + execute_with_options).
  • ProgressReporterTurnProgress (top-level Turn* events + streaming) vs SubagentProgress (nested Subagent* events, no streaming) vs NullProgress.
  • TurnObserver — caller-specific side effects (context management, transcript persistence, typed-history rebuild, worker-thread mirroring). The channel loop uses NullObserver; the subagent + Agent rebuild their own state.
  • CheckpointStrategyErrorCheckpoint (typed MaxIterationsExceeded) vs summarize-on-cap for the subagent + Agent.
  • ResponseParserDefaultParser (native + XML fallback) vs DispatcherParser (the Agent's ToolDispatcher, incl. PFormat's positional grammar).

Agent::turn keeps its richer state: it materializes the engine's ChatMessage buffer from its typed ConversationMessage history each iteration (via an observer that holds &mut Agent), rebuilds the typed history (with reasoning_content round-trip + dispatcher.format_results), and persists the transcript — while the engine owns the loop.

Landed as 8 behavior-preserving commits (extract shared helpers → engine + channel shim → generalize seams → subagent → Agent groundwork → Agent → cleanup), each green.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) — the moved logic is exercised by the existing harness/agent/session suites, including failure paths (turn_errors_on_empty_*, turn_emits_checkpoint_at_max_iterations, circuit-breaker, policy-deny, unknown-tool); tool_loop_tests.rs updated for the shim.
  • N/A: behaviour-preserving refactor — changed lines are existing, already-tested loop logic relocated behind seams; they run under the agent/harness suites (10,564 lib tests pass locally). Will add targeted tests if CI diff-cover flags specific lines.
  • N/A: behaviour-only change — no feature rows added/removed/renamed in the coverage matrix.
  • N/A: no feature rows change, so no matrix feature IDs to list.
  • No new external network dependencies introduced.
  • N/A: no release-cut surface change — this is internal agent-loop plumbing.
  • N/A: no tracking issue for this refactor.

Impact

  • Affects all agent execution paths (desktop/web chat, non-web channels, triage, sub-agents). Behaviour-preserving by design.
  • Sanctioned consistency changes (previously inconsistent across the three loops; each noted in its commit body):
    • Sub-agents and web chat now run the repeated-failure circuit breaker + context-guard/stop-hooks.
    • Web chat now emits TurnStarted + TurnCostUpdated progress events and runs multimodal prep.
    • Sub-agent tool success is now structured (was !starts_with("Error")); sub-agent text-mode <tool_result> blocks drop the status= attribute.
  • No migration, no new persistence format. Agent.tool_dispatcher changed BoxArc (internal only).

Related

  • Closes:
  • Follow-up PR(s)/TODOs: none.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: refactor/unify-agent-turn-engine
  • Commit SHA: 96d0a41

Validation Run

  • N/A: pnpm --filter openhuman-app format:check — no frontend/TS changes in this PR.
  • N/A: pnpm typecheck — no frontend/TS changes.
  • Focused tests: cargo test --lib (10,564 passed; 2 pre-existing/environmental failures in composio::tools + macOS cwd_jail seatbelt, untouched by this diff).
  • Rust fmt/check (if changed): cargo fmt applied; cargo check --manifest-path Cargo.toml clean; clippy clean on changed files.
  • Tauri fmt/check (if changed): cargo check --manifest-path app/src-tauri/Cargo.toml clean.

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: consistency fixes listed under Impact (circuit breaker + context guard for all paths; web-chat progress events + multimodal prep; structured sub-agent tool success; sub-agent text-mode tool_result loses status=).
  • User-visible effect: web chat surfaces live cost/turn events + bails early with a root-cause summary on repeated identical tool failures; otherwise unchanged.

Parity Contract

  • Legacy behavior preserved: the canonical channel-loop body was moved verbatim into the engine; the Agent's typed ConversationMessage history, reasoning_content round-trip, KV-cache transcript prefix, EmptyProviderResponse error, and dispatcher (native/XML/PFormat) parse are all preserved.
  • Guard/fallback/dispatch parity checks: spawn-depth (MAX_SPAWN_DEPTH) + "sub-agents can't spawn" untouched (live in run_subagent setup); approval gate + audit rows shared via run_one_tool; force-text-mode preserved via advertising zero specs.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none
  • Canonical PR: this PR
  • Resolution: N/A

Summary by CodeRabbit

  • New Features

    • Unified turn engine powering all agent turn flows; consistent parsing, tool sourcing, and progress streaming.
    • New resumable max-iteration checkpointing and improved progress reporters with streaming deltas.
  • Bug Fixes

    • More reliable, consistent tool execution (timeouts, approval gating, failure tracking, and sanitization).
    • Reduced variance across entry points and sub-agents.
  • Refactor

    • Centralized turn/tool orchestration and simplified per-call execution paths.
  • Tests

    • Minor test import cleanup.

Review Change Stack

senamakel added 12 commits May 29, 2026 21:54
…gine/

Phase 1 of unifying the three agent-turn loops (Agent::turn, run_tool_call_loop,
subagent run_inner_loop) behind one engine. Introduces
src/openhuman/agent/harness/engine/ and lifts two pieces that were duplicated
across the loops, behavior-preserving:

- engine::run_one_tool — the per-call tool executor (start event, policy gate,
  scope guard, approval gate, execute+timeout, scrub/tokenjuice/cap/summarize,
  audit, completion event). Collapses ~400 lines of branches in tool_loop.rs
  into one call + a uniform push/writeln/circuit-breaker tail.
- engine::spawn_delta_forwarder — the ProviderDelta -> AgentProgress streaming
  forwarder, previously copied byte-for-byte in tool_loop.rs and turn.rs.

No behavior change. tool_loop tests: 24/24 pass.
Phase 2 of the agent-turn unification. Moves the canonical agentic loop body
out of run_tool_call_loop into engine::run_turn_engine, behavior-preserving, and
introduces the first axis seam:

- engine::ToolSource (+ RegistryToolSource) — owns tool advertisement
  (request_specs) and per-call execution (execute_call). The channel/CLI source
  resolves registry+extra under the visibility whitelist and delegates to
  run_one_tool, exactly as before.
- engine::run_turn_engine — the single loop (turn/iteration/cost/completed
  progress, stop hooks, context guard, token-budget trim, multimodal prep,
  streaming forwarder, native/text parse, circuit breaker, max-iter error).
- run_tool_call_loop is now a ~30-line adapter that builds a RegistryToolSource
  and hands off. Signature unchanged; bus/triage/summarizer/tests unaffected.

tool_loop + harness suites: 128/128 pass.
…int seams

Phase 3a. Adds the remaining axis seams so the subagent and Agent loops can route
through run_turn_engine next:

- ProgressReporter (TurnProgress / SubagentProgress / NullProgress) — the engine
  no longer names a concrete AgentProgress variant; flavor (Turn* vs Subagent*)
  + streaming-on/off is chosen by the impl. run_one_tool now reports tool
  start/complete through it too.
- TurnObserver (NullObserver) — per-caller side effects (context mgmt,
  transcript, worker-thread mirroring) around each step; all hooks default no-op.
- CheckpointStrategy (ErrorCheckpoint) — max-iteration outcome (error vs
  summarize). run_turn_engine now returns TurnEngineOutcome { text, iterations,
  cost } and accumulates a tool digest for summarizing strategies.

run_tool_call_loop wires TurnProgress + NullObserver + ErrorCheckpoint; behavior
unchanged. tool_loop + harness suites: 118/118 pass.
Phase 3. run_inner_loop is replaced by run_turn_engine + three subagent seams:

- SubagentToolSource — lazy toolkit resolution + allowlist gating + the
  progressive-disclosure handoff, with per-call execution now via the shared
  engine::run_one_tool. Sub-agents thereby gain the same approval gate, audit
  rows, credential scrub, tokenjuice and timeout as the channel loop, and tool
  success is now structured (not the fragile !starts_with("Error")). force-text
  mode falls out of advertising zero specs (engine skips native tools → XML
  fallback parse → batched [Tool results]).
- SubagentObserver — accumulates AggregatedUsage, persists the per-iteration
  transcript, and mirrors assistant intents / per-call results / text-mode
  batches / final responses to the spawn's worker thread.
- SubagentCheckpoint — summarize-on-cap (resumable checkpoint) with the
  deterministic-digest fallback; its summarization usage is folded back into
  the turn cost + AggregatedUsage by the engine.

Spawn-depth + can't-spawn-subagents enforcement is untouched (it lives in
run_subagent / run_typed_mode setup, outside the loop). Sanctioned behavior
changes: sub-agents now run stop-hooks + context-guard + token-trim, get
scrub/tokenjuice, and the text-mode tool_result block drops the status=
attribute. Engine TurnObserver grew on_assistant/on_tool_result/
on_results_batch/after_iteration hooks; CheckpointStrategy now returns
CheckpointOutcome { text, usage }.

subagent + tool_filter + harness_gap: 68/68; broad harness + agent: 460/460.
…dwork)

The engine's hardcoded native-first + XML-fallback parse becomes the DefaultParser
behind a ResponseParser seam. This lets Agent::turn (next) plug in its configured
ToolDispatcher — crucially PFormat, whose positional name[args] grammar the
built-in parser can't read — without the engine knowing about dispatchers.
DispatcherParser adapts any ToolDispatcher to the seam (dispatcher::ParsedToolCall
-> engine ParsedToolCall). Channel loop + subagent pass DefaultParser; behavior
unchanged.
Lets the turn engine's DispatcherParser hold a cheap clone of the dispatcher
without borrowing the Agent (whose history/context the turn observer borrows
mutably). Builder setter still takes Box; conversion to Arc happens at build().
No behavior change.
…rver hooks (Phase 4 groundwork)

- session/agent_tool_exec.rs: run_agent_tool_call — the Agent's per-call path
  (visibility gate, session policy, per-call permission, ToolPolicy.check,
  execute_with_options + payload summarizer, per-result budget) as a free fn so
  both Agent::execute_tool_call and the upcoming AgentToolSource share it.
  Agent::execute_tool_call now delegates to it (old body kept as
  execute_tool_call_legacy, dead, to delete in cleanup).
- engine: TurnObserver hooks enriched (on_assistant now carries display+raw
  text, reasoning_content, native + parsed calls; on_tool_result carries
  success) so Agent::turn can rebuild its typed ConversationMessage history.
  TurnEngineOutcome gains hit_cap.

agent + session::turn + session::tests: 164/164 pass.
…nEngine

Phase 4 — the third and final loop. Agent::turn now drives the shared
run_turn_engine via three seams, preserving its richer state:

- AgentToolSource: owns Arc/value clones of the Agent's tool state (disjoint
  from the &mut Agent the observer holds) and runs each call through the shared
  run_agent_tool_call; collects ToolCallRecords for post-turn hooks. Advertises
  specs only when the dispatcher's should_send_tool_specs() is true (PFormat/XML
  fall back to the engine's text path, same as subagent force-text-mode).
- DispatcherParser: plugs the Agent's ToolDispatcher (native/XML/PFormat) into
  the engine's ResponseParser seam, so PFormat's positional grammar is preserved.
- AgentObserver: borrows the Agent mutably — runs the ContextManager reduction +
  re-materializes the engine buffer from typed history each iteration, rebuilds
  the typed ConversationMessage history (Chat + AssistantToolCalls with
  reasoning_content round-trip + dispatcher.format_results), accumulates usage,
  persists the transcript. allow_empty_final()=false preserves the
  EmptyProviderResponse error.
- AgentCheckpoint: summarize-on-cap with streaming + deterministic fallback.

Supporting: tool_dispatcher is now Arc; turn_checkpoint moved to a session
sibling module; execute_tool_call delegates to the shared executor (old body
kept as execute_tool_call_legacy, deleted in Phase 5).

Sanctioned behavior changes (web chat now matches the channel loop): emits
TurnStarted + TurnCostUpdated progress events, gains the repeated-failure
circuit breaker, and runs multimodal prep. agent + session + harness: 482/482.
…ts, docs

- Delete Agent::execute_tool_call_legacy (the pre-extraction duplicate body);
  Agent::execute_tool_call now solely delegates to the shared run_agent_tool_call.
- Drop now-unused imports across turn.rs + engine/mod.rs re-exports; cargo fmt.
- Document the unified engine in gitbooks agent-harness architecture ("One
  engine, three entry points" + the five seams).

Full agent + channels::runtime lib suite: 980/980 pass.
@senamakel senamakel requested a review from a team May 30, 2026 07:57
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f730a6e2-bdef-49f7-94bf-6e06cc5e917a

📥 Commits

Reviewing files that changed from the base of the PR and between 8fdbc90 and aac72b6.

📒 Files selected for processing (3)
  • src/openhuman/agent/harness/engine/core.rs
  • src/openhuman/agent/harness/engine/tools.rs
  • src/openhuman/agent/harness/session/turn.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/engine/core.rs
  • src/openhuman/agent/harness/engine/tools.rs

📝 Walkthrough

Walkthrough

Consolidates three agent loops (Agent::turn, run_tool_call_loop, run_subagent) into a single run_turn_engine with pluggable seams: ToolSource, ProgressReporter, TurnObserver, CheckpointStrategy, and ResponseParser; three call sites adapt by constructing concrete seam implementations and delegating iteration control to the engine.

Changes

Unified Turn Engine Consolidation

Layer / File(s) Summary
Engine seams and type contracts
src/openhuman/agent/harness/engine/checkpoint.rs, src/openhuman/agent/harness/engine/core.rs, src/openhuman/agent/harness/engine/parser.rs, src/openhuman/agent/harness/engine/progress.rs, src/openhuman/agent/harness/engine/state.rs, src/openhuman/agent/harness/engine/tool_source.rs, src/openhuman/agent/harness/engine/tools.rs, src/openhuman/agent/harness/engine/mod.rs
Defines CheckpointStrategy/CheckpointOutcome, ProgressReporter, TurnObserver, ResponseParser, ToolSource, ToolRunResult, and TurnEngineOutcome.
Turn engine core loop
src/openhuman/agent/harness/engine/core.rs
run_turn_engine implements the unified per-iteration loop: pre-dispatch checks, provider call (optional native-tool + streaming), response parsing, final-text streaming or tool-execution loop, circuit-breaker early exit, and checkpoint delegation on iteration cap.
Per-tool execution with policy and gating
src/openhuman/agent/harness/engine/tools.rs
run_one_tool centralizes per-call lifecycle: policy checks, unknown/CliRpcOnly handling, optional ApprovalGate, timed execution, scrubbing/compaction/summarization/caps, final progress event and audit recording.
Progress reporters and tool source implementations
src/openhuman/agent/harness/engine/progress.rs, src/openhuman/agent/harness/engine/tool_source.rs
Adds TurnProgress, SubagentProgress, NullProgress, spawn_delta_forwarder, and RegistryToolSource which filters/dedupes tool specs and delegates execution to run_one_tool.
Agent session adapters
src/openhuman/agent/harness/session/agent_tool_exec.rs, src/openhuman/agent/harness/session/turn_engine_adapter.rs, src/openhuman/agent/harness/session/turn.rs
Introduces AgentToolExecCtx and run_agent_tool_call, AgentToolSource, AgentObserver, and AgentCheckpoint; refactors Agent::turn to call run_turn_engine and simplifies Agent::execute_tool_call to delegate to the new executor.
Tool loop and subagent runner refactored
src/openhuman/agent/harness/tool_loop.rs, src/openhuman/agent/harness/subagent_runner/ops.rs
run_tool_call_loop and subagent runner now construct engine inputs (tool source, progress, observer, checkpoint, parser) and call run_turn_engine, centralizing previous inline loop behavior.
Type updates, module wiring, imports and docs
src/openhuman/agent/harness/session/types.rs, src/openhuman/agent/harness/session/builder.rs, src/openhuman/agent/harness/session/mod.rs, src/openhuman/agent/harness/mod.rs, src/openhuman/agent/harness/session/turn.rs, gitbooks/developing/architecture/agent-harness.md, src/openhuman/agent/harness/tool_loop_tests.rs
Agent::tool_dispatcher changed to Arc, engine module added, session modules reorganized, imports consolidated, STREAM_CHUNK_MIN_CHARS visibility adjusted, and documentation updated.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

working

Suggested reviewers

  • M3gA-Mind
  • oxoxDev

Poem

🐰 I watched three loops become just one,
Threads untangled, the work half-done.
With seams in place and hops of cheer,
One engine hums, the paths draw near.
Tools and turns now dance as one.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor(agent): unify the three agent-turn loops into one TurnEngine' directly and accurately describes the main objective of the changeset—consolidating three separate agent-turn loop implementations (Agent::turn, run_tool_call_loop, and run_inner_loop for subagents) into a single unified engine::run_turn_engine.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

senamakel added 2 commits May 30, 2026 01:16
…t-turn-engine

# Conflicts:
#	src/openhuman/agent/harness/subagent_runner/ops.rs
@senamakel
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. labels May 30, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/harness/engine/core.rs`:
- Around line 85-86: The boolean use_native_tools is computed once outside the
loop so it won't reflect lazy registrations; recompute it each iteration right
before it's used instead of once at declaration—replace the fixed
use_native_tools with a local per-iteration evaluation using
provider.supports_native_tools() && !tools.request_specs().is_empty() inside the
loop (where use_native_tools is read), and make the same change in the other
occurrence around the block referenced by the code using lines ~216-220 so
native-tool enablement follows dynamic changes to tools.request_specs().
- Around line 536-548: The max-iteration success path in the turn engine returns
a TurnEngineOutcome after checkpoint.on_max_iter(...) without emitting the
lifecycle event; call progress.turn_completed(...) with the same outcome data
(text/co.text, iterations/max_iterations, cost/turn_cost, hit_cap=true) after
folding usage (turn_cost.add_call and observer.record_usage) and before
returning so consumers receive the terminal event; locate this in the block
handling checkpoint.on_max_iter and ensure progress.turn_completed is invoked
with the constructed TurnEngineOutcome (or equivalent parameters) prior to
Ok(TurnEngineOutcome { ... }).

In `@src/openhuman/agent/harness/engine/tools.rs`:
- Around line 290-296: The tracing::warn call is logging raw tool output
(output) before redaction; update the logic so you scrub sensitive data first
(use scrub_credentials(&output) into scrubbed) and then use that scrubbed value
in the tracing::warn message and pass scrubbed (not output) into
crate::openhuman::tokenjuice::compact_tool_output so no unsanitized tool payload
is emitted; locate the logging and compact call around the agent_loop/tool
handling (references: tracing::warn, scrub_credentials, compact_tool_output,
call.name) and replace uses of output with the scrubbed value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2e06d35c-10f1-4189-a4df-0918908173f1

📥 Commits

Reviewing files that changed from the base of the PR and between 3556842 and 8fdbc90.

📒 Files selected for processing (19)
  • gitbooks/developing/architecture/agent-harness.md
  • src/openhuman/agent/harness/engine/checkpoint.rs
  • src/openhuman/agent/harness/engine/core.rs
  • src/openhuman/agent/harness/engine/mod.rs
  • src/openhuman/agent/harness/engine/parser.rs
  • src/openhuman/agent/harness/engine/progress.rs
  • src/openhuman/agent/harness/engine/state.rs
  • src/openhuman/agent/harness/engine/tool_source.rs
  • src/openhuman/agent/harness/engine/tools.rs
  • src/openhuman/agent/harness/mod.rs
  • src/openhuman/agent/harness/session/agent_tool_exec.rs
  • src/openhuman/agent/harness/session/builder.rs
  • src/openhuman/agent/harness/session/mod.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/session/turn_engine_adapter.rs
  • src/openhuman/agent/harness/session/types.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/agent/harness/tool_loop.rs
  • src/openhuman/agent/harness/tool_loop_tests.rs

Comment thread src/openhuman/agent/harness/engine/core.rs Outdated
Comment thread src/openhuman/agent/harness/engine/core.rs
Comment thread src/openhuman/agent/harness/engine/tools.rs
- core.rs: recompute native-tool enablement each iteration (a ToolSource may
  register tools lazily mid-turn, so it can flip off->on).
- core.rs: emit turn_completed on the max-iteration checkpoint exit too, so
  consumers always get a terminal lifecycle event; drop the now-duplicate
  manual emit in Agent::turn's post-loop.
- tools.rs: scrub the failing tool output before logging it (avoid leaking
  credentials/PII into logs).

harness + agent + subagent suites: 229/229 pass.
@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 30, 2026
@senamakel senamakel merged commit 74bff7b into tinyhumansai:main May 30, 2026
19 of 26 checks passed
CodeGhost21 added a commit to CodeGhost21/openhuman that referenced this pull request Jun 1, 2026
…flow tokio worker stack

The unified turn engine (PR tinyhumansai#3012) made `run_turn_engine` a ~600-line
async fn. When the orchestrator delegates to a sub-agent — e.g. the
`chat-harness-subagent` Playwright spec hitting `researcher` via
`dispatch_subagent` → `run_subagent` → `run_typed_mode` → `run_inner_loop`
→ child `run_turn_engine` — the parent's polling stack carries the
child's engine state machine inline on top of its own, plus
`run_typed_mode`'s ~1000-line state. The sum crosses tokio's 2 MiB
default worker stack and the core aborts with:

    thread 'tokio-rt-worker' has overflowed its stack
    fatal runtime error: stack overflow, aborting

That kills the openhuman-core process mid-test, and every alphabetically
later spec in the Playwright `web lane 1/4` shard cascades into
`ECONNREFUSED 127.0.0.1:17788` — visible across many unrelated PRs.

Fix the root cause at the two unboxed recursion boundaries inside
`run_typed_mode`:

- box-pin the `run_inner_loop` call so its (engine-wrapping) state
  lives on the heap independently of `run_typed_mode`
- box-pin the child `run_turn_engine` call inside `run_inner_loop` so
  the child engine's poll frame doesn't pile on top of the parent's

Adds `nested_subagent_dispatch_runs_on_a_constrained_worker_stack` as a
smoke test that exercises the exact recursion shape (outer subagent →
tool exec → inner subagent) on a 1 MiB worker stack so refactors that
inline these boxes away are caught at `cargo test` time. The deep
end-to-end catcher remains the existing `chat-harness-subagent.spec.ts`
Playwright spec, which is exactly what surfaced this bug.

No tests are skipped, marked flaky, or otherwise hidden — this fixes
the underlying Rust crash so the lane-1 spec and its cascade run for
real.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant