fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K by senamakel · Pull Request #1162 · tinyhumansai/openhuman

senamakel · 2026-05-04T02:50:36Z

Summary

Stop wasting tokens when integrations_agent calls a Composio action slug that exists in the bound toolkit but was dropped by the up-front fuzzy top-K filter.
Add LazyToolkitResolver: on an unknown call the inner loop builds the matching ComposioActionTool on the spot, admits it to the allowlist, and dispatches normally.
Helpful error: genuine misses (typos, non-toolkit slugs) now include the available tool names so the model self-corrects in one turn instead of guessing.

Problem

integrations_agent narrows each Composio toolkit's per-action tools via a fuzzy top-K filter (12 for heavy toolkits like Gmail) so dense per-action JSON schemas don't blow Fireworks' grammar / context budget. The filter exists to keep schemas out of the system prompt — not to gate execution.

In practice the model frequently knows the correct slug (e.g. GMAIL_LIST_MESSAGES) even when it falls outside the top-K. The inner loop dead-ended with a bare Error: tool '...' is not available to the integrations_agent sub-agent, with no list of what is available, so the model retried / wandered, burning tokens.

Solution

In src/openhuman/agent/harness/subagent_runner/ops.rs:

New LazyToolkitResolver struct holding the bound toolkit's full action catalogue + composio client. Built once in run_typed_mode for integrations_agent-with-toolkit alongside the existing top-K registration (re-uses the same fresh-actions / cached fallback).
run_inner_loop now owns extra_tools / allowed_names (passed by value) so it can extend them mid-run, and takes an optional lazy_resolver: Option<LazyToolkitResolver>. Fork mode passes None.
On unknown call: try the resolver first. Slug match → build ComposioActionTool, push into extra_tools, insert into allowed_names, dispatch on the same turn (and reuse on subsequent turns).
For genuine misses, the error now lists the union of allowed names + toolkit slugs so the model can recover in one turn.

Tradeoff: a slightly looser invariant on allowed_names (it may grow during the loop), and one extra ComposioActionTool allocation per "miss". Both negligible vs. the token waste they replace.

Submission Checklist

N/A: behaviour-only change to an internal sub-agent fallback path; existing 15 subagent_runner unit tests (including typed_mode_blocks_unallowed_tool_calls) still pass and exercise the gating logic.
N/A: changed lines are inside the integrations_agent harness path which is exercised by existing rust tests; a dedicated unit test for the resolver would require stubbing ComposioClient, which has no mock harness today.
N/A: behaviour-only change, no new feature rows in the coverage matrix.
N/A: no feature-matrix IDs touched by this change.
No new external network dependencies introduced.
N/A: no release-cut surface changes.
N/A: no linked issue.

Impact

Runtime: integrations_agent (desktop core sidecar). No frontend impact, no Tauri shell changes, no schema/migration changes.
Performance: removes an entire class of failed tool-call retry loops; one extra ComposioActionTool allocation per "miss" turn is negligible compared to the token spend it avoids.
Security: no change in tool surface — only actions that the toolkit's own catalogue (from Composio) advertises can be lazily admitted; the connected + allowlist pre-flight in spawn_subagent is unchanged.

Summary by CodeRabbit

Refactor
- Improved tool resolution flow and error messaging: unavailable tool errors now list resolvable tools and newly admitted tools are allowed for subsequent steps.
Performance
- Deferred tool construction until actually needed, reducing startup overhead and memory usage by loading tools on demand.

The integrations_agent narrows each Composio toolkit's per-action tools via a fuzzy top-K filter (12 for heavy toolkits like Gmail) so per-action JSON schemas don't blow the prompt budget. When the model calls a slug that's a real action but fell outside the top-K, the inner loop dead- ended with a bare "tool '...' is not available" error. The model then retried / wandered, burning tokens. - Build a `LazyToolkitResolver` alongside the existing top-K registration carrying the toolkit's full action catalogue + composio client. - On an unknown call the inner loop now consults the resolver: a slug match builds a `ComposioActionTool` on the spot, admits it to the allowlist for this and subsequent turns, and dispatches normally. - For genuine misses (typos, non-toolkit tools), the error string now lists the available tool names so the model can self-correct in one turn instead of guessing blind. - `run_inner_loop` now owns `extra_tools` / `allowed_names` so it can extend them mid-run; fork mode passes `None` for the resolver.

coderabbitai · 2026-05-04T02:50:49Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd4c0e8e-900a-4b6b-825a-441300bcf9c6

📥 Commits

Reviewing files that changed from the base of the PR and between 389e43a and 8193d90.

📒 Files selected for processing (1)

src/openhuman/agent/harness/subagent_runner/ops.rs

✅ Files skipped from review due to trivial changes (1)

src/openhuman/agent/harness/subagent_runner/ops.rs

📝 Walkthrough

Walkthrough

A LazyToolkitResolver was added to defer construction of ComposioActionTool instances from a toolkit's full action catalogue. In typed mode the runner registers only fuzzy top-K tools up front, stores the full catalogue in the resolver, and lazily builds/admits tools on-demand during the inner loop; error messaging now includes resolver-known slugs.

Changes

Lazy Toolkit Resolution

Layer / File(s)	Summary
Data Shape `src/openhuman/agent/harness/subagent_runner/ops.rs` (lines 40–69)	Introduces `LazyToolkitResolver` with methods to resolve an action slug into a `Box<dyn ComposioActionTool>` and to list known slugs.
State Management `src/openhuman/agent/harness/subagent_runner/ops.rs` (lines 241–389)	Adds `lazy_resolver: Option<LazyToolkitResolver>` to the typed-mode runner state; when `toolkit` is provided, stores full `integration.tools` in the resolver while pushing only fuzzy top-K tools into `dynamic_tools`.
Function Signatures / Wiring `src/openhuman/agent/harness/subagent_runner/ops.rs` (lines 655–667, 763–767, 812–818)	Updates `run_inner_loop` signature to take owned `extra_tools: Vec<_>`, owned `allowed_names: HashSet<String>`, and `lazy_resolver: Option<LazyToolkitResolver>`; call sites updated (typed mode passes `Some(...)`, fork mode passes `None`).
Core Logic `src/openhuman/agent/harness/subagent_runner/ops.rs` (lines 1103–1143)	Inner per-call resolution: if `call.name` not in `allowed_names`, attempt lazy resolution via `lazy_resolver`; on success insert tool into `extra_tools` and `allowed_names`. If still not allowed, include `lazy_resolver.known_slugs()` in constructed "Available tools" list.
Documentation / Manifest `Cargo.toml`	Cargo manifest touched (versioning/dependencies context for build).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant AgentRunner as Runner
    participant Resolver as LazyToolkitResolver
    participant Toolkit as ToolkitCatalogue
    participant Tool as ComposioActionTool

    Note right of Runner: Startup (typed mode)
    Runner->>Toolkit: store full catalogue (lazy)
    Toolkit-->>Resolver: hold slugs & client

    Note right of Runner: During inner loop (tool call)
    Runner->>Resolver: resolve(call.name)?
    alt resolver has slug
        Resolver->>Toolkit: build action tool
        Toolkit-->>Tool: constructed
        Tool-->>Runner: boxed tool added
        Runner->>Runner: insert into extra_tools and allowed_names
        Runner->>Tool: execute tool call
    else not resolvable
        Runner->>Runner: report "tool not available" (include known slugs)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat(agent): fuzzy-filter skills_agent toolkit actions by task prompt #579 — Introduced fuzzy top-K tool_filter pre-selection of toolkit actions; this PR complements it by keeping the full catalogue and lazily resolving filtered-away actions at runtime.

Suggested reviewers

graycyrus

Poem

🐰 I stash the tools the clever folks send,
Keep only top picks, but save every friend.
When a call asks for one I once left behind,
I build it on demand — quick, nimble, and kind! 🎩✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: implementing lazy registration of toolkit actions that were filtered out by fuzzy top-K filtering to avoid token waste on retries.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Review rate limit: 1/5 review remaining, refill in 38 minutes and 45 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

# Conflicts: # src/openhuman/agent/harness/subagent_runner/ops.rs

* feat(remotion): Ghosty character library with transparent MOV variants (tinyhumansai#1059) Co-authored-by: WOZCODE <contact@withwoz.com> * feat(composio/gmail): sync into memory tree (Slack-parity) (tinyhumansai#1056) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(scheduler-gate): throttle background AI on battery / busy CPU (tinyhumansai#1062) * fix(core,cef): run core in-process and stop orphaning CEF helpers on Cmd+Q (tinyhumansai#1061) * ci: add dedicated staging release workflow (tinyhumansai#1066) * fix(sentry): Rust source context + per-release deploy marker (tinyhumansai#405) (tinyhumansai#1067) * fix(welcome): re-enable OAuth buttons with focus/timeout recovery (tinyhumansai#1049) (tinyhumansai#1069) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(dependencies): update pnpm-lock.yaml and Cargo.lock for package… (tinyhumansai#1082) * fix(onboarding): personalize welcome agent greeting with user identity (tinyhumansai#1078) * fix(chat): make agent message bubbles fit content width (tinyhumansai#1083) * Feat/dmg checks (tinyhumansai#1084) * fix(linux): Add X11 platform flags to .deb package launcher (tinyhumansai#1087) Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com> * fix(sentry): auto-send React events; collapse core→tauri for desktop (tinyhumansai#1086) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * fix(cef): run blank reload guard on the CEF UI thread (tinyhumansai#1092) * fix(app): reload webview instead of restart_app in dev mode (tinyhumansai#1068) (tinyhumansai#1071) * fix(linux): deliver X11 ozone flags via custom .desktop template (tinyhumansai#1091) * fix(webview-accounts): retry data-dir purge so CEF handle race doesn't leak cookies (tinyhumansai#1076) (tinyhumansai#1081) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * fix(webview/slack): media perms + deep-link isolation (tinyhumansai#1074) (tinyhumansai#1080) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * ci(release): split staging vs production workflows; promote staging tags (tinyhumansai#1094) * Update release-staging.yml (tinyhumansai#1097) * chore(staging): v0.53.5 * chore(staging): v0.53.6 * ci(staging): cut staging from main; add act local-debug helper (tinyhumansai#1099) * chore(staging): v0.53.7 * fix(ci): correct sentry-cli download URL and trap scope (tinyhumansai#1100) * chore(staging): v0.53.8 * feat(chat): forward thread_id to backend for KV cache locality (tinyhumansai#1095) * fix(ci): bump pinned sentry-cli to 3.4.1 (2.34.2 was never published) (tinyhumansai#1102) * chore(staging): v0.53.9 * fix(ci): drop bash trap in upload_sentry_symbols.sh; inline cleanup (tinyhumansai#1103) * chore(staging): v0.53.10 * refactor(session): flatten session_raw/, switch md to YYYY_MM_DD (tinyhumansai#1098) * Add full Composio managed-auth toolkit catalog (tinyhumansai#1093) * ci: add diff-aware 80% coverage gate (Vitest + cargo-llvm-cov) (tinyhumansai#1104) * feat(scripts): pnpm work + pnpm debug for agent-driven workflows (tinyhumansai#1105) * ci: pull pnpm into CI image, drop redundant setup steps (tinyhumansai#1107) * docs: add Cursor Cloud specific instructions to AGENTS.md (tinyhumansai#1106) Co-authored-by: Cursor Agent <cursoragent@cursor.com> * chore(staging): v0.53.11 * docs: surface 80% coverage gate and scripts/debug runners (tinyhumansai#1108) * feat(app): show Composio integrations as sorted icon grid on Skills (tinyhumansai#1109) Co-authored-by: Cursor Agent <cursoragent@cursor.com> * feat(composio): client-side trigger enable/disable toggles (tinyhumansai#1110) * feat(skills): channels grid + integrations card polish; tolerant Composio trigger decode (tinyhumansai#1112) * chore(staging): v0.53.12 * feat(home): early-bird banner + assistant→agent terminology (tinyhumansai#1113) * feat(updater): in-app auto-update with auto-download + restart prompt (tinyhumansai#677) (tinyhumansai#1114) * chore(claude): add ship-and-babysit slash command (tinyhumansai#1115) * feat(home): EarlyBirdyBanner + agent terminology + LinkedIn enrichment model pin (tinyhumansai#1118) * fix(chat): single onboarding thread in sidebar after wizard (tinyhumansai#1116) Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com> * fix: filter out global namespace from citation chips (tinyhumansai#1124) Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com> * feat(nav): enable Memory tab in BottomTabBar (tinyhumansai#1125) * feat(memory): singleton ingestion + status RPC + UI pill (tinyhumansai#1126) * feat(human): mascot tab with viseme-driven lipsync (staging only) (tinyhumansai#1127) * Fix CEF zombie processes on full app close and restart (tinyhumansai#1128) Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * Update issue templates for GitHub issue types (tinyhumansai#1146) * feat(human): expand mascot expressions and tighten reply-speech state machine (tinyhumansai#1147) * feat(memory): ingestion pipeline + tree-architecture docs + ops/schemas split (tinyhumansai#1142) * feat(threads): surface live subagent work in parent thread (tinyhumansai#1122) (tinyhumansai#1159) * fix(human): keep mascot mouth animating when TTS ships no viseme data (tinyhumansai#1160) * feat(composio): consume backend markdownFormatted for LLM output (tinyhumansai#1165) * fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K (tinyhumansai#1162) * feat(memory): user-facing long-term memory window preset (tinyhumansai#1137) (tinyhumansai#1161) * fix(tauri-shell): proactively kill stale openhuman RPC on startup (tinyhumansai#1166) * chore(staging): v0.53.13 * fix(composio): per-action tool consumes backend markdownFormatted (tinyhumansai#1167) * fix(threads): persist selectedThreadId across reloads (tinyhumansai#1168) * feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context) (tinyhumansai#1174) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agent): drop redundant [Memory context] recall injection (tinyhumansai#1173) * chore(memory_tree): drop body-read timeouts on Ollama HTTP calls (tinyhumansai#1171) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(transcript): emit thread_id + fix orchestrator missing cost (tinyhumansai#1169) * fix(composio/gmail): phase out html2md, prefer text/plain MIME part (tinyhumansai#1170) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tools): markdown output for internal tool results (tinyhumansai#1172) * feat(security): enforce prompt-injection guard before model and tool execution (tinyhumansai#1175) * fix(cef): popup paint dies after first frame — skip blank-page guard for popups (tinyhumansai#1079) (tinyhumansai#1182) Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com> * chore(sentry): rename OPENHUMAN_SENTRY_DSN → OPENHUMAN_CORE_SENTRY_DSN (tinyhumansai#1186) * feat(remotion): add yellow mascot character with all animation variants (tinyhumansai#1193) Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(composio): hide raw connection ID, derive friendly label (tinyhumansai#1153) (tinyhumansai#1185) Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> * fix(windows): align install.ps1 MSI with per-machine scope (tinyhumansai#913) (tinyhumansai#1187) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tauri): deterministic CEF teardown on full app close (tinyhumansai#1120) (tinyhumansai#1189) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(composio): cap Gmail HTML body before strip (crash mitigation) (tinyhumansai#1191) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(auth): stop stale chat threads after signup (tinyhumansai#1192) Co-authored-by: Cursor <cursoragent@cursor.com> * feat(sentry): staging-only "Trigger Sentry Test" button (tinyhumansai#1072) (tinyhumansai#1183) * chore(staging): v0.53.14 * chore(staging): v0.53.15 * feat(composio): format trigger slugs into human-readable labels (tinyhumansai#1129) (tinyhumansai#1179) Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> * fix(ui): hide unsupported permission UI on non-macOS for Screen Intelligence (tinyhumansai#1194) Co-authored-by: Cursor <cursoragent@cursor.com> * chore(tauri-shell): retire embedded Gmail webview-account flow (tinyhumansai#1181) * feat(onboarding): replace welcome-agent bot with react-joyride walkthrough (tinyhumansai#1180) * chore(release): v0.53.16 * fix(threads): preserve selectedThreadId on cold-boot identity hydration (tinyhumansai#1196) * feat(core): version/shutdown/update RPCs + mid-thread integration refresh (tinyhumansai#1195) * fix(mascot): swap to yellow mascot via @remotion/player (tinyhumansai#1200) * feat(memory_tree): cloud-default LLM, queue priority, entity filter, Memory tab UI (tinyhumansai#1198) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Persist turn state + restore conversation history on cold-boot (tinyhumansai#1202) * feat(mascot): floating desktop mascot via native NSPanel + WKWebView (macOS) (tinyhumansai#1203) * fix(memory/tree): emit summary children as Obsidian wikilinks (tinyhumansai#1210) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tools): coding-harness baseline primitives (tinyhumansai#1205) (tinyhumansai#1208) * docs: add Codex PR checklist for remote agents --------- Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com> Co-authored-by: WOZCODE <contact@withwoz.com> Co-authored-by: sanil-23 <sanil@vezures.xyz> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Cyrus Gray <144336577+graycyrus@users.noreply.github.com> Co-authored-by: CodeGhost21 <164498022+CodeGhost21@users.noreply.github.com> Co-authored-by: oxoxDev <164490987+oxoxDev@users.noreply.github.com> Co-authored-by: Mega Mind <146339422+M3gA-Mind@users.noreply.github.com> Co-authored-by: Gaurang Patel <ptelgm.yt@gmail.com> Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com> Co-authored-by: Steven Enamakel's Droid <enamakel.agent@tinyhumans.ai> Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com> Co-authored-by: YellowSnnowmann <167776381+YellowSnnowmann@users.noreply.github.com> Co-authored-by: Neil <neil@maha.xyz> Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local> Co-authored-by: obchain <167975049+obchain@users.noreply.github.com> Co-authored-by: Jwalin Shah <jshah1331@gmail.com>

senamakel requested a review from a team May 4, 2026 02:50

coderabbitai Bot previously approved these changes May 4, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/main' into fix/tool-calls

8193d90

# Conflicts: # src/openhuman/agent/harness/subagent_runner/ops.rs

senamakel dismissed coderabbitai[bot]’s stale review via 8193d90 May 4, 2026 02:57

coderabbitai Bot approved these changes May 4, 2026

View reviewed changes

senamakel merged commit 6793101 into tinyhumansai:main May 4, 2026
23 of 25 checks passed

coderabbitai Bot mentioned this pull request May 4, 2026

fix(composio): per-action tool consumes backend markdownFormatted #1167

Merged

7 tasks

coderabbitai Bot mentioned this pull request May 7, 2026

perf(agent): orchestrator harness efficiency improvements #1314

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K#1162

fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K#1162
senamakel merged 2 commits intotinyhumansai:mainfrom
senamakel:fix/tool-calls

senamakel commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senamakel commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading