Skip to content

fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K#1162

Merged
senamakel merged 2 commits intotinyhumansai:mainfrom
senamakel:fix/tool-calls
May 4, 2026
Merged

fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K#1162
senamakel merged 2 commits intotinyhumansai:mainfrom
senamakel:fix/tool-calls

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented May 4, 2026

Summary

  • Stop wasting tokens when integrations_agent calls a Composio action slug that exists in the bound toolkit but was dropped by the up-front fuzzy top-K filter.
  • Add LazyToolkitResolver: on an unknown call the inner loop builds the matching ComposioActionTool on the spot, admits it to the allowlist, and dispatches normally.
  • Helpful error: genuine misses (typos, non-toolkit slugs) now include the available tool names so the model self-corrects in one turn instead of guessing.

Problem

integrations_agent narrows each Composio toolkit's per-action tools via a fuzzy top-K filter (12 for heavy toolkits like Gmail) so dense per-action JSON schemas don't blow Fireworks' grammar / context budget. The filter exists to keep schemas out of the system prompt — not to gate execution.

In practice the model frequently knows the correct slug (e.g. GMAIL_LIST_MESSAGES) even when it falls outside the top-K. The inner loop dead-ended with a bare Error: tool '...' is not available to the integrations_agent sub-agent, with no list of what is available, so the model retried / wandered, burning tokens.

Solution

In src/openhuman/agent/harness/subagent_runner/ops.rs:

  • New LazyToolkitResolver struct holding the bound toolkit's full action catalogue + composio client. Built once in run_typed_mode for integrations_agent-with-toolkit alongside the existing top-K registration (re-uses the same fresh-actions / cached fallback).
  • run_inner_loop now owns extra_tools / allowed_names (passed by value) so it can extend them mid-run, and takes an optional lazy_resolver: Option<LazyToolkitResolver>. Fork mode passes None.
  • On unknown call: try the resolver first. Slug match → build ComposioActionTool, push into extra_tools, insert into allowed_names, dispatch on the same turn (and reuse on subsequent turns).
  • For genuine misses, the error now lists the union of allowed names + toolkit slugs so the model can recover in one turn.

Tradeoff: a slightly looser invariant on allowed_names (it may grow during the loop), and one extra ComposioActionTool allocation per "miss". Both negligible vs. the token waste they replace.

Submission Checklist

  • N/A: behaviour-only change to an internal sub-agent fallback path; existing 15 subagent_runner unit tests (including typed_mode_blocks_unallowed_tool_calls) still pass and exercise the gating logic.
  • N/A: changed lines are inside the integrations_agent harness path which is exercised by existing rust tests; a dedicated unit test for the resolver would require stubbing ComposioClient, which has no mock harness today.
  • N/A: behaviour-only change, no new feature rows in the coverage matrix.
  • N/A: no feature-matrix IDs touched by this change.
  • No new external network dependencies introduced.
  • N/A: no release-cut surface changes.
  • N/A: no linked issue.

Impact

  • Runtime: integrations_agent (desktop core sidecar). No frontend impact, no Tauri shell changes, no schema/migration changes.
  • Performance: removes an entire class of failed tool-call retry loops; one extra ComposioActionTool allocation per "miss" turn is negligible compared to the token spend it avoids.
  • Security: no change in tool surface — only actions that the toolkit's own catalogue (from Composio) advertises can be lazily admitted; the connected + allowlist pre-flight in spawn_subagent is unchanged.

Related

  • Closes:
  • Follow-up PR(s)/TODOs:

Summary by CodeRabbit

  • Refactor

    • Improved tool resolution flow and error messaging: unavailable tool errors now list resolvable tools and newly admitted tools are allowed for subsequent steps.
  • Performance

    • Deferred tool construction until actually needed, reducing startup overhead and memory usage by loading tools on demand.

The integrations_agent narrows each Composio toolkit's per-action tools
via a fuzzy top-K filter (12 for heavy toolkits like Gmail) so per-action
JSON schemas don't blow the prompt budget. When the model calls a slug
that's a real action but fell outside the top-K, the inner loop dead-
ended with a bare "tool '...' is not available" error. The model then
retried / wandered, burning tokens.

- Build a `LazyToolkitResolver` alongside the existing top-K registration
  carrying the toolkit's full action catalogue + composio client.
- On an unknown call the inner loop now consults the resolver: a slug
  match builds a `ComposioActionTool` on the spot, admits it to the
  allowlist for this and subsequent turns, and dispatches normally.
- For genuine misses (typos, non-toolkit tools), the error string now
  lists the available tool names so the model can self-correct in one
  turn instead of guessing blind.
- `run_inner_loop` now owns `extra_tools` / `allowed_names` so it can
  extend them mid-run; fork mode passes `None` for the resolver.
@senamakel senamakel requested a review from a team May 4, 2026 02:50
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd4c0e8e-900a-4b6b-825a-441300bcf9c6

📥 Commits

Reviewing files that changed from the base of the PR and between 389e43a and 8193d90.

📒 Files selected for processing (1)
  • src/openhuman/agent/harness/subagent_runner/ops.rs
✅ Files skipped from review due to trivial changes (1)
  • src/openhuman/agent/harness/subagent_runner/ops.rs

📝 Walkthrough

Walkthrough

A LazyToolkitResolver was added to defer construction of ComposioActionTool instances from a toolkit's full action catalogue. In typed mode the runner registers only fuzzy top-K tools up front, stores the full catalogue in the resolver, and lazily builds/admits tools on-demand during the inner loop; error messaging now includes resolver-known slugs.

Changes

Lazy Toolkit Resolution

Layer / File(s) Summary
Data Shape
src/openhuman/agent/harness/subagent_runner/ops.rs (lines 40–69)
Introduces LazyToolkitResolver with methods to resolve an action slug into a Box<dyn ComposioActionTool> and to list known slugs.
State Management
src/openhuman/agent/harness/subagent_runner/ops.rs (lines 241–389)
Adds lazy_resolver: Option<LazyToolkitResolver> to the typed-mode runner state; when toolkit is provided, stores full integration.tools in the resolver while pushing only fuzzy top-K tools into dynamic_tools.
Function Signatures / Wiring
src/openhuman/agent/harness/subagent_runner/ops.rs (lines 655–667, 763–767, 812–818)
Updates run_inner_loop signature to take owned extra_tools: Vec<_>, owned allowed_names: HashSet<String>, and lazy_resolver: Option<LazyToolkitResolver>; call sites updated (typed mode passes Some(...), fork mode passes None).
Core Logic
src/openhuman/agent/harness/subagent_runner/ops.rs (lines 1103–1143)
Inner per-call resolution: if call.name not in allowed_names, attempt lazy resolution via lazy_resolver; on success insert tool into extra_tools and allowed_names. If still not allowed, include lazy_resolver.known_slugs() in constructed "Available tools" list.
Documentation / Manifest
Cargo.toml
Cargo manifest touched (versioning/dependencies context for build).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant AgentRunner as Runner
    participant Resolver as LazyToolkitResolver
    participant Toolkit as ToolkitCatalogue
    participant Tool as ComposioActionTool

    Note right of Runner: Startup (typed mode)
    Runner->>Toolkit: store full catalogue (lazy)
    Toolkit-->>Resolver: hold slugs & client

    Note right of Runner: During inner loop (tool call)
    Runner->>Resolver: resolve(call.name)?
    alt resolver has slug
        Resolver->>Toolkit: build action tool
        Toolkit-->>Tool: constructed
        Tool-->>Runner: boxed tool added
        Runner->>Runner: insert into extra_tools and allowed_names
        Runner->>Tool: execute tool call
    else not resolvable
        Runner->>Runner: report "tool not available" (include known slugs)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • graycyrus

Poem

🐰 I stash the tools the clever folks send,
Keep only top picks, but save every friend.
When a call asks for one I once left behind,
I build it on demand — quick, nimble, and kind! 🎩✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing lazy registration of toolkit actions that were filtered out by fuzzy top-K filtering to avoid token waste on retries.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Review rate limit: 1/5 review remaining, refill in 38 minutes and 45 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 4, 2026
# Conflicts:
#	src/openhuman/agent/harness/subagent_runner/ops.rs
@senamakel senamakel merged commit 6793101 into tinyhumansai:main May 4, 2026
23 of 25 checks passed
jwalin-shah added a commit to jwalin-shah/openhuman that referenced this pull request May 5, 2026
* feat(remotion): Ghosty character library with transparent MOV variants (tinyhumansai#1059)

Co-authored-by: WOZCODE <contact@withwoz.com>

* feat(composio/gmail): sync into memory tree (Slack-parity) (tinyhumansai#1056)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(scheduler-gate): throttle background AI on battery / busy CPU (tinyhumansai#1062)

* fix(core,cef): run core in-process and stop orphaning CEF helpers on Cmd+Q (tinyhumansai#1061)

* ci: add dedicated staging release workflow (tinyhumansai#1066)

* fix(sentry): Rust source context + per-release deploy marker (tinyhumansai#405) (tinyhumansai#1067)

* fix(welcome): re-enable OAuth buttons with focus/timeout recovery (tinyhumansai#1049) (tinyhumansai#1069)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(dependencies): update pnpm-lock.yaml and Cargo.lock for package… (tinyhumansai#1082)

* fix(onboarding): personalize welcome agent greeting with user identity (tinyhumansai#1078)

* fix(chat): make agent message bubbles fit content width (tinyhumansai#1083)

* Feat/dmg checks (tinyhumansai#1084)

* fix(linux): Add X11 platform flags to .deb package launcher (tinyhumansai#1087)

Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com>

* fix(sentry): auto-send React events; collapse core→tauri for desktop (tinyhumansai#1086)

Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* fix(cef): run blank reload guard on the CEF UI thread (tinyhumansai#1092)

* fix(app): reload webview instead of restart_app in dev mode (tinyhumansai#1068) (tinyhumansai#1071)

* fix(linux): deliver X11 ozone flags via custom .desktop template (tinyhumansai#1091)

* fix(webview-accounts): retry data-dir purge so CEF handle race doesn't leak cookies (tinyhumansai#1076) (tinyhumansai#1081)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* fix(webview/slack): media perms + deep-link isolation (tinyhumansai#1074) (tinyhumansai#1080)

Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* ci(release): split staging vs production workflows; promote staging tags (tinyhumansai#1094)

* Update release-staging.yml (tinyhumansai#1097)

* chore(staging): v0.53.5

* chore(staging): v0.53.6

* ci(staging): cut staging from main; add act local-debug helper (tinyhumansai#1099)

* chore(staging): v0.53.7

* fix(ci): correct sentry-cli download URL and trap scope (tinyhumansai#1100)

* chore(staging): v0.53.8

* feat(chat): forward thread_id to backend for KV cache locality (tinyhumansai#1095)

* fix(ci): bump pinned sentry-cli to 3.4.1 (2.34.2 was never published) (tinyhumansai#1102)

* chore(staging): v0.53.9

* fix(ci): drop bash trap in upload_sentry_symbols.sh; inline cleanup (tinyhumansai#1103)

* chore(staging): v0.53.10

* refactor(session): flatten session_raw/, switch md to YYYY_MM_DD (tinyhumansai#1098)

* Add full Composio managed-auth toolkit catalog (tinyhumansai#1093)

* ci: add diff-aware 80% coverage gate (Vitest + cargo-llvm-cov) (tinyhumansai#1104)

* feat(scripts): pnpm work + pnpm debug for agent-driven workflows (tinyhumansai#1105)

* ci: pull pnpm into CI image, drop redundant setup steps (tinyhumansai#1107)

* docs: add Cursor Cloud specific instructions to AGENTS.md (tinyhumansai#1106)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* chore(staging): v0.53.11

* docs: surface 80% coverage gate and scripts/debug runners (tinyhumansai#1108)

* feat(app): show Composio integrations as sorted icon grid on Skills (tinyhumansai#1109)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* feat(composio): client-side trigger enable/disable toggles (tinyhumansai#1110)

* feat(skills): channels grid + integrations card polish; tolerant Composio trigger decode (tinyhumansai#1112)

* chore(staging): v0.53.12

* feat(home): early-bird banner + assistant→agent terminology (tinyhumansai#1113)

* feat(updater): in-app auto-update with auto-download + restart prompt (tinyhumansai#677) (tinyhumansai#1114)

* chore(claude): add ship-and-babysit slash command (tinyhumansai#1115)

* feat(home): EarlyBirdyBanner + agent terminology + LinkedIn enrichment model pin (tinyhumansai#1118)

* fix(chat): single onboarding thread in sidebar after wizard (tinyhumansai#1116)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com>

* fix: filter out global namespace from citation chips (tinyhumansai#1124)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>

* feat(nav): enable Memory tab in BottomTabBar (tinyhumansai#1125)

* feat(memory): singleton ingestion + status RPC + UI pill (tinyhumansai#1126)

* feat(human): mascot tab with viseme-driven lipsync (staging only) (tinyhumansai#1127)

* Fix CEF zombie processes on full app close and restart (tinyhumansai#1128)

Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* Update issue templates for GitHub issue types (tinyhumansai#1146)

* feat(human): expand mascot expressions and tighten reply-speech state machine (tinyhumansai#1147)

* feat(memory): ingestion pipeline + tree-architecture docs + ops/schemas split (tinyhumansai#1142)

* feat(threads): surface live subagent work in parent thread (tinyhumansai#1122) (tinyhumansai#1159)

* fix(human): keep mascot mouth animating when TTS ships no viseme data (tinyhumansai#1160)

* feat(composio): consume backend markdownFormatted for LLM output (tinyhumansai#1165)

* fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K (tinyhumansai#1162)

* feat(memory): user-facing long-term memory window preset (tinyhumansai#1137) (tinyhumansai#1161)

* fix(tauri-shell): proactively kill stale openhuman RPC on startup (tinyhumansai#1166)

* chore(staging): v0.53.13

* fix(composio): per-action tool consumes backend markdownFormatted (tinyhumansai#1167)

* fix(threads): persist selectedThreadId across reloads (tinyhumansai#1168)

* feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context) (tinyhumansai#1174)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): drop redundant [Memory context] recall injection (tinyhumansai#1173)

* chore(memory_tree): drop body-read timeouts on Ollama HTTP calls (tinyhumansai#1171)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(transcript): emit thread_id + fix orchestrator missing cost (tinyhumansai#1169)

* fix(composio/gmail): phase out html2md, prefer text/plain MIME part (tinyhumansai#1170)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tools): markdown output for internal tool results (tinyhumansai#1172)

* feat(security): enforce prompt-injection guard before model and tool execution (tinyhumansai#1175)

* fix(cef): popup paint dies after first frame — skip blank-page guard for popups (tinyhumansai#1079) (tinyhumansai#1182)

Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com>

* chore(sentry): rename OPENHUMAN_SENTRY_DSN → OPENHUMAN_CORE_SENTRY_DSN (tinyhumansai#1186)

* feat(remotion): add yellow mascot character with all animation variants (tinyhumansai#1193)

Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(composio): hide raw connection ID, derive friendly label (tinyhumansai#1153) (tinyhumansai#1185)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(windows): align install.ps1 MSI with per-machine scope (tinyhumansai#913) (tinyhumansai#1187)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(tauri): deterministic CEF teardown on full app close (tinyhumansai#1120) (tinyhumansai#1189)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(composio): cap Gmail HTML body before strip (crash mitigation) (tinyhumansai#1191)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(auth): stop stale chat threads after signup (tinyhumansai#1192)

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(sentry): staging-only "Trigger Sentry Test" button (tinyhumansai#1072) (tinyhumansai#1183)

* chore(staging): v0.53.14

* chore(staging): v0.53.15

* feat(composio): format trigger slugs into human-readable labels (tinyhumansai#1129) (tinyhumansai#1179)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(ui): hide unsupported permission UI on non-macOS for Screen Intelligence (tinyhumansai#1194)

Co-authored-by: Cursor <cursoragent@cursor.com>

* chore(tauri-shell): retire embedded Gmail webview-account flow (tinyhumansai#1181)

* feat(onboarding): replace welcome-agent bot with react-joyride walkthrough (tinyhumansai#1180)

* chore(release): v0.53.16

* fix(threads): preserve selectedThreadId on cold-boot identity hydration (tinyhumansai#1196)

* feat(core): version/shutdown/update RPCs + mid-thread integration refresh (tinyhumansai#1195)

* fix(mascot): swap to yellow mascot via @remotion/player (tinyhumansai#1200)

* feat(memory_tree): cloud-default LLM, queue priority, entity filter, Memory tab UI (tinyhumansai#1198)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Persist turn state + restore conversation history on cold-boot (tinyhumansai#1202)

* feat(mascot): floating desktop mascot via native NSPanel + WKWebView (macOS) (tinyhumansai#1203)

* fix(memory/tree): emit summary children as Obsidian wikilinks (tinyhumansai#1210)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tools): coding-harness baseline primitives (tinyhumansai#1205) (tinyhumansai#1208)

* docs: add Codex PR checklist for remote agents

---------

Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com>
Co-authored-by: WOZCODE <contact@withwoz.com>
Co-authored-by: sanil-23 <sanil@vezures.xyz>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cyrus Gray <144336577+graycyrus@users.noreply.github.com>
Co-authored-by: CodeGhost21 <164498022+CodeGhost21@users.noreply.github.com>
Co-authored-by: oxoxDev <164490987+oxoxDev@users.noreply.github.com>
Co-authored-by: Mega Mind <146339422+M3gA-Mind@users.noreply.github.com>
Co-authored-by: Gaurang Patel <ptelgm.yt@gmail.com>
Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com>
Co-authored-by: Steven Enamakel's Droid <enamakel.agent@tinyhumans.ai>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>
Co-authored-by: YellowSnnowmann <167776381+YellowSnnowmann@users.noreply.github.com>
Co-authored-by: Neil <neil@maha.xyz>
Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local>
Co-authored-by: obchain <167975049+obchain@users.noreply.github.com>
Co-authored-by: Jwalin Shah <jshah1331@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant