Skip to content

fix(plugins): guard runtime tool descriptors#89253

Draft
vincentkoc wants to merge 1 commit into
mainfrom
fuzz-plugin-runtime-tool-descriptor-20260601
Draft

fix(plugins): guard runtime tool descriptors#89253
vincentkoc wants to merge 1 commit into
mainfrom
fuzz-plugin-runtime-tool-descriptor-20260601

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

Summary

  • Harden runtime plugin tool candidate inspection against poisoned descriptor getters for name, execute, parameters, description, label, and displaySummary.
  • Return accepted plugin tools through the scoped proxy with snapshotted descriptor fields so poisoned optional getters are not reachable downstream.
  • Reuse validated descriptor snapshots when writing manifest tool descriptor cache entries, and cover the runtime behavior with a poisoned-descriptor regression test.

Verification

  • node scripts/run-vitest.mjs src/plugins/tools.optional.test.ts --reporter=dot
  • node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/plugins/tools.ts src/plugins/tools.optional.test.ts src/plugins/tool-descriptor-cache.ts
  • ./node_modules/.bin/oxfmt --check --threads=1 src/plugins/tools.ts src/plugins/tools.optional.test.ts src/plugins/tool-descriptor-cache.ts
  • git diff --check
  • .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
  • Remote proof: azure Crabbox run run_95ca8a20197a, lease cbx_06e2e1c4e590, corepack pnpm check:changed, exit 0

Real behavior proof

Behavior addressed: Runtime plugin tools with malformed or poisoned descriptor getters are skipped with diagnostics while valid sibling tools remain available and scoped to the owning plugin.

Real environment tested: Azure Linux Crabbox run run_95ca8a20197a, lease cbx_06e2e1c4e590, cloned branch fuzz-plugin-runtime-tool-descriptor-20260601, Node v24.15.0.

Exact steps or command run after this patch: Installed Node/Corepack on the raw box, cloned the pushed branch, fetched origin/main, then ran env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 CI=1 corepack pnpm check:changed.

Evidence after fix: check:changed selected core and coreTests; typecheck, changed-file lint, import-cycle/runtime guards, and related changed-gate checks passed with exit 0.

Observed result after fix: The focused regression keeps the valid message plugin tool, emits malformed diagnostics for poisoned name, execute, parameters, and description descriptors, and masks poisoned optional descriptor fields to safe undefined values.

What was not tested: Live third-party plugin packages beyond the mocked runtime registry regression.

@vincentkoc vincentkoc self-assigned this Jun 1, 2026
@openclaw-barnacle openclaw-barnacle Bot added size: M maintainer Maintainer-authored PR labels Jun 1, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 1, 2026

Codex review: needs maintainer review before merge. Reviewed June 1, 2026, 7:35 PM ET / 23:35 UTC.

Summary
This PR hardens plugin tool resolution by safely reading and snapshotting runtime tool descriptor fields, then adds a poisoned-descriptor regression test.

PR surface: Source +168, Tests +89. Total +257 across 3 files.

Reproducibility: yes. from source inspection: current main directly reads runtime plugin tool descriptor getters in resolution and cache capture paths, so a throwing getter can abort that path. I did not run the regression locally because this was a read-only review.

Review metrics: 1 noteworthy metric.

  • Descriptor fields guarded: 6 fields guarded/snapshotted. The patch covers the getter-backed fields that determine whether one malformed plugin tool can poison sibling tool discovery.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] Live third-party plugin packages were not exercised; the proof is focused on a mocked runtime-registry regression plus changed-gate validation.

Maintainer options:

  1. Decide the mitigation before merge
    Land the guarded descriptor snapshot path after normal maintainer draft and CI review so malformed plugin tools are isolated while valid siblings remain usable.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • No automated repair is needed; this maintainer-owned draft should proceed through normal maintainer review and merge-readiness gates.

Security
Cleared: The diff is a plugin runtime hardening change with no dependency, workflow, lockfile, secret-handling, or package-resolution changes.

Review details

Best possible solution:

Land the guarded descriptor snapshot path after normal maintainer draft and CI review so malformed plugin tools are isolated while valid siblings remain usable.

Do we have a high-confidence way to reproduce the issue?

Yes, from source inspection: current main directly reads runtime plugin tool descriptor getters in resolution and cache capture paths, so a throwing getter can abort that path. I did not run the regression locally because this was a read-only review.

Is this the best way to solve the issue?

Yes; guarding descriptor reads at the plugin runtime boundary and reusing validated snapshots is narrower and more maintainable than adding downstream try/catch handling at each caller.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 4c6285e8ff79.

Label changes

Label changes:

  • add P2: This is a normal-priority plugin runtime hardening bugfix with limited blast radius and focused regression coverage.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal-style Crabbox proof for corepack pnpm check:changed on the branch plus the focused poisoned-descriptor regression result.
  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal-style Crabbox proof for corepack pnpm check:changed on the branch plus the focused poisoned-descriptor regression result.

Label justifications:

  • P2: This is a normal-priority plugin runtime hardening bugfix with limited blast radius and focused regression coverage.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal-style Crabbox proof for corepack pnpm check:changed on the branch plus the focused poisoned-descriptor regression result.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal-style Crabbox proof for corepack pnpm check:changed on the branch plus the focused poisoned-descriptor regression result.
Evidence reviewed

PR surface:

Source +168, Tests +89. Total +257 across 3 files.

View PR surface stats
Area Files Added Removed Net
Source 2 215 47 +168
Tests 1 89 0 +89
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 3 304 47 +257

What I checked:

  • Root repository policy read: Read the full root AGENTS.md in two ranges; its plugin/runtime review guidance affected the review depth and merge-risk assessment. (AGENTS.md:1, 4c6285e8ff79)
  • Scoped plugin policy read: src/plugins/AGENTS.md identifies this area as owning plugin discovery, registry assembly, runtime resolution, and contract enforcement, so the review focused on that boundary. (src/plugins/AGENTS.md:1, 4c6285e8ff79)
  • Current main behavior: Current main directly reads plugin tool getters while wrapping and resolving tools, including execute, name, description, parameters, and cache descriptor fields. (src/plugins/tools.ts:109, 4c6285e8ff79)
  • Current descriptor cache behavior: Current main descriptor capture directly reads label, displaySummary, name, description, parameters, and executor toolName from the runtime tool object. (src/plugins/tool-descriptor-cache.ts:144, 4c6285e8ff79)
  • PR hardening path: The PR adds guarded candidate inspection for required descriptor fields and treats throwing getters as malformed tool diagnostics rather than letting them abort sibling tool resolution. (src/plugins/tools.ts:551, 91599223dec7)
  • PR snapshot/cache path: Accepted tools are wrapped with snapshotted descriptor fields, and the descriptor cache receives those validated snapshots instead of rereading hostile optional getters. (src/plugins/tools.ts:1458, 91599223dec7)

Likely related people:

  • Dallin Romney: Blame and symbol-history in the available main history show the current plugin tool resolver and descriptor cache implementation coming from commit 5a67c5c. (role: recent area contributor; confidence: medium; commits: 5a67c5c5566e; files: src/plugins/tools.ts, src/plugins/tool-descriptor-cache.ts, src/plugins/tools.optional.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. labels Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: M status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant