[luv-293] fix: canonicalize tool names across all agent CLIs so builtin policies fire#293
[luv-293] fix: canonicalize tool names across all agent CLIs so builtin policies fire#293NiveditJain merged 1 commit intomainfrom
Conversation
…in policies fire Builtin policies match tool names in PascalCase (`["Bash"]`, `["Read","Glob","Grep","Bash"]`, …) via case-sensitive `Array.includes` at policy-registry.ts:93-95. But Copilot's tool registry emits lowercase IDs (`bash`, `read`, …) — confirmed by the session-log shape at `lib/copilot-sessions.ts:257` and `__tests__/lib/copilot-sessions.test.ts:87` — and OpenCode's plugin SDK exposes the same. Without canonicalization every Bash/Read/Write/Edit builtin silently no-ops under those CLIs. User-reported regression: under GitHub Copilot CLI, `ls -la --almost-all $HOME | sed -n '1,200p'` ran successfully despite an enabled `block-read-outside-cwd` policy. The same command blocks correctly under Claude Code today. Fix: - `COPILOT_TOOL_MAP` in `src/hooks/types.ts` and a copilot branch in `handler.ts:canonicalizeToolName` (subprocess CLIs canonicalize handler-side). - `OPENCODE_TOOL_MAP` and `PI_TOOL_MAP` in `src/hooks/types.ts`; the OpenCode and Pi shims embed identical maps inline (shims must be self-contained — they're loaded in-process by their host CLI). The Pi shim's previous `charAt(0).toUpperCase()` heuristic only worked for single-word tool IDs and would have mis-canonicalized future multi-word tools (`todo_write` to `Todo_write`, not `TodoWrite`). - Update Copilot e2e payload fixture (`__tests__/e2e/helpers/payloads.ts`) to match the real lowercase shape; existing e2e assertions still pass because the handler now canonicalizes. Tests: - `__tests__/hooks/handler.test.ts` — three new Copilot canonicalization cases (bash to Bash regression, full COPILOT_TOOL_MAP coverage, unknown-tool passthrough), mirroring the existing Gemini coverage. - `__tests__/hooks/opencode-plugin-shim.test.ts` — full OPENCODE_TOOL_MAP coverage + unknown-tool passthrough; updated existing assertion to expect `Bash` post-canonicalization. - `__tests__/e2e/hooks/copilot-integration.e2e.test.ts` — pinned regression test reproducing the user-reported bug (block-read-outside-cwd plus an out-of-cwd ls denies under Copilot). Manual verification: piping a stdin payload with `tool_name: "bash"` and an out-of-cwd `ls` to `bun bin/failproofai.mjs --hook PreToolUse --cli copilot` now returns `permissionDecision: "deny"` with `Bash read outside project directory blocked`. Pre-fix it returned no decision (allow). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughTool-name canonicalization is introduced across Copilot, OpenCode, and Pi agent CLIs via new tool-name mapping constants (COPILOT_TOOL_MAP, OPENCODE_TOOL_MAP, PI_TOOL_MAP). These maps normalize lowercase tool IDs to Claude PascalCase names before policy evaluation. The handler, OpenCode shim, and Pi extension are updated to canonicalize incoming tool names. Comprehensive unit and end-to-end tests verify the canonicalization behavior. ChangesTool-Name Canonicalization Across Agent CLIs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
…atch + complete the PR #293 audit (#297) PR #293 wired per-CLI tool-name canonicalization so builtin policies that match Claude PascalCase names (`Bash`, `Read`, `Write`, `Edit`, `Glob`, `Grep`) fire under non-Claude CLIs. The map for Copilot was incomplete: Copilot's `view` tool — used by the model both to read files and to list directory contents — was not mapped, so `block-read-outside-cwd` never fired on `view` calls. User-reported regression: with `block-read-outside-cwd` enabled under Copilot CLI, the model could still list `$HOME` via a `view` call (a single tool invocation with `tool_input: {path: "/home/user"}`) where PR #293 had only fixed the `bash ls -la` flow. Empirical confirmation in this user's local session log at `~/.copilot/session-state/.../events.jsonl`: `{"type":"tool.execution_start","data":{"toolName":"view","arguments":{"path":"/home/nivedit"}}}` against Copilot CLI 1.0.39. Auditing all seven supported CLIs against their public tool registries plus on-disk session evidence revealed three more gaps in the same class: - Copilot was missing `view`, `create`, `apply_patch`, `web_fetch`, `powershell`, `*_bash`/`*_powershell` (the eight session-management tools), `rg`, `show_file` — directory/file reads, file creation, patches, PowerShell, web fetches all bypassed policies. - Cursor (PR #293 left it as passthrough) — Cursor uses `Shell` for what Claude calls `Bash`, so every Bash builtin (`block-sudo`, `block-rm-rf`, `block-read-outside-cwd` Bash branch, …) silently no-op'd on Cursor sessions. - Codex (PR #293 left it as passthrough) — Codex hooks report `tool_name: "apply_patch"` even when matchers say `Edit`/`Write`; live sessions also expose `write_stdin` which sends input to a running shell. - OpenCode was missing `apply_patch` and `websearch`. Fix: - Extend `COPILOT_TOOL_MAP` in `src/hooks/types.ts` with the full Copilot CLI tool surface — `view`/`show_file` → `Read`, `create` → `Write`, `apply_patch` → `Edit`, `web_fetch` → `WebFetch`, `powershell` and the `*_bash`/`*_powershell` session-management tools → `Bash`, `rg` → `Grep`. - Extend `OPENCODE_TOOL_MAP` with `apply_patch` → `Edit`, `websearch` → `WebSearch`. Mirror the same entries in the OpenCode shim template at `src/hooks/integrations.ts:734` (the shim must stay self-contained — it's loaded in-process by opencode). - Add `CURSOR_TOOL_MAP` (`Shell` → `Bash`) and `CODEX_TOOL_MAP` (`apply_patch` → `Edit`, `write_stdin` → `Bash`) in `src/hooks/types.ts`, plus matching cursor/codex branches in `handler.ts:canonicalizeToolName`. `apply_patch` maps to `Edit` (not `Write`) for consistency with the existing `str_replace_editor` → `Edit` entry; the choice was confirmed via AskUserQuestion. The trade-off is documented: `Edit` preserves Claude semantics (Claude's own `Edit` tool doesn't trigger `block-write-outside-cwd` either), while `Write` would have been stricter but inconsistent with Claude. Tests: - `__tests__/hooks/handler.test.ts` — extend the existing per-Copilot canonicalization loop to cover every new entry (with `[view, Read]` listed first as the regression anchor); add new test blocks for Cursor (`Shell` → `Bash`, plus passthrough for `Read`/`Write`/`Grep`/`Delete`/`Task`/`MCP:*`) and Codex (`apply_patch` → `Edit`, `write_stdin` → `Bash`, plus passthrough for `Bash`/`mcp__*`). - `__tests__/e2e/hooks/copilot-integration.e2e.test.ts` — pinned regression test "blocks `view` of a path outside cwd under Copilot (regression for #295)" mirroring the PR #293 `ls -la` test, plus a new `CopilotPayloads.preToolUse.view` factory in `__tests__/e2e/helpers/payloads.ts`. - `__tests__/hooks/opencode-plugin-shim.test.ts` — extend the OPENCODE_TOOL_MAP coverage loop with `apply_patch` and `websearch`. Verified: `bun run test:run` → 1461 passed, `bun run test:e2e` → 291 passed (Copilot e2e went 11 → 12), `bunx tsc --noEmit` → clean. Manual repro of all three: Copilot `view /etc` denies, Cursor `Shell sudo …` denies, Codex `write_stdin` denies (canonicalized to `Bash`). Pre-existing item not in this PR: the dogfood `.opencode/plugins/failproofai.mjs` was never updated with `TOOL_NAME_MAP` after PR #293 (the template was updated but the dogfood file is hand-maintained). Production users get the correct map via the template; only contributors running OpenCode against this repo are affected. Reading or rewriting that file requires temporarily disabling the `block-read-outside-cwd` agent-settings guard — deferred to a follow-up PR. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* [luv-338] fix: canonicalize OpenCode + Pi tool-input arg keys so path-checking builtins fire
OpenCode's `read` slipped past `block-read-outside-cwd`: the shim
canonicalized the tool name (`read` → `Read`) but forwarded `output.args`
verbatim. OpenCode delivers `{ filePath }`; the policy reads
`ctx.toolInput.file_path`, so `getFilePath()` returned "" and the
target-empty short-circuit at builtin-policies.ts:799 allowed the read.
Same family of bug on Pi with a different mismatch shape: Pi's
read/write/edit use `path`. `block-read-outside-cwd` happens to work
on Pi via the existing `tool_input.path` fallback at builtin-policies.ts:796,
but `block-env-files` and `block-secrets-write` only check `file_path`
and were silently no-op'd.
Mirrors the TOOL_NAME_MAP pattern from #293 with two new per-tool
input-key maps keyed by canonical PascalCase tool name:
• OPENCODE_TOOL_INPUT_MAP — Read/Write/Edit: filePath→file_path,
plus oldString/newString/replaceAll for Edit
• PI_TOOL_INPUT_MAP — Read/Write/Edit: path→file_path
(Pi's nested edits[{oldText,newText}] array doesn't flat-rename,
no current builtin reads it)
Both maps are mirrored inline in their shims so .opencode/plugins/
failproofai.mjs and pi-extension/index.ts stay self-contained. MCP
mcp_* and any unmapped tool pass through unchanged.
Existing users must re-run `failproofai policies --install --cli opencode`
/ `--cli pi` to regenerate their shims and pick up the fix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* [luv-338] docs: document OPENCODE_TOOL_INPUT_MAP and PI_TOOL_INPUT_MAP in configuration.mdx
Mirror the existing tool-name canonicalization wording in the OpenCode
and Pi sections. Calls out which builtin policies start firing as a
result (block-read-outside-cwd / block-env-files / block-secrets-write
on OpenCode; block-env-files / block-secrets-write on Pi).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* [luv-338] chore: stamp PR number in CHANGELOG entry
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
["Bash"],["Read","Glob","Grep","Bash"], …) match on tool names with case-sensitiveArray.includesatpolicy-registry.ts:93-95, but Copilot's tool registry emits lowercase IDs (bash,read, …) — every Bash/Read/Write/Edit builtin silently no-ops under Copilot.COPILOT_TOOL_MAP(handler-side) plusOPENCODE_TOOL_MAP/PI_TOOL_MAP(shim-side; the OpenCode and Pi shims embed identical maps inline since they're loaded in-process by their host CLI). Replaces Pi's naivecharAt(0).toUpperCase()heuristic that would have mis-canonicalized future multi-word tool IDs.User-reported regression
Under GitHub Copilot CLI, an out-of-cwd
lspiped throughsedran successfully despite an enabledblock-read-outside-cwdpolicy. The same flow blocks correctly under Claude Code today.Test plan
__tests__/e2e/hooks/copilot-integration.e2e.test.ts).app/components/log-viewer/tool-input-output.tsxwarning unrelated to this PR).tool_name: bashand an out-of-cwdlsnow returnspermissionDecision: deny(pre-fix: allow).OPENCODE_TOOL_MAP/PI_TOOL_MAP(bash,read,write,edit,glob,grep, plus OpenCode'slist,webfetch,todowrite,todoread). The handler/shim's?? rawfallback keeps unknown tools functional even if the assumed set is incomplete; the regression test pins behaviour forbashspecifically. Codex and Cursor remain passthrough until empirical evidence shows otherwise (per docs they're already PascalCase).🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes