feat: gate MCP tool calls through EvalOps approvals by haasonsaas · Pull Request #44 · evalops/kestrel

haasonsaas · 2026-04-21T00:20:21Z

Summary

add an EvalOps approvals client path for RequestApproval and GetApproval
gate MCP tool execution through EvalOps before calling the local MCP server manager
annotate existing wide events with approval risk, decision, request id, and offline fallback state

This is the first backend slice for #17. It adds the approvals enforcement point and telemetry, while leaving the dedicated Electron approval UI/modal as follow-up work.

Validation

npm run build
git diff --check
Earlier branch validation: npm run contextkit:build && npm test built ContextKit and passed the ContextKit/meeting tests, then hit the existing local audio duration assertion in tests/audio-data-size.test.mjs (Duration > 3s, got 0.0s) despite non-empty WAV output.

cursor · 2026-04-21T00:20:25Z

PR Summary

Medium Risk
Changes the MCP tool execution path to synchronously request/poll EvalOps approvals before running tools, which can block or deny actions and introduces new network/offline behaviors that could affect reliability and latency.

Overview
MCP tool execution is now gated by EvalOps approvals: before MCPServerManager.callTool, the executor requests an approval, optionally polls GetApproval for a decision, and denies execution when not approved.

Adds a new evalops/approvals module that infers a tool risk level from tool metadata, encodes tool-call payload/context for EvalOps, supports configurable wait/poll intervals, and falls back to allow when approvals are offline/errors occur.

Extends the EvalOps consumer SDK to support RequestApproval and GetApproval, and annotates WideEvent telemetry with approval risk/offline/decision/request-id fields. Also adds MCPServerManager.getTool to fetch tool metadata for approval context.

^{Reviewed by Cursor Bugbot for commit 7254668. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.}

^{Reviewed by Cursor Bugbot for commit 7254668. Configure here.}

cursor · 2026-04-21T00:25:22Z

+    })
+    const result = decisionResult(response, input.requestId, input.riskLevel)
+    if (result.decision || normalizeState(result.state) === 'resolved') return result
+  }


Polling loop ignores offline results, blocks then denies

High Severity

The waitForDecision polling loop exit condition (result.decision || normalizeState(result.state) === 'resolved') never matches an offline fallback result from decisionResult, because allowOffline sets neither decision nor state. When EvalOps goes offline mid-polling, the loop silently discards the allowed: true offline result, polls for the full 2-minute timeout, and then returns allowed: false — contradicting the fail-open offline policy used everywhere else. The loop needs to also check result.offline to break out and honor the offline fallback.

Additional Locations (1)

src/main/evalops/approvals.ts#L196-L209

^{Reviewed by Cursor Bugbot for commit 7254668. Configure here.}

cursor · 2026-04-21T00:25:22Z

+
+  if (/\b(delete|remove|destroy|drop|truncate|wipe|erase|format|revoke|kill|shutdown|terminate)\b/u.test(haystack)) {
+    return { level: 'RISK_LEVEL_HIGH', reason: 'destructive tool name or schema' }
+  }


Regex keyword "format" misclassifies tools with JSON Schema annotations

Medium Severity

inferMCPToolRisk stringifies the tool's inputSchema into the haystack, then tests it against the HIGH risk regex which includes the word format. Since "format" is a standard JSON Schema keyword used ubiquitously for date-time, email, URI, and other annotations, any MCP tool with such schema fields will be misclassified as RISK_LEVEL_HIGH ("destructive"). This will cause routine read-only tools to trigger the strictest approval path.

^{Reviewed by Cursor Bugbot for commit 7254668. Configure here.}

feat: gate MCP tool calls with EvalOps approvals

7254668

haasonsaas merged commit 631e18b into main Apr 21, 2026
5 checks passed

haasonsaas deleted the codex/evalops-mcp-approvals branch April 21, 2026 00:22

haasonsaas mentioned this pull request Apr 21, 2026

approvals: route MCP tool-call approvals through platform.approvals #17

Closed

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: gate MCP tool calls through EvalOps approvals#44

feat: gate MCP tool calls through EvalOps approvals#44
haasonsaas merged 1 commit intomainfrom
codex/evalops-mcp-approvals

haasonsaas commented Apr 21, 2026

Uh oh!

cursor Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

haasonsaas commented Apr 21, 2026

Summary

Validation

Uh oh!

cursor Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Polling loop ignores offline results, blocks then denies

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Regex keyword "format" misclassifies tools with JSON Schema annotations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cursor Bot commented Apr 21, 2026 •

edited

Loading