feat(config): layered config tool filter — config_denied over user toggles#468
Conversation
|
Thanks @nlaurance — clean implementation and genuinely thorough tests. I support the idea of config-based tool filtering — declaring tool visibility in My concern is purely the interaction model with the per-tool enable/disable that recently landed (#463 Web UI, #447, REST/CLI
Definition I'd like us to agree onFor me, config-based filtering should mean hard-off for unwanted tools, and for that to be solid it has to be transparent, predictable, and surfaced in the UI and CLI — never a silent toggle that snaps back. Precisely:
Spec use cases
Implementation-wise this means config should override the stored DB enable state at evaluation time rather than overwriting it: evaluate the config layer in Does this definition work for you? If we're aligned on it, I'm happy to pair on this PR and convert the current behavior to override the DB option after config read along these lines — the config fields + mutual-exclusion validation you've already written are solid and stay as-is. |
|
now rebased with changes suggested |
…enylist Adds two mutually exclusive fields to ServerConfig that let operators declare tool visibility statically in mcp_config.json rather than having to call the API or CLI after every fresh install. enabled_tools: ["list_issues", "get_issue"] // allowlist — only these visible disabled_tools: ["delete_repo", "force_push"] // denylist — hide these, allow rest Config validation rejects a server that has both fields set. On every applyDifferentialToolUpdate (server connect / tool refresh), applyConfigToolFilter walks the in-memory config, computes the desired enabled/disabled state for each discovered tool, and calls setToolEnabledNoEmit to persist it in BBolt. All existing enforcement paths (isToolCallable, retrieve_tools pre-ranking, call_tool_*) pick up the change automatically with no further modifications. Five unit tests cover: allowlist disables unlisted tools, allowlist re-enables a tool moved back into the list, denylist disables listed tools, no-op when neither field is set, and end-to-end integration through applyDifferentialToolUpdate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lConfigDenied Replace the BBolt-writing applyConfigToolFilter function (which overwrote user preferences on every reconnect, emitted spurious audit events, and had no provenance) with an evaluation-time IsToolConfigDenied method that: - Reads config at call time, never writes to BBolt - Preserves user-set tool preferences and audit trail - Enables separation of concerns: config vs user intent Key changes: - Delete applyConfigToolFilter from tool_quarantine.go (63 lines removed) - Add IsToolConfigDenied(serverName, toolName string) bool to Runtime - Remove applyConfigToolFilter call from lifecycle.go - Rewrite tool_config_filter_test.go: 5 new tests for IsToolConfigDenied * AllowList: tools not listed are denied * DenyList: listed tools are denied * NoFilter: all tools allowed when config has no filter * UnknownServer: returns false for missing servers * UserDisabledPreserved: BBolt state is independent from config layer All 198 runtime tests pass. No behavior change to actual tool visibility— the config layer is now just evaluated at call time instead of persisted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add IsToolConfigDenied delegation on *Server and insert a config-layer check in isToolCallable so tools denied by enabled_tools/disabled_tools in the server config are hard-off at MCP call time, evaluated at runtime without touching BBolt storage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fig-denied tools - Add ConfigDenied bool field to contracts.Tool (json: config_denied,omitempty) - Enrich config_denied in handleGetServerTools via IsToolConfigDenied interface - Return HTTP 409 in handleSetToolEnabled when req.Enabled is true for a config-denied tool - Remove debug fmt.Printf lines from enrichment loop; use logger.Debug instead Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extended the Tool interface to include runtime fields that the backend sends and Vue components use: server_name, schema, usage, last_used, approval_status, disabled, and config_denied. This allows proper typing of these fields in the frontend instead of using unsafe casts. Simplified type assertions in isToolConfigDenied and isToolEnabled functions to use the properly-typed Tool interface directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
d349489 to
3f24ce3
Compare
…mcp-proxy#468 smart-mcp-proxy#1-smart-mcp-proxy#4) smart-mcp-proxy#1 (bug): config-denied tools returned the generic 'Tool is disabled and not callable' over MCP — identical to a user-toggled tool, but the remediation differs (edit mcp_config.json vs a UI toggle that 409s). blockedToolMessage() now branches on IsToolConfigDenied and tells the agent it is operator policy, not user-overridable. Split into a pure blockedToolMessageFor(bool) with a dedicated unit test. smart-mcp-proxy#2: config-lock badges badge-error (red/alarming) -> badge-neutral + 🔒; a policy lock is not an error. smart-mcp-proxy#3: 'Enable All' toast now reports 'N tools remain locked by config' (client-side from config_denied; no API contract change). smart-mcp-proxy#4: unified copy — stray 'config locked' label -> '🔒 locked by config'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dumbris
left a comment
There was a problem hiding this comment.
LGTM. Layered config tool filter is the right design (config restricts, never force-on; user prefs preserved across reconnect). Added the 4 UX-review fixes in 7aa683b: status-aware TOOL_BLOCKED message (config-policy vs user-disable), neutral lock badges, Enable-All skip feedback, copy consistency. All CI green; local config/runtime/storage/httpapi + server blocked-tool unit tests pass. Follow-up (agent-discoverable disabled tools) tracked as spec 049.
* docs(spec): agent-discoverable disabled tools design Follow-up to PR #468. Opt-in include_disabled on retrieve_tools + conditional counts on upstream_servers, 5-state classifier, reactive discovery nudges. Token-cost-zero until exercised. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(049): speckit spec for agent-discoverable disabled tools Converts the brainstormed design (docs/superpowers/specs/2026-05-18-...) into canonical speckit format. Opt-in include_disabled discovery, 5-state status + remediation map, conditional per-server counts, reactive nudges. No enforcement/storage change. Related #468 ## Changes - specs/049-agent-discoverable-disabled-tools/spec.md - specs/049-agent-discoverable-disabled-tools/checklists/requirements.md ## Testing - N/A (spec only); requirements checklist all items pass * docs(049): speckit plan + phase-0/1 artifacts Related #468 ## Changes - plan.md (constitution check PASS, no violations) - research.md (decisions + codebase facts; no open unknowns) - data-model.md (5-state status enum, additive response shapes) - contracts/mcp-deltas.md (retrieve_tools + upstream_servers deltas) - quickstart.md (curl + live-MCP verification recipe) - CLAUDE.md active-technologies updated by speckit agent-context ## Testing - N/A (planning docs only) * docs(049): speckit tasks (18, TDD, by user story) Related #468 ## Changes - tasks.md: Setup, Foundational (classifier+types), US1 MVP, US2, US3, Polish ## Testing - N/A (task plan only) * feat(049): foundational classifier + additive response types Related #468 ## Changes - contracts: DisabledToolStatus consts, LockedToolEntry, ServerToolCounts - runtime: ClassifyDisabledTool (pure, 5-state precedence, read-only) - tests: precedence/unknown/pending table + 1k benchmark (6.6 ns/op) ## Testing - go test ./internal/runtime -run TestClassifyDisabledTool: PASS - BenchmarkClassifyDisabledTool: 6.6 ns/op (<<100ms budget) * feat(049): US1 — opt-in include_disabled tool discovery Related #468 ## Changes - retrieve_tools: include_disabled param + one-line schema hint - handler: split callable/disabled, agent-scope before classify, cap min(limit,10), once-per-response remediation map - p.classifyDisabledTool (storage-sourced, mirrors runtime classifier) - in-memory include_disabled usage counter (no persistence) ## Testing - TestDisabledDiscovery_{DefaultPathUnchanged,OptIn,CapAtTen,UsageCounter}: PASS - regressions (ExcludesDisabled, CallBlockedTool, BlockedToolMessageFor): PASS * feat(049): US2 — discovery pointer + zero-result nudge Related #468 ## Changes - blockedToolMessageFor: append include_disabled:true pointer (both branches) - retrieve_tools: when 0 callable + droppedCount>0 + flag off, emit a one-line 'notice' count nudge (no inline entries) ## Testing - TestBlockedToolMessage_DiscoveryPointer, TestDisabledDiscovery_ZeroResultNudge: PASS * feat(049): US3 — conditional per-server tool counts Related #468 ## Changes - upstream_servers list/get: emit ServerToolCounts only when a non-callable count > 0 (all-callable servers gain 0 bytes) - classifyServerToolStatus: single storage-sourced truth (runtime- independent); classifyDisabledTool now delegates to it (no drift) ## Testing - TestServerToolCounts_Conditional + full server/runtime suites: PASS * docs(049): built-in tools note for include_disabled + server counts Related #468 * test(049): live curl+MCP verification results + task closeout Related #468 ## Changes - quickstart.md: verification-results table (§1-§5 PASS), documents the pre-existing #468 config->runtime disabled_tools gap (out of 049 scope) - tasks.md: all phases checked off ## Testing - runtime/server/contracts unit suites: PASS; verify-oas: PASS - live MCP+curl: default unchanged, include_disabled, counts, blocked msg * docs(049): trim CLAUDE.md additions (speckit auto-append cruft + terser notes) Related #468 check-size is pre-existing/advisory (CLAUDE.md was 39347 chars at base, already >25k; #468 merged through the same failing check). This commit minimizes 049's footprint regardless. --------- Co-authored-by: Claude Code <noreply@anthropic.com>
…477) The merged #476 classifyServerToolStatus read config-denial only from the storage ServerConfig copy. Config-file disabled_tools on stdio servers live in the live runtime config and are not always mirrored to storage, so disabled_by_config classification + upstream_servers counts were wrong at runtime for that case (unit tests passed because the no-runtime harness uses the storage fallback path). Prefer p.mainServer.runtime.IsToolConfigDenied (same authority isToolCallable/blockedToolMessage use); fall back to storage IsToolAllowedByConfig only when no runtime is wired. Adds a runtime regression test pinning that ClassifyDisabledTool reads live config (the authority the server layer now delegates to). Related #468
Pairs with the Runtime.Close Docker-verify fix. The fast unit step now passes -short (activating the testing.Short guards) with a 4m race timeout for headroom; the heavy property/timing tests run unguarded in the main-only stress-tests job so coverage is preserved. Related #468
…waste (#478) * fix(ci): skip Docker-cleanup verification when isolation unused Root cause of the chronic E2E red (failing on every main commit for weeks; v0.31.0/.1 shipped through it): Runtime.Close() and Server.Stop() always ran a Docker container-cleanup verification — a docker ps probe + a 15x1s poll loop + 2s sleep (~17s) — even when no Docker isolation was ever used. Every runtime-constructing unit test paid ~17s in t.Cleanup; across ~dozens of tests under -race the internal/runtime package hung past 12 min, blowing CI's -race -timeout 2m step. Fix: add Manager.UsesDockerIsolation() (same predicate as the Docker recovery monitor) and gate both cleanup-verification blocks on it. No isolation => no managed containers => nothing to verify. Production behavior with Docker isolation is unchanged. Also add testing.Short() guards to the heaviest property/timing tests (TestRapidQuarantineStateMachine ~270s, TestRapidInvariant ~70s, and five ~18s timing tests) so the fast unit step stays lean; they still run in the main-only stress-tests job. Result (local, exact CI condition -short -race -timeout 2m): - internal/runtime: hang >12m -> ok 20s - internal/server: slow -> ok 18s - internal/upstream/contracts: ok Related #468 * ci(e2e): -short the fast unit step; heavy tests move to stress job Pairs with the Runtime.Close Docker-verify fix. The fast unit step now passes -short (activating the testing.Short guards) with a 4m race timeout for headroom; the heavy property/timing tests run unguarded in the main-only stress-tests job so coverage is preserved. Related #468
The first revision string-interpolated t.TempDir() into a JSON literal; on windows-amd64 the backslash path produced invalid JSON, failing the test (caught by the Build Binaries windows job). Marshal the config via encoding/json so the path is escaped on every OS. Related #468
…ression guard) (#479) * fix(configsvc): deep-copy tool-filter slices in Snapshot.Clone Investigation of the reported 'config-file disabled_tools does not reach runtime' concern found NO reproducible code bug: every boundary (file parse, storage Save/Get/List, and the full path parse->New->LoadConfiguredServers->SaveConfiguration->IsToolConfigDenied) provably preserves enabled_tools/disabled_tools. The original manual observation was a test-environment artifact (reused /tmp data-dir while SaveConfiguration rewrites the config file + hot-reload watcher). The investigation did surface one real latent defect: Snapshot.Clone() deep-copied Args/Headers/Env/OAuth/Isolation but shallow-aliased the spec-049 EnabledTools/DisabledTools slices via 'clonedSrv := *srv'. Not data loss (reads see the aliased value) but an immutability violation: mutating a cloned config's filter would corrupt the shared backing array. - Deep-copy EnabledTools/DisabledTools in Snapshot.Clone - TestSnapshotClone_DeepCopiesToolFilters (aliasing regression) - TestConfigFileToolFilter_ReachesRuntime (permanent full-path guard) - correct the spec-049 quickstart note (no #468 gap exists) Related #468 * test(049): make TestConfigFileToolFilter_ReachesRuntime cross-platform The first revision string-interpolated t.TempDir() into a JSON literal; on windows-amd64 the backslash path produced invalid JSON, failing the test (caught by the Build Binaries windows job). Marshal the config via encoding/json so the path is escaped on every OS. Related #468
Summary
applyConfigToolFilter(which overwrote user-ownedDisabledflags on every reconnect) with a pure evaluation-time config layerenabled_tools/disabled_toolsinmcp_config.jsonare now enforced without touching user preferences, emitting spurious audit events, or silently re-enabling tools the user manually disabledWhat changed
Backend
ServerConfig.IsToolAllowedByConfig(toolName)— pure config helperRuntime.IsToolConfigDenied(serverName, toolName)— evaluation-time check, no BBolt writesisToolCallable(MCP enforcement chokepoint) checks config layer before BBoltrecord.Disabled— config-denied tools are hard off across all paths (call_tool_*,retrieve_tools,upstream_serverstool counts)SetAllToolsEnabledskips config-denied tools so bulk Enable All never sets a misleadingDisabled=falseon a hard-off toolGET /api/v1/servers/{id}/toolsexposesconfig_denied: trueper toolPOST /api/v1/servers/{id}/tools/{tool}/enabledreturns HTTP 409 when trying to enable a config-denied toolUpstreamRecordnow persistsEnabledTools/DisabledToolsthrough BBolt round-trips (bug fix)Frontend
config_denied: trueshow alocked by configbadge and a read-only label instead of a toggleToolinterface inapi.tsnow properly typesdisabled,config_denied,approval_status, etc.Effective visibility model
Config can only restrict further — it can never force a tool back on against the user.
Test Plan
go test ./internal/config/... ./internal/runtime/... ./internal/server/... ./internal/httpapi/... ./internal/storage/...— all passcd frontend && npm run build— clean buildenabled_tools: ["list_issues"]→create_issueunreachable via MCP, shows locked in UIdisabled_tools: ["delete_repo"]→delete_repounreachable, locked in UIPOST /api/v1/servers/{id}/tools/delete_repo/enabledwith{"enabled": true}→ 409 Conflict🤖 Generated with Claude Code