Security hardening: MCP token-gate/SSRF, tracing scrub, local-router timeout#44
Conversation
Bearer credentials from plugin-supplied .mcp.json could leak in two ways: sent in cleartext to a public http:// server, or exfiltrated to a cloud-metadata endpoint via a crafted url. Add src/core/mcp/urlSecurity.ts with two guards: - shouldSendBearer: attach Authorization only over https, a private/ loopback/CGNAT host, or an ISAAC_MCP_ALLOW_HOSTS-listed host. - assertMcpUrlAllowed: refuse known cloud-metadata endpoints unless allowlisted. Wire both into the loader (fail-closed, validated before the server id is claimed so a rejected entry can't shadow a valid same-id server) and re-check at connect time as defense-in-depth. Also fixes the test-only type errors that broke check-types after the http-transport union.
chat() called fetch() with no AbortSignal or timeout, so an unresponsive worker would hang the agent indefinitely. The streaming path already has total + idle timers; the non-streaming path produces no chunks, so only a wall-clock total timeout applies. Mirror chatStream()'s combineAbortSignals pattern and wrap the AbortError as LocalRouterTimeoutError so callers can fall back on it.
scrubSecrets was applied to planner/tool fields but not to meta.json (gateway_url could carry inline creds, worker endpoints) nor to the errors[] array (an error string may echo a failed token). Scrub at the single persistMeta() point (returns a deep copy, so in-memory meta stays intact for later merges) and scrub errors[] in appendTurn. Note: api_conversation_history.json is deliberately NOT scrubbed here - it is replayed to resume a task, so redacting it would change behaviour; that is a separate decision.
The cli tsconfig uses the classic JSX runtime, so React must be in scope for the JSX in this test (tsc TS2686). biome's noUnusedImports didn't see the usage; add a scoped biome-ignore instead of dropping the import.
Two hardening fixes in the emulated tool-call parser: - Bash commands parsed from a model's markdown fence defaulted to requires_approval:false, letting e.g. 'rm -rf' auto-run. Default to true so they hit the approval gate; auto-approve modes still bypass. - The plain-function extractor interpolated the tool name into a RegExp unescaped; a name with regex metacharacters would corrupt the pattern. Escape it via escapeRegExp.
- ResponseCache evicted by LRU without first reclaiming expired entries, so a cache full of stale entries could evict a fresh one. Reclaim expired entries before the LRU eviction. - HealthMonitor shared one AbortController + 5s timer across the /health and /v1/models attempts; the fallback could start already-aborted and mark an up worker as down. Give each ping its own controller/timeout.
api_conversation_history.json is a plaintext copy of the full LLM exchange and may carry secrets. It is replayed to resume a task, so it is deliberately not scrubbed; restrict it to the owner instead via a new optional mode arg on atomicWriteFile.
Rebrand residual user-visible 'dirac' references that do not affect storage compat: telemetry event dirac_cli -> isaac_cli, exports JSDoc, CLI help fixtures in index.test, DEVELOPMENT.md command examples, and two code comments. Storage paths (~/.dirac, DIRAC_DIR, .diracrules) are intentionally left for backward compat per src/CLAUDE.md.
Pre-PR critic review (verdict ACCEPT-WITH-RESERVATIONS, agent a9be496e52ccc434a) found that http://[::ffff:169.254.169.254]/ and the hextet form ::ffff:a9fe:a9fe bypassed the metadata SSRF guard, and that IPv4-mapped private addresses were wrongly treated as public by the token-gate. normalizeHost now collapses IPv4-mapped IPv6 to dotted IPv4 and strips a trailing FQDN dot, so both guards see the real address. Adds tests for the mapped/trailing-dot evasions.
There was a problem hiding this comment.
Pull request overview
Security-hardening pass on the ISAAC fork: gates MCP HTTP bearer tokens behind a TLS / private-host policy, blocks SSRF to cloud-metadata endpoints, scrubs secrets at the JSONL tracing persist points, restricts conversation-history file mode to 0600, and tightens local-router robustness (timeouts, regex injection, cache eviction). Also picks up residual Dirac→ISAAC rebrand strings and drops kanban CLI references.
Changes:
- New
urlSecuritymodule (shouldSendBearertoken-gate +assertMcpUrlAllowedSSRF guard) wired into the MCP loader (fail-closed, before id is claimed) and re-checked at connect time. JsonlTracernow scrubs secrets (deep-copy) onpersistMeta()and onerrors[];api_conversation_history.jsonis written0600.- LocalRouter: non-streaming
chat()now has wall-clock timeout/AbortSignal;execute_commandparsed from markdown fences defaults torequires_approval: true; tool names are escaped before regex use;ResponseCachereclaims expired entries before LRU eviction;HealthMonitoruses a fresh controller/timeout per ping. - Branding/rebrand: telemetry event id, JSDoc/module name, CLI test help fixtures, DEVELOPMENT.md examples; kanban command/option/test removed in CLI.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/core/mcp/urlSecurity.ts | New SSRF + token-gate primitives with IPv4-mapped IPv6 / trailing-dot normalization. |
| src/core/mcp/McpServerConfigLoader.ts | Validate + resolve before claiming server id; enforce SSRF guard and token-gate fail-closed. |
| src/core/mcp/McpClientManager.ts | Defense-in-depth SSRF re-check at connect time. |
| src/core/mcp/tests/urlSecurity.test.ts | Coverage for SSRF guard, token-gate, allowlist override. |
| src/core/mcp/tests/McpServerConfigLoader.test.ts | Adds SSRF-skip / token-gate-skip / private-host-allow cases; type narrows for stdio fields. |
| src/core/tracing/JsonlTracer.ts | Scrub secrets when writing meta.json and inside errors[]. |
| cli/tests/tracing/JsonlTracer.test.ts | Verifies inline-cred scrub and error-string scrub. |
| src/core/storage/disk.ts | Optional mode for atomicWriteFile; saves api conversation history as 0600. |
| src/services/local-router/LocalRouter.ts | Adds escapeRegExp, total timeout in chat(), defaults parsed bash commands to require approval. |
| src/services/local-router/HealthMonitor.ts | Per-attempt controller/timeout via pingOnce. |
| src/services/local-router/ResponseCache.ts | Reclaim expired entries before LRU eviction. |
| src/services/local-router/tests/LocalRouter.test.ts | Asserts requires_approval: true default for fenced commands. |
| cli/src/init.ts | Telemetry event renamed dirac_cli → isaac_cli. |
| cli/src/exports.ts | JSDoc rebranded to ISAAC / isaac-cli. |
| cli/src/index.test.ts | Removes kanban command/flag tests. |
| cli/src/context/StdinContext.tsx, StdinContext.test.tsx | Comment rebrand; biome-ignore for React import in classic JSX runtime. |
| cli/src/utils/piped.ts | Doc-comment rebrand to isaac. |
| cli/DEVELOPMENT.md | Rebrand examples and headings. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const cleanup = () => { | ||
| clearTimeout(totalTimer) | ||
| detachSignals() | ||
| if (!controller.signal.aborted) controller.abort() | ||
| } |
| if ((token && !hasAuth) || hasAuth) { | ||
| if (!shouldSendBearer(url)) { | ||
| throw new Error( | ||
| `refusing to send Authorization over insecure channel to ${url} ` + | ||
| `(use https, a private/loopback/Tailscale host, or add the host to ISAAC_MCP_ALLOW_HOSTS)`, | ||
| ) | ||
| } | ||
| } |
| function envAllowlist(): Set<string> { | ||
| const raw = process.env.ISAAC_MCP_ALLOW_HOSTS | ||
| if (!raw) return new Set() | ||
| return new Set( | ||
| raw | ||
| .split(",") | ||
| .map((h) => h.trim().toLowerCase()) | ||
| .filter(Boolean), | ||
| ) | ||
| } |
| function normalizeHost(hostname: string): string { | ||
| let h = hostname.replace(/^\[/, "").replace(/\]$/, "").toLowerCase() | ||
| if (h.length > 1 && h.endsWith(".")) h = h.slice(0, -1) | ||
| const mappedHex = h.match(/^::ffff:([0-9a-f]{1,4}):([0-9a-f]{1,4})$/) | ||
| if (mappedHex) { | ||
| const hi = Number.parseInt(mappedHex[1], 16) | ||
| const lo = Number.parseInt(mappedHex[2], 16) | ||
| return `${hi >> 8}.${hi & 0xff}.${lo >> 8}.${lo & 0xff}` | ||
| } | ||
| const mappedDotted = h.match(/^::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/) | ||
| if (mappedDotted) return mappedDotted[1] | ||
| return h | ||
| } |
| // Defense-in-depth: the loader already enforces this, but re-check at | ||
| // connect time in case a config reached us through another path. | ||
| const verdict = assertMcpUrlAllowed(cfg.url) | ||
| if (!verdict.ok) throw new Error(`MCP server "${serverId}": ${verdict.reason}`) |
Root mocha suite — verified under Node LTSThe PR body noted
Identical counts → this branch introduces zero mocha regressions. The 12 The 25 remaining failures are pre-existing on master and unrelated to this PR (Prompt System snapshot tests, Fireworks/OpenRouter handlers, tree-sitter Python syntax, RenameSymbol/Subagent handler mocks, Note: the |
Security hardening — MCP auth, tracing redaction, local-router robustness
Issues surfaced by a deep code+functional review of the ISAAC delta and recent commits. Each fix is an atomic commit; CI-relevant checks are green.
MCP HTTP servers (bearer auth, SSRF)
shouldSendBearer):Authorization: Beareris attached only overhttps://, a private/loopback/CGNAT-Tailscale host, or a host inISAAC_MCP_ALLOW_HOSTS. A token never leaves the machine in cleartext to a publichttp://endpoint.assertMcpUrlAllowed): refuses known cloud-metadata endpoints (169.254.169.254,metadata.google.internal/.goog,100.100.100.200,fd00:ec2::254), incl. IPv4-mapped IPv6 (::ffff:…) and trailing-FQDN-dot evasions; override viaISAAC_MCP_ALLOW_HOSTS.Tracing (EU AI Act audit artifacts)
persistMeta()point (coversgateway_urlinline creds + worker endpoints) and in theerrors[]array.scrubSecretsreturns a deep copy so in-memory meta stays intact for later merges.local-router
chat()now has a wall-clock timeout + AbortSignal (mirrorschatStream()); an unresponsive worker no longer hangs the agent indefinitely.requires_approval: true(untrusted → must hit the approval gate; auto-approve modes are unaffected).escapeRegExpon tool names in the plain-function extractor (RegExp injection).ResponseCachereclaims expired entries before LRU eviction;HealthMonitoruses a fresh AbortController/timeout per ping attempt.Storage
api_conversation_history.jsonis written0600. It is replayed to resume a task, so it is deliberately not scrubbed — restricted to the owner instead.Branding
dirac_cli→isaac_cli, exports JSDoc, CLI help fixtures, DEVELOPMENT.md examples, comments. Storage paths (~/.dirac,DIRAC_DIR,.diracrules) intentionally kept for backward compat persrc/CLAUDE.md.Verification
check-types,lint(biome),test:unit:mcp(76),cli:test(538), tracing tests.test:unit:mochanot run here: blocked by an environment mismatch (Node v26 vs.nvmrc lts/*;better-sqlite3has no prebuilt binary for node-v147 and the source rebuild fails). Same exit 37 as the pre-change baseline — no regression from this branch. Requires Node LTS to run.1990405.Deferred (follow-up)
api_conversation_history.json(scrub vs documented internal state) beyond the 0600 hardening.~/.dirac → ~/.isaac(needs a migration plan persrc/CLAUDE.md).requires_approval: false(moot — extractor overrides to true).