fix(DDE-365): security hardening — leak #1–#17#7
Open
nc-markovic wants to merge 21 commits into
Open
Conversation
Guard the OAuth token mint path in AEMFetch.getAuthToken: when IMS returns expires_in <= 60 (or NaN/undefined), tokenExpiry would be placed at-or-before 'now', forcing an IMS round-trip on every request. Single-flight dedups concurrent callers within a tick but cannot prevent the per-call burn. Throw a readable error instead of accepting the unusable token.
Replace cors({ origin: '*' }) with a narrow allowlist enforced ahead of the
cors middleware. Defaults cover the MCP Inspector (UI :6274, proxy :6277) on
both localhost and 127.0.0.1; extra origins via repeatable --allow-origin CLI
flag or comma-separated MCP_ALLOWED_ORIGINS env var. Disallowed Origin returns
HTTP 403 + JSON-RPC -32600 body. Requests with no Origin (CLI curl) pass
through unchanged. Access-Control-Expose-Headers: Mcp-Session-Id preserved.
Without an explicit bind argument, Express's `app.listen(port, …)` binds 0.0.0.0, exposing /mcp to anyone on the same LAN (cafe WiFi, office, hotel) who can then drive every tool with whatever AEM credentials the server was launched with. Default to 127.0.0.1; add a `--bind` CLI flag and `MCP_BIND` env var for explicit all-interfaces opt-in. CLI flag overrides env. No auth flow or MCP protocol surface changed.
Replace the unconditional process.exit(0) on SIGINT with a drain sequence: server.close() to stop accepting new connections, closeIdleConnections() to drop keep-alive sockets, transports[].close() for MCP session teardown, then wait up to --shutdown-drain-seconds (env: MCP_SHUTDOWN_DRAIN_SECONDS, default 60s) for in-flight tool calls to finish. Worst-case bulk operations (bulkUpdateComponents) can hold for minutes — a shorter window would still abandon work. Wire the same handler to SIGTERM. Exit code semantics: clean drain -> 0; forced timeout -> 1 so orchestrators (Kubernetes, systemd) can distinguish "everything finished" from "had to abandon pending work" and surface alerts on the latter. A "0" on timeout would hide hostile keep-opens or stuck handlers from operators. Add uncaughtException / unhandledRejection fallbacks: log to stderr with the "[fatal]" prefix, sync-close transports (process state is undefined per Node docs — no async drain), exit 1.
Two workflow rules added to make Claude's behaviour in this repo predictable without needing to repeat the instructions every session: - `Clarify before deciding` — before any non-trivial decision (credentials, destructive vs read-only, scope, approach against real environments), ask the user first with concrete options. Do not infer or default-to- destructive just because a plan lists a destructive path. - `Testing workflow` — for items with a `How to Test` block, run every terminal-driven test yourself (ad-hoc Node harnesses under src/test/, curl against a locally-started server, port checks). Show results and only delegate to the user for tests that genuinely cannot run from a session (browser UIs, real cloud tenants, multi-machine signals) — with paste- ready commands when handing off. Typecheck is necessary but not sufficient.
…init Replace the 400 Bad Request on unknown Mcp-Session-Id with 404 Not Found per the MCP StreamableHTTP specification §6.3. Clients that send a stale session ID (e.g. after a server restart) must receive 404 so that inspection tooling like the MCP Inspector can detect the lost session and trigger its automatic reinitialization loop. A 400 leaves the Inspector stuck on a generic error with no recovery path, requiring a manual reconnect. Also tighten the JSON-RPC error code from -32000 (generic server error) to -32001 (session not found) so programmatic clients can distinguish a bad-session rejection from an internal server fault.
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Security-hardening series covering 17 leaks identified during a code audit
(see
plan/mcp-cert-auth-implementation.mdfor the full plan). Each leak isa discrete commit on this branch so that
git log --grep "leak #"andgit bisectwork cleanly. Numbering follows commit-count convention on thebranch —
leak #11/#12/#13were folded into earlier fixes or retiredduring the audit and intentionally skipped to keep numbering aligned with
the branch's commit count.
Leaks shipped
a9a1a59f6f12dbf586da618d478011e4964fe6ad4e77427a6016776bee306746c2c571AEM_IMS_URLa647b0cexpires_in <= 60to prevent IMS stormc1dcc36Originheader (allowlist) — prevent DNS rebinding3216fc5be4cb2ed5eab21Two
docs:commits codify project-level workflow rules inCLAUDE.md(
Testing workflowandClarify before deciding).Highlights
cors({ origin: '*' })replaced with anarrow allowlist (Inspector :6274/:6277 on
localhostand127.0.0.1).Extra origins via
--allow-originflag orMCP_ALLOWED_ORIGINSenv.Disallowed Origin → HTTP 403 + JSON-RPC
-32600body.app.listen(port)defaulted to0.0.0.0—anyone on the same WiFi could drive every tool. Now defaults to
127.0.0.1.--bind 0.0.0.0/MCP_BIND=0.0.0.0for explicit opt-in.60s default budget (
--shutdown-drain-seconds/MCP_SHUTDOWN_DRAIN_SECONDS)so multi-minute bulk operations finish cleanly. Exit code is
0on cleandrain,
1on forced timeout so orchestrators can alert on stuck handlers.Adds
uncaughtException/unhandledRejectionfallbacks (sync-onlycleanup + exit 1).
inflightTokendedupesconcurrent 401-retry refreshes (5 parallel callers → 1 IMS call).
expires_in <= 60guard rejects unusable tokens that would force aper-request IMS round-trip.
Test plan
No test runner is configured — each leak ships with an ad-hoc Node harness
under
src/test/(gitignored) that exercises the affected code path. Forthe four hardening leaks merged in this push:
npm run build— typecheck cleannode src/test/test-leak-14.mjs— 12 checks (expires_in guard, leak Release merge #5/feat(DDE-365): stdio transport — subprocess mode for Claude Desktop / Cursor / VS Code #9 regression)node src/test/test-leak-15.mjs— 13 checks (Origin allowlist + preflight + CORS headers)node src/test/test-leak-16.mjs— 10 checks (loopback default,--bind, env, CLI overrides env)node src/test/test-leak-17.mjs— 19 checks (SIGINT/SIGTERM drain, deadline timeout, fatal handlers)node src/test/test-leak-17-real-aem.mjs— 9 checks (drain against real AEM atlocalhost:4502)127.0.0.1:8502/mcp(manual — recommended after merge)Out of scope
Cert-auth (mTLS) is not in this PR — it lands as
feat #1–#8on aseparate branch (
feat/DDE-XXX-cert-auth-mtls) from a hardenedmainafter this merges. See
plan/mcp-cert-auth-implementation.md§"Cert-Auth Feature".