Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,66 @@ for that specific tag for the per-commit details.
flow + token-issuance endpoint, OR server-side template injection) is its
own design — tracked as a follow-up issue.

- **Production-readiness PR 2 of 5 — resource limits & abuse protection.**
Closes audit findings #2, #3, C1 (HIGH) and #10, #11 (MEDIUM).
- **Cypher transaction timeout** (audit #2). Neo4j embedded
`GraphDatabaseSettings.transaction_timeout = 30s` configured in
`Neo4jConfig` — every transaction in the JVM, including `run_cypher`
and graph traversals, gets a hard wall-clock cap. Catches runaway
variable-length matches before they starve the page cache.
- **Result-set cap on `run_cypher`** (audit #2). Hard row cap at
`mcp.limits.max_results` (default 500); excess rows dropped, response
carries `truncated: true` + `max_results: N`. Defends the JVM heap
against `MATCH (a),(b),(c) RETURN a,b,c LIMIT 999999999` blowups.
- **MCP `traceImpact` depth cap** (audit #10 corrected, C3). New
`mcp.limits.max_depth` field (default 10) wired into
`McpTools.traceImpact` via `Math.min`. Defends against
`RELATES_TO*1..1000` Cartesian explosions on hub nodes.
- **TTL snapshot cache on topology tools** (audit C1). `McpTools.
getCachedData()` now backed by a 60-second TTL snapshot. Without it,
every concurrent `service_dependencies` / `blast_radius` /
`find_path` / `find_bottlenecks` / `find_circular_deps` /
`find_dead_services` / `find_node` call paid the full
`graphStore.findAll()` cost and double-allocated multi-GB heaps.
A bridge fix; the proper refactor (TopologyService → per-tool Cypher)
is a tracked follow-up.
- **Per-client rate limiter** (audit #3). New `RateLimitFilter` using
Bucket4j 8.18.0 (Apache-2.0). Token bucket sized at
`mcp.limits.rate_per_minute` (default 300). Keyed by SHA-256 hash of
the `Authorization` header (so the token never lives in our key map),
falls back to `X-Forwarded-For` (first hop) or `RemoteAddr`. 429
response with `Retry-After`, `X-RateLimit-Limit`, `X-RateLimit-Remaining`
headers. Registered before `BearerAuthFilter` so unauthenticated
brute-force is also throttled.
- **`/api/file` content-type sniff** (audit #11 corrected). Added
`Files.probeContentType` guard — non-text MIMEs (`.jks`, `.so`,
`.png`, native libs) return HTTP 415 with the probed type, instead
of being served as garbled `text/plain`. Allowlist: `text/*`,
`application/json`, `application/xml`, `application/x-yaml`,
`application/javascript`. The byte cap (already enforced by
`SafeFileReader`) is unchanged.
- **Tomcat slow-client tarpit** (audit #11). `server.tomcat.connection-
timeout: 10s`, `max-swallow-size: 1MB` in the serving profile —
drops connections that hold a virtual thread + Tomcat connection at
1 KB/s.
- **CodeQL hardening on the security baseline.** Sanitised request
method + URI before logging in `BearerAuthFilter` (CWE-117 / CodeQL
`java/log-injection`); removed env-var name from the bearer-token
bootstrap log line in `TokenResolver` (CodeQL `java/sensitive-log`);
documented the deliberate stateless-bearer rationale on
`SecurityConfig.csrf(disable)` (CodeQL `java/spring-disabled-csrf-protection`
— no exploit path on a no-cookie surface).
- **Tests:** new `RateLimitFilterTest` (10 cases: under/over limit,
separate buckets per client, header-hashing, X-Forwarded-For
precedence, permit-list, default-rate fallback). Existing 6 test
classes updated for the new `McpTools` ctor signature. Full suite:
3672 tests / 0 failures / 0 errors.

**Known follow-up:** TopologyService still walks the full snapshot
in-memory after the cache hit — long-term plan is to rewrite each
topology tool as a targeted Cypher query so the snapshot isn't needed.
The cache is the bridge; the rewrite reduces peak memory.

## [0.1.0] - 2026-03-28

First general-availability cut. See the
Expand Down
8 changes: 8 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,14 @@ bean for code paths that haven't been ported yet.
- **Never log the `Authorization` header.** `BearerAuthFilter` deliberately never passes the header value to a logger, even at DEBUG. The rejection log line carries only `method` and `requestURI`. There's a regression test (`tokenValueNeverAppearsInLogs`) that captures all log lines for the filter and asserts the secret substring is absent.
- **`mode=none` + active `serving` profile = startup failure** unless `codeiq.mcp.auth.allow_unauthenticated=true` is **explicitly** set. This is by design — operators must opt into permissive mode deliberately. `mode=mtls` is reserved and currently throws "not yet implemented" (better than silently passing through).
- **`server.error.include-stacktrace: never`** is set in the serving profile as defense-in-depth alongside `GlobalExceptionHandler`. Don't enable it for "easier debugging" — stack frames in the response body leak class names + paths (CWE-209). Use the `request_id` in the envelope to correlate to the WARN log line where the full stack is captured.
- **Cypher transaction wall-clock cap is configured at the DBMS level**, not per-call. `Neo4jConfig.databaseManagementService(...)` sets `GraphDatabaseSettings.transaction_timeout = 30s` so every transaction gets the cap automatically. Don't reach for `graphDb.beginTx(timeout, unit)` overload in tool code — the test suite mocks `beginTx()` with no args and the overload changes the matcher signature, breaking the existing stubs across `McpToolsTest` / `McpToolsExpandedTest` / `McpToolsEvidenceTest`.
- **`McpTools.runCypher` row cap is enforced in the iteration loop, not via `LIMIT`.** After `maxResults` rows are accumulated the loop breaks and the response carries `truncated: true` + `max_results: N`. Don't try to inject `LIMIT N` into the user-supplied query string — that would require parsing the query (and the user's query may already have its own LIMIT).
- **`McpTools.getCachedData()` 60-second TTL snapshot is a bridge fix.** It's NOT the proper solution — the proper solution is to rewrite each topology MCP tool to use a targeted Cypher query so the full graph never needs to live on heap. The cache caps peak memory under concurrent calls but the snapshot itself is still multi-GB on large graphs. When that refactor lands, the `AtomicReference<CachedSnapshot>` and `getCachedData()` itself can be deleted.
- **`RateLimitFilter` keys by `sha256(Authorization)`** — the raw token NEVER goes into the bucket key map. The 16-hex-char digest is enough collision resistance for keying. Falls back to `X-Forwarded-For` (first hop) → `RemoteAddr` when no auth header is present. Buckets live in a `ConcurrentHashMap` — bounded in practice by `num_distinct_clients`, which for the single-tenant pod shape is small. Swap to a Caffeine cache with a max-size eviction if multi-tenant exposure is ever added.
- **Filter chain order in `serving` profile**: `SecurityHeadersFilter` → `RateLimitFilter` → `BearerAuthFilter` → ... → controller. Each `addFilterBefore(X, UsernamePasswordAuthenticationFilter.class)` inserts X immediately before UPAFilter, pushing the previously-inserted filter farther from the target — so the **registration order in `SecurityConfig.servingFilterChain` IS the chain order**. Don't shuffle without re-reasoning about it: if `RateLimitFilter` ran AFTER `BearerAuthFilter`, an unauthenticated brute-force attempt would never get throttled (would just see 401 over and over, hitting the slow path).
- **`Files.probeContentType` is best-effort** — JDK 25 on Linux uses `/etc/mime.types` + magic-byte fallback. It returns `null` if the type can't be determined; treat that as "let it through" (the byte cap in `SafeFileReader` still bounds size). The allowlist for `/api/file` is `text/*` + `application/{json,xml,x-yaml,javascript}` — extending requires adding to the explicit list in `GraphController.readFile`.
- **Sanitize user-controlled values before logging.** `BearerAuthFilter.sanitizeForLog(String)` strips `\p{Cntrl}` and truncates at 256 chars. Use it on anything tainted by `request.getRequestURI()`, `request.getMethod()`, headers, etc. before passing to a logger. CodeQL `java/log-injection` will flag direct `log.warn("... {} ...", request.getRequestURI())` as a vuln.
- **`mcp.limits.max_depth` is a NEW field on `McpLimitsConfig`** (default 10). Audit #10 / C3 — the original audit assumed it existed but it didn't. When adding new MCP traversal tools, cap depth via `Math.min(callerSupplied, maxDepth)` before passing to Cypher. The REST endpoint already had this guard via `config.getMaxDepth()` from `CodeIqConfig`; the MCP path now mirrors it via `McpLimitsConfig.maxDepth()`.

## Supply-chain observability (OpenSSF)

Expand Down
10 changes: 10 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,16 @@
<artifactId>spring-boot-starter-security</artifactId>
</dependency>

<!-- Bucket4j: in-process token-bucket rate limiter (Apache-2.0).
Used by RateLimitFilter to throttle /api and /mcp on a per-token /
per-IP key. Single-replica serving = single bucket per key, so no
cluster coordination needed. ~80 KB; pure Java, no native deps. -->
<dependency>
<groupId>com.bucket4j</groupId>
<artifactId>bucket4j_jdk17-core</artifactId>
<version>8.18.0</version>
</dependency>

<!-- Neo4j Embedded (Community Edition) -->
<dependency>
<groupId>org.neo4j</groupId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,26 @@ public ResponseEntity<String> readFile(
if (!Files.isRegularFile(resolvedReal)) {
return ResponseEntity.notFound().build();
}
// Reject non-text files early. Without this, .jks keystores, .png images,
// native .so libraries get served as text/plain with garbled UTF-8 — a
// slow client at 1 KB/s holds a virtual thread + Tomcat connection for
// 1000s. Audit #11 (revised): SafeFileReader already enforces the byte
// cap; the gap is the content-type guard.
try {
String probedType = Files.probeContentType(resolvedReal);
if (probedType != null && !probedType.startsWith("text/")
&& !probedType.equals("application/json")
&& !probedType.equals("application/xml")
&& !probedType.equals("application/x-yaml")
&& !probedType.equals("application/javascript")) {
return ResponseEntity.status(HttpStatus.UNSUPPORTED_MEDIA_TYPE)
.contentType(MediaType.TEXT_PLAIN)
.body("File is not a text/source type (probed: " + probedType + ")");
}
} catch (IOException probeFail) {
// probeContentType is best-effort; if it fails, fall through to read.
// SafeFileReader byte cap still bounds the response size.
}
try {
String content = SafeFileReader.read(resolvedReal, startLine, endLine, config.getMaxFileBytes());
return ResponseEntity.ok()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import org.springframework.data.neo4j.repository.config.EnableNeo4jRepositories;

import java.nio.file.Path;
import java.time.Duration;
import java.util.Arrays;

/**
Expand All @@ -40,7 +41,12 @@ public class Neo4jConfig {
DatabaseManagementService databaseManagementService(CodeIqConfig config, Environment env) {
var builder = new DatabaseManagementServiceBuilder(Path.of(config.getGraph().getPath()))
.setConfig(BoltConnector.enabled, true)
.setConfig(BoltConnector.listen_address, new SocketAddress("localhost", boltPort));
.setConfig(BoltConnector.listen_address, new SocketAddress("localhost", boltPort))
// Hard wall-clock cap on every transaction. Prevents a runaway Cypher
// (e.g. unbounded variable-length match on a hub node) from hogging
// the page cache and starving readiness/liveness probes. Audit
// finding #2 (HIGH) — runs alongside per-tool timeouts in McpTools.
.setConfig(GraphDatabaseSettings.transaction_timeout, Duration.ofSeconds(30));

// Read-only mode for serving profile — no lock files, no transaction logs.
// Required for read-only filesystems (e.g., AKS with read-only volumes).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,8 @@ protected void doFilterInternal(HttpServletRequest request,
// CRITICAL: never log the Authorization header value. Method and
// URI are sanitized with sanitizeForLog (strips \r\n\t — defends
// against CWE-117 log forging via crafted URIs; CodeQL
// java/log-injection).
// java/log-injection). A request line like
// `GET /\nINFO: granted access HTTP/1.1` can't inject fake log lines.
log.warn("Auth rejected: {} {} (request_id={})",
sanitizeForLog(request.getMethod()),
sanitizeForLog(request.getRequestURI()),
Expand Down Expand Up @@ -150,9 +151,11 @@ private static String currentRequestId() {
/**
* Strip CR/LF/TAB before sending request-derived data to a log appender.
* Defends against log forging via crafted URIs (CWE-117 / CodeQL
* {@code java/log-injection}). Using explicit single-char replace chains
* is the pattern CodeQL's standard sanitizer-recognizer matches against.
* Output is also length-capped at 256 chars to prevent log-bomb URIs.
* {@code java/log-injection}). Explicit single-char replace chains are
* the pattern CodeQL's standard sanitizer-recognizer matches against
* — {@code replaceAll("[\p{Cntrl}]", ...)} was not picked up by the
* data-flow analysis. Output is also length-capped at 256 chars to
* prevent log-bomb URIs.
*/
static String sanitizeForLog(String s) {
if (s == null) return "null";
Expand Down
Loading