diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index ff15c373..00000000 --- a/AGENTS.md +++ /dev/null @@ -1,52 +0,0 @@ -# AGENTS.md — codeiq - -> **Repo-root entry point for any agent collaborator.** This file is intentionally short and lists pointers; the canonical contents live elsewhere and are linked from here. - -## What this repo is - -codeiq is a CLI + read-only stdio MCP server that builds a deterministic code-knowledge graph over a codebase. No AI in the index/enrich pipeline; LLM use is opt-in via `codeiq review`. Single static Go binary (CGO for Kuzu + SQLite). See [`/CLAUDE.md`](CLAUDE.md) for the architecture, package map, pipeline, conventions, and gotchas. - -## Pointers, in priority order - -1. **Read [`/CLAUDE.md`](CLAUDE.md) first.** It is the SSoT for architecture, build/test commands, package layout, and the long-tail of "things that bite you on this codebase." -2. **Then [`/shared/runbooks/engineering-standards.md`](shared/runbooks/engineering-standards.md).** Coverage, CVE, signed-commits, and quality-gate policy. -3. **Then the runbooks you'll actually need:** - - [`shared/runbooks/first-time-setup.md`](shared/runbooks/first-time-setup.md) — get from clean clone to green local build. - - [`shared/runbooks/release.md`](shared/runbooks/release.md) — how to ship; gates downstream RAN-* product work. - - [`shared/runbooks/rollback.md`](shared/runbooks/rollback.md) — when a ship goes bad. -4. **Security**: [`/SECURITY.md`](SECURITY.md) for disclosure; private reports only. - -## Hard rules for any agent doing work in this repo - -- **Branch off `main`.** Never commit to `main` directly. -- **Sign every commit.** The repo-local config (`scripts/setup-git-signed.sh`) makes this automatic; do not rewrite it. -- **One logical change per commit.** Conventional-commit subjects (`feat:`, `fix:`, `chore:`, `refactor:`, `test:`, `docs:`, `perf:`). -- **Squash-merge only.** Branch protection rejects merge commits and force-pushes to `main`. -- **Tests + race + vet must pass.** `CGO_ENABLED=1 go test ./... -count=1` is the contract; release CI runs `-race` too. 880+ tests today. -- **Determinism is non-negotiable.** Same input → same output, byte-for-byte. Any new detector ships with a determinism test. -- **Read-only MCP server.** Tool calls never write to the graph. Index/enrich happen only via the CLI commands `codeiq index` / `codeiq enrich`. The Java reference's REST API + React SPA were deleted in Phase 6 cutover (#132) and will not be reintroduced. -- **No secrets in code.** Repo-level GitHub Actions secrets only. - -## Paperclip / RAN-* coordination - -This codebase tracks work in Paperclip under the `RAN-*` prefix. When you pick up a task: - -1. Checkout the issue (`POST /api/issues/{id}/checkout`) before you start. -2. Comment progress on every heartbeat — terse markdown, link the PR. -3. Branch protection requires TechLead approval; route review there. -4. Reference the issue in your commit/PR body (`Closes RAN-N`). - -If the task asks for product/feature work and `shared/runbooks/release.md` is missing on `main`, **stop**: the RAN-46 bootstrap precondition has not landed yet and product work is gated on it. - -## Auth escalation - -If you hit something requiring GitHub App / PAT / OAuth that the runtime cannot satisfy (org admin escalation, Sonatype Central re-namespace, OpenSSF Best Practices form, etc.), do **not** improvise auth: PATCH the issue to `blocked` with the exact ask and `@`-mention the board. - - - -# Memory Context - -# [codeiq] recent context, 2026-04-28 6:43am UTC - -No previous sessions found. - \ No newline at end of file diff --git a/CHANGELOG.md b/CHANGELOG.md deleted file mode 100644 index 9afffbfe..00000000 --- a/CHANGELOG.md +++ /dev/null @@ -1,164 +0,0 @@ -# Changelog - -All notable changes to this project are documented in this file. - -The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), -and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). - -Per-tag release notes are published on -[GitHub Releases](https://github.com/RandomCodeSpace/codeiq/releases). This file -captures the cross-cutting changes that span multiple commits or releases (new -quality gates, security policy, deploy surface, etc.) — see the GitHub Release -for that specific tag for the per-commit details. - -The release history was reset at v0.4.0: all earlier GitHub Releases and tags -(`v0.0.x`, `v0.1.x`, `v0.2.x`, `v0.3.0`, `v1.0.0`) were deleted because the -Go module proxy permanently caches every published version's content. A -deleted `v0.1.0` tag from the original Python-prototype era would have -poisoned `go install` for any reused number. v0.4.0 is the first never-used -version after the cleanup; the commit history under it is unchanged and -includes everything that previously shipped as v0.3.0 plus the post-cutover -work listed below. Historical sections below v0.4.0 are kept for the record -even though their GitHub Releases are gone. - -## [Unreleased] - -## [v0.4.1] - 2026-05-14 - -Patch release. Pure CI / dependency hygiene — no codeiq pipeline or -detector behavior changes. - -### Fixed - -- `release-darwin.yml` race against `release-go.yml`: bumped poll budget - from 90 s to 15 minutes and added an early-bail when the upstream - release-go run for the tag concluded as failure / cancelled / - timed_out. Pinned `--repo` on every `gh` invocation. PR #165. - -### Changed - -- Routine Dependabot bumps: `github.com/spf13/pflag` 1.0.9 → 1.0.10 - (PR #163), `step-security/harden-runner` 2.19.1 → 2.19.2 (PR #164). - -[v0.4.1]: https://github.com/RandomCodeSpace/codeiq/releases/tag/v0.4.1 - -## [v0.4.0] - 2026-05-14 - -First release of the Go-native codeiq after the `/go/` subdirectory -hoist. Same commit content as the (now deleted) v0.3.0 plus the -post-cutover work below. - -### Fixed - -- `codeiq enrich` survives polyglot codebases at `~/projects/` scale (49k - files, 15 GiB host). Pre-fix runs OOM-killed at exit 137; now exits 0 - with peak RSS 1.8–2.2 GiB. PRs #145, #146, #147, #148. -- Five enrich pipeline correctness fixes that surfaced at scale (each one - blocked the next — landed in order): - - PR #149: MCP dispatch arg names in `tools_consolidated` (7 modes were - permanently returning `INVALID_INPUT`). - - PR #150: pipe-delimited Kuzu COPY staging — JSON property values - containing commas (e.g. Python `imports`) no longer break the parser. - - PR #151: path-qualified SERVICE node IDs — two modules sharing a name - in different folders no longer collide on primary key. - - PR #152: TOML detector unquotes quoted keys (e.g. airflow's - `.cherry_picker.toml` `"check_sha" = ...`). - - PR #153: explicit `QUOTE='"', ESCAPE='"'` on Kuzu COPY so RFC-4180 - quoting round-trips correctly (Istio EDS cluster names with `|`). - -### Changed - -- **Module hoisted from `/go/` to repo root** (PR #162). Module path drops - the `/go` suffix; `go install github.com/randomcodespace/codeiq/cmd/codeiq@vX.Y.Z` - resolves directly. 320 Go files rewritten, 5 CI workflows + goreleaser - + Dependabot config aligned. -- **Kuzu 0.7.1 → 0.11.3** (PR #155). Migrates the embedded graph DB to a - release with bundled FTS extension and bound `LIMIT`/`SKIP` parameters. -- **Real FTS replaces CONTAINS predicates** (PR #159). `SearchByLabel` - and `SearchLexical` now route through `CALL QUERY_FTS_INDEX` with BM25 - ranking; CONTAINS fallback retained for pre-enrich graphs. Auto-suffix - `*` on single-token queries preserves prefix-match UX. Two indexes - created at enrich time: - - `code_node_label_fts` over `(label, fqn_lower)` - - `code_node_lexical_fts` over `(prop_lex_comment, prop_lex_config_keys)` -- **Parameterized `LIMIT`/`SKIP`** across the query layer (PR #159). - `intLiteral` helper removed; `fmt.Sprintf("LIMIT %d", n)` replaced with - `LIMIT $lim` bindings. -- **Dropped `stringsToAny` widener** (PR #159). Kuzu 0.11's Go binding - accepts `[]string` directly for `IN $param` clauses. -- **Mutation gate** allow-lists read-only `CALL QUERY_FTS_INDEX` (PR #159); - `CREATE_FTS_INDEX` / `DROP_FTS_INDEX` stay blocked under - `OpenReadOnly`. -- **Dependabot config** rewritten (PR #154) — drops the dead Java `maven` - (`/`) and `npm` (`/src/main/frontend`) ecosystems, adds `gomod` with - groups for `kuzu`, `tree-sitter`, `mcp`, `cobra-viper`, `sqlite`, - `test-libs`. Routine bumps that followed: `go-kuzu` 0.7.1 → 0.11.3 - (PR #155), `spf13/cobra` + `pflag` group (PR #156), `go-sqlite3` - 1.14.22 → 1.14.44 (PR #157), 4 GitHub Actions (PR #158). - -### Added - -- `codeiq enrich` knobs (PR #147): `--memprofile=` writes a Go - heap profile; `--max-buffer-pool=N` overrides the 2 GiB Kuzu cap; - `--copy-threads=N` overrides `MaxNumThreads` default. -- Perf-gate CI step (PR #148): `/usr/bin/time -v codeiq enrich` runs on - fixture-multi-lang; fails the build if peak RSS exceeds 300 MB. -- `runtime/debug.BuildInfo` fallback in the `buildinfo` package - (PR #161). `go install …@vX.Y.Z` binaries now self-identify their - version, commit, and date without needing the goreleaser `-ldflags -X` - path — the Go toolchain stamps `vcs.revision`/`vcs.time`/`vcs.modified` - and the buildinfo `init()` reads them. Goreleaser's ldflags still win - on release artifacts. - -[v0.4.0]: https://github.com/RandomCodeSpace/codeiq/releases/tag/v0.4.0 - -## [v0.3.0] - 2026-05-13 - -### Changed - -- **Phase 6 cutover — Java reference deleted, Go is the only - implementation.** Single static binary released from `go/cmd/codeiq`. - Deletes `src/`, `pom.xml`, `spotbugs-exclude.xml`, - `.github/workflows/{ci-java,beta-java,release-java,go-parity}.yml`. - ~8.9 MB / ~1500 files removed. - -### v0.3.0 surface - -What ships in v0.3.0 (carrying forward from the c363727 squash + c630245 release infra): - -- 100 detectors across 35+ languages. -- Deterministic graph with confidence-aware NodeMerger and canonical - `(src, tgt, kind)` edge dedup; phantom-drop visibility. -- 6 consolidated mode-driven MCP tools + `run_cypher` escape hatch + - `review_changes`. The deprecated 34 narrow tools remain wired for - back-compat in this release; targeted for removal in a future minor. -- `codeiq review` CLI + `review_changes` MCP tool with Ollama (local - or Cloud) for LLM-driven PR review against graph evidence. -- Goreleaser cross-platform binaries (linux/amd64, linux/arm64, - darwin/arm64), SPDX SBOMs, Cosign keyless signatures via GitHub - OIDC + Sigstore Rekor. -- Per-PR perf-regression gate (`perf-gate.yml`). - -### Removed - -- `src/main/java/`, `src/test/java/`, `src/main/frontend/`, - `src/main/resources/`, `pom.xml`, `spotbugs-exclude.xml`. -- `.github/workflows/ci-java.yml`, `release-java.yml`, `beta-java.yml`, - `go-parity.yml` (the last needed the Java jar build that's gone). - -### Migration notes - -Pre-cutover Java-side history is preserved in the squash-merge commit -`c363727` and on `origin/main`. Anyone needing to recover Java files -can `git show c363727:` or `git checkout c363727 -- `. - -[v0.3.0]: https://github.com/RandomCodeSpace/codeiq/releases/tag/v0.3.0 - -### Added - -- **Phase 5 release infrastructure for the Go binary** — - `.goreleaser.yml` + `.github/workflows/release-go.yml` cut a - multi-platform (linux/amd64, linux/arm64, darwin/arm64) release on - every `v*.*.*` tag push. Each archive ships with an SPDX SBOM - (Syft), and the `checksums.sha256` manifest is keyless-signed via - Cosign + GitHub OIDC (Sigstore Rekor transparency log). Optional diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index fdf2ed66..00000000 --- a/CLAUDE.md +++ /dev/null @@ -1,482 +0,0 @@ -# codeiq (Go) — Project Instructions - -## What This Project Is - -**codeiq** — a CLI tool + MCP server that scans codebases to build a -deterministic code knowledge graph. No AI, no external APIs — pure -static analysis. 100 detectors, 35+ languages, Kuzu embedded graph -database, MCP stdio server, single static Go binary. - -- **CLI command**: `codeiq` (single binary from `cmd/codeiq/main.go`) -- **Go module**: `github.com/randomcodespace/codeiq` -- **Go directive**: `go 1.25.0` (dep-mandated by `modelcontextprotocol/go-sdk`); `toolchain go1.25.10` -- **GitHub repo**: `RandomCodeSpace/codeiq` (default branch: `main`) -- **Cache on disk**: `.codeiq/cache/codeiq.sqlite` (SQLite analysis cache) -- **Graph on disk**: `.codeiq/graph/codeiq.kuzu` (Kuzu embedded graph) -- **Config file**: `codeiq.yml` (project-level overrides) - -The Java/Spring Boot reference that seeded this codebase was deleted -in Phase 6 cutover (v0.3.0). For history, see commits `c363727` (port -landing) and `c630245` (release infra). - -## Tech Stack - -> Source of truth: `go.mod` + `go.sum`. Update pins there; this -> list moves with them in the same commit. - -- **Go 1.25.10** — toolchain pin; module min is 1.25.0 (clamped by the - MCP SDK's own `go` directive). -- **Kuzu 0.11.3** (`github.com/kuzudb/go-kuzu`) — embedded graph DB. - CGO. Native FTS via `CALL CREATE_FTS_INDEX` / `QUERY_FTS_INDEX`. - Capability matrix documented in `## Gotchas` below. -- **`mattn/go-sqlite3` 1.14.44** — SQLite analysis cache. CGO. -- **`smacker/go-tree-sitter`** — AST parsing for Java / Python / - TypeScript / Go. -- **`modelcontextprotocol/go-sdk` v1.6** — stdio MCP server. v1.6 API - shape: `Server.Serve(ctx, mcpsdk.Transport)`; no `NewStdioTransport` - helper. -- **`spf13/cobra` 1.10.2** — CLI framework. Subcommand registration via - `internal/cli` blank imports. - -## Architecture - -### Pipeline - -``` -index: FileDiscovery → Parsers → Detectors (goroutine pool) → GraphBuilder → SQLite cache -enrich: SQLite → Linkers → LayerClassifier → LexicalEnricher → LanguageEnricher → ServiceDetector → Kuzu (COPY FROM) -mcp: Kuzu → QueryService → 6 consolidated MCP tools + run_cypher escape hatch + review_changes -``` - -codeiq has no REST API and no web UI surface — by design. Consumers -interact through the CLI or through the stdio MCP server (read-only). -The Java reference had a `codeiq serve` subcommand (Spring Boot REST -+ React SPA); both were removed in the Go port and will not be -reintroduced. - -### Pipeline components - -- **`internal/analyzer/file_discovery.go`** — `git ls-files` first, - dir-walk fallback. Maps extension → `parser.Language` via - `LanguageFromExtension` in `internal/parser/parser.go`. -- **`internal/parser`** — tree-sitter wrappers + a structured parser - for YAML/JSON/TOML/INI/properties. Falls back to regex-only when - parse fails (matches Java's per-file try/catch). -- **`internal/detector`** — `@Component` analogue is Go's `init()` - blank-import pattern; every detector registers itself with - `detector.Default`. Auto-discovery via `internal/cli/detectors_register.go` - (this file is the choke point — every detector package leaf must - blank-import here or the binary won't fire it). -- **`internal/analyzer/graph_builder.go`** — buffers detector results. - Confidence-aware node merge (`mergeNode`), canonical - `(source, target, kind)` edge dedup, deterministic Snapshot with - dangling-edge drop. Surfaces dedup/drop counts on `Stats`. -- **`internal/analyzer/linker/`** — TopicLinker, EntityLinker, - ModuleContainmentLinker. Each emits `Result{Nodes, Edges}` that's - `.Sorted()` at the call site (Phase 1 §1.4). -- **`internal/graph`** — Kuzu wrapper. Read-only via `OpenReadOnly` - (mutation gate in `cypher.go`). -- **`internal/mcp`** — 6 consolidated mode-driven tools (`graph_summary`, - `find_in_graph`, `inspect_node`, `trace_relationships`, - `analyze_impact`, `topology_view`), `run_cypher` escape hatch, - `read_file` utility, `generate_flow`, and `review_changes` — 10 - user-facing tools total. The narrow toolXxx(d) builder funcs remain - in tools_graph.go/tools_intelligence.go/tools_topology.go as Go-API - delegation targets for the consolidated layer; they are NOT - registered as user-facing MCP tools. -- **`internal/review`** — diff parser, Ollama-compatible chat client, - ReviewService orchestrator. Default endpoint = local Ollama; - `OLLAMA_API_KEY` flips to Ollama Cloud. - -### Package layout - -``` -codeiq/ -├── cmd/codeiq/ # main package — single binary entrypoint -├── internal/ -│ ├── analyzer/ # pipeline orchestration -│ │ └── linker/ # cross-file enrichers -│ ├── buildinfo/ # version/commit/date from -ldflags -│ ├── cache/ # SQLite analysis cache -│ ├── cli/ # cobra subcommands + detector registrations -│ ├── detector/ # 100 detectors organized by category -│ │ ├── auth/ -│ │ ├── base/ # AbstractDetector analogues + helpers -│ │ ├── csharp/ -│ │ ├── frontend/ # React, Vue, Svelte, Angular, frontend routes -│ │ ├── generic/ -│ │ ├── golang/ -│ │ ├── iac/ # Terraform, Bicep, Dockerfile, CloudFormation -│ │ ├── jvm/ -│ │ │ ├── java/ # ~37 Java detectors -│ │ │ ├── kotlin/ -│ │ │ └── scala/ -│ │ ├── markup/ # Markdown -│ │ ├── proto/ -│ │ ├── python/ -│ │ ├── script/shell/ # PowerShell, Bash -│ │ ├── sql/ # SqlMigration -│ │ ├── structured/ # YAML, JSON, TOML, K8s, Helm, OpenAPI, … -│ │ ├── systems/{cpp,rust}/ -│ │ └── typescript/ -│ ├── flow/ # architecture-flow diagram engine -│ ├── graph/ # Kuzu facade -│ ├── intelligence/ # Lexical + language extractors + evidence + query planner -│ ├── mcp/ # MCP server + tool definitions -│ ├── model/ # CodeNode, CodeEdge, NodeKind, EdgeKind, Confidence, Layer -│ ├── parser/ # tree-sitter + structured parsers -│ ├── query/ # service / topology / stats -│ └── review/ # PR-review pipeline (diff + LLM) -├── parity/ # parity harness (build tag `parity`) -├── testdata/ # fixtures -├── go.mod -└── go.sum -``` - -## Critical Rules - -### Read-Only MCP - -The MCP server is **strictly read-only** — no data mutation from tool -calls. `run_cypher` rejects mutation keywords at the gate -(`internal/graph/cypher.go`). `review_changes` reads the graph and -shells out to `git`; it never writes to `.codeiq/`. - -Analysis/enrichment happens only via the CLI commands `index` / -`enrich`. - -### Determinism - -- Same input MUST produce same output. Every run. -- No `map` iteration without sorting first (every range loop over a - map sorts keys before emit). -- `GraphBuilder.Snapshot` sorts nodes + edges by ID. -- Linker outputs go through `Result.Sorted()` at the boundary. -- All detectors are stateless — no mutable struct fields. Stateless - methods only. The single shared instance per detector type is - registered with `detector.Default` at package init. - -### Detector dispatch is choke-pointed - -Adding a new detector package under `internal/detector//` is NOT -enough. The package must be blank-imported in -[`internal/cli/detectors_register.go`](internal/cli/detectors_register.go). -Without that line, the package's `init()` never runs and the binary -ships without your detector. The Phase 4 benchmark exposed this bug -when 15 language families silently produced 0 nodes — see commit -`04098be` for the fix. - -### Goroutine safety - -- File I/O and SQLite writes run on a bounded worker pool - (`Analyzer.opts.Workers`, default 2× GOMAXPROCS). -- Detectors must be stateless. Method-local state only. -- Kuzu reads use the embedded API; one query at a time per - `Store.Cypher` call. The store internal mutex serializes. - -## CLI Commands - -| Command | Purpose | -|---|---| -| `index [path]` | Scan files → SQLite analysis cache. | -| `enrich [path]` | Load cache → Kuzu graph; run linkers + LayerClassifier + intelligence. | -| `mcp [path]` | Stdio MCP server (Claude / Cursor). | -| `stats [path]` | Categorized statistics from the enriched graph. | -| `query [path]` | consumers/producers/callers/dependencies/dependents/shortest-path/cycles/dead-code. | -| `find [path]` | endpoints, entities, services, … | -| `cypher [path]` | Raw Cypher (read-only) against Kuzu. | -| `flow [path]` | Architecture-flow diagrams (mermaid/dot/yaml). | -| `graph [path]` | Export graph in json / yaml / mermaid / dot. | -| `topology [path]` | Service-topology projection. | -| `review [path]` | LLM-driven PR review (Ollama by default). | -| `cache ` | Inspect / clear the SQLite cache. | -| `plugins ` | List + describe registered detectors. | -| `config ` | Validate / explain `codeiq.yml`. | -| `version` | `--version` long form. | - -### Standard pipeline - -```bash -codeiq index /path/to/repo -codeiq enrich /path/to/repo -codeiq stats /path/to/repo -codeiq mcp /path/to/repo # for Claude / Cursor wiring -``` - -## MCP Tools - -The MCP server registers 10 user-facing tools — 6 consolidated -mode-driven, `run_cypher` (escape hatch), `read_file` (utility), -`generate_flow`, and `review_changes`. The 24 narrow tools that the -consolidated layer subsumes were dropped from the MCP surface; -their Go-API implementations (`toolXxx(d) Tool`) stay in the package -because the consolidated tools delegate to them. - -| Consolidated tool | mode dispatch | -|---|---| -| `graph_summary` | `overview` / `categories` / `capabilities` / `provenance` | -| `find_in_graph` | `nodes` / `edges` / `text` / `fuzzy` / `by_file` / `by_endpoint` | -| `inspect_node` | `neighbors` / `ego` / `evidence` / `source` | -| `trace_relationships` | `callers` / `consumers` / `producers` / `dependencies` / `dependents` / `shortest_path` | -| `analyze_impact` | `blast_radius` / `trace` / `cycles` / `circular_deps` / `dead_code` / `dead_services` / `bottlenecks` | -| `topology_view` | `summary` / `service` / `service_deps` / `service_dependents` / `flow` | -| `run_cypher` | (escape hatch — mutation-rejected) | -| `review_changes` | (Ollama-driven PR review) | - -## Adding a New Detector - -1. Create file in `internal/detector//my_detector.go`. -2. Implement the `detector.Detector` interface: - - ```go - package mycategory - - import ( - "github.com/randomcodespace/codeiq/internal/detector" - "github.com/randomcodespace/codeiq/internal/detector/base" - "github.com/randomcodespace/codeiq/internal/model" - ) - - type MyDetector struct{} - - func NewMyDetector() *MyDetector { return &MyDetector{} } - - func (MyDetector) Name() string { return "my_detector" } - func (MyDetector) SupportedLanguages() []string { return []string{"java"} } - func (MyDetector) DefaultConfidence() model.Confidence { return base.RegexDetectorDefaultConfidence } - - func init() { detector.RegisterDefault(NewMyDetector()) } - - func (MyDetector) Detect(ctx *detector.Context) *detector.Result { - // … pattern matching → return detector.ResultOf(nodes, edges) - return detector.EmptyResult() - } - ``` - -3. **CRITICAL** — if the package is a NEW directory under - `internal/detector/`, blank-import it in - `internal/cli/detectors_register.go`. Existing directories - already covered. -4. Add a test file at the same path (`my_detector_test.go`). Include - positive match, negative match, determinism (run twice, assert - identical output). -5. `CGO_ENABLED=1 go test ./internal/detector//... - -count=1`. - -### Detector base helpers - -| File | Purpose | -|---|---| -| `base/regex.go` | `FindLineNumber`, `RegexDetectorDefaultConfidence`. | -| `base/imports_helpers.go` | `EnsureFileAnchor`, `EnsureExternalAnchor` — emit anchor nodes so imports/depends_on edges survive `Snapshot`'s phantom filter. | -| `base/component.go` | `CreateComponentNode` for React/Vue/Angular component detectors. | -| `base/structures.go` | `AddImportEdge`, `CreateStructureNode` for Scala/Kotlin/etc structure detectors. | - -## Configuration - -`codeiq.yml` at the repo root. Resolution order (last wins): - -1. Built-in defaults -2. `~/.codeiq/config.yml` -3. `./codeiq.yml` -4. `CODEIQ_
_` env vars -5. CLI flags - -`codeiq config validate` + `codeiq config explain`. - -## Testing - -```bash - -# Full suite -CGO_ENABLED=1 go test ./... -count=1 - -# Race detector -CGO_ENABLED=1 go test ./... -race -count=1 - -# Single package -CGO_ENABLED=1 go test ./internal/detector/jvm/java/... - -# Verbose -CGO_ENABLED=1 go test ./... -v -``` - -828+ tests. Every detector ships with positive, negative, and -determinism tests. - -## Build Commands - -```bash - -# Build -CGO_ENABLED=1 go build -o /usr/local/bin/codeiq ./cmd/codeiq - -# Build with version info (release-go.yml does this with goreleaser): -CGO_ENABLED=1 go build \ - -ldflags "-X 'github.com/randomcodespace/codeiq/internal/buildinfo.Version=v0.3.0' \ - -X 'github.com/randomcodespace/codeiq/internal/buildinfo.Commit=$(git rev-parse --short HEAD)' \ - -X 'github.com/randomcodespace/codeiq/internal/buildinfo.Date=$(date -u +%Y-%m-%dT%H:%M:%SZ)'" \ - -o /usr/local/bin/codeiq ./cmd/codeiq -``` - -Release pipeline: -[`shared/runbooks/release-go.md`](shared/runbooks/release-go.md). - -## Code Conventions - -- Go 1.25+ idioms — generics where they reduce repetition, `slices.` - and `maps.` over hand-rolled loops, `min`/`max` builtins. -- `model.Confidence` and `Source` are mandatory on every `CodeNode` / - `CodeEdge`. Base classes stamp the per-detector floor at the - orchestration boundary (LEXICAL for regex bases, SYNTACTIC for - AST/structured bases). -- Property union semantics: in `mergeNode`, donor only fills keys the - survivor doesn't already have. Don't clobber a high-confidence - detector's framework/auth_type stamping. -- ID format: `:::` — keep prefixes - stable; the GraphBuilder dedup map relies on them. -- File-anchor / external-anchor IDs: - - `:file:` for the file-as-module - - `:external:` for imported packages - This pattern saves imports edges from phantom drop. -- Detectors with framework guards: require a framework-specific - import before emitting (e.g. Quarkus requires `io.quarkus`). -- UTF-8 everywhere — explicit `[]byte` only when interfacing with - Kuzu or SQLite. - -## Gotchas & Lessons Learned - -### Pipeline - -- **Pipeline is `index → enrich → (mcp|stats|query)`.** Don't put - analysis in MCP. MCP is read-only. -- **Detector registration choke point** (`internal/cli/detectors_register.go`). - Forgetting the blank import ships an empty registry for that - language. Caught by the polyglot benchmark — 15 language families - silently produced 0 nodes pre-fix. Test: `codeiq plugins` lists - every detector by name; new ones must appear. - -### Kuzu v0.11.3 (current pin) - -**Lifted in 0.11.3** — `CLAUDE.md` previously documented these as 0.7.1 -quirks; they were unwound in the post-bump cleanup: - -- FTS extension ships bundled. `CreateIndexes()` runs `INSTALL fts; LOAD - EXTENSION fts;` then `CALL CREATE_FTS_INDEX`. `SearchByLabel` / - `SearchLexical` query via `CALL QUERY_FTS_INDEX` with BM25 ranking; - CONTAINS predicates remain as fallback for pre-enrich graphs. -- `LIMIT $param` and `SKIP $param` work as bound parameters. No more - `fmt.Sprintf` for integer literals. -- `toLower()` works (use it; `lower()` still accepted for SQL parity). -- Go binding accepts `[]string` for `IN $param` directly. The - `stringsToAny` widener is gone. - -**Still present in 0.11.3** — keep workarounds: - -- List comprehension binder rejects out-of-scope variables. Use - `properties(nodes(p), 'id')` instead of `[n IN nodes(p) | n.id]`. -- `EXISTS { … }` subquery doesn't see outer-scope `$param`. Inline - static lists as rel-pattern alternations. -- Multi-label rel alternation + kleene-star in the same recursive - pattern breaks the binder. BlastRadius uses an anonymous recursive - pattern. -- Recursive pattern upper bound `[*1..N]` must be a literal, not a - parameter — only LIMIT/SKIP are now bindable. -- Mutation gate allows `CALL QUERY_FTS_INDEX` but blocks - `CALL CREATE_FTS_INDEX` / `CALL DROP_FTS_INDEX` (catalog writes). - -### MCP SDK v1.6 - -- No `NewStdioTransport(in, out)` helper. `StdioTransport{}` - zero-value bound to `os.Stdin`/`os.Stdout`. Tests use - `NewInMemoryTransports()`. -- `Server.AddTool(t *Tool, h ToolHandler)` — two args, not aggregate. -- `CallToolRequest.Params` is `*CallToolParamsRaw{Arguments - json.RawMessage}`. Wrapper unmarshals once, hands raw JSON to the - handler. -- ToolHandler JSON-marshals returned values. Special-case `string` - in `mcp/tool.go` for the `generate_flow` rendered output — - otherwise the Mermaid/DOT string gets double-encoded. - -### Go RE2 vs Java regex - -- No lookahead / lookbehind. Plan-spec patterns like - `CALL\s+(?!db\.)` won't compile. Rewrites: two-stage match (collect - every CALL site, then allow-list each procedure name). -- No possessive quantifiers (`*+`). RE2 doesn't need them — its NFA - doesn't backtrack. Strip them when porting Java regex. -- No DOTALL — use `(?s)` prefix. - -### Detector authoring traps - -- **Phantom edges**: emit edges with anchor nodes on both ends - (`base.EnsureFileAnchor` + `base.EnsureExternalAnchor`). Without - anchors, the edge drops at Snapshot. -- **Discriminator guards**: framework detectors must require a - framework-specific import or annotation before emitting. Without a - guard, generic patterns (e.g. `@Transactional`) match across - unrelated frameworks and produce false positives. -- **Determinism**: never iterate a Go `map` without sorting keys - first. Run the determinism test twice with `count=1` to catch this. - -### Filesystem & paths - -- File discovery dir-walk fallback ingests `node_modules/`, - `vendor/`, `target/`, etc. — see `DefaultExcludeDirs` in - `analyzer/file_discovery.go`. Add new ignored dirs there. -- `Files.probeContentType` is best-effort on Linux (JDK note from the - Java side — replaced in Go by `net/http.DetectContentType` plus an - explicit allowlist in `mcp/read_file.go`). - -### Performance - -- CertificateAuthDetector once consumed 99% of indexing CPU on - C#-heavy projects because its pre-screen included `.cert` / `.crt` - / `.pem` substrings that match `using - System.Security.Cryptography.X509Certificates;`. Use a STRICT - keyword list (high-signal markers only — not path extensions) in - any cross-language regex pre-screen. -- **Enrich memory ceiling (Phase A+B+C OOM-fix plan, 2026-05-13).** - Pre-fix `codeiq enrich` peaked at 3.8 GB on the airflow polyglot - target (9k Python files) and OOM-killed at exit 137 on ~/projects/- - scale (49k files). Three landed fixes brought peak RSS down to: - - **fixture-multi-lang (22 files): ~108 MB** (CI-gated via - perf-gate workflow at 300 MB ceiling) - - **airflow (9,151 files): 1.27 GB** (1× /usr/bin/time -v on - 16 GB CI host) - - **~/projects/ (49,076 files): 3.12 GB** (well under the 4 GiB - acceptance bar from the plan) - - The three fixes: - 1. `intelligence/extractor/enricher.go` parses tree-sitter trees - once per file (was per-node, ~13× over-parse on Python). - 2. `intelligence/extractor/enricher.go` bounds the per-file - goroutine pool to `2 * GOMAXPROCS` (was unbounded — 7k+ goroutines - held live trees + file content strings). - 3. `graph.Open()` caps Kuzu `BufferPoolSize` to 2 GiB by default - (was 80% of system RAM via `kuzu.DefaultSystemConfig()`). - - Tunable knobs on `codeiq enrich`: - - `--memprofile=` writes a Go heap profile (analyze with - `go tool pprof -top -inuse_space ...`). - - `--max-buffer-pool=N` overrides the 2 GiB Kuzu cap. - - `--copy-threads=N` overrides `MaxNumThreads` (default `min(4, - GOMAXPROCS)`). - - Plan + research history: `docs/superpowers/plans/2026-05-13-enrich-oom-fix.md`. - -### Release / signing - -- Release tag must be `v*.*.*`; pre-releases use the - `vX.Y.Z-rc.N` form (Goreleaser `prerelease: auto` honors it). -- Cosign keyless via GitHub OIDC — no long-lived key on the runner. - Verification needs the cosign bundle file + the OIDC identity regex - (see `shared/runbooks/release-go.md`). - -## Updating This File - -After significant changes (new detectors, new MCP tools, architectural -decisions, conventions learned), update this file. Keep it concise. -The full pre-cutover Java-side history of these notes is on the -squash-merge `c363727`; reach for that via `git show` when you need -context. diff --git a/PROJECT_SUMMARY.md b/PROJECT_SUMMARY.md deleted file mode 100644 index d42f272c..00000000 --- a/PROJECT_SUMMARY.md +++ /dev/null @@ -1,148 +0,0 @@ -# Project Summary: codeiq - -> Refreshed 2026-05-13 after Phase 6 cutover (v0.3.0). Audience: AI -> agents (and humans) who need to understand and modify this codebase. -> -> **Canonical depth lives in [`CLAUDE.md`](CLAUDE.md)** (~16 KB, -> agent-oriented, hand-maintained). This file is a thin entry point -> that links into `CLAUDE.md` and the runbooks under -> [`shared/runbooks/`](shared/runbooks/). - -## Identity - -- **What it is**: a CLI + MCP server that scans a codebase and emits a - deterministic code knowledge graph — services, endpoints, entities, - infrastructure, auth patterns, framework usage. No AI, pure static - analysis. LLM is opt-in via `codeiq review`. -- **Type**: CLI tool + MCP stdio server, single static binary. -- **Status**: v0.3.0 (Phase 6 cutover landed 2026-05-13). Active. -- **Primary language**: Go 1.25.10. CGO required. - -## Tech stack - -- **Go 1.25.10** — toolchain pin in `go.mod` (module min 1.25.0, - clamped by `modelcontextprotocol/go-sdk`). -- **Kuzu 0.11.3** (`github.com/kuzudb/go-kuzu`) — embedded graph DB. - Native FTS via `QUERY_FTS_INDEX` (bundled). -- **`mattn/go-sqlite3` 1.14.44** — SQLite analysis cache. -- **`smacker/go-tree-sitter`** — AST parsing (Java / Python / TS / Go). -- **`modelcontextprotocol/go-sdk` v1.6** — stdio MCP server. -- **`spf13/cobra` 1.10.2** — CLI framework. -- Manifest files read: `go.mod`, `go.sum`. - -## Entry points - -| Entrypoint | File | Purpose | -|---|---|---| -| CLI / MCP server | `cmd/codeiq/main.go` | The only binary. All subcommands live in `internal/cli`. | -| Subcommand registry | `internal/cli/root.go` | Sets up cobra root + registers per-subcommand inits. | -| Detector registry | `internal/cli/detectors_register.go` | Blank-imports every detector package leaf. **Choke point** — forget it and detectors silently no-op. | -| Stdio MCP | `internal/cli/mcp.go` + `internal/mcp/server.go` | Wires 10 user-facing tools: 6 consolidated + `run_cypher` + `read_file` + `generate_flow` + `review_changes`. | -| Analyzer pipeline | `internal/analyzer/analyzer.go` | FileDiscovery → parser → detectors (pool) → GraphBuilder → SQLite. | -| Enrich pipeline | `internal/analyzer/enrich.go` | SQLite → Kuzu + linkers + layer classifier + intelligence. | - -## Directory map - -``` -codeiq/ -├── cmd/codeiq/ — main package (single binary) -├── internal/ -│ ├── analyzer/ — pipeline orchestration + linkers -│ ├── buildinfo/ — version metadata -│ ├── cache/ — SQLite analysis cache -│ ├── cli/ — cobra subcommands -│ ├── detector/ — 100 detectors organized by category -│ ├── flow/ — architecture-flow diagram engine -│ ├── graph/ — Kuzu facade (read-only) -│ ├── intelligence/ — lexical + language extractors + evidence + planner -│ ├── mcp/ — MCP server + tool definitions -│ ├── model/ — CodeNode, CodeEdge, kinds, Confidence -│ ├── parser/ — tree-sitter + structured parsers -│ ├── query/ — service / topology / stats -│ └── review/ — PR-review pipeline (diff + Ollama) -├── parity/ — parity harness (build tag `parity`) -├── testdata/ — fixtures (fixture-minimal, fixture-multi-lang) -├── go.mod — module: github.com/randomcodespace/codeiq -├── go.sum -├── .github/workflows/ — go-ci, perf-gate, release-go, release-darwin, security, scorecard -├── shared/runbooks/ — release-go.md + engineering-standards.md -├── CHANGELOG.md -├── CLAUDE.md — SSoT internals doc -├── PROJECT_SUMMARY.md — this file -├── README.md — user-facing entry doc -├── SECURITY.md -└── .goreleaser.yml — Goreleaser config (CGO multi-arch) -``` - -## Run, build, test - -Commands taken from `go.mod`, `Makefile` (none — pure `go` tooling), -and `.github/workflows/go-ci.yml`: - -```bash -# Install deps (vendored via go module cache; no extra step) - -# Run unit tests -CGO_ENABLED=1 go test ./... -count=1 - -# Race detector -CGO_ENABLED=1 go test ./... -race -count=1 - -# Static analysis (mirrors CI) -go install honnef.co/go/tools/cmd/staticcheck@2025.1.1 -staticcheck ./... -go install github.com/securego/gosec/v2/cmd/gosec@v2.22.0 -gosec -quiet -exclude=G104,G115,G202,G204,G301,G304,G306,G401,G404,G501 ./... -go install golang.org/x/vuln/cmd/govulncheck@latest -govulncheck ./... - -# Build (local) -CGO_ENABLED=1 go build -o /usr/local/bin/codeiq ./cmd/codeiq -``` - -**Required env / external services**: none for build. At run-time the -binary reads `OLLAMA_API_KEY` (optional, switches `codeiq review` to -Ollama Cloud). - -## Conventions an agent must respect - -- **Detector blank-import**: new package under `internal/detector//` - must be added to `internal/cli/detectors_register.go`. The polyglot - benchmark caught 15 missing imports (commit `04098be`). -- **Determinism**: never iterate a Go `map` without sorting keys. Run - the determinism test twice with the same fixture and assert byte - equality. -- **Anchor nodes for cross-file edges**: use - `base.EnsureFileAnchor` + `base.EnsureExternalAnchor`. Otherwise - imports/depends_on edges drop at Snapshot's phantom filter. -- **Read-only MCP**: every MCP tool reads. `run_cypher` rejects - mutation keywords. `review_changes` reads the graph + shells `git` - read-only. -- **Confidence + Source mandatory**: every emitted `CodeNode` and - `CodeEdge`. Base classes stamp the floor at the orchestration - boundary; detectors override only when they have higher-confidence - evidence. - -Full set in [`CLAUDE.md` §Code Conventions](CLAUDE.md#code-conventions). - -## Gotchas - -- **Kuzu v0.7.1 binder limitations** — no FTS, no parameterized - LIMIT/SKIP, `lower()` not `toLower()`, no negative lookahead, list - comprehensions reject out-of-scope variables. See - [`CLAUDE.md` §Kuzu v0.7.1 quirks](CLAUDE.md#kuzu-v071-quirks). -- **Go RE2 vs Java regex** — no lookahead, no possessive quantifiers. - Strip `*+` when porting; use two-stage matchers for lookahead. -- **MCP SDK v1.6** — `Server.AddTool(t, h)` (two args, not aggregate). - `StdioTransport{}` zero-value, no factory. JSON marshal of string - returns needs special casing in `mcp/tool.go`. -- **`detectors_register.go` is a choke point** — see above. -- **gosec @v2.21.4 fails to build under Go 1.25** — pinned to v2.22.0. -- **GO-2026-4918 (HTTP/2 SETTINGS DoS)** reachable from - `review.Client.Review` — fixed in Go 1.25.10 (our toolchain pin). - -## Where to look next - -- Build & release → [`shared/runbooks/release-go.md`](shared/runbooks/release-go.md) -- MCP integration → [`README.md#mcp-integration`](README.md#mcp-integration) -- Internal SSoT → [`CLAUDE.md`](CLAUDE.md) diff --git a/README.md b/README.md deleted file mode 100644 index aac193f5..00000000 --- a/README.md +++ /dev/null @@ -1,178 +0,0 @@ -

-

codeiq

-

- Deterministic code knowledge graph — scans codebases to map services, endpoints, entities, infrastructure, auth patterns, and framework usage. No AI, pure static analysis. Single static Go binary; MCP server included. -

-

- -

- Latest release - CI - Go 1.25.10 - License - Security - OpenSSF Scorecard - OpenSSF Best Practices - 100 Detectors - 35+ Languages -

- ---- - -## What it is - -codeiq scans a codebase and produces a deterministic graph of its -services, endpoints, entities, infrastructure, auth patterns, and -framework usage. Same input ⇒ same output, every time. - -- **Single static binary** — built from the `go/` tree. No JVM, no - Spring Boot start time. ~30 MB. CGO enabled (Kuzu graph + SQLite - cache). -- **100 detectors** across 35+ languages — Java, Kotlin, Scala, Python, - TypeScript/JavaScript, Go, Rust, C#, C++, Terraform, Bicep, Helm, - Kubernetes, Docker, GitHub Actions, GitLab CI, … -- **MCP server included** — `codeiq mcp` runs an MCP stdio server - exposing 10 user-facing tools (6 consolidated mode-driven + - `run_cypher` + `read_file` + `generate_flow` + `review_changes`) - so Claude / Cursor / any MCP-aware agent can query the graph - directly. -- **LLM-driven PR review** — `codeiq review` walks the diff, queries - the indexed graph for evidence, and asks Ollama (Cloud or local) for - review comments. - -## Install - -### Pre-built binary - -Grab from -[Releases](https://github.com/RandomCodeSpace/codeiq/releases/latest): - -```bash -curl -L https://github.com/RandomCodeSpace/codeiq/releases/latest/download/codeiq_$(uname -s | tr A-Z a-z)_$(uname -m | sed s/x86_64/amd64/).tar.gz | tar xz -sudo install codeiq /usr/local/bin/ -codeiq --version -``` - -Verify (Sigstore keyless): - -```bash -sha256sum -c checksums.sha256 -cosign verify-blob \ - --bundle checksums.sha256.cosign.bundle \ - --certificate-identity-regexp 'https://github.com/RandomCodeSpace/codeiq/.github/workflows/release-go.yml@.*' \ - --certificate-oidc-issuer https://token.actions.githubusercontent.com \ - checksums.sha256 -``` - -### Build from source - -Requires Go 1.25.10+ and a C toolchain (CGO). - -```bash -git clone https://github.com/RandomCodeSpace/codeiq.git -cd codeiq -CGO_ENABLED=1 go build -o /usr/local/bin/codeiq ./cmd/codeiq -codeiq --version -``` - -Or directly via `go install`: - -```bash -CGO_ENABLED=1 go install github.com/randomcodespace/codeiq/cmd/codeiq@latest -``` - -## Quickstart - -```bash -# Index a repository → SQLite analysis cache. -codeiq index /path/to/repo - -# Enrich → Kuzu graph at .codeiq/graph/codeiq.kuzu. -codeiq enrich /path/to/repo - -# Query. -codeiq stats /path/to/repo -codeiq find endpoints /path/to/repo -codeiq query consumers /path/to/repo -codeiq topology /path/to/repo -codeiq flow /path/to/repo --view overview --format mermaid - -# LLM PR review (local Ollama; OLLAMA_API_KEY → Cloud). -codeiq review --base origin/main --head HEAD /path/to/repo -``` - -## MCP integration - -Add to your MCP client config (e.g. `.mcp.json` at the project root): - -```json -{ - "mcpServers": { - "code-mcp": { - "command": "codeiq", - "args": ["mcp"] - } - } -} -``` - -Ten user-facing tools: six mode-driven (`graph_summary`, -`find_in_graph`, `inspect_node`, `trace_relationships`, -`analyze_impact`, `topology_view`) plus `run_cypher` (Cypher escape -hatch), `read_file` (utility), `generate_flow`, and `review_changes`. - -## CLI reference - -| Command | Purpose | -|---|---| -| `index [path]` | Scan files → SQLite analysis cache. | -| `enrich [path]` | Load cache → Kuzu graph; run linkers + layer classifier. | -| `mcp [path]` | Stdio MCP server (Claude / Cursor). | -| `stats [path]` | Categorized statistics. | -| `query [path]` | consumers / producers / callers / dependencies / dependents / shortest-path / cycles / dead-code. | -| `find [path]` | endpoints, entities, services, … | -| `cypher [path]` | Raw Cypher against Kuzu (read-only). | -| `flow [path]` | Mermaid / dot / yaml flow diagrams. | -| `graph [path]` | Export graph: json / yaml / mermaid / dot. | -| `topology [path]` | Service-topology projection. | -| `review [path]` | LLM-driven PR review. | -| `cache ` | Inspect / clear the analysis cache. | -| `plugins ` | List + describe registered detectors. | -| `config ` | Validate / explain `codeiq.yml`. | -| `version` | Build info. | - -`codeiq --help` for full flag listing. - -## Design - -The graph is canonical and deterministic — `GraphBuilder` deduplicates -nodes by ID (confidence-aware merge) and edges by canonical -`(source, target, kind)` tuple. Phantom edges (endpoint missing from -the graph) are dropped at snapshot. Every run prints a -"Deduped: N nodes, M edges Dropped: K phantom edges" line so graph -hygiene is visible. - -Pipeline: FileDiscovery → tree-sitter / regex → detectors → -GraphBuilder → linkers → LayerClassifier → Kuzu. See -[`CLAUDE.md`](CLAUDE.md) for the full architecture and the detector -authoring contract. - -## Releases - -Tag `vX.Y.Z` → `.github/workflows/release-go.yml` builds linux/amd64, -linux/arm64, darwin/arm64 archives with SPDX SBOMs (Syft); the -checksum manifest is keyless-signed via Cosign + GitHub OIDC -(Sigstore Rekor). Runbook: -[`shared/runbooks/release-go.md`](shared/runbooks/release-go.md). - -## Security - -See [SECURITY.md](SECURITY.md). Supply-chain stack: OpenSSF Best -Practices [12650](https://www.bestpractices.dev/projects/12650), -OpenSSF Scorecard, and the OSS-CLI workflow -([`security.yml`](.github/workflows/security.yml)) running OSV-Scanner, -Trivy, Semgrep, Gitleaks, jscpd, and `anchore/sbom-action` on every PR. - -## License - -See [LICENSE](LICENSE). diff --git a/SECURITY.md b/SECURITY.md deleted file mode 100644 index e63a3232..00000000 --- a/SECURITY.md +++ /dev/null @@ -1,67 +0,0 @@ -# Security Policy - -## Supported versions - -Security fixes are issued against the latest minor release line. While codeiq is pre-1.0 (`0.x.y`) only the **latest** released `0.MINOR.x` line receives backports; older minor lines are EOL the moment a new minor ships. - -| Version line | Status | -|---|---| -| `0.3.x` | Supported (current — Go single binary) | -| `0.2.x` and below | Unsupported (Java/Spring Boot reference, deleted at Phase 6 cutover) | - -Development builds (untagged `main`) are not covered — track the latest tagged release. - -## Reporting a vulnerability - -Please **do not open a public GitHub issue** for security problems. - -Use one of: - -- **GitHub private vulnerability report** — preferred. Open `https://github.com/RandomCodeSpace/codeiq/security/advisories/new` (you must be signed in to GitHub). The advisory channel is monitored by the maintainer. -- **Email** — `ak.nitrr13@gmail.com`. Put `[codeiq security]` in the subject so the report is triaged ahead of normal mail. - -Please include: - -- The codeiq version (`codeiq --version`). -- The shortest reproducer you can produce — a CLI command, a test case, or an indexed-fixture path. -- Your assessment of impact (e.g., RCE, path traversal, info-disclosure, DoS). -- Whether the issue is in a transitive dependency (please name the dependency + advisory ID if known). - -## What you can expect - -- **Acknowledgement** within 72 hours. -- **Initial triage** within 7 days, with a severity rating (CVSS v3.1) and an indicative remediation timeline. -- **Coordinated disclosure** — we will agree on a public-disclosure date with the reporter; default is 90 days from triage, sooner for low-impact / already-public issues. -- **Credit** in the GHSA advisory and `CHANGELOG.md` (unless the reporter requests anonymity). - -We do not currently run a paid bug bounty. - -## Scope - -In-scope: - -- The `codeiq` CLI binary and every subcommand (`index`, `enrich`, `mcp`, `query`, `find`, `cypher`, `stats`, `flow`, `graph`, `topology`, `review`, `cache`, `plugins`, `config`). -- The stdio MCP server (`codeiq mcp`) — including its 10 user-facing tools (`graph_summary`, `find_in_graph`, `inspect_node`, `trace_relationships`, `analyze_impact`, `topology_view`, `run_cypher`, `read_file`, `generate_flow`, `review_changes`). The mutation gate on `run_cypher` is in-scope — bypassing it to mutate the read-only Kuzu store is a vulnerability. -- The pipeline cache (SQLite, `.codeiq/cache/codeiq.sqlite`) and graph store (Kuzu embedded, `.codeiq/graph/codeiq.kuzu`) — including local privilege escalation and data tampering of the indexed graph. -- File-read sandboxing in `read_file` and `codeiq review` — path traversal out of the indexed root is in-scope. -- The release pipeline — Goreleaser config, signing keys (cosign keyless via OIDC), GitHub Actions workflows under `.github/workflows/`, and the published artifacts (binary tarballs + checksums + cosign bundles). - -Out of scope: - -- Vulnerabilities that require pre-existing local code execution on the developer's machine (we ship as a developer tool — by definition you trust the code you point it at). -- Public-internet attack surface — codeiq does not expose any service to the public internet. It is a CLI + stdio MCP server only; there is no REST API and no web UI (the Java reference had both; they were deleted in Phase 6 cutover and will not be reintroduced). -- Vulnerabilities in the LLM endpoint used by `codeiq review` (Ollama local or cloud) — those are the LLM vendor's surface area. -- Findings in third-party services we do not control (GitHub itself, OpenSSF, Socket Security, etc.) — please report those upstream. - -## Hardening references - -- [`shared/runbooks/engineering-standards.md`](shared/runbooks/engineering-standards.md) — CVE policy and quality gates. -- [`shared/runbooks/rollback.md`](shared/runbooks/rollback.md) §6 — secret rotation flow. -- `.github/workflows/scorecard.yml` — OpenSSF Scorecard supply-chain checks. -- `.github/workflows/security.yml` — CodeQL, Semgrep, OSV-Scanner, Trivy, Gitleaks, SBOM, Socket Security on every PR. -- `.github/workflows/perf-gate.yml` — enrich memory regression gate (300 MB ceiling on fixture-multi-lang). -- `.github/dependabot.yml` — automated `gomod` + `github-actions` bumps, grouped per ecosystem. - -## Changelog - -This file is versioned as part of the repo. Material changes (e.g., raising the supported-versions table, changing the disclosure timeline) are announced via a Release note. diff --git a/docs/audits/2026-04-28-serve-path-prod-readiness-counter.md b/docs/audits/2026-04-28-serve-path-prod-readiness-counter.md deleted file mode 100644 index ab840594..00000000 --- a/docs/audits/2026-04-28-serve-path-prod-readiness-counter.md +++ /dev/null @@ -1,156 +0,0 @@ -# Counter-Audit: serve-path Production Readiness -**Original audit:** `docs/audits/2026-04-28-serve-path-prod-readiness.md` (15 findings: 6 HIGH / 7 MEDIUM / 2 LOW) -**Counter-audit date:** 2026-04-28 -**Method:** Every finding verified against actual source files. Net-new findings added from independent inspection of GraphController, McpTools, ServeCommand, GraphHealthIndicator, CorsConfig, BundleCommand, SafeFileReader, CodeIqConfig, McpLimitsConfig, McpAuthConfig, GraphStore, application.yml, pom.xml, security.yml, ci-java.yml, and frontend/package.json. - ---- - -## Section A — Corrections to Original Audit - -### A1. Finding #6 overstated — `markReady()` fires AFTER `bootstrapNeo4jFromCache()`, not before - -**Original claim:** "`markReady()` fires before the graph is loaded" — traffic is routed during bootstrap. - -**What the code actually does:** `ServeCommand.java` lines 83–126: `graphBootstrapper.bootstrapNeo4jFromCache()` is called at line 84 and returns only when bootstrap completes. `markReady()` is called at line 126, after that return and after the node/edge count is printed to stdout. The auto-bootstrap race window the audit describes does not exist in the current code. - -**What IS real:** `GraphHealthIndicator` is not wired into the readiness group in `application.yml` (no `management.endpoint.health.group.readiness.include: readinessState,graph`). So the Kubernetes readiness probe does not gate on graph health. If a future code change moves `markReady()` earlier, or if bootstrap is made async, this becomes acute. The config gap is the real finding. - -**Correction:** Downgrade finding #6 from HIGH to MEDIUM. The fix (adding `management.endpoint.health.group.readiness.include: readinessState,graph` to the serving profile in `application.yml`) is still valid and still needed, but the acute rollout-during-bootstrap race does not currently exist. - ---- - -### A2. Finding #10 is half-wrong — REST `traceImpact` IS depth-capped; MCP is not - -**Original claim:** "the API endpoint `/api/triage/impact/{id}` (`GraphController:188`) doesn't appear to bound it." - -**What the code actually does:** `GraphController.java` line 192: -```java -int cappedDepth = Math.min(depth, config.getMaxDepth()); -``` -`config.getMaxDepth()` defaults to 10 (`CodeIqConfig.java`). The REST endpoint is bounded. - -**What IS true — in the other direction:** `McpTools.traceImpact` passes `depth != null ? depth : 3` directly to `queryService.traceImpact` with no `Math.min` guard. The MCP path is unbounded. - -**Correction:** The unbounded-depth defect exists on the MCP tool, not the REST controller. The fix targets `McpTools.traceImpact`, not `GraphController`. Severity (MEDIUM) is unchanged; affected surface is corrected. See also C3 below. - ---- - -### A3. Finding #11 is partially wrong — `SafeFileReader` does enforce a byte cap - -**Original claim:** "no `Content-Length` cap matches `getMaxFileBytes`" — implies size is unbounded. - -**What the code actually does:** `GraphController.readFile` calls `SafeFileReader.read(resolved, startLine, endLine, config.getMaxFileBytes())`. `SafeFileReader` enforces the byte cap for both full-file and line-range reads. The cap is real and working. - -**What IS real:** Content-type is not sniffed. Binary files (`.jks`, `.so`, `.png`) are served as `text/plain` with no early reject. The slow-client connection exhaustion concern is valid. The size-cap concern is not. - -**Correction:** Remove "no Content-Length cap" from finding #11. The binary content-type concern stands; severity (MEDIUM) is unchanged. - ---- - -## Section B — Severity Adjustments - -### B1. Finding #6 — downgrade from HIGH to MEDIUM (per A1) - -The bootstrap-before-markReady ordering eliminates the acute readiness race. The residual gap — readiness group not configured — is MEDIUM: probes do not gate on graph health, which can cause transient 503s after a pod restart if Neo4j is slow to open, but the bootstrap path completes before traffic is accepted under normal conditions. - ---- - -### B2. Finding #4 — scope extended; severity remains HIGH - -Finding #4 correctly flags null checksums in `BundleCommand.createManifest` (line 241: `null` passed as the checksums argument). This is confirmed. However the finding misses a co-equal defect: `generateServeShell` (lines 265–274 in `BundleCommand.java`) emits a `serve.sh` that unconditionally downloads the JAR at runtime: - -```bash -curl -fL -o "$JAR" \ - "https://repo1.maven.org/maven2/io/github/randomcodespace/iq/code-iq/${VERSION}/code-iq-${VERSION}-cli.jar" -``` - -An equivalent `serve.bat` exists for Windows. This is a direct violation of `build.md` ("No runtime network calls to the public internet"). Bundles deployed in air-gapped environments silently fail to start when the JAR is absent. A compromised Maven Central namespace could substitute a malicious JAR, and even correct checksums in `manifest.json` would not protect against it because the downloaded JAR is not verified against them. The fix for finding #4 must address both the null checksums **and** the runtime download. - ---- - -## Section C — Missed Findings - -### C1. HIGH — `getCachedData()` loads the full graph into heap on every topology MCP tool call - -**Symptom in prod:** Seven MCP tools — `serviceDetail`, `blastRadius`, `findPath`, `findBottlenecks`, `findCircularDeps`, `findDeadServices`, `findNode` — all begin by calling `getCachedData()` (`McpTools.java` lines 83–92). `getCachedData()` calls `graphStore.findAll()`, which executes two full graph scans (`findAllNodes` + `findAllEdges`) and materialises every node and every edge into a `GraphData` record on the Java heap. On a 5M-node enriched graph this is multiple gigabytes per call. No result is cached between invocations — each call pays the full allocation cost. Two concurrent `blast_radius` invocations double-allocate. This is an OOM vector independent of the `run_cypher` issue in finding #2, and it is triggered by normal topology tool usage, not by adversarial queries. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java:83–92` (`getCachedData`); `src/main/java/io/github/randomcodespace/iq/graph/GraphStore.java` (`findAll`, `findAllNodes`, `findAllEdges`). - -**Severity:** HIGH - -**Fix proposal:** Replace `getCachedData()` with targeted Cypher queries per tool (e.g. `blastRadius` needs only the subgraph reachable from the seed node, not all 5M nodes). If a full snapshot is required for correctness, populate it once at serve startup into a size-bounded `SoftReference` and invalidate on graph reload. Add a Caffeine cache with a 60-second TTL and a max-weight bound in bytes. Effort: M. - ---- - -### C2. MEDIUM — Swagger UI exposed unauthenticated; full API schema readable by any cluster workload - -**Symptom in prod:** `pom.xml` includes `springdoc-openapi-starter-webmvc-ui:3.0.3`. SpringDoc auto-registers `/swagger-ui/index.html`, `/swagger-ui.html`, and `/v3/api-docs` at startup with no authentication guard. Because there is no Spring Security on the classpath (finding #1), no filter intercepts these paths. Any actor who can reach the pod gets the complete OpenAPI schema: every endpoint path, parameter name, response shape, and the full enumeration of `NodeKind` / `EdgeKind` values. This is reconnaissance-in-depth that lowers the cost of exploiting finding #1. Neither `springdoc.swagger-ui.enabled` nor `springdoc.api-docs.enabled` is set to `false` in the serving profile of `application.yml`. - -**File / location:** `pom.xml` (`springdoc-openapi-starter-webmvc-ui:3.0.3`); `src/main/resources/application.yml` (no `springdoc.*` keys in serving profile). - -**Severity:** MEDIUM - -**Fix proposal:** In `application.yml` serving profile: `springdoc.swagger-ui.enabled: false` and `springdoc.api-docs.enabled: false`. Provide opt-in `codeiq.serving.swagger-ui.enabled: true` for local development. When auth (finding #1) is implemented, gate `/swagger-ui/**` and `/v3/api-docs/**` behind the same bearer check. Effort: XS. - ---- - -### C3. MEDIUM — `McpTools.traceImpact` has no depth cap on the MCP path - -**Symptom in prod:** As established in A2, `McpTools.traceImpact` forwards caller-supplied `depth` to `queryService.traceImpact` without any `Math.min` guard. A malicious or runaway MCP client sends `depth=1000` on a hub node; the resulting `RELATES_TO*1..1000` variable-length Cypher match runs until the transaction timeout fires — which is also not configured (finding #2). The REST endpoint at `GraphController:192` is safe; the MCP surface is not. `McpLimitsConfig` already defines a `maxDepth` field that is never consumed here. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java` (`traceImpact` method, depth forwarding). - -**Severity:** MEDIUM - -**Fix proposal:** `int safedDepth = depth != null ? Math.min(depth, mcpLimitsConfig.maxDepth()) : 3;` — wire the already-parsed `maxDepth` from `McpLimitsConfig`. Effort: XS. - ---- - -### C4. MEDIUM — `semgrep` installed from PyPI without a pinned version in `security.yml` - -**Symptom in prod:** `.github/workflows/security.yml` line 94 runs `python -m pip install --quiet --upgrade pip semgrep` with no version pin. Every workflow run fetches the latest `semgrep` release from PyPI at the moment of execution. Every other tool in the same workflow is pinned: `osv-scanner` uses `OSV_SCANNER_VERSION: 2.3.5` with a named release download; `gitleaks` uses `GITLEAKS_VERSION: 8.30.1`; all GitHub Actions are SHA-pinned. A compromised PyPI release of `semgrep` (or a transitive dependency) would execute arbitrary code inside the SAST job, which runs with `contents: read` permission and access to `GITHUB_TOKEN`. This directly contradicts the workflow's header comment: "All actions SHA-pinned per Scorecard `Pinned-Dependencies`." - -**File / location:** `.github/workflows/security.yml:94`. - -**Severity:** MEDIUM - -**Fix proposal:** Pin to a specific version: `pip install semgrep==` (resolve via PyPI). Better: use `returntocorp/semgrep-action` pinned by commit SHA (free for OSS), which eliminates the PyPI install entirely and aligns with the workflow's existing SHA-pinning posture. Effort: XS. - ---- - -### C5. LOW — CLAUDE.md tech-stack version table is stale on three components - -**Symptom:** CLAUDE.md "Tech Stack" section states Spring Boot `4.0.5`, Spring AI `2.0.0-M3`, Neo4j Embedded `2026.02.3`. Actual values in `pom.xml`: Spring Boot `4.0.6`, Spring AI `2.0.0-M4`, Neo4j `2026.04.0`. Stale docs cause reviewers and automated tooling to reference wrong versions when checking CVE databases or compatibility matrices. - -**File / location:** `CLAUDE.md` (Tech Stack section); `pom.xml` (ground truth). - -**Severity:** LOW - -**Fix proposal:** Update three lines in CLAUDE.md. Note that `pom.xml` is the SSoT; CLAUDE.md is informational. Effort: XS. - ---- - -## Summary Table - -| # | Original Finding | Verdict | Adjustment | -|---|-----------------|---------|------------| -| 1 | No auth on API/MCP | Confirmed | — | -| 2 | `run_cypher` no cap / timeout / READ mode | Confirmed | — | -| 3 | No rate limiting | Confirmed | — | -| 4 | Unsigned bundle + null checksums | Confirmed + extended | serve.sh/bat runtime Maven Central download is co-equal defect; fix must cover both | -| 5 | `/api/file` ships secrets in bundle | Confirmed | — | -| 6 | Readiness fires before graph load | **Partially wrong** | Downgrade HIGH → MEDIUM; bootstrap-before-markReady ordering is correct; readiness group config gap is the real issue | -| 7 | No `@RestControllerAdvice`; stack trace leak | Confirmed | — | -| 8 | MCP errors return HTTP 200 | Confirmed | — | -| 9 | No structured logs / request ID / MDC | Confirmed | — | -| 10 | `findShortestPath` + `traceImpact` unbounded | **Partially wrong** | REST `traceImpact` IS capped via `Math.min`; MCP `traceImpact` is NOT; fix target corrected | -| 11 | `/api/file` no size cap; binary served as text | **Partially wrong** | Size cap IS enforced by SafeFileReader; binary content-type issue stands; remove size-cap claim | -| 12 | `GraphHealthIndicator` uncached count on every probe | Confirmed | — | -| 13 | CORS defaults wrong; no CSP / security headers | Confirmed | — | -| 14 | Bad YAML silently uses defaults; no fail-fast | Confirmed | — | -| 15 | Zero integration tests for auth / rate-limit path | Confirmed | — | -| C1 | `getCachedData()` full graph load per topology call | **NET NEW** | HIGH | -| C2 | Swagger UI unauthenticated | **NET NEW** | MEDIUM | -| C3 | MCP `traceImpact` no depth cap | **NET NEW** | MEDIUM | -| C4 | `semgrep` unpinned in `security.yml` | **NET NEW** | MEDIUM | -| C5 | CLAUDE.md version table stale | **NET NEW** | LOW | diff --git a/docs/audits/2026-04-28-serve-path-prod-readiness.md b/docs/audits/2026-04-28-serve-path-prod-readiness.md deleted file mode 100644 index 1b572d62..00000000 --- a/docs/audits/2026-04-28-serve-path-prod-readiness.md +++ /dev/null @@ -1,177 +0,0 @@ -## 1. HIGH — MCP and REST API are fully unauthenticated; one curl from anywhere on the cluster reads the whole graph - -**Symptom in prod:** Pod has no auth on `/api/**` or `/mcp` (no Spring Security on classpath, no `@PreAuthorize`, no filter, no token check). Any other workload in the AKS namespace — including a compromised sidecar in another tenant's pod that resolves the codeiq Service — can hit `GET /api/file?path=...` and exfiltrate every byte under the analyzed codebase root, plus run arbitrary read-only Cypher via `POST /mcp` `run_cypher`. The unified config defines `mcp.auth.mode: bearer|mtls` (`McpAuthConfig`) but **nothing wires it into a filter** — the field is dead. East-west attack on multi-tenant pipeline = data exfil from other tenants' analyzed source. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/api/GraphController.java:39` (no `@PreAuthorize`); `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java:269` (no auth check); `pom.xml` (no `spring-boot-starter-security`); `src/main/java/io/github/randomcodespace/iq/config/unified/McpAuthConfig.java` (config class, never consumed). - -**Severity:** HIGH - -**Fix proposal:** Add `spring-boot-starter-security`. Implement `SecurityFilterChain` in a new `config/SecurityConfig.java` that, when `codeiq.mcp.auth.mode=bearer`, requires `Authorization: Bearer ${CODEIQ_MCP_TOKEN}` on `/api/**` AND `/mcp/**` (constant-time compare). Permit only `/actuator/health/*`. Default `mode=none` permitted only when `spring.profiles.active` contains `local`. Effort: M. - ---- - -## 2. HIGH — `run_cypher` has zero result-set cap, zero query timeout, and runs in the default (read+write) tx mode - -**Symptom in prod:** A single MCP client sends `MATCH (a:CodeNode), (b:CodeNode), (c:CodeNode) RETURN a, b, c LIMIT 999999999`. `runCypher` accumulates rows in an `ArrayList>` with no cap, the JVM heap fills, `OutOfMemoryError` triggers (heap dump goes to `/tmp` per `aks-launch.sh:51`, eats tmpfs), pod is `OOMKilled`. Tenant outage ≥60s while replica restarts and re-bootstraps Neo4j. Embedded Neo4j has no per-query memory limit configured (`Neo4jConfig.java`, no `dbms.memory.transaction.max_size`). Additionally, `tx.execute(query)` runs in default access mode, not READ — so a procedure registered later (or one this regex-blocklist misses) could mutate. The CLAUDE.md "Gotchas" already calls out RAN-31 ("pin run_cypher to Neo4j READ access mode") but the current code at `mcp/McpTools.java:296` still uses `graphDb.beginTx()` not `beginTx(KernelTransaction.Type.IMPLICIT, AUTH_DISABLED, AccessMode.Static.READ, timeoutMs, MILLIS)`. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java:269-318` (`runCypher`); `mcp/McpTools.java:311` (unbounded `rows.add`); `src/main/java/io/github/randomcodespace/iq/config/Neo4jConfig.java` (no transaction-timeout / memory settings). - -**Severity:** HIGH - -**Fix proposal:** Use `graphDb.beginTx(perToolTimeoutMs, MILLIS)` (transaction timeout already in `McpLimitsConfig.perToolTimeoutMs=15000`). Cap rows at `mcp.limits.max_results` (500) and stop iterating; return a `truncated: true` flag. Cap accumulated payload bytes at `mcp.limits.max_payload_bytes` (2 MB) by serializing-as-you-go. Configure `dbms.memory.transaction.max_size=512m` in `Neo4jConfig`. Effort: S. - ---- - -## 3. HIGH — No rate limiting anywhere; one MCP client saturates the pod for everyone - -**Symptom in prod:** `mcp.limits.rate_per_minute: 300` is defined in `McpLimitsConfig` and parsed by `UnifiedConfigLoader.java:166` but **no filter or interceptor enforces it** (zero hits for `Bucket4j|Resilience4j|RateLimiter|HandlerInterceptor` in main source). One agent client in a runaway loop fires `find_cycles` (which runs `MATCH p=(a)-[:RELATES_TO*2..10]->(a)` — graph-wide variable-length match, no per-call limit) at hundreds of QPS. Tomcat virtual-thread executor saturates Neo4j page cache, p99 on `/api/stats` jumps from 50 ms to multi-second, readiness probe (`periodSeconds: 5`) starts to flake, kubelet restarts the pod (`replicas: 1` — no failover), tenant goes dark. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java` (no rate limiter); `src/main/java/io/github/randomcodespace/iq/api/GraphController.java` (no rate limiter); `src/main/java/io/github/randomcodespace/iq/config/unified/McpLimitsConfig.java` (`ratePerMinute` parsed but unused). - -**Severity:** HIGH - -**Fix proposal:** Add Bucket4j (Apache-2.0, single dep, ~80 KB). Register an `OncePerRequestFilter` keyed by `Authorization` token (or remote IP fallback) with a refill-per-second token bucket sized at `mcp.limits.rate_per_minute / 60`. 429 with `Retry-After` header on bucket exhaustion. Apply to `/api/**` and `/mcp/**`. Effort: S. - ---- - -## 4. HIGH — Bundle is unsigned and unverified; init-container blindly unzips whatever Nexus serves - -**Symptom in prod:** AKS init-container (`shared/runbooks/aks-read-only-deploy.md:48-72`) runs `curl -u $NEXUS_USER:$NEXUS_PASS .../bundle.zip | unzip` with no checksum verification, no signature check. `ArtifactManifest` defines a `checksums` field (`Map`) but `BundleCommand.createManifest` (`cli/BundleCommand.java`) passes `null` for it (sed shows `null` literal in the constructor call). On Nexus credential compromise OR a malicious internal user with `codeiq-bundles` write access, an attacker swaps `bundle.zip` with one that contains a `graph.db/` planted with a Cypher full-text index that triggers JNDI lookup, OR a `serve.sh` that is NEVER actually invoked at runtime but still — once bundles are signed, you can also trust `manifest.json`. Single tenant's bundle becomes a foothold across the whole pipeline because the same Nexus path is served to every replica. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/cli/BundleCommand.java:141-150` (manifest checksum field passed `null`); `src/main/java/io/github/randomcodespace/iq/intelligence/ArtifactManifest.java` (record defines `checksums` but never populated); `shared/runbooks/aks-read-only-deploy.md:48-72` (no `sha256sum -c` step). - -**Severity:** HIGH - -**Fix proposal:** In `BundleCommand`, after writing each entry, accumulate SHA-256 in a `MessageDigest` and emit the map. Write a sibling `bundle.zip.sha256` file uploaded next to the bundle. In the init-container, fetch `.sha256` first and `sha256sum -c` before unzip. For tamper-resistance, also sign with cosign / GPG (Sigstore = supply-chain consistent with §7.1 of engineering-standards). Effort: M. - ---- - -## 5. HIGH — `/api/file` reads anything under the codebase root; bundle ships full source — credentials, .env, .pem all readable - -**Symptom in prod:** `GraphController.readFile` (line 255) and `McpTools.readFile` (line 394) traverse-protect to the codebase root, but the bundle (`BundleCommand`, `source/` directory) ships **the entire source tree** including `.env`, `.aws/credentials` if committed, private keys checked in by mistake, secrets in `application-local.yml`. An authenticated MCP client (or unauthenticated, until #1 is fixed) calls `read_file(path=".env")` and prints the file. There is no extension allow-list, no `.gitignore`-aware filter at bundle time, no scrubber. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/api/GraphController.java:255-310` (`readFile`); `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java:394-420`; `src/main/java/io/github/randomcodespace/iq/cli/BundleCommand.java` (`source/` packaging — no exclusion). - -**Severity:** HIGH - -**Fix proposal:** At bundle time, exclude a curated set: `**/.env*`, `**/*.pem`, `**/*.key`, `**/id_rsa*`, `**/credentials`, `**/secrets/**`, anything matched by `.gitignore`. At read time, reject those same patterns even if they slip through. Add a `serving.read_file_extension_allowlist` config (default = source-code extensions only). Effort: S. - ---- - -## 6. HIGH — `/actuator/health/readiness` returns 200 before the graph is loaded - -**Symptom in prod:** `ServeCommand.markReady()` publishes `ReadinessState.ACCEPTING_TRAFFIC` after the Spring context is up, but `GraphHealthIndicator` (`health/GraphHealthIndicator.java`) is registered as a generic `HealthIndicator`, not under the readiness group. With Spring Boot's defaults, custom `HealthIndicator`s land in the liveness+readiness composite **only if they're added to the `readiness` group**. Right now: pod becomes "ready" the moment Spring starts (~8-16s per CLAUDE.md) but `GraphBootstrapper` is still loading H2 → Neo4j (can take seconds-to-minutes for big graphs). Readiness probe passes, kube-proxy routes traffic, every request 503s with "Neo4j graph not available" (`GraphController.requireQueryService:line ~30`). On rolling deploy this also means the new pod is marked ready before old pod is drained → 100% error rate during the rollover window. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/cli/ServeCommand.java:~110` (`markReady()`); `src/main/java/io/github/randomcodespace/iq/health/GraphHealthIndicator.java:1-40` (no readiness group); `application.yml` `serving` profile (`management.health.readinessstate.enabled: true` but no `management.endpoint.health.group.readiness.include: graph,readinessState`). - -**Severity:** HIGH - -**Fix proposal:** Move `markReady()` to fire **after** `GraphBootstrapper` returns AND `graphStore.count() > 0`. Add to `application.yml` (serving profile): `management.endpoint.health.group.readiness.include: readinessState,graph`. Add a regression test. Effort: S. - ---- - -## 7. MEDIUM — No `@RestControllerAdvice`; uncaught exceptions return generic 500s with stack-trace bodies, no error envelope - -**Symptom in prod:** `grep '@ControllerAdvice'` returns zero hits in `src/main/java`. When `QueryService.nodesByKind` throws (Neo4j tx died, NPE on a malformed cached node, etc.), Spring's default error attributes return a JSON body with `"trace": "...full stack..."` if `server.error.include-stacktrace` defaults haven't been turned off — and nothing in `application.yml` turns it off. On-call sees redacted `INTERNAL_SERVER_ERROR` in clients but the response body leaks classnames + line numbers (CWE-209). MCP tools partially mask this by returning `{"error": "..."}` 200 (which is its OWN problem — see finding #8). REST has no consistent error envelope at all. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/api/GraphController.java` (mixed `ResponseStatusException` + raw return); no `*ControllerAdvice*.java` files; missing `server.error.include-stacktrace=never` in `application.yml`. - -**Severity:** MEDIUM - -**Fix proposal:** Add `api/GlobalExceptionHandler.java` with `@RestControllerAdvice`. Map `ResponseStatusException` through, all others to `{"code": "INTERNAL", "message": , "request_id": }` with HTTP 500. Set `server.error.include-stacktrace: never` and `server.error.include-message: never` in the serving profile. Effort: S. - ---- - -## 8. MEDIUM — MCP tools return `{"error": "..."}` with HTTP 200, defeating client retry logic and observability - -**Symptom in prod:** Every `catch (Exception e)` in `McpTools` returns `toJson(Map.of(PROP_ERROR, e.getMessage()))` as a successful 200 response. Spring Boot metrics (`http.server.requests`) record these as 2xx, so error-rate dashboards stay green during incidents. MCP clients with retry-on-non-2xx never retry, never alert. Worse, `e.getMessage()` from a Neo4j parse error can leak query structure / node IDs from another tenant if a path-traversal bug ever lands. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/mcp/McpTools.java` (35+ `catch (Exception e) { return toJson(Map.of(PROP_ERROR, e.getMessage())); }` blocks). - -**Severity:** MEDIUM - -**Fix proposal:** Define error codes (`INVALID_INPUT`, `NOT_FOUND`, `INTERNAL`, `RATE_LIMITED`). Return MCP-spec-compliant errors (Spring AI MCP supports throwing — verify on its API). At minimum: log with stack trace at WARN, return `{"error": {"code": "INTERNAL", "message": "internal error", "request_id": ...}}` with the actual message redacted unless it's an `IllegalArgumentException`. Effort: S. - ---- - -## 9. MEDIUM — No structured logs, no request ID, no MDC; on-call has no way to correlate a slow request to a Neo4j query - -**Symptom in prod:** `grep MDC.put|requestId|X-Request-ID|OncePerRequestFilter` in `src/main/java`: zero hits. Pod logs are default Spring Boot text format. When customer reports "the graph endpoint hung for 30s at 14:32", on-call has only timestamp matching to find the query, no per-request span ID. With virtual threads enabled (`spring.threads.virtual.enabled: true`) and N concurrent slow requests, log lines interleave with no way to demux. - -**File / location:** `src/main/resources/logback*.xml` (none — uses Spring Boot default); `src/main/resources/application.yml` (no `logging.pattern.level`); no `RequestIdFilter`. - -**Severity:** MEDIUM - -**Fix proposal:** Add `logback-spring.xml` with JSON appender (logstash-logback-encoder, MIT, single dep) gated on `spring.profiles.active=serving`. Add a `RequestIdFilter` (`OncePerRequestFilter`) that pulls `X-Request-ID` or generates a UUID, populates MDC, returns it in the response header. Add `Micrometer` timers around each `@McpTool` (Spring AI auto-instruments REST). Expose `/actuator/prometheus` (currently `metrics` is exposed but not the Prometheus scrape endpoint). Effort: M. - ---- - -## 10. MEDIUM — `GraphStore.findShortestPath` and `traceImpact` have unbounded depth or fixed `[*..20]` with no row limit, no time guard - -**Symptom in prod:** `GraphStore.findShortestPath` (line 453) runs `MATCH p = shortestPath((a)-[*..20]-(b)) RETURN [n IN nodes(p) | n.id]` — fine on small graphs, on a 5M-node enriched bundle this is 30+ seconds. `traceImpact` runs `MATCH (a)-[:RELATES_TO*1..$depth]->(b)` with `depth` capped at 10 by `McpTools.traceImpact:line ~349` — but the API endpoint `/api/triage/impact/{id}` (`GraphController:188`) doesn't appear to bound it. With 99 detector kinds and `RELATES_TO*1..10` on a hub node (e.g. a popular library import), this is a Cartesian explosion. No `WITH p LIMIT N` cap, no `dbms.transaction.timeout` configured. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/graph/GraphStore.java:453` (`shortestPath`); `:line for traceImpact`; `src/main/java/io/github/randomcodespace/iq/api/GraphController.java:188` (`triage/impact`). - -**Severity:** MEDIUM - -**Fix proposal:** Set `dbms.transaction.timeout=30s` in `Neo4jConfig`. Add `LIMIT $maxNodes` (e.g. 10000) on every `*..N` query. Bound `depth` ≤ 5 in REST endpoint and validate. Effort: S. - ---- - -## 11. MEDIUM — `/api/file` content-type is `text/plain` for all files; binary data dumps; no `Content-Length` cap matches `getMaxFileBytes` - -**Symptom in prod:** `readFile` returns binary files (a checked-in `.png`, `.jks` keystore, native `.so`) as `text/plain` with garbled UTF-8. Browser logs the entire base64-mangled body. The implementation reads via `SafeFileReader.read(resolved, startLine, endLine, config.getMaxFileBytes())` so size is bounded, but content-type isn't sniffed and there's no early-reject for non-text files. Slow client reading 1 MB file at 1 KB/s — keeps a virtual thread + a Tomcat connection occupied for 1000s. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/api/GraphController.java:255-310`. - -**Severity:** MEDIUM - -**Fix proposal:** Probe content type with `Files.probeContentType` or magic-byte check; if not `text/*`, return 415. Set `server.tomcat.connection-timeout=10s`, `server.tomcat.max-swallow-size=1MB`. Effort: S. - ---- - -## 12. MEDIUM — `GraphHealthIndicator.health()` calls `graphStore.count()` on every probe — `MATCH (n:CodeNode) RETURN count(n)` against an embedded DB - -**Symptom in prod:** Readiness probe `periodSeconds: 5` → 12 full Cypher count queries per minute, each holding a transaction open. On a 5M-node graph with concurrent user traffic, this contends with the page cache. Liveness probe also fires every 10s. The current implementation has no cache/throttle. - -**File / location:** `src/main/java/io/github/randomcodespace/iq/health/GraphHealthIndicator.java:30`. - -**Severity:** MEDIUM - -**Fix proposal:** Cache `count()` result for 30 s in an `AtomicReference`. Or: only verify "graph reachable" via a constant-time `tx.execute("RETURN 1").hasNext()`. Effort: S. - ---- - -## 13. MEDIUM — `CorsConfig` default allows `http://localhost:[*]` and `http://127.0.0.1:[*]`; in cluster, this is wrong but undetected; no CSP - -**Symptom in prod:** Default `codeiq.cors.allowed-origin-patterns` (`config/CorsConfig.java:14`) is hardcoded to dev-loopback patterns. In AKS, the React UI is served same-origin (no CORS needed) — this is fine — but if anyone exposes the API behind a reverse proxy at a different origin, they'll get cryptic CORS failures because the YAML doesn't override it (`codeiq.yml.example` doesn't even include it). Worse: zero CSP / X-Frame-Options / X-Content-Type-Options headers means the served React UI is clickjackable and the JSON endpoints can be loaded into a hostile origin's `