From 81273d868ea01205bb27d096cc014710e2edca57 Mon Sep 17 00:00:00 2001
From: HumanBean17 <doudmitry@gmail.com>
Date: Mon, 4 May 2026 16:44:15 +0000
Subject: [PATCH 1/2] added cursor rules, .cursorignore and AGENTS.md for cli
 agents

---
 .cursor/rules/agent-workflow.mdc     |  86 ++++++++++++++++++++++
 .cursor/rules/graph-and-ast.mdc      | 102 +++++++++++++++++++++++++++
 .cursor/rules/mcp-server.mdc         |  75 ++++++++++++++++++++
 .cursor/rules/project-overview.mdc   |  71 +++++++++++++++++++
 .cursor/rules/python-style.mdc       |  57 +++++++++++++++
 .cursor/rules/search-and-ranking.mdc |  91 ++++++++++++++++++++++++
 .cursor/rules/tests-and-fixtures.mdc |  64 +++++++++++++++++
 .cursorignore                        |  42 +++++++++++
 AGENTS.md                            |  65 +++++++++++++++++
 9 files changed, 653 insertions(+)
 create mode 100644 .cursor/rules/agent-workflow.mdc
 create mode 100644 .cursor/rules/graph-and-ast.mdc
 create mode 100644 .cursor/rules/mcp-server.mdc
 create mode 100644 .cursor/rules/project-overview.mdc
 create mode 100644 .cursor/rules/python-style.mdc
 create mode 100644 .cursor/rules/search-and-ranking.mdc
 create mode 100644 .cursor/rules/tests-and-fixtures.mdc
 create mode 100644 .cursorignore
 create mode 100644 AGENTS.md

diff --git a/.cursor/rules/agent-workflow.mdc b/.cursor/rules/agent-workflow.mdc
new file mode 100644
index 0000000..f7f7504
--- /dev/null
+++ b/.cursor/rules/agent-workflow.mdc
@@ -0,0 +1,86 @@
+---
+description: How agents should investigate, plan, edit, validate, and ship changes in this repo.
+alwaysApply: true
+---
+
+# Agent workflow
+
+Cursor CLI agents operate this repo with no human in the loop most of
+the time. Follow this loop.
+
+## 1. Investigate before editing
+
+- Start from `README.md` and `CODEBASE_REQUIREMENTS.md` — they encode
+  the public contract and the brownfield/tuning surface.
+- For graph/AST work: read `propose/completed/CALL-GRAPH-PROPOSE.md`
+  and the matching tests under `tests/test_call_graph_*` /
+  `tests/test_ast_*`.
+- For ranking/search work: read the "Ranking behaviour" and
+  "Capabilities" sections of `README.md` and the symbol-bonus block
+  near the bottom of §5.
+- For brownfield/role overrides: `README.md` "Brownfield overrides"
+  and `CODEBASE_REQUIREMENTS.md` §A.2.1 + §B.
+
+When in doubt, run a structural search (`find_implementors`,
+`find_callers`, etc.) over **this** repo's source — it is itself a
+medium-sized Python project and the same investigation patterns apply.
+
+## 2. Plan with proposes (for non-trivial work)
+
+The repo has a strong "propose then implement" culture
+(`propose/`, `plans/`). For non-trivial features:
+
+1. Drop a short markdown propose under `propose/` describing scope,
+   schema impact, reindex requirement, and tests touched.
+2. Reference it from the PR description.
+3. Move it into `propose/completed/` (or `plans/completed/`) once
+   landed.
+
+Skip this for clearly-bounded fixes (one-file bugs, doc edits, test
+loosening). Use judgement.
+
+## 3. Make the change
+
+- Respect `breaking-changes.mdc`: no compatibility shims, no
+  deprecation cycles.
+- Keep schema changes paired with a "Re-index required" doc update
+  and (when semantics shift) an `ontology_version` bump.
+- If you add a new env var, document it in the README env table and
+  thread it through `mcp.json.example` if it affects deployment.
+- If you add a new role / capability, do it in `java_ontology.py`
+  *and* the inference tables in `ast_java.py` — don't sprinkle string
+  literals.
+
+## 4. Validate
+
+- `ruff check .` — fix or justify warnings.
+- `pytest tests -v` — must pass without `LANCEDB_MCP_RUN_HEAVY`.
+- For schema or ranking work, run `LANCEDB_MCP_RUN_HEAVY=1 pytest
+  tests -v` locally (slow; downloads models).
+- For graph work, eyeball `tests/test_kuzu_queries.py` and
+  `tests/test_ast_graph_build.py` — those exercise the BFS / closure
+  shapes the MCP tools depend on.
+
+## 5. Commit and PR
+
+- Commit messages: present tense, imperative, lowercase first word —
+  match existing style (`fixed call graph review D6`, `cursor config`,
+  `applied fixes for call graph layer`).
+- One logical change per commit when feasible.
+- Branch names: `cursor/<topic>` for cursor agent work, `plan/<name>`
+  for in-progress proposes (matching the existing `plan/tier1-completion`).
+- PR body should reference any propose under `propose/` it implements,
+  list user-visible behaviour changes, and call out reindex / env-var
+  requirements explicitly.
+
+## 6. Don't
+
+- Don't run `gh auth status` or otherwise inspect credentials. They
+  are pre-configured.
+- Don't widen the public surface "just in case" — every new tool,
+  env var, or schema column adds a re-index burden on users.
+- Don't cargo-cult Spring assumptions onto the indexer for one
+  fixture class. Use `.lancedb-mcp.yml` `role_overrides` or
+  `@CodebaseRole` (the brownfield path is there precisely so the core
+  inference tables stay small).
+- Don't push directly to `master`. Always open a PR.
diff --git a/.cursor/rules/graph-and-ast.mdc b/.cursor/rules/graph-and-ast.mdc
new file mode 100644
index 0000000..cf8f701
--- /dev/null
+++ b/.cursor/rules/graph-and-ast.mdc
@@ -0,0 +1,102 @@
+---
+description: Semantics of the Tree-sitter Java parser, role/capability inference, and the Kuzu AST graph (nodes, edges, call resolution). Load when touching ast_java.py, build_ast_graph.py, graph_enrich.py, kuzu_queries.py, java_ontology.py, or any *graph*/*call*/*ast* test.
+globs: ast_java.py,build_ast_graph.py,graph_enrich.py,kuzu_queries.py,java_ontology.py,tests/**/test_ast_*.py,tests/**/test_call_graph_*.py,tests/**/test_kuzu_*.py,tests/**/test_graph_enrich.py,tests/**/test_brownfield_overrides.py,tests/**/test_meta_chain_core.py
+alwaysApply: false
+---
+
+# AST + Kuzu graph rules
+
+## What the graph contains
+
+**Nodes:** `package`, `file`, `class`, `interface`, `enum`, `record`,
+`annotation`, `method`, `constructor`. Unresolved targets become
+**phantom** nodes (`resolved=false`, FQN guessed from imports / `java.lang`).
+
+**Edges:**
+- `EXTENDS`, `IMPLEMENTS`, `INJECTS` — type-level wiring (Phase 1).
+- `DECLARES` — type → its own method or constructor.
+- `CALLS` — method → method, with `confidence` + `strategy` tags
+  (Phase 3). JDK / Spring / Lombok callees are phantom symbols
+  (`resolved=false`); `find_callers` / `find_callees` default to
+  `exclude_external=true`.
+
+## Roles and capabilities (java_ontology + ast_java)
+
+- `VALID_ROLES` and `VALID_CAPABILITIES` are the source of truth in
+  `java_ontology.py`. Add a new role or capability there *and* in the
+  inference tables (`ROLE_ANNOTATIONS`, the `_*_TO_CAPABILITY` maps in
+  `ast_java.py`). Don't sprinkle string literals across modules.
+- Role assignment is **first-hit-wins** from type annotations.
+- Capabilities are **type-level**: per-method evidence is aggregated up.
+  This is intentional (see `plans/PLAN-CAPABILITIES-MODEL.md`). Don't
+  bolt on per-method capabilities without revisiting that plan.
+- Resolution order in code: built-in inference → config annotation maps
+  → meta-annotation walk → `@CodebaseRole` / `@CodebaseCapability` →
+  `role_overrides.fqn` (highest priority). Preserve this order.
+
+## Brownfield consistency (Kuzu vs Lance)
+
+- **Both** the Kuzu graph writer and Lance chunk enrichment must call
+  `graph_enrich.collect_annotation_meta_chain` — there is exactly **one**
+  such walk: sorted `iter_java_source_files`, the same
+  `COMMON_EXCLUDED_PATH_PATTERNS`, stderr on parse errors, first-seen
+  FQN wins on duplicate simple names. Don't add a second walk.
+- After changing any `@interface` used as a custom stereotype, a **full**
+  `refresh_code_index(confirm=true)` (or full cocoindex reprocess + Kuzu
+  rebuild) is required. The pipeline does not track that dependency.
+- `Symbol` rows on the graph: `role` / `capabilities` are populated for
+  **type** rows only. Method and constructor `Symbol` rows keep
+  `role=OTHER` and `capabilities=[]`. Don't change this without
+  updating `CODEBASE_REQUIREMENTS.md` §A.2.1.
+
+## Call graph (CALLS / DECLARES) semantics
+
+Tree-sitter call sites tracked: `method_invocation`,
+`object_creation_expression`, `method_reference`,
+`explicit_constructor_invocation`. Nested context: `lambda_expression`
+(calls inside a lambda body stay on the enclosing method's `CallSite`
+list with `in_lambda=true`).
+
+- **Anonymous classes** are modeled as synthetic nested types
+  (`<anon:startByte>`, FQN `Outer.<anon:byte>`) with normal `MethodDecl`
+  rows. `CALLS` from those members use the synthetic member as the
+  caller. Bare-call lookup falls through to the lexically enclosing type
+  (`build_ast_graph._lookup_method_candidates`) — this matches `javac`'s
+  access rules. Don't "simplify" by attributing anon calls to the outer
+  method.
+- **Lambdas** keep inner calls on the enclosing named method (no
+  synthetic callable). Don't change without an explicit propose.
+- **Receiver scope (locals):** ONE name→type map per method (fields →
+  inherited fields → parameters → locals). Locals shadow same-named
+  fields/params. Lexical block nesting is **not** modeled. Document any
+  change here in `CODEBASE_REQUIREMENTS.md` §A.2.
+- **`this.f1.f2.m()` / `super.f1.f2.m()`** field chains (no calls in the
+  receiver) resolve by walking declared fields to the final field type.
+  Tested by `call_graph_smoke` / `test_call_graph_smoke_roundtrip` (D6).
+- **Confidence + strategy** stay on every `CALLS` edge. Don't drop them
+  to "clean up" the schema — `find_callers` / `find_callees` accept
+  `min_confidence` and rely on `strategy`.
+- **`overload_ambiguous`** is emitted only when multiple callee
+  candidates remain after the name/arity walk. Single-candidate edges
+  keep their receiver-resolution strategy (e.g. `import_map`,
+  `this_super`).
+
+## File walker / excludes
+
+`java_index_v1_common.iter_java_source_files` filters strictly on
+`*.java` and applies `COMMON_EXCLUDED_PATH_PATTERNS` (test sources,
+`target/`, `build/`, `node_modules/`, `.idea/`, `.venv/`, etc.). Both
+the indexer and `build_ast_graph.py` use it — don't re-implement
+walking. If you need to exclude a new path, add it to that constant.
+
+## Project-root semantics
+
+`module` / `microservice` inference depends on the project root used at
+**index** time:
+
+- CocoIndex flow: `LANCEDB_MCP_PROJECT_ROOT` if set, else `Path(".").resolve()`.
+- `build_ast_graph.py`: `--source-root`, defaulting to cwd.
+- MCP runtime: `LANCEDB_MCP_PROJECT_ROOT`.
+
+Consistency across builds requires the same root. When changing root
+resolution, audit all three call sites.
diff --git a/.cursor/rules/mcp-server.mdc b/.cursor/rules/mcp-server.mdc
new file mode 100644
index 0000000..96f373a
--- /dev/null
+++ b/.cursor/rules/mcp-server.mdc
@@ -0,0 +1,75 @@
+---
+description: MCP server (server.py) tool contracts, defaults, and stdio invariants. Load when touching server.py or test_mcp_tools.py.
+globs: server.py,tests/**/test_mcp_tools.py
+alwaysApply: false
+---
+
+# MCP server rules
+
+`server.py` is a **stdio FastMCP server**. Treat its `_INSTRUCTIONS`
+string and `@mcp.tool` signatures as part of the public contract.
+
+## Stdio invariants
+
+- **Never** print to stdout — stdout is the JSON-RPC transport. All
+  diagnostics (debug, parse warnings, startup info) go to **stderr**.
+- Long-running work happens on the asyncio loop; keep tool handlers
+  responsive and avoid blocking the event loop with non-async IO when
+  possible.
+- All tools return Pydantic models defined in `server.py`; reuse them
+  rather than ad-hoc dicts.
+
+## Tool descriptions = prompts
+
+The `_INSTRUCTIONS` block and per-tool `description=` strings are read
+by LLM clients to decide which tool to call. They are not freeform
+docs. When you change a tool's behaviour:
+
+1. Update its description in `server.py`.
+2. Update the `_INSTRUCTIONS` block if the tool's role in the larger
+   workflow changes.
+3. Update the README "Tools exposed by the server" table.
+4. Mirror the change in `CODEBASE_REQUIREMENTS.md` only if the user
+   contract for tuning the codebase changes.
+
+## Defaults agents rely on
+
+Preserve unless the task explicitly says otherwise:
+
+- `codebase_search` — limit 5; `auto_hybrid` recommended when query has
+  identifiers; `graph_expand` + `expand_depth=1..3` for Kuzu fusion;
+  `context_neighbors=1..2` to attach neighbors. For behavioural queries
+  the suggested filter is `exclude_roles=['DTO','ENTITY','CONFIG','OTHER']`;
+  schema/domain queries use `role='DTO'` or `role='ENTITY'`.
+- `trace_flow` — defaults `follow_calls=true` (DECLARES+CALLS merged
+  with INJECTS/EXTENDS/IMPLEMENTS) and `exclude_external=true` on the
+  CALLS hop. Seeds are auto-filtered to entrypoint-like roles
+  (CONTROLLER / COMPONENT / SERVICE / FEIGN_CLIENT). `follow_calls=false`
+  flips to type-only wiring.
+- `find_callers` / `find_callees` — `exclude_external=true` by default.
+  `find_callers` filters caller (src) FQNs; `find_callees` filters
+  callee (dst) FQNs. Same `min_confidence` parameter on both.
+- `refresh_code_index` — gated by `LANCEDB_MCP_ALLOW_REFRESH=1`. Runs
+  `cocoindex update` first, then `build_ast_graph.py`. Subprocess `cwd`
+  is the bundle directory (so Python imports resolve), while
+  `LANCEDB_MCP_PROJECT_ROOT` is propagated so indexing targets the Java
+  tree, not the bundle.
+
+## Adding a new tool
+
+1. Define a Pydantic input/output model in `server.py`.
+2. `@mcp.tool` with a clear, action-oriented `description=` that says
+   *when* an LLM should call it (e.g. "Use for X when Y").
+3. Wire it into `_INSTRUCTIONS` if it's part of a common flow.
+4. Cover it in `tests/test_mcp_tools.py` — at minimum the validation
+   and "index missing" error paths. Real-graph cases use the
+   session-scoped `kuzu_graph` fixture from `conftest.py`.
+5. Update README's "Tools exposed by the server" table.
+
+## Backwards compatibility
+
+Per `breaking-changes.mdc`: there is **no** compatibility obligation.
+Prefer straightforward removals and schema/API updates to dual code
+paths. Update the README "Re-index required" callout when schema
+breaks, and bump `ontology_version` when graph/Lance enrichment
+semantics change.
diff --git a/.cursor/rules/project-overview.mdc b/.cursor/rules/project-overview.mdc
new file mode 100644
index 0000000..20e63c5
--- /dev/null
+++ b/.cursor/rules/project-overview.mdc
@@ -0,0 +1,71 @@
+---
+description: High-level map of this repo (LanceDB MCP bundle + Kuzu AST graph for Java RAG/GraphRAG). Always-on so agents start with the right mental model.
+alwaysApply: true
+---
+
+# Project overview
+
+This repo is a **self-contained stdio MCP server** that serves semantic +
+structural search over a Java codebase. It is a Python project (the indexer
+and server). It is **not** a Java project — the `tests/bank-chat-system/`
+tree is fixture data, not code to modify.
+
+## What it does
+
+1. **LanceDB vector index** — chunks of Java / SQL / YAML embedded with
+   `sentence-transformers` (default `all-MiniLM-L6-v2`). Built by the
+   CocoIndex flow `java_index_flow_lancedb.py`.
+2. **Kuzu AST graph (sidecar)** — Tree-sitter Java -> Kuzu graph with
+   nodes (`package`, `file`, `class`, `interface`, `enum`, `record`,
+   `annotation`, `method`, `constructor`) and edges (`EXTENDS`,
+   `IMPLEMENTS`, `INJECTS`, `DECLARES`, `CALLS`). Built by
+   `build_ast_graph.py`.
+3. **MCP server (`server.py`)** — exposes `codebase_search`, `trace_flow`,
+   `find_implementors` / `find_subclasses` / `find_injectors` /
+   `find_callers` / `find_callees`, `impact_analysis`, `list_by_role` /
+   `list_by_annotation` / `list_by_capability`, `graph_neighbors`,
+   `list_code_index_tables`, `graph_meta`, and (gated)
+   `refresh_code_index`.
+
+## File map (top of repo)
+
+| File | Role |
+|------|------|
+| `server.py` | MCP stdio server. Every `@mcp.tool` lives here. |
+| `search_lancedb.py` | Vector / hybrid / graph-expanded search; ranking. |
+| `build_ast_graph.py` | Tree-sitter -> Kuzu graph builder (full rebuild). |
+| `kuzu_queries.py` | Read-only Cypher helpers used by the server. |
+| `ast_java.py` | Tree-sitter Java parsing, role/capability inference. |
+| `graph_enrich.py` | `module` / `microservice` resolution, brownfield overrides, meta-annotation walk. |
+| `java_ontology.py` | Source of truth for `VALID_ROLES` / `VALID_CAPABILITIES`. |
+| `chunk_heuristics.py` | Query-time chunk hints (no AST / no re-index). |
+| `index_common.py` | Embedding config (no CocoIndex dep). |
+| `java_index_flow_lancedb.py` | CocoIndex flow (only used by `refresh_code_index`). |
+| `java_index_v1_common.py` | Shared file walker / exclude patterns. |
+| `mcp.json.example` | Template for `.mcp.json`. |
+
+## Key environment variables
+
+- `LANCEDB_URI` — absolute path to `lancedb_data/` (required at runtime).
+- `LANCEDB_MCP_PROJECT_ROOT` — Java project root for indexing & metadata.
+- `KUZU_DB_PATH` — Kuzu DB path (defaults to `${LANCEDB_URI}/code_graph.kuzu`).
+- `LANCEDB_MCP_GRAPH_ENABLED` — `1`/`0`; auto-on when the Kuzu DB exists.
+- `LANCEDB_MCP_ALLOW_REFRESH` — `1` to enable the heavy `refresh_code_index` tool.
+- `LANCEDB_MCP_MICROSERVICE_ROOTS` — override microservice inference.
+- `SBERT_MODEL` / `SBERT_DEVICE` — must match indexer.
+
+## Two location concepts
+
+- `module` — innermost build-marker (`pom.xml`, `build.gradle(.kts)`, `build.sbt`) ancestor's directory name.
+- `microservice` — outermost build-marker ancestor under `LANCEDB_MCP_PROJECT_ROOT`. Single-module projects collapse `module == microservice`.
+
+Resolution order for `microservice`: explicit override (env or
+`.lancedb-mcp.yml` `microservice_roots:`) → outermost build marker →
+first path segment under `project_root` → empty string.
+
+## Read these before non-trivial edits
+
+- `README.md` — full feature surface and behaviour.
+- `CODEBASE_REQUIREMENTS.md` — exact assumptions about Java repos and a
+  per-file map of what to edit if your tree doesn't match.
+- `tests/README.md` — testing philosophy (see `tests-and-fixtures.mdc`).
diff --git a/.cursor/rules/python-style.mdc b/.cursor/rules/python-style.mdc
new file mode 100644
index 0000000..17fec53
--- /dev/null
+++ b/.cursor/rules/python-style.mdc
@@ -0,0 +1,57 @@
+---
+description: Python coding style, typing, and tooling conventions used in this bundle.
+globs: **/*.py
+alwaysApply: false
+---
+
+# Python style and tooling
+
+Target Python **3.11+**. The codebase uses `from __future__ import annotations`
+in every module and modern PEP 604 union syntax (`str | None`).
+
+## Conventions
+
+- **Imports:** `from __future__ import annotations` at the very top, then
+  stdlib, then third-party, then local. One module per `import` line.
+- **Typing:** Annotate all public functions and dataclasses. Use built-in
+  generics (`list[str]`, `dict[str, int]`) and PEP 604 unions
+  (`int | None`), not `typing.List` / `typing.Optional`.
+- **Dataclasses:** Prefer `@dataclass` (and `frozen=True` for value
+  objects) for record types. Use `pydantic.BaseModel` only at the MCP
+  tool boundary in `server.py`.
+- **Module docstrings:** One short sentence at the top describing intent.
+  Match the existing tone in `chunk_heuristics.py` / `index_common.py`.
+- **Public API surface:** Modules that expose a small, intentional API
+  (e.g. `java_ontology.py`) declare `__all__`. Add to it when you add a
+  new export.
+- **Private helpers:** Prefix with `_` and keep them module-local. Don't
+  promote a helper to public API just to ease testing — extend the public
+  function or add a fixture.
+- **Side effects at import time:** None. Reading env vars is fine
+  (`index_common.py` does this); doing IO or loading models at import is
+  not.
+- **Logging:** Diagnostics for the indexer and graph builder go to
+  **stderr** (see `graph_enrich` parse-error warnings,
+  `LANCEDB_MCP_DEBUG_CONTEXT=1` debug output). Never `print` to stdout
+  from anything reachable by the MCP server — stdout is the MCP transport.
+
+## Tooling
+
+- `ruff` is pinned in `requirements.txt`. Run `ruff check .` before
+  pushing meaningful changes; fix or justify warnings.
+- Tests run with `pytest` under `asyncio_mode = auto` (see `pytest.ini`
+  at repo root and under `tests/`).
+- No new top-level dependencies without updating `requirements.txt` and
+  thinking about wheel availability on Linux **and** macOS (Intel + ARM)
+  — this bundle has been tuned to work on Intel Mac (`8963a72`).
+
+## What to avoid
+
+- Don't add `print()` debugging in committed code. Use stderr or, for the
+  context-expansion path, the `LANCEDB_MCP_DEBUG_CONTEXT` flag.
+- Don't introduce a hard dependency on `cocoindex` outside
+  `java_index_flow_lancedb.py` / `refresh_code_index`. The whole point
+  of the bundle is that search and MCP run without CocoIndex installed.
+- Don't reach into private helpers of another module. If you need
+  something, lift it into `index_common.py` or `graph_enrich.py` (the
+  designated cross-cutting modules).
diff --git a/.cursor/rules/search-and-ranking.mdc b/.cursor/rules/search-and-ranking.mdc
new file mode 100644
index 0000000..0924a13
--- /dev/null
+++ b/.cursor/rules/search-and-ranking.mdc
@@ -0,0 +1,91 @@
+---
+description: LanceDB chunk schema, ranking weights, score components, and graph-expansion fusion. Load when touching search_lancedb.py, java_index_flow_lancedb.py, or any *search*/*lancedb* test.
+globs: search_lancedb.py,java_index_flow_lancedb.py,java_index_v1_common.py,chunk_heuristics.py,index_common.py,tests/**/test_search_lancedb*.py,tests/**/test_lancedb_*.py
+alwaysApply: false
+---
+
+# Search and ranking rules
+
+## Schema invariants
+
+`JavaLanceChunk` carries enrichment columns:
+`package`, `module`, `microservice`, `primary_type_fqn`,
+`primary_type_kind`, `role`, `capabilities`, `annotations_on_type`,
+`symbols`, `ontology_version`.
+
+- `annotations_on_type` and `symbols` are **native PyArrow
+  `list<string>`**, not JSON-encoded strings. Older indexes had
+  char-array output; the server defensively JSON-decodes string-form
+  list columns so old indexes don't explode, but `array_contains`
+  filters won't work until re-indexed.
+- Any schema change requires a **full reindex** via
+  `refresh_code_index` or `cocoindex update --full-reprocess -f`.
+  Update `README.md` §5 (the "Re-index required" note) and bump
+  `ontology_version` if semantics change.
+
+## Ranking (Java hits)
+
+Java hits get a role-weighted re-rank after vector / hybrid scoring:
+
+| Role | Weight |
+|------|--------|
+| `CONTROLLER` | +0.10 |
+| `SERVICE` | +0.08 |
+| `FEIGN_CLIENT` | +0.06 |
+| `COMPONENT` | +0.03 |
+| `REPOSITORY` | +0.02 |
+| `MAPPER` / `OTHER` | 0 |
+| `ENTITY` | -0.06 |
+| `CONFIG` | -0.10 |
+
+Plus a **symbol-match bonus**:
+
+1. Method/field overlap: `+0.03` per overlapping declared symbol, capped at `+0.06`.
+2. Action-verb bump: `+0.02` flat when a declared method starts with
+   `process`, `handle`, `on`, `pick`, `select`, `assign`, `notify`,
+   `dispatch`, `publish`, `consume`, `route`, `trigger`, `enqueue`,
+   `distribute`, ...
+3. Type-name overlap: `+0.05` per token shared between
+   `simple_name(primary_type_fqn)` and the query, capped at `+0.10`.
+
+**Both role weights and the symbol bonus are skipped when the caller
+locks `role=`.** Preserve this. The per-row breakdown lives in
+`score_components` (`distance`, `hybrid_rrf`, `role_weight`,
+`symbol_bonus`, `import_penalty`) and feeds the compact `why` string.
+When tweaking weights, update both the table in `README.md` and the
+ranking section in `CODEBASE_REQUIREMENTS.md`.
+
+## Filters and modes
+
+- `role`, `module`, `microservice`, `package_prefix`, `capability` are
+  AND-combined.
+- `auto_hybrid=true` mixes vector + FTS via RRF. Recommended when a
+  query contains identifiers / CamelCase / snake_case tokens.
+- `graph_expand=true` + `expand_depth=1..3` fuses Kuzu BFS results into
+  vector hits via RRF. The Kuzu DB must exist.
+- `context_neighbors=1..2` attaches adjacent chunks as `context_before`
+  / `context_after`. Empty context with neighbors set is a known
+  failure path: set `LANCEDB_MCP_DEBUG_CONTEXT=1` to log why expansion
+  bailed (missing schema columns, empty bucket, chunk not found, scan
+  error). Common cause: stale server after reindex, or legacy index
+  without `range_start` / `range_end`.
+
+## Capabilities axis
+
+`capabilities` is a multi-tag `list<string>` per type — types can
+carry zero or many. They **augment**, never replace, the single
+`role`. Triggers (must stay in sync with `_*_TO_CAPABILITY` maps):
+
+- `MESSAGE_LISTENER` — `@KafkaListener`, `@RabbitListener`, `@JmsListener`, `@SqsListener`, `@EventListener`, `@StreamListener`.
+- `MESSAGE_PRODUCER` — type injects `KafkaTemplate`, `RabbitTemplate`, `JmsTemplate`, `StreamBridge`, or `ApplicationEventPublisher`.
+- `SCHEDULED_TASK` — `@Scheduled` on any method, or class implements `org.quartz.Job`.
+- `EXCEPTION_HANDLER` — `@ControllerAdvice`, `@RestControllerAdvice`, or any method with `@ExceptionHandler`.
+
+## What to avoid
+
+- Don't introduce ranking signals that depend on a specific package
+  prefix or class name from the test fixture (see `tests-and-fixtures.mdc`).
+- Don't change how `score_components` is shaped without updating the
+  `why`-string formatter and tests in `test_search_lancedb*.py`.
+- Don't silently change embedding model — `SBERT_MODEL` must match the
+  one used at index time, and `index_common.py` is the single source.
diff --git a/.cursor/rules/tests-and-fixtures.mdc b/.cursor/rules/tests-and-fixtures.mdc
new file mode 100644
index 0000000..fcfdd12
--- /dev/null
+++ b/.cursor/rules/tests-and-fixtures.mdc
@@ -0,0 +1,64 @@
+---
+description: Testing philosophy. Read before adding/changing tests or touching the bank-chat-system fixture. Loose invariants beat exact counts.
+globs: tests/**/*.py,tests/**/*.java
+alwaysApply: false
+---
+
+# Testing rules
+
+The full philosophy is in `tests/README.md` — follow it strictly. Summary
+for agents below.
+
+## DO NOT OVERFIT THE MCP TO THE FIXTURE
+
+`tests/bank-chat-system/` is a deterministic Java corpus used for
+assertions. It is **not** a model of real production codebases. Real
+repos look different in dozens of ways.
+
+1. **Assert on invariants, not exact counts.** Prefer `>= 1`, `> 0`,
+   `key in result`, `len(...) >= N`, or structural shape over `== 11`.
+   Exact counts are fine only when proving both sides of a known
+   relationship in the fixture (e.g. that `EventProcessor` has the
+   implementations the fixture defines).
+2. **Never special-case the fixture in production code.** Don't add a
+   role / heuristic / regex that only fires for `com.bank.chat...` or
+   `ChatManagementService`. If a test needs it, the test is wrong.
+3. **Test the contract without LanceDB.** Validation and error paths
+   (e.g. "index missing" responses) should run without a real index.
+   Heavy integration goes in `test_lancedb_e2e.py` and is gated behind
+   `LANCEDB_MCP_RUN_HEAVY=1`.
+4. **When a test fails after a refactor, re-read the assertion first.**
+   Most assertions are intentionally loose. Tightening one to chase a
+   number is almost always wrong.
+
+## Layout
+
+- `tests/conftest.py` builds the session-scoped Kuzu graph from
+  `tests/bank-chat-system/` exactly once into a `tmp_path_factory` dir,
+  then sets `KUZU_DB_PATH` (and a fake `LANCEDB_URI`) for the suite.
+- `tests/fixtures/call_graph_smoke/` is a mini Maven tree for
+  scope / overload / wildcard / method-ref graph checks. Don't edit it
+  to make a test pass — it is calibrated against the resolver.
+- Coverage matrix for the call-graph propose lives in
+  `propose/completed/CALL-GRAPH-PROPOSE.md` §7.1; tests are spread
+  across `test_ast_java_calls.py`, `test_call_graph_smoke_roundtrip.py`,
+  `test_call_graph_receiver_resolution.py`, `test_ast_graph_build.py`,
+  `test_kuzu_queries.py`, and the MCP smoke tests.
+
+## Heavy / e2e tests
+
+`test_lancedb_e2e.py` runs `cocoindex` and a real LanceDB index. Skipped
+unless `LANCEDB_MCP_RUN_HEAVY=1`. Don't unconditionally enable it; it
+downloads the embedding model on first run and indexes the corpus from
+scratch.
+
+## Adding new tests
+
+- Use `pytest` + `asyncio_mode = auto` (already set globally).
+- Reuse the `kuzu_graph` session fixture instead of building your own.
+- For new behaviour, add fixture Java under `tests/bank-chat-system/`
+  (or extend `call_graph_smoke/`) **only** if the asserted behaviour is
+  general — never to encode a one-off heuristic.
+- New MCP tools must be exercised at least once in `test_mcp_tools.py`,
+  either with the real Kuzu graph fixture or via the error path when
+  LanceDB is unavailable.
diff --git a/.cursorignore b/.cursorignore
new file mode 100644
index 0000000..c03ab36
--- /dev/null
+++ b/.cursorignore
@@ -0,0 +1,42 @@
+# Local runtime / index artifacts (huge, binary, regenerated)
+lancedb_data/
+cocoindex_java_lance.db/
+*.kuzu
+*.kuzu.wal
+
+# Python caches
+__pycache__/
+*.py[cod]
+*$py.class
+.pytest_cache/
+.mypy_cache/
+.ruff_cache/
+.coverage
+.coverage.*
+htmlcov/
+
+# Virtual envs
+.venv/
+venv/
+env/
+ENV/
+
+# Build artifacts
+build/
+dist/
+*.egg-info/
+.eggs/
+
+# IDE / OS
+.idea/
+.vscode/
+.DS_Store
+
+# Local env
+.env
+.env.*
+tmp/
+
+# Test fixture compiled output (the .java sources stay searchable)
+tests/**/target/
+tests/**/build/
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..55a2de4
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,65 @@
+# AGENTS.md
+
+Entry point for Cursor CLI agents (and other agentic tools) operating
+this repo. Detailed guidance lives in `.cursor/rules/*.mdc` — those
+files are auto-loaded by Cursor based on globs and `alwaysApply`. This
+file is a flat summary for tools that don't read `.cursor/rules/`.
+
+## What this repo is
+
+Self-contained **stdio MCP server** for semantic + structural search
+over a Java codebase:
+
+- **LanceDB** vector index (Java / SQL / YAML chunks, `sentence-transformers`).
+- **Kuzu** AST graph (Tree-sitter Java, deterministic) with
+  `EXTENDS`, `IMPLEMENTS`, `INJECTS`, `DECLARES`, `CALLS` edges.
+- **MCP tools** in `server.py`: `codebase_search`, `trace_flow`,
+  `find_callers` / `find_callees` / `find_implementors` / `find_subclasses`
+  / `find_injectors`, `impact_analysis`, `list_by_role` /
+  `list_by_annotation` / `list_by_capability`, `graph_neighbors`,
+  `list_code_index_tables`, `graph_meta`, gated `refresh_code_index`.
+
+## Hard rules (read first)
+
+1. **No backward-compatibility obligation.** Prefer removals and
+   schema updates over shims (`.cursor/rules/breaking-changes.mdc`).
+2. **No overfitting to the test fixture.** `tests/bank-chat-system/`
+   is a deterministic corpus, not a model of production. Assert on
+   invariants, not exact counts (`.cursor/rules/tests-and-fixtures.mdc`).
+3. **MCP server is stdio.** `print()` to stdout breaks the transport.
+   All diagnostics go to stderr.
+4. **One source of truth for roles and capabilities:** `java_ontology.py`
+   + the inference tables in `ast_java.py`. No string literals
+   sprinkled elsewhere.
+5. **Schema changes require a full reindex.** Update the README
+   "Re-index required" block and bump `ontology_version` when
+   enrichment semantics change.
+
+## Investigation order
+
+1. `README.md` — feature surface and behaviour.
+2. `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file
+   tuning map.
+3. `propose/` and `plans/` — designed-but-deferred work and the
+   "propose-then-implement" culture.
+4. `tests/README.md` — testing philosophy.
+5. The relevant `.cursor/rules/*.mdc` for the file you're editing.
+
+## Workflow
+
+- Branch from `master`. Branch names: `cursor/<topic>` (CLI work),
+  `plan/<name>` (in-progress propose).
+- Commit messages: present tense, imperative, lowercase first word
+  (e.g. `fixed call graph review D6`).
+- Always open a PR; never push to `master`.
+- Run `ruff check .` and `pytest tests -v` before pushing.
+- For non-trivial features, drop a short propose under `propose/` and
+  reference it in the PR.
+
+## Environment for running the server
+
+`LANCEDB_URI` (required), `LANCEDB_MCP_PROJECT_ROOT`, `KUZU_DB_PATH`
+(defaults to `${LANCEDB_URI}/code_graph.kuzu`),
+`LANCEDB_MCP_GRAPH_ENABLED`, `LANCEDB_MCP_ALLOW_REFRESH`,
+`LANCEDB_MCP_MICROSERVICE_ROOTS`, `SBERT_MODEL`, `SBERT_DEVICE`. See
+`README.md` §2 and `mcp.json.example`.

From d66f06449b0bd2b8a35bd2fde60f0687d6f7f753 Mon Sep 17 00:00:00 2001
From: HumanBean17 <doudmitry@gmail.com>
Date: Mon, 4 May 2026 18:02:15 +0000
Subject: [PATCH 2/2] trimmed cursor rules to project-specific guidance only

---
 .cursor/rules/agent-workflow.mdc     |  99 ++++++++++++--------------
 .cursor/rules/graph-and-ast.mdc      | 102 ---------------------------
 .cursor/rules/mcp-server.mdc         |  75 --------------------
 .cursor/rules/project-overview.mdc   |  76 ++++++++------------
 .cursor/rules/python-style.mdc       |  57 ---------------
 .cursor/rules/search-and-ranking.mdc |  91 ------------------------
 .cursor/rules/tests-and-fixtures.mdc |  64 -----------------
 AGENTS.md                            |  90 +++++++++--------------
 8 files changed, 112 insertions(+), 542 deletions(-)
 delete mode 100644 .cursor/rules/graph-and-ast.mdc
 delete mode 100644 .cursor/rules/mcp-server.mdc
 delete mode 100644 .cursor/rules/python-style.mdc
 delete mode 100644 .cursor/rules/search-and-ranking.mdc
 delete mode 100644 .cursor/rules/tests-and-fixtures.mdc

diff --git a/.cursor/rules/agent-workflow.mdc b/.cursor/rules/agent-workflow.mdc
index f7f7504..732bf00 100644
--- a/.cursor/rules/agent-workflow.mdc
+++ b/.cursor/rules/agent-workflow.mdc
@@ -1,31 +1,21 @@
 ---
-description: How agents should investigate, plan, edit, validate, and ship changes in this repo.
+description: How agents should investigate, edit, validate, and ship in this repo.
 alwaysApply: true
 ---
 
 # Agent workflow
 
-Cursor CLI agents operate this repo with no human in the loop most of
-the time. Follow this loop.
+## Investigate before editing
 
-## 1. Investigate before editing
+For any non-trivial change, read the relevant doc first instead of
+inferring from code:
 
-- Start from `README.md` and `CODEBASE_REQUIREMENTS.md` — they encode
-  the public contract and the brownfield/tuning surface.
-- For graph/AST work: read `propose/completed/CALL-GRAPH-PROPOSE.md`
-  and the matching tests under `tests/test_call_graph_*` /
-  `tests/test_ast_*`.
-- For ranking/search work: read the "Ranking behaviour" and
-  "Capabilities" sections of `README.md` and the symbol-bonus block
-  near the bottom of §5.
-- For brownfield/role overrides: `README.md` "Brownfield overrides"
-  and `CODEBASE_REQUIREMENTS.md` §A.2.1 + §B.
+- Behaviour / public surface → `README.md`.
+- Brownfield assumptions, role/capability tuning → `CODEBASE_REQUIREMENTS.md`.
+- Why current design exists → `propose/completed/` and `plans/completed/`.
+- Testing philosophy → `tests/README.md`.
 
-When in doubt, run a structural search (`find_implementors`,
-`find_callers`, etc.) over **this** repo's source — it is itself a
-medium-sized Python project and the same investigation patterns apply.
-
-## 2. Plan with proposes (for non-trivial work)
+## Propose-then-implement culture
 
 The repo has a strong "propose then implement" culture
 (`propose/`, `plans/`). For non-trivial features:
@@ -39,48 +29,53 @@ The repo has a strong "propose then implement" culture
 Skip this for clearly-bounded fixes (one-file bugs, doc edits, test
 loosening). Use judgement.
 
-## 3. Make the change
-
-- Respect `breaking-changes.mdc`: no compatibility shims, no
-  deprecation cycles.
-- Keep schema changes paired with a "Re-index required" doc update
-  and (when semantics shift) an `ontology_version` bump.
-- If you add a new env var, document it in the README env table and
-  thread it through `mcp.json.example` if it affects deployment.
-- If you add a new role / capability, do it in `java_ontology.py`
-  *and* the inference tables in `ast_java.py` — don't sprinkle string
-  literals.
-
-## 4. Validate
+## Editing rules
+
+- Respect `.cursor/rules/breaking-changes.mdc`: no compatibility
+  shims, no deprecation cycles.
+- One source of truth for roles and capabilities lives in
+  `java_ontology.py`. Don't sprinkle role / capability string
+  literals across other modules.
+- Schema changes that affect the Lance index or Kuzu graph need a
+  matching update to the README "Re-index required" callout. Bump
+  `ontology_version` when enrichment semantics change.
+- `server.py` is a stdio MCP server: anything reachable from a tool
+  handler must not write to **stdout** (that's the JSON-RPC
+  transport). Diagnostics go to stderr.
+- Tool `description=` strings and `_INSTRUCTIONS` in `server.py` are
+  read by LLM clients to choose tools — treat them as part of the
+  contract, not freeform docs.
+
+## Validate
 
 - `ruff check .` — fix or justify warnings.
 - `pytest tests -v` — must pass without `LANCEDB_MCP_RUN_HEAVY`.
-- For schema or ranking work, run `LANCEDB_MCP_RUN_HEAVY=1 pytest
-  tests -v` locally (slow; downloads models).
-- For graph work, eyeball `tests/test_kuzu_queries.py` and
-  `tests/test_ast_graph_build.py` — those exercise the BFS / closure
-  shapes the MCP tools depend on.
+- For schema or ranking work, also run with
+  `LANCEDB_MCP_RUN_HEAVY=1` locally (slow; downloads models).
 
-## 5. Commit and PR
+## Commit and PR
 
-- Commit messages: present tense, imperative, lowercase first word —
-  match existing style (`fixed call graph review D6`, `cursor config`,
+- Commit messages: present tense, imperative, lowercase first word,
+  matching existing style (e.g. `fixed call graph review D6`,
   `applied fixes for call graph layer`).
 - One logical change per commit when feasible.
-- Branch names: `cursor/<topic>` for cursor agent work, `plan/<name>`
-  for in-progress proposes (matching the existing `plan/tier1-completion`).
-- PR body should reference any propose under `propose/` it implements,
-  list user-visible behaviour changes, and call out reindex / env-var
+- Branch names: `cursor/<topic>` for cursor-agent work, `plan/<name>`
+  for in-progress proposes (matching existing `plan/tier1-completion`).
+- PR body should reference any propose it implements, list
+  user-visible behaviour changes, and call out reindex / env-var
   requirements explicitly.
+- Never push directly to `master`.
 
-## 6. Don't
+## Don't
 
-- Don't run `gh auth status` or otherwise inspect credentials. They
-  are pre-configured.
+- Don't run `gh auth status` or otherwise inspect credentials.
 - Don't widen the public surface "just in case" — every new tool,
   env var, or schema column adds a re-index burden on users.
-- Don't cargo-cult Spring assumptions onto the indexer for one
-  fixture class. Use `.lancedb-mcp.yml` `role_overrides` or
-  `@CodebaseRole` (the brownfield path is there precisely so the core
-  inference tables stay small).
-- Don't push directly to `master`. Always open a PR.
+- Don't special-case the `tests/bank-chat-system/` fixture in
+  production code. If a test needs it, the test is wrong (see
+  `tests/README.md`).
+- Don't tighten loose test assertions (`>= 1`, `len(...) >= N`,
+  `key in result`) into exact counts to chase a number — they are
+  intentionally loose.
+- Don't add a hard dependency on `cocoindex` outside
+  `java_index_flow_lancedb.py` / the `refresh_code_index` tool.
diff --git a/.cursor/rules/graph-and-ast.mdc b/.cursor/rules/graph-and-ast.mdc
deleted file mode 100644
index cf8f701..0000000
--- a/.cursor/rules/graph-and-ast.mdc
+++ /dev/null
@@ -1,102 +0,0 @@
----
-description: Semantics of the Tree-sitter Java parser, role/capability inference, and the Kuzu AST graph (nodes, edges, call resolution). Load when touching ast_java.py, build_ast_graph.py, graph_enrich.py, kuzu_queries.py, java_ontology.py, or any *graph*/*call*/*ast* test.
-globs: ast_java.py,build_ast_graph.py,graph_enrich.py,kuzu_queries.py,java_ontology.py,tests/**/test_ast_*.py,tests/**/test_call_graph_*.py,tests/**/test_kuzu_*.py,tests/**/test_graph_enrich.py,tests/**/test_brownfield_overrides.py,tests/**/test_meta_chain_core.py
-alwaysApply: false
----
-
-# AST + Kuzu graph rules
-
-## What the graph contains
-
-**Nodes:** `package`, `file`, `class`, `interface`, `enum`, `record`,
-`annotation`, `method`, `constructor`. Unresolved targets become
-**phantom** nodes (`resolved=false`, FQN guessed from imports / `java.lang`).
-
-**Edges:**
-- `EXTENDS`, `IMPLEMENTS`, `INJECTS` — type-level wiring (Phase 1).
-- `DECLARES` — type → its own method or constructor.
-- `CALLS` — method → method, with `confidence` + `strategy` tags
-  (Phase 3). JDK / Spring / Lombok callees are phantom symbols
-  (`resolved=false`); `find_callers` / `find_callees` default to
-  `exclude_external=true`.
-
-## Roles and capabilities (java_ontology + ast_java)
-
-- `VALID_ROLES` and `VALID_CAPABILITIES` are the source of truth in
-  `java_ontology.py`. Add a new role or capability there *and* in the
-  inference tables (`ROLE_ANNOTATIONS`, the `_*_TO_CAPABILITY` maps in
-  `ast_java.py`). Don't sprinkle string literals across modules.
-- Role assignment is **first-hit-wins** from type annotations.
-- Capabilities are **type-level**: per-method evidence is aggregated up.
-  This is intentional (see `plans/PLAN-CAPABILITIES-MODEL.md`). Don't
-  bolt on per-method capabilities without revisiting that plan.
-- Resolution order in code: built-in inference → config annotation maps
-  → meta-annotation walk → `@CodebaseRole` / `@CodebaseCapability` →
-  `role_overrides.fqn` (highest priority). Preserve this order.
-
-## Brownfield consistency (Kuzu vs Lance)
-
-- **Both** the Kuzu graph writer and Lance chunk enrichment must call
-  `graph_enrich.collect_annotation_meta_chain` — there is exactly **one**
-  such walk: sorted `iter_java_source_files`, the same
-  `COMMON_EXCLUDED_PATH_PATTERNS`, stderr on parse errors, first-seen
-  FQN wins on duplicate simple names. Don't add a second walk.
-- After changing any `@interface` used as a custom stereotype, a **full**
-  `refresh_code_index(confirm=true)` (or full cocoindex reprocess + Kuzu
-  rebuild) is required. The pipeline does not track that dependency.
-- `Symbol` rows on the graph: `role` / `capabilities` are populated for
-  **type** rows only. Method and constructor `Symbol` rows keep
-  `role=OTHER` and `capabilities=[]`. Don't change this without
-  updating `CODEBASE_REQUIREMENTS.md` §A.2.1.
-
-## Call graph (CALLS / DECLARES) semantics
-
-Tree-sitter call sites tracked: `method_invocation`,
-`object_creation_expression`, `method_reference`,
-`explicit_constructor_invocation`. Nested context: `lambda_expression`
-(calls inside a lambda body stay on the enclosing method's `CallSite`
-list with `in_lambda=true`).
-
-- **Anonymous classes** are modeled as synthetic nested types
-  (`<anon:startByte>`, FQN `Outer.<anon:byte>`) with normal `MethodDecl`
-  rows. `CALLS` from those members use the synthetic member as the
-  caller. Bare-call lookup falls through to the lexically enclosing type
-  (`build_ast_graph._lookup_method_candidates`) — this matches `javac`'s
-  access rules. Don't "simplify" by attributing anon calls to the outer
-  method.
-- **Lambdas** keep inner calls on the enclosing named method (no
-  synthetic callable). Don't change without an explicit propose.
-- **Receiver scope (locals):** ONE name→type map per method (fields →
-  inherited fields → parameters → locals). Locals shadow same-named
-  fields/params. Lexical block nesting is **not** modeled. Document any
-  change here in `CODEBASE_REQUIREMENTS.md` §A.2.
-- **`this.f1.f2.m()` / `super.f1.f2.m()`** field chains (no calls in the
-  receiver) resolve by walking declared fields to the final field type.
-  Tested by `call_graph_smoke` / `test_call_graph_smoke_roundtrip` (D6).
-- **Confidence + strategy** stay on every `CALLS` edge. Don't drop them
-  to "clean up" the schema — `find_callers` / `find_callees` accept
-  `min_confidence` and rely on `strategy`.
-- **`overload_ambiguous`** is emitted only when multiple callee
-  candidates remain after the name/arity walk. Single-candidate edges
-  keep their receiver-resolution strategy (e.g. `import_map`,
-  `this_super`).
-
-## File walker / excludes
-
-`java_index_v1_common.iter_java_source_files` filters strictly on
-`*.java` and applies `COMMON_EXCLUDED_PATH_PATTERNS` (test sources,
-`target/`, `build/`, `node_modules/`, `.idea/`, `.venv/`, etc.). Both
-the indexer and `build_ast_graph.py` use it — don't re-implement
-walking. If you need to exclude a new path, add it to that constant.
-
-## Project-root semantics
-
-`module` / `microservice` inference depends on the project root used at
-**index** time:
-
-- CocoIndex flow: `LANCEDB_MCP_PROJECT_ROOT` if set, else `Path(".").resolve()`.
-- `build_ast_graph.py`: `--source-root`, defaulting to cwd.
-- MCP runtime: `LANCEDB_MCP_PROJECT_ROOT`.
-
-Consistency across builds requires the same root. When changing root
-resolution, audit all three call sites.
diff --git a/.cursor/rules/mcp-server.mdc b/.cursor/rules/mcp-server.mdc
deleted file mode 100644
index 96f373a..0000000
--- a/.cursor/rules/mcp-server.mdc
+++ /dev/null
@@ -1,75 +0,0 @@
----
-description: MCP server (server.py) tool contracts, defaults, and stdio invariants. Load when touching server.py or test_mcp_tools.py.
-globs: server.py,tests/**/test_mcp_tools.py
-alwaysApply: false
----
-
-# MCP server rules
-
-`server.py` is a **stdio FastMCP server**. Treat its `_INSTRUCTIONS`
-string and `@mcp.tool` signatures as part of the public contract.
-
-## Stdio invariants
-
-- **Never** print to stdout — stdout is the JSON-RPC transport. All
-  diagnostics (debug, parse warnings, startup info) go to **stderr**.
-- Long-running work happens on the asyncio loop; keep tool handlers
-  responsive and avoid blocking the event loop with non-async IO when
-  possible.
-- All tools return Pydantic models defined in `server.py`; reuse them
-  rather than ad-hoc dicts.
-
-## Tool descriptions = prompts
-
-The `_INSTRUCTIONS` block and per-tool `description=` strings are read
-by LLM clients to decide which tool to call. They are not freeform
-docs. When you change a tool's behaviour:
-
-1. Update its description in `server.py`.
-2. Update the `_INSTRUCTIONS` block if the tool's role in the larger
-   workflow changes.
-3. Update the README "Tools exposed by the server" table.
-4. Mirror the change in `CODEBASE_REQUIREMENTS.md` only if the user
-   contract for tuning the codebase changes.
-
-## Defaults agents rely on
-
-Preserve unless the task explicitly says otherwise:
-
-- `codebase_search` — limit 5; `auto_hybrid` recommended when query has
-  identifiers; `graph_expand` + `expand_depth=1..3` for Kuzu fusion;
-  `context_neighbors=1..2` to attach neighbors. For behavioural queries
-  the suggested filter is `exclude_roles=['DTO','ENTITY','CONFIG','OTHER']`;
-  schema/domain queries use `role='DTO'` or `role='ENTITY'`.
-- `trace_flow` — defaults `follow_calls=true` (DECLARES+CALLS merged
-  with INJECTS/EXTENDS/IMPLEMENTS) and `exclude_external=true` on the
-  CALLS hop. Seeds are auto-filtered to entrypoint-like roles
-  (CONTROLLER / COMPONENT / SERVICE / FEIGN_CLIENT). `follow_calls=false`
-  flips to type-only wiring.
-- `find_callers` / `find_callees` — `exclude_external=true` by default.
-  `find_callers` filters caller (src) FQNs; `find_callees` filters
-  callee (dst) FQNs. Same `min_confidence` parameter on both.
-- `refresh_code_index` — gated by `LANCEDB_MCP_ALLOW_REFRESH=1`. Runs
-  `cocoindex update` first, then `build_ast_graph.py`. Subprocess `cwd`
-  is the bundle directory (so Python imports resolve), while
-  `LANCEDB_MCP_PROJECT_ROOT` is propagated so indexing targets the Java
-  tree, not the bundle.
-
-## Adding a new tool
-
-1. Define a Pydantic input/output model in `server.py`.
-2. `@mcp.tool` with a clear, action-oriented `description=` that says
-   *when* an LLM should call it (e.g. "Use for X when Y").
-3. Wire it into `_INSTRUCTIONS` if it's part of a common flow.
-4. Cover it in `tests/test_mcp_tools.py` — at minimum the validation
-   and "index missing" error paths. Real-graph cases use the
-   session-scoped `kuzu_graph` fixture from `conftest.py`.
-5. Update README's "Tools exposed by the server" table.
-
-## Backwards compatibility
-
-Per `breaking-changes.mdc`: there is **no** compatibility obligation.
-Prefer straightforward removals and schema/API updates to dual code
-paths. Update the README "Re-index required" callout when schema
-breaks, and bump `ontology_version` when graph/Lance enrichment
-semantics change.
diff --git a/.cursor/rules/project-overview.mdc b/.cursor/rules/project-overview.mdc
index 20e63c5..e5bc44b 100644
--- a/.cursor/rules/project-overview.mdc
+++ b/.cursor/rules/project-overview.mdc
@@ -1,31 +1,34 @@
 ---
-description: High-level map of this repo (LanceDB MCP bundle + Kuzu AST graph for Java RAG/GraphRAG). Always-on so agents start with the right mental model.
+description: Project map and where to look for what. Always-on so agents start with the right mental model.
 alwaysApply: true
 ---
 
 # Project overview
 
-This repo is a **self-contained stdio MCP server** that serves semantic +
-structural search over a Java codebase. It is a Python project (the indexer
-and server). It is **not** a Java project — the `tests/bank-chat-system/`
-tree is fixture data, not code to modify.
-
-## What it does
-
-1. **LanceDB vector index** — chunks of Java / SQL / YAML embedded with
-   `sentence-transformers` (default `all-MiniLM-L6-v2`). Built by the
-   CocoIndex flow `java_index_flow_lancedb.py`.
-2. **Kuzu AST graph (sidecar)** — Tree-sitter Java -> Kuzu graph with
-   nodes (`package`, `file`, `class`, `interface`, `enum`, `record`,
-   `annotation`, `method`, `constructor`) and edges (`EXTENDS`,
-   `IMPLEMENTS`, `INJECTS`, `DECLARES`, `CALLS`). Built by
-   `build_ast_graph.py`.
-3. **MCP server (`server.py`)** — exposes `codebase_search`, `trace_flow`,
-   `find_implementors` / `find_subclasses` / `find_injectors` /
-   `find_callers` / `find_callees`, `impact_analysis`, `list_by_role` /
-   `list_by_annotation` / `list_by_capability`, `graph_neighbors`,
-   `list_code_index_tables`, `graph_meta`, and (gated)
-   `refresh_code_index`.
+This repo is a **self-contained stdio MCP server** that serves
+semantic + structural search over a Java codebase. It is a Python
+project (the indexer and server). It is **not** a Java project —
+the `tests/bank-chat-system/` tree is fixture data, not code to
+modify.
+
+Treat README and the markdown docs as the source of truth for
+behaviour, schemas, env vars, ranking, edges, tool defaults, and
+ontology. **Do not copy that content into rules** — read it directly
+when needed.
+
+## Where to look
+
+- `README.md` — feature surface, env vars, ranking, capabilities,
+  tool list, "Re-index required" callouts.
+- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file
+  map of what to edit when a target tree doesn't match defaults.
+- `tests/README.md` — testing philosophy.
+- `propose/` and `propose/completed/` — design proposes; the active
+  ones describe in-flight scope and the completed ones explain *why*
+  current code looks the way it does.
+- `plans/` and `plans/completed/` — longer-form plans (e.g.
+  capabilities model, tier completions).
+- `.cursor/rules/breaking-changes.mdc` — the no-back-compat policy.
 
 ## File map (top of repo)
 
@@ -44,28 +47,9 @@ tree is fixture data, not code to modify.
 | `java_index_v1_common.py` | Shared file walker / exclude patterns. |
 | `mcp.json.example` | Template for `.mcp.json`. |
 
-## Key environment variables
-
-- `LANCEDB_URI` — absolute path to `lancedb_data/` (required at runtime).
-- `LANCEDB_MCP_PROJECT_ROOT` — Java project root for indexing & metadata.
-- `KUZU_DB_PATH` — Kuzu DB path (defaults to `${LANCEDB_URI}/code_graph.kuzu`).
-- `LANCEDB_MCP_GRAPH_ENABLED` — `1`/`0`; auto-on when the Kuzu DB exists.
-- `LANCEDB_MCP_ALLOW_REFRESH` — `1` to enable the heavy `refresh_code_index` tool.
-- `LANCEDB_MCP_MICROSERVICE_ROOTS` — override microservice inference.
-- `SBERT_MODEL` / `SBERT_DEVICE` — must match indexer.
-
-## Two location concepts
-
-- `module` — innermost build-marker (`pom.xml`, `build.gradle(.kts)`, `build.sbt`) ancestor's directory name.
-- `microservice` — outermost build-marker ancestor under `LANCEDB_MCP_PROJECT_ROOT`. Single-module projects collapse `module == microservice`.
-
-Resolution order for `microservice`: explicit override (env or
-`.lancedb-mcp.yml` `microservice_roots:`) → outermost build marker →
-first path segment under `project_root` → empty string.
-
-## Read these before non-trivial edits
+## Test layout
 
-- `README.md` — full feature surface and behaviour.
-- `CODEBASE_REQUIREMENTS.md` — exact assumptions about Java repos and a
-  per-file map of what to edit if your tree doesn't match.
-- `tests/README.md` — testing philosophy (see `tests-and-fixtures.mdc`).
+- `tests/conftest.py` — session-scoped Kuzu graph fixture.
+- `tests/bank-chat-system/` — deterministic Java corpus (fixture, not production model).
+- `tests/fixtures/call_graph_smoke/` — mini Maven tree calibrated against the call-graph resolver.
+- Heavy e2e tests gated behind `LANCEDB_MCP_RUN_HEAVY=1`.
diff --git a/.cursor/rules/python-style.mdc b/.cursor/rules/python-style.mdc
deleted file mode 100644
index 17fec53..0000000
--- a/.cursor/rules/python-style.mdc
+++ /dev/null
@@ -1,57 +0,0 @@
----
-description: Python coding style, typing, and tooling conventions used in this bundle.
-globs: **/*.py
-alwaysApply: false
----
-
-# Python style and tooling
-
-Target Python **3.11+**. The codebase uses `from __future__ import annotations`
-in every module and modern PEP 604 union syntax (`str | None`).
-
-## Conventions
-
-- **Imports:** `from __future__ import annotations` at the very top, then
-  stdlib, then third-party, then local. One module per `import` line.
-- **Typing:** Annotate all public functions and dataclasses. Use built-in
-  generics (`list[str]`, `dict[str, int]`) and PEP 604 unions
-  (`int | None`), not `typing.List` / `typing.Optional`.
-- **Dataclasses:** Prefer `@dataclass` (and `frozen=True` for value
-  objects) for record types. Use `pydantic.BaseModel` only at the MCP
-  tool boundary in `server.py`.
-- **Module docstrings:** One short sentence at the top describing intent.
-  Match the existing tone in `chunk_heuristics.py` / `index_common.py`.
-- **Public API surface:** Modules that expose a small, intentional API
-  (e.g. `java_ontology.py`) declare `__all__`. Add to it when you add a
-  new export.
-- **Private helpers:** Prefix with `_` and keep them module-local. Don't
-  promote a helper to public API just to ease testing — extend the public
-  function or add a fixture.
-- **Side effects at import time:** None. Reading env vars is fine
-  (`index_common.py` does this); doing IO or loading models at import is
-  not.
-- **Logging:** Diagnostics for the indexer and graph builder go to
-  **stderr** (see `graph_enrich` parse-error warnings,
-  `LANCEDB_MCP_DEBUG_CONTEXT=1` debug output). Never `print` to stdout
-  from anything reachable by the MCP server — stdout is the MCP transport.
-
-## Tooling
-
-- `ruff` is pinned in `requirements.txt`. Run `ruff check .` before
-  pushing meaningful changes; fix or justify warnings.
-- Tests run with `pytest` under `asyncio_mode = auto` (see `pytest.ini`
-  at repo root and under `tests/`).
-- No new top-level dependencies without updating `requirements.txt` and
-  thinking about wheel availability on Linux **and** macOS (Intel + ARM)
-  — this bundle has been tuned to work on Intel Mac (`8963a72`).
-
-## What to avoid
-
-- Don't add `print()` debugging in committed code. Use stderr or, for the
-  context-expansion path, the `LANCEDB_MCP_DEBUG_CONTEXT` flag.
-- Don't introduce a hard dependency on `cocoindex` outside
-  `java_index_flow_lancedb.py` / `refresh_code_index`. The whole point
-  of the bundle is that search and MCP run without CocoIndex installed.
-- Don't reach into private helpers of another module. If you need
-  something, lift it into `index_common.py` or `graph_enrich.py` (the
-  designated cross-cutting modules).
diff --git a/.cursor/rules/search-and-ranking.mdc b/.cursor/rules/search-and-ranking.mdc
deleted file mode 100644
index 0924a13..0000000
--- a/.cursor/rules/search-and-ranking.mdc
+++ /dev/null
@@ -1,91 +0,0 @@
----
-description: LanceDB chunk schema, ranking weights, score components, and graph-expansion fusion. Load when touching search_lancedb.py, java_index_flow_lancedb.py, or any *search*/*lancedb* test.
-globs: search_lancedb.py,java_index_flow_lancedb.py,java_index_v1_common.py,chunk_heuristics.py,index_common.py,tests/**/test_search_lancedb*.py,tests/**/test_lancedb_*.py
-alwaysApply: false
----
-
-# Search and ranking rules
-
-## Schema invariants
-
-`JavaLanceChunk` carries enrichment columns:
-`package`, `module`, `microservice`, `primary_type_fqn`,
-`primary_type_kind`, `role`, `capabilities`, `annotations_on_type`,
-`symbols`, `ontology_version`.
-
-- `annotations_on_type` and `symbols` are **native PyArrow
-  `list<string>`**, not JSON-encoded strings. Older indexes had
-  char-array output; the server defensively JSON-decodes string-form
-  list columns so old indexes don't explode, but `array_contains`
-  filters won't work until re-indexed.
-- Any schema change requires a **full reindex** via
-  `refresh_code_index` or `cocoindex update --full-reprocess -f`.
-  Update `README.md` §5 (the "Re-index required" note) and bump
-  `ontology_version` if semantics change.
-
-## Ranking (Java hits)
-
-Java hits get a role-weighted re-rank after vector / hybrid scoring:
-
-| Role | Weight |
-|------|--------|
-| `CONTROLLER` | +0.10 |
-| `SERVICE` | +0.08 |
-| `FEIGN_CLIENT` | +0.06 |
-| `COMPONENT` | +0.03 |
-| `REPOSITORY` | +0.02 |
-| `MAPPER` / `OTHER` | 0 |
-| `ENTITY` | -0.06 |
-| `CONFIG` | -0.10 |
-
-Plus a **symbol-match bonus**:
-
-1. Method/field overlap: `+0.03` per overlapping declared symbol, capped at `+0.06`.
-2. Action-verb bump: `+0.02` flat when a declared method starts with
-   `process`, `handle`, `on`, `pick`, `select`, `assign`, `notify`,
-   `dispatch`, `publish`, `consume`, `route`, `trigger`, `enqueue`,
-   `distribute`, ...
-3. Type-name overlap: `+0.05` per token shared between
-   `simple_name(primary_type_fqn)` and the query, capped at `+0.10`.
-
-**Both role weights and the symbol bonus are skipped when the caller
-locks `role=`.** Preserve this. The per-row breakdown lives in
-`score_components` (`distance`, `hybrid_rrf`, `role_weight`,
-`symbol_bonus`, `import_penalty`) and feeds the compact `why` string.
-When tweaking weights, update both the table in `README.md` and the
-ranking section in `CODEBASE_REQUIREMENTS.md`.
-
-## Filters and modes
-
-- `role`, `module`, `microservice`, `package_prefix`, `capability` are
-  AND-combined.
-- `auto_hybrid=true` mixes vector + FTS via RRF. Recommended when a
-  query contains identifiers / CamelCase / snake_case tokens.
-- `graph_expand=true` + `expand_depth=1..3` fuses Kuzu BFS results into
-  vector hits via RRF. The Kuzu DB must exist.
-- `context_neighbors=1..2` attaches adjacent chunks as `context_before`
-  / `context_after`. Empty context with neighbors set is a known
-  failure path: set `LANCEDB_MCP_DEBUG_CONTEXT=1` to log why expansion
-  bailed (missing schema columns, empty bucket, chunk not found, scan
-  error). Common cause: stale server after reindex, or legacy index
-  without `range_start` / `range_end`.
-
-## Capabilities axis
-
-`capabilities` is a multi-tag `list<string>` per type — types can
-carry zero or many. They **augment**, never replace, the single
-`role`. Triggers (must stay in sync with `_*_TO_CAPABILITY` maps):
-
-- `MESSAGE_LISTENER` — `@KafkaListener`, `@RabbitListener`, `@JmsListener`, `@SqsListener`, `@EventListener`, `@StreamListener`.
-- `MESSAGE_PRODUCER` — type injects `KafkaTemplate`, `RabbitTemplate`, `JmsTemplate`, `StreamBridge`, or `ApplicationEventPublisher`.
-- `SCHEDULED_TASK` — `@Scheduled` on any method, or class implements `org.quartz.Job`.
-- `EXCEPTION_HANDLER` — `@ControllerAdvice`, `@RestControllerAdvice`, or any method with `@ExceptionHandler`.
-
-## What to avoid
-
-- Don't introduce ranking signals that depend on a specific package
-  prefix or class name from the test fixture (see `tests-and-fixtures.mdc`).
-- Don't change how `score_components` is shaped without updating the
-  `why`-string formatter and tests in `test_search_lancedb*.py`.
-- Don't silently change embedding model — `SBERT_MODEL` must match the
-  one used at index time, and `index_common.py` is the single source.
diff --git a/.cursor/rules/tests-and-fixtures.mdc b/.cursor/rules/tests-and-fixtures.mdc
deleted file mode 100644
index fcfdd12..0000000
--- a/.cursor/rules/tests-and-fixtures.mdc
+++ /dev/null
@@ -1,64 +0,0 @@
----
-description: Testing philosophy. Read before adding/changing tests or touching the bank-chat-system fixture. Loose invariants beat exact counts.
-globs: tests/**/*.py,tests/**/*.java
-alwaysApply: false
----
-
-# Testing rules
-
-The full philosophy is in `tests/README.md` — follow it strictly. Summary
-for agents below.
-
-## DO NOT OVERFIT THE MCP TO THE FIXTURE
-
-`tests/bank-chat-system/` is a deterministic Java corpus used for
-assertions. It is **not** a model of real production codebases. Real
-repos look different in dozens of ways.
-
-1. **Assert on invariants, not exact counts.** Prefer `>= 1`, `> 0`,
-   `key in result`, `len(...) >= N`, or structural shape over `== 11`.
-   Exact counts are fine only when proving both sides of a known
-   relationship in the fixture (e.g. that `EventProcessor` has the
-   implementations the fixture defines).
-2. **Never special-case the fixture in production code.** Don't add a
-   role / heuristic / regex that only fires for `com.bank.chat...` or
-   `ChatManagementService`. If a test needs it, the test is wrong.
-3. **Test the contract without LanceDB.** Validation and error paths
-   (e.g. "index missing" responses) should run without a real index.
-   Heavy integration goes in `test_lancedb_e2e.py` and is gated behind
-   `LANCEDB_MCP_RUN_HEAVY=1`.
-4. **When a test fails after a refactor, re-read the assertion first.**
-   Most assertions are intentionally loose. Tightening one to chase a
-   number is almost always wrong.
-
-## Layout
-
-- `tests/conftest.py` builds the session-scoped Kuzu graph from
-  `tests/bank-chat-system/` exactly once into a `tmp_path_factory` dir,
-  then sets `KUZU_DB_PATH` (and a fake `LANCEDB_URI`) for the suite.
-- `tests/fixtures/call_graph_smoke/` is a mini Maven tree for
-  scope / overload / wildcard / method-ref graph checks. Don't edit it
-  to make a test pass — it is calibrated against the resolver.
-- Coverage matrix for the call-graph propose lives in
-  `propose/completed/CALL-GRAPH-PROPOSE.md` §7.1; tests are spread
-  across `test_ast_java_calls.py`, `test_call_graph_smoke_roundtrip.py`,
-  `test_call_graph_receiver_resolution.py`, `test_ast_graph_build.py`,
-  `test_kuzu_queries.py`, and the MCP smoke tests.
-
-## Heavy / e2e tests
-
-`test_lancedb_e2e.py` runs `cocoindex` and a real LanceDB index. Skipped
-unless `LANCEDB_MCP_RUN_HEAVY=1`. Don't unconditionally enable it; it
-downloads the embedding model on first run and indexes the corpus from
-scratch.
-
-## Adding new tests
-
-- Use `pytest` + `asyncio_mode = auto` (already set globally).
-- Reuse the `kuzu_graph` session fixture instead of building your own.
-- For new behaviour, add fixture Java under `tests/bank-chat-system/`
-  (or extend `call_graph_smoke/`) **only** if the asserted behaviour is
-  general — never to encode a one-off heuristic.
-- New MCP tools must be exercised at least once in `test_mcp_tools.py`,
-  either with the real Kuzu graph fixture or via the error path when
-  LanceDB is unavailable.
diff --git a/AGENTS.md b/AGENTS.md
index 55a2de4..7ccf58a 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,65 +1,45 @@
 # AGENTS.md
 
-Entry point for Cursor CLI agents (and other agentic tools) operating
-this repo. Detailed guidance lives in `.cursor/rules/*.mdc` — those
-files are auto-loaded by Cursor based on globs and `alwaysApply`. This
-file is a flat summary for tools that don't read `.cursor/rules/`.
-
-## What this repo is
-
-Self-contained **stdio MCP server** for semantic + structural search
-over a Java codebase:
-
-- **LanceDB** vector index (Java / SQL / YAML chunks, `sentence-transformers`).
-- **Kuzu** AST graph (Tree-sitter Java, deterministic) with
-  `EXTENDS`, `IMPLEMENTS`, `INJECTS`, `DECLARES`, `CALLS` edges.
-- **MCP tools** in `server.py`: `codebase_search`, `trace_flow`,
-  `find_callers` / `find_callees` / `find_implementors` / `find_subclasses`
-  / `find_injectors`, `impact_analysis`, `list_by_role` /
-  `list_by_annotation` / `list_by_capability`, `graph_neighbors`,
-  `list_code_index_tables`, `graph_meta`, gated `refresh_code_index`.
-
-## Hard rules (read first)
-
-1. **No backward-compatibility obligation.** Prefer removals and
-   schema updates over shims (`.cursor/rules/breaking-changes.mdc`).
-2. **No overfitting to the test fixture.** `tests/bank-chat-system/`
-   is a deterministic corpus, not a model of production. Assert on
-   invariants, not exact counts (`.cursor/rules/tests-and-fixtures.mdc`).
-3. **MCP server is stdio.** `print()` to stdout breaks the transport.
-   All diagnostics go to stderr.
-4. **One source of truth for roles and capabilities:** `java_ontology.py`
-   + the inference tables in `ast_java.py`. No string literals
-   sprinkled elsewhere.
-5. **Schema changes require a full reindex.** Update the README
-   "Re-index required" block and bump `ontology_version` when
+Entry point for Cursor CLI agents (and other agentic tools) working
+on this repo. Detailed guidance lives in `.cursor/rules/*.mdc` —
+those files are auto-loaded by Cursor. This file is a flat summary
+for tools that don't read `.cursor/rules/`.
+
+## Where to look
+
+- `README.md` — feature surface, env vars, ranking, capabilities,
+  tool list, "Re-index required" callouts.
+- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and tuning map.
+- `propose/` and `plans/` (plus their `completed/` subdirs) —
+  in-flight scope and the rationale behind current design.
+- `tests/README.md` — testing philosophy.
+
+Read these directly. Don't rely on rule files to mirror them.
+
+## Hard rules
+
+1. **No backward-compatibility obligation** —
+   `.cursor/rules/breaking-changes.mdc`. Prefer removals and schema
+   updates over shims.
+2. **Propose-then-implement** for non-trivial features. Drop a short
+   markdown propose under `propose/`, reference it from the PR, move
+   it to `propose/completed/` once landed.
+3. **Don't overfit to the `tests/bank-chat-system/` fixture.** It is
+   a deterministic corpus, not a model of production. Assert on
+   invariants, not exact counts. Don't special-case the fixture in
+   production code.
+4. **`server.py` is stdio MCP.** Nothing reachable from a tool
+   handler may write to stdout. Diagnostics go to stderr.
+5. **Single source of truth** for roles and capabilities is
+   `java_ontology.py`. No string literals sprinkled elsewhere.
+6. **Schema changes require a reindex** — update the README
+   "Re-index required" callout and bump `ontology_version` when
    enrichment semantics change.
 
-## Investigation order
-
-1. `README.md` — feature surface and behaviour.
-2. `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file
-   tuning map.
-3. `propose/` and `plans/` — designed-but-deferred work and the
-   "propose-then-implement" culture.
-4. `tests/README.md` — testing philosophy.
-5. The relevant `.cursor/rules/*.mdc` for the file you're editing.
-
 ## Workflow
 
 - Branch from `master`. Branch names: `cursor/<topic>` (CLI work),
   `plan/<name>` (in-progress propose).
-- Commit messages: present tense, imperative, lowercase first word
-  (e.g. `fixed call graph review D6`).
+- Commit messages: present tense, imperative, lowercase first word.
 - Always open a PR; never push to `master`.
 - Run `ruff check .` and `pytest tests -v` before pushing.
-- For non-trivial features, drop a short propose under `propose/` and
-  reference it in the PR.
-
-## Environment for running the server
-
-`LANCEDB_URI` (required), `LANCEDB_MCP_PROJECT_ROOT`, `KUZU_DB_PATH`
-(defaults to `${LANCEDB_URI}/code_graph.kuzu`),
-`LANCEDB_MCP_GRAPH_ENABLED`, `LANCEDB_MCP_ALLOW_REFRESH`,
-`LANCEDB_MCP_MICROSERVICE_ROOTS`, `SBERT_MODEL`, `SBERT_DEVICE`. See
-`README.md` §2 and `mcp.json.example`.