feat: mcp api v2 composition tweaks (PR-V2-2) by HumanBean17 · Pull Request #50 · HumanBean17/java-codebase-rag

HumanBean17 · 2026-05-07T10:03:55Z

Scope

Implements plans/PLAN-MCP-API-V2.md § PR-V2-2 — composition tweaks: edge_summary, symbol_id, meta edge counts on branch feat/mcp-v2-compose.

What Changed

mcp_v2.py: describe_v2 now populates NodeRecord.edge_summary via grouped in/out edge counts, and includes the requested inline validation-contract comment on required neighbors_v2 params.
mcp_v2.py: added _chunk_to_symbol_id and wired search_v2 to populate symbol_id from Lance rows when present (symbol_id direct field and metadata.symbol_id, including JSON-text metadata).
kuzu_queries.py: extended meta() with edge_counts for all 9 graph edge types and added edge_counts_for(node_id) helper used by describe_v2.
server.py: extended GraphMetaOutput/_graph_meta_output to surface edge_counts through graph_meta.
search_lancedb.py: ensured Java search projections include symbol_id/metadata when schema has them so search_v2 can compose into describe/neighbors in real execution.
tests/test_mcp_v2_compose.py: added the 6 PR-V2-2 compose tests from plan.
tests/test_search_lancedb.py: added projection regression tests to ensure symbol identity fields stay in selected Java columns.

Semantics / Non-Goals

No v1 tool registration changes (deferred to PR-V2-3).
No new v2 handler args/behaviors beyond PR-V2-2 composition additions.
No analyze_pr / diagnose_ignore / refresh_code_index / list_code_index_tables extraction work (PR-V2-4).
No schema migration or ontology bump.
README “v2 navigation tools (preview)” section intentionally unchanged in this PR.

Validation

Lint

ruff check . ✅

Tests

pytest tests -q ✅
Result: 368 passed, 4 skipped
pytest tests/test_mcp_v2_compose.py -v ✅
Result: 6 passed (baseline + 6 new)

Additional checks

python build_ast_graph.py --source-root tests/bank-chat-system --kuzu-path /tmp/pr_v2_2_check ✅

Sentinel checks

git diff --name-only master...HEAD ->
- kuzu_queries.py
- mcp_v2.py
- search_lancedb.py
- server.py
- tests/test_mcp_v2_compose.py
- tests/test_search_lancedb.py

Manual evidence

python -c "from kuzu_queries import KuzuGraph; from mcp_v2 import describe_v2; g=KuzuGraph('/tmp/pr_v2_2_check'); r=g._rows(\"MATCH (s:Symbol)-[:CALLS]->(:Symbol) RETURN s.id AS id LIMIT 1\"); sid=r[0]['id']; out=describe_v2(sid, graph=g); print(sid); print(out.record.edge_summary if out.record else None)"
Observed: describe_v2(...).record.edge_summary returns non-empty per-edge in/out map (example: {'CALLS': {'in': 0, 'out': 1}, 'DECLARES': {'in': 1, 'out': 0}}).
python -c "from kuzu_queries import KuzuGraph; g = KuzuGraph('/tmp/pr_v2_2_check'); m = g.meta(); print(sorted((m.get('edge_counts') or {}).keys()))"
Observed: all 9 edge keys present:
['ASYNC_CALLS', 'CALLS', 'DECLARES', 'DECLARES_CLIENT', 'EXPOSES', 'EXTENDS', 'HTTP_CALLS', 'IMPLEMENTS', 'INJECTS']

Out of Scope Confirmed

Did not implement:

Any v1 tool registration removal/changes.
Any new v2 handler arguments beyond the 3 composition additions.
Ops tool extraction / CLI migration work from PR-V2-4.
README v2 preview promotion/removal of v1 docs.
Ontology/schema bump.

Definition of Done

All listed deliverables for this PR are shipped.
Required lint/tests pass locally with recorded command output.
Sentinel checks produce expected results.
Only in-scope files are modified.
PR description includes scope, validation, and manual evidence.
PR targets master with agreed title and branch naming.

Made with Cursor

Co-authored-by: Cursor <cursoragent@cursor.com>

HumanBean17 · 2026-05-07T12:20:21Z

Post-merge audit: PR-V2-2 — composition tweaks

Verdict: Approved ✅ (post-merge — no merge gate, but a clean audit and one cleanup follow-up worth tracking).

PR is on-spec for plans/PLAN-MCP-API-V2.md § PR-V2-2. All 3 composition deliverables landed (edge_summary populated, _chunk_to_symbol_id + search_v2 symbol-identity wiring, meta.edge_counts for all 9 edge types). Manual evidence reproduces against a freshly-rebuilt fixture. Tests green at 368/4 (+8 over PR-V2-1; +6 prescribed compose tests + 2 bonus regression tests on search_lancedb projections). One observation from the PR-V2-1 review (Field(...) validation contract) was addressed properly via @validate_call.

Scope discipline (out-of-scope checks)

Sentinel	Status
v1 `@mcp.tool` deletions (PR-V2-3)	✅ 0 deletions
New `@mcp.tool` registrations (none expected)	✅ 0 additions
`user_rag/cli`, `pyproject.toml` (PR-V2-4)	✅ 0 matches
`ONTOLOGY_VERSION` bump	✅ 0 — only one regex hit, an `import as _ONTOLOGY_VERSION` line that isn't a version change
`CREATE NODE TABLE` / `CREATE REL TABLE` / `DROP TABLE`	✅ 0 — no schema work
`SCHEMA_VERSION`	✅ 0
README "v2 navigation tools (preview)" subsection	✅ unchanged this PR (deferred to PR-V2-3 promotion)

git diff against the merge parent (44b726c8...62f56f7c): 6 files (kuzu_queries.py, mcp_v2.py, search_lancedb.py, server.py, tests/test_mcp_v2_compose.py, tests/test_search_lancedb.py) — 4 expected by plan + 2 plan deltas; both deltas are justified (see Plan Deltas below).

Plan compliance

#	Step from plan	Verified
1	`describe_v2.edge_summary` populated via grouped Cypher count	✅ `_edge_summary_for_node` (`mcp_v2.py:265`) + `KuzuGraph.edge_counts_for` (`kuzu_queries.py:579`); both filter zero-count types
2	`_chunk_to_symbol_id` helper added; `search_v2.symbol_id` populated when chunk row carries it	✅ `_chunk_to_symbol_id` at `mcp_v2.py:157`, wired into `search_v2` at line 147 (covered by `test_search_populates_symbol_id_when_chunk_rooted_in_symbol`)
3	`KuzuGraph.meta().edge_counts: dict[str, int]` covers all 9 edge types from `_SCHEMA_*`	✅ at `kuzu_queries.py:535-575`; surfaced through `GraphMetaOutput.edge_counts` in `server.py:369,551`
4	6 prescribed compose tests in `tests/test_mcp_v2_compose.py`	✅ all 6 names exact: `test_describe_edge_summary_for_controller`, `_omits_zero_count_types`, `_for_route`, `test_search_populates_symbol_id_when_chunk_rooted_in_symbol`, `test_meta_returns_per_edge_type_counts`, `test_search_describe_neighbors_chain_end_to_end`
5	No graph schema changes / no ontology bump	✅ confirmed by sentinel checks
6	`README.md` "v2 preview" left untouched	✅ no README diff

Tests

368 passed, 4 skipped in 61.36s

Delta vs post-V2-1 baseline (360): +8 new = 6 prescribed compose tests + 2 bonus search_lancedb projection regression tests. PR description cites "+6 new" but the actual count is +8 — minor under-reporting, not over-reporting (the +2 are clearly disclosed in the file diff).

Manual evidence reproduced

Rebuilt fixture at /tmp/v22_check, then ran both manual-evidence snippets from the PR description:

1. describe_v2(...).record.edge_summary on a symbol that has both inbound and outbound edges:

id: 1dae8ba1e800d8a1857bf827ce5d1551cff71f2a
edge_summary: {'CALLS': {'in': 0, 'out': 1}, 'DECLARES': {'in': 1, 'out': 0}}

✅ Exact match against PR description's example.

2. KuzuGraph.meta().edge_counts keys + non-zero values:

['ASYNC_CALLS', 'CALLS', 'DECLARES', 'DECLARES_CLIENT', 'EXPOSES',
 'EXTENDS', 'HTTP_CALLS', 'IMPLEMENTS', 'INJECTS']
values: {EXTENDS: 10, IMPLEMENTS: 14, INJECTS: 71, DECLARES: 474, CALLS: 793,
         EXPOSES: 11, DECLARES_CLIENT: 2, HTTP_CALLS: 2, ASYNC_CALLS: 5}

✅ All 9 edge types present with non-zero counts. Total in-graph edges from bank-chat-system fixture: 1382.

Notes that earned my trust

PR-V2-1 review observation AST by Opus #1 was addressed properly. Instead of just adding a comment about the Field(...)-as-default convention, the agent went one step better and added an explicit @validate_call(config={"arbitrary_types_allowed": True}) decorator on neighbors_v2 (mcp_v2.py:446). That's the architecturally clean way to honor the validation contract for both MCP-bound and direct-Python calls. Plus a 2-line inline comment at :449-450 explaining the intent. Excellent follow-through.
Quality catch on search_lancedb.py. Adding symbol_id and metadata to JAVA_ENRICHED_COLUMNS (search_lancedb.py:37-38) was required for V2-2 to actually work end-to-end. Without it, _chunk_to_symbol_id would never see the symbol_id column on Lance rows because the SELECT projection was filtering it out — unit tests would pass against synthetic rows, but real search_v2 calls would silently return symbol_id=None for every hit. The agent caught this and added a regression test (test_search_one_table_selects_symbol_identity_columns_when_schema_has_them) using a monkeypatch-based fake Lance table to lock the projection in. That's a level of architectural awareness that justifies the plan delta.
Edge summary helper centralised on KuzuGraph. edge_counts_for(node_id) lives on the graph class (single source of truth for the Cypher) and mcp_v2.py calls it via getattr fallback. Clean separation.
Zero-count edge types are filtered out at both the helper level (mcp_v2.py:284-288) and the graph method level (kuzu_queries.py:596-600) — matches the plan's "edge_summary should only contain non-zero entries" rule (test 2 confirms this).

Observations (non-blocking)

Code duplication: _edge_summary_for_node (mcp_v2.py:265-288) and KuzuGraph.edge_counts_for (kuzu_queries.py:579-600) are byte-for-byte identical implementations of the same Cypher UNION ALL query + zero-count filter. The mcp_v2 helper does a hasattr(graph, "edge_counts_for") check and falls through to the inline implementation only if the method is missing. Since KuzuGraph always has it now, the inline fallback is dead code — kept presumably for test ergonomics (a fake graph that doesn't implement edge_counts_for). Worth either deleting the fallback (and adjusting tests to require the protocol method) or making the inline path use the same helper, in PR-V2-3 or a follow-up. Non-blocking; ~24 LoC of dupe.
PR description says "+6 new tests" but actual delta is +8. The 2 extra tests are the plan-delta regression tests on search_lancedb projections. Not a problem — under-reporting is fine — but worth correcting to "+6 prescribed + 2 regression" for future grep-ability if this PR is ever audited from the description alone.
_resolve_node_kind extra graph round-trip (PR-V2-1 observation Add Cursor rules and agent settings for CLI agents #3) still stands — neighbors_v2 calls it once per origin id (mcp_v2.py:464), which means an N-ids batch call does N extra existence-check queries. Small in absolute terms; eligible for the same prefix-only fast-path optimization later. No action needed for V2-3/V2-4.

Plan deltas

Two file additions to the plan's PR-V2-2 deliverable list — both justified:

search_lancedb.py (+2 lines): adding symbol_id and metadata to JAVA_ENRICHED_COLUMNS. Required for end-to-end V2-2 functionality (see Notes above). The plan's PR-V2-2 §1 should mention this as part of the symbol_id wiring; recommend a one-line amend to the plan in the next planning pass.
tests/test_search_lancedb.py (+62 lines, 2 new tests): regression tests for the JAVA_ENRICHED_COLUMNS projection. Locks in the fix above. No plan change needed for tests — they're a strict superset of what the plan prescribed.

Audit complete. Master is at 62f56f7; on to PR-V2-3 (feat/mcp-v2-cutover) — delete the 18 v1 navigation tools, promote the README v2 subsection to primary, drop tests/test_mcp_v2_equivalence.py, update the surface assertion to expect 9 registered MCP tools. The cursor prompt is ready in plans/CURSOR-PROMPTS-MCP-API-V2.md § PR-V2-3.

I noticed cursor/pr-v2-3-cutover-fixes already exists on origin — looks like Cursor may have started on V2-3 in parallel. Let me know when there's a PR for it and I'll review.

apply v2 compose edge summary and symbol identity fixes

b7078c7

Co-authored-by: Cursor <cursoragent@cursor.com>

HumanBean17 merged commit 62f56f7 into master May 7, 2026

HumanBean17 deleted the feat/mcp-v2-compose branch May 10, 2026 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: mcp api v2 composition tweaks (PR-V2-2)#50

feat: mcp api v2 composition tweaks (PR-V2-2)#50
HumanBean17 merged 1 commit into
masterfrom
feat/mcp-v2-compose

HumanBean17 commented May 7, 2026

Uh oh!

HumanBean17 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HumanBean17 commented May 7, 2026

Scope

What Changed

Semantics / Non-Goals

Validation

Lint

Tests

Additional checks

Sentinel checks

Manual evidence

Out of Scope Confirmed

Definition of Done

Uh oh!

HumanBean17 commented May 7, 2026

Post-merge audit: PR-V2-2 — composition tweaks

Scope discipline (out-of-scope checks)

Plan compliance

Tests

Manual evidence reproduced

Notes that earned my trust

Observations (non-blocking)

Plan deltas

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant