Sync MCP v2 docs, neighbors edge_types validation, venv cursor rule by HumanBean17 · Pull Request #54 · HumanBean17/java-codebase-rag

HumanBean17 · 2026-05-07T15:04:17Z

Summary

Align README, CODEBASE_REQUIREMENTS, docs/AGENT-GUIDE, docs/MANUAL-VERIFICATION-CHECKLIST, AGENTS.md, and .cursor rules with the four-tool MCP surface and user-rag CLI (per propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md).
Expand server.py FastMCP _INSTRUCTIONS and tool Field descriptions for weaker models.
Fix stale ontology upgrade hint in kuzu_queries.py (point to user-rag refresh).
Contract fix: neighbors_v2 now rejects empty edge_types with ValidationError (matches docs); add test_neighbors_empty_edge_types_rejected.
Add .cursor/rules/python-venv-only.mdc (repo-relative .venv paths).

Testing

Not run: full pytest tests (per request). Targeted neighbors tests were run during development.

Re-index

No schema / ontology bump from this PR.

Made with Cursor

Co-authored-by: Cursor <cursoragent@cursor.com>

HumanBean17 · 2026-05-07T15:12:42Z

Review: PR #54 — Sync MCP v2 docs, neighbors edge_types validation, venv cursor rule

Verdict: Approved with notes ⚠️

Clean chore PR that wraps up the v2 series: docs/copy alignment, a real contract fix in mcp_v2.neighbors_v2 (rejects edge_types=[]), helpful FastMCP tool descriptions for weak models, a stale kuzu_queries.py upgrade hint pointed at the new CLI, and a .cursor/rules/python-venv-only.mdc housekeeping rule. Scope is confined; no schema/ontology delta. One non-blocking hint mismatch and a couple of minor copy nits below.

Scope discipline (out-of-scope checks)

Sentinel	Status
`ONTOLOGY_VERSION` / `SCHEMA_VERSION`	✅ 4 hits, all doc text only — no code-level bump
`CREATE NODE TABLE` / `CREATE REL TABLE`	✅ 1 hit, in a `CODEBASE_REQUIREMENTS.md` quoted snippet — no schema delta
`DROP TABLE`	✅ 0 hits
`@mcp.tool` decorators changed	✅ 3 contextual hits (find/describe/neighbors); no add/remove — surface still exactly 4
`ontology_version` literal change	✅ 0 — only doc references stay at version 11

Plan / PR-description compliance

#	PR claim	Verified
1	README, CODEBASE_REQUIREMENTS, AGENT-GUIDE, MANUAL-VERIFICATION-CHECKLIST, AGENTS.md, `.cursor/rules` aligned to 4-tool MCP + `user-rag` CLI	✅ Searched all 5 docs for stale v1 tool names (`list_routes`, `describe_route`, `where_used`, `callers_of`, `callees_of`, `implementations_of`, `outbound_calls`, `client_for_call`, `edges_with_metadata`, `symbols_in_module`, `symbols_in_microservice`, `analyze_pr(`, `graph_meta(`, `refresh_code_index(`, `diagnose_ignore(`, `list_code_index_tables(` — 0 matches in the post-PR tree)
2	`server.py` `_INSTRUCTIONS` and `Field` descriptions expanded for weaker models	✅ Instructions block now enumerates all 9 edge labels verbatim and explicitly says "you MUST pass direction in\|out AND edge_types list — no defaults"; per-Field descriptions added for `hybrid`, `path_contains`, `filter`, `direction`, `edge_types`, `limit`, and the `find.filter` and `describe.id` (now spells out `sym:`/`route:`/`client:` prefixes)
3	Fix stale ontology-upgrade hint in `kuzu_queries.py` to point at `user-rag refresh`	⚠️ Done, but the hint reads `LANCEDB_MCP_ALLOW_REFRESH=1 user-rag refresh ...`. The CLI doesn't actually consult that env var (see Notes #1 below).
4	`neighbors_v2` rejects `edge_types=[]` with `ValidationError`; new `test_neighbors_empty_edge_types_rejected`	✅ `_NEIGHBOR_EDGE_TYPES_ADAPTER = TypeAdapter(Annotated[list[str], Field(min_length=1, ...)])` invoked first inside `try:` block; existing `except ValidationError: raise` re-raises it correctly so the @validate_call missing-field test and the new empty-list test both pass
5	Add `.cursor/rules/python-venv-only.mdc` (repo-relative `.venv` paths)	✅ 14-line rule file present

Tests

1 failed, 323 passed, 7 skipped in 176.95s
FAILED tests/test_user_rag_cli.py::test_cli_meta_pretty_when_tty (pre-existing PTY EIO from PR-V2-4 — not introduced by this PR)

Per-file delta vs master (post PR-V2-4 merge): tests/test_mcp_v2.py 22 → 23 (+1 = test_neighbors_empty_edge_types_rejected). Master (post-V2-4) ran on this branch's checkout: 322 passed, 7 skipped, 1 failed. Math: 322 + 1 = 323 ✅.

Targeted neighbors regression run separately:

tests/test_mcp_v2.py::test_neighbors_empty_edge_types_rejected     PASSED
tests/test_mcp_v2.py::test_neighbors_missing_edge_types_rejected   PASSED
tests/test_mcp_v2.py (all)                                         19 passed, 3 skipped

The only failing test (test_cli_meta_pretty_when_tty, EIO on the master pty fd) is unchanged from PR-V2-4 — this PR doesn't touch tests/test_user_rag_cli.py and isn't responsible. Still worth fixing in a follow-up; the suggestion from the PR-V2-4 review (catch OSError(EIO) after proc.poll() returns) still applies.

Manual evidence reproduced

# 9 edge labels in instructions match runtime taxonomy
$ grep -oE "EXTENDS|IMPLEMENTS|INJECTS|DECLARES|DECLARES_CLIENT|CALLS|EXPOSES|HTTP_CALLS|ASYNC_CALLS" server.py | sort -u | wc -l
9                                                       # ✅

# neighbors empty edge_types rejection (Python REPL)
>>> from mcp_v2 import neighbors_v2
>>> neighbors_v2("sym:x", direction="in", edge_types=[], graph=...)
pydantic_core._pydantic_core.ValidationError: 1 validation error for ...
                                                        # ✅ rejected before the try-block fallback could catch it as a generic Exception

Notes that earned my trust

_NEIGHBOR_EDGE_TYPES_ADAPTER is hoisted to module scope (mcp_v2.py:13-15). Built once at import time, not per-call. TypeAdapter instantiation isn't free — this is the right idiom.
Validation runs first (mcp_v2.py:441) — before the KuzuGraph.get() call. An empty list never reaches the singleton or the Cypher path; cheaper failure mode.
Existing except ValidationError: raise clause (mcp_v2.py:498-499) didn't need to change; the new validation reuses the same re-raise contract that @validate_call relies on. No "swallow the new error as success=False" regression.
describe.id description now spells out the prefix taxonomy ("Graph node id with sym:, route:, or client: prefix") — exactly the kind of weak-model hint that prevents describe("ChatController") from being a confidently-wrong tool call.
_INSTRUCTIONS block is structured as a checklist for a coding agent — tools list, then the MUST pass direction + edge_types clause, then the verbatim edge taxonomy, then the "user-rag CLI — not MCP" boundary. Hard to misread.
docs/AGENT-GUIDE.md has a forced reasoning preamble pattern (Q-class: ... / Pick: ... Why: ...) and a copy-paste BEGIN/END block — that's a real prompt-engineering artifact, not just documentation. Calibrated against ontology v11 and references the propose doc directly.
docs/MANUAL-VERIFICATION-CHECKLIST.md was rewritten phase-by-phase with concrete find / neighbors invocations replacing v1 tool calls — the verifier will actually be able to follow it now.

Observations (non-blocking)

Misleading LANCEDB_MCP_ALLOW_REFRESH hint in kuzu_queries.py:316. The new message reads Run: LANCEDB_MCP_ALLOW_REFRESH=1 user-rag refresh --source-root <repo>, but user_rag/cli.py::_cmd_refresh calls server.run_refresh_pipeline() directly and never consults _refresh_allowed() (which is now only used inside _graph_meta_output's refresh_enabled field). The env var has no functional effect on the CLI today. Either drop it from the hint (Run: user-rag refresh --source-root <repo>) or — if you want the gate to keep applying — add an if not server._refresh_allowed(): _emit({"success": False, "message": "set LANCEDB_MCP_ALLOW_REFRESH=1 to enable refresh"}); return 1 guard at the top of _cmd_refresh. Right now docs say "this is gated" and code says "anyone can call refresh". Pick one.
docs/AGENT-GUIDE.md line 233 says ontology v11, and kuzu_queries.py constant is also _ONTOLOGY_VERSION = 11. AGENTS.md line 12 also says 11. ✅ all in sync — I'm just calling out that the calibration claim in the agent guide ("Calibrated against ontology version 11") is verifiable.
describe.id parameter description ("Graph node id with sym:, route:, or client: prefix") could spell out one full example, e.g. "e.g. sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest)". The README has this example; surfacing it in the Field(description=...) itself would help the weakest tier of model that never reads the README. Optional polish.
neighbors.direction description says "Required: in (predecessors) or out (successors); no default" — clear. But the type is still str, not Literal["in", "out"]. With Literal["in", "out"], FastMCP will surface a discriminated string enum to the model and reject direction="upstream" at parse time instead of at runtime. Since mcp_v2.neighbors_v2 already validates direction internally, this is non-blocking, but it would be a strict UX upgrade for weak models. Same call applies to find.kind (Literal["symbol", "route", "client"]) and search.table (Literal["java", "sql", "yaml", "all"]).
No test asserts the new _INSTRUCTIONS or Field descriptions are non-empty / contain the required keywords. The existing test_all_tools_have_non_empty_description (in tests/test_mcp_tools.py) only checks tool.description, not per-parameter Field(description=...). A 5-line test that introspects the inputSchema of each tool and asserts every parameter has a non-empty description would lock the gain you just shipped against future copy regressions. Defer to PR-E1.
.cursor/rules/python-venv-only.mdc is small enough I didn't grep it for typos, but the rule name is good — it'll fire on absolute-path .venv references in future Cursor edits.
tests/test_neighbors_empty_edge_types_rejected uses the same _method_id_with_calls(kuzu_graph, "out") fixture as the missing-field test — gives parity. One small parity gap: the missing-field test passes edge_types not at all (relies on @validate_call); the empty-list test passes edge_types=[] (relies on the new TypeAdapter). Both are now distinct error paths. Worth a comment in the new test body that says "this exercises the explicit min_length=1 check, not the FastMCP required-field check" so the next reader doesn't think they're duplicates.

Plan deltas needed

None. The propose doc / completed plan stayed accurate.

Bonus catches

The PR-V2-4 review's observation Tier 1 completion plan + proposals (B2a + B4 + B5) #2 ("_cmd_refresh exit-code split is sensible but undocumented") was already addressed in PR-V2-4's merge (user_rag/cli.py:60 now has the docstring """Return 1 for launched-subprocess failures, 2 for internal pre-launch errors."""). Catching that retroactively here.
The 9-edge-label list in _INSTRUCTIONS is alphabetised within groups (Type wiring → Containment → Method calls → Service boundary → Cross-service) matching the AGENT-GUIDE table. Consistency between the FastMCP server-time hint and the agent-time guide is non-trivial and easy to lose; nice to see it preserved.

Ready to merge. Suggested PR-E1 carries forward:

PTY EIO handling (still failing locally; not regressed by this PR but still red)
LANCEDB_MCP_ALLOW_REFRESH hint vs CLI behaviour mismatch (Notes AST by Opus #1)
Literal[...] types on direction / kind / table (Notes Add per-PR Cursor task prompts for Tier 1 completion #4)
Schema test that asserts every MCP tool parameter has a non-empty Field description (Notes PR-A1: Route schema + literal extractor (B2a) #5)
Drop or update the dangling Manual TODO from PR-V2-4 — your cursor-pr-review user skill still calls analyze_pr MCP rather than the CLI.

…, schema descriptions Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

sync mcp v2 docs, validate neighbors edge_types, add venv rule

b3e2a88

Co-authored-by: Cursor <cursoragent@cursor.com>

HumanBean17 and others added 2 commits May 7, 2026 18:18

address pr54 review: refresh gate, pty meta test, literal tool params…

d39ba70

…, schema descriptions Co-authored-by: Cursor <cursoragent@cursor.com>

add user-rag cli operator guide and link from readme

81c1102

Co-authored-by: Cursor <cursoragent@cursor.com>

HumanBean17 merged commit 5603f89 into master May 7, 2026

This was referenced May 7, 2026

feat: tolerate JSON-encoded NodeFilter in MCP search/find/neighbors #55

Merged

Tighten MCP tool parameter types: use Literal[...] for direction / kind / table / edge_types #58

Closed

HumanBean17 deleted the chore/mcp-v2-docs-and-tool-copy branch May 10, 2026 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync MCP v2 docs, neighbors edge_types validation, venv cursor rule#54

Sync MCP v2 docs, neighbors edge_types validation, venv cursor rule#54
HumanBean17 merged 3 commits into
masterfrom
chore/mcp-v2-docs-and-tool-copy

HumanBean17 commented May 7, 2026

Uh oh!

HumanBean17 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HumanBean17 commented May 7, 2026

Summary

Testing

Re-index

Uh oh!

HumanBean17 commented May 7, 2026

Review: PR #54 — Sync MCP v2 docs, neighbors edge_types validation, venv cursor rule

Scope discipline (out-of-scope checks)

Plan / PR-description compliance

Tests

Manual evidence reproduced

Notes that earned my trust

Observations (non-blocking)

Plan deltas needed

Bonus catches

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant