You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
find(kind="client" | "route", filter={...}) silently ignores symbol-only filter fields (most notably fqn_prefix), returning every node of the requested kind instead of an empty/restricted set.
Reproduced on tests/bank-chat-system fixture:
fromkuzu_queriesimportKuzuGraphfrommcp_v2importfind_v2g=KuzuGraph("/tmp/repro_kuzu")
out=find_v2(
kind="client",
filter={"fqn_prefix": "NO.SUCH.PACKAGE.AtAll", "kind": "client"},
graph=g,
)
print(len(out.results)) # observed: 2 — every Client in the graph# expected: 0, or a loud error
Two distinct silent-drop pathologies in the same payload:
fqn_prefix is consulted only in the kind="symbol" branch (mcp_v2.py:342). For client/route, neither the Cypher push-down nor the post-filter reads it.
NodeFilter.model_config == {} → Pydantic defaults to extra="ignore". The malformed "kind": "client"insidefilter is dropped silently rather than rejected.
Why this isn't a single-field fix
fqn_prefix is the most natural predicate to trip over, but the same pathology applies to every symbol-only field sent with a non-symbol kind, and symmetrically:
NodeFilter is a 17-field bag that pretends every field applies to every kind. The agent's only signal about what's queryable is "the tool didn't error" — exactly the wrong feedback loop.
The bug is the contract, not the field.
Frame direction (locked after grilling — 2026-05-14)
After several rounds of pressure-testing strict vs. permissive vs. hybrid framings, the chosen direction is:
§1 Frame: The MCP V2 surface is a typed query language for the code graph. Filters and traversal targets are strict — every input field has one and only one mapping to a stored attribute, and inapplicable inputs fail loud. The search tool's query parameter is the single exception: it accepts opaque natural-language or code text and returns ranked results. Everything else — including search's filter parameter — follows strict-frame rules.
What this rules out
Smart filters on find / describe / neighbors (no fuzzy match, no alias magic, no cross-kind shortcuts)
Per-kind aligned vocabulary where the concept is genuinely the same (microservice, module are uniform across kinds)
Why this carve-out earns its keep
search is fundamentally about inexact retrieval — its outputs are ranked by score, not boolean-matched. find / describe / neighbors are about exact structural lookup — the graph either has this node/edge or it doesn't. The carve-out traces the underlying epistemic difference between operation classes: strict where truth is binary, permissive where truth is ranked.
The permissive frame is scoped precisely to search.query and search scoring — not to search.filter, which follows strict-frame rules even when hosted inside search.
resolve as named escape valve (deferred — separate issue)
The strict frame needs an escape valve for legitimate agent intents that don't fit find or search: identifier-shaped lookups that need to disambiguate honestly (one node, N candidates with reasons, or no match).
Common workflows that need this:
Agent has "SmartCareAssignClient" — is that interface, impl, package, or all? Today: search returns ranked text matches; find(fqn_prefix=...) returns N rows with no disambiguation.
Package vs class vs method: "ru.sbrf.pprb.chatx.out.assign.client" — package (no node), class FQN prefix (N nodes), or wrong identifier? Today: no clean answer.
The strict-frame answer is a dedicated resolve tool whose entire job is being smart-but-honest about identifier resolution:
defresolve(identifier: str, hint_kind: str|None=None) ->ResolveOutput:
"""Map a human-typed identifier to a canonical node ID. Returns exactly one NodeRef if unambiguous, a structured list of candidates with disambiguation reasons if ambiguous, or success=False with a structured "why" if no match. """
resolve is named in #117's frame but designed in its own follow-up issue. Two reasons:
resolve is only useful after the strict frame is locked. Under permissive or lossless-permissive frames, smart filters on find would cover most resolution cases and resolve becomes redundant.
Action: open separate issue "design proposal: resolve tool for identifier-shaped lookups" linked from here, to be drafted only after #117 frame is locked.
Smart behaviors the strict frame rules out, and where they go instead
The strict frame is ~95% strict. Worth being explicit about what's given up and where each smart behavior is relocated (not removed):
None of these capabilities are removed — they're each given a clean home in the surface. The cost is one extra tool call or one extra agent reasoning step per case. The benefit is that the agent's contract is predictable: same input → same kind of result, every time.
Propose-doc TODOs surfaced during grilling (2026-05-14)
Issue framing is locked. The following questions surfaced during pressure-test are propose-doc-level, not issue-level — they must be answered in propose/MCP-FILTER-FRAME-PROPOSE.md before the propose locks:
Wildcards in structured predicates. Is fqn_prefix="com.x.*Service" strict (a structured LIKE-shaped operator) or smart (rejected under the frame)? Frame as written doesn't answer. Propose must.
FQN-as-identifier in describe. Is describe(fqn="...") a lossless alias for describe(id=...) (FQN ↔ ID is bijective for resolved nodes) or smart resolution (use resolve first)? Propose must answer.
Multi-value field semantics. Does microservice=["a", "b"] mean OR (strict structured predicate)? Likely yes, but propose must lock it.
Negation predicates.exclude_roles exists today. Is negation strict-by-default or a smart behavior? Likely strict, but propose must confirm.
Empty-filter semantics. Does find(kind="client", filter={}) mean "all clients" or "filter required"? Today's behavior is "all clients" — propose must lock or change.
Revisit-trigger tightening. "N legitimate workflows hit fail-loud" is vague. Propose should set N=3 and define "legitimate" as "issue filed with workflow that has no clean search / resolve / multi-call analog under the strict frame."
Transition-window gap acknowledgment. Between strict-frame landing and resolve shipping, identifier-resolution workflows fall back to search + describe-per-candidate. Propose must call this out explicitly so reviewers don't read the frame and ask "wait, how do I look up a name?"
What still concerns us about this direction
Two real concerns we want pressure-tested in propose-doc form before locking, not waved away:
A. The carve-out is ~95% strict, dressed up
Strict-with-search.query-carve-out is mostly strict. The earlier worry ("strict rules out future smart behaviors") applies here at ~95% strength. The recalibration: the worry was about being-nervous-about-a-big-decision, not about the decision being wrong. The evidence supports strict — the strict parts of the V2 surface (EdgeType, kinds, directions) have zero issues; the permissive parts (NodeFilter, silent-drop) account for all three issues this session.
Still worth budgeting for in the propose: a §8 Risks row + revisit trigger ("if N legitimate agent workflows hit fail-loud with no clean search/resolve analog within 6 months, reopen").
B. Aligned-vocabulary discipline is ongoing work
The hybrid frame's "aligned vocabulary where possible" is the part that scales worst. Today:
microservice, module, source_layer — same meaning across kinds ✓
path_prefix (Route filter) vs. target_path_prefix (Client filter) — intentionally distinct
http_method (Route filter) vs. client_method (Client filter) — intentionally distinct
fqn_prefix (Symbol only today; the bug surface) — needs alignment decision
The propose-doc needs an Appendix doing the field-by-field alignment audit. That's a one-propose-with-multiple-follow-up-PRs job, not a single PR. Worth pricing in before locking, not after.
Migration shape (incremental, V2.x-friendly)
The frame is incremental. Each step is V2.x-shippable independently:
extra="forbid" on all filter models — purely additive, loud-fail with great error messages (the error message becomes a teaching surface; the agent learns the contract from its own mistakes).
Per-kind field validation in find_v2 — applicability check; inapplicable field → structured error with the applicable-fields list.
Aligned vocabulary audit + renames (Appendix-driven) — most renames are deprecation-alias-friendly. Plausible breaking renames land together if any.
resolve as fifth primitive — purely additive, separate propose, separate issue.
Removing any smart-filter behaviors on find/describe/neighbors — none exist today to remove.
If step 3's audit produces breaking renames that aren't deprecation-alias-friendly, those land together as MCP V3.0. Otherwise the full propose ships as a series of V2.x PRs.
Versioning impact: TBD at lock-time — likely V2.x incremental; V3.0 only if breaking renames force it.
Composability with shipped invariants
This frame builds on, not replaces, PR #89 V2 decisions:
Draft propose/MCP-FILTER-FRAME-PROPOSE.md with the §1 frame above, the §3 surface, the §4 use-case re-walk (15+ UCs), the aligned-vocabulary Appendix, and the revisit-trigger in §8 Risks. Lock decisions only after consistency pass.
Open resolve as its own issue once frame is locked, not before.
The bug that surfaced this
find(kind="client" | "route", filter={...})silently ignores symbol-only filter fields (most notablyfqn_prefix), returning every node of the requested kind instead of an empty/restricted set.Reproduced on
tests/bank-chat-systemfixture:Two distinct silent-drop pathologies in the same payload:
fqn_prefixis consulted only in thekind="symbol"branch (mcp_v2.py:342). Forclient/route, neither the Cypher push-down nor the post-filter reads it.NodeFilter.model_config == {}→ Pydantic defaults toextra="ignore". The malformed"kind": "client"insidefilteris dropped silently rather than rejected.Why this isn't a single-field fix
fqn_prefixis the most natural predicate to trip over, but the same pathology applies to every symbol-only field sent with a non-symbol kind, and symmetrically:role,exclude_roles,annotation,capability,fqn_prefix,symbol_kind,symbol_kindsroute,clienthttp_method,path_prefix,frameworksymbol,clientclient_kind,target_service,target_path_prefix,client_method,source_layersymbol,routeNodeFilteris a 17-field bag that pretends every field applies to every kind. The agent's only signal about what's queryable is "the tool didn't error" — exactly the wrong feedback loop.The bug is the contract, not the field.
Frame direction (locked after grilling — 2026-05-14)
After several rounds of pressure-testing strict vs. permissive vs. hybrid framings, the chosen direction is:
What this rules out
find/describe/neighbors(no fuzzy match, no alias magic, no cross-kind shortcuts)search.query(query stays opaque text; no structured-query parsing)neighbors(EdgeTypeliteral stays closed, dot-keys stay read-only — aligned with PR propose: synthetic(via members)rollup keys indescribe.edge_summary(clients + routes) #89 decision plan: Tier 1B (B2b + B6) plan + per-PR Cursor prompts #11)What this explicitly permits
search.query(already shipped — score field)search.query(already shipped)_coerce_filteraccepting JSON-encoded strings — already shipped)microservice,moduleare uniform across kinds)Why this carve-out earns its keep
searchis fundamentally about inexact retrieval — its outputs are ranked by score, not boolean-matched.find/describe/neighborsare about exact structural lookup — the graph either has this node/edge or it doesn't. The carve-out traces the underlying epistemic difference between operation classes: strict where truth is binary, permissive where truth is ranked.The permissive frame is scoped precisely to
search.queryandsearchscoring — not tosearch.filter, which follows strict-frame rules even when hosted insidesearch.resolveas named escape valve (deferred — separate issue)The strict frame needs an escape valve for legitimate agent intents that don't fit
findorsearch: identifier-shaped lookups that need to disambiguate honestly (one node, N candidates with reasons, or no match).Common workflows that need this:
"SmartCareAssignClient"— is that interface, impl, package, or all? Today:searchreturns ranked text matches;find(fqn_prefix=...)returns N rows with no disambiguation."smartcare"(short) →"operator-api"(canonical). Today:searchreturns text matches;find(microservice="smartcare")returns zero."ru.sbrf.pprb.chatx.out.assign.client"— package (no node), class FQN prefix (N nodes), or wrong identifier? Today: no clean answer.The strict-frame answer is a dedicated
resolvetool whose entire job is being smart-but-honest about identifier resolution:resolveis named in #117's frame but designed in its own follow-up issue. Two reasons:resolveis only useful after the strict frame is locked. Under permissive or lossless-permissive frames, smart filters onfindwould cover most resolution cases andresolvebecomes redundant.Action: open separate issue "design proposal:
resolvetool for identifier-shaped lookups" linked from here, to be drafted only after #117 frame is locked.Smart behaviors the strict frame rules out, and where they go instead
The strict frame is ~95% strict. Worth being explicit about what's given up and where each smart behavior is relocated (not removed):
target_service="smartcare"matching canonical"operator-api"resolve(deferred follow-up issue)"SmartCareClient", findsSmartCareAssignClientImpl)search(query=...)for ranked text/vector match, thendescribeper candidateneighbors(edge_types=["DECLARES.DECLARES_CLIENT"]))findreturning Symbols + Routes + Clients matching a predicate)findcalls, one per kindfqn_prefixworking on both Symbols and Clients)member_fqn_prefixon Client if needed — audit decision)search.query(e.g.,"microservice:operator-api role:CONTROLLER")findwith structured filterNone of these capabilities are removed — they're each given a clean home in the surface. The cost is one extra tool call or one extra agent reasoning step per case. The benefit is that the agent's contract is predictable: same input → same kind of result, every time.
Propose-doc TODOs surfaced during grilling (2026-05-14)
Issue framing is locked. The following questions surfaced during pressure-test are propose-doc-level, not issue-level — they must be answered in
propose/MCP-FILTER-FRAME-PROPOSE.mdbefore the propose locks:fqn_prefix="com.x.*Service"strict (a structured LIKE-shaped operator) or smart (rejected under the frame)? Frame as written doesn't answer. Propose must.describe. Isdescribe(fqn="...")a lossless alias fordescribe(id=...)(FQN ↔ ID is bijective for resolved nodes) or smart resolution (useresolvefirst)? Propose must answer.microservice=["a", "b"]mean OR (strict structured predicate)? Likely yes, but propose must lock it.exclude_rolesexists today. Is negation strict-by-default or a smart behavior? Likely strict, but propose must confirm.find(kind="client", filter={})mean "all clients" or "filter required"? Today's behavior is "all clients" — propose must lock or change.search/resolve/ multi-call analog under the strict frame."resolveshipping, identifier-resolution workflows fall back tosearch+describe-per-candidate. Propose must call this out explicitly so reviewers don't read the frame and ask "wait, how do I look up a name?"What still concerns us about this direction
Two real concerns we want pressure-tested in propose-doc form before locking, not waved away:
A. The carve-out is ~95% strict, dressed up
Strict-with-
search.query-carve-out is mostly strict. The earlier worry ("strict rules out future smart behaviors") applies here at ~95% strength. The recalibration: the worry was about being-nervous-about-a-big-decision, not about the decision being wrong. The evidence supports strict — the strict parts of the V2 surface (EdgeType, kinds, directions) have zero issues; the permissive parts (NodeFilter, silent-drop) account for all three issues this session.Still worth budgeting for in the propose: a §8 Risks row + revisit trigger ("if N legitimate agent workflows hit fail-loud with no clean
search/resolveanalog within 6 months, reopen").B. Aligned-vocabulary discipline is ongoing work
The hybrid frame's "aligned vocabulary where possible" is the part that scales worst. Today:
microservice,module,source_layer— same meaning across kinds ✓path_prefix(Route filter) vs.target_path_prefix(Client filter) — intentionally distincthttp_method(Route filter) vs.client_method(Client filter) — intentionally distinctfqn_prefix(Symbol only today; the bug surface) — needs alignment decisionThe propose-doc needs an Appendix doing the field-by-field alignment audit. That's a one-propose-with-multiple-follow-up-PRs job, not a single PR. Worth pricing in before locking, not after.
Migration shape (incremental, V2.x-friendly)
The frame is incremental. Each step is V2.x-shippable independently:
extra="forbid"on all filter models — purely additive, loud-fail with great error messages (the error message becomes a teaching surface; the agent learns the contract from its own mistakes).find_v2— applicability check; inapplicable field → structured error with the applicable-fields list.resolveas fifth primitive — purely additive, separate propose, separate issue.find/describe/neighbors— none exist today to remove.If step 3's audit produces breaking renames that aren't deprecation-alias-friendly, those land together as MCP V3.0. Otherwise the full propose ships as a series of V2.x PRs.
Versioning impact: TBD at lock-time — likely V2.x incremental; V3.0 only if breaking renames force it.
Composability with shipped invariants
This frame builds on, not replaces, PR #89 V2 decisions:
EdgeTypeliteral closed-set; rollup dot-keys read-only (PR propose: synthetic(via members)rollup keys indescribe.edge_summary(clients + routes) #89 decision plan: Tier 1B (B2b + B6) plan + per-PR Cursor prompts #11);_coerce_filterJSON-decoding as lossless multi-form input.(via members)rollup keys indescribe.edge_summary(clients + routes) #89 left this implicit; this propose makes it strict per-kind); inapplicable-field handling (silent → loud).resolveas fifth primitive (in a follow-up issue); aligned-vocabulary table (Appendix); revisit-trigger discipline.Cross-links
label(e) IN $listpredicate #119 — Kuzulabel(e) IN $listbug. Independent; should land first so error messages and tests under the new frame are based on correct traversal behavior.extra="forbid"+ per-kind validation become rich teaching surfaces; hints add complementary road-sign signals.Next step
Draft
propose/MCP-FILTER-FRAME-PROPOSE.mdwith the §1 frame above, the §3 surface, the §4 use-case re-walk (15+ UCs), the aligned-vocabulary Appendix, and the revisit-trigger in §8 Risks. Lock decisions only after consistency pass.Open
resolveas its own issue once frame is locked, not before.References
mcp_v2.py:17–31—EdgeTypeliteral (kept; strict invariant)mcp_v2.py:49–66—NodeFilter(the bag to refactor)mcp_v2.py:79–98—_coerce_filter(lossless multi-form input, kept)mcp_v2.py:321–368—_node_matches_filter(per-kind post-filter; gets per-kind validation)mcp_v2.py:401–409—search_v2(filter param follows strict-frame rules)mcp_v2.py:456–466—find_v2client branch (wherefqn_prefixsilently dropped today)(via members)rollup keys indescribe.edge_summary(clients + routes) #89 — V2 frame propose; this builds on (not replaces) its locked decisions