Skip to content

find filter contract: silent-drop bug surfaces the larger 'what is the filter contract per kind?' question #117

@HumanBean17

Description

@HumanBean17

The bug that surfaced this

find(kind="client" | "route", filter={...}) silently ignores symbol-only filter fields (most notably fqn_prefix), returning every node of the requested kind instead of an empty/restricted set.

Reproduced on tests/bank-chat-system fixture:

from kuzu_queries import KuzuGraph
from mcp_v2 import find_v2
g = KuzuGraph("/tmp/repro_kuzu")
out = find_v2(
    kind="client",
    filter={"fqn_prefix": "NO.SUCH.PACKAGE.AtAll", "kind": "client"},
    graph=g,
)
print(len(out.results))   # observed: 2 — every Client in the graph
                          # expected: 0, or a loud error

Two distinct silent-drop pathologies in the same payload:

  1. fqn_prefix is consulted only in the kind="symbol" branch (mcp_v2.py:342). For client/route, neither the Cypher push-down nor the post-filter reads it.
  2. NodeFilter.model_config == {} → Pydantic defaults to extra="ignore". The malformed "kind": "client" inside filter is dropped silently rather than rejected.

Why this isn't a single-field fix

fqn_prefix is the most natural predicate to trip over, but the same pathology applies to every symbol-only field sent with a non-symbol kind, and symmetrically:

Field Belongs to Silently dropped for kinds
role, exclude_roles, annotation, capability, fqn_prefix, symbol_kind, symbol_kinds symbol route, client
http_method, path_prefix, framework route symbol, client
client_kind, target_service, target_path_prefix, client_method, source_layer client symbol, route

NodeFilter is a 17-field bag that pretends every field applies to every kind. The agent's only signal about what's queryable is "the tool didn't error" — exactly the wrong feedback loop.

The bug is the contract, not the field.

Frame direction (locked after grilling — 2026-05-14)

After several rounds of pressure-testing strict vs. permissive vs. hybrid framings, the chosen direction is:

§1 Frame: The MCP V2 surface is a typed query language for the code graph. Filters and traversal targets are strict — every input field has one and only one mapping to a stored attribute, and inapplicable inputs fail loud. The search tool's query parameter is the single exception: it accepts opaque natural-language or code text and returns ranked results. Everything else — including search's filter parameter — follows strict-frame rules.

What this rules out

What this explicitly permits

  • Fuzzy ranking on search.query (already shipped — score field)
  • Natural-language queries on search.query (already shipped)
  • Lossless multi-form input (e.g., _coerce_filter accepting JSON-encoded strings — already shipped)
  • Per-kind aligned vocabulary where the concept is genuinely the same (microservice, module are uniform across kinds)

Why this carve-out earns its keep

search is fundamentally about inexact retrieval — its outputs are ranked by score, not boolean-matched. find / describe / neighbors are about exact structural lookup — the graph either has this node/edge or it doesn't. The carve-out traces the underlying epistemic difference between operation classes: strict where truth is binary, permissive where truth is ranked.

The permissive frame is scoped precisely to search.query and search scoring — not to search.filter, which follows strict-frame rules even when hosted inside search.

resolve as named escape valve (deferred — separate issue)

The strict frame needs an escape valve for legitimate agent intents that don't fit find or search: identifier-shaped lookups that need to disambiguate honestly (one node, N candidates with reasons, or no match).

Common workflows that need this:

  • Agent has "SmartCareAssignClient" — is that interface, impl, package, or all? Today: search returns ranked text matches; find(fqn_prefix=...) returns N rows with no disambiguation.
  • Service-name resolution: "smartcare" (short) → "operator-api" (canonical). Today: search returns text matches; find(microservice="smartcare") returns zero.
  • Package vs class vs method: "ru.sbrf.pprb.chatx.out.assign.client" — package (no node), class FQN prefix (N nodes), or wrong identifier? Today: no clean answer.

The strict-frame answer is a dedicated resolve tool whose entire job is being smart-but-honest about identifier resolution:

def resolve(identifier: str, hint_kind: str | None = None) -> ResolveOutput:
    """Map a human-typed identifier to a canonical node ID.

    Returns exactly one NodeRef if unambiguous, a structured list of
    candidates with disambiguation reasons if ambiguous, or success=False
    with a structured "why" if no match.
    """

resolve is named in #117's frame but designed in its own follow-up issue. Two reasons:

Action: open separate issue "design proposal: resolve tool for identifier-shaped lookups" linked from here, to be drafted only after #117 frame is locked.

Smart behaviors the strict frame rules out, and where they go instead

The strict frame is ~95% strict. Worth being explicit about what's given up and where each smart behavior is relocated (not removed):

Smart behavior we'd lose Where it goes instead
Fuzzy target_service="smartcare" matching canonical "operator-api" resolve (deferred follow-up issue)
Approximate class-name match (e.g., user types "SmartCareClient", finds SmartCareAssignClientImpl) search(query=...) for ranked text/vector match, then describe per candidate
Auto-traversal of rollup keys (e.g., neighbors(edge_types=["DECLARES.DECLARES_CLIENT"])) Multi-call 2-hop walk, with PR #120 hints teaching the pattern
Cross-kind shortcuts (e.g., one find returning Symbols + Routes + Clients matching a predicate) Multiple find calls, one per kind
Field aliases across kinds (e.g., fqn_prefix working on both Symbols and Clients) Kind-appropriate field name per kind (e.g., member_fqn_prefix on Client if needed — audit decision)
Structured-query DSL inside search.query (e.g., "microservice:operator-api role:CONTROLLER") find with structured filter

None of these capabilities are removed — they're each given a clean home in the surface. The cost is one extra tool call or one extra agent reasoning step per case. The benefit is that the agent's contract is predictable: same input → same kind of result, every time.

Propose-doc TODOs surfaced during grilling (2026-05-14)

Issue framing is locked. The following questions surfaced during pressure-test are propose-doc-level, not issue-level — they must be answered in propose/MCP-FILTER-FRAME-PROPOSE.md before the propose locks:

  1. Wildcards in structured predicates. Is fqn_prefix="com.x.*Service" strict (a structured LIKE-shaped operator) or smart (rejected under the frame)? Frame as written doesn't answer. Propose must.
  2. FQN-as-identifier in describe. Is describe(fqn="...") a lossless alias for describe(id=...) (FQN ↔ ID is bijective for resolved nodes) or smart resolution (use resolve first)? Propose must answer.
  3. Multi-value field semantics. Does microservice=["a", "b"] mean OR (strict structured predicate)? Likely yes, but propose must lock it.
  4. Negation predicates. exclude_roles exists today. Is negation strict-by-default or a smart behavior? Likely strict, but propose must confirm.
  5. Empty-filter semantics. Does find(kind="client", filter={}) mean "all clients" or "filter required"? Today's behavior is "all clients" — propose must lock or change.
  6. Revisit-trigger tightening. "N legitimate workflows hit fail-loud" is vague. Propose should set N=3 and define "legitimate" as "issue filed with workflow that has no clean search / resolve / multi-call analog under the strict frame."
  7. Transition-window gap acknowledgment. Between strict-frame landing and resolve shipping, identifier-resolution workflows fall back to search + describe-per-candidate. Propose must call this out explicitly so reviewers don't read the frame and ask "wait, how do I look up a name?"

What still concerns us about this direction

Two real concerns we want pressure-tested in propose-doc form before locking, not waved away:

A. The carve-out is ~95% strict, dressed up

Strict-with-search.query-carve-out is mostly strict. The earlier worry ("strict rules out future smart behaviors") applies here at ~95% strength. The recalibration: the worry was about being-nervous-about-a-big-decision, not about the decision being wrong. The evidence supports strict — the strict parts of the V2 surface (EdgeType, kinds, directions) have zero issues; the permissive parts (NodeFilter, silent-drop) account for all three issues this session.

Still worth budgeting for in the propose: a §8 Risks row + revisit trigger ("if N legitimate agent workflows hit fail-loud with no clean search/resolve analog within 6 months, reopen").

B. Aligned-vocabulary discipline is ongoing work

The hybrid frame's "aligned vocabulary where possible" is the part that scales worst. Today:

  • microservice, module, source_layer — same meaning across kinds ✓
  • path_prefix (Route filter) vs. target_path_prefix (Client filter) — intentionally distinct
  • http_method (Route filter) vs. client_method (Client filter) — intentionally distinct
  • fqn_prefix (Symbol only today; the bug surface) — needs alignment decision

The propose-doc needs an Appendix doing the field-by-field alignment audit. That's a one-propose-with-multiple-follow-up-PRs job, not a single PR. Worth pricing in before locking, not after.

Migration shape (incremental, V2.x-friendly)

The frame is incremental. Each step is V2.x-shippable independently:

  1. extra="forbid" on all filter models — purely additive, loud-fail with great error messages (the error message becomes a teaching surface; the agent learns the contract from its own mistakes).
  2. Per-kind field validation in find_v2 — applicability check; inapplicable field → structured error with the applicable-fields list.
  3. Aligned vocabulary audit + renames (Appendix-driven) — most renames are deprecation-alias-friendly. Plausible breaking renames land together if any.
  4. resolve as fifth primitive — purely additive, separate propose, separate issue.
  5. Removing any smart-filter behaviors on find/describe/neighbors — none exist today to remove.

If step 3's audit produces breaking renames that aren't deprecation-alias-friendly, those land together as MCP V3.0. Otherwise the full propose ships as a series of V2.x PRs.

Versioning impact: TBD at lock-time — likely V2.x incremental; V3.0 only if breaking renames force it.

Composability with shipped invariants

This frame builds on, not replaces, PR #89 V2 decisions:

Cross-links

Next step

Draft propose/MCP-FILTER-FRAME-PROPOSE.md with the §1 frame above, the §3 surface, the §4 use-case re-walk (15+ UCs), the aligned-vocabulary Appendix, and the revisit-trigger in §8 Risks. Lock decisions only after consistency pass.

Open resolve as its own issue once frame is locked, not before.

References

  • mcp_v2.py:17–31EdgeType literal (kept; strict invariant)
  • mcp_v2.py:49–66NodeFilter (the bag to refactor)
  • mcp_v2.py:79–98_coerce_filter (lossless multi-form input, kept)
  • mcp_v2.py:321–368_node_matches_filter (per-kind post-filter; gets per-kind validation)
  • mcp_v2.py:401–409search_v2 (filter param follows strict-frame rules)
  • mcp_v2.py:456–466find_v2 client branch (where fqn_prefix silently dropped today)
  • PR propose: synthetic (via members) rollup keys in describe.edge_summary (clients + routes) #89 — V2 frame propose; this builds on (not replaces) its locked decisions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions