Skip to content

propose: composed override-axis rollup keys in describe.edge_summary (dispatch chasm)#90

Merged
HumanBean17 merged 2 commits into
masterfrom
propose/describe-override-rollup
May 13, 2026
Merged

propose: composed override-axis rollup keys in describe.edge_summary (dispatch chasm)#90
HumanBean17 merged 2 commits into
masterfrom
propose/describe-override-rollup

Conversation

@HumanBean17
Copy link
Copy Markdown
Owner

@HumanBean17 HumanBean17 commented May 12, 2026

TL;DR

When describe is called on a method Symbol, edge_summary adds composed rollup keys exposing the override relationship + any brownfield signal hidden behind it:

  • on an abstract/interface method: OVERRIDDEN_BY, OVERRIDDEN_BY.DECLARES_CLIENT, OVERRIDDEN_BY.EXPOSES
  • on a concrete override: OVERRIDES

Computed at describe-time via Cypher (IMPLEMENTS/EXTENDS class-level walk + signature column match). No schema change, no OVERRIDES edge table, no indexing pass.

Why

Agent walks Foo.processCALLSAssignClient.openChat (the interface method). edge_summary shows CALLS + DECLARES only. The concrete LocalAssignClient.openChat that carries @CodebaseClient is reachable only via 5 hops with a name+arity join in the middle. No agent guesses that walk. The dispatch chasm is invisible by design — same affordance bug as PR-89, different axis.

Symmetric to PR-89

PR-89 (this is the precedent) This propose
Axis Containment — class loses signal that lives on its members Dispatch — interface method loses signal that lives on its concrete overrides
Naming DECLARES.<projected> (dot notation) OVERRIDDEN_BY / OVERRIDES + OVERRIDDEN_BY.<projected> (dot notation)
Trigger Type Symbol (class, interface, enum, record, annotation) Method Symbol (method, constructor)
Walks DECLARES (one stored hop) IMPLEMENTS/EXTENDS + signature match (virtual hop)
Schema impact none none

Both rollups only ever appear on different node kinds — they never collide in the same edge_summary.

Scope

  • 1 PR. ~60 LoC behind _edge_summary_for_node + helper, 5 new tests.
  • Read-path Cypher only.
  • Both dispatch directions (declaration↔impl); one hop each side.
  • Signature match: signature column equality (mirrors _lookup_method_candidates's name+arity semantics).
  • Composed keys can't be passed to neighbors(edge_types=...) — Pydantic rejects OVERRIDDEN_BY / OVERRIDES as invalid EdgeType literals.

Out of scope

  • Persisted OVERRIDES / OVERRIDDEN_BY edge table.
  • Rewriting CALLS to also point at overrides (would falsify call-site resolution).
  • Transitive override chains (agent recurses explicitly; see UC13).
  • OVERRIDDEN_BY.HTTP_CALLS / OVERRIDDEN_BY.ASYNC_CALLS — separate propose if surfaced (symmetric to PR-89's DECLARES.HTTP_CALLS deferral).
  • Confidence/strategy filtering on rollup counts.

Sections

  • §1 Frame — describe must tell the agent that interesting signal lives one dispatch-hop away; scope translation framing (aligned with PR-89 v2.1)
  • §2 Principles — 8 binding rules (composed-only, method-Symbols-only, two vantage points, dot-notation convention, direction, signature-column matching, one hop only, no double-count with PR-89)
  • §3 Surface — before/after edge_summary on declaration side and impl side, Cypher sketch using signature column, AGENT-GUIDE addition
  • §4 Use-case re-walk — 16 UCs (originating case, route-side symmetry, diamond impls, static interface methods, sealed, generics, pathological 50-impl interface)
  • §5 What this deliberately does NOT do — 12-row table
  • §6 Migration plan — 1 PR, 5 tests enumerated
  • §7 Decisions taken — 18 locked
  • §8 Risks — 8 rows
  • Appendix A — override_axis_rollup_for skeleton + integration sketch that composes cleanly with PR-89's helper
  • Appendix B — v1→v2 traceability (six defects from code-grounded review)

Verdict on the originating question

This is a bug of the same family as PR-89: describe is too literal. The graph correctly records that @CodebaseClient is on the override (by deliberate operator design — each method represents one specific outbound call). But describe stops at the node's literal edges and doesn't translate the operator's architectural intent into an agent affordance at the declaration's grain. The override-axis rollup fixes the view without modifying the graph.

Together with PR-89, every brownfield signal hidden by Java's attachment patterns (member-of-class, override-of-interface) becomes one explicit walk-step away from the natural describe target.

Surface dispatch-axis affordances on method describe so the
agent walking 'Foo.process -> CALLS -> AssignClient.openChat'
(interface method) gets a breadcrumb to the concrete override
that carries @CodebaseClient, instead of dead-ending.

Synthetic rollup, no schema change. Two vantage points:
  - declaration side: OVERRIDDEN_BY (via signature) +
    DECLARES_CLIENT/EXPOSES (via overrides)
  - implementation side: OVERRIDES (via signature)

Signature match uses name + arity + ordered erased
param-type-fqns (mirrors _lookup_method_candidates).
One IMPLEMENTS/EXTENDS hop only; agent recurses explicitly.

Symmetric to PR-89's class-level (via members) rollup
(containment axis); this fixes the dispatch axis.

16 use cases, 17 locked decisions, 8 principles, 1 PR.
@HumanBean17 HumanBean17 marked this pull request as ready for review May 13, 2026 07:44
@HumanBean17
Copy link
Copy Markdown
Owner Author

Review — PR #90 needs a v2 pass aligned with PR #89's grilling

PR #89 went through three iterations (v1 → v2 → v2.1) during yesterday's review. PR #90 was written concurrently with v1 and hasn't absorbed any of the changes. Six issues, in decreasing severity.


1. Phantom Kuzu columns — the Cypher sketches reference param_count and param_type_fqns which don't exist

This is the most serious defect. The Cypher in §3.3 and Appendix A queries:

WHERE mover.name = m.name
  AND mover.param_count = m.param_count
  AND mover.param_type_fqns = m.param_type_fqns

The actual Symbol schema (build_ast_graph.py:2034-2043) is:

_SCHEMA_NODE = (
    "CREATE NODE TABLE Symbol("
    "id STRING PRIMARY KEY, "
    "kind STRING, name STRING, fqn STRING, package STRING, "
    "module STRING, microservice STRING, "
    "filename STRING, start_line INT64, end_line INT64, "
    "start_byte INT64, end_byte INT64, "
    "modifiers STRING[], annotations STRING[], capabilities STRING[], "
    "role STRING, signature STRING, parent_id STRING, resolved BOOLEAN"
    ")"
)

There is no param_count column. There is no param_type_fqns column. These are phantom fields — the Cypher would fail at runtime.

The signature column (stores "name(T1,T2)") could serve as a proxy for same-method matching: mover.signature = m.signature. But that has different semantics than what the propose describes — it's a string equality check on a formatted signature, not a structured name+arity+param-type comparison.

This also invalidates principle #6 ("Signature-match, not name-only") and decision #8 ("simple name + arity + ordered erased param-type-fqns"). The propose describes matching semantics that can't be implemented against the current schema without either (a) adding columns (contradicting "no schema change") or (b) doing the match in Python after fetching candidates (which changes the Cypher sketch fundamentally).

2. _lookup_method_candidates mismatch — the propose overclaims the match precision

Principle #6 says the signature match mirrors _lookup_method_candidates's existing semantics. Decision #8 says: "simple name + arity + ordered erased param-type-fqns."

The actual function (build_ast_graph.py:776-835) matches on name + arity onlym.decl.name == callee_simple and len(m.decl.parameters) == arg_count. There is no param-type-FQN comparison anywhere in _lookup_method_candidates. The propose invents a matching precision that neither the function nor the schema supports.

This matters for §8 risk assessment — the propose identifies "signature-match diverges from what JVM dispatch actually does" as a risk and claims the mitigation is "mirror _lookup_method_candidates's existing semantics (already battle-tested)." But the function is less precise than claimed, and mapping even that precision to Cypher requires columns that don't exist.

3. Naming convention is stale — must align with PR #89 v2's dot notation

PR #89 moved from (via members) to dot notation (DECLARES.DECLARES_CLIENT, DECLARES.EXPOSES). PR #90 still uses the v1-era parens-suffix throughout:

  • OVERRIDDEN_BY (via signature)
  • OVERRIDES (via signature)
  • DECLARES_CLIENT (via overrides)
  • EXPOSES (via overrides)

The symmetry table in the PR body explicitly says PR-89 uses (via members) — stale. Principle #4, principle #8, decision #5, and at least 15 other references cite the superseded convention.

The structural question: PR #89's dot notation DECLARES.DECLARES_CLIENT names two relations being composed (stored first hop + stored second hop). The override axis is different — OVERRIDES is a virtual relation computed from class hierarchy + signature match, not a stored edge. So the dot convention doesn't directly port.

Two options I see:

(a) Dot notation with virtual parent: OVERRIDDEN_BY.DECLARES_CLIENT, OVERRIDDEN_BY.EXPOSES, standalone OVERRIDDEN_BY and OVERRIDES. The dot still communicates "walk this relation, count this projected relation at the second hop." The fact that the parent is computed rather than stored doesn't change the agent's reading.

(b) Keep parens-suffix but switch to (rollup): consistent with the "minimal fix" from PR #89's review comment, but diverges from PR #89's final choice.

I lean (a) — the dot convention should be universal for composed keys. OVERRIDDEN_BY and OVERRIDES without a suffix look like real edge types, but they aren't in the schema and Pydantic rejects them, so the ambiguity is theoretical.

4. §1 frame overstatement — same defect as PR #89 v2.1 defect 1

PR #89 v2.1 fixed the frame from "Java forces it" to "scope translation." PR #90's §1 still says:

"the graph correctly records what Java records: @CodebaseClient is on the override, not on the declaration. Dispatch is runtime; the graph can't promise it."

This has the same problem. @CodebaseClient on the override is by deliberate operator design — each concrete method represents one specific outbound call with its own target/path/client_kind. The granularity is meaningful, not a Java-syntax workaround. The frame should be: the override-method-grain truth is correct; the agent's interface-method-grain question deserves an interface-method-grain answer about what's one dispatch hop away. Scope translation, not recovery of hidden intent.

5. is_static flag doesn't exist on Symbol nodes

Decision #11 says: "filter via is_static flag if present on the AST; else accept that statics may produce false positives."

There is no is_static column on Symbol. The is_static_call flag exists only on in-memory CallSite objects during graph build. modifiers STRING[] on Symbol does store "static" as a modifier — so the filter would need to check "static" IN m.modifiers, not query a non-existent is_static boolean. Decision #11 should name the actual mechanism.

6. §3.1 example contradicts the omission rule

The text says: "EXPOSES (via overrides) is omitted (count 0); the schema-omission rule from edge_counts_for applies." But the JSON example immediately above shows:

"EXPOSES (via overrides)": {"in": 0, "out": 0}

Either the key is omitted (text claim) or it's present with zeros (example claim). Pick one and make them consistent. Given principle #7 says "when count is 0, the key is omitted entirely," the example should drop the line.


What I'd ask for

A v2 pass that:

  1. Code-grounds the Cypher against the actual Symbol schema. Either use signature string matching, or acknowledge the plan must add columns, or restructure the helper to fetch + filter in Python.
  2. Aligns naming with PR propose: synthetic (via members) rollup keys in describe.edge_summary (clients + routes) #89 v2's dot notation. Decide whether the override-axis virtual parent gets dot syntax or a different convention, and state the reasoning.
  3. Reframes §1 as scope translation (same fix as PR propose: synthetic (via members) rollup keys in describe.edge_summary (clients + routes) #89 v2.1 defect 1).
  4. Fixes decision plan: Tier 1B (B2b + B6) plan + per-PR Cursor prompts #11 to reference modifiers STRING[] containing "static", not a phantom is_static flag.
  5. Fixes the §3.1 example to match the omission rule.
  6. Corrects the _lookup_method_candidates claim — the function does name+arity, not name+arity+param_type_fqns.

Items 1 and 2 are the structural ones; the rest are surgical.

@HumanBean17 HumanBean17 changed the title propose: synthetic override-axis rollup keys in describe.edge_summary (dispatch chasm) propose: composed override-axis rollup keys in describe.edge_summary (dispatch chasm) May 13, 2026
…n frame

Six defects fixed from code-grounded review (aligned with PR-89 v2.1):

1. phantom Kuzu columns (param_count, param_type_fqns) → signature equality
2. _lookup_method_candidates overclaim → name+arity via signature column
3. stale (via signature)/(via overrides) naming → dot notation
4. §1 frame overstatement → scope translation
5. phantom is_static flag → "static" IN modifiers
6. §3.1 example contradicted omission rule → zero-count line removed

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17
Copy link
Copy Markdown
Owner Author

v2 amendment — six defects from code-grounded review (force-pushed 4e5fe38 → caa115f)

A review pass reading v1 against the actual code in build_ast_graph.py, ast_java.py, and mcp_v2.py — applying the same pressure-test methodology that drove PR-89 through v2/v2.1 — surfaced six defects. All fixed in v2.

Defect 1 (critical): Phantom Kuzu columns

v1's Cypher sketches queried m.param_count and m.param_type_fqns. Neither column exists on the Symbol table (build_ast_graph.py:2034-2043). The schema stores signature STRING (format "name(T1,T2)") as the single source of truth for method identity at the Kuzu level.

v2 rewrites all Cypher to use mover.signature = m.signature. This also fixes Appendix A's code skeleton.

Defect 2: _lookup_method_candidates match precision overclaimed

v1 principle #6 and decision #8 claimed "simple name + arity + ordered erased param-type-fqns", citing _lookup_method_candidates as precedent. The actual function (build_ast_graph.py:776-835) matches on name + arity only — there is no param-type-FQN comparison.

v2 corrects: signature equality provides name+arity+simple-type-name semantics, consistent with (and slightly more precise than) the existing function.

Defect 3: Naming convention stale

v1 used parens-suffix naming ((via signature), (via overrides)) matching PR-89's v1 convention, which PR-89 v2 replaced with dot notation.

v2 aligns with PR-89 v2's dot convention:

  • Standalone keys: OVERRIDDEN_BY, OVERRIDES (virtual dispatch relation)
  • Composed keys: OVERRIDDEN_BY.DECLARES_CLIENT, OVERRIDDEN_BY.EXPOSES (brownfield projection through overrides)

The DECLARES.<projected> family (PR-89) composes through a stored parent edge; the OVERRIDDEN_BY.<projected> family composes through a virtual parent relation (computed from class hierarchy + signature match). Both share the dot convention; the distinction is documented.

Defect 4: §1 frame overstated Java's role

Same overstatement PR-89 v2.1 fixed. @CodebaseClient on the override is by deliberate operator design, not because Java forces it. v2 reframes as scope translation: the override-method-grain truth is correct; the agent's interface-method-grain question deserves an interface-method-grain answer.

Defect 5: is_static flag phantom

v1 decision #11 referenced an is_static flag on Symbol nodes. No such column exists. The modifiers STRING[] column stores "static" when applicable. v2 corrects: filter by "static" IN m.modifiers.

Defect 6: §3.1 example contradicted omission rule

v1's JSON example showed "EXPOSES (via overrides)": {"in": 0, "out": 0} while the text said it should be omitted when zero. v2 removes the zero-count line.

Consistency pass

Full v1→v2 traceability in Appendix B.

@HumanBean17 HumanBean17 merged commit 1cb8bb7 into master May 13, 2026
@HumanBean17 HumanBean17 deleted the propose/describe-override-rollup branch May 13, 2026 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant