Skip to content

PR-E3: intra-JVM CALLS invariant guard (item 8)#22

Merged
HumanBean17 merged 2 commits into
masterfrom
cursor/pr-e3-calls-invariant-guard
May 5, 2026
Merged

PR-E3: intra-JVM CALLS invariant guard (item 8)#22
HumanBean17 merged 2 commits into
masterfrom
cursor/pr-e3-calls-invariant-guard

Conversation

@HumanBean17
Copy link
Copy Markdown
Owner

@HumanBean17 HumanBean17 commented May 5, 2026

Scope

  • Add pass3 observability + a guard that prevents spurious cross-microservice CALLS edges when the candidate set includes an in-caller-microservice resolution (FQN collision / overload-ambiguous duplication case).
  • Surface pass3_skipped_cross_service on GraphMeta and in KuzuGraph.meta() with backwards-compatible query fallbacks.
  • Add tests/fixtures/fqn_collision_smoke + tests/test_call_invariant.py.

Plumbing note (plan option a)

pass3_calls writes tables.pass3_skipped_cross_service at the end of pass3; _write_meta reads it for the GraphMeta insert.

Intentional divergence from the literal per-edge guard

A naive caller_ms != callee_ms filter breaks legitimate monorepo builds where a type is attributed to a different microservice than its caller (still same JVM / shared module). The shipped behavior only drops cross-ms candidates when at least one same-ms candidate exists; otherwise behavior is unchanged.

Manual evidence

Collision fixture

rm -rf /tmp/check_e3 && python build_ast_graph.py \
  --source-root tests/fixtures/fqn_collision_smoke \
  --kuzu-path /tmp/check_e3 --verbose 2>&1 | grep -E "Call resolution|skipped cross"

Observed:

[pass3] Call resolution: 2 sites, 1 chained phantoms (50.0%), 0 unresolved callee (0.0%), 0 phantom receiver (0.0%), 1 skipped cross-service, strategies: {'chained_receiver': 1, 'constructor': 1}

Bank chat system (zero drift on skip counter + CALLS totals)

rm -rf /tmp/check_e3_bcs && python build_ast_graph.py \
  --source-root tests/bank-chat-system \
  --kuzu-path /tmp/check_e3_bcs --verbose 2>&1 | grep -E "Call resolution"

Observed:

[pass3] Call resolution: 800 sites, 77 chained phantoms (9.6%), 294 unresolved callee (36.8%), 138 phantom receiver (17.2%), 0 skipped cross-service, strategies: {'import_map': 506, 'phantom': 138, 'implicit_super': 38, 'chained_receiver': 77, 'constructor': 20, 'this_super': 15, 'method_reference': 2, 'unique_type_name': 4}

CALLS totals (Kuzu MATCH ()-[c:CALLS]->() RETURN count(c)): 793 on master (abdd00d) and 793 on this branch (same tests/bank-chat-system build).

Follow-ups / known gaps

  • Pass3’s Call resolution: … sites summary can disagree with final persisted CALLS counts (e.g. 800 sites vs 793 CALLS edges on tests/bank-chat-system). This mismatch is pre-existing on master; tracked in #23.

Tests

python -m pytest tests --ignore=tests/test_indexer.py -q266 passed, 4 skipped

Plan tracking: plans/PLAN-POST-TIER1B-FOLLOWUPS.md item 8 marked shipped for PR-E3 (#22).

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17 HumanBean17 force-pushed the cursor/pr-e3-calls-invariant-guard branch from 1bc5d30 to f66abc5 Compare May 5, 2026 16:25
@HumanBean17
Copy link
Copy Markdown
Owner Author

Review: PR-E3 — Intra-JVM CALLS invariant guard (item 8)

Verdict: Approved ✅

Clean, surgical, and ships a smarter guard than the literal plan called for. The plan said "drop any cross-microservice candidate"; the implementation drops cross-ms candidates only when at least one same-ms candidate exists, preserving the legal "shared-module method called from another microservice" case the plan didn't account for. All three plan-mandated risk checks pass with byte-identical hashes on bank-chat-system. Tiered KuzuGraph.meta() ladder is now 4-deep (pr_e3pre_e3pr_a2legacy) with correct backwards compatibility.

Scope discipline (out-of-scope checks)

Sentinel Status
ONTOLOGY_VERSION bump ✅ 0 hits in +/- lines (the one match is an unchanged context line in the meta INSERT)
SCHEMA_VERSION ✅ 0 hits
DROP TABLE.*Calls / CREATE REL TABLE.*Calls ✅ 0 hits
@mcp.tool (no new MCP tools) ✅ 0 hits
graph_enrich.py modifications ✅ untouched (the 1 mention is in the plan file, item-7 historical row)
PR-E2 sentinel preserved grep -cE "annotation.*spel.*constant_ref" graph_enrich.py still returns 1
File whitelist ✅ exactly the 9 expected files (build_ast_graph.py, kuzu_queries.py, plan, fixture, test)

Plan compliance (vs plans/PLAN-POST-TIER1B-FOLLOWUPS.md § PR-E3)

# Step from plan Verified
1 New skipped_cross_service field on CallResolutionStats build_ast_graph.py:148, placed after callee_unresolved as specified
2 Guard in _resolve_and_emit_call candidate-emit branches ⚠️ shipped as a pre-filter at build_ast_graph.py:1074-1087 instead of per-edge guards in the single-candidate / overload-ambiguous branches. See "Intentional divergence" below — this is a deliberate, justified design choice.
3 Verbose log line mentions counter build_ast_graph.py:1193 appends , {N} skipped cross-service before the strategies dict
4 GraphMeta plumbing — option (a) tables.pass3_skipped_cross_service ✅ field at build_ast_graph.py:256, written at line 1188, read into INSERT at lines 2185 + 2210
5 Tiered KuzuGraph.meta() read ✅ new _META_PR_E3 is now first tier; existing _META_FULL becomes _META_PRE_E3; PR_A2 + LEGACY tiers preserved (kuzu_queries.py:331-400); pass3_skipped_cross_service defaults to 0 in older-graph branches
6 New fqn_collision_smoke fixture (svc-x + svc-y, same FQN) tests/fixtures/fqn_collision_smoke/{svc-x,svc-y}/ with pom.xml markers, SharedDto.java in both, Caller.java in svc-x
7 Plan tracking — item 8 marked shipped ✅ table row `8

Tests

266 passed, 4 skipped in 56.94s

Exactly the DoD target (263 baseline + 3 new tests). Focused run: pytest tests/test_call_invariant.py -v → 3 passed.

Manual evidence reproduced

Collision fixture — guard fires exactly once:

[pass3] Call resolution: 2 sites, 1 chained phantoms (50.0%), 0 unresolved callee (0.0%),
        0 phantom receiver (0.0%), 1 skipped cross-service,
        strategies: {'chained_receiver': 1, 'constructor': 1}
[guard] skipping cross-microservice CALLS edge com.example.Caller#run() ->
        com.example.SharedDto#<init>() (caller=svc-x, callee=svc-y)

DB inspection: 2 CALLS edges, both with b.microservice == 'svc-x'. The svc-y SharedDto#<init> is NOT in the call graph. g.meta()['pass3_skipped_cross_service'] == 1. Cross-service CALLS query returns 0.

Bank-chat-system parity check (plan-mandated):

MASTER: CALLS_total=793 hash=2449a2a693cea96b
PR-E3 : CALLS_total=793 hash=2449a2a693cea96b

Byte-identical hashes over (a.fqn, b.fqn, c.strategy, c.confidence, c.resolved) tuples. Same per-strategy distribution: import_map: 500, phantom: 137, chained_receiver: 77, implicit_super: 38, constructor: 20, this_super: 15, unique_type_name: 4, method_reference: 2. Guard is provably inert on real-world fixtures.

Notes that earned my trust

  • The intentional divergence from the literal plan is the right call. The shipped pre-filter (if same_ms and len(same_ms) != len(candidates):) only drops cross-ms candidates when there's also a same-ms peer — exactly the "FQN collision" failure mode the plan described. A naive per-candidate filter would have broken legitimate shared-module cases where a class is attributed to one microservice but is called from another within the same JVM. PR description calls this out with ## Intentional divergence — that's the right level of transparency.
  • Phantom branches got an inline justification comment instead of redundant guard code (build_ast_graph.py:1047: "Chained-receiver phantoms have no microservice attribution, so they cannot violate cross-service CALLS invariants.") and :1059 for unresolved-receiver phantoms. This is exactly what the prompt asked for: verify the invariant holds by construction, document it, don't add dead code.
  • Tiered meta read renames _META_FULL_META_PRE_E3 rather than just adding a new tier. The naming is consistent with the chain (pr_e3 / pre_e3 / pr_a2 / legacy), and the meta_mode in ("pr_e3", "pre_e3") branch correctly unpacks the same fields for both.
  • Test Add Cursor rules and agent settings for CLI agents #3 (test_call_invariant_inert_on_bank_chat_system) uses the existing corpus_root conftest fixture instead of hard-coding the path. Cheap forward-compat — if the corpus fixture moves, this test follows automatically.
  • Verbose log line preserves byte-stable structure — counter is appended before , strategies: {...} so existing log-grep checks still work.

Observations (non-blocking)

  1. Pre-filter approach has a real semantic gap vs. the literal plan — if the resolver ever returns a candidates list where all entries are cross-ms (no same-ms peer), the guard does nothing and the cross-service CALLS edge gets emitted. This can't happen today because _lookup_method_candidates is global and FQN collisions inevitably include the caller's own copy, but a future refactor (e.g. per-microservice resolution scoping in the deferred Tier-2 work) could expose the gap. Worth a one-line code comment near the guard noting the precondition: "guard relies on _lookup_method_candidates returning at least one same-ms candidate when a same-ms target exists; revisit if pass3 ever scopes lookups per-microservice." Non-blocking because it's exactly the case Tier-2 would obviate.
  2. PR description's pass3 numbers don't match Kuzu CALLS totals (pass3 reports 800 sites / 138 phantom-recv; Kuzu has 793 CALLS / 137 phantom dst). 7-edge gap exists on master too — pre-existing pass5/dedup behaviour, not caused by this PR. Worth a one-line follow-up issue ("investigate pass3-vs-kuzu CALLS edge gap on bank-chat-system") so the discrepancy is tracked, not silently absorbed.
  3. _build helper in tests/test_call_invariant.py skips pass4_routes (only runs pass1+2+3+write_kuzu). Fine because the tests only inspect pass3 stats — but if a future test in this file ever needs to assert on routes, it'll silently get an empty Route table. Worth a one-line docstring on _build noting the omission is intentional.
  4. Test Tier 1 completion plan + proposals (B2a + B4 + B5) #2 (inert_on_clean_fixtures) only checks pass3_skipped_cross_service == 0 but not that the CALLS topology is unchanged. The bank-chat-system byte-hash check I ran manually would be valuable as a fourth automated test — but it's slow (full corpus rebuild) and the manual evidence covers it. Leave as-is unless the suite ever loses the corpus_root fixture.

Plan deltas needed

None — the table row is already updated. The "Intentional divergence" decision is well-documented in the PR body, so the ## Resolution section of the plan (which still describes the literal per-edge guard) reads as historical context rather than a delta. Optionally: add a one-line "Final shape: see PR #22 §Intentional divergence" pointer in the plan, but it's a polish item.


Ready to merge. After this lands, all 8 follow-ups are done — move plans/PLAN-POST-TIER1B-FOLLOWUPS.md to plans/completed/ and revisit the Tier-2 incremental rebuild work (PR #18, now merged as a propose).

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17 HumanBean17 merged commit 79a99a1 into master May 5, 2026
@HumanBean17 HumanBean17 deleted the cursor/pr-e3-calls-invariant-guard branch May 10, 2026 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant