Skip to content

feat: B2b HTTP_CALLS + ASYNC_CALLS extractor (PR-D1)#12

Merged
HumanBean17 merged 2 commits into
masterfrom
feat/b2b-http-async-edges
May 5, 2026
Merged

feat: B2b HTTP_CALLS + ASYNC_CALLS extractor (PR-D1)#12
HumanBean17 merged 2 commits into
masterfrom
feat/b2b-http-async-edges

Conversation

@HumanBean17
Copy link
Copy Markdown
Owner

Scope statement

Implements PR-D1 from plans/PLAN-TIER1B-COMPLETION.md only: B2b core HTTP/async caller extraction, pass5_imperative_edges, new edge tables/writers, _string_value_atoms rename, ontology bump to 7, graph meta call-edge counters, and PR-D1 test/fixture additions.

Summary

  • Renamed _route_value_atoms to _string_value_atoms, added OutgoingCallDecl, and populated MethodDecl.outgoing_calls via new _collect_outgoing_calls for Feign-method, RestTemplate, KafkaTemplate, WebClient(unresolved), and StreamBridge(unresolved).
  • Added HTTP_CALLS / ASYNC_CALLS schema, HttpCallRow / AsyncCallRow / CallEdgeStats, and pass5_imperative_edges wired immediately after pass4_routes to emit caller edges with match='unresolved' and phantom Route targets where needed.
  • Extended graph_meta with HTTP/async call totals, strategy JSON blobs, and resolved percentages; added Kuzu meta decoding for new JSON fields; updated README route/edge section and added PR-D1 fixture/tests (cases 1-19).

Test count

  • python3 -m pytest tests -q -> 229 passed, 4 skipped

Manual evidence

$ python3 build_ast_graph.py --source-root tests/bank-chat-system --kuzu-path /tmp/check_d1 --verbose 2>&1 | grep -E "^\[pass[45]\]"
[pass4] Route extraction: emitted=11, exposes=11, skipped_unresolved=0, routes_resolved_pct=81.8, routes_from_brownfield_pct=0.0, by_framework={'spring_mvc': 9, 'kafka': 2}
[pass5] HTTP_CALLS: 2 edges, ASYNC_CALLS: 5 edges
http_calls_total=2, async_calls_total=5
http_calls_by_strategy={'rest_template': 2}
ontology_version=7

Made with Cursor

Implement PR-D1 core by adding outgoing-call extraction, pass5 edge emission, and graph metadata counters so caller-side HTTP/async edges are materialized with unresolved match semantics.

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17
Copy link
Copy Markdown
Owner Author

Review: PR-D1 — B2b HTTP_CALLS + ASYNC_CALLS extractor

Verdict: Approved ✅

PR-D1 ships exactly what plans/PLAN-TIER1B-COMPLETION.md § PR-D1 specifies — _string_value_atoms rename, OutgoingCallDecl + _collect_outgoing_calls, pass5_imperative_edges, HTTP_CALLS / ASYNC_CALLS schema, ontology bump 6→7, graph_meta extension, and all 19 tests named per the plan. Scope discipline is clean: zero PR-D2 brownfield surface, zero PR-D3 cross-service / match-breakdown surface. Manual evidence reproduces bit-for-bit on tests/bank-chat-system.

Scope discipline (out-of-scope checks)

Sentinel (PR-D2 / PR-D3 territory) Status
CodebaseClient, CodebaseProducer ✅ 0 occurrences
HttpClientHint, AsyncProducerHint ✅ 0 occurrences
annotation_to_http_client_hint, fqn_to_http_client_hint ✅ 0 occurrences
annotation_to_async_producer_hint, fqn_to_async_producer_hint ✅ 0 occurrences
resolve_http_client_for_method, resolve_async_producer_for_method ✅ 0 occurrences
http_client_overrides, async_producer_overrides (YAML keys) ✅ 0 occurrences
match_breakdown, _match_factor (PR-D3) ✅ 0 occurrences
cross_service, intra_service ✅ Only as VALID_HTTP_CALL_MATCHES constants per plan §2 — no code paths consume them yet

Plan compliance

# Step from plan §"PR-D1 implementation step list" Verified
1 Rename _route_value_atoms_string_value_atoms, update 4 call sites grep -rn "_route_value_atoms" returns 0; all 4 call sites in ast_java.py (1084, 1094, 1153, 1156, 1159, 1250, 1255, 1264, 1269) call the new name
2 Add OutgoingCallDecl, MethodDecl.outgoing_calls ✅ Dataclass added; field populated by _collect_outgoing_calls
3 Implement _collect_outgoing_calls for Feign / RestTemplate / Kafka ✅ Tests 2–10, 13 pass
4 WebClient + StreamBridge unresolved branches ✅ Tests 11, 12 pass
5 VALID_CLIENT_KINDS, VALID_HTTP_CALL_*, VALID_ASYNC_CALL_* ✅ Added to java_ontology.py (frozenset, exported via __all__)
6 _SCHEMA_HTTP_CALLS, _SCHEMA_ASYNC_CALLS in create + drop lists (FROM Symbol TO Route, ...) exact match to plan §3.2
7 HttpCallRow, AsyncCallRow, CallEdgeStats, GraphTables fields ✅ Dataclasses present
8 pass5_imperative_edges wired after pass4_routes ✅ Manual evidence reproduces
9 HTTP_CALLS + ASYNC_CALLS writers + phantom-route dedup ✅ Tests 14, 15, 17 green
10 graph_meta extended with 6 new columns ✅ All 6 columns present + kuzu_queries.meta() decodes JSON blobs defensively (pattern mirrors routes_by_framework)
11 Bump ONTOLOGY_VERSION 6 → 7 meta() reports 7
12 README schema/edge section update HTTP_CALLS / ASYNC_CALLS row added; old "remaining work" bullet removed

Tests

229 passed, 4 skipped in 51.09s

Master baseline: 214 collected. PR-D1 branch: 233 collected → +19 tests, exactly per the plan. All 19 test names in tests/test_outgoing_call_extraction.py (12 cases), tests/test_call_edges_e2e.py (6 cases), and tests/test_string_value_atoms.py (1 case) match the plan §4 table verbatim.

Manual evidence reproduced

$ rm -rf /tmp/check_d1 && python build_ast_graph.py --source-root tests/bank-chat-system \
    --kuzu-path /tmp/check_d1 --verbose 2>&1 | grep -E "^\[pass[45]\]"
[pass4] Route extraction: emitted=11, exposes=11, skipped_unresolved=0, routes_resolved_pct=81.8, routes_from_brownfield_pct=0.0, by_framework={'spring_mvc': 9, 'kafka': 2}
[pass5] HTTP_CALLS: 2 edges, ASYNC_CALLS: 5 edges
ontology_version       = 7
http_calls_total       = 2
async_calls_total      = 5
http_calls_by_strategy = {'rest_template': 2}
async_calls_by_strategy= {'kafka_template': 5}
http_calls_resolved_pct= 1.0
async_calls_resolved_pct= 1.0

✅ Identical to the PR description. Sampling actual edges:

HTTP_CALLS: rest_template / unresolved / POST / 0.21       (×2 — bank-chat postForEntity sites)
ASYNC_CALLS: kafka_template / unresolved / producer / "topic" | "ChatTopics.OPERATOR_NOTIFICATIONS" | "ChatTopics.ESCALATION" | "ChatTopics.COMPLIANCE_REVIEW" | "ChatTopics.INCOMING"

Confidence 0.21 = 0.7 (concat-tail base) × 0.3 (PR-D1 fixed match_factor) × 1.0 (caller_microservice='', so micro_factor=1.0) — matches the plan §3.4 PR-D1 formula. Every edge has match='unresolved' as the plan mandates (PR-D3 will overwrite this column).

Notes that earned my trust

  • Symmetric defensive JSON decoder for *_by_strategy MAP-as-STRING fields in kuzu_queries.py:399–416. Mirrors the routes_by_framework pattern (try-except → empty dict, then isinstance check). Re-running meta() against an old DB would degrade gracefully.
  • Phantom-route dedup by id is implemented as "compute synthetic id, append to tables.routes_rows, re-call existing inserter with idempotent semantics" — i.e. it reuses B2a's writer rather than inventing a parallel one. Test 17 (test_phantom_routes_dedup_across_call_sites) locks this behaviour.
  • framework='' and microservice='' for caller-side synthetic ids (per plan §3.4) — guarantees no collision with B2a's exposer-side ids. Future PR-D3 can match by (http_method, path_template) cleanly because the phantom rows are uniquely keyed on caller-context-free attributes.
  • Test naming hygiene: every test name from the plan §4 table appears verbatim in code. No drive-by additions. No skipped/xfailed tests in the PR-D1 set.
  • WebClient + StreamBridge tests assert strategy='unresolved' explicitly (tests 11, 12), locking in the v2 deferral. If a future PR sneaks resolution support for either, those tests will fail loudly and force a plan amendment.

Observations (non-blocking)

  1. pass5 verbose log is total-only. build_ast_graph.py:1506–1511 prints just [pass5] HTTP_CALLS: N edges, ASYNC_CALLS: M edges. The plan §5 DoD bullet 3 says: "verbose output reports per-client_kind and per-strategy counts." The data exists in tables.call_edge_stats — just not surfaced to stderr. Trivial to extend in PR-D2 (or as a one-line follow-up); the durable counters land in graph_meta.*_by_strategy either way, so this isn't a blocker.

  2. Second copy of the strategy ladder still lives in graph_enrich.py:720–724 (annotation/spel/constant_ref ladder for brownfield route hints, pre-existing from PR-A3). PR-D1 doesn't touch it (correctly — out of scope). DoD bullet 1 says "no duplicate three-strategy ladder anywhere", but that wording is best read as "no duplicate of _string_value_atoms introduced by PR-D1" — verified clean. The second copy in graph_enrich.py is a known consolidation candidate for a future cleanup PR.

  3. http_calls_resolved_pct=1.0 on bank-chat despite all match='unresolved'. This is actually correct per plan §3.6 — the metric is % of edges where strategy != 'unresolved', not match != 'unresolved'. Both rest_template and kafka_template are concrete strategies. Worth a one-line docstring on the metric in build_ast_graph.py to prevent confusion later, but the semantics are exactly what the plan specifies.

  4. http_caller_smoke fixture exercises 5 client_kinds in one fixture (Feign interface + caller, RestTemplate exchange, KafkaTemplate.send, WebClient chain, StreamBridge.send). Future PR-D2 brownfield tests will be a perfect add-on against the same fixture by dropping in @CodebaseClient annotations and asserting replacement-rule behaviour from PR-D2's plan §3.5.

Plan deltas needed

None. Plan §3.6, §4, §5 all hold as written.


Ready to merge. Next: PR-D2 (B2b brownfield: caller-side overrides + @CodebaseClient / @CodebaseProducer). Plan §"Caller-side composition divergence" (option b — brownfield replaces built-in) and tests 27 / 31a / 31b will be the headline verification points.

Report per-client-kind and per-strategy counts in pass5 verbose output, and document that call-edge resolved percentages are strategy-based in PR-D1.

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17 HumanBean17 merged commit 7d193dd into master May 5, 2026
HumanBean17 added a commit that referenced this pull request May 5, 2026
…-E2 plan (#16)

Catches from PR-D1, PR-D2, PR-D3 reviews that were intentionally
deferred until Tier 1B landed are gathered into one document to
prevent them from getting lost across review threads.

PR-E1 (small, 1-day): risk_score [0,1] re-normalisation,
VALID_HTTP_CALL_MATCHES rename, two inline comments, two doc fixes.

PR-E2 (refactor): consolidate the second three-strategy ladder in
graph_enrich.py:720-724 onto the canonical resolver.

Refs:
- PR-D1 #12 obs 2 (strategy-ladder duplicate)
- PR-D2 #13 post-D3 follow-ups (anchor-fills-from-builtin doc, channel field)
- PR-D3 #15 obs 1-3, 5 (risk-score contract, VALID_HTTP_CALL_MATCHES rename, two reader comments)
@HumanBean17 HumanBean17 deleted the feat/b2b-http-async-edges branch May 10, 2026 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant