Skip to content

fix(benchmark): apply sink-edge exclusion filter to scripts/resolution-benchmark.ts#1699

Merged
carlos-alm merged 4 commits into
mainfrom
release/3.15.0
Jun 22, 2026
Merged

fix(benchmark): apply sink-edge exclusion filter to scripts/resolution-benchmark.ts#1699
carlos-alm merged 4 commits into
mainfrom
release/3.15.0

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • scripts/resolution-benchmark.ts was missing the sink-edge exclusion filters that commit 95304680 added to the vitest test file
  • The CI gate reads precomputed metrics from the script via RESOLUTION_RESULT_JSON, so sink edges (confidence=0, targeting file nodes) were counted as false positives
  • Added AND tgt.kind != 'file' and the confidence/dynamic_kind belt-and-suspenders guard to the script's SQL, matching extractResolvedEdges in the test file

Affected fixtures (before fix):

  • dynamic-java: precision 33.3% (runInvoke, runForName sink edges leaking through)
  • dynamic-javascript: precision 57.1% (runComputedKey, runEval, runNewFunction leaking)
  • dynamic-scala: precision 50.0% (runInvoke leaking)
  • dynamic-typescript: precision 60.0% (runReflectGetVariable, runReflectApplyExternal leaking)

Test plan

  • Re-trigger the publish workflow — Gate on resolution thresholds should pass for all four fixtures at 100% precision

… path

`insert_call_edge_rows` in pipeline.rs mapped ComputedEdge→EdgeRow but
silently dropped dynamic_kind, so native sink edges landed in the DB with
dynamic_kind IS NULL.  The benchmark filter added in 39b9a00
(`OR e.dynamic_kind IS NULL`) then matched those NULL-kind rows and counted
them as false positives, breaking precision for dynamic-javascript,
dynamic-typescript, and dynamic-scala.

Two-layer fix:

1. edges.rs: add dynamic_kind: Option<String> to EdgeRow and update
   do_insert_edges to a 6-column INSERT (CHUNK 199→165 to stay under
   SQLite's 999 bind-parameter limit).  None binds as SQL NULL for all
   non-sink edges; Some(dk) writes the kind string for sink edges.

2. pipeline.rs: pass e.dynamic_kind.clone() in insert_call_edge_rows so
   ComputedEdge.dynamic_kind reaches the DB insert.

3. resolution-benchmark.test.ts: add AND tgt.kind != 'file' to
   extractResolvedEdges — the correct semantic filter since sink edges are
   the only calls that ever target the file node, and no legitimate
   resolution does.  This filter is correct regardless of dynamic_kind being
   set, removing the dependency on the back-fill path.
…xtractResolvedEdges (#1698)

The tgt.kind != 'file' guard is the authoritative semantic exclusion for
sink edges; the confidence/dynamic_kind clause is kept as belt-and-suspenders
for databases built before the native fix where dynamic_kind was NULL.
…n-benchmark.ts

The tgt.kind != 'file' and confidence/dynamic_kind guards were added to
extractResolvedEdges in the vitest test file (9530468) but not to the
matching SQL query in scripts/resolution-benchmark.ts. The CI gate reads
precomputed JSON from that script (via RESOLUTION_RESULT_JSON), so sink
edges (confidence=0, targeting file nodes) were counted as false positives,
breaking precision for dynamic-java, dynamic-javascript, dynamic-scala,
and dynamic-typescript.
@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR backports the sink-edge exclusion SQL filter (added to the vitest test in commit 95304680) into scripts/resolution-benchmark.ts, which is the path the CI gate actually reads via RESOLUTION_RESULT_JSON. It also propagates the native Rust dynamic_kind field through EdgeRow so the native engine writes the column that the SQL guard relies on.

  • Benchmark script fix (scripts/resolution-benchmark.ts): adds AND tgt.kind != 'file' (primary guard) and the belt-and-suspenders confidence/dynamic_kind clause to the resolvedEdges query, making it identical to extractResolvedEdges in the test file. Without this, sink edges (confidence=0, targeting file nodes) were classified as false positives, dragging precision down to 33–60% on the four dynamic-language fixtures.
  • Rust native engine fix (edges.rs, pipeline.rs): adds dynamic_kind: Option<String> to EdgeRow, recalculates CHUNK to 165 (165×6=990, safely under the SQLite 999-variable limit), and wires the new column through the bulk-insert SQL and binding loop.
  • Documentation sweep: CHANGELOG, README, ROADMAP, and BACKLOG updated for the v3.15.0 release.

Confidence Score: 4/5

Safe to merge — the benchmark script now uses the same SQL filter as the test, and the Rust bulk-insert path correctly writes dynamic_kind for sink edges.

All changed logic is a targeted one-to-one alignment of the benchmark script's SQL query with the already-correct test implementation. The only loose end is a comment in the test file that cites #1698 for the Rust dynamic_kind write fix, which actually lands in this PR (#1699).

The inline comment on line 267 of tests/benchmarks/resolution/resolution-benchmark.test.ts references an incorrect PR number for the native engine fix.

Important Files Changed

Filename Overview
scripts/resolution-benchmark.ts Adds the missing sink-edge exclusion SQL clauses (AND tgt.kind != 'file' + belt-and-suspenders confidence/dynamic_kind guard) to match the test file's extractResolvedEdges query; the logic is correct.
tests/benchmarks/resolution/resolution-benchmark.test.ts Pre-existing SQL filter that was already correct; a comment in this file references #1698 for the Rust dynamic_kind write fix, but that change actually lands in this PR (#1699).
crates/codegraph-core/src/db/repository/edges.rs Adds dynamic_kind: Option<String> to EdgeRow, adjusts CHUNK from 199 to 165 (165x6=990, still under the 999 SQLite variable limit), and propagates the new column in the bulk-insert SQL and bindings — all correct.
crates/codegraph-core/src/domain/graph/builder/pipeline.rs One-liner that passes dynamic_kind through from ComputedEdge to EdgeRow; straightforward and correct.
CHANGELOG.md Adds the v3.15.0 release notes block; documentation only.
README.md Adds codegraph roles --dynamic example and ignoreAdditionalDirs config snippet; documentation only.
docs/roadmap/ROADMAP.md Version bump to 3.15.0, marks two ADRs as complete, adds a new roadmap phase; documentation only.
docs/roadmap/BACKLOG.md Date stamp update; documentation only.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant CI as CI Gate
    participant Script as resolution-benchmark.ts
    participant DB as SQLite (graph.db)
    participant Test as resolution-benchmark.test.ts

    CI->>Script: run (reads RESOLUTION_RESULT_JSON)
    Script->>DB: "SELECT WHERE kind=calls AND tgt.kind != file NEW"
    DB-->>Script: resolved edges (sink edges excluded)
    Script-->>CI: metrics JSON (precision/recall)
    CI->>CI: Gate on resolution thresholds

    Note over Test: extractResolvedEdges() already had both guards since commit 95304680
    Test->>DB: same SQL (reference implementation)
    DB-->>Test: resolved edges (sink edges excluded)
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant CI as CI Gate
    participant Script as resolution-benchmark.ts
    participant DB as SQLite (graph.db)
    participant Test as resolution-benchmark.test.ts

    CI->>Script: run (reads RESOLUTION_RESULT_JSON)
    Script->>DB: "SELECT WHERE kind=calls AND tgt.kind != file NEW"
    DB-->>Script: resolved edges (sink edges excluded)
    Script-->>CI: metrics JSON (precision/recall)
    CI->>CI: Gate on resolution thresholds

    Note over Test: extractResolvedEdges() already had both guards since commit 95304680
    Test->>DB: same SQL (reference implementation)
    DB-->>Test: resolved edges (sink edges excluded)
Loading

Fix All in Claude Code

Reviews (1): Last reviewed commit: "fix(benchmark): apply sink-edge exclusio..." | Re-trigger Greptile

Comment on lines +267 to +268
-- databases built before the Rust fix in #1698 where dynamic_kind was NULL for
-- sink edges, making tgt.kind != 'file' the authoritative semantic guard.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Comment references wrong PR number for the Rust fix

The comment attributes the native dynamic_kind write fix to #1698, but the EdgeRow struct change that actually makes the native engine write dynamic_kind for sink edges is part of this PR (#1699). Anyone reading this comment in the future will look up #1698 and not find the corresponding edges.rs change. If #1698 was intended as a separate squash/predecessor, the match is still off-by-one here.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

@carlos-alm carlos-alm merged commit c8fd894 into main Jun 22, 2026
35 checks passed
@carlos-alm carlos-alm deleted the release/3.15.0 branch June 22, 2026 17:08
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant