fix(benchmark): apply sink-edge exclusion filter to scripts/resolution-benchmark.ts by carlos-alm · Pull Request #1699 · optave/ops-codegraph-tool

carlos-alm · 2026-06-22T16:31:48Z

Summary

scripts/resolution-benchmark.ts was missing the sink-edge exclusion filters that commit 95304680 added to the vitest test file
The CI gate reads precomputed metrics from the script via RESOLUTION_RESULT_JSON, so sink edges (confidence=0, targeting file nodes) were counted as false positives
Added AND tgt.kind != 'file' and the confidence/dynamic_kind belt-and-suspenders guard to the script's SQL, matching extractResolvedEdges in the test file

Affected fixtures (before fix):

dynamic-java: precision 33.3% (runInvoke, runForName sink edges leaking through)
dynamic-javascript: precision 57.1% (runComputedKey, runEval, runNewFunction leaking)
dynamic-scala: precision 50.0% (runInvoke leaking)
dynamic-typescript: precision 60.0% (runReflectGetVariable, runReflectApplyExternal leaking)

Test plan

Re-trigger the publish workflow — Gate on resolution thresholds should pass for all four fixtures at 100% precision

… path `insert_call_edge_rows` in pipeline.rs mapped ComputedEdge→EdgeRow but silently dropped dynamic_kind, so native sink edges landed in the DB with dynamic_kind IS NULL. The benchmark filter added in 39b9a00 (`OR e.dynamic_kind IS NULL`) then matched those NULL-kind rows and counted them as false positives, breaking precision for dynamic-javascript, dynamic-typescript, and dynamic-scala. Two-layer fix: 1. edges.rs: add dynamic_kind: Option<String> to EdgeRow and update do_insert_edges to a 6-column INSERT (CHUNK 199→165 to stay under SQLite's 999 bind-parameter limit). None binds as SQL NULL for all non-sink edges; Some(dk) writes the kind string for sink edges. 2. pipeline.rs: pass e.dynamic_kind.clone() in insert_call_edge_rows so ComputedEdge.dynamic_kind reaches the DB insert. 3. resolution-benchmark.test.ts: add AND tgt.kind != 'file' to extractResolvedEdges — the correct semantic filter since sink edges are the only calls that ever target the file node, and no legitimate resolution does. This filter is correct regardless of dynamic_kind being set, removing the dependency on the back-fill path.

…xtractResolvedEdges (#1698) The tgt.kind != 'file' guard is the authoritative semantic exclusion for sink edges; the confidence/dynamic_kind clause is kept as belt-and-suspenders for databases built before the native fix where dynamic_kind was NULL.

…n-benchmark.ts The tgt.kind != 'file' and confidence/dynamic_kind guards were added to extractResolvedEdges in the vitest test file (9530468) but not to the matching SQL query in scripts/resolution-benchmark.ts. The CI gate reads precomputed JSON from that script (via RESOLUTION_RESULT_JSON), so sink edges (confidence=0, targeting file nodes) were counted as false positives, breaking precision for dynamic-java, dynamic-javascript, dynamic-scala, and dynamic-typescript.

greptile-apps · 2026-06-22T16:41:53Z

Greptile Summary

This PR backports the sink-edge exclusion SQL filter (added to the vitest test in commit 95304680) into scripts/resolution-benchmark.ts, which is the path the CI gate actually reads via RESOLUTION_RESULT_JSON. It also propagates the native Rust dynamic_kind field through EdgeRow so the native engine writes the column that the SQL guard relies on.

Benchmark script fix (scripts/resolution-benchmark.ts): adds AND tgt.kind != 'file' (primary guard) and the belt-and-suspenders confidence/dynamic_kind clause to the resolvedEdges query, making it identical to extractResolvedEdges in the test file. Without this, sink edges (confidence=0, targeting file nodes) were classified as false positives, dragging precision down to 33–60% on the four dynamic-language fixtures.
Rust native engine fix (edges.rs, pipeline.rs): adds dynamic_kind: Option<String> to EdgeRow, recalculates CHUNK to 165 (165×6=990, safely under the SQLite 999-variable limit), and wires the new column through the bulk-insert SQL and binding loop.
Documentation sweep: CHANGELOG, README, ROADMAP, and BACKLOG updated for the v3.15.0 release.

Confidence Score: 4/5

Safe to merge — the benchmark script now uses the same SQL filter as the test, and the Rust bulk-insert path correctly writes dynamic_kind for sink edges.

All changed logic is a targeted one-to-one alignment of the benchmark script's SQL query with the already-correct test implementation. The only loose end is a comment in the test file that cites #1698 for the Rust dynamic_kind write fix, which actually lands in this PR (#1699).

The inline comment on line 267 of tests/benchmarks/resolution/resolution-benchmark.test.ts references an incorrect PR number for the native engine fix.

Important Files Changed

Filename	Overview
scripts/resolution-benchmark.ts	Adds the missing sink-edge exclusion SQL clauses (`AND tgt.kind != 'file'` + belt-and-suspenders `confidence/dynamic_kind` guard) to match the test file's `extractResolvedEdges` query; the logic is correct.
tests/benchmarks/resolution/resolution-benchmark.test.ts	Pre-existing SQL filter that was already correct; a comment in this file references `#1698` for the Rust `dynamic_kind` write fix, but that change actually lands in this PR (`#1699`).
crates/codegraph-core/src/db/repository/edges.rs	Adds `dynamic_kind: Option<String>` to `EdgeRow`, adjusts CHUNK from 199 to 165 (165x6=990, still under the 999 SQLite variable limit), and propagates the new column in the bulk-insert SQL and bindings — all correct.
crates/codegraph-core/src/domain/graph/builder/pipeline.rs	One-liner that passes `dynamic_kind` through from `ComputedEdge` to `EdgeRow`; straightforward and correct.
CHANGELOG.md	Adds the v3.15.0 release notes block; documentation only.
README.md	Adds `codegraph roles --dynamic` example and `ignoreAdditionalDirs` config snippet; documentation only.
docs/roadmap/ROADMAP.md	Version bump to 3.15.0, marks two ADRs as complete, adds a new roadmap phase; documentation only.
docs/roadmap/BACKLOG.md	Date stamp update; documentation only.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant CI as CI Gate
    participant Script as resolution-benchmark.ts
    participant DB as SQLite (graph.db)
    participant Test as resolution-benchmark.test.ts

    CI->>Script: run (reads RESOLUTION_RESULT_JSON)
    Script->>DB: "SELECT WHERE kind=calls AND tgt.kind != file NEW"
    DB-->>Script: resolved edges (sink edges excluded)
    Script-->>CI: metrics JSON (precision/recall)
    CI->>CI: Gate on resolution thresholds

    Note over Test: extractResolvedEdges() already had both guards since commit 95304680
    Test->>DB: same SQL (reference implementation)
    DB-->>Test: resolved edges (sink edges excluded)

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant CI as CI Gate
    participant Script as resolution-benchmark.ts
    participant DB as SQLite (graph.db)
    participant Test as resolution-benchmark.test.ts

    CI->>Script: run (reads RESOLUTION_RESULT_JSON)
    Script->>DB: "SELECT WHERE kind=calls AND tgt.kind != file NEW"
    DB-->>Script: resolved edges (sink edges excluded)
    Script-->>CI: metrics JSON (precision/recall)
    CI->>CI: Gate on resolution thresholds

    Note over Test: extractResolvedEdges() already had both guards since commit 95304680
    Test->>DB: same SQL (reference implementation)
    DB-->>Test: resolved edges (sink edges excluded)

_{Reviews (1): Last reviewed commit: "fix(benchmark): apply sink-edge exclusio..." | Re-trigger Greptile}

greptile-apps · 2026-06-22T16:41:58Z

+        -- databases built before the Rust fix in #1698 where dynamic_kind was NULL for
+        -- sink edges, making tgt.kind != 'file' the authoritative semantic guard.


Comment references wrong PR number for the Rust fix

The comment attributes the native dynamic_kind write fix to #1698, but the EdgeRow struct change that actually makes the native engine write dynamic_kind for sink edges is part of this PR (#1699). Anyone reading this comment in the future will look up #1698 and not find the corresponding edges.rs change. If #1698 was intended as a separate squash/predecessor, the match is still off-by-one here.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

carlos-alm added 4 commits June 21, 2026 17:56

docs: prepare release notes for v3.15.0

5e6c623

greptile-apps Bot reviewed Jun 22, 2026

View reviewed changes

carlos-alm merged commit c8fd894 into main Jun 22, 2026
35 checks passed

carlos-alm deleted the release/3.15.0 branch June 22, 2026 17:08

github-actions Bot locked and limited conversation to collaborators Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(benchmark): apply sink-edge exclusion filter to scripts/resolution-benchmark.ts#1699

fix(benchmark): apply sink-edge exclusion filter to scripts/resolution-benchmark.ts#1699
carlos-alm merged 4 commits into
mainfrom
release/3.15.0

carlos-alm commented Jun 22, 2026

Uh oh!

greptile-apps Bot commented Jun 22, 2026

Uh oh!

greptile-apps Bot Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		-- databases built before the Rust fix in #1698 where dynamic_kind was NULL for
		-- sink edges, making tgt.kind != 'file' the authoritative semantic guard.

Uh oh!

Conversation

carlos-alm commented Jun 22, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot commented Jun 22, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant