Skip to content

fix(watch): report net edge delta not gross re-inserted count#1245

Merged
carlos-alm merged 35 commits into
mainfrom
fix/watch-net-edge-delta
May 29, 2026
Merged

fix(watch): report net edge delta not gross re-inserted count#1245
carlos-alm merged 35 commits into
mainfrom
fix/watch-net-edge-delta

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Track outgoing edge count per file before purgeFileData (edgesBefore in IncrementalStmts + RebuildResult)
  • logRebuildResults now shows edgesAdded - edgesBefore so a no-op save reports +0 edges instead of e.g. +10 edges
  • Negative values (edges removed) are now surfaced correctly as -N edges

Test plan

  • Existing watcher integration tests pass (watcher-fk-embeddings, watcher-rebuild)
  • Full test suite green (159 files, 2771 tests)
  • Manual: start codegraph watch, edit a file without changing imports/calls → log shows +0 edges

Closes #1241

carlos-alm and others added 30 commits May 25, 2026 22:59
… table

The native-orchestrator drop warning lived in a single wall-of-text WARN
line that grew unreadable when 30+ extensions were dropped at once
(easy to trigger via journal-vs-fresh-build collisions). Make the
per-extension breakdown scan like a table: header line keeps the count
and now also reports the extension total; each extension occupies its
own indented row with a right-aligned count column.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… collection

Adds shared utilities to src/extractors/helpers.ts in preparation for
adoption across language extractors (phase 2):

- nodeStartLine: companion to nodeEndLine for the ~108 hand-rolled
  startPosition.row + 1 literals scattered across extractors
- findFirstChildOfTypes: find first child matching any of N types
  (useful for grammar variants like string vs string_literal)
- iterChildren / PUNCTUATION_TOKENS: generator-based child iteration
  with punctuation skipping, used in elixir/gleam destructuring walks
- pushCall / pushImport: centralise Call/Import construction so line
  derivation stays consistent across extractors
- extractSimpleParameters / resolveParamName: uniform parameter
  extraction with optional type-map sink — collapses boilerplate in
  the ~16 per-language extractParams helpers

Phase 1 of the TS extractor refactor plan (sync.json clusters 1).
Additive only — no consumer adoption yet; existing helpers and
extractor behaviour unchanged. Consumers updated in phase 2.

docs check acknowledged: internal refactor, no doc updates needed.
Phase 2 of the TS extractor refactor plan (sync.json cluster 1). Adopts
the helpers extended in 9c8be55 (nodeStartLine, findFirstChildOfTypes,
pushCall, pushImport, extractSimpleParameters, stripQuotes) across six
language extractors:

- r.ts: drop local stripQuotes; use shared stripQuotes/pushCall/
  pushImport/findFirstChildOfTypes/nodeStartLine
- gleam.ts: use pushCall/pushImport/findFirstChildOfTypes/nodeStartLine;
  extract pushConstructor helper for the dual-branch data-constructor
  walk
- julia.ts: use pushCall/pushImport/nodeStartLine; collapse Julia param
  wrapper-type branches via JULIA_PARAM_WRAPPER_TYPES set
- java.ts: use pushCall/pushImport/nodeStartLine; collapse
  extractJavaParameters via extractSimpleParameters with typeMap sink;
  extract resolveJavaTypeText for the generic_type unwrap pattern
- gleam.ts and solidity.ts: extract qualifyWithParent helper in solidity
  to collapse 6 duplicated `parent ? \`\${parent}.\${name}\` : name`
  blocks
- solidity.ts: use pushCall/pushImport/findFirstChildOfTypes/
  nodeStartLine; collapse extractSolParams via extractSimpleParameters
- javascript.ts: bulk-replace 43 inline `XXX.startPosition.row + 1`
  literals with nodeStartLine() calls; replace one stray endPosition
  literal with nodeEndLine

Net -65 lines. No behaviour changes — only call-site collapsing onto the
shared helpers (semantics verified by careful inspection of each
replacement; pushImport's empty-names fallback matches the previous
ad-hoc defaults in each extractor).

docs check acknowledged: internal refactor, no doc updates needed.
Convert collectElixirParamIdentifiers from mutual-recursion with
collectElixirMapBindings into a single iterative worklist traversal.
Map/list/tuple/binary-operator dispatch is now done via three leaf
helpers that push child nodes onto the worklist instead of calling
back into the main function. This removes the function-level cycle
flagged by codegraph (9 -> 8 cycles) without changing extractor
semantics. docs check acknowledged: internal refactor only.
Phase 5 of the Rust extractor refactor plan (sync.json cluster 2). Adopts
the helpers extended in 0d687c4 (push_call, push_simple_call, push_import,
push_type_map_entry, extract_simple_parameters, match_c_family_type_map)
across eight language extractors:

- cpp.rs: collapse match_cpp_type_map to a one-line delegate of
  match_c_family_type_map; use push_import/push_simple_call/push_call
  for include and call sites
- cuda.rs: same delegation as cpp.rs; use push_import/push_simple_call/
  push_call across include and call_expression handlers
- java.rs: use push_type_map_entry for local-variable / formal-parameter
  bindings; use push_call/push_simple_call for method invocation and
  object creation; collapse extract_java_parameters to a one-shot
  extract_simple_parameters call; use push_import for import declaration
- javascript.rs: use push_simple_call for new_expression identifier
  branch; use push_type_map_entry for the confidence-0.9 type entries
- julia.rs: use push_simple_call/push_call across identifier and
  field_expression / scoped_identifier call branches
- objc.rs: use push_import for at_import; use push_call for c-call and
  message-expression handlers (drops redundant is_empty guards)
- r_lang.rs: use push_simple_call/push_call across identifier and
  namespace_operator call branches; use push_import for library/source
- solidity.rs: use push_call (drops redundant guard) for call sites;
  collapse extract_sol_params to a one-shot extract_simple_parameters

Net: -207 lines across 8 files, no behavior change. cargo check clean,
324 rust unit tests pass.

Pre-existing test failure: tests/engines/parity.test.ts has two failing
elixir cases unrelated to this commit (filed as #1227 — regression from
commit 5abe6ad in Phase 3).
Convert collect_elixir_param_identifiers from mutual-recursion with
collect_elixir_map_bindings into a single iterative worklist traversal.
Map/list/tuple/binary-operator dispatch is now done via three leaf
helpers (push_elixir_sequence_items, push_elixir_map_values,
push_elixir_binary_operator_operands) that push child nodes onto the
worklist instead of calling back into the main function. This removes
the function-level cycle flagged by codegraph (8 -> 7 cycles) and
mirrors the TS refactor in 5abe6ad without changing extractor
semantics. docs check acknowledged: internal refactor only.
…lection strategy

Extract the native-orchestrator path out of pipeline.ts into two new stage
modules:

  - stages/native-orchestrator.ts — tryNativeOrchestrator + post-native
    structure/analysis fallback + dropped-language detection/backfill.
  - stages/native-db-lifecycle.ts — shared rusqlite connection helpers
    (closeNativeDb, reopenNativeDb, suspendNativeDb, refreshJsDb).

This breaks the function-level cycle 'buildGraph <-> tryNativeOrchestrator'
caused by codegraph's name-based resolver conflating the local buildGraph
function with the ctx.nativeDb.buildGraph() method call. Once the
orchestrator lives in its own file, there is no longer a local buildGraph
in scope to collide with the method invocation.

Function-level cycles: 9 -> 5. No file-level cycle introduced (still 1,
unchanged — pre-existing MCP cycle). pipeline.ts shrinks from 1404 to 465
lines and now reads as a thin top-level controller: detect changes, try
native, fall back to JS stages.

computeWasmOnlyStaleFiles is re-exported from pipeline.ts so existing
unit tests (tests/builder/wasm-only-stale-files.test.ts) keep working
without changes.
docs check acknowledged — no doc-relevant changes (internal helper extraction).
docs check acknowledged - Rust internal helper extraction, no user-facing changes
…impact and dependencies

Split high-cognitive-complexity functions in the analysis domain into focused
helpers. Worst functions per gauntlet (cog/cyc/maxNesting/halstead) are now
below thresholds.

module-map.ts (statsData cog=31 -> below threshold):
- Extract buildStatsFromNative and buildStatsFromJs branches
- Share false-positive query and quality-score helpers between paths
- aggregateRolesFromNative pulls duplicated role-aggregation code out

fn-impact.ts (bfsTransitiveCallers cog=37 -> below threshold,
              impactAnalysisData cog=27 -> below threshold):
- Extract recordCaller, processFrontierNode, seedInterfaceImplementors
- Extract bfsImportDependents and groupDependentsByLevel

dependencies.ts (bfsShortestPath cog=29, bfsFilePath cog=30,
                 buildTransitiveCallers cog=24 -> all below threshold):
- Extract buildNextCallerFrontier from buildTransitiveCallers
- Extract buildNeighborStmt + visitNeighbor; state collected in struct
- Extract visitFileNeighbor + reconstructFilePath

docs check acknowledged - internal helper extraction, no user-facing changes
…, structure-query, and owners

Internal refactor — no public API or behaviour change, so docs check acknowledged.

- complexity.ts: split collectNativeBulkRows (cog=70) into classify/build/collect-file helpers;
  extract classifyHalsteadToken + summarizeHalsteadCounts from computeHalsteadMetrics.
- structure.ts: merge classifyNodeRolesFull/Incremental DRY via shared buildActiveFilesSet
  + buildClassifierInput helpers.
- graph-enrichment.ts: decompose prepareFileLevelData (cog=32, cyc=26) into loadFileLevelEdges,
  computeFileFanCounts, detectFileCommunities, buildFileVisNode, selectFileSeedNodes.
- structure-query.ts: split hotspotsData (cog=34, sloc=102) using a strategy pattern
  (HOTSPOT_ORDER_BY) and mapNative/JsHotspotRow helpers.
- owners.ts: split ownersData (sloc=158, bugs=1.55) into loadFilteredFiles, buildOwnerIndex,
  loadSymbolsForFiles, computeOwnerBoundaries, buildOwnersSummary.
Internal refactor — public APIs unchanged. docs check acknowledged.
…ir pushElixirSequenceItems

Replaces the inline childCount loop with the shared iterChildren generator
configured with PUNCTUATION_TOKENS, completing phase 1 of the TS extractor
refactor plan (sync.json cluster 1). Behaviour preserved — same nodes are
pushed onto the worklist, just via the shared helper.

docs check acknowledged: internal refactor, no doc updates needed.
Wire forge phase 4 helpers into their consumers:

- find_first_child_of_types: collapse find_child(x, A).or_else(|| find_child(x, B))
  in fsharp.rs handle_application
- iter_children + PUNCTUATION_TOKENS: replace inline punctuation-skip loop in
  javascript.rs first_arg_is_string_literal

Closes 3 dead-ffi helpers extracted by forge phase 4. Semantically identical.
…ap helpers

pushElixirSequenceItems and pushElixirMapValues were pushing items in
forward order onto the LIFO worklist stack, causing tuple/list/map
parameters to be emitted in reverse source order (e.g. {x, _y} → ['_y',
'x'] instead of ['x', '_y']). The fix collects items then pushes them in
reverse so the LIFO pop restores source order, matching the native engine.
Track outgoing edge count before purge (edgesBefore) and subtract from
edgesAdded so the rebuild log shows the real change — e.g. +0 edges when
nothing changed instead of +10 for every trivial save.
@greptile-apps

greptile-apps Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes the edge-delta reporting in the incremental watcher by capturing outgoing edge counts before purgeFileData and before each reverse-dep's deleteOutgoingEdges, then computing edgesAdded - edgesBefore (net delta) rather than the raw gross re-inserted count. Both the human-facing log and the machine-readable change event journal now use the net delta.

  • IncrementalStmts gains a countEdges prepared statement; RebuildResult gains edgesBefore; both are wired through buildDeletionResult, the fileNodeRow-null early return, and the main rebuild path.
  • parseReverseDeps now accumulates reverseDepsEdgesBefore (summed before each deleteOutgoingEdges call) and returns it up through runReverseDepCascaderebuildFile, which adds it to the target file's edgesBefore so the net delta is accurate even when a shared utility file is saved and triggers a reverse-dep cascade.
  • Both integration test fixtures (watcher-fk-embeddings, watcher-rebuild) receive the new countEdges statement so their makeStmts helpers remain in sync with the updated IncrementalStmts interface.

Confidence Score: 5/5

Safe to merge — the fix is logically complete across all rebuild paths including deletion, early returns, and reverse-dep cascades.

The edge-counting logic correctly captures baselines before each deletion step (purgeFileData for the target file, deleteOutgoingEdges for each reverse dep), sums them into totalEdgesBefore, and both the log and change event journal now report the true net delta. All four changed files are narrow and coherent. The previous reviewer concern about reverse-dep cascade inflation is addressed in commit b5bb5d1 and the fix is present in this diff.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/graph/builder/incremental.ts Core fix: adds countEdges to IncrementalStmts, captures edgesBefore for the target file and each reverse dep before deletion, and threads totalEdgesBefore through RebuildResult. Logic is sound across all code paths (deletion, fileNodeRow-null early return, and normal rebuild with cascade).
src/domain/graph/watcher.ts Watcher correctly updated: countEdges statement added, local RebuildResult gains edgesBefore, both logRebuildResults and buildChangeEvent now use the net delta (edgesAdded - edgesBefore). Addresses both the log and the change event journal.
tests/integration/watcher-fk-embeddings.test.ts Test fixture makeStmts updated to include countEdges prepared statement, keeping it in sync with the updated IncrementalStmts interface.
tests/integration/watcher-rebuild.test.ts Test fixture makeStmts updated to include countEdges prepared statement, keeping it in sync with the updated IncrementalStmts interface.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[rebuildFile called] --> B[countEdges for target file → edgesBefore]
    B --> C[findReverseDeps]
    C --> D[purgeFileData — deletes target edges & nodes]
    D --> E{file still exists?}
    E -- No --> F[buildDeletionResult\nedgesAdded=0, edgesBefore=edgesBefore\nnet delta = -edgesBefore]
    E -- Yes --> G[parseFileIncremental]
    G --> H{fileNodeRow exists?}
    H -- No --> I[early return\nedgesAdded=0, edgesBefore=edgesBefore]
    H -- Yes --> J[rebuildEdgesForTargetFile → edgesAdded]
    J --> K[runReverseDepCascade]
    K --> L[parseReverseDeps loop\nfor each dep: countEdges → reverseDepsEdgesBefore\nthen deleteOutgoingEdges]
    L --> M[rebuild dep edges → cascadeEdges]
    M --> N[totalEdgesBefore = edgesBefore + reverseDepsEdgesBefore\ntotal edgesAdded = edgesAdded + cascadeEdges]
    N --> O[RebuildResult\nnet delta = edgesAdded − totalEdgesBefore]
    O --> P[logRebuildResults: edgeDelta display]
    O --> Q[buildChangeEvent: edgesAdded = net delta]
Loading

Reviews (5): Last reviewed commit: "Merge branch 'main' into fix/watch-net-e..." | Re-trigger Greptile

Comment thread src/domain/graph/builder/incremental.ts Outdated
Comment on lines +649 to +693
@@ -519,18 +660,7 @@ export async function rebuildFile(

if (!fs.existsSync(filePath)) {
if (cache) (cache as { remove(p: string): void }).remove(filePath);
const symbolDiff = diffSymbols ? diffSymbols(oldSymbols, []) : null;
return {
file: relPath,
nodesAdded: 0,
nodesRemoved: oldNodes,
edgesAdded: 0,
deleted: true,
event: 'deleted',
symbolDiff,
nodesBefore: oldNodes,
nodesAfter: 0,
};
return buildDeletionResult(relPath, oldNodes, edgesBefore, oldSymbols, diffSymbols);
}

let code: string;
@@ -551,47 +681,16 @@ export async function rebuildFile(

const fileNodeRow = stmts.getNodeId.get(relPath, 'file', relPath, 0);
if (!fileNodeRow)
return { file: relPath, nodesAdded: newNodes, nodesRemoved: oldNodes, edgesAdded: 0 };

const aliases: PathAliases = { baseUrl: null, paths: {} };

let edgesAdded = buildContainmentEdges(db, stmts, relPath, symbols);
edgesAdded += rebuildDirContainment(db, stmts, relPath);
edgesAdded += buildImportEdges(stmts, relPath, symbols, rootDir, fileNodeRow.id, aliases, db);
const importedNames = buildImportedNamesMap(symbols, rootDir, relPath, aliases);
edgesAdded += buildCallEdges(stmts, relPath, symbols, fileNodeRow, importedNames);
return {
file: relPath,
nodesAdded: newNodes,
nodesRemoved: oldNodes,
edgesAdded: 0,
edgesBefore,
};

// Cascade: rebuild outgoing edges for reverse-dep files.
// Two-pass approach: first rebuild direct edges (creating reexports edges for barrels),
// then add barrel import edges (which need reexports edges to exist for resolution).
const depSymbols = new Map<string, ExtractorOutput>();
for (const depRelPath of reverseDeps) {
const symbols_ = await parseReverseDep(rootDir, depRelPath, engineOpts, cache);
if (symbols_) {
deleteOutgoingEdges(db, depRelPath);
depSymbols.set(depRelPath, symbols_);
}
}
// Pass 1: direct edges only (no barrel resolution) — creates reexports edges
for (const [depRelPath, symbols_] of depSymbols) {
edgesAdded += rebuildReverseDepEdges(db, rootDir, depRelPath, symbols_, stmts, true);
}
// Pass 2: add barrel import edges (reexports edges now exist)
for (const [depRelPath, symbols_] of depSymbols) {
const fileNodeRow_ = stmts.getNodeId.get(depRelPath, 'file', depRelPath, 0);
if (!fileNodeRow_) continue;
const aliases_: PathAliases = { baseUrl: null, paths: {} };
for (const imp of symbols_.imports) {
if (imp.reexport) continue;
const resolvedPath = resolveImportPath(
path.join(rootDir, depRelPath),
imp.source,
rootDir,
aliases_,
);
edgesAdded += resolveBarrelImportEdges(db, stmts, fileNodeRow_.id, resolvedPath, imp);
}
}
let edgesAdded = rebuildEdgesForTargetFile(db, stmts, relPath, symbols, fileNodeRow, rootDir);
edgesAdded += await runReverseDepCascade(db, rootDir, reverseDeps, stmts, engineOpts, cache);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Reverse-dep cascade inflates edgesAdded vs edgesBefore

edgesBefore counts only outgoing edges from the target file, but edgesAdded is the sum of target-file edges and all re-inserted reverse-dep edges (via runReverseDepCascade). Each reverse dep has its old outgoing edges deleted by deleteOutgoingEdges then re-inserted, so their count lands entirely in edgesAdded with no corresponding offset in edgesBefore.

Concretely, editing a shared utility file (e.g. utils.ts) imported by 3 other files where each dep has 5 edges: edgesBefore = 10, edgesAdded = 10 + 15 = 25, reported delta = +15 — not +0 as the test plan states. The "+0 for a no-op save" guarantee only holds for leaf files (no reverse deps).

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — parseReverseDeps now counts each reverse dep's outgoing edges (via stmts.countEdges) before calling deleteOutgoingEdges, and returns the total as reverseDepsEdgesBefore. runReverseDepCascade now returns { edgesAdded, reverseDepsEdgesBefore } and rebuildFile adds reverseDepsEdgesBefore to edgesBefore so the net delta (edgesAdded - edgesBefore) is correct for files with any number of reverse deps. Commit b5bb5d1.

…delta (#1245)

- parseReverseDeps now counts each dep's outgoing edges before deleting
  them and returns the total as reverseDepsEdgesBefore
- runReverseDepCascade returns { edgesAdded, reverseDepsEdgesBefore }
- rebuildFile adds reverseDepsEdgesBefore to edgesBefore so the net
  delta (edgesAdded - edgesBefore) is correct even when the cascade
  re-inserts unchanged edges for reverse deps
- buildChangeEvent receives the net delta instead of the gross count,
  keeping the change journal consistent with the log output
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed Greptile's review feedback:

Issue 1 — reverse-dep cascade inflates edgesAdded vs edgesBefore (inline comment on incremental.ts:693):

  • parseReverseDeps now queries stmts.countEdges for each dep before calling deleteOutgoingEdges, accumulating the total in reverseDepsEdgesBefore
  • runReverseDepCascade returns { edgesAdded, reverseDepsEdgesBefore }
  • rebuildFile computes totalEdgesBefore = edgesBefore + reverseDepsEdgesBefore, so the net delta is correct for shared files with any number of reverse deps

Issue 2 — edgesAdded in change event vs net delta in log (review body comment on watcher.ts:110-116):

  • buildChangeEvent now receives r.edgesAdded - r.edgesBefore (the net delta) instead of the gross count, keeping the change journal consistent with the log output

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@github-actions

github-actions Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

10 functions changed9 callers affected across 3 files

  • IncrementalStmts.countEdges in src/domain/graph/builder/incremental.ts:33 (0 transitive callers)
  • RebuildResult.edgesBefore in src/domain/graph/builder/incremental.ts:44 (0 transitive callers)
  • buildDeletionResult in src/domain/graph/builder/incremental.ts:517 (3 transitive callers)
  • parseReverseDeps in src/domain/graph/builder/incremental.ts:564 (3 transitive callers)
  • runReverseDepCascade in src/domain/graph/builder/incremental.ts:622 (3 transitive callers)
  • rebuildFile in src/domain/graph/builder/incremental.ts:652 (4 transitive callers)
  • prepareWatcherStatements in src/domain/graph/watcher.ts:18 (3 transitive callers)
  • RebuildResult.edgesBefore in src/domain/graph/watcher.ts:58 (0 transitive callers)
  • writeJournalAndChangeEvents in src/domain/graph/watcher.ts:99 (4 transitive callers)
  • logRebuildResults in src/domain/graph/watcher.ts:125 (4 transitive callers)

@carlos-alm carlos-alm merged commit 2a0b92d into main May 29, 2026
21 checks passed
@carlos-alm carlos-alm deleted the fix/watch-net-edge-delta branch May 29, 2026 23:14
@github-actions github-actions Bot locked and limited conversation to collaborators May 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(watch): report net edge delta (not gross re-inserted count) in rebuild log

1 participant