fix(watch): report net edge delta not gross re-inserted count#1245
Conversation
… table The native-orchestrator drop warning lived in a single wall-of-text WARN line that grew unreadable when 30+ extensions were dropped at once (easy to trigger via journal-vs-fresh-build collisions). Make the per-extension breakdown scan like a table: header line keeps the count and now also reports the extension total; each extension occupies its own indented row with a right-aligned count column. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-drop-summary-tabular
… collection Adds shared utilities to src/extractors/helpers.ts in preparation for adoption across language extractors (phase 2): - nodeStartLine: companion to nodeEndLine for the ~108 hand-rolled startPosition.row + 1 literals scattered across extractors - findFirstChildOfTypes: find first child matching any of N types (useful for grammar variants like string vs string_literal) - iterChildren / PUNCTUATION_TOKENS: generator-based child iteration with punctuation skipping, used in elixir/gleam destructuring walks - pushCall / pushImport: centralise Call/Import construction so line derivation stays consistent across extractors - extractSimpleParameters / resolveParamName: uniform parameter extraction with optional type-map sink — collapses boilerplate in the ~16 per-language extractParams helpers Phase 1 of the TS extractor refactor plan (sync.json clusters 1). Additive only — no consumer adoption yet; existing helpers and extractor behaviour unchanged. Consumers updated in phase 2. docs check acknowledged: internal refactor, no doc updates needed.
Phase 2 of the TS extractor refactor plan (sync.json cluster 1). Adopts the helpers extended in 9c8be55 (nodeStartLine, findFirstChildOfTypes, pushCall, pushImport, extractSimpleParameters, stripQuotes) across six language extractors: - r.ts: drop local stripQuotes; use shared stripQuotes/pushCall/ pushImport/findFirstChildOfTypes/nodeStartLine - gleam.ts: use pushCall/pushImport/findFirstChildOfTypes/nodeStartLine; extract pushConstructor helper for the dual-branch data-constructor walk - julia.ts: use pushCall/pushImport/nodeStartLine; collapse Julia param wrapper-type branches via JULIA_PARAM_WRAPPER_TYPES set - java.ts: use pushCall/pushImport/nodeStartLine; collapse extractJavaParameters via extractSimpleParameters with typeMap sink; extract resolveJavaTypeText for the generic_type unwrap pattern - gleam.ts and solidity.ts: extract qualifyWithParent helper in solidity to collapse 6 duplicated `parent ? \`\${parent}.\${name}\` : name` blocks - solidity.ts: use pushCall/pushImport/findFirstChildOfTypes/ nodeStartLine; collapse extractSolParams via extractSimpleParameters - javascript.ts: bulk-replace 43 inline `XXX.startPosition.row + 1` literals with nodeStartLine() calls; replace one stray endPosition literal with nodeEndLine Net -65 lines. No behaviour changes — only call-site collapsing onto the shared helpers (semantics verified by careful inspection of each replacement; pushImport's empty-names fallback matches the previous ad-hoc defaults in each extractor). docs check acknowledged: internal refactor, no doc updates needed.
Convert collectElixirParamIdentifiers from mutual-recursion with collectElixirMapBindings into a single iterative worklist traversal. Map/list/tuple/binary-operator dispatch is now done via three leaf helpers that push child nodes onto the worklist instead of calling back into the main function. This removes the function-level cycle flagged by codegraph (9 -> 8 cycles) without changing extractor semantics. docs check acknowledged: internal refactor only.
Phase 5 of the Rust extractor refactor plan (sync.json cluster 2). Adopts the helpers extended in 0d687c4 (push_call, push_simple_call, push_import, push_type_map_entry, extract_simple_parameters, match_c_family_type_map) across eight language extractors: - cpp.rs: collapse match_cpp_type_map to a one-line delegate of match_c_family_type_map; use push_import/push_simple_call/push_call for include and call sites - cuda.rs: same delegation as cpp.rs; use push_import/push_simple_call/ push_call across include and call_expression handlers - java.rs: use push_type_map_entry for local-variable / formal-parameter bindings; use push_call/push_simple_call for method invocation and object creation; collapse extract_java_parameters to a one-shot extract_simple_parameters call; use push_import for import declaration - javascript.rs: use push_simple_call for new_expression identifier branch; use push_type_map_entry for the confidence-0.9 type entries - julia.rs: use push_simple_call/push_call across identifier and field_expression / scoped_identifier call branches - objc.rs: use push_import for at_import; use push_call for c-call and message-expression handlers (drops redundant is_empty guards) - r_lang.rs: use push_simple_call/push_call across identifier and namespace_operator call branches; use push_import for library/source - solidity.rs: use push_call (drops redundant guard) for call sites; collapse extract_sol_params to a one-shot extract_simple_parameters Net: -207 lines across 8 files, no behavior change. cargo check clean, 324 rust unit tests pass. Pre-existing test failure: tests/engines/parity.test.ts has two failing elixir cases unrelated to this commit (filed as #1227 — regression from commit 5abe6ad in Phase 3).
Convert collect_elixir_param_identifiers from mutual-recursion with collect_elixir_map_bindings into a single iterative worklist traversal. Map/list/tuple/binary-operator dispatch is now done via three leaf helpers (push_elixir_sequence_items, push_elixir_map_values, push_elixir_binary_operator_operands) that push child nodes onto the worklist instead of calling back into the main function. This removes the function-level cycle flagged by codegraph (8 -> 7 cycles) and mirrors the TS refactor in 5abe6ad without changing extractor semantics. docs check acknowledged: internal refactor only.
…lection strategy
Extract the native-orchestrator path out of pipeline.ts into two new stage
modules:
- stages/native-orchestrator.ts — tryNativeOrchestrator + post-native
structure/analysis fallback + dropped-language detection/backfill.
- stages/native-db-lifecycle.ts — shared rusqlite connection helpers
(closeNativeDb, reopenNativeDb, suspendNativeDb, refreshJsDb).
This breaks the function-level cycle 'buildGraph <-> tryNativeOrchestrator'
caused by codegraph's name-based resolver conflating the local buildGraph
function with the ctx.nativeDb.buildGraph() method call. Once the
orchestrator lives in its own file, there is no longer a local buildGraph
in scope to collide with the method invocation.
Function-level cycles: 9 -> 5. No file-level cycle introduced (still 1,
unchanged — pre-existing MCP cycle). pipeline.ts shrinks from 1404 to 465
lines and now reads as a thin top-level controller: detect changes, try
native, fall back to JS stages.
computeWasmOnlyStaleFiles is re-exported from pipeline.ts so existing
unit tests (tests/builder/wasm-only-stale-files.test.ts) keep working
without changes.
docs check acknowledged — no doc-relevant changes (internal helper extraction).
docs check acknowledged - Rust internal helper extraction, no user-facing changes
…impact and dependencies
Split high-cognitive-complexity functions in the analysis domain into focused
helpers. Worst functions per gauntlet (cog/cyc/maxNesting/halstead) are now
below thresholds.
module-map.ts (statsData cog=31 -> below threshold):
- Extract buildStatsFromNative and buildStatsFromJs branches
- Share false-positive query and quality-score helpers between paths
- aggregateRolesFromNative pulls duplicated role-aggregation code out
fn-impact.ts (bfsTransitiveCallers cog=37 -> below threshold,
impactAnalysisData cog=27 -> below threshold):
- Extract recordCaller, processFrontierNode, seedInterfaceImplementors
- Extract bfsImportDependents and groupDependentsByLevel
dependencies.ts (bfsShortestPath cog=29, bfsFilePath cog=30,
buildTransitiveCallers cog=24 -> all below threshold):
- Extract buildNextCallerFrontier from buildTransitiveCallers
- Extract buildNeighborStmt + visitNeighbor; state collected in struct
- Extract visitFileNeighbor + reconstructFilePath
docs check acknowledged - internal helper extraction, no user-facing changes
…ic and hybrid search
…, structure-query, and owners Internal refactor — no public API or behaviour change, so docs check acknowledged. - complexity.ts: split collectNativeBulkRows (cog=70) into classify/build/collect-file helpers; extract classifyHalsteadToken + summarizeHalsteadCounts from computeHalsteadMetrics. - structure.ts: merge classifyNodeRolesFull/Incremental DRY via shared buildActiveFilesSet + buildClassifierInput helpers. - graph-enrichment.ts: decompose prepareFileLevelData (cog=32, cyc=26) into loadFileLevelEdges, computeFileFanCounts, detectFileCommunities, buildFileVisNode, selectFileSeedNodes. - structure-query.ts: split hotspotsData (cog=34, sloc=102) using a strategy pattern (HOTSPOT_ORDER_BY) and mapNative/JsHotspotRow helpers. - owners.ts: split ownersData (sloc=158, bugs=1.55) into loadFilteredFiles, buildOwnerIndex, loadSymbolsForFiles, computeOwnerBoundaries, buildOwnersSummary.
Internal refactor — public APIs unchanged. docs check acknowledged.
…ir pushElixirSequenceItems Replaces the inline childCount loop with the shared iterChildren generator configured with PUNCTUATION_TOKENS, completing phase 1 of the TS extractor refactor plan (sync.json cluster 1). Behaviour preserved — same nodes are pushed onto the worklist, just via the shared helper. docs check acknowledged: internal refactor, no doc updates needed.
Wire forge phase 4 helpers into their consumers: - find_first_child_of_types: collapse find_child(x, A).or_else(|| find_child(x, B)) in fsharp.rs handle_application - iter_children + PUNCTUATION_TOKENS: replace inline punctuation-skip loop in javascript.rs first_arg_is_string_literal Closes 3 dead-ffi helpers extracted by forge phase 4. Semantically identical.
…ap helpers
pushElixirSequenceItems and pushElixirMapValues were pushing items in
forward order onto the LIFO worklist stack, causing tuple/list/map
parameters to be emitted in reverse source order (e.g. {x, _y} → ['_y',
'x'] instead of ['x', '_y']). The fix collects items then pushes them in
reverse so the LIFO pop restores source order, matching the native engine.
Track outgoing edge count before purge (edgesBefore) and subtract from edgesAdded so the rebuild log shows the real change — e.g. +0 edges when nothing changed instead of +10 for every trivial save.
Greptile SummaryThis PR fixes the edge-delta reporting in the incremental watcher by capturing outgoing edge counts before
Confidence Score: 5/5Safe to merge — the fix is logically complete across all rebuild paths including deletion, early returns, and reverse-dep cascades. The edge-counting logic correctly captures baselines before each deletion step (purgeFileData for the target file, deleteOutgoingEdges for each reverse dep), sums them into totalEdgesBefore, and both the log and change event journal now report the true net delta. All four changed files are narrow and coherent. The previous reviewer concern about reverse-dep cascade inflation is addressed in commit b5bb5d1 and the fix is present in this diff. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[rebuildFile called] --> B[countEdges for target file → edgesBefore]
B --> C[findReverseDeps]
C --> D[purgeFileData — deletes target edges & nodes]
D --> E{file still exists?}
E -- No --> F[buildDeletionResult\nedgesAdded=0, edgesBefore=edgesBefore\nnet delta = -edgesBefore]
E -- Yes --> G[parseFileIncremental]
G --> H{fileNodeRow exists?}
H -- No --> I[early return\nedgesAdded=0, edgesBefore=edgesBefore]
H -- Yes --> J[rebuildEdgesForTargetFile → edgesAdded]
J --> K[runReverseDepCascade]
K --> L[parseReverseDeps loop\nfor each dep: countEdges → reverseDepsEdgesBefore\nthen deleteOutgoingEdges]
L --> M[rebuild dep edges → cascadeEdges]
M --> N[totalEdgesBefore = edgesBefore + reverseDepsEdgesBefore\ntotal edgesAdded = edgesAdded + cascadeEdges]
N --> O[RebuildResult\nnet delta = edgesAdded − totalEdgesBefore]
O --> P[logRebuildResults: edgeDelta display]
O --> Q[buildChangeEvent: edgesAdded = net delta]
Reviews (5): Last reviewed commit: "Merge branch 'main' into fix/watch-net-e..." | Re-trigger Greptile |
| @@ -519,18 +660,7 @@ export async function rebuildFile( | |||
|
|
|||
| if (!fs.existsSync(filePath)) { | |||
| if (cache) (cache as { remove(p: string): void }).remove(filePath); | |||
| const symbolDiff = diffSymbols ? diffSymbols(oldSymbols, []) : null; | |||
| return { | |||
| file: relPath, | |||
| nodesAdded: 0, | |||
| nodesRemoved: oldNodes, | |||
| edgesAdded: 0, | |||
| deleted: true, | |||
| event: 'deleted', | |||
| symbolDiff, | |||
| nodesBefore: oldNodes, | |||
| nodesAfter: 0, | |||
| }; | |||
| return buildDeletionResult(relPath, oldNodes, edgesBefore, oldSymbols, diffSymbols); | |||
| } | |||
|
|
|||
| let code: string; | |||
| @@ -551,47 +681,16 @@ export async function rebuildFile( | |||
|
|
|||
| const fileNodeRow = stmts.getNodeId.get(relPath, 'file', relPath, 0); | |||
| if (!fileNodeRow) | |||
| return { file: relPath, nodesAdded: newNodes, nodesRemoved: oldNodes, edgesAdded: 0 }; | |||
|
|
|||
| const aliases: PathAliases = { baseUrl: null, paths: {} }; | |||
|
|
|||
| let edgesAdded = buildContainmentEdges(db, stmts, relPath, symbols); | |||
| edgesAdded += rebuildDirContainment(db, stmts, relPath); | |||
| edgesAdded += buildImportEdges(stmts, relPath, symbols, rootDir, fileNodeRow.id, aliases, db); | |||
| const importedNames = buildImportedNamesMap(symbols, rootDir, relPath, aliases); | |||
| edgesAdded += buildCallEdges(stmts, relPath, symbols, fileNodeRow, importedNames); | |||
| return { | |||
| file: relPath, | |||
| nodesAdded: newNodes, | |||
| nodesRemoved: oldNodes, | |||
| edgesAdded: 0, | |||
| edgesBefore, | |||
| }; | |||
|
|
|||
| // Cascade: rebuild outgoing edges for reverse-dep files. | |||
| // Two-pass approach: first rebuild direct edges (creating reexports edges for barrels), | |||
| // then add barrel import edges (which need reexports edges to exist for resolution). | |||
| const depSymbols = new Map<string, ExtractorOutput>(); | |||
| for (const depRelPath of reverseDeps) { | |||
| const symbols_ = await parseReverseDep(rootDir, depRelPath, engineOpts, cache); | |||
| if (symbols_) { | |||
| deleteOutgoingEdges(db, depRelPath); | |||
| depSymbols.set(depRelPath, symbols_); | |||
| } | |||
| } | |||
| // Pass 1: direct edges only (no barrel resolution) — creates reexports edges | |||
| for (const [depRelPath, symbols_] of depSymbols) { | |||
| edgesAdded += rebuildReverseDepEdges(db, rootDir, depRelPath, symbols_, stmts, true); | |||
| } | |||
| // Pass 2: add barrel import edges (reexports edges now exist) | |||
| for (const [depRelPath, symbols_] of depSymbols) { | |||
| const fileNodeRow_ = stmts.getNodeId.get(depRelPath, 'file', depRelPath, 0); | |||
| if (!fileNodeRow_) continue; | |||
| const aliases_: PathAliases = { baseUrl: null, paths: {} }; | |||
| for (const imp of symbols_.imports) { | |||
| if (imp.reexport) continue; | |||
| const resolvedPath = resolveImportPath( | |||
| path.join(rootDir, depRelPath), | |||
| imp.source, | |||
| rootDir, | |||
| aliases_, | |||
| ); | |||
| edgesAdded += resolveBarrelImportEdges(db, stmts, fileNodeRow_.id, resolvedPath, imp); | |||
| } | |||
| } | |||
| let edgesAdded = rebuildEdgesForTargetFile(db, stmts, relPath, symbols, fileNodeRow, rootDir); | |||
| edgesAdded += await runReverseDepCascade(db, rootDir, reverseDeps, stmts, engineOpts, cache); | |||
There was a problem hiding this comment.
Reverse-dep cascade inflates
edgesAdded vs edgesBefore
edgesBefore counts only outgoing edges from the target file, but edgesAdded is the sum of target-file edges and all re-inserted reverse-dep edges (via runReverseDepCascade). Each reverse dep has its old outgoing edges deleted by deleteOutgoingEdges then re-inserted, so their count lands entirely in edgesAdded with no corresponding offset in edgesBefore.
Concretely, editing a shared utility file (e.g. utils.ts) imported by 3 other files where each dep has 5 edges: edgesBefore = 10, edgesAdded = 10 + 15 = 25, reported delta = +15 — not +0 as the test plan states. The "+0 for a no-op save" guarantee only holds for leaf files (no reverse deps).
There was a problem hiding this comment.
Fixed — parseReverseDeps now counts each reverse dep's outgoing edges (via stmts.countEdges) before calling deleteOutgoingEdges, and returns the total as reverseDepsEdgesBefore. runReverseDepCascade now returns { edgesAdded, reverseDepsEdgesBefore } and rebuildFile adds reverseDepsEdgesBefore to edgesBefore so the net delta (edgesAdded - edgesBefore) is correct for files with any number of reverse deps. Commit b5bb5d1.
…delta (#1245) - parseReverseDeps now counts each dep's outgoing edges before deleting them and returns the total as reverseDepsEdgesBefore - runReverseDepCascade returns { edgesAdded, reverseDepsEdgesBefore } - rebuildFile adds reverseDepsEdgesBefore to edgesBefore so the net delta (edgesAdded - edgesBefore) is correct even when the cascade re-inserts unchanged edges for reverse deps - buildChangeEvent receives the net delta instead of the gross count, keeping the change journal consistent with the log output
|
Addressed Greptile's review feedback: Issue 1 — reverse-dep cascade inflates
Issue 2 —
|
Codegraph Impact Analysis10 functions changed → 9 callers affected across 3 files
|
Summary
purgeFileData(edgesBeforeinIncrementalStmts+RebuildResult)logRebuildResultsnow showsedgesAdded - edgesBeforeso a no-op save reports+0 edgesinstead of e.g.+10 edges-N edgesTest plan
watcher-fk-embeddings,watcher-rebuild)codegraph watch, edit a file without changing imports/calls → log shows+0 edgesCloses #1241