Skip to content

feat: type analysis modules, parser WASM maps, extract canStripTypes (#554 follow-up)#558

Open
carlos-alm wants to merge 21 commits intomainfrom
feat/ts-deferred-followups
Open

feat: type analysis modules, parser WASM maps, extract canStripTypes (#554 follow-up)#558
carlos-alm wants to merge 21 commits intomainfrom
feat/ts-deferred-followups

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Type db: anyBetterSqlite3.Database across all 9 domain/analysis/*.ts modules, with NodeRow, RelatedNodeRow, and other types from types.ts replacing untyped parameters
  • Type WASM parser maps in domain/parser.ts using web-tree-sitter exported types (Parser, Language, Query, Tree) with a new LanguageRegistryEntry interface
  • Extract canStripTypes Node version probe to tests/helpers/node-version.js shared helper (ready for import when CLI test guards land)

All three items were deferred as follow-ups acknowledged in PR #554 comments.

Test plan

  • npx tsc --noEmit — zero type errors
  • npm run lint — zero errors (only pre-existing style warnings)
  • npx vitest run — 2002 tests pass, 107 test files pass (3 pre-existing CLI spawn failures unchanged)
  • Verify no runtime regressions in CI

…pTypes helper (#554)

Address deferred review feedback from PR #554:
- Convert all 9 domain/analysis/*.js → .ts with BetterSqlite3.Database
  and NodeRow type annotations replacing untyped db/node parameters
- Type WASM parser cache maps (Parser, Language, Query) in domain/parser.ts
  with LanguageRegistryEntry interface and full function signatures
- Extract canStripTypes Node version probe to tests/helpers/node-version.js
  shared helper (ready for import when CLI test TS guards land)
@claude
Copy link

claude bot commented Mar 21, 2026

Claude finished @carlos-alm's task —— View job


Code Review Analysis

Conducting comprehensive 5-phase review for PR #558...

Todo List

  • Phase 1: Root cause analysis - understand the problem being solved
  • Phase 2: Configuration change detection - check for high-risk changes
  • Phase 3: Test integrity verification - ensure no weakened assertions
  • Phase 4: Code quality analysis - review TypeScript typing improvements
  • Phase 5: Backlog compliance evaluation - rate against foundation principles
  • Final assessment and recommendation

Root Cause Analysis

Problem Being Solved: This PR addresses deferred TypeScript typing improvements from PR #554. It converts 9 analysis modules from .js to .ts with proper type annotations, particularly replacing db: any with BetterSqlite3.Database and using proper row types from types.ts.

Why This Approach: The approach is sound - these are legitimate typing improvements that:

  1. Replace untyped any parameters with proper BetterSqlite3.Database types
  2. Use well-defined row interfaces (NodeRow, RelatedNodeRow, etc.) from types.ts
  3. Add proper WASM parser types using web-tree-sitter exports
  4. Extract a shared helper to reduce code duplication

Risk Assessment: Very low risk. This is purely additive typing with no runtime behavior changes.

Code Quality Assessment

Excellent TypeScript Migration:

  • Proper database typing: All db: anyBetterSqlite3.Database conversions are correct
  • Row type usage: Consistent use of NodeRow, RelatedNodeRow, ChildNodeRow from types.ts
  • WASM parser types: Good use of web-tree-sitter exports (Parser, Language, Query, Tree)
  • Interface design: New LanguageRegistryEntry interface is well-structured
  • Helper extraction: canStripTypes helper properly extracted to shared location

Type Safety Improvements:

  • All SQL query results properly typed with database row interfaces
  • Better-sqlite3 import type correctly used for database parameters
  • WASM type imports from web-tree-sitter are appropriate
  • Consistent biome-ignore comments for legitimate any usage (dynamic config shapes, native addon)

Code Quality: High quality conversion that maintains all existing functionality while adding proper type safety. No runtime changes detected.

Backlog Compliance

  • Zero-dep: ✅ — No new runtime dependencies added
  • Foundation-aligned: ✅ — Fully compliant with all principles, improves type safety without affecting core architecture
  • Problem-fit: 4/5 — Improves developer experience and reduces potential type-related bugs, which helps AI agents work with the codebase more reliably
  • Breaking: No — Purely additive typing, no API changes
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Excellent TypeScript conversion that improves type safety without introducing risk. Clean implementation following existing patterns, proper use of defined interfaces, and good documentation. Only minor deduction because this is primarily maintenance work rather than feature development.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR completes three follow-up items deferred from #554: typing db: anyBetterSqlite3.Database across all nine domain/analysis/*.ts modules, typing the WASM parser maps in domain/parser.ts using web-tree-sitter exported types, and extracting the canStripTypes Node-version probe to tests/helpers/node-version.js.

Key changes:

  • All nine analysis modules now use BetterSqlite3.Database for the DB handle and typed row interfaces (NodeRow, RelatedNodeRow, ImportEdgeRow, etc.) from types.ts in place of any casts and biome-ignore suppressions.
  • domain/parser.ts maps (_cachedParsers, _cachedLanguages, _queryCache) are typed as Map<string, Parser | null>, Map<string, Language>, and Map<string, Query> respectively; LanguageRegistryEntry is now defined and exported locally rather than imported from types.ts.
  • TS_BACKFILL_EXTS promoted to a single module-level Set constant, eliminating three duplicate inline constructions across parseFileAuto, parseFilesAuto, and parseFileIncremental.
  • vendor.d.ts gains raw(toggle?: boolean): this on Statement, removing the last as any cast in exports.ts.
  • New tests/helpers/node-version.js exports canStripTypes for future CLI spawn test guards.

Issues found:

  • findAffectedFunctions in impact.ts still calls db.prepare() on every iteration of the changedRanges loop — the same anti-pattern that was explicitly fixed in buildTransitiveCallers, reconstructPath, exportsFileImpl, and explainFunctionImpl earlier in this same PR.
  • The ! non-null assertion on result.get(relPath) in parseFilesAuto was silently dropped when the map value type became any; failures are swallowed by the surrounding catch block, so a missing-entry bug would be invisible.

Confidence Score: 4/5

  • Safe to merge; one missed db.prepare() hoist in findAffectedFunctions and one dropped null-assertion are low-severity style issues with no runtime risk in normal operation.
  • The PR is a well-structured typing cleanup with all previously-flagged db.prepare()-in-loop issues addressed. The two remaining items (loop prepare in findAffectedFunctions, dropped ! in parseFilesAuto) are P2 style concerns — neither causes incorrect behavior in the current test suite, and both are consistent with patterns the author has already been fixing iteratively in this series.
  • src/domain/analysis/impact.ts (loop db.prepare()) and src/domain/parser.ts (dropped null-assertion on result.get(relPath)).

Important Files Changed

Filename Overview
src/db/repository/nodes.ts Adds getLineCountForNode and getMaxEndLineForFile helpers using the established cachedStmt WeakMap pattern, correctly extracted from inline SQL in context.ts.
src/domain/analysis/impact.ts Good db: any → BetterSqlite3.Database typing throughout, but findAffectedFunctions still calls db.prepare() on every loop iteration — the same anti-pattern fixed multiple times elsewhere in this PR.
src/domain/parser.ts WASM parser maps typed with `Parser
src/domain/analysis/context.ts All db: any / node: any replaced with BetterSqlite3.Database / typed rows; cachedStmt WeakMap pattern applied to explainFunctionImpl; duplicate findCallers call eliminated; inline SQL replaced with new repo helpers.
src/domain/analysis/exports.ts All five db.prepare() calls hoisted before fileNodes.map(); schema probe cached via _hasExportedColCache WeakMap matching the _hasImplementsCache pattern in impact.ts; raw(true) now typed via vendor.d.ts.
src/domain/analysis/dependencies.ts upstreamStmt hoisted out of BFS loops; second SELECT id re-query eliminated via e.source_id; nodeByIdStmt hoisted in reconstructPath; all db: any replaced with typed counterparts.
src/vendor.d.ts Adds raw(toggle?: boolean): this to the Statement interface, removing the need for the as any cast in exports.ts.
tests/helpers/node-version.js New shared helper exporting canStripTypes (Node ≥ 22.6 probe). Clean implementation with no underscore-prefixed variables.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[diffImpactData] --> B[parseGitDiff]
    B --> C[findAffectedFunctions\ndb.prepare inside loop ⚠️]
    C --> D[buildFunctionImpactResults]
    D --> E[bfsTransitiveCallers]
    E --> F[findDistinctCallers\nfindImplementors]

    G[exportsData] --> H[exportsFileImpl]
    H --> I[exportedNodesStmt\nconsumersStmt\nreexportsFromStmt\nreexportsToStmt\nHoisted ✅]

    J[explainData] --> K[explainFunctionImpl]
    K --> L[cachedStmt _explainNodeStmtCache ✅]

    M[pathData] --> N[bfsShortestPath\nneighborStmt hoisted ✅]
    N --> O[reconstructPath\nnodeByIdStmt hoisted ✅]

    P[fnDepsData] --> Q[buildTransitiveCallers\nupstreamStmt hoisted ✅]
Loading

Last reviewed commit: "fix: hoist db.prepar..."

Comment on lines +5 to +6
const [_major, _minor] = process.versions.node.split('.').map(Number);
export const canStripTypes = _major > 22 || (_major === 22 && _minor >= 6);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Underscore-prefix on used variables is misleading

Both _major and _minor carry the underscore "potentially-unused" convention, yet both are referenced in the canStripTypes expression. Biome's noUnusedVariables rule silently skips identifiers whose names start with _, so future readers (and the linter) receive a false signal that these bindings are intentionally discarded.

Consider dropping the leading underscores so the intent is clear:

Suggested change
const [_major, _minor] = process.versions.node.split('.').map(Number);
export const canStripTypes = _major > 22 || (_major === 22 && _minor >= 6);
const [major, minor] = process.versions.node.split('.').map(Number);
export const canStripTypes = major > 22 || (major === 22 && minor >= 6);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — dropped the underscore prefix from major and minor since both are referenced in the canStripTypes expression.

Comment on lines +116 to +121
try {
db.prepare('SELECT exported FROM nodes LIMIT 0').raw();
// biome-ignore lint/suspicious/noExplicitAny: Statement.raw() exists at runtime but is missing from type declarations
(db.prepare('SELECT exported FROM nodes LIMIT 0') as any).raw(true);
hasExportedCol = true;
} catch (e) {
debug(`exported column not available, using fallback: ${e.message}`);
} catch (e: unknown) {
debug(`exported column not available, using fallback: ${(e as Error).message}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 as any cast for Statement.raw() may be unnecessary

The biome-ignore comment explains that Statement.raw() is "missing from type declarations," but current better-sqlite3 TypeScript typings (v9+) do declare Statement.raw(raw?: boolean): this. If the project's installed type version already includes this signature, the as any cast and its accompanying biome-ignore directive are both unnecessary — and the latter suppresses the Biome noExplicitAny warning for all code on that line.

It's worth checking whether dropping the cast resolves cleanly:

db.prepare('SELECT exported FROM nodes LIMIT 0').raw(true);

If tsc still complains, it confirms the local typings are outdated and the workaround is warranted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added raw(toggle?: boolean): this to the Statement interface in vendor.d.ts, which allows us to remove the as any cast and the biome-ignore comment.

@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +468 to +477
relatedTests,
...implInfo,
};
});

const base = { name, results };
return paginateResult(base, 'results', { limit: opts.limit, offset: opts.offset });
} finally {
db.close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inline SQL bypasses the db/index.js abstraction layer

Two queries in explainFileImpl are issued directly on db rather than going through the functions in db/index.js:

const metric = db
  .prepare(`SELECT nm.line_count FROM node_metrics nm WHERE nm.node_id = ?`)
  .get(fn.id) as { line_count: number } | undefined;
...
const maxLine = db
  .prepare(`SELECT MAX(end_line) as max_end FROM nodes WHERE file = ?`)
  .get(fn.file) as { max_end: number | null } | undefined;

Keeping all SQL behind the DB layer makes future schema changes easier to track and ensures uniform error handling. If these queries are genuinely one-off, a brief comment explaining why they're inlined (e.g., "no existing helper for node_metrics") would make the intent clear to future readers.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — extracted the two inline SQL queries into getLineCountForNode and getMaxEndLineForFile functions in db/repository/nodes.js, using the standard cachedStmt pattern. context.ts now imports and calls these instead of issuing raw SQL.

@carlos-alm
Copy link
Contributor Author

Addressed all Greptile review feedback:

  1. Underscore prefix on used vars — dropped _major/_minor to major/minor in tests/helpers/node-version.js
  2. as any cast for Statement.raw() — added raw() to vendor.d.ts Statement interface, removed the cast and biome-ignore in exports.ts
  3. Inline SQL bypasses db abstraction — extracted getLineCountForNode and getMaxEndLineForFile into db/repository/nodes.js using cachedStmt pattern
  4. N+1 re-query in buildTransitiveCallers — callers already carry .id from findCallers; updated parameter type to accept id directly, eliminating per-caller SELECT
  5. customDbPath type inconsistency — aligned implementations.ts to string (was string | undefined)

All passing: zero type errors, zero lint errors, 546 integration tests green.

@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +80 to +93
const upstream = db
.prepare(`
SELECT n.name, n.kind, n.file, n.line
FROM edges e JOIN nodes n ON e.source_id = n.id
WHERE e.target_id = ? AND e.kind = 'calls'
`)
.all(f.id) as Array<{ name: string; kind: string; file: string; line: number }>;
for (const u of upstream) {
if (noTests && isTestFile(u.file)) continue;
const uid = (
db
.prepare('SELECT id FROM nodes WHERE name = ? AND kind = ? AND file = ? AND line = ?')
.get(u.name, u.kind, u.file, u.line) as { id: number } | undefined
)?.id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 db.prepare() inside nested BFS loops forces redundant SQL compilation and an extra DB roundtrip per node

Two db.prepare() calls are issued on every inner-loop iteration during BFS traversal. better-sqlite3 compiles the query string on each call, so the first prepare runs O(depth × frontier) compilations instead of one. More critically, the first query omits n.id from its SELECT, forcing a second query (SELECT id FROM nodes WHERE name = ? AND kind = ? AND file = ? AND line = ?) for every upstream node just to obtain the ID — even though the ID is already available on the join side (e.source_id).

For comparison, bfsShortestPath in the same file handles this correctly: the statement is compiled once before the loops, and n.id is included in the single-query SELECT.

Fix: hoist both statements out of the loops and add n.id to the first SELECT to eliminate the second lookup entirely:

// Hoist before the loop
const upstreamStmt = db.prepare(`
  SELECT n.id, n.name, n.kind, n.file, n.line
  FROM edges e JOIN nodes n ON e.source_id = n.id
  WHERE e.target_id = ? AND e.kind = 'calls'
`);

for (let d = 2; d <= depth; d++) {
  const nextFrontier: typeof frontier = [];
  for (const f of frontier) {
    if (visited.has(f.id)) continue;
    visited.add(f.id);
    const upstream = upstreamStmt.all(f.id) as Array<{ id: number; name: string; kind: string; file: string; line: number }>;
    for (const u of upstream) {
      if (noTests && isTestFile(u.file)) continue;
      if (!visited.has(u.id)) {
        nextFrontier.push(u);
      }
    }
  }
  ...
}

This collapses two queries per node into one and avoids repeated statement compilation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — hoisted the upstream query statement out of the BFS loops and added n.id to the SELECT. This eliminates both the repeated db.prepare() compilation (O(depth × frontier) → 1) and the per-node ID re-query. The second SELECT id FROM nodes is completely removed since e.source_id already provides the ID via the join.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed in the prior round (commit 3e3cb50) — the upstream query is hoisted to upstreamStmt before the BFS loop at line 75, and neighborStmt is hoisted at line 277 in bfsShortestPath. No db.prepare() remains inside any loop in this file.

const patched = patchNativeResult(result);
// Only backfill typeMap for TS/TSX — JS files have no type annotations,
// and the native engine already handles `new Expr()` patterns.
const TS_BACKFILL_EXTS = new Set(['.ts', '.tsx']);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicate inline Set construction — should be a module-level constant

new Set(['.ts', '.tsx']) is constructed three separate times across the same module:

  • parseFileAuto (line 505) as TS_BACKFILL_EXTS
  • parseFilesAuto (line 557) as TS_EXTS
  • parseFileIncremental (line 669) as TS_BACKFILL_EXTS

All three sets represent the same concept. Promoting this to a single module-level constant prevents the sets from being re-created on every call and makes future extension (e.g. adding .mts) a one-line change.

Suggested change
const TS_BACKFILL_EXTS = new Set(['.ts', '.tsx']);
const TS_BACKFILL_EXTS = new Set(['.ts', '.tsx']);

Then reference TS_BACKFILL_EXTS at lines 505, 557 (replacing TS_EXTS), and 669.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — promoted \ to a module-level \ at the top of . Removed the 3 duplicate inline \ constructions at lines 505, 557 (), and 669. All now reference the single constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed in the prior round (commit b3fc6d5) — TS_BACKFILL_EXTS is a module-level constant at line 52. All three inline new Set(['.ts', '.tsx']) constructions were removed.

@carlos-alm
Copy link
Contributor Author

Addressed round 2 Greptile feedback:

  1. Hoisted db.prepare() out of BFS loops and added n.id to SELECT — eliminates repeated compilation and extra roundtrip per node
  2. Promoted TS_BACKFILL_EXTS to module-level constant, removed 3 duplicate inline Set constructions

All passing: zero type errors, 546 integration tests green.

@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +338 to +342
const testCallerRows = findCallers(db, node.id) as RelatedNodeRow[];
const seenFiles = new Set<string>();
const relatedTests = testCallerRows
.filter((r) => isTestFile(r.file) && !seenFiles.has(r.file) && seenFiles.add(r.file))
.map((r) => ({ file: r.file }));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Duplicate findCallers() call — redundant DB roundtrip per node

findCallers(db, node.id) is called twice on every iteration of the nodes.slice(0, 10).map(...) loop: once on line 330 to build the callers array, and again here on line 338 to source testCallerRows. Since the DB query is identical (find all callers of node.id), the second call is pure waste — it re-runs the same prepared statement and allocates the same rows a second time.

The first call's raw rows should be retained before the .map() transform so they can be reused for relatedTests:

Suggested change
const testCallerRows = findCallers(db, node.id) as RelatedNodeRow[];
const seenFiles = new Set<string>();
const relatedTests = testCallerRows
.filter((r) => isTestFile(r.file) && !seenFiles.has(r.file) && seenFiles.add(r.file))
.map((r) => ({ file: r.file }));
const rawCallerRows = findCallers(db, node.id) as RelatedNodeRow[];
let callers = rawCallerRows.map((c) => ({
name: c.name,
kind: c.kind,
file: c.file,
line: c.line,
}));
if (noTests) callers = callers.filter((c) => !isTestFile(c.file));
const seenFiles = new Set<string>();
const relatedTests = rawCallerRows
.filter((r) => isTestFile(r.file) && !seenFiles.has(r.file) && seenFiles.add(r.file))
.map((r) => ({ file: r.file }));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — stored findCallers(db, node.id) result in allCallerRows and reused it for both the callers list and the relatedTests filter, eliminating the duplicate DB roundtrip.

Comment on lines +141 to +161
const buildSymbolResult = (s: NodeRow, fileLines: string[] | null) => {
let consumers = db
.prepare(
`SELECT n.name, n.file, n.line FROM edges e JOIN nodes n ON e.source_id = n.id
WHERE e.target_id = ? AND e.kind = 'calls'`,
)
.all(s.id) as Array<{ name: string; file: string; line: number }>;
if (noTests) consumers = consumers.filter((c) => !isTestFile(c.file));

return {
name: s.name,
kind: s.kind,
line: s.line,
endLine: s.end_line ?? null,
role: s.role || null,
signature: fileLines ? extractSignature(fileLines, s.line, displayOpts) : null,
summary: fileLines ? extractSummary(fileLines, s.line, displayOpts) : null,
consumers: consumers.map((c) => ({ name: c.name, file: c.file, line: c.line })),
consumerCount: consumers.length,
};
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 db.prepare() inside per-symbol closure — same pattern as the BFS-loop issue already fixed in this PR

buildSymbolResult is defined inside exportsFileImpl and called once per exported symbol (exported.map(...)) and once per re-exported symbol (for (const s of targetExported)). Because db.prepare() is inside the closure, the SQL is compiled on every invocation — O(exported + reexported) compilations per file instead of 1.

The same "SELECT * FROM nodes WHERE file = ? AND kind != 'file' AND exported = 1" prepare also appears inside the for (const reexTarget of reexportTargets) loop at line 190, which adds another O(reexport-targets) compile per file.

This is the same anti-pattern as the upstreamStmt BFS issue fixed earlier in dependencies.ts. The fix is to hoist both statements before the closures/loops:

// Hoist before buildSymbolResult is defined
const consumersStmt = db.prepare(
  `SELECT n.name, n.file, n.line FROM edges e JOIN nodes n ON e.source_id = n.id
   WHERE e.target_id = ? AND e.kind = 'calls'`,
);
const exportedByFileStmt = db.prepare(
  "SELECT * FROM nodes WHERE file = ? AND kind != 'file' AND exported = 1 ORDER BY line",
);

This pattern also appears at:

  • src/domain/analysis/dependencies.ts:335db.prepare('SELECT name, kind, file, line FROM nodes WHERE id = ?') inside the getNode closure in reconstructPath, called once per unique path node.
  • src/domain/analysis/impact.ts:344db.prepare(...) inside the for (const [file, ranges] of changedRanges) loop in findAffectedFunctions, compiled once per changed file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — hoisted consumersStmt = db.prepare(...) above the buildSymbolResult closure so it compiles once per fileNode iteration instead of once per exported symbol.

…t of per-symbol closure

context.ts: reuse allCallerRows for both callers list and relatedTests
instead of calling findCallers(db, node.id) twice per iteration.

exports.ts: hoist consumersStmt above buildSymbolResult closure so
db.prepare() runs once per fileNode instead of once per exported symbol.
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +128 to +194
// Use the exported column populated during build
exported = db
.prepare(
"SELECT * FROM nodes WHERE file = ? AND kind != 'file' AND exported = 1 ORDER BY line",
)
.all(fn.file) as NodeRow[];
} else {
// Fallback: symbols that have incoming calls from other files
const exportedIds = findCrossFileCallTargets(db, fn.file) as Set<number>;
exported = symbols.filter((s) => exportedIds.has(s.id));
}
const internalCount = symbols.length - exported.length;

const consumersStmt = db.prepare(
`SELECT n.name, n.file, n.line FROM edges e JOIN nodes n ON e.source_id = n.id
WHERE e.target_id = ? AND e.kind = 'calls'`,
);

const buildSymbolResult = (s: NodeRow, fileLines: string[] | null) => {
let consumers = consumersStmt.all(s.id) as Array<{ name: string; file: string; line: number }>;
if (noTests) consumers = consumers.filter((c) => !isTestFile(c.file));

return {
name: s.name,
kind: s.kind,
line: s.line,
endLine: s.end_line ?? null,
role: s.role || null,
signature: fileLines ? extractSignature(fileLines, s.line, displayOpts) : null,
summary: fileLines ? extractSummary(fileLines, s.line, displayOpts) : null,
consumers: consumers.map((c) => ({ name: c.name, file: c.file, line: c.line })),
consumerCount: consumers.length,
};
};

const results = exported.map((s) => buildSymbolResult(s, getFileLines(fn.file)));

const totalUnused = results.filter((r) => r.consumerCount === 0).length;

// Files that re-export this file (barrel -> this file)
const reexports = (
db
.prepare(
`SELECT DISTINCT n.file FROM edges e JOIN nodes n ON e.source_id = n.id
WHERE e.target_id = ? AND e.kind = 'reexports'`,
)
.all(fn.id) as Array<{ file: string }>
).map((r) => ({ file: r.file }));

// For barrel files: gather symbols re-exported from target modules
const reexportTargets = db
.prepare(
`SELECT DISTINCT n.file FROM edges e JOIN nodes n ON e.target_id = n.id
WHERE e.source_id = ? AND e.kind = 'reexports'`,
)
.all(fn.id) as Array<{ file: string }>;

const reexportedSymbols: Array<ReturnType<typeof buildSymbolResult> & { originFile: string }> =
[];
for (const reexTarget of reexportTargets) {
let targetExported: NodeRow[];
if (hasExportedCol) {
targetExported = db
.prepare(
"SELECT * FROM nodes WHERE file = ? AND kind != 'file' AND exported = 1 ORDER BY line",
)
.all(reexTarget.file) as NodeRow[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Repeated db.prepare() calls inside map and nested loop

Four db.prepare() calls still live inside the fileNodes.map() callback (lines 129–133, 141, 169–175, 178–183), compiling the SQL once per matched file node. Most critically, the prepare at lines 190–194 is inside a nested for (const reexTarget of reexportTargets) loop, so it runs O(fileNodes × reexportTargets) times — the same anti-pattern that was fixed in dependencies.ts earlier in this PR.

Additionally, the SQL at line 190–194 is identical to the one at line 129–133. Both should share a single hoisted statement.

The prior round correctly hoisted consumersStmt out of buildSymbolResult (down to per-fileNode), but the remaining prepares were left behind. Hoisting all five statements before fileNodes.map(...) reduces compilation from O(fileNodes × reexportTargets) to a constant 5 total.

// Hoist before fileNodes.map()
const exportedNodesStmt = db.prepare(
  "SELECT * FROM nodes WHERE file = ? AND kind != 'file' AND exported = 1 ORDER BY line",
);
const reexportsFromStmt = db.prepare(
  `SELECT DISTINCT n.file FROM edges e JOIN nodes n ON e.source_id = n.id
   WHERE e.target_id = ? AND e.kind = 'reexports'`,
);
const reexportsToStmt = db.prepare(
  `SELECT DISTINCT n.file FROM edges e JOIN nodes n ON e.target_id = n.id
   WHERE e.source_id = ? AND e.kind = 'reexports'`,
);
const consumersStmt = db.prepare(
  `SELECT n.name, n.file, n.line FROM edges e JOIN nodes n ON e.source_id = n.id
       WHERE e.target_id = ? AND e.kind = 'calls'`,
);

return fileNodes.map((fn) => {
  ...
  if (hasExportedCol) {
    exported = exportedNodesStmt.all(fn.file) as NodeRow[];
  }
  ...
  const reexports = (reexportsFromStmt.all(fn.id) as Array<{ file: string }>).map((r) => ({ file: r.file }));
  const reexportTargets = reexportsToStmt.all(fn.id) as Array<{ file: string }>;
  ...
  for (const reexTarget of reexportTargets) {
    if (hasExportedCol) {
      targetExported = exportedNodesStmt.all(reexTarget.file) as NodeRow[];  // reuse the same statement
    }
    ...
  }
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — hoisted all five db.prepare() calls (exportedNodesStmt, consumersStmt, reexportsFromStmt, reexportsToStmt) before fileNodes.map(). The duplicate exportedNodesStmt inside the reexport loop now reuses the same hoisted statement. SQL compilation reduced from O(fileNodes × reexportTargets) to constant 4.

…rts.ts

Moves exportedNodesStmt, consumersStmt, reexportsFromStmt, and reexportsToStmt
before the fileNodes.map() callback. Eliminates O(fileNodes × reexportTargets)
SQL compilations — now compiles each statement exactly once per exportsFileImpl call.
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +333 to +342
const getNode = (id: number) => {
if (nodeCache.has(id)) return nodeCache.get(id)!;
const row = db.prepare('SELECT name, kind, file, line FROM nodes WHERE id = ?').get(id) as {
name: string;
kind: string;
file: string;
line: number;
};
nodeCache.set(id, row);
return row;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 db.prepare() inside getNode closure — same pattern fixed elsewhere in this PR

reconstructPath hoists nothing: every unique path node that misses the nodeCache triggers a fresh db.prepare() call, compiling the same SQL once per node. For a BFS bounded at maxDepth=10, that is up to 11 compilations per pathData() call. The exact same anti-pattern was fixed this PR in buildTransitiveCallers (hoisting upstreamStmt) and in bfsShortestPath (hoisting neighborStmt).

Hoist the statement outside getNode:

Suggested change
const getNode = (id: number) => {
if (nodeCache.has(id)) return nodeCache.get(id)!;
const row = db.prepare('SELECT name, kind, file, line FROM nodes WHERE id = ?').get(id) as {
name: string;
kind: string;
file: string;
line: number;
};
nodeCache.set(id, row);
return row;
) {
const nodeCache = new Map<number, { name: string; kind: string; file: string; line: number }>();
const nodeByIdStmt = db.prepare('SELECT name, kind, file, line FROM nodes WHERE id = ?');
const getNode = (id: number) => {
if (nodeCache.has(id)) return nodeCache.get(id)!;
const row = nodeByIdStmt.get(id) as {
name: string;
kind: string;
file: string;
line: number;
};
nodeCache.set(id, row);
return row;
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — hoisted db.prepare() out of the getNode closure into nodeByIdStmt before the closure definition in reconstructPath. The prepared statement is now compiled once and reused across all calls.

Comment on lines +308 to +312
let nodes = db
.prepare(
`SELECT * FROM nodes WHERE name LIKE ? AND kind IN ('function','method','class','interface','type','struct','enum','trait','record','module','constant') ORDER BY file, line`,
)
.all(`%${target}%`) as NodeRow[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 db.prepare() recompiled on each recursive explainCallees call

explainFunctionImpl is called once per unique callee during explainCallees traversal (depth × breadth calls for a deep query). Each invocation compiles this SQL string again. The correct fix here is to use cachedStmt (like all other DB helpers in this codebase) so the compiled statement is reused across calls for the same db connection. Alternatively, pass the hoisted statement as a parameter from explainData.

This is consistent with how other modules in this PR use cachedStmt in db/repository/nodes.js (see getLineCountForNode and getMaxEndLineForFile).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the inline db.prepare() in explainFunctionImpl with the cachedStmt WeakMap pattern (module-level _explainNodeStmtCache: StmtCache<NodeRow>). The statement is now compiled once per db handle and reused across recursive explainCallees calls.

Comment on lines +114 to +121
// Detect whether exported column exists
let hasExportedCol = false;
try {
db.prepare('SELECT exported FROM nodes LIMIT 0').raw(true);
hasExportedCol = true;
} catch (e: unknown) {
debug(`exported column not available, using fallback: ${(e as Error).message}`);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Schema probe re-runs on every exportsData call

exportsFileImpl re-evaluates the exported column existence on every invocation. For a tool like the MCP server that may call exportsData frequently (once per file query), this is a redundant schema check each time.

impact.ts in this same PR sets a precedent with _hasImplementsCache: WeakMap<BetterSqlite3.Database, boolean> — the same pattern applies here:

const _hasExportedColCache = new WeakMap<BetterSqlite3.Database, boolean>();

function hasExportedCol(db: BetterSqlite3.Database): boolean {
  if (_hasExportedColCache.has(db)) return _hasExportedColCache.get(db)!;
  let result = false;
  try {
    db.prepare('SELECT exported FROM nodes LIMIT 0');
    result = true;
  } catch {
    // column absent
  }
  _hasExportedColCache.set(db, result);
  return result;
}

Then replace the inline probe with if (hasExportedCol(db)) {.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a module-level _hasExportedColCache: WeakMap<BetterSqlite3.Database, boolean> that caches the schema probe result per db handle, same pattern as _hasImplementsCache in impact.ts. The SELECT exported FROM nodes LIMIT 0 probe now runs at most once per connection.

Impact: 423 functions changed, 501 affected
… feedback

- dependencies.ts: hoist db.prepare() out of getNode closure in reconstructPath
- context.ts: use cachedStmt WeakMap pattern for explainFunctionImpl node query
- exports.ts: cache exported-column schema probe with WeakMap per db handle

Impact: 4 functions changed, 4 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines 344 to +348
const defs = db
.prepare(
`SELECT * FROM nodes WHERE file = ? AND kind IN ('function', 'method', 'class') ORDER BY line`,
)
.all(file) as any[];
.all(file) as NodeRow[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 db.prepare() inside loop — same anti-pattern fixed elsewhere in this PR

findAffectedFunctions iterates over every entry in changedRanges and calls db.prepare() on each iteration, compiling the same SQL string once per changed file. This is exactly the anti-pattern that was hoisted out in buildTransitiveCallers (upstreamStmt), reconstructPath (nodeByIdStmt), exportsFileImpl (exportedNodesStmt, consumersStmt, etc.), and explainFunctionImpl (cachedStmt). In a large diff with many changed files, this becomes O(changedFiles) compilations instead of 1.

Hoist the statement before the loop:

Suggested change
const defs = db
.prepare(
`SELECT * FROM nodes WHERE file = ? AND kind IN ('function', 'method', 'class') ORDER BY line`,
)
.all(file) as any[];
.all(file) as NodeRow[];
): NodeRow[] {
const affectedFunctions: NodeRow[] = [];
const affectedFuncsStmt = db.prepare(
`SELECT * FROM nodes WHERE file = ? AND kind IN ('function', 'method', 'class') ORDER BY line`,
);
for (const [file, ranges] of changedRanges) {
if (noTests && isTestFile(file)) continue;
const defs = affectedFuncsStmt.all(file) as NodeRow[];

Comment on lines +570 to +580
const symbols = result.get(relPath);
symbols.typeMap =
extracted.symbols.typeMap instanceof Map
? extracted.symbols.typeMap
: new Map(
(extracted.symbols.typeMap as any).map((e: any) => [e.name, e.typeName]),
extracted.symbols.typeMap.map((e: { name: string; typeName: string }) => [
e.name,
e.typeName,
]),
);
(symbols as any)._typeMapBackfilled = true;
symbols._typeMapBackfilled = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dropped ! non-null assertion — silent failure path

The previous code used result.get(relPath)! to assert the entry exists before writing typeMap. The new code drops the assertion since result is Map<string, any> and symbols is typed as any, so TypeScript no longer enforces it. If result.get(relPath) returned undefined for any reason, symbols.typeMap = ... would throw a TypeError: Cannot set properties of undefined — which is silently swallowed by the surrounding catch block. This means a missing-entry bug would fail quietly instead of surfacing.

Restoring a non-null assertion (or an explicit guard) keeps the intent clear:

Suggested change
const symbols = result.get(relPath);
symbols.typeMap =
extracted.symbols.typeMap instanceof Map
? extracted.symbols.typeMap
: new Map(
(extracted.symbols.typeMap as any).map((e: any) => [e.name, e.typeName]),
extracted.symbols.typeMap.map((e: { name: string; typeName: string }) => [
e.name,
e.typeName,
]),
);
(symbols as any)._typeMapBackfilled = true;
symbols._typeMapBackfilled = true;
const symbols = result.get(relPath);
if (!symbols) continue;
symbols.typeMap =

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant