Skip to content

ci: add dynamic import verification to catch stale paths#540

Merged
carlos-alm merged 5 commits intomainfrom
ci/verify-imports
Mar 20, 2026
Merged

ci: add dynamic import verification to catch stale paths#540
carlos-alm merged 5 commits intomainfrom
ci/verify-imports

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Add scripts/verify-imports.js — scans all await import() calls in src/, resolves relative paths, and fails if any target file doesn't exist
  • Add verify-imports CI job to .github/workflows/ci.yml (no npm install needed — zero-dependency script)
  • Fix stale import in src/mcp/tools/semantic-search.js: ../../embeddings/index.js../../domain/search/index.js (3 occurrences)
  • Fix stale import in src/cli/commands/info.js: ../../db.js../../db/index.js

Closes roadmap item 10.3 — the ast command crash was caused by a stale import path in compiled output; this CI step prevents that class of bug.

Test plan

  • node scripts/verify-imports.js exits 0 on clean codebase
  • Introduce a broken dynamic import → script exits 1 with clear error
  • CI verify-imports job passes in pipeline
  • MCP semantic-search tool loads without crash (stale import fixed)
  • codegraph info runs without crash (stale db.js import fixed)

Add scripts/verify-imports.js that scans all await import() calls in
src/ and verifies each relative path resolves to an existing file.
Runs as a new CI job (verify-imports) in the pipeline gate.

Also fixes two stale imports found by the script:
- src/mcp/tools/semantic-search.js: ../../embeddings/index.js →
  ../../domain/search/index.js (embeddings was moved to domain/search)
- src/cli/commands/info.js: ../../db.js → ../../db/index.js
  (db.js was split into db/ directory)

Closes roadmap item 10.3.
@claude
Copy link

claude bot commented Mar 19, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete ⭐⭐⭐☆☆ (3/5)

Reviewing PR #540: ci: add dynamic import verification to catch stale paths

Root Cause Analysis

Problem Being Solved: Runtime crashes caused by stale dynamic await import() paths after file/directory moves. The AST command was crashing because imports pointed to nonexistent files.
Why This Approach: Static analysis script catches stale paths at build time rather than runtime. Reasonable preventive measure.
Risk Assessment: Low risk - script is zero-dependency, pure Node.js with no external calls.

Backlog Compliance

  • Zero-dep: ✅ — uses only built-in Node.js modules
  • Foundation-aligned: ✅ — pure local verification, supports P1 (keeping graph current)
  • Problem-fit: 3 — prevents one class of bugs but doesn't directly address AI agent token waste
  • Breaking: No — purely additive CI step and npm script
  • Tier: 1

Code Quality Issues Found

1. Regex Limitation ⚠️

The regex /await\s+import\(\s*(['"])(.+?)\1\s*\)/g only catches quoted string literals. It will miss:

  • await import(pathToFileURL(filePath).href) (used in src/cli/index.js:47)
  • Template literals: await import(\./${moduleName}.js`)`
  • Variable imports: await import(moduleVar)

Impact: False negatives - the script won't catch all stale import patterns.

2. Comment Detection Brittleness ⚠️

Lines 50-67 implement comment detection with regex patterns:

  • /^\s*\/\*/ for block comments
  • /^\s*\/\// for line comments

This approach is fragile - it can't handle:

  • Mixed code and comments: const x = 5; /* comment */ await import('./foo.js')
  • Multi-line imports with comments
  • String literals containing comment-like text

Recommendation: For robust comment detection, consider using a proper JS parser or AST walker.

3. Path Resolution Edge Cases ⚠️

The resolution logic (lines 87-96) may not handle all Node.js module resolution rules:

  • Missing .mts/.cts extensions for TypeScript
  • Package.json exports mapping
  • Symlinks and junction points

4. Positive: Good Implementation Details

  • Proper exit codes (0/1)
  • Clear error messages with file:line format
  • Handles basic extensions and index files correctly
  • Zero dependencies as advertised

Critical Concerns

The regex limitation is the most significant issue. Dynamic imports using pathToFileURL() or template literals won't be verified, creating a false sense of security.

Final Recommendation

  • Rating: ⭐⭐⭐☆☆ (3/5)
  • Action: APPROVE WITH CONDITIONS
  • Reasoning: Solves a real problem and fixes legitimate stale imports, but has documented limitations. The verification script catches the most common pattern (string literals) which covers the majority of the codebase. Benefits outweigh limitations for this specific use case.

Conditions for approval:

  1. Document the regex limitations in the script comments
  2. Consider adding a TODO comment about enhancing detection for pathToFileURL patterns

The fixed import paths are definitely correct and needed - the old embeddings/ and db.js paths don't exist.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR adds a zero-dependency static analysis script (scripts/verify-imports.js) that scans all await import() / bare import() calls in src/, resolves relative paths against the filesystem, and exits non-zero if any target is missing. The script is wired into a new verify-imports CI job and added to the ci-pipeline gate. Two stale import paths that caused runtime crashes are also fixed: semantic-search.js (3 occurrences of ../../embeddings/index.js../../domain/search/index.js) and info.js (../../db.js../../db/index.js), both confirmed to exist on disk.

The core scanner logic is solid — it correctly handles mid-line /* block-comment openings, */ closings that leave real code on the same line, bare (non-awaited) import() calls, and // comment markers embedded in string literals. Two remaining gaps are worth tracking as the project grows:

  • Extension fallback skips explicit-extension specifiers (line 114–117): the !extname(target) guard means a specifier written as ./foo.js will never be tested against ./foo.ts, so the TypeScript ESM convention (write .js, compile from .ts) would produce false positives and break CI if the project ever adds TypeScript source files with ESM-style import specifiers.
  • Template-literal imports are invisible (line 39): the regex (['"]) only matches single/double quotes; a fully-static backtick import like await import(`./path.js`) bypasses the scanner entirely.

Confidence Score: 4/5

  • Safe to merge — the stale-path fixes are correct and the CI script works for the current all-JS codebase; two edge-case gaps in the scanner are future-facing concerns, not blockers.
  • Both runtime-crash import fixes are verified against the filesystem. The scanner correctly handles all the previously reported edge cases. The two remaining gaps (.js.ts extension fallback and template-literal imports) do not affect the current codebase — there are no TypeScript source files with dynamic imports and no template-literal import calls in src/. Score is 4 rather than 5 because the extension-fallback limitation is a latent false-positive risk that could silently block CI if the TypeScript footprint grows.
  • scripts/verify-imports.js — extension fallback and template-literal regex gaps worth addressing before the project adds more TypeScript source files.

Important Files Changed

Filename Overview
scripts/verify-imports.js New CI script that walks src/ and verifies all dynamic import() paths resolve to real files. Core logic is sound after previous rounds of fixes (block comment handling, line-comment detection, bare import() matching). Two remaining gaps: the extension-fallback loop is silently skipped for specifiers that already carry an explicit extension (.js.ts false positive risk), and template-literal imports are outside the regex entirely.
.github/workflows/ci.yml Adds a zero-dependency verify-imports CI job and wires it into the ci-pipeline gate. Job correctly skips npm install since the script only uses built-in Node.js modules. Consistent with existing job structure.
src/mcp/tools/semantic-search.js Three stale import specifiers updated from ../../embeddings/index.js../../domain/search/index.js. Target file confirmed to exist on disk at src/domain/search/index.js.
src/cli/commands/info.js Stale import updated from ../../db.js../../db/index.js. Target file confirmed to exist on disk at src/db/index.js.
package.json Adds verify-imports npm script as a convenience wrapper around node scripts/verify-imports.js, consistent with how the CI step calls the script directly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([node scripts/verify-imports.js]) --> B[walk src/\ncollect .js/.ts files]
    B --> C{for each file}
    C --> D[extractDynamicImports\nline-by-line scan]
    D --> E{line in\nblock comment?}
    E -- yes, no close --> F[skip line]
    E -- yes, close found --> G[slice after  */\nfall through]
    E -- no --> H{line starts\nwith  // ?}
    H -- yes --> F
    H -- no --> I[strip inline\n/* ... */ blocks]
    I --> J{unclosed /* ?}
    J -- yes --> K[truncate line\nenter block mode]
    J -- no --> L[run DYNAMIC_IMPORT_RE\nagainst scanLine]
    G --> L
    K --> F
    L --> M{isInsideLineComment\nbefore match?}
    M -- yes --> N[skip import]
    M -- no --> O[collect specifier]
    O --> P[resolveSpecifier]
    P --> Q{starts with\n. or / ?}
    Q -- no --> R[skip bare specifier\nnull = OK]
    Q -- yes --> S{exact file\nexists?}
    S -- yes --> R
    S -- no --> T{no extension?\ntry .js/.ts/.mjs/.cjs}
    T -- found --> R
    T -- not found --> U{is directory?\ntry index files}
    U -- found --> R
    U -- not found --> V[broken import]
    V --> W[broken list]
    R --> C
    N --> C
    F --> C
    W --> X{any broken?}
    X -- no --> Y([exit 0 ✓])
    X -- yes --> Z([exit 1 ✗ print paths])
Loading

Last reviewed commit: "fix: scan content af..."

Comment on lines +66 to +67
const before = line.slice(0, match.index);
if (before.includes('//') || before.includes('/*')) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 False-negative: // inside string literals silently skips real imports

The check before.includes('//') is meant to detect trailing // comments, but it matches any occurrence of // in the preceding text — including URLs embedded in string literals. For example:

const base = 'https://example.com'; const { mod } = await import('./missing-path.js');

The before slice here contains //, so the import is silently skipped and the broken path goes undetected — exactly the opposite of what this script is meant to do.

A safer heuristic is to check whether // appears outside quoted strings, or at minimum check that it is preceded only by whitespace or non-string characters. A simple approach is to scan for the first unquoted //:

Suggested change
const before = line.slice(0, match.index);
if (before.includes('//') || before.includes('/*')) continue;
// Skip if the match is inside a trailing comment (simple heuristic: look for // outside quotes)
const before = line.slice(0, match.index);
const trailingCommentIdx = before.search(/(?<!['"]\S*)\/\//);
if (trailingCommentIdx !== -1 || before.includes('/*')) continue;

Alternatively, use a stricter check that only skips when // or /* is not preceded by a quote character on the same line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced naive \ with a new \ helper that walks character-by-character tracking quote state (single, double, backtick). Only flags \ that appears outside string literals, so URLs like \ no longer cause false negatives.

Comment on lines +55 to +58
if (/^\s*\/\*/.test(line)) {
if (!line.includes('*/')) inBlockComment = true;
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Block-comment tracking misses mid-line /* openings

inBlockComment is only set when a line matches /^\s*\/\*/ — i.e., /* at the beginning of the line. A /* that appears after real code is ignored:

const x = callSomething(); /* begin multi-line comment
   await import('./stale-path.js')   ← this line IS inside a comment but WILL be scanned
*/

On the line containing await import('./stale-path.js') the state inBlockComment is still false, so it gets scanned and could produce either a false positive (valid path inside comment flagged) or miss a stale path in code.

Consider tracking the opening position more broadly:

Suggested change
if (/^\s*\/\*/.test(line)) {
if (!line.includes('*/')) inBlockComment = true;
continue;
}
if (!inBlockComment && line.includes('/*')) {
if (!line.includes('*/')) inBlockComment = true;
// Still check content before the /*
}

Or strip block comments from the line before applying the regex, rather than using a line-level guard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the line-start-only ^\s*\/\* check. Block comments are now stripped from each line before regex scanning: inline /* ... */ pairs are removed with a greedy replace, and if an unclosed /* remains, we truncate the line at that point and enter inBlockComment mode. This correctly handles mid-line openings like const x = foo(); /* begin comment.


// ── extract dynamic import specifiers ───────────────────────────────────
// Matches: await import('...') and await import("...")
const DYNAMIC_IMPORT_RE = /await\s+import\(\s*(['"])(.+?)\1\s*\)/g;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-awaited import() calls are invisible to this script

The regex only matches await import(...). A dynamic import that is returned, .then()-chained, or assigned without await will be entirely skipped:

return import('./stale-path.js');          // not caught
import('./stale-path.js').then(doSomething); // not caught

If the codebase ever uses these patterns, a stale path would bypass this check entirely. Consider broadening the regex to match bare import(...) calls as well:

Suggested change
const DYNAMIC_IMPORT_RE = /await\s+import\(\s*(['"])(.+?)\1\s*\)/g;
const DYNAMIC_IMPORT_RE = /(?:await\s+)?import\(\s*(['"])(.+?)\1\s*\)/g;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — broadened the regex from /await\s+import\(...\)/ to /(?:await\s+)?import\(...\)/ so it now catches bare import() calls: return import(...), import(...).then(...), and variable assignments without await.

- Broaden regex to catch non-awaited import() calls (return, .then())
- Fix block-comment tracking to handle mid-line /* openings
- Replace naive before.includes('//') with quote-aware isInsideLineComment
  to avoid false negatives on URLs in string literals
@carlos-alm
Copy link
Contributor Author

All three Greptile review items addressed in ab315f7:

  1. P1 — // inside string literals: Replaced before.includes('//') with a character-by-character isInsideLineComment() that tracks quote state, so // inside URLs or strings is ignored.
  2. P2 — Mid-line /* block comments: Removed line-start-only ^\s*\/\* check. Now strips inline /* ... */ pairs and truncates at unclosed /*, entering block-comment mode correctly.
  3. P2 — Non-awaited import(): Broadened regex to (?:await\s+)?import(...) to catch return import(...), .then() chains, and bare assignments.

Script still passes: all 262 src/ files scanned, 0 broken imports.

@greptileai

Comment on lines +70 to +73
if (inBlockComment) {
if (line.includes('*/')) inBlockComment = false;
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 False-negative: content after */ on same line is never scanned

When the closing */ of a block comment appears mid-line (e.g. */ await import('./stale.js');), inBlockComment is set to false but the continue unconditionally skips the rest of the current line. Any real import() call after the */ is invisible to the scanner — exactly the kind of false negative this tool is meant to prevent.

// current (buggy):
if (inBlockComment) {
  if (line.includes('*/')) inBlockComment = false;
  continue; // ← skips ALL of the current line, including code after */
}

A minimal fix is to skip the continue when */ is found and instead pass the post-close slice into scanLine:

if (inBlockComment) {
  const closeIdx = line.indexOf('*/');
  if (closeIdx === -1) continue;          // still fully inside a block comment
  inBlockComment = false;
  // fall through with only the part of the line that follows '*/'
  scanLine = line.slice(closeIdx + 2);
  // skip re-assigning scanLine = line below, then proceed to import scanning
}

Because scanLine is currently declared a few lines later (let scanLine = line;), the cleanest approach is to hoist the declaration and branch early, similar to how the block-comment stripping already works for mid-line /* openings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3ee1b89 — when */ is found mid-line, the scanner now extracts the content after the close (scanLine = line.slice(closeIdx + 2)) and falls through to the import scanning logic instead of unconditionally skipping the line with continue.

When */ appeared mid-line, the continue statement skipped the entire
line including any real code after the comment close. Now extracts and
scans the post-close portion of the line.
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit c433e6e into main Mar 20, 2026
21 checks passed
@carlos-alm carlos-alm deleted the ci/verify-imports branch March 20, 2026 06:37
@github-actions github-actions bot locked and limited conversation to collaborators Mar 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant