fix(embed): handle absolute file paths from native engine#783
fix(embed): handle absolute file paths from native engine#783carlos-alm merged 3 commits intomainfrom
Conversation
The native engine stores absolute file paths in the DB. buildEmbeddings unconditionally joined rootDir + file, doubling the path and causing ENOENT for every symbol — producing 0 embeddings silently. Check path.isAbsolute() before joining, and add a regression test that inserts nodes with absolute paths to prevent recurrence. Closes #760
|
Claude finished @carlos-alm's task —— View job Code Review Complete ⭐⭐⭐⭐ (4/5)Analyzing PR #783: Review Checklist
Root Cause AnalysisProblem Being Solved: The Why This Approach: The fix adds a simple
Risk Assessment: Very low risk. This is a defensive fix that makes the code more robust to different path storage formats from the dual-engine architecture. Configuration Change Detection ✓HIGH-RISK changes: None detected.
Test Integrity ✓Test quality: Excellent regression test coverage.
Code Quality AssessmentStrengths:
Technical Analysis:
Potential Concerns: None identified
Backlog Compliance
Critical ConcernsNone. This is a well-executed bug fix. Final Recommendation
Minor enhancement suggestion (not blocking): The warning message on line 108 could be slightly more specific by mentioning whether the path was treated as absolute or relative, but this is a very minor improvement and not necessary for approval. This PR exemplifies quality software engineering: identify the root cause, apply the minimal correct fix, and add comprehensive test coverage to prevent regression. |
Greptile SummaryAdds a one-line Confidence Score: 5/5Safe to merge — minimal, targeted fix with a direct regression test and no observable side effects. The change is a single conditional expression that preserves all prior behaviour for relative paths while correctly handling absolute paths. The regression test reproduces the exact failure mode and confirms the fix. No P0/P1 issues found. No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant C as Caller
participant G as buildEmbeddings()
participant DB as SQLite (nodes)
participant FS as Filesystem
C->>G: buildEmbeddings(rootDir, model, dbPath)
G->>DB: SELECT * FROM nodes WHERE kind IN (...)
DB-->>G: [{file: "/abs/path/math.js", ...}] (native engine: absolute)
Note over G: path.isAbsolute(file)?
alt file is absolute
G->>FS: readFileSync("/abs/path/math.js") ✅
else file is relative
G->>FS: readFileSync(path.join(rootDir, file)) ✅
end
FS-->>G: source lines
G->>G: build embedding text
G->>DB: INSERT INTO embeddings ...
Reviews (1): Last reviewed commit: "style: fix biome formatting in test" | Re-trigger Greptile |
Summary
buildEmbeddingsdoubling file paths when the native engine stores absolute paths in the DB (e.g.,/tmp/foo/bar.jsbecame/tmp/foo/tmp/foo/bar.js)path.isAbsolute()guard before joiningrootDirwith the DB file pathRoot Cause
buildEmbeddingsingenerator.tsunconditionally calledpath.join(rootDir, file)on line 103. When the native Rust engine stored absolute file paths in the DBnodes.filecolumn, this doubled the path, causingENOENTfor every file read — silently producing 0 embeddings while still writing metadata.CI evidence from the failing run showed:
Closes #760
Test plan
embedding-strategy.test.ts— new testabsolute file paths in DB (#760)passesembedding-regression.test.ts— all 8 existing tests still pass