fix(build): warn before --no-incremental wipes embeddings#986
fix(build): warn before --no-incremental wipes embeddings#986carlos-alm merged 3 commits intomainfrom
Conversation
A full rebuild (either via --no-incremental or when promoted by an engine/ schema/version mismatch) silently drops every row from the embeddings table. Users who just spent minutes generating embeddings lose them with no warning, and subsequent \`codegraph search\` returns zero hits with no indication why. Add a warnOnEmbeddingsWipe step to setupPipeline that runs after the incremental/forceFullRebuild decision is made. When the build will be full AND the embeddings table is non-empty, log: Full rebuild will discard N embeddings; re-run \`codegraph embed\` after the build. The warning sits at the pipeline entry, so it fires uniformly for the native orchestrator and the JS fallback path. It is silent when the embeddings table is missing, empty, or when the build is incremental. Closes #982
|
Claude finished @carlos-alm's task —— View job Root Cause AnalysisProblem Being Solved: Users lose embeddings silently during full rebuilds ( Why This Approach: Adding warning at the pipeline entry point ( Risk Assessment: Very low risk. This is a purely additive warning with proper error handling and no behavioral changes. Backlog Compliance
Code Quality AssessmentStrengths
Technical Details
No Issues Found
Test Coverage VerificationThe PR description claims comprehensive test coverage:
This covers all the logical branches and edge cases properly. Comparison with Similar PatternsLooking at the codebase, this warning pattern is consistent with existing UX improvements:
Final Recommendation
This PR effectively addresses the silent data loss issue that was identified during dogfooding. The implementation is technically sound and the placement ensures consistent behavior across both execution paths. |
Greptile SummaryAdds a Confidence Score: 5/5Safe to merge — the change is additive (read-only DB query + log), all promotion paths are covered, and the only finding is a P2 style suggestion about the catch scope. The sole finding is a P2 style/hardening suggestion (broad catch). No logic errors or data-integrity risks were found; the warning logic is correct across all identified forceFullRebuild promotion paths. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[buildGraph called] --> B[setupPipeline]
B --> C[checkEngineSchemaMismatch\nsets forceFullRebuild]
C --> D[warnOnEmbeddingsWipe\n!incremental OR forceFullRebuild?]
D -->|No| E[loadAliases / continue]
D -->|Yes| F[SELECT COUNT from embeddings]
F -->|count > 0| G["WARN: Full rebuild will discard N embeddings"]
F -->|count == 0 or table missing| E
G --> E
E --> H{tryNativeOrchestrator}
H -->|success| I[Done]
H -->|throws| J{incremental &&\n!forceFullRebuild?}
J -->|No| K[runPipelineStages]
J -->|Yes| L[Late version check\nforceFullRebuild = true]
L --> M[warnOnEmbeddingsWipe again]
M --> K
K --> I
Reviews (2): Last reviewed commit: "fix(build): re-check embeddings on nativ..." | Re-trigger Greptile |
Codegraph Impact Analysis3 functions changed → 8 callers affected across 6 files
|
…ch (#986) The catch block that handles native-orchestrator throws performs a late promotion to a full rebuild when the codegraph_version mismatch is detected. warnOnEmbeddingsWipe already ran in setupPipeline before forceFullRebuild was set here, so this path previously wiped embeddings silently — exactly the scenario the initial guard was meant to close. Call warnOnEmbeddingsWipe again after setting ctx.forceFullRebuild = true so the late-promotion path surfaces the same warning as the other full rebuild paths. Impact: 1 functions changed, 6 affected
|
Addressed Greptile's P1 feedback on the late-promotion gap:
Pushed in b4c907b. |
Summary
codegraph build --no-incremental(and any full rebuild promoted by engine/schema/version mismatch) silently drops every row in theembeddingstable. Users who spent minutes oncodegraph embedlose their embeddings with no warning; the nextcodegraph searchreturns zero hits with no clue why.Add a
warnOnEmbeddingsWipestep at the pipeline entry — just after the incremental/forceFullRebuild decision is made. When the build is full AND the embeddings table is non-empty, it logs:Because the warning sits at the shared pipeline entry point, it fires uniformly regardless of which path (native orchestrator or JS fallback) actually performs the delete.
Found during
Dogfooding v3.9.4 — see #982
Test plan
npm run typecheckcleannpm run lintclean