Description
When running codegraph index on a medium-to-large React/JavaScript codebase, the indexing process fails with a SQLite unique constraint violation. The error occurs consistently at around 10-34% progress through the indexing phase.
Error Message
✗ Failed to index: UNIQUE constraint failed: nodes.id
Environment
- CodeGraph Version: 0.3.1
- Node.js Version: v20.19.0
- Platform: Linux (WSL2) 6.6.87.2-microsoft-standard-WSL2
- Installation Method:
npm install -g @colbymchenry/codegraph --ignore-scripts followed by npm rebuild
Codebase Characteristics
- ~1,000+ source files (excluding e2e tests)
- Mixed JavaScript (.js, .jsx) and TypeScript (.ts, .tsx)
- React 19 application with Vite
- Standard project structure:
src/components/, src/features/, src/hooks/, etc.
Steps to Reproduce
-
Initialize codegraph in a React project:
-
Run the indexer:
-
Observe the failure at approximately 10-34% progress during the "Parsing code" phase.
Partial Index State
Before the failure, the indexer successfully processes some files:
Index Statistics:
Files: 106
Nodes: 412
Edges: 21
Nodes by Kind:
function 385
method 21
class 3
interface 3
Files by Language:
javascript 57
jsx 46
typescript 3
Attempted Workarounds
- Excluding e2e directory: Added
"**/e2e/**" to the exclude list in config.json - issue persists
- Fresh initialization: Removed
.codegraph/ directory and reinitialized - same error
- Multiple attempts: The failure point varies slightly but always occurs in the same general range
Analysis
The error suggests that the node ID generation algorithm is producing duplicate IDs for different code symbols. This could be caused by:
- Hash collisions in the ID generation logic
- Similar function/variable names across different files producing identical IDs
- Re-exported symbols being processed multiple times
Expected Behavior
The indexer should complete successfully, handling any potential ID conflicts gracefully (e.g., by incorporating file path or additional context into the ID hash).
Additional Context
The codebase has many similarly-named components and functions across different feature directories (e.g., multiple index.js barrel files, common hook names like useAuth, useForm, etc.). This pattern is typical of large React applications.
Happy to provide additional logs or a minimal reproduction case if helpful.
Description
When running
codegraph indexon a medium-to-large React/JavaScript codebase, the indexing process fails with a SQLite unique constraint violation. The error occurs consistently at around 10-34% progress through the indexing phase.Error Message
Environment
npm install -g @colbymchenry/codegraph --ignore-scriptsfollowed bynpm rebuildCodebase Characteristics
src/components/,src/features/,src/hooks/, etc.Steps to Reproduce
Initialize codegraph in a React project:
codegraph init .Run the indexer:
codegraph index .Observe the failure at approximately 10-34% progress during the "Parsing code" phase.
Partial Index State
Before the failure, the indexer successfully processes some files:
Attempted Workarounds
"**/e2e/**"to the exclude list inconfig.json- issue persists.codegraph/directory and reinitialized - same errorAnalysis
The error suggests that the node ID generation algorithm is producing duplicate IDs for different code symbols. This could be caused by:
Expected Behavior
The indexer should complete successfully, handling any potential ID conflicts gracefully (e.g., by incorporating file path or additional context into the ID hash).
Additional Context
The codebase has many similarly-named components and functions across different feature directories (e.g., multiple
index.jsbarrel files, common hook names likeuseAuth,useForm, etc.). This pattern is typical of large React applications.Happy to provide additional logs or a minimal reproduction case if helpful.