Skip to content

UNIQUE constraint failed: nodes.id during indexing on React/JSX codebase #11

@ronniehyslop

Description

@ronniehyslop

Description

When running codegraph index on a medium-to-large React/JavaScript codebase, the indexing process fails with a SQLite unique constraint violation. The error occurs consistently at around 10-34% progress through the indexing phase.

Error Message

✗ Failed to index: UNIQUE constraint failed: nodes.id

Environment

  • CodeGraph Version: 0.3.1
  • Node.js Version: v20.19.0
  • Platform: Linux (WSL2) 6.6.87.2-microsoft-standard-WSL2
  • Installation Method: npm install -g @colbymchenry/codegraph --ignore-scripts followed by npm rebuild

Codebase Characteristics

  • ~1,000+ source files (excluding e2e tests)
  • Mixed JavaScript (.js, .jsx) and TypeScript (.ts, .tsx)
  • React 19 application with Vite
  • Standard project structure: src/components/, src/features/, src/hooks/, etc.

Steps to Reproduce

  1. Initialize codegraph in a React project:

    codegraph init .
  2. Run the indexer:

    codegraph index .
  3. Observe the failure at approximately 10-34% progress during the "Parsing code" phase.

Partial Index State

Before the failure, the indexer successfully processes some files:

Index Statistics:
  Files:     106
  Nodes:     412
  Edges:     21

Nodes by Kind:
  function        385
  method          21
  class           3
  interface       3

Files by Language:
  javascript      57
  jsx             46
  typescript      3

Attempted Workarounds

  1. Excluding e2e directory: Added "**/e2e/**" to the exclude list in config.json - issue persists
  2. Fresh initialization: Removed .codegraph/ directory and reinitialized - same error
  3. Multiple attempts: The failure point varies slightly but always occurs in the same general range

Analysis

The error suggests that the node ID generation algorithm is producing duplicate IDs for different code symbols. This could be caused by:

  • Hash collisions in the ID generation logic
  • Similar function/variable names across different files producing identical IDs
  • Re-exported symbols being processed multiple times

Expected Behavior

The indexer should complete successfully, handling any potential ID conflicts gracefully (e.g., by incorporating file path or additional context into the ID hash).

Additional Context

The codebase has many similarly-named components and functions across different feature directories (e.g., multiple index.js barrel files, common hook names like useAuth, useForm, etc.). This pattern is typical of large React applications.


Happy to provide additional logs or a minimal reproduction case if helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions