Skip to content

CodebaseIndexer: runaway embedding loop with 0% cache hits + 4GB+ data/query memleak #944

@joelteply

Description

@joelteply

Summary

CodebaseIndexer (TS, src/system/rag/services/CodebaseIndexer.ts) enters an embedding-generation tight loop ~120s after server start and never stops. Combined with the data/query memleak in continuum-core, the system becomes unable to serve persona inference within minutes.

Symptoms (observed 2026-04-19, Mac M5)

  • Server starts → 120s grace → indexer scheduled
  • Once running: Generated 16 embeddings (384d) log lines fire every ~100-500ms continuously, cache hits 0/16
  • continuum-core-server CPU climbs to 1100%+, RSS climbs from ~500MB to 2.2GB
  • [MEMLEAK] data/query:+4807MB cumulative (largest single leaker)
  • ALL persona inference requests hang with no response — DataDaemon is starved
  • AIProviderDaemonServer reports "Appears stuck (60s, 90s, ... 360s+ since last success)" indefinitely

Workaround (already shipped)

PR adds SKIP_CODEBASE_INDEX=1 env var (commit 048a8235f, branch feature/shared-cognition-rust). Setting the var skips initializeCodebaseIndexing() entirely. With the var set, personas respond normally (validated 2026-04-19 — see PR description).

Root cause (not yet diagnosed)

Two intertwined issues to investigate:

  1. 0/16 cache hit ratio. The indexer is supposed to skip files whose contentHash matches the existing code_index entry (removeEntriesForFiles + loadContentHashes). With 0% hit rate, EITHER the hash compare is broken OR every cycle truly sees new content.
  2. data/query memleak. The indexer's read path (ORM.query over code_index with limit: 10000) appears to leak ~5-30MB per call. After thousands of calls, gigabytes accumulated. Could be:
    • SQLite connection / cursor not released
    • Vector buffers (384d × thousands of rows) retained in IPC layer
    • Embedding cache growing without bound

Fix priority

Workaround unblocks chat-validation. Real fix needs investigation under low-cost reproduction (single-folder repo? unit test that runs N indexing cycles and asserts steady RSS?).

Related

  • Workaround commit: 048a8235f feat(ServiceInitializer): SKIP_CODEBASE_INDEX env gate
  • Branch: feature/shared-cognition-rust

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions