Summary
CodebaseIndexer (TS, src/system/rag/services/CodebaseIndexer.ts) enters an embedding-generation tight loop ~120s after server start and never stops. Combined with the data/query memleak in continuum-core, the system becomes unable to serve persona inference within minutes.
Symptoms (observed 2026-04-19, Mac M5)
- Server starts → 120s grace → indexer scheduled
- Once running:
Generated 16 embeddings (384d) log lines fire every ~100-500ms continuously, cache hits 0/16
continuum-core-server CPU climbs to 1100%+, RSS climbs from ~500MB to 2.2GB
[MEMLEAK] data/query:+4807MB cumulative (largest single leaker)
- ALL persona inference requests hang with no response — DataDaemon is starved
- AIProviderDaemonServer reports "Appears stuck (60s, 90s, ... 360s+ since last success)" indefinitely
Workaround (already shipped)
PR adds SKIP_CODEBASE_INDEX=1 env var (commit 048a8235f, branch feature/shared-cognition-rust). Setting the var skips initializeCodebaseIndexing() entirely. With the var set, personas respond normally (validated 2026-04-19 — see PR description).
Root cause (not yet diagnosed)
Two intertwined issues to investigate:
- 0/16 cache hit ratio. The indexer is supposed to skip files whose
contentHash matches the existing code_index entry (removeEntriesForFiles + loadContentHashes). With 0% hit rate, EITHER the hash compare is broken OR every cycle truly sees new content.
data/query memleak. The indexer's read path (ORM.query over code_index with limit: 10000) appears to leak ~5-30MB per call. After thousands of calls, gigabytes accumulated. Could be:
- SQLite connection / cursor not released
- Vector buffers (384d × thousands of rows) retained in IPC layer
- Embedding cache growing without bound
Fix priority
Workaround unblocks chat-validation. Real fix needs investigation under low-cost reproduction (single-folder repo? unit test that runs N indexing cycles and asserts steady RSS?).
Related
- Workaround commit:
048a8235f feat(ServiceInitializer): SKIP_CODEBASE_INDEX env gate
- Branch:
feature/shared-cognition-rust
Summary
CodebaseIndexer(TS,src/system/rag/services/CodebaseIndexer.ts) enters an embedding-generation tight loop ~120s after server start and never stops. Combined with the data/query memleak in continuum-core, the system becomes unable to serve persona inference within minutes.Symptoms (observed 2026-04-19, Mac M5)
Generated 16 embeddings (384d)log lines fire every ~100-500ms continuously, cache hits 0/16continuum-core-serverCPU climbs to 1100%+, RSS climbs from ~500MB to 2.2GB[MEMLEAK] data/query:+4807MB cumulative(largest single leaker)Workaround (already shipped)
PR adds
SKIP_CODEBASE_INDEX=1env var (commit048a8235f, branchfeature/shared-cognition-rust). Setting the var skipsinitializeCodebaseIndexing()entirely. With the var set, personas respond normally (validated 2026-04-19 — see PR description).Root cause (not yet diagnosed)
Two intertwined issues to investigate:
contentHashmatches the existingcode_indexentry (removeEntriesForFiles+loadContentHashes). With 0% hit rate, EITHER the hash compare is broken OR every cycle truly sees new content.data/querymemleak. The indexer's read path (ORM.queryovercode_indexwithlimit: 10000) appears to leak ~5-30MB per call. After thousands of calls, gigabytes accumulated. Could be:Fix priority
Workaround unblocks chat-validation. Real fix needs investigation under low-cost reproduction (single-folder repo? unit test that runs N indexing cycles and asserts steady RSS?).
Related
048a8235f feat(ServiceInitializer): SKIP_CODEBASE_INDEX env gatefeature/shared-cognition-rust