Bug: SIGSEGV (exit 139) in lsp_cross pass when indexing TS/JS monorepos with 1189+ definitions
Summary
The lsp_cross pass crashes with SIGSEGV when indexing TypeScript/JavaScript monorepos where pxc_collect_all_defs produces a CBMLSPDef[] array exceeding ~1189 entries. The crash is scale-dependent, not file-specific — any subset of files that stays below the threshold indexes successfully.
Environment
- OS: macOS 15 (Darwin 24.0.0, arm64)
- Binary: v0.6.1 darwin-arm64 (prebuilt release)
- Project: Vue 3 + TypeScript monorepo (pnpm workspace)
- 1832 files total, 4864 defs extracted
- 899 TS/JS/TSX files requiring lsp_cross processing
- Backend (Kotlin) indexes fine in full mode — Kotlin has no cross-file LSP, so
pxc_has_cross_lsp() returns false
Reproduction
# Clone a Vue 3 + TS monorepo with 1800+ files
codebase-memory-mcp index --project my-frontend --repo /path/to/frontend --mode full
# → SIGSEGV at lsp_cross pass (exit code 139)
Scale Threshold Isolation
Systematic binary-search testing isolates the crash to the total defs count passed to cbm_run_ts_lsp_cross:
| Scope |
Files |
Defs |
Result |
| Minimal 2-file TS project |
2 |
4 |
OK |
apps/ alone |
917 |
1160 |
OK |
packages/ alone |
471 |
1189 |
CRASH |
packages/ui-kit alone |
337 |
~600 |
CRASH |
packages/ui-kit/shadcn-ui alone |
251 |
25 |
OK |
Any single packages/* sub-package |
varies |
<200 |
OK |
| Combined sub-packages (total defs >1189) |
varies |
1189+ |
CRASH |
The crash triggers regardless of which specific files are included — it's the accumulated all_defs array size that matters.
Root Cause Analysis
The crash path is:
cbm_pipeline_pass_lsp_cross() [pass_lsp_cross.c:400]
→ pxc_collect_all_defs() [pass_lsp_cross.c:149]
// Creates single shared CBMLSPDef[1189+] array
→ for each TS/JS file:
→ pxc_run_one_ts() [pass_lsp_cross.c:381]
→ cbm_run_ts_lsp_cross() [ts_lsp.c:4230]
// Registers ALL defs into CBMTypeRegistry
// → SIGSEGV
Key observations from source analysis (pass_lsp_cross.c):
-
pxc_collect_all_defs (line 149) collects ALL project definitions into a single CBMLSPDef[] array — every Class, Interface, Function, Method, Enum, Type, Protocol, Trait across all files.
-
cbm_pipeline_pass_lsp_cross (line 400) passes this entire all_defs array to every pxc_run_one_ts / pxc_run_one call. For 899 TS files, the same 1189+ def array is registered into 899 separate type registries.
-
pxc_run_one_ts (line 381) uses a per-file scratch arena, but the all_defs array and the type registry built from it are not isolated — memory corruption in one iteration can cascade.
-
The per-file scratch arena in pxc_run_one/pxc_run_one_ts was added to prevent O(N×project_size) memory growth (noted in comment at line 326-330), but it doesn't protect against corruption in the shared defs array or the type registry registration path.
Suggested Fix: Batch Processing
Split the TS/JS file loop in cbm_pipeline_pass_lsp_cross into batches. Each batch:
- Collects only the definitions relevant to its subset of files (direct defs + imported defs)
- Creates a fresh
CBMLSPDef[] subset and scratch arena
- Processes its file subset independently
This limits the all_defs array size per batch, avoiding the memory corruption threshold while preserving cross-file resolution within each batch.
Relationship to Other Issues
Workaround
Index with mode: "moderate" (skips lsp_cross pass) or index sub-packages individually below the ~1189 def threshold.
Bug: SIGSEGV (exit 139) in
lsp_crosspass when indexing TS/JS monorepos with 1189+ definitionsSummary
The
lsp_crosspass crashes with SIGSEGV when indexing TypeScript/JavaScript monorepos wherepxc_collect_all_defsproduces aCBMLSPDef[]array exceeding ~1189 entries. The crash is scale-dependent, not file-specific — any subset of files that stays below the threshold indexes successfully.Environment
pxc_has_cross_lsp()returns falseReproduction
Scale Threshold Isolation
Systematic binary-search testing isolates the crash to the total defs count passed to
cbm_run_ts_lsp_cross:apps/alonepackages/alonepackages/ui-kitalonepackages/ui-kit/shadcn-uialonepackages/*sub-packageThe crash triggers regardless of which specific files are included — it's the accumulated
all_defsarray size that matters.Root Cause Analysis
The crash path is:
Key observations from source analysis (
pass_lsp_cross.c):pxc_collect_all_defs(line 149) collects ALL project definitions into a singleCBMLSPDef[]array — every Class, Interface, Function, Method, Enum, Type, Protocol, Trait across all files.cbm_pipeline_pass_lsp_cross(line 400) passes this entireall_defsarray to everypxc_run_one_ts/pxc_run_onecall. For 899 TS files, the same 1189+ def array is registered into 899 separate type registries.pxc_run_one_ts(line 381) uses a per-file scratch arena, but theall_defsarray and the type registry built from it are not isolated — memory corruption in one iteration can cascade.The per-file scratch arena in
pxc_run_one/pxc_run_one_tswas added to prevent O(N×project_size) memory growth (noted in comment at line 326-330), but it doesn't protect against corruption in the shared defs array or the type registry registration path.Suggested Fix: Batch Processing
Split the TS/JS file loop in
cbm_pipeline_pass_lsp_crossinto batches. Each batch:CBMLSPDef[]subset and scratch arenaThis limits the
all_defsarray size per batch, avoiding the memory corruption threshold while preserving cross-file resolution within each batch.Relationship to Other Issues
lsp_crossSIGSEGV, filed 2 days ago with bisect pointing to May 8 LSP commits. This issue adds scale-threshold isolation data and C source-level root cause analysis.pointer authentication trap IB). Thelsp_crosscorruption could cascade into the dump phase.Workaround
Index with
mode: "moderate"(skipslsp_crosspass) or index sub-packages individually below the ~1189 def threshold.