Skip to content

v0.5.0a13 — cross-file resolver: fix OOM (spill to DB)

Choose a tag to compare

@TheYonk TheYonk released this 09 Jun 13:02
· 24 commits to main since this release
12b29f9

Fix: cross-file code-graph resolver OOM on large repos

The v0.5.0a12 cross-file resolver (cross_file_code_graph=True) accumulated one entry per call site (each with a source snippet) in memory — O(call-sites), ~427 B/call, projecting to ~3.6 GB at 100K files → OOM on big codebases.

Fix: CorpusCodeGraph now keeps only the small symbol index in memory and spills call sites to an UNLOGGED code_calls_stage table (migration 014). Phase-2 resolution drains them in keyset batches (resolve_batch(), verified byte-identical to the one-shot resolver) plus one resolve_class_edges() pass. Peak resolver memory is now O(batch + symbol index) instead of O(corpus call sites); scratch rows are deleted after.

  • Parity: batched == one-shot edges (unit test); 644 cross-file edges on self-ingest unchanged.
  • Memory: peak in-memory call list measured 0 (was ~24.7K).
  • No change to the resolved graph; default ingest path untouched.

Full changelog: see CHANGELOG.md.