v0.5.0a1 — background extraction + multi-worker safety + observability
Pre-releaseHeadline: defer_extraction=True decouples LLM/lede extraction from ingest() — 59× faster time-to-queryable (~18 ms/doc vs ~1063 ms/doc on lede_spacy MHR). A new pgrg extract CLI (--once / --daemon) drains the queue out-of-band, with multi-worker safety guaranteed by SELECT … FOR UPDATE SKIP LOCKED (claim) and a new UNIQUE (namespace, src_id, dst_id, rel_type) constraint + ON CONFLICT on writes.
Backward-compatible: synchronous-extract default behavior is byte-for-byte unchanged.
What's new
Background extraction
Pass defer_extraction=True to ingest_records() and the producer returns in chunk + embed time only — the document is immediately queryable via vector + BM25, with entity/relationship extraction deferred. Drain on your schedule:
# cron-driven
pgrg --db $PGRG_DSN extract --namespace crm --once
# always-on daemon — SIGTERM-graceful, emits metrics per iteration
pgrg --db $PGRG_DSN extract --namespace crm --daemon --poll-interval 1.0New module: pg_raggraph.backfill (claim_pending / extract_documents / release_processing). New migration 012_documents_graph_status.sql adds the queue-tracking columns with a partial index on pending rows.
Measured impact (MHR slice, lede_spacy, no LLM):
| n_docs | SYNC ingest | DEFER ingest | DRAIN | B+C total | A/B speedup |
|---|---|---|---|---|---|
| 20 | 21.27 s | 0.36 s | 7.24 s | 7.60 s | 59.0× |
| 40 | 26.27 s | 0.44 s | 15.12 s | 15.56 s | 59.8× |
Total async path (B+C) is also faster than synchronous (A) because the synchronous path holds per-doc transactions open across extraction.
Multi-worker safety invariants
- Namespace-scoped reaper.
release_processingis keyword-only onnamespace=/doc_ids=. The CLI passes its--namespace, so a worker starting in namespace A no longer steals namespace B's in-flight 'processing' claims. - Edge-level idempotency. Migration
013_relationships_unique.sqlde-duplicates any existing rows and addsUNIQUE (namespace, src_id, dst_id, rel_type). Both ingest paths useON CONFLICT DO UPDATE SET weight = GREATEST(...)— re-extraction is safe. merge_entitiesupdated to pre-delete colliding rows before the src_id/dst_id rewrite.
Run as many pgrg extract --daemon workers as you want against the same namespace.
Observability
Three new metric events per iteration, through the existing _emit_metric channel:
pgrg.backfill.claim—namespace,batch_size,claimed,latency_mspgrg.backfill.extract—claimed,ready,failed,entities,relationships,latency_mspgrg.backfill.queue_depth— per-status doc counts
Query-time hint
QueryResult.metadata['graph_status_summary'] and GraphRAG.status(ns)['graph_status'] expose per-status doc counts so callers can see whether the graph is still backfilling.
Benchmarks
benchmarks/defer_extraction_bench.py— A/B/C harness (sync ingest vs deferred ingest vs drain).benchmarks/ingest_perf.pyextended with--provider {local,http}. Measured: TEI HTTP CPU beats local fastembed 2.1× on bge-small (66 vs 140 ms/chunk).
Documentation
docs/cookbook/background-extraction.md— full guide with three architectural patterns (sync / cron / always-on daemon), end-to-end FastAPI walkthrough, operator playbook.README.md/docs/README.md/docs/user-guide.md/docs/operations-guide.md— cross-references and discovery paths.
Production-readiness audit
skill-output/prod-ready/ contains the 16-finding audit that drove this release's safety fixes. P0s landed; P1s (timed background reaper, retry counter + cap, extractor health probe, multi-worker concurrency test) are tracked for the next cycle.
Verdict: single-worker / single-namespace deployments are production-ready; multi-worker deployments are safe by construction.
Compatibility
- Synchronous ingest behavior: byte-for-byte unchanged.
- Schema migrations 012 + 013 run automatically on connect; both are forward-only and idempotent.
release_processingsignature is now keyword-only — any caller passingdoc_idspositionally needsdoc_ids=(the only known internal caller in tests is updated).