v0.1.0b2 — Score-gated expression, WAL durability, ΣĒMA cold-storage
What's New
Score-gated expression & retrieval quality
- Coverage metric now uses extracted domain/entity signals instead of raw word splits — coverage: 0.19 → 0.85-1.0
- Ellipticity improved from 0.37 avg → 0.60-0.74 (approaching aligned threshold)
- Score-gated trimming drops weak-scoring tail candidates (< 20% of top score)
- Dynamic density denominator scales by expressed/max ratio
WAL durability
checkpoint()method with PASSIVE/FULL/TRUNCATE modes- Periodic checkpoint in upsert (every 50/500 genes) + background 60s timer in server
- Max crash loss reduced from ~13,700 genes to ~50
ΣĒMA cold-storage compression tiers
- Three tiers: OPEN (full fidelity), EUCHROMATIN (summary + ΣĒMA), HETEROCHROMATIN (ΣĒMA + metadata only)
compact_genome()retroactive sweep with configurable thresholds- Density gate at ingest routes low-signal content directly to cold tiers
/admin/compactand/admin/checkpointendpoints
Domain tagging
- spaCy EntityRuler with project vocabulary (before statistical NER)
- SPLADE weight boosted 2.5 → 3.5 as semantic safety net
Performance
- Dedicated read-only SQLite connection — WAL readers no longer block writers
- ΣĒMA vector cache: pre-materialized numpy matrix replaces 7K json_loads() per query
- Mode B scan: 120s → <100ms
All 179 tests passing.
🤖 Generated with Claude Code