v0.3.0b2 — Agent-friendly responses + hot-reload
Patch release — better multi-agent ergonomics
Focused on making the /context endpoint respond more usefully to programmatic agents (not just Continue), and letting operators refresh runtime state without restarting the process.
What's New
Agent-friendly /context response
The /context endpoint now returns an agent metadata field alongside the existing Continue-compatible fields. Backward compatible — existing integrations work unchanged.
New fields:
```json
{
"name": "Helix Genome Context",
"description": "...",
"content": "...",
"context_health": { ... },
"agent": {
"recommendation": "trust" | "verify" | "refresh" | "reread_raw",
"hint": "Context is well-grounded. Use directly.",
"citations": [
{ "gene_id": "a23ff24e...", "source": "...", "score": 108.71 },
...
],
"latency_ms": 1996.2,
"total_tokens_est": 3614,
"compression_ratio": 2.77,
"moe_mode": true
}
}
```
Pass verbose: true in the request body to include promoter tags (domains, entities) for each citation — useful when agents need to inspect why a gene was ranked.
Recommendation semantics
The recommendation field tells the agent what to do with the context based on delta-epsilon health:
| Health status | Recommendation | Meaning |
|---|---|---|
| aligned | trust |
Use the context directly |
| sparse | verify |
Verify specific values before acting |
| stale | refresh |
Expressed genes are outdated |
| denatured | reread_raw |
Unreliable — read raw files instead |
Hot-reload endpoints
Three admin endpoints for refreshing runtime state without killing the process:
POST /admin/reload— full refresh (config + genome snapshot + ΣĒMA cache + clear stale per-query state)POST /admin/sema/rebuild— force rebuild the ΣĒMA vector cachePOST /admin/checkpoint?mode=TRUNCATE— flush WAL to main DB (already existed, documented here)
What hot-reload refreshes:
helix.tomlconfig- Genome WAL snapshot (sees external writes)
- ΣĒMA vector cache (rebuilt from current genome)
last_query_scorescleared
What it does NOT refresh (these still need a process restart):
- Python code changes
- Ribosome backend swap (model stays loaded)
Use hot-reload when you want to tweak thresholds in helix.toml or pick up new genes from an external writer without dropping in-flight requests.
Replica schema resilience fix
_build_sema_cache now catches no such column: compression_tier errors from stale replicas and falls back to the legacy query. This was silently breaking the live server on replicated read paths where the replica hadn't been migrated to the current schema.
Benchmark validation (post-restart)
Needle-in-a-haystack benchmark against the 7,313-gene live genome after restarting on v0.3.0b2:
| Metric | Pre-v0.3.0b2 | v0.3.0b2 |
|---|---|---|
| Context retrieval | 8/10 (80%) | 10/10 (100%) |
| Answer accuracy | 7/10 | 8/10 |
| Avg context latency | 21-120s (ΣĒMA Mode B 7K JSON loads) | 1.0s (numpy cache) |
The 100x latency improvement comes from the v0.2.0b2 ΣĒMA vector cache fix, which was shipped in code but never active against the live server until this restart. The retrieval quality improvement comes from the v0.2.0b2 authority boosts combined with the fresh server picking up the new code.
All 179 tests passing.
🤖 Generated with Claude Code