Release v0.2.0b1 — SIKE validation & MoE decoder · mbachaud/helix-context

Highlights

This release establishes scale-invariant retrieval across model sizes from 0.6B to 8B parameters, validated by the SIKE benchmark. Retrieval is no longer the bottleneck — it's consistent at 10/10 across all tested models.

SIKE Benchmark Results (q4_0 KV cache)

Model	Retrieval	Accuracy	Notes
qwen3:0.6b	10/10	2/10	Parameter floor — retrieval works, model can't use it
qwen3:1.7b	10/10	3/10
qwen3:4b	10/10	9/10	Sweet spot — 2.5GB VRAM
gemma4:e4b	10/10	9/10	MoE decoder enabled
qwen3:8b	10/10	9/10

MoE-aware decoder

Front-loads KV answer slate in first 200 tokens for SWA (sliding-window attention) models
Relevance-first gene ordering for MoE/small models (vs sequence_index for dense)
Automatic activation via MOE_MODEL_FAMILIES = (\"gemma4\",)
gemma4:e4b jumped from 5/10 → 9/10 accuracy with slate enabled

Per-request model detection

Server reads body[\"model\"] and adapts expression strategy per request
_should_use_slate() gates on downstream model name + param count
SMALL_MODEL_THRESHOLD_B = 3.2 — excludes qwen3:4b which works without slate

Think-mode suppression for sub-3.2B models

Small models' reasoning loops consume the entire output budget without producing answers
Injects /no_think prefix and sets temperature=0 for Qwen3 sub-3.2B
q8_0 tested: worse than q4_0 (think mode gets more rope to hang itself)

Storage & operations

New Genome.vacuum() method + /admin/vacuum endpoint (752 MB → 523 MB, -30.4%)
Clear documentation distinguishing checkpoint / refresh / compact / vacuum operations
README refresh with badges, TOC, glossary, sample output
Test corpus composition breakdown with public/private repo split

Cumulative changes since v0.1.0b2

MoE-aware decoder with answer slate + relevance-first ordering
SIKE benchmark validation across 5 model scales
Per-request downstream model detection
Think suppression for sub-3.2B models
Genome.vacuum() + storage optimizations
README overhaul + SIKE benchmark docs

All 179 tests passing.

🤖 Generated with Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0b1 — SIKE validation & MoE decoder

Choose a tag to compare

Sorry, something went wrong.