-
Notifications
You must be signed in to change notification settings - Fork 0
Roadmap
This page reflects where the project actually stands, not a static wishlist — it gets revisited and corrected as work lands, the same way a stale README section gets caught and fixed rather than left to drift.
The five original items that defined the project's early feature set are all done:
- ✅ Configurable thresholds
- ✅ Kiwix search term disambiguation — see Kiwix Disambiguation
- ✅ Multi-book Kiwix fusion — see Multi-Book Fusion
- ✅ Confidence-aware fusion with expanded ingest — see Confidence-Aware Fusion
- ✅ Conditional query detection — see Conditional Query Detection
Three real gaps, found through deliberate review rather than reported failures, all closed:
- ✅ Discourse-framing routing bypass — see The Discourse-Framing Investigation
- ✅ Fallback visibility in
/logs/stats - ✅ Routing cache size bounding + visibility in
/health - ✅ Background snapshot job health
- ✅ Adversarial self-testing — see Adversarial Self-Testing
- ✅ Cross-source temporal pattern detection — see Cross-Source Temporal Pattern Detection
Full mechanism detail for the operational maturity work lives in Health & Observability and Caching.
A deliberate, full read of every file in app/, top to bottom — specifically ignoring complexity scores and looking at the kind of small, simple-looking code that score-driven review naturally skips. Found and fixed real bugs in nearly every file touched, several of them significant:
- ✅
home_assistant.py— a severe word-boundary bug ("is the front door locked" silently returning no results) and four related fixes - ✅
kiwix.py— non-deterministic book selection, broken table-of-contents stripping, a single-character search-term bug, and an unbounded retry loop with a real multi-minute worst case - ✅
fusion.py— a real crash onFUSION_MAX_SOURCES=0 - ✅
snapshots.py— uptime history only covering 9.6 real hours instead of a full week - ✅
router.py/fusion.py— a cross-file drift in the shared "did this source actually fail" logic that silently disabled thenews→webfallback for unconfigured sources - ✅
forecast.py— an unconfigured deployment silently returning real weather data for the wrong place on Earth - ✅
llm.py— thinking models on the OpenAI-compatible backend silently returning no answer at all
mcp_server.py, query_expansion.py, and searxng.py were read with the same scrutiny and came back genuinely clean — a real, useful outcome in its own right, confirming prior work in those files holds up.
- 🔄 This wiki
- ⬜ The README stays lean going forward — deep-dive material gets added here instead of growing the README further
These are real, understood boundaries — not bugs waiting for a fix, but deliberate scope decisions or honest, accepted ceilings. A reader-facing version of this same list, written for evaluating fit rather than tracking status, lives at Known Limitations:
- Single ambiguous bare words (e.g. "galaxy") can land on a thematically-related but imprecise match when the index genuinely contains multiple comparably-relevant senses of the word. See Kiwix Scoring.
- Conditional phrasing without an explicit comma ("if the front door is unlocked tell me") is intentionally not detected — a real grammatical-parsing problem, not a pattern-matching one. See Conditional Query Detection.
- A decomposed segment merging two unrelated topics may route to a single source that doesn't serve both well — an accepted, minor side effect of the proper-noun-pair guard's content-preservation fix, not a regression.
These are still squarely in "permitted to fail, no obligation to succeed" territory — the same honest framing the now-shipped temporal pattern detection work used to carry above, before it actually landed.
Cross-modal grounding — correlating a camera snapshot with a text answer ("did anything weird happen at the back door" pulling the actual image alongside the sensor log) would be a genuine "wow" capability, not just well-executed plumbing. Deliberately not pursued yet — the current camera setup (Ring) isn't infrastructure worth building on top of long-term; revisit once a self-controlled NVR solution exists instead.
- New source modules — see Contributing for the current list of proposed ones looking for contributors
-
HA/voice pipeline architecture question — whether to bypass Home Assistant's own conversation/intent layer for non-device-control voice queries, piping STT output more directly to Mnemolis's
/searchinstead, and keeping HA for device control and audio I/O only. Raised, never designed — a genuinely different kind of work (infrastructure/integration) than anything else on this list.