Skip to content

Roadmap

Bob edited this page Jun 25, 2026 · 6 revisions

Roadmap

This page reflects where the project actually stands, not a static wishlist — it gets revisited and corrected as work lands, the same way a stale README section gets caught and fixed rather than left to drift.

Capability Expansion — complete

The five original items that defined the project's early feature set are all done:

  1. ✅ Configurable thresholds
  2. ✅ Kiwix search term disambiguation — see Kiwix Disambiguation
  3. ✅ Multi-book Kiwix fusion — see Multi-Book Fusion
  4. ✅ Confidence-aware fusion with expanded ingest — see Confidence-Aware Fusion
  5. ✅ Conditional query detection — see Conditional Query Detection

Battle Testing & Operational Maturity — complete

Three real gaps, found through deliberate review rather than reported failures, all closed:

Full mechanism detail for the operational maturity work lives in Health & Observability and Caching.

Bulletproofing Pass — complete

A deliberate, full read of every file in app/, top to bottom — specifically ignoring complexity scores and looking at the kind of small, simple-looking code that score-driven review naturally skips. Found and fixed real bugs in nearly every file touched, several of them significant:

  • home_assistant.py — a severe word-boundary bug ("is the front door locked" silently returning no results) and four related fixes
  • kiwix.py — non-deterministic book selection, broken table-of-contents stripping, a single-character search-term bug, and an unbounded retry loop with a real multi-minute worst case
  • fusion.py — a real crash on FUSION_MAX_SOURCES=0
  • snapshots.py — uptime history only covering 9.6 real hours instead of a full week
  • router.py / fusion.py — a cross-file drift in the shared "did this source actually fail" logic that silently disabled the newsweb fallback for unconfigured sources
  • forecast.py — an unconfigured deployment silently returning real weather data for the wrong place on Earth
  • llm.py — thinking models on the OpenAI-compatible backend silently returning no answer at all

mcp_server.py, query_expansion.py, and searxng.py were read with the same scrutiny and came back genuinely clean — a real, useful outcome in its own right, confirming prior work in those files holds up.

Documentation Restructuring — in progress

  • 🔄 This wiki
  • ⬜ The README stays lean going forward — deep-dive material gets added here instead of growing the README further

Known limitations (tracked, accepted, not blocking)

These are real, understood boundaries — not bugs waiting for a fix, but deliberate scope decisions or honest, accepted ceilings. A reader-facing version of this same list, written for evaluating fit rather than tracking status, lives at Known Limitations:

  • Single ambiguous bare words (e.g. "galaxy") can land on a thematically-related but imprecise match when the index genuinely contains multiple comparably-relevant senses of the word. See Kiwix Scoring.
  • Conditional phrasing without an explicit comma ("if the front door is unlocked tell me") is intentionally not detected — a real grammatical-parsing problem, not a pattern-matching one. See Conditional Query Detection.
  • A decomposed segment merging two unrelated topics may route to a single source that doesn't serve both well — an accepted, minor side effect of the proper-noun-pair guard's content-preservation fix, not a regression.

Tabled, revisit in ~1 year

These are still squarely in "permitted to fail, no obligation to succeed" territory — the same honest framing the now-shipped temporal pattern detection work used to carry above, before it actually landed.

Cross-modal grounding — correlating a camera snapshot with a text answer ("did anything weird happen at the back door" pulling the actual image alongside the sensor log) would be a genuine "wow" capability, not just well-executed plumbing. Deliberately not pursued yet — the current camera setup (Ring) isn't infrastructure worth building on top of long-term; revisit once a self-controlled NVR solution exists instead.

Still tracked, lower priority

  • New source modules — see Contributing for the current list of proposed ones looking for contributors
  • HA/voice pipeline architecture question — whether to bypass Home Assistant's own conversation/intent layer for non-device-control voice queries, piping STT output more directly to Mnemolis's /search instead, and keeping HA for device control and audio I/O only. Raised, never designed — a genuinely different kind of work (infrastructure/integration) than anything else on this list.

Clone this wiki locally