Skip to content

feat: retrieval-only fallback + tier-aware history depth (P13d-3, closes P13d)#147

Merged
blokzdev merged 1 commit into
mainfrom
claude/p13d3-retrieval-fallback
Jun 3, 2026
Merged

feat: retrieval-only fallback + tier-aware history depth (P13d-3, closes P13d)#147
blokzdev merged 1 commit into
mainfrom
claude/p13d3-retrieval-fallback

Conversation

@blokzdev
Copy link
Copy Markdown
Owner

@blokzdev blokzdev commented Jun 3, 2026

P13d-3 — Low-tier retrieval-only fallback + tier-aware history depth + RAM co-residency

The final P13d slice — closes the flagship "Ask your library". Until now the feature hid entirely on low/ineligible tiers (the entry hard-required a generation model), even though retrieval (embedder + Cozo HNSW) works there. No schema change; no new deps.

1. Retrieval-only fallback

  • AskEntryTile now shows whenever graphStore.isAvailable + the library is non-empty, routing by tier: capable → /ask (chat), low/ineligible → /ask/relevant (subtitle adapts to "Find the most relevant items").
  • New RelevantItemsScreen (/ask/relevant): a query → semanticResultsProvider (embed → HNSW vector search → hydrate) → the most relevant items in the existing MediaGrid (tap → item). Fully ephemeral (no chats persisted), framed clearly ("this device shows relevant items rather than a written answer"), with an on-ramp to AI settings when Smart search isn't ready.

2. Tier-aware history depth

  • Pure historyBudgetForTier(DeviceTier) (ask_chat.dart): low/mid → 1000, high → 3000 (replaces the d-1 default 1500).
  • AskController.send reads activeDeviceTierProvider and passes historyCharBudget: historyBudgetForTier(tier) into each turn's retrieve(...) — a shallower window for memory-constrained mid devices.

3. RAM co-residency (carried from P12d-2)

  • The mid-tier history reduction is the concrete RAM lever landed here; k: 30 / maxSources: 6 stay as modest secondary levers. Real-hardware co-residency (LLM + live HNSW index) is the APK spot-check in VERIFICATION. The two BACKLOG entries for this are marked resolved/verified at P13d-3.

Tests (CI; native generation/RAM is APK-verified)

  • historyBudgetForTier truth table.
  • AskController passes the tier-scaled budget (mid vs high) — FakeRagRetriever captures it; tier pinned via a fake ActiveDeviceTier.
  • AskEntryTile: low tier (no model) → shows + targets /ask/relevant; capable → /ask; still hides when graph unavailable / library empty.
  • RelevantItemsScreen: results render; empty → "No matching items"; not-ready → on-ramp.
  • Full suite green (874 tests), dart format + flutter analyze clean.

Docs

  • P13-PLAN.md — d-3 [~] + parent P13d flipped [x] (flagship complete).
  • VERIFICATION.md — P13d-3: low-end retrieval-only flow; tier-aware depth; RAM co-residency on real low/mid hardware.
  • BACKLOG.md — both RAM co-residency entries resolved; new note to decouple Ask from the semanticSearchEnabled opt-in (pre-existing, surfaced here).

Verification

  • ✅ CI gates green locally (format · analyze · 874 tests).
  • APK spot-checks owed: retrieval-only fallback on a low-end device (offline, nothing persisted, on-ramp when not ready); generation + live HNSW index co-residency on real low/mid hardware (no OOM); tier-aware depth coherent.

This closes the flagship P13d. Next top-level slice is P13e (advanced graph analytics & viz), planned separately.

https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T


Generated by Claude Code

Closes the flagship "Ask your library". Low/ineligible tiers (no generation
model) now reach an ephemeral RelevantItemsScreen (/ask/relevant) that surfaces
the most relevant library items via semantic retrieval — no LLM, nothing
persisted, with an on-ramp when Smart search isn't ready. The Dashboard entry
shows whenever the graph is available + the library is non-empty, routing
capable tiers to the chat and low tiers to the fallback.

History depth is now tier-aware: historyBudgetForTier(DeviceTier) (low/mid 1000,
high 3000) feeds AskController's per-turn retrieval budget — the concrete RAM
lever for the LLM + Cozo HNSW co-residency carried from P12d-2 (real-hardware
validation owed via APK). No schema, no deps.

https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
@blokzdev blokzdev merged commit 121315d into main Jun 3, 2026
1 check passed
@blokzdev blokzdev deleted the claude/p13d3-retrieval-fallback branch June 3, 2026 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants