Release 0.3.0 — recall · arttttt/mnemo

Memory can now answer a question — not just return hits. Plus projects, a query-less browse, deferred embedding, and the pplx embedder

recall — an opt-in LLM read tool: Gemma 4 E2B-it (official QAT GGUF) synthesizes a concise, grounded answer to a query from a project's memories, replying No relevant memories found. when none apply (never outside knowledge). Available via MCP and the CLI (mnemo recall <project> "<query>"); the write path stays LLM-free, and the generator is transient (loaded on demand, unloaded after).
browse — query-less retrieval: list memories by filter (type / tags / scope / created_after), newest first, no relevance ranking.
Projects as first-class entities — create_project / update_project / list_projects / delete_project (cascades its memories via ON DELETE CASCADE); writes and reads are gated on a registered project (an unknown project errors, with near-match suggestions).
Deferred embedding — writes return immediately while an async worker pool embeds off the hot path (MNEMO_EMBED_WORKERS sizes a bounded instance pool for parallel encodes); pending (vector-less) memories are supported and reconciled.
pplx-embed-v1-0.6b (int8 ONNX) as the default embedder, with mnemo reindex to re-embed on an embedder switch.
Per-memory content cap (MNEMO_MAX_MEMORY_TOKENS, default 512); over-window or empty content is rejected with an actionable error (never truncated).
A composable-stage pipeline framework underpinning recall (and groundwork for background consolidation); mnemo stats now reports pending (un-embedded) memories.

llmkit — a new shared inference package: ONNX (embedder / reranker) and llama.cpp (generator) runtimes behind an on-demand residency lifecycle; the embedder, reranker, and generator all run through it.
Store — SQLite + sqlite-vec + FTS5 is the sole backend (the in-memory store is a test double), refactored onto SQL executors; the typed-links table folded into a supersedes column; the reranker is off by default.
Architecture — the repository port segregated (ISP), repositories made stateless and no longer mutate domain entities, interface/implementation naming normalized.
Service lifecycle — stop_service; the service restarts after a reindex; the connector waits out a slow start instead of double-spawning, and the proxy reconnects if the service restarts mid-session.
remember returns a status (created / duplicate / superseded); the legacy dedup accounting was removed.

Every numeric MNEMO_* is validated at startup (no opaque later crash).
Background workers (embed, idle monitor, drain) stay alive and loud on a store fault instead of dying silently; idle-exit can't hang on a fault.
Deletion heals the supersede chain (promote head, splice interior) across batch / root / whole-chain cases.
Re-keyed exact duplicates and half-specified retrievals now error explicitly; exact-content dedup is scoped to active rows within a project; topic_key supersede is atomic.
Large multi-id deletes are chunked under SQLite's parameter cap; the reindex rebuild is atomic; a never-binding service spawn is killed (no double-spawn).
Project-scoped search requires a project; a project filter on an all / global search is rejected; created_after is normalized to UTC.

Full Changelog: 0.2.0...0.3.0

Provide feedback