v0.1.10 — bounded memory, never-drop capture, local extraction
Hardening release — validated on a clean Linux box, a Hermes integration, and the 16 GB host that filed #15.
- Recall memory is now bounded and returned to the OS (#15). The recall path no longer climbs to GBs and hold it — the ONNX reranker arena is bounded and freed memory is released back to the OS after each recall. Measured: a workload that previously climbed to ~2.6 GB and stuck now plateaus a few hundred MB under load and recedes when idle.
- Slow extraction never drops a write. Capture (proxy / SDK / MCP
remember) uses async writes, so a slow extraction backend can't time out and silently lose the assistant's turn — both sides of every exchange land. Higher default client timeouts too. - Fully-local extraction. Set
MEMNOS_EXTRACT_BASE_URL(+ optionalMEMNOS_EXTRACT_MODEL) to any OpenAI-compatible endpoint — Ollama, vLLM, LM Studio — and fact extraction runs locally while embeddings stay free local-384. No OpenAI key required for a fully on-device install. - Long recall queries no longer crash the Postgres FTS parser (
tsquerystack) — clamped safely. agent-setup --namespace <ns>now grants the wired token on that namespace (no more silent 403 on writes).- SDK
__version__derives from package metadata (no drift);memnos statusreports the real extraction backend.
Upgrade: memnos upgrade && memnos restart (or uv tool upgrade memnos).