Skip to content

v0.1.10 — bounded memory, never-drop capture, local extraction

Choose a tag to compare

@thameema thameema released this 13 Jun 17:52
· 22 commits to master since this release

Hardening release — validated on a clean Linux box, a Hermes integration, and the 16 GB host that filed #15.

  • Recall memory is now bounded and returned to the OS (#15). The recall path no longer climbs to GBs and hold it — the ONNX reranker arena is bounded and freed memory is released back to the OS after each recall. Measured: a workload that previously climbed to ~2.6 GB and stuck now plateaus a few hundred MB under load and recedes when idle.
  • Slow extraction never drops a write. Capture (proxy / SDK / MCP remember) uses async writes, so a slow extraction backend can't time out and silently lose the assistant's turn — both sides of every exchange land. Higher default client timeouts too.
  • Fully-local extraction. Set MEMNOS_EXTRACT_BASE_URL (+ optional MEMNOS_EXTRACT_MODEL) to any OpenAI-compatible endpoint — Ollama, vLLM, LM Studio — and fact extraction runs locally while embeddings stay free local-384. No OpenAI key required for a fully on-device install.
  • Long recall queries no longer crash the Postgres FTS parser (tsquery stack) — clamped safely.
  • agent-setup --namespace <ns> now grants the wired token on that namespace (no more silent 403 on writes).
  • SDK __version__ derives from package metadata (no drift); memnos status reports the real extraction backend.

Upgrade: memnos upgrade && memnos restart (or uv tool upgrade memnos).