Releases · thameema/memnos

15 Jun 03:25

thameema

v0.1.11

6cf8b64

v0.1.11 — sharper recall Latest

Latest

Sharper recall — entity-aware disambiguation, answer-quality ranking, and both-sides capture in headless Claude Code.

Recall quality

Entity-aware recall (#17) — recall now uses each fact's subject entity to separate semantically-adjacent but distinct subjects, so a query about one project no longer pulls in another project's notes that merely share vocabulary. Adds an optional subject scope to hard-filter recall to a single entity. Default-on; tune/disable with MEMNOS_RECALL_ENTITY_BOOST / MEMNOS_RECALL_ENTITY_SCOPE.
Answer-quality ranking (#2) — candidate dedup, fact-first context rendering, and a bounded fact preference so distilled answers surface ahead of verbose meta-turns for list/broad queries. Verbatim queries still lead with the raw turn.
Deterministic ordering — stable tie-break in the hybrid-RRF queries; recall order is now identical across PostgreSQL query plans.

Capture

Headless Claude Code (#18) — claude -p (print mode) now captures both the user prompt and the assistant reply; previously only the user turn was saved.

Fixes

Long/pathological queries are clamped to the embedder and reranker and return fast.
memnos status no longer reports a false "STALE pid file" under autostart.

Quality

LoCoMo 64–65% band held (65% on the release gate, gpt-4o judge). LongMemEval 78.4% (500q).

Upgrade: memnos upgrade (or uv tool install -U memnos).

Assets 2

13 Jun 17:52

thameema

v0.1.10

fc7404f

v0.1.10 — bounded memory, never-drop capture, local extraction

Hardening release — validated on a clean Linux box, a Hermes integration, and the 16 GB host that filed #15.

Recall memory is now bounded and returned to the OS (#15). The recall path no longer climbs to GBs and hold it — the ONNX reranker arena is bounded and freed memory is released back to the OS after each recall. Measured: a workload that previously climbed to ~2.6 GB and stuck now plateaus a few hundred MB under load and recedes when idle.
Slow extraction never drops a write. Capture (proxy / SDK / MCP remember) uses async writes, so a slow extraction backend can't time out and silently lose the assistant's turn — both sides of every exchange land. Higher default client timeouts too.
Fully-local extraction. Set MEMNOS_EXTRACT_BASE_URL (+ optional MEMNOS_EXTRACT_MODEL) to any OpenAI-compatible endpoint — Ollama, vLLM, LM Studio — and fact extraction runs locally while embeddings stay free local-384. No OpenAI key required for a fully on-device install.
Long recall queries no longer crash the Postgres FTS parser (tsquery stack) — clamped safely.
agent-setup --namespace <ns> now grants the wired token on that namespace (no more silent 403 on writes).
SDK __version__ derives from package metadata (no drift); memnos status reports the real extraction backend.

Upgrade: memnos upgrade && memnos restart (or uv tool upgrade memnos).

Assets 2

13 Jun 13:55

thameema

v0.1.9

b0469d1

v0.1.9 — fresh-install & capture fixes

Bug-fix release hardening the install path and agent capture (found via a clean-Linux + Hermes field test).

Linux install on more distros. memnos now works with pgvector 0.6.0 — it feature-detects the installed pgvector and uses full-precision vector columns on <0.7, halfvec automatically on ≥0.7. No source build needed where your distro ships 0.6.
memnos proxy handles gzip'd upstreams. Fixes a bug where a gzip-compressed response (e.g. via OpenRouter) broke both capture and a client behind the proxy. Both sides of every exchange are now reliably captured — verified end-to-end with a live agent.
Write failures are never silent. MCP write tools (remember, memory_write, …) now raise an explicit error instead of reporting success when a write is rejected — so an agent can't believe it saved something it didn't.
agent-setup wires each agent its own scoped principal + token (fixes writes silently failing with 403).

Upgrade: memnos upgrade && memnos restart (or uv tool upgrade memnos).

Assets 2

12 Jun 19:15

thameema

v0.1.8

a7f3e3e

v0.1.8 — recall that stays fast on your hardware

No more cold-start stall. The server accepts recalls immediately on startup; while the reranker warms in the background, recall returns best-available results (flagged degraded) instead of blocking. First-call latency drops from tens of seconds to under ~2s on every machine.
Self-calibrating to your hardware. memnos measures rerank speed at startup and sizes reranking to a latency ceiling: capable machines keep full ranking depth (no accuracy change), CPU-only machines stay responsive instead of timing out. Tunable via MEMNOS_RERANK_BUDGET_MS / MEMNOS_RERANK_CAP; MEMNOS_RERANK=0 disables reranking entirely.
Per-stage recall timings in the audit log (embed / sql / staleness / rerank) for diagnosing latency, plus a 60s query-embedding cache.
Published benchmark: LongMemEval full-500 = 78.4% (gpt-4o answer + judge, on a competitor's open MemoryBench harness) — every prediction in benchmarks/results/.

Upgrade: memnos upgrade && memnos restart (or uv tool upgrade memnos).

Assets 2

11 Jun 21:06

thameema

v0.1.7

d0e5257

v0.1.7 — what's true now

Belief-change supersession on the write path: when a new memory contradicts an old one (status flips, negations, value updates like "rate limit is now 200"), the old fact is closed out with full provenance — recall returns what's true now, with the transition visible ("superseded as of ").
Write-path dedupe: restating the same fact bumps its salience instead of stacking duplicate rows.
memnos namespace reconcile <ns> — applies the same rules to memories stored before 0.1.7 (--dry-run shows exact counts first; no LLM calls).
Smarter ranking on broad questions: "where are we with X" now leads with distilled facts; verbatim questions still surface the original conversation. Tunable/disablable via env.
Windows installation guide (docs/guides/windows.md); LoCoMo full-10 band updated to 64–65% with results published.

Closes #10, #11. Verified: full test suite + 3-OS CI matrix + LoCoMo full-10 held at 64–65% across the changes.

Upgrade: memnos upgrade (or uv tool upgrade memnos), then memnos restart.

Assets 2

11 Jun 05:51

thameema

v0.1.6

d09c0d8

v0.1.6 — the coordination release: attributed, grounded, governed

Memory as the coordination plane for multi-agent systems.

Author-attributed memory

Every memory now carries its author — stamped server-side from the authenticated token, impossible to forge from a request body. Recall and context blocks show who wrote what (- (decision, 2026-06-11, by arch-agent) ...), and /recall accepts an author filter. Multi-agent blackboard coordination: shared namespace + signed writes + webhook subscriptions = agents that hand off work with full provenance.

Grounded recall (knowledge bases)

Tag any namespace kind=knowledge (no reserved prefixes) and link it to working namespaces: memnos namespace link proj:billing cms57-docs. Recall then automatically consults linked knowledge bases — enforced by the engine, not the prompt. Links are policy, grants are permission: both required, and skipped links are visible in the response (grounded_in / links_skipped).

Typed memories + pinned constraints

memnos remember "..." --type decision|incident|constraint|skill|fact — types flow through extraction and recall filters. Constraints are always injected: type=constraint memories appear in every recall for their namespace (and granted linked KBs), rendered first as CONSTRAINT: lines regardless of the query — compliance rules your agents physically cannot forget. New "Memory feed" tab in the admin console (type badges, author, age).

Published, CI-enforced API contract

openapi.yaml — 58 operations, every one exercised and schema-validated against the real server in CI. The published spec cannot drift from the implementation. Human reference at docs/api.md with curl + SDK examples.

The memnos CLI grammar

Consistent noun–verb commands (principal create, token mint, grant add, namespace link) — every old form still works as an alias. The full CLI reference is auto-generated from the parser itself (staleness gated in CI), published at docs/cli.md and as a CLI Reference tab in the admin console. New cross-platform CI matrix (Linux/macOS/Windows) — which caught and fixed a real Windows console encoding bug before release.

memnos-sdk 0.1.6 published in lockstep.

Assets 2

11 Jun 05:08

thameema

v0.1.5

a0a7a03

v0.1.5 — reliable capture, deterministic proxy, field-hardened performance

The reliability release: every fix in here came from real field deployments.

Capture — the answer is the memory

CRITICAL fix: the Claude Code Stop hook now captures BOTH speakers. Previously only the user's prompt was saved — everything the assistant said (decisions, ticket IDs, outcomes) was invisible across sessions. Assistant replies are now captured with identifiers preserved verbatim, and the extraction prompt keeps ticket keys / PR numbers / versions / URLs intact.
New: memnos proxy — deterministic both-sides capture for any OpenAI- or Anthropic-compatible client (Hermes, OpenClaw, Open WebUI, SDK apps): transparent relay (streaming included), BYOK (keys forwarded, never stored), fail-open (a capture problem never breaks your agent), agent-loop noise filtered (tool-call iterations, title calls, dedupe). Typed error taxonomy so you always know if a failure is the proxy, the network, or the LLM.
Session trust rails: every Claude Code session opens with a visible memnos: memory ACTIVE status line (or a loud warning + fix); memnos status shows server, proxy, capture counters, and stale/unmanaged-process detection.

Integrations

memnos agent-setup now uniform across claude-code, claude-desktop, codex, cursor, windsurf, openclaw, hermes (one grammar; claude-setup remains an alias)
Claude Desktop: platform-aware config paths, absolute command path (fixes spawn memnos ENOENT), and a memory skill for consistent recall/remember behavior
memnos upgrade re-wires previously-installed integrations automatically
memnos autostart [--proxy] — launchd/systemd login services; the server waits for Postgres instead of crashing

Performance (measured at 52k–104k row scale)

/recall −51% latency (single-pass retrieve+rerank); raw-turn BM25 index added
New default reranker: ms-marco-MiniLM-L-6 — measured on the identical full-10 LoCoMo corpus: +6pp accuracy over the 13× larger model, 8.4× faster reranks, ~660MB less memory, 0.23s cold start
No pool connection is ever held across LLM/embedding/CPU work (storm test: total write outage → 24/24 success, admin p95 3ms); async ingest option; statement timeouts
Supersession writes 2,116ms → 0.9ms at 95k facts; bounded embedding cache (a 35k-write ingest now costs +26MB, was ~1.9GB); processes show as memnos-server/memnos-proxy in ps/top
Admin console: loading/error/empty states everywhere, pagination, retry — plus faster endpoints

Benchmarks (reproducible, predictions published)

Full LoCoMo (10 conversations, 1,542 questions, gpt-4o judge): 64–65% across three independent ingests with the new defaults (benchmarks/results/)

memnos-sdk 0.1.5 published in lockstep.

Assets 2

10 Jun 06:22

thameema

v0.1.4

7889126

v0.1.4 — friction-free CLI

Friction-free install & operations

Server lifecycle like any daemon

memnos start / stop / restart / status — background server with pid + log management; memnos serve remains the foreground primitive for systemd/launchd/Docker
memnos start shows live progress (first start downloads the local ONNX models ~1 GB — it tells you, tails the log, and surfaces startup errors inline)
Prominent "START THE SERVER" guidance at the end of setup

Setup

memnos setup --docker — provisions a ready pgvector Postgres container (native Postgres remains the first-class path)
Clear, actionable guidance when pgvector is missing or built for the wrong Postgres major (brew version-mismatch gotcha included)
OpenAI-key step: hidden input, whitespace-scrubbed, format-checked, live-validated against the OpenAI API before acceptance; blank entry now confirms before locking in free local 384-d mode; key stored AES-256-GCM encrypted in the vault
Re-running setup is safe — the schema is additive/idempotent (never wipes data)

New: memnos migrate-embeddings

Lossless migration between local 384-d and OpenAI 1536-d embeddings — re-embeds every memory from its stored source text, swaps the column type, rebuilds the HNSW indexes, and flips the server mode. --to {384,1536}, with cost + running-server warnings

Upgrades & versioning

memnos upgrade (--check) — detects uv/pipx/pip installs and upgrades in place
Bare memnos and memnos -V show the version; an available upgrade is hinted

Engine (since 0.1.3)

Reranker + local embeddings on ONNX Runtime via fastembed — torch removed, install ~770 MB → ~236 MB, identical ranking (LoCoMo re-validated within the 57–61% band)

memnos-sdk 0.1.4 published in lockstep (no functional SDK changes).

Assets 2

10 Jun 03:57

thameema

v0.1.3

9e81e04

memnos v0.1.3 — uv-first install docs

Docs refresh: uv is now the primary installer (uv tool install memnos for the CLI, uv pip install memnos-sdk for the library); pip/pipx are fallbacks. No engine changes. memnos + memnos-sdk both at 0.1.3.

Assets 2

10 Jun 03:42

thameema

v0.1.2

deb3a43

memnos v0.1.2 — ONNX backend (no torch), ~2.4x smaller install

Reranker + local embeddings now run on ONNX Runtime (fastembed) instead of torch — install ~770MB to ~236MB. Same models, bit-identical reranking. LoCoMo re-validated: 57% (57-61% band). memnos + memnos-sdk both at 0.1.2.

Assets 2

Releases: thameema/memnos

v0.1.11 — sharper recall

Recall quality

Capture

Fixes

Quality

Uh oh!

v0.1.10 — bounded memory, never-drop capture, local extraction

Uh oh!

v0.1.9 — fresh-install & capture fixes

Uh oh!

v0.1.8 — recall that stays fast on your hardware

Uh oh!

v0.1.7 — what's true now

Uh oh!

v0.1.6 — the coordination release: attributed, grounded, governed

Author-attributed memory

Grounded recall (knowledge bases)

Typed memories + pinned constraints

Published, CI-enforced API contract

The memnos CLI grammar

Uh oh!

v0.1.5 — reliable capture, deterministic proxy, field-hardened performance

Capture — the answer is the memory

Integrations

Performance (measured at 52k–104k row scale)

Benchmarks (reproducible, predictions published)

Uh oh!

v0.1.4 — friction-free CLI

Friction-free install & operations

Uh oh!

memnos v0.1.3 — uv-first install docs

Uh oh!

memnos v0.1.2 — ONNX backend (no torch), ~2.4x smaller install

Uh oh!