Skip to content

v1.0 — server-side memory

Choose a tag to compare

@maeddesg maeddesg released this 14 Jun 18:09
· 15 commits to main since this release

Major release. VulkanForge gains a server-side memory — a persistent, project-scoped, semantic store embedded in the vulkanforge serve process. Write notes on purpose, read them back by meaning; the record survives server restarts and model swaps. Supported-config inference output is unchanged — this release adds a subsystem and does not touch the decode/prefill path.

What it does

  • MemoryStore, embedded in the API process. SQLiteGraph (3.2.5, GPL-3.0) holds nodes + edges + per-project HNSW vector indexes in one SQLite file; a CPU embedder (fastembed 5.16.2, ONNX Runtime) runs Nomic-Embed v1.5-Q (768-dim, INT8 → AVX-512/VNNI). The memory path runs off the async runtime and never takes the GPU concurrency permit — a recall never waits behind a generation.
  • VF-native /memory/* endpoints (separate from /v1/*):
    • POST /memory/remember {project_key?, kind, text, name?, metadata?}{id}
    • POST /memory/recall {project_key?, query, k?}{hits:[{id, kind, name, text, status, score}]}
    • POST / GET /memory/projects
    • project_key is optional → a shared global scope.
  • Project isolation by construction. Each project gets its own persistent HNSW index (768-dim, cosine, m=16, ef_construction=200); a recall in one project physically cannot return another's notes.
  • Persistent across restarts — vectors restore from the SQLite store with no re-embedding.
  • Local and single-user, all the way down: no cloud, no telemetry; the embeddings are computed on your CPU and the whole store is one SQLite file (default ~/.vulkanforge/memory.db, override VF_MEMORY_DB).

What it is not (yet)

This release writes and reads — that is deliberately the whole of it. Not yet here: lifecycle transitions (draft→confirmed→…→archived), delete/archive, a richer edge taxonomy, auto-injection, and the vf-clide client integration (REPL /project / /recall, agent memory-tools). Those are the next milestone. See the wiki's Memory page for what it is, what it isn't, and the roadmap.

Cost (honest)

The two native deps add real surface: the release binary grows ~25 MB → ~59 MB (statically linked ONNX Runtime + bundled SQLite), the lockfile ~250 → ~384 packages, and a first clean build takes a few extra minutes. The first server start downloads the Nomic ONNX model from HuggingFace into ~/.vulkanforge/embed-cache (then runs offline).


Versions: engine 0.9.2 → 1.0.0; vf-clide unchanged at 0.3.1. Validated on AMD RX 9070 XT (RADV/gfx1201), Mesa 26.1.2.