Releases: maeddesg/vulkanforge
v1.0.5 — conflict edges, opt-in frontier, edge-type priors, cross-process determinism
v1.0.5 — conflict edges, opt-in frontier retrieval, edge-type priors, and cross-process recall determinism
This release extends the opt-in memory subsystem and moves it onto SQLiteGraph 3.3.1. Everything is additive and opt-in: default recall stays byte-identical, the retrieval frontier is off by default, and the new conflict edge never changes ranking. Decode is untouched.
Dependency
sqlitegraphbumped to 3.3.1 (maeddesg fork @80a3168, from oldnordic 3.2.5). 3.3.1 honorsmultilayer_deterministic_seed(the HNSW level distributor no longer falls back tofrom_entropy()) and re-elects the HNSW entry point ondelete_vector.--features memorynow requires Rust 1.89; the lean default build stays on 1.85.
New (all opt-in / additive)
CONTRADICTSedge — a third, symmetric edge flagging two notes as in conflict (/contradict,/uncontradict;POST /memory/contradict·/uncontradict). Awareness only: no suppression, no winner. The conflict shows in--explain(⚠ conflicts with #X+ conflict pairs); you resolve it with the existing/supersede. Default recall is byte-identical.- Opt-in frontier retrieval (
--frontier, default OFF) — reserves a few slots (VF_FRONTIER_SLOTS, default 2) for a top hit'sDERIVES_FROM-linked evidence (one hop), pulling a supporting premise up next to it;--explainlabels seed vs. frontier picks. Unset → pure top-k, byte-identical to before. - Edge-type priors — edge types carry categorical roles (no scalar weights):
DERIVES_FROMpulls,CONTRADICTSwithholds. A frontier candidate that contradicts a seed is held back (the slot goes to the next clean candidate), shown transparently in--explain. The frontier never amplifies evidence a more relevant hit disputes. - Cross-process recall determinism (
VF_HNSW_SEED) — the HNSW seed is pinned (honored on SG 3.3.1), so two separate processes that build the same store recall byte-identically. Guarded by a new committed integration test that spawns two processes and diffs their recall (ids + bit-exact scores).
Validation
cargo build --features memory 0 new warnings; lib 306 (lean) / 308 (memory); tests/memory.rs 29 (+1 ignored subprocess probe); vf-clide (0.3.4) 115; cargo tree carries no SQLiteGraph/fastembed/ort/rusqlite in the lean build or the client; recall byte-identical with and without edges (tested). Engine 1.0.5, vf-clide 0.3.4.
v1.0.4 — KV prefix-reuse on by default, recall diagnostics, note typing, and memory edges
KV prefix-reuse on by default, recall diagnostics, note typing, and memory edges. Recall stays byte-identical when no edges exist and no opt-ins are active. Memory is opt-in (--features memory + serve --memory); without it the inference path is unchanged.
- KV prefix reuse is now on by default (
VF_KV_PREFIX_REUSE=0to disable) — removes the within-turn double-prefill on memory-augmented turns. The reused KV is logit byte-identical to a fresh prefill (standing gatetests/kv_reuse_ident.rs, F16 + FP8-KV paths). Measured (Qwen3-8B, warm steady-state median, isolated GPU with no competing load): the redundant within-turn re-prefill of the shared ~1.5k-token prefix (~460 ms) is eliminated. recall --explain— diagnostic view: returned hits, near-misses, score separation, and the cut reason per near-miss (superseded/type/threshold/top-k).- Relevance threshold — opt-in via
VF_RECALL_MARGIN, off by default (adaptive, relative-to-top). - Note typing —
--typeon remember,/retype, and a--typefilter on recall (invariant/working/episodic/decision/failure, defaultuntyped). SUPERSEDESedges —/supersede//unsupersede; superseded notes are suppressed from recall by default (--include-supersededto show), chains resolve to the current head, and recall backfills tokafter suppression. Notes are suppressed, never deleted.DERIVES_FROMedges +/why— explicit derivation links and a why-graph trace (cycle-guarded, depth-capped); never alters recall results.
Engine 1.0.3 → 1.0.4, vf-clide 0.3.2 → 0.3.3.
v1.0.3 — agent-side curation, un-archive, 404 for missing notes
v1.0.3 — agent-side curation, un-archive, and a 404 for missing notes.
Memory curation grows up: the agent can now curate, archives are reversible, and the curation API returns the right HTTP status. No inference-path change — decode is byte-identical, so there's nothing new to benchmark.
- The agent can
archive— safely. It may archive only a note it recalled this session, and only behind an always-on confirmation that shows the note's real stored text (never the model's claim) plus a required reason. It's on the memory axis, so even--allow-shelldoesn't auto-approve it; headless denies.forget(hard delete) stays user-only. - Archiving is reversible —
/unarchive <id>. Archive drops the note's vector but keeps the record; unarchive re-embeds the stored text (the embedder is deterministic, so the original vector comes back) and restores it to recall, node-id link intact. Idempotent, and it survives a restart. Like/forget, it's a user action — the agent has no un-archive tool. - A missing id is a 404, not a 500.
POST /memory/archive·/unarchive·/deletenow answer 404 Not Found when the id doesn't exist — honestly distinguishing "your id was wrong" from "the server broke". Real faults still return 500.
Engine 1.0.2 → 1.0.3; vf-clide 0.3.1 → 0.3.2.
See CHANGELOG.md and the wiki's Memory / Memory Design pages.
v1.0.2 — vf-clide memory access + curation, agent self-state, 1.96 cleanup
v1.0.2 — agent memory access + self-state, 1.96 cleanup (2026-06-16)
Client-side memory access and an accurate agent self-image. The v1.0/v1.0.1 server-side memory store is now
reachable from vf-clide — by the REPL and the agent loop — and the agent is told its real tools, permissions, and
memory scope instead of guessing them. No inference-path change; decode logits are bit-identical to v1.0.1.
- Added — agent memory access (opt-in,
serve --memory).vf-clidereaches the project-scoped store through
recall/rememberagent tools (offered only when the server reports memory enabled) and the REPL commands
/project/recall/remember. Project isolation and the CPU/VNNI embedder (no VRAM) are unchanged from v1.0.
The client stays thin — zero SQLiteGraph/fastembed/ort dependencies. - Added — memory curation (user-only).
/archive <id>drops a note from recall but keeps the record;
/forget <id>hard-deletes it;rememberde-duplicates near-identical notes instead of storing twice. Curation
is user-driven only — the agent cannot archive or delete, and is told so. - Added — accurate agent self-state. The agent's system prompt now carries its live tool permissions
(from the actual gate: allowed / confirm-gated / denied-needs-flag, withshellflagged un-confined), the active
memory scope, the user-only curation boundary, and a scoped proactive-recall nudge (only for memory-type
questions, only when the scope already has notes — no over-recall). Presence-aware: a fresh session learns the
project has memory without any note content being injected. - Fixed — recall vs. file-search.
recall(project memory) andsearch(workspace files) now describe
themselves so they can't be confused — a memory question no longer triggers a file search. - Fixed — recall cites the real note id.
recallresults show each note's true id ([id 7]) instead of the
enumeration index, so the agent references the correct note. - Fixed — self-state permission wording.
shellis described as un-confined (a command can touch paths
outside the workspace);write_fileis confirm-gated without--allow-mutating, not silently "allowed". - Changed — rust 1.96 lib warnings silenced (114 → 0), with and without
--features memory. Warning-only; no
unsafeblock was removed (kept for MSRV portability — removing them would raise the floor to 1.96). Decode
logits are bit-identical before/after (verified by a greedyVF_LOGIT_DUMPdiff). - Notes. Memory design and philosophy (tool-driven, visible, not auto-injected; curation user-only): the wiki
Memory Design page.
Validation. Engine unchanged → cargo build --release 0 warnings, lib 306/306, correctness 83/83,
regression 26 + 1 ignored; --features memory tests/memory.rs 10/10. vf-clide 92/92; cargo tree carries
no SQLiteGraph/fastembed/ort. Agent behavior live-verified end-to-end (cross-session recall, permission awareness,
user-only forget, no over-recall) on Qwen3-8B @ serve --memory.
Versions. Engine 1.0.1 → 1.0.2. vf-clide gains the memory client + self-state (REPL commands, agent tools,
curation, accurate self-state) — version set by mg at release. Lean default still builds on Rust 1.85+;
--features memory needs Rust 1.89+.
v1.0.1 — memory is now opt-in
⚠️ Behavior change: memory is now opt-in (default off)
The v1.0 server-side memory subsystem is now optional and off by default, gated twice. Inference is unchanged — this release only puts memory behind opt-in gates so the standard build is lean again.
- Build it in:
cargo build --release --features memory(the defaultcargo build --releasestays lean and pulls in neither SQLiteGraph nor the ONNX embedder). - Turn it on:
vulkanforge serve --model … --memory(orVULKANFORGE_MEMORY=1). - Off by default: without
--memory,/memory/*returns 503 and the server runs inference only — no embedder load, no database opened, so an inference-only run carries zero memory overhead. - Clear errors: passing
--memoryto a lean binary fails fast with arebuild with --features memorymessage, before the model loads.
Cost & toolchain
- Lean default ~25 MB again;
--features memoryadds ~34 MB (static ONNX Runtime + bundled SQLite) → ~58 MB. - rustc floor: the lean build still works on Rust 1.85+;
--features memoryneeds Rust 1.89+ (the edition-2024sqlitegraphdeclaresrust-version = 1.89;ortdeclares 1.88).
Docs
Full opt-in documentation sweep — see the wiki's Memory page (what it is, what it isn't, enabling, how it works), plus updated Installation / Usage / Configuration / Hardware & Compatibility / Troubleshooting.
Validation
Lean default build (the shipping default): lib 306/306, correctness 83/83, regression 26 (+1 ignored); cargo tree carries none of sqlitegraph/fastembed/ort/rusqlite. --features memory: memory integration tests 6/6. No inference-path change.
Versions: engine 1.0.0 → 1.0.1; vf-clide unchanged at 0.3.1. Built/verified on Mesa 26.1.2-arch2.1 (RX 9070 XT).
v1.0 — server-side memory
Major release. VulkanForge gains a server-side memory — a persistent, project-scoped, semantic store embedded in the vulkanforge serve process. Write notes on purpose, read them back by meaning; the record survives server restarts and model swaps. Supported-config inference output is unchanged — this release adds a subsystem and does not touch the decode/prefill path.
What it does
MemoryStore, embedded in the API process. SQLiteGraph (3.2.5, GPL-3.0) holds nodes + edges + per-project HNSW vector indexes in one SQLite file; a CPU embedder (fastembed 5.16.2, ONNX Runtime) runs Nomic-Embed v1.5-Q (768-dim, INT8 → AVX-512/VNNI). The memory path runs off the async runtime and never takes the GPU concurrency permit — arecallnever waits behind a generation.- VF-native
/memory/*endpoints (separate from/v1/*):POST /memory/remember{project_key?, kind, text, name?, metadata?}→{id}POST /memory/recall{project_key?, query, k?}→{hits:[{id, kind, name, text, status, score}]}POST/GET /memory/projectsproject_keyis optional → a shared global scope.
- Project isolation by construction. Each project gets its own persistent HNSW index (768-dim, cosine, m=16, ef_construction=200); a recall in one project physically cannot return another's notes.
- Persistent across restarts — vectors restore from the SQLite store with no re-embedding.
- Local and single-user, all the way down: no cloud, no telemetry; the embeddings are computed on your CPU and the whole store is one SQLite file (default
~/.vulkanforge/memory.db, overrideVF_MEMORY_DB).
What it is not (yet)
This release writes and reads — that is deliberately the whole of it. Not yet here: lifecycle transitions (draft→confirmed→…→archived), delete/archive, a richer edge taxonomy, auto-injection, and the vf-clide client integration (REPL /project / /recall, agent memory-tools). Those are the next milestone. See the wiki's Memory page for what it is, what it isn't, and the roadmap.
Cost (honest)
The two native deps add real surface: the release binary grows ~25 MB → ~59 MB (statically linked ONNX Runtime + bundled SQLite), the lockfile ~250 → ~384 packages, and a first clean build takes a few extra minutes. The first server start downloads the Nomic ONNX model from HuggingFace into ~/.vulkanforge/embed-cache (then runs offline).
Versions: engine 0.9.2 → 1.0.0; vf-clide unchanged at 0.3.1. Validated on AMD RX 9070 XT (RADV/gfx1201), Mesa 26.1.2.
v0.9.4 — vf-clide REPL permission ceiling + denial wording
vf-clide UX release (0.3.1). No engine change (engine stays at 0.9.2). Two changes to the agent's permission UX.
REPL honors the permission ceiling
In the interactive --agent REPL, a tool call at or below the active ceiling (--yes → ReadOnly,
--allow-mutating → Mutating, --allow-shell → Exec, cumulative) is now auto-approved — and still printed,
so you see every tool that ran — and only a call above the ceiling prompts y/N.
Previously the REPL prompted for every call and the flags only took effect headless. So --agent --yes now
stops asking about reads, --allow-mutating stops asking about writes, and so on, while anything above the
ceiling still asks. This is consistent with headless, not laxer: workspace confinement still bounds the file
tools independently, and shell is still only auto-approved with --allow-shell.
Headless -p is unchanged — a call above the ceiling is denied (not prompted), byte-for-byte as before.
Denial wording in the constitution
The built-in agent system prompt now distinguishes the two kinds of denial so the model stops claiming it needs
"elevated permissions" or that a target is "system-critical":
- a permission denial (a tool above the current ceiling) is lifted only by re-running with
--allow-mutating/--allow-shell— never OS or filesystem permissions; - a workspace-confinement denial (a path outside the workspace) is absolute — no flag overrides it.
Versions: engine 0.9.2 (unchanged), vf-clide 0.3.0 → 0.3.1. Validated on AMD RX 9070 XT (RADV/gfx1201),
Mesa 26.1.2.
v0.9.2 — vf-clide token meter + clean server shutdown
Feature + bugfix release. vf-clide gains live token accounting and a pinned status line; the engine's vulkanforge serve now shuts down cleanly. Supported-config inference output is unchanged — no decode/prefill/behavior change.
vf-clide 0.3.0 — token meter + pinned status line (feature)
- Token accounting. The client surfaces real token usage on every path: the non-streaming response, the tool-calling loop, and the streaming path (via
stream_options.include_usage, which the server emits as a finalusagechunk). No local tokenizer, no estimation — the numbers are the server's own counts. - Pinned status line. The REPL pins a bottom status line (raw ANSI scroll region, no TUI framework) with a token meter —
↑prompt ↓completion (total) · session …— and the current action (idle/generating…/thinking…/running <tool>(…)). It is a no-op when stdout isn't a TTY, so headless-poutput stays byte-for-byte unchanged and fully scriptable.
Engine 0.9.2 — clean serve shutdown (bugfix)
Ctrl+C / SIGTERM on vulkanforge serve previously left the GPU objects undestroyed (the validation layer reported hundreds of leaked objects) and then freed memory against an already-destroyed device → SIGSEGV. The shutdown path now:
- waits for the device to go idle (
device_wait_idle), - runs the explicit resource-teardown chain in order while the device is still alive, and
- drops the memory allocator before the device.
Result: 0 leaked objects, clean exit, no crash on both Ctrl+C and SIGTERM. Shutdown-path only — steady-state decode is untouched.
Versions: engine 0.9.0 → 0.9.2, vf-clide 0.2.1 → 0.3.0. (v0.9.1 was a vf-clide-only search-confinement security patch; the engine stayed at 0.9.0 through it.) Validated on AMD RX 9070 XT (RADV/gfx1201), Mesa 26.1.2.
v0.9.1 — search symlink-confinement fix (security)
v0.9.1 — Security: vf-clide search no longer follows symlinks out of the workspace
Security fix — update recommended.
vf-clide's agent search tool recursively walked the workspace using Path::is_dir/is_file,
which follow symlinks. A symlink inside the workspace pointing outside it (e.g. escape → /etc)
was treated as a directory and recursed into, so search could read files outside the workspace
root — reachable with only --yes (read-only auto-approval). The single-path tools
(read_file/write_file) and the search start path were already confined; only the recursive
walk was not.
Fix
search's recursive walk now checks each entry's own type via symlink_metadata (which does not
follow the final component) and skips symlinks entirely — they are neither recursed into nor read.
This closes the confinement hole and also prevents symlink cycles. read_file / write_file / shell
are unchanged.
Scope
- vf-clide
0.2.0 → 0.2.1. Engine unchanged (0.9.0) — no engine/decode/behavior change. - A regression test (
search_does_not_follow_escaping_symlink) pins the fix; vf-clide unit 60/60. - Verified live @Qwen3-14B-Q4: in a workspace with
escape → /etc,searchreturns only the
in-workspace files, never/etc/....
If you run the --agent mode with untrusted workspaces, update.
v0.9.0 — Agentic vf-clide
v0.9.0 — Agentic vf-clide
vf-clide wird vom Chat-Client zum agentischen Coding-Client; die Engine bekommt eine Test-Infra-Härtung.
Highlights
- vf-clide kann jetzt agentisch coden. Im
--agent-Modus nutzt der Client Tools über VFs OpenAI-API in
einem Loop: read_file, write_file, search, shell. - 3-Stufen-Permission-Modell. Tools sind nach Risiko gestuft — ReadOnly (
read_file/search),
Mutating (write_file), Exec (shell). Auto-Approval steigt opt-in über--yes→--allow-mutating→
--allow-shell(kumulativ). Interaktiv wird pro Call bestätigt. - Workspace-Confinement. Datei-Tools sind auf die Workspace-Wurzel (
--workspace, Default cwd) beschränkt;
../- und Symlink-Escapes werden abgewiesen. - Konstitution. Ein knapper Default-System-Prompt plus optionales projektspezifisches
AGENTS.md. - Engine: Test-Infra-Härtung. Die End-to-end-Regressions- und Per-Shader-Correctness-Suites laufen wieder
und werden per Compile-Wächter gegen erneute Drift geschützt. Kein Decode-/Verhaltens-Change.
Validierung
- vf-clide: eigene Suite + Live-Smokes über die Tool-/Permission-Pfade @Qwen3-14B-Q4.
- Engine: Lib-Tests + reaktivierte Correctness-/Regression-Suiten.
Bekannte Grenzen
shellist nicht confined — ein Kommando kann den Workspace verlassen.--allow-shellist die bewusste,
laut benannte Opt-in-Stufe; einmal gesetzt, gilt sie für die Session. Bewusst einsetzen.- Keine Session-Persistenz (folgt).
searchist substring-basiert (kein Regex).- Kontext-Decke 16384 auf RDNA4/gfx1201; gemma-QAT ist VRAM-eng. Default-Coder = Qwen3-14B-Q4.
- gemma-Tool-Calling für einfache Argumente validiert; code-tragende Argumente folgen.
- Vorbestehend: coopmat-
gemm_q(opt-in, default-OFF) liefert end-to-end NaN → quarantäniert; gemma-Q4_K_M
(MMQ_ID); Q8_0-Coverage-Lücke.