Skip to content

Releases: maeddesg/vulkanforge

v1.0.5 — conflict edges, opt-in frontier, edge-type priors, cross-process determinism

20 Jun 17:36

Choose a tag to compare

v1.0.5 — conflict edges, opt-in frontier retrieval, edge-type priors, and cross-process recall determinism

This release extends the opt-in memory subsystem and moves it onto SQLiteGraph 3.3.1. Everything is additive and opt-in: default recall stays byte-identical, the retrieval frontier is off by default, and the new conflict edge never changes ranking. Decode is untouched.

Dependency

  • sqlitegraph bumped to 3.3.1 (maeddesg fork @ 80a3168, from oldnordic 3.2.5). 3.3.1 honors multilayer_deterministic_seed (the HNSW level distributor no longer falls back to from_entropy()) and re-elects the HNSW entry point on delete_vector. --features memory now requires Rust 1.89; the lean default build stays on 1.85.

New (all opt-in / additive)

  • CONTRADICTS edge — a third, symmetric edge flagging two notes as in conflict (/contradict, /uncontradict; POST /memory/contradict · /uncontradict). Awareness only: no suppression, no winner. The conflict shows in --explain (⚠ conflicts with #X + conflict pairs); you resolve it with the existing /supersede. Default recall is byte-identical.
  • Opt-in frontier retrieval (--frontier, default OFF) — reserves a few slots (VF_FRONTIER_SLOTS, default 2) for a top hit's DERIVES_FROM-linked evidence (one hop), pulling a supporting premise up next to it; --explain labels seed vs. frontier picks. Unset → pure top-k, byte-identical to before.
  • Edge-type priors — edge types carry categorical roles (no scalar weights): DERIVES_FROM pulls, CONTRADICTS withholds. A frontier candidate that contradicts a seed is held back (the slot goes to the next clean candidate), shown transparently in --explain. The frontier never amplifies evidence a more relevant hit disputes.
  • Cross-process recall determinism (VF_HNSW_SEED) — the HNSW seed is pinned (honored on SG 3.3.1), so two separate processes that build the same store recall byte-identically. Guarded by a new committed integration test that spawns two processes and diffs their recall (ids + bit-exact scores).

Validation

cargo build --features memory 0 new warnings; lib 306 (lean) / 308 (memory); tests/memory.rs 29 (+1 ignored subprocess probe); vf-clide (0.3.4) 115; cargo tree carries no SQLiteGraph/fastembed/ort/rusqlite in the lean build or the client; recall byte-identical with and without edges (tested). Engine 1.0.5, vf-clide 0.3.4.

v1.0.4 — KV prefix-reuse on by default, recall diagnostics, note typing, and memory edges

18 Jun 17:54

Choose a tag to compare

KV prefix-reuse on by default, recall diagnostics, note typing, and memory edges. Recall stays byte-identical when no edges exist and no opt-ins are active. Memory is opt-in (--features memory + serve --memory); without it the inference path is unchanged.

  • KV prefix reuse is now on by default (VF_KV_PREFIX_REUSE=0 to disable) — removes the within-turn double-prefill on memory-augmented turns. The reused KV is logit byte-identical to a fresh prefill (standing gate tests/kv_reuse_ident.rs, F16 + FP8-KV paths). Measured (Qwen3-8B, warm steady-state median, isolated GPU with no competing load): the redundant within-turn re-prefill of the shared ~1.5k-token prefix (~460 ms) is eliminated.
  • recall --explain — diagnostic view: returned hits, near-misses, score separation, and the cut reason per near-miss (superseded / type / threshold / top-k).
  • Relevance threshold — opt-in via VF_RECALL_MARGIN, off by default (adaptive, relative-to-top).
  • Note typing--type on remember, /retype, and a --type filter on recall (invariant/working/episodic/decision/failure, default untyped).
  • SUPERSEDES edges/supersede / /unsupersede; superseded notes are suppressed from recall by default (--include-superseded to show), chains resolve to the current head, and recall backfills to k after suppression. Notes are suppressed, never deleted.
  • DERIVES_FROM edges + /why — explicit derivation links and a why-graph trace (cycle-guarded, depth-capped); never alters recall results.

Engine 1.0.3 → 1.0.4, vf-clide 0.3.2 → 0.3.3.

v1.0.3 — agent-side curation, un-archive, 404 for missing notes

17 Jun 16:48

Choose a tag to compare

v1.0.3 — agent-side curation, un-archive, and a 404 for missing notes.

Memory curation grows up: the agent can now curate, archives are reversible, and the curation API returns the right HTTP status. No inference-path change — decode is byte-identical, so there's nothing new to benchmark.

  • The agent can archive — safely. It may archive only a note it recalled this session, and only behind an always-on confirmation that shows the note's real stored text (never the model's claim) plus a required reason. It's on the memory axis, so even --allow-shell doesn't auto-approve it; headless denies. forget (hard delete) stays user-only.
  • Archiving is reversible — /unarchive <id>. Archive drops the note's vector but keeps the record; unarchive re-embeds the stored text (the embedder is deterministic, so the original vector comes back) and restores it to recall, node-id link intact. Idempotent, and it survives a restart. Like /forget, it's a user action — the agent has no un-archive tool.
  • A missing id is a 404, not a 500. POST /memory/archive · /unarchive · /delete now answer 404 Not Found when the id doesn't exist — honestly distinguishing "your id was wrong" from "the server broke". Real faults still return 500.

Engine 1.0.2 → 1.0.3; vf-clide 0.3.1 → 0.3.2.

See CHANGELOG.md and the wiki's Memory / Memory Design pages.

v1.0.2 — vf-clide memory access + curation, agent self-state, 1.96 cleanup

16 Jun 17:11

Choose a tag to compare

v1.0.2 — agent memory access + self-state, 1.96 cleanup (2026-06-16)

Client-side memory access and an accurate agent self-image. The v1.0/v1.0.1 server-side memory store is now
reachable from vf-clide — by the REPL and the agent loop — and the agent is told its real tools, permissions, and
memory scope instead of guessing them. No inference-path change; decode logits are bit-identical to v1.0.1.

  • Added — agent memory access (opt-in, serve --memory). vf-clide reaches the project-scoped store through
    recall/remember agent tools (offered only when the server reports memory enabled) and the REPL commands
    /project /recall /remember. Project isolation and the CPU/VNNI embedder (no VRAM) are unchanged from v1.0.
    The client stays thin — zero SQLiteGraph/fastembed/ort dependencies.
  • Added — memory curation (user-only). /archive <id> drops a note from recall but keeps the record;
    /forget <id> hard-deletes it; remember de-duplicates near-identical notes instead of storing twice. Curation
    is user-driven only — the agent cannot archive or delete, and is told so.
  • Added — accurate agent self-state. The agent's system prompt now carries its live tool permissions
    (from the actual gate: allowed / confirm-gated / denied-needs-flag, with shell flagged un-confined), the active
    memory scope, the user-only curation boundary, and a scoped proactive-recall nudge (only for memory-type
    questions, only when the scope already has notes — no over-recall). Presence-aware: a fresh session learns the
    project has memory without any note content being injected.
  • Fixed — recall vs. file-search. recall (project memory) and search (workspace files) now describe
    themselves so they can't be confused — a memory question no longer triggers a file search.
  • Fixed — recall cites the real note id. recall results show each note's true id ([id 7]) instead of the
    enumeration index, so the agent references the correct note.
  • Fixed — self-state permission wording. shell is described as un-confined (a command can touch paths
    outside the workspace); write_file is confirm-gated without --allow-mutating, not silently "allowed".
  • Changed — rust 1.96 lib warnings silenced (114 → 0), with and without --features memory. Warning-only; no
    unsafe block was removed (kept for MSRV portability — removing them would raise the floor to 1.96). Decode
    logits are bit-identical before/after (verified by a greedy VF_LOGIT_DUMP diff).
  • Notes. Memory design and philosophy (tool-driven, visible, not auto-injected; curation user-only): the wiki
    Memory Design page.

Validation. Engine unchanged → cargo build --release 0 warnings, lib 306/306, correctness 83/83,
regression 26 + 1 ignored; --features memory tests/memory.rs 10/10. vf-clide 92/92; cargo tree carries
no SQLiteGraph/fastembed/ort. Agent behavior live-verified end-to-end (cross-session recall, permission awareness,
user-only forget, no over-recall) on Qwen3-8B @ serve --memory.

Versions. Engine 1.0.1 → 1.0.2. vf-clide gains the memory client + self-state (REPL commands, agent tools,
curation, accurate self-state) — version set by mg at release. Lean default still builds on Rust 1.85+;
--features memory needs Rust 1.89+.

v1.0.1 — memory is now opt-in

15 Jun 07:35

Choose a tag to compare

⚠️ Behavior change: memory is now opt-in (default off)

The v1.0 server-side memory subsystem is now optional and off by default, gated twice. Inference is unchanged — this release only puts memory behind opt-in gates so the standard build is lean again.

  • Build it in: cargo build --release --features memory (the default cargo build --release stays lean and pulls in neither SQLiteGraph nor the ONNX embedder).
  • Turn it on: vulkanforge serve --model … --memory (or VULKANFORGE_MEMORY=1).
  • Off by default: without --memory, /memory/* returns 503 and the server runs inference only — no embedder load, no database opened, so an inference-only run carries zero memory overhead.
  • Clear errors: passing --memory to a lean binary fails fast with a rebuild with --features memory message, before the model loads.

Cost & toolchain

  • Lean default ~25 MB again; --features memory adds ~34 MB (static ONNX Runtime + bundled SQLite) → ~58 MB.
  • rustc floor: the lean build still works on Rust 1.85+; --features memory needs Rust 1.89+ (the edition-2024 sqlitegraph declares rust-version = 1.89; ort declares 1.88).

Docs

Full opt-in documentation sweep — see the wiki's Memory page (what it is, what it isn't, enabling, how it works), plus updated Installation / Usage / Configuration / Hardware & Compatibility / Troubleshooting.

Validation

Lean default build (the shipping default): lib 306/306, correctness 83/83, regression 26 (+1 ignored); cargo tree carries none of sqlitegraph/fastembed/ort/rusqlite. --features memory: memory integration tests 6/6. No inference-path change.

Versions: engine 1.0.0 → 1.0.1; vf-clide unchanged at 0.3.1. Built/verified on Mesa 26.1.2-arch2.1 (RX 9070 XT).

v1.0 — server-side memory

14 Jun 18:09

Choose a tag to compare

Major release. VulkanForge gains a server-side memory — a persistent, project-scoped, semantic store embedded in the vulkanforge serve process. Write notes on purpose, read them back by meaning; the record survives server restarts and model swaps. Supported-config inference output is unchanged — this release adds a subsystem and does not touch the decode/prefill path.

What it does

  • MemoryStore, embedded in the API process. SQLiteGraph (3.2.5, GPL-3.0) holds nodes + edges + per-project HNSW vector indexes in one SQLite file; a CPU embedder (fastembed 5.16.2, ONNX Runtime) runs Nomic-Embed v1.5-Q (768-dim, INT8 → AVX-512/VNNI). The memory path runs off the async runtime and never takes the GPU concurrency permit — a recall never waits behind a generation.
  • VF-native /memory/* endpoints (separate from /v1/*):
    • POST /memory/remember {project_key?, kind, text, name?, metadata?}{id}
    • POST /memory/recall {project_key?, query, k?}{hits:[{id, kind, name, text, status, score}]}
    • POST / GET /memory/projects
    • project_key is optional → a shared global scope.
  • Project isolation by construction. Each project gets its own persistent HNSW index (768-dim, cosine, m=16, ef_construction=200); a recall in one project physically cannot return another's notes.
  • Persistent across restarts — vectors restore from the SQLite store with no re-embedding.
  • Local and single-user, all the way down: no cloud, no telemetry; the embeddings are computed on your CPU and the whole store is one SQLite file (default ~/.vulkanforge/memory.db, override VF_MEMORY_DB).

What it is not (yet)

This release writes and reads — that is deliberately the whole of it. Not yet here: lifecycle transitions (draft→confirmed→…→archived), delete/archive, a richer edge taxonomy, auto-injection, and the vf-clide client integration (REPL /project / /recall, agent memory-tools). Those are the next milestone. See the wiki's Memory page for what it is, what it isn't, and the roadmap.

Cost (honest)

The two native deps add real surface: the release binary grows ~25 MB → ~59 MB (statically linked ONNX Runtime + bundled SQLite), the lockfile ~250 → ~384 packages, and a first clean build takes a few extra minutes. The first server start downloads the Nomic ONNX model from HuggingFace into ~/.vulkanforge/embed-cache (then runs offline).


Versions: engine 0.9.2 → 1.0.0; vf-clide unchanged at 0.3.1. Validated on AMD RX 9070 XT (RADV/gfx1201), Mesa 26.1.2.

v0.9.4 — vf-clide REPL permission ceiling + denial wording

14 Jun 13:53

Choose a tag to compare

vf-clide UX release (0.3.1). No engine change (engine stays at 0.9.2). Two changes to the agent's permission UX.

REPL honors the permission ceiling

In the interactive --agent REPL, a tool call at or below the active ceiling (--yes → ReadOnly,
--allow-mutating → Mutating, --allow-shell → Exec, cumulative) is now auto-approved — and still printed,
so you see every tool that ran — and only a call above the ceiling prompts y/N.

Previously the REPL prompted for every call and the flags only took effect headless. So --agent --yes now
stops asking about reads, --allow-mutating stops asking about writes, and so on, while anything above the
ceiling still asks. This is consistent with headless, not laxer: workspace confinement still bounds the file
tools independently, and shell is still only auto-approved with --allow-shell.

Headless -p is unchanged — a call above the ceiling is denied (not prompted), byte-for-byte as before.

Denial wording in the constitution

The built-in agent system prompt now distinguishes the two kinds of denial so the model stops claiming it needs
"elevated permissions" or that a target is "system-critical":

  • a permission denial (a tool above the current ceiling) is lifted only by re-running with
    --allow-mutating / --allow-shell — never OS or filesystem permissions;
  • a workspace-confinement denial (a path outside the workspace) is absolute — no flag overrides it.

Versions: engine 0.9.2 (unchanged), vf-clide 0.3.0 → 0.3.1. Validated on AMD RX 9070 XT (RADV/gfx1201),
Mesa 26.1.2.

v0.9.2 — vf-clide token meter + clean server shutdown

14 Jun 12:05

Choose a tag to compare

Feature + bugfix release. vf-clide gains live token accounting and a pinned status line; the engine's vulkanforge serve now shuts down cleanly. Supported-config inference output is unchanged — no decode/prefill/behavior change.

vf-clide 0.3.0 — token meter + pinned status line (feature)

  • Token accounting. The client surfaces real token usage on every path: the non-streaming response, the tool-calling loop, and the streaming path (via stream_options.include_usage, which the server emits as a final usage chunk). No local tokenizer, no estimation — the numbers are the server's own counts.
  • Pinned status line. The REPL pins a bottom status line (raw ANSI scroll region, no TUI framework) with a token meter — ↑prompt ↓completion (total) · session … — and the current action (idle / generating… / thinking… / running <tool>(…)). It is a no-op when stdout isn't a TTY, so headless -p output stays byte-for-byte unchanged and fully scriptable.

Engine 0.9.2 — clean serve shutdown (bugfix)

Ctrl+C / SIGTERM on vulkanforge serve previously left the GPU objects undestroyed (the validation layer reported hundreds of leaked objects) and then freed memory against an already-destroyed device → SIGSEGV. The shutdown path now:

  1. waits for the device to go idle (device_wait_idle),
  2. runs the explicit resource-teardown chain in order while the device is still alive, and
  3. drops the memory allocator before the device.

Result: 0 leaked objects, clean exit, no crash on both Ctrl+C and SIGTERM. Shutdown-path only — steady-state decode is untouched.


Versions: engine 0.9.0 → 0.9.2, vf-clide 0.2.1 → 0.3.0. (v0.9.1 was a vf-clide-only search-confinement security patch; the engine stayed at 0.9.0 through it.) Validated on AMD RX 9070 XT (RADV/gfx1201), Mesa 26.1.2.

v0.9.1 — search symlink-confinement fix (security)

13 Jun 17:13

Choose a tag to compare

v0.9.1 — Security: vf-clide search no longer follows symlinks out of the workspace

Security fix — update recommended.

vf-clide's agent search tool recursively walked the workspace using Path::is_dir/is_file,
which follow symlinks. A symlink inside the workspace pointing outside it (e.g. escape → /etc)
was treated as a directory and recursed into, so search could read files outside the workspace
root
— reachable with only --yes (read-only auto-approval). The single-path tools
(read_file/write_file) and the search start path were already confined; only the recursive
walk was not.

Fix

search's recursive walk now checks each entry's own type via symlink_metadata (which does not
follow the final component) and skips symlinks entirely — they are neither recursed into nor read.
This closes the confinement hole and also prevents symlink cycles. read_file / write_file / shell
are unchanged.

Scope

  • vf-clide 0.2.0 → 0.2.1. Engine unchanged (0.9.0) — no engine/decode/behavior change.
  • A regression test (search_does_not_follow_escaping_symlink) pins the fix; vf-clide unit 60/60.
  • Verified live @Qwen3-14B-Q4: in a workspace with escape → /etc, search returns only the
    in-workspace files, never /etc/....

If you run the --agent mode with untrusted workspaces, update.

v0.9.0 — Agentic vf-clide

13 Jun 15:57

Choose a tag to compare

v0.9.0 — Agentic vf-clide

vf-clide wird vom Chat-Client zum agentischen Coding-Client; die Engine bekommt eine Test-Infra-Härtung.

Highlights

  • vf-clide kann jetzt agentisch coden. Im --agent-Modus nutzt der Client Tools über VFs OpenAI-API in
    einem Loop: read_file, write_file, search, shell.
  • 3-Stufen-Permission-Modell. Tools sind nach Risiko gestuft — ReadOnly (read_file/search),
    Mutating (write_file), Exec (shell). Auto-Approval steigt opt-in über --yes--allow-mutating
    --allow-shell (kumulativ). Interaktiv wird pro Call bestätigt.
  • Workspace-Confinement. Datei-Tools sind auf die Workspace-Wurzel (--workspace, Default cwd) beschränkt;
    ../- und Symlink-Escapes werden abgewiesen.
  • Konstitution. Ein knapper Default-System-Prompt plus optionales projektspezifisches AGENTS.md.
  • Engine: Test-Infra-Härtung. Die End-to-end-Regressions- und Per-Shader-Correctness-Suites laufen wieder
    und werden per Compile-Wächter gegen erneute Drift geschützt. Kein Decode-/Verhaltens-Change.

Validierung

  • vf-clide: eigene Suite + Live-Smokes über die Tool-/Permission-Pfade @Qwen3-14B-Q4.
  • Engine: Lib-Tests + reaktivierte Correctness-/Regression-Suiten.

Bekannte Grenzen

  • shell ist nicht confined — ein Kommando kann den Workspace verlassen. --allow-shell ist die bewusste,
    laut benannte Opt-in-Stufe; einmal gesetzt, gilt sie für die Session. Bewusst einsetzen.
  • Keine Session-Persistenz (folgt).
  • search ist substring-basiert (kein Regex).
  • Kontext-Decke 16384 auf RDNA4/gfx1201; gemma-QAT ist VRAM-eng. Default-Coder = Qwen3-14B-Q4.
  • gemma-Tool-Calling für einfache Argumente validiert; code-tragende Argumente folgen.
  • Vorbestehend: coopmat-gemm_q (opt-in, default-OFF) liefert end-to-end NaN → quarantäniert; gemma-Q4_K_M
    (MMQ_ID); Q8_0-Coverage-Lücke.