Skip to content

v0.9.2 — vf-clide token meter + clean server shutdown

Choose a tag to compare

@maeddesg maeddesg released this 14 Jun 12:05
· 19 commits to main since this release

Feature + bugfix release. vf-clide gains live token accounting and a pinned status line; the engine's vulkanforge serve now shuts down cleanly. Supported-config inference output is unchanged — no decode/prefill/behavior change.

vf-clide 0.3.0 — token meter + pinned status line (feature)

  • Token accounting. The client surfaces real token usage on every path: the non-streaming response, the tool-calling loop, and the streaming path (via stream_options.include_usage, which the server emits as a final usage chunk). No local tokenizer, no estimation — the numbers are the server's own counts.
  • Pinned status line. The REPL pins a bottom status line (raw ANSI scroll region, no TUI framework) with a token meter — ↑prompt ↓completion (total) · session … — and the current action (idle / generating… / thinking… / running <tool>(…)). It is a no-op when stdout isn't a TTY, so headless -p output stays byte-for-byte unchanged and fully scriptable.

Engine 0.9.2 — clean serve shutdown (bugfix)

Ctrl+C / SIGTERM on vulkanforge serve previously left the GPU objects undestroyed (the validation layer reported hundreds of leaked objects) and then freed memory against an already-destroyed device → SIGSEGV. The shutdown path now:

  1. waits for the device to go idle (device_wait_idle),
  2. runs the explicit resource-teardown chain in order while the device is still alive, and
  3. drops the memory allocator before the device.

Result: 0 leaked objects, clean exit, no crash on both Ctrl+C and SIGTERM. Shutdown-path only — steady-state decode is untouched.


Versions: engine 0.9.0 → 0.9.2, vf-clide 0.2.1 → 0.3.0. (v0.9.1 was a vf-clide-only search-confinement security patch; the engine stayed at 0.9.0 through it.) Validated on AMD RX 9070 XT (RADV/gfx1201), Mesa 26.1.2.