Skip to content
maeddesg edited this page Jun 13, 2026 · 4 revisions

Installation

VulkanForge is built from source. It is a single static binary once compiled.

Prerequisites

  • GPU: AMD RDNA 4 / gfx1201 (Radeon RX 9070 XT). See Hardware and Compatibility.
  • Driver: Mesa RADV ≥ 26.1 recommended (native FP8 WMMA via shaderFloat8CooperativeMatrix). Mesa 26.0.6 also works (GGUF + FP8 via the BF16 conversion path, no native FP8 WMMA). Vulkan 1.4 loader + headers.
  • Toolchain: Rust 1.85+ (edition 2024), Vulkan headers.
  • OS: a current Linux with RADV (Arch / CachyOS and similar).

Build

git clone https://github.com/maeddesg/vulkanforge.git
cd vulkanforge
cargo build --release   # Rust 1.85+, Vulkan headers required

The release binary is at target/release/vulkanforge.

The CLI chat & agentic coding client vf-clide is a separate crate (no engine dependencies) — build it on demand:

cargo build --release --manifest-path vf-clide/Cargo.toml   # → ./vf-clide/target/release/vf-clide

Kernel parameter (14B+ models)

For 14B+ models, raise the amdgpu compute timeout on the kernel command line — the default 2 s is too short for long prefill submits and will TDR-reset the GPU:

amdgpu.lockup_timeout=10000,10000

Add it to your bootloader (e.g. GRUB GRUB_CMDLINE_LINUX_DEFAULT), regenerate the config, reboot.

Verify the driver path

Check whether native FP8 WMMA is available on your driver:

vulkaninfo 2>/dev/null | grep shaderFloat8CooperativeMatrix

If present, VulkanForge auto-selects the native FP8 path; otherwise it falls back to the BF16 path.

Verify the install

Run a quick benchmark on a Q4_K_M GGUF:

vulkanforge bench --model ~/models/Qwen3-8B-Q4_K_M.gguf

This enumerates the GPU, loads the model, and prints decode + prefill numbers. For a chat sanity check, see Usage.

More driver / environment detail: see docs/INSTALLATION.md.

Clone this wiki locally