Skip to content

History / Choosing a Model for Coding

Revisions

  • docs(wiki): v0.8.0 — auto ctx-size, Gemma-4 tool-calling, vf-clide; KV-FP8 now required - New page: vf-clide (the standalone CLI chat client) + Sidebar/Home/Usage/Installation links. - Automatic context sizing (v0.8.0): new Configuration section, Usage/Home/Troubleshooting notes, the 16384 RDNA4 LDS ceiling (explicit --ctx-size above it aborts, not clamped). - Gemma-4 native tool/function calling documented (Usage, Supported-Models). - KV-FP8 corrected from "recommended" to REQUIRED for the Gemma-4-26B-A4B MoE across Supported-Models / Configuration / Usage / Troubleshooting / Architecture / Choosing-a-Model: the non-FP8 KV path is known-broken and the engine fail-loud aborts without VULKANFORGE_KV_FP8=1 (debug override VULKANFORGE_ALLOW_BROKEN_KV=1). - Home/Configuration now reference the shipped v0.8.0 (perf matrix provenance stays v0.7.0).

    @maeddesg maeddesg committed Jun 12, 2026
  • docs(wiki): add "Choosing a Model for Coding" comparison page (neutral, no default) Side-by-side of the three coding-capable models (Gemma-4-26B-A4B Q3_K_M, Gemma-4-26B-A4B QAT/Q4_0, Qwen3.6-27B Q3_K_S) by quality / speed / context, from VulkanForge's own coding tests on the 16 GB RX 9070 XT. No single "best" — user picks by priority. Honest caveats (small sample; not a quant-controlled comparison; 16 GB-specific). Linked from _Sidebar, Supported-Models, and Benchmarks; softened the stale Q3_K_M "default" note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

    @maeddesg maeddesg committed Jun 10, 2026