From ad9d72171447c70ff7be4b34fd2792df30f545dc Mon Sep 17 00:00:00 2001 From: Wagner Bruna Date: Sat, 20 Sep 2025 08:05:53 -0300 Subject: [PATCH] docs: include Vulkan compatibility for LoRA quants Based on the GGML_OP_GET_ROWS code at: https://github.com/ggml-org/ggml/blob/5fdc78fff274094e2a1b155928131983362d8a71/src/ggml-vulkan/ggml-vulkan.cpp#L11940 --- docs/lora.md | 44 +++++++++++++++++++++++++++----------------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/docs/lora.md b/docs/lora.md index e2e1d82e9..9885ae549 100644 --- a/docs/lora.md +++ b/docs/lora.md @@ -20,20 +20,30 @@ Here's a simple example: NOTE: The other backends may have different support. -| Quant / Type | CUDA | -|--------------|------| -| F32 | ✔️ | -| F16 | ✔️ | -| BF16 | ✔️ | -| I32 | ✔️ | -| Q4_0 | ✔️ | -| Q4_1 | ✔️ | -| Q5_0 | ✔️ | -| Q5_1 | ✔️ | -| Q8_0 | ✔️ | -| Q2_K | ❌ | -| Q3_K | ❌ | -| Q4_K | ❌ | -| Q5_K | ❌ | -| Q6_K | ❌ | -| Q8_K | ❌ | +| Quant / Type | CUDA | Vulkan | +|--------------|------|--------| +| F32 | ✔️ | ✔️ | +| F16 | ✔️ | ✔️ | +| BF16 | ✔️ | ✔️ | +| I32 | ✔️ | ❌ | +| Q4_0 | ✔️ | ✔️ | +| Q4_1 | ✔️ | ✔️ | +| Q5_0 | ✔️ | ✔️ | +| Q5_1 | ✔️ | ✔️ | +| Q8_0 | ✔️ | ✔️ | +| Q2_K | ❌ | ❌ | +| Q3_K | ❌ | ❌ | +| Q4_K | ❌ | ❌ | +| Q5_K | ❌ | ❌ | +| Q6_K | ❌ | ❌ | +| Q8_K | ❌ | ❌ | +| IQ1_S | ❌ | ✔️ | +| IQ1_M | ❌ | ✔️ | +| IQ2_XXS | ❌ | ✔️ | +| IQ2_XS | ❌ | ✔️ | +| IQ2_S | ❌ | ✔️ | +| IQ3_XXS | ❌ | ✔️ | +| IQ3_S | ❌ | ✔️ | +| IQ4_XS | ❌ | ✔️ | +| IQ4_NL | ❌ | ✔️ | +| MXFP4 | ❌ | ✔️ |