Auto-Adaptive Mixed-Precision Quantization (Target BPW) #18531

EAddario · 2026-01-01T23:15:21Z

EAddario
Jan 1, 2026

PR #15550 introduces an optimization routine that dynamically determines an optimal per-tensor quantization type mix to achieve a user-specified global bits-per-weight (bpw) target (e.g., --target-bpw 4.5678).

Instead of relying on heuristic presets (like Q4_K_M), the function solves a constrained optimization problem: Minimize total quantization error, subject to a total size budget by measuring the sensitivity of each layer and dynamically allocating the "bit budget" where it matters most.

How it Works

Sensitivity Analysis:
The algorithm samples every tensor in the model. It quantizes these samples into various formats (from IQ1_S up to Q8_0) and compares them against the original FP16/FP32 weights.
- It utilizes the Importance Matrix (Imatrix) to weigh the error (errors in frequently activated weights are penalized more).
- If available, it uses Activation Data to penalize systematic bias (shifts in the mean output).
Pareto Optimization:
For each tensor, the algorithm builds a curve of "Size vs. Error". It discards inefficient formats (formats that take up more space without offering better accuracy for that specific layer).
Global Resource Allocation (Lagrangian Solver):
The algorithm solves a global optimization problem: "How do we distribute X bits across Y tensors to minimize total error?"
- Critical layers (like output.weight and Attention v weights) generally receive higher bit-precision.
- Less sensitive layers (like some FFN gates) are compressed more aggressively.
- This is achieved via a binary search determines the optimal "cost of a bit" ($\mu$) across the entire network.

Advantages

Target arbitrary size models
- The algorithm will produce a model (nearly) exactly of the requested bpw size, which is very useful for maximizing VRAM usage. In a system with 24GB VRAM and a 70B model, standard quants might produce a 16.8GB file (too small, quality left on table) or a 24.1GB file (won't fit). This approach can generate a 23.85GB file to utilize the hardware fully.
Data-driven mixed precision often can improve quality at fixed size
- Instead of using hardcoded heuristics (e.g. make attn_v Q5_K for a 70B model), that may be sub‑optimal for a given architecture or size, the quantization mix is determined by the actual error sensitivity of the specific model's weights. This often yields a better quality/size trade-off, especially in aggressive quantization scenarios (1.5 to 3.5 bpw), or for unusual architectures.
Allows better like-for-like comparisons between models and families
- Standard quantization uses hardcoded rules like: "use Q4_K_M, except bump some tensors up/down, except fall back if incompatible, except keep some tensors unquantized..." and for that reason, two different models quantized with the same Q4_K_M type can end up with very different bpw (e.g. 4.75 and 4.30).
- All things being equal, the performance of a model is usually proportional to its overall bpw size; models with a higher bpw tend to perform better than lower bpw models. Since model A has simply been given more bits, it will typically perform better (lower perplexity, better eval scores, etc.) even if the underlying quantization method is identical. That makes comparing the performance not a controlled experiment, because the comparison is between models with different effective compression ratios.
- --target-bpw tries to address that by making the experiment more controlled: each model gets quantized to land on (approximately) the same global byte budget, so that the models' performance differences are more attributable to architecture/training differences, quantization error behaviour at the same compression ratio, optimizer’s allocation decisions, etc.

Disadvantages

Quantization process is significantly slower than standard
- This approach can take 5x-10x longer as it quantizes a sample of most tensors into 15 different formats, dequantizes them back to floats, computes error diffs, and selects the best size/error option that fits the global bpw budget.
- However, the --keep-bpw-state option will save the above-mentioned computations to disk so that future quantizations, in the permissible bpw range for the same model, can be generated at normal speed. It also allows to interrupt the computation process and resume it at a later time.
The optimization target is only a proxy for the model's performance quality
- The process minimizes a per-tensor estimated error computed from sampled rows, not actual perplexity or divergence of output distributions (a future version may address this). Since errors interact nonlinearly across layers, there are no guarantees it will select the best possible quantization recipe subject to the bpw size constraint.
- Furthermore, the process can operate in two modes: giving priority to important tensors (default) or treating each tensor equally (setting the --no-importance option). To my knowledge, there is no computationally feasible way to determine ahead of time which modality will yield better results and two runs per model may be needed to obtain the best quality, but the default mode usually wins.
An imatrix with activations data is required for best results
- The algorithm relies on a yet to be merged pull request (imatrix: calculate activation-based statistics for new format (GGUF) imatrices #14891). Activation data is needed to compute the bias factor (i.e. the systematic error projected onto activation directions). If the imatrix file does not contain activation data, the quantization recipe will likely be sub-optimal.

Test results

Based on 132 tests with models from 11 different families, the target_bpw_type() optimization routine generated 96 (~70%) better quality models, and 10 (~8%) same as standard quantization. However, even though the method produced better quality often, it lost in surprising cases. Naive quants made up for the remaining 25 tests (20%) performing better, sometimes by a significant margin (e.g. ERNIE-4.5-21B-A3B-PT-IQ1_M, granite-4.0-h-tiny-IQ2_M, granite-4.0-h-tiny-IQ1_M)

Of the 96 cases where it performed better, about 1/3 achieved higher scores when using the --no-importance option, forcing the algorithm to treat each tensor equally instead of prioritising some (i.e. attn_output, ffn_down, etc.).

Target BPW test results

Using Cor(ln(PPL(Q)), ln(PPL(base))) as the discriminant metric

| Model                                        |    BPW |         PPL |   𝜌PPL |      KLD | Same Top P | Best          | --target-bpw | --no-importance |
| -------------------------------------------- | -----: | ----------: | -----: | -------: | ---------: | :------------ | :----------: | :-------------: |
| ARWKV-R1-1B5-Q6_K_M-naive                    | 6.6851 |   35.263025 | 99.95% | 0.003130 |     97.02% |               |              |                 |
| ARWKV-R1-1B5-Q6_K_M-test                     | 6.6851 |   35.244879 | 99.95% | 0.003225 |     96.93% | Same          |    99.95%    |     99.94%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-7B-Q6_K_M-naive                     | 6.6283 |   20.353217 | 99.95% | 0.002558 |     97.67% |               |              |                 |
| ARWKV-R1-7B-Q6_K_M-test                      | 6.6275 |   20.355471 | 99.95% | 0.002662 |     97.65% | Same          |    99.95%    |     99.95%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-1B5-Q5_K_M-naive                    | 5.8687 |   35.189166 | 99.92% | 0.006015 |     95.79% |               |              |                 |
| ARWKV-R1-1B5-Q5_K_M-test                     | 5.8675 |   35.265399 | 99.92% | 0.005931 |     95.76% | Same          |    99.92%    |     99.88%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-7B-Q5_K_M-naive                     | 5.7624 |   20.383109 | 99.92% | 0.004544 |     96.68% |               |              |                 |
| ARWKV-R1-7B-Q5_K_M-test                      | 5.7622 |   20.378203 | 99.92% | 0.005341 |     96.27% | Same          |    99.92%    |     99.88%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-1B5-Q4_K_M-naive                    | 5.1004 |   35.737707 | 99.78% | 0.017944 |     92.35% |               |              |                 |
| ARWKV-R1-1B5-Q4_K_M-test                     | 5.1004 |   35.450752 | 99.80% | 0.016954 |     92.38% | Target BPW    |    99.80%    |     99.73%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-7B-Q4_K_M-naive                     | 4.9474 |   20.510452 | 99.83% | 0.012588 |     94.20% |               |              |                 |
| ARWKV-R1-7B-Q4_K_M-test                      | 4.9472 |   20.463235 | 99.83% | 0.012339 |     94.36% | Same          |    99.83%    |     99.80%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-1B5-Q3_K_M-naive                    | 4.1899 |   36.901085 | 98.98% | 0.087509 |     83.43% |               |              |                 |
| ARWKV-R1-1B5-Q3_K_M-test                     | 4.1896 |   37.147295 | 99.45% | 0.049345 |     87.24% | Target BPW    |    99.45%    |     99.13%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-7B-Q3_K_M-naive                     | 3.9747 |   21.308946 | 99.29% | 0.054533 |     88.23% |               |              |                 |
| ARWKV-R1-7B-Q3_K_M-test                      | 3.9744 |   21.040018 | 99.43% | 0.045004 |     89.19% | Target BPW    |    99.43%    |     98.79%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-1B5-IQ2_M-naive                     | 3.2024 |   57.511363 | 93.55% | 0.579476 |     62.23% |               |              |                 |
| ARWKV-R1-1B5-IQ2_M-test                      | 3.2024 |   45.057798 | 97.43% | 0.235909 |     73.38% | Target BPW    |    97.43%    |     95.43%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-7B-IQ2_M-naive                      | 2.9175 |   27.362006 | 95.23% | 0.378432 |     71.15% |               |              |                 |
| ARWKV-R1-7B-IQ2_M-test                       | 2.9174 |   24.951673 | 96.66% | 0.253425 |     76.46% | Target BPW    |    96.66%    |     95.33%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-1B5-IQ1_M-naive                     | 2.4958 |  191.855358 | 79.31% | 1.833383 |     38.33% |               |              |                 |
| ARWKV-R1-1B5-IQ1_M-test                      | 2.4956 |   73.897487 | 90.95% | 0.789102 |     56.30% | Target BPW    |    90.95%    |     88.79%      |
|                                              |        |             |        |          |            |               |              |                 |
| ARWKV-R1-7B-IQ1_M-naive                      | 2.1619 |   51.790959 | 85.96% | 1.123773 |     50.89% |               |              |                 |
| ARWKV-R1-7B-IQ1_M-test                       | 2.1615 |   35.385003 | 90.83% | 0.720311 |     61.79% | Target BPW    |    90.83%    |     89.40%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q6_K_M-naive               | 6.5652 |   14.657145 | 99.87% | 0.008099 |     94.49% |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q6_K_M-test                | 6.5646 |   14.672409 | 99.85% | 0.009163 |     94.25% | Naive         |    99.85%    |     99.85%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q6_K_M-naive            | 6.5678 |    6.252806 | 99.88% | 0.003923 |     97.31% |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q6_K_M-test             | 6.5678 |    6.282425 | 99.87% | 0.004336 |     97.07% | No Importance |    99.87%    |     99.91%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q5_K_M-naive               | 5.9050 |   14.760451 | 99.70% | 0.018535 |     92.32% |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q5_K_M-test                | 5.9047 |   14.750293 | 99.74% | 0.016542 |     92.36% | Target BPW    |    99.74%    |     99.67%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q5_K_M-naive            | 5.6851 |    6.282396 | 99.80% | 0.007295 |     96.26% |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q5_K_M-test             | 5.6851 |    6.329379 | 99.71% | 0.011546 |     94.99% | No Importance |    99.71%    |     99.80%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q4_K_M-naive               | 5.2837 |   15.360215 | 99.03% | 0.062306 |     86.24% |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q4_K_M-test                | 5.2837 |   15.127183 | 99.36% | 0.041702 |     88.63% | Target BPW    |    99.36%    |     99.03%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q4_K_M-naive            | 4.8543 |    6.356795 | 99.49% | 0.020904 |     93.73% |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q4_K_M-test             | 4.8543 |    6.329602 | 99.50% | 0.021666 |     93.20% | No Importance |    99.50%    |     99.55%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q3_K_M-naive               | 4.6599 |   17.624006 | 96.28% | 0.233042 |     75.01% |               |              |                 |
| ERNIE-4.5-0.3B-PT-Q3_K_M-test                | 4.6598 |   15.572376 | 98.80% | 0.081764 |     83.33% | Target BPW    |    98.80%    |     97.90%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q3_K_M-naive            | 3.8363 |    6.653076 | 98.43% | 0.065814 |     88.93% |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-Q3_K_M-test             | 3.8363 |    6.960149 | 96.89% | 0.140407 |     83.10% | Naive         |    96.89%    |     98.12%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-0.3B-PT-IQ2_M-naive                | 3.5601 |  126.510216 | 67.36% | 2.245763 |     31.50% |               |              |                 |
| ERNIE-4.5-0.3B-PT-IQ2_M-test                 | 3.5600 |   21.776330 | 92.99% | 0.468520 |     64.78% | Target BPW    |    92.99%    |     91.33%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-IQ2_M-naive             | 2.6266 |    9.325763 | 90.05% | 0.452442 |     71.35% |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-IQ2_M-test              | 2.6266 |   11.443878 | 86.25% | 0.673509 |     64.15% | Naive         |    86.25%    |     88.21%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-0.3B-PT-IQ1_M-naive                | 2.9380 | 1993.785673 | 43.82% | 5.024711 |      9.48% |               |              |                 |
| ERNIE-4.5-0.3B-PT-IQ1_M-test                 | 2.9380 |   52.789501 | 80.71% | 1.401172 |     43.87% | Target BPW    |    80.71%    |     74.19%      |
|                                              |        |             |        |          |            |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-IQ1_M-naive             | 1.8207 |   14.684351 | 80.98% | 0.944582 |     59.54% |               |              |                 |
| ERNIE-4.5-21B-A3B-PT-IQ1_M-test              | 1.8207 |   34.634577 | 68.09% | 1.888292 |     43.24% | Naive         |    68.09%    |     73.29%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q6_K_M-naive         | 6.5724 |   16.855635 | 99.40% | 0.012620 |     94.87% |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q6_K_M-test          | 6.5723 |   17.139379 | 99.39% | 0.014104 |     94.89% | Naive         |    99.39%    |     99.18%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-7B-Instruct-Q6_K_M-naive           | 6.5665 |    7.951421 | 99.53% | 0.010064 |     95.13% |               |              |                 |
| Falcon-H1-7B-Instruct-Q6_K_M-test            | 6.5664 |    7.982449 | 99.57% | 0.006085 |     96.81% | Target BPW    |    99.57%    |     99.53%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q5_K_M-naive         | 5.6838 |   16.573882 | 99.12% | 0.030210 |     92.56% |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q5_K_M-test          | 5.6838 |   16.918074 | 98.93% | 0.042785 |     91.43% | No Importance |    98.93%    |     98.94%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-7B-Instruct-Q5_K_M-naive           | 5.6789 |    7.966140 | 99.42% | 0.015622 |     94.28% |               |              |                 |
| Falcon-H1-7B-Instruct-Q5_K_M-test            | 5.6789 |    8.013705 | 99.43% | 0.014055 |     94.86% | Target BPW    |    99.43%    |     99.35%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q4_K_M-naive         | 4.8474 |   17.630107 | 97.98% | 0.106563 |     87.08% |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q4_K_M-test          | 4.8473 |   17.151377 | 98.20% | 0.092033 |     87.74% | Target BPW    |    98.20%    |     98.18%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-7B-Instruct-Q4_K_M-naive           | 4.8435 |    8.012346 | 98.96% | 0.038206 |     91.75% |               |              |                 |
| Falcon-H1-7B-Instruct-Q4_K_M-test            | 4.8435 |    7.992109 | 99.03% | 0.035016 |     92.04% | Target BPW    |    99.03%    |     98.91%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q3_K_M-naive         | 3.9229 |   16.622489 | 94.71% | 0.323379 |     78.25% |               |              |                 |
| Falcon-H1-1.5B-Instruct-Q3_K_M-test          | 3.9229 |   17.527258 | 95.41% | 0.290505 |     77.66% | Target BPW    |    95.41%    |     95.01%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-7B-Instruct-Q3_K_M-naive           | 3.8833 |    8.593841 | 97.34% | 0.125806 |     86.02% |               |              |                 |
| Falcon-H1-7B-Instruct-Q3_K_M-test            | 3.8833 |    8.559043 | 97.01% | 0.152381 |     83.06% | Naive         |    97.01%    |     95.89%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-1.5B-Instruct-IQ2_M-naive          | 2.9630 |   40.116694 | 78.11% | 1.716486 |     53.43% |               |              |                 |
| Falcon-H1-1.5B-Instruct-IQ2_M-test           | 2.9630 |   29.781395 | 85.51% | 1.108352 |     63.46% | Target BPW    |    85.51%    |     85.47%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-7B-Instruct-IQ2_M-naive            | 2.8225 |   10.098122 | 88.34% | 0.596415 |     70.78% |               |              |                 |
| Falcon-H1-7B-Instruct-IQ2_M-test             | 2.8225 |   10.309871 | 91.69% | 0.450354 |     74.88% | Target BPW    |    91.71%    |     88.74%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-1.5B-Instruct-IQ1_M-naive          | 2.2094 |  121.364852 | 64.81% | 3.142805 |     38.85% |               |              |                 |
| Falcon-H1-1.5B-Instruct-IQ1_M-test           | 2.2094 |  102.480281 | 69.13% | 2.833264 |     40.20% | Target BPW    |    69.13%    |     64.46%      |
|                                              |        |             |        |          |            |               |              |                 |
| Falcon-H1-7B-Instruct-IQ1_M-naive            | 2.0412 |   18.060904 | 76.92% | 1.341149 |     59.08% |               |              |                 |
| Falcon-H1-7B-Instruct-IQ1_M-test             | 2.0412 |   18.906200 | 78.57% | 1.341664 |     57.51% | No Importance |    78.57%    |     78.46%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-4b-it-Q6_K_M-naive                   | 6.5649 |   15.601843 | 98.67% | 0.010012 |     96.44% |               |              |                 |
| gemma-3-4b-it-Q6_K_M-test                    | 6.5649 |   15.391448 | 98.66% | 0.011424 |     96.19% | Naive         |    98.66%    |     98.65%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-12b-it-Q6_K_M-naive                  | 6.5642 |    9.113018 | 99.57% | 0.005106 |     97.07% |               |              |                 |
| gemma-3-12b-it-Q6_K_M-test                   | 6.5637 |    9.098008 | 99.50% | 0.007870 |     96.42% | Naive         |    99.50%    |     99.48%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-4b-it-Q5_K_M-naive                   | 5.8205 |   15.621460 | 98.40% | 0.023596 |     94.65% |               |              |                 |
| gemma-3-4b-it-Q5_K_M-test                    | 5.8204 |   15.583861 | 98.41% | 0.024760 |     94.40% | Target BPW    |    98.41%    |     97.91%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-12b-it-Q5_K_M-naive                  | 5.7375 |    9.149511 | 99.36% | 0.013113 |     95.54% |               |              |                 |
| gemma-3-12b-it-Q5_K_M-test                   | 5.7370 |    9.164833 | 99.34% | 0.015653 |     94.93% | Naive         |    99.34%    |     98.49%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-4b-it-Q4_K_M-naive                   | 5.1200 |   15.272407 | 97.35% | 0.076753 |     90.47% |               |              |                 |
| gemma-3-4b-it-Q4_K_M-test                    | 5.1195 |   15.625719 | 97.42% | 0.070577 |     90.84% | Target BPW    |    97.42%    |     97.23%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-12b-it-Q4_K_M-naive                  | 4.9595 |    9.177975 | 98.66% | 0.045939 |     91.89% |               |              |                 |
| gemma-3-12b-it-Q4_K_M-test                   | 4.9589 |    9.282984 | 98.57% | 0.048337 |     91.53% | Naive         |    98.57%    |     97.82%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-4b-it-Q3_K_M-naive                   | 4.3129 |   16.001779 | 94.27% | 0.230011 |     83.58% |               |              |                 |
| gemma-3-4b-it-Q3_K_M-test                    | 4.3129 |   15.899249 | 96.22% | 0.141760 |     86.81% | Target BPW    |    96.22%    |     95.94%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-12b-it-Q3_K_M-naive                  | 4.0811 |    9.689398 | 96.59% | 0.141529 |     86.16% |               |              |                 |
| gemma-3-12b-it-Q3_K_M-test                   | 4.0810 |    9.610142 | 96.51% | 0.149120 |     84.57% | Naive         |    96.51%    |     96.27%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-4b-it-IQ2_M-naive                    | 3.1574 |   17.583538 | 82.90% | 0.892136 |     65.37% |               |              |                 |
| gemma-3-4b-it-IQ2_M-test                     | 3.1573 |   16.608971 | 89.10% | 0.517544 |     74.60% | Target BPW    |    89.10%    |     85.64%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-12b-it-IQ2_M-naive                   | 2.9263 |   10.513148 | 86.85% | 0.604512 |     70.09% |               |              |                 |
| gemma-3-12b-it-IQ2_M-test                    | 2.9262 |   10.614872 | 90.35% | 0.430504 |     74.96% | Target BPW    |    90.35%    |     87.04%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-4b-it-IQ1_M-naive                    | 2.4597 |   35.890562 | 69.51% | 1.966169 |     47.67% |               |              |                 |
| gemma-3-4b-it-IQ1_M-test                     | 2.4597 |   19.682657 | 79.94% | 1.078927 |     60.69% | Target BPW    |    79.94%    |     77.44%      |
|                                              |        |             |        |          |            |               |              |                 |
| gemma-3-12b-it-IQ1_M-naive                   | 2.1473 |   20.829944 | 72.56% | 1.495531 |     52.87% |               |              |                 |
| gemma-3-12b-it-IQ1_M-test                    | 2.1472 |   15.581627 | 78.73% | 1.116024 |     56.99% | Target BPW    |    78.73%    |     76.52%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-h-tiny-Q6_K_M-naive              | 6.5800 |    8.399185 | 99.83% | 0.007239 |     95.75% |               |              |                 |
| granite-4.0-h-tiny-Q6_K_M-test               | 6.5799 |    8.412915 | 99.73% | 0.011246 |     94.83% | Naive         |    99.73%    |     99.80%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-micro-Q6_K_M-naive               | 6.5641 |   10.320644 | 99.77% | 0.008214 |     95.83% |               |              |                 |
| granite-4.0-micro-Q6_K_M-test                | 6.5637 |   10.418077 | 99.77% | 0.008245 |     95.80% | No Importance |    99.77%    |     99.78%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-h-tiny-Q5_K_M-naive              | 5.7010 |    8.414691 | 99.70% | 0.013480 |     94.32% |               |              |                 |
| granite-4.0-h-tiny-Q5_K_M-test               | 5.7010 |    8.533514 | 99.27% | 0.034641 |     90.73% | Naive         |    99.27%    |     99.50%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-micro-Q5_K_M-naive               | 5.7210 |   10.483792 | 99.55% | 0.019151 |     93.74% |               |              |                 |
| granite-4.0-micro-Q5_K_M-test                | 5.7206 |   10.436584 | 99.49% | 0.022465 |     92.98% | Naive         |    99.49%    |     99.34%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-h-tiny-Q4_K_M-naive              | 4.8737 |    8.522251 | 99.25% | 0.035130 |     91.29% |               |              |                 |
| granite-4.0-h-tiny-Q4_K_M-test               | 4.8737 |    8.837110 | 98.11% | 0.092883 |     84.96% | Naive         |    98.11%    |     99.23%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-micro-Q4_K_M-naive               | 4.9275 |   10.511546 | 98.62% | 0.064691 |     88.82% |               |              |                 |
| granite-4.0-micro-Q4_K_M-test                | 4.9274 |   10.625824 | 98.84% | 0.056839 |     89.13% | Target BPW    |    98.84%    |     98.69%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-h-tiny-Q3_K_M-naive              | 3.8616 |    8.914942 | 97.37% | 0.123502 |     83.99% |               |              |                 |
| granite-4.0-h-tiny-Q3_K_M-test               | 3.8616 |   10.884609 | 93.10% | 0.362505 |     74.29% | Naive         |    93.10%    |     95.47%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-micro-Q3_K_M-naive               | 4.0484 |   11.202790 | 95.80% | 0.211357 |     80.67% |               |              |                 |
| granite-4.0-micro-Q3_K_M-test                | 4.0484 |   11.165460 | 96.39% | 0.188088 |     80.70% | Target BPW    |    96.39%    |     96.08%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-h-tiny-IQ2_M-naive               | 2.6695 |   15.734925 | 85.55% | 0.821723 |     57.89% |               |              |                 |
| granite-4.0-h-tiny-IQ2_M-test                | 2.6695 |   39.990624 | 73.52% | 1.839443 |     43.53% | Naive         |    73.52%    |     81.22%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-micro-IQ2_M-naive                | 2.9103 |   40.430709 | 73.35% | 1.759012 |     47.31% |               |              |                 |
| granite-4.0-micro-IQ2_M-test                 | 2.9103 |   16.745333 | 86.44% | 0.763426 |     63.47% | Target BPW    |    86.44%    |     84.23%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-h-tiny-IQ1_M-naive               | 1.8752 |   39.256179 | 70.36% | 1.853246 |     44.35% |               |              |                 |
| granite-4.0-h-tiny-IQ1_M-test                | 1.8752 |  720.525238 | 47.64% | 4.859275 |     14.60% | Naive         |    47.64%    |     58.41%      |
|                                              |        |             |        |          |            |               |              |                 |
| granite-4.0-micro-IQ1_M-naive                | 2.1284 |  125.146696 | 60.35% | 3.002957 |     34.76% |               |              |                 |
| granite-4.0-micro-IQ1_M-test                 | 2.9103 |   84.336421 | 63.71% | 2.602684 |     36.85% | Target BPW    |    63.71%    |     68.64%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q6_K_M-naive           | 6.5655 |   18.164981 | 99.80% | 0.007364 |     95.31% |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q6_K_M-test            | 6.5647 |   18.155518 | 99.81% | 0.007040 |     95.41% | Target BPW    |    99.81%    |     99.76%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q6_K_M-naive | 6.5642 |   14.592521 | 99.66% | 0.008116 |     95.97% |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q6_K_M-test  | 6.5641 |   14.630426 | 99.66% | 0.007304 |     96.23% | Same          |    99.66%    |     99.64%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q5_K_M-naive           | 5.7541 |   18.258318 | 99.66% | 0.017431 |     92.87% |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q5_K_M-test            | 5.7540 |   18.269360 | 99.65% | 0.017580 |     92.86% | No Importance |    99.65%    |     99.68%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q5_K_M-naive | 5.7152 |   14.621338 | 99.52% | 0.015283 |     94.59% |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q5_K_M-test  | 5.7149 |   14.610505 | 99.50% | 0.015519 |     94.60% | No Importance |    99.50%    |     99.53%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q4_K_M-naive           | 4.9904 |   18.859517 | 99.03% | 0.058574 |     87.37% |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q4_K_M-test            | 4.9903 |   18.712691 | 99.31% | 0.039891 |     89.55% | Target BPW    |    99.31%    |     99.16%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q4_K_M-naive | 4.9162 |   15.009760 | 98.95% | 0.045500 |     90.86% |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q4_K_M-test  | 4.9162 |   14.633656 | 99.16% | 0.033903 |     92.00% | Target BPW    |    99.16%    |     99.13%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q3_K_M-naive           | 4.1221 |   21.888925 | 96.44% | 0.230853 |     76.63% |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-Q3_K_M-test            | 4.1219 |   20.308097 | 97.64% | 0.146930 |     81.02% | Target BPW    |    97.64%    |     97.11%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q3_K_M-naive | 3.9606 |   15.558934 | 96.90% | 0.154157 |     83.35% |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-Q3_K_M-test  | 3.9606 |   15.194790 | 97.75% | 0.107148 |     86.21% | Target BPW    |    97.75%    |     96.96%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-IQ2_M-naive            | 3.1089 |   90.621951 | 76.71% | 1.718901 |     42.10% |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-IQ2_M-test             | 3.1088 |   36.247619 | 88.29% | 0.788680 |     59.18% | Target BPW    |    88.29%    |     84.06%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-IQ2_M-naive  | 2.8595 |   26.902770 | 83.16% | 1.039928 |     59.91% |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-IQ2_M-test   | 2.8595 |   21.852258 | 88.71% | 0.636635 |     67.94% | Target BPW    |    88.71%    |     87.54%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-IQ1_M-naive            | 2.3694 | 2755.174125 | 46.93% | 5.250251 |     12.92% |               |              |                 |
| Huihui-MoE-1.2B-A0.6B-IQ1_M-test             | 2.3694 |  226.909163 | 67.09% | 2.700929 |     31.01% | Target BPW    |    67.09%    |     46.33%      |
|                                              |        |             |        |          |            |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-IQ1_M-naive  | 2.0921 |   96.325903 | 66.88% | 2.431305 |     40.58% |               |              |                 |
| Huihui-MoE-5B-A1.7B-abliterated-IQ1_M-test   | 2.0920 |   55.148303 | 73.43% | 1.839316 |     47.37% | Target BPW    |    73.43%    |     63.95%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.1-8B-Q6_K_M-naive                    | 6.5633 |    6.155437 | 99.94% | 0.003014 |     97.38% |               |              |                 |
| Llama-3.1-8B-Q6_K_M-test                     | 6.5632 |    6.151212 | 99.95% | 0.002538 |     97.67% | Target BPW    |    99.95%    |     99.94%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.2-1B-Q6_K_M-naive                    | 6.5639 |    9.685527 | 99.91% | 0.004948 |     96.17% |               |              |                 |
| Llama-3.2-1B-Q6_K_M-test                     | 6.5638 |    9.684942 | 99.93% | 0.003642 |     96.62% | No Importance |    99.93%    |     99.94%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.1-8B-Q5_K_M-naive                    | 5.7036 |    6.181832 | 99.85% | 0.007059 |     96.15% |               |              |                 |
| Llama-3.1-8B-Q5_K_M-test                     | 5.7035 |    6.176639 | 99.86% | 0.006445 |     96.27% | Target BPW    |    99.86%    |     99.82%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.2-1B-Q5_K_M-naive                    | 5.8499 |    9.753430 | 99.80% | 0.011244 |     94.43% |               |              |                 |
| Llama-3.2-1B-Q5_K_M-test                     | 5.8491 |    9.726408 | 99.85% | 0.008460 |     94.91% | Target BPW    |    99.85%    |     99.73%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.1-8B-Q4_K_M-naive                    | 4.8944 |    6.286192 | 99.47% | 0.023817 |     93.15% |               |              |                 |
| Llama-3.1-8B-Q4_K_M-test                     | 4.8943 |    6.247224 | 99.61% | 0.018801 |     93.52% | Target BPW    |    99.61%    |     99.51%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.2-1B-Q4_K_M-naive                    | 5.1779 |   10.023605 | 99.34% | 0.037436 |     90.23% |               |              |                 |
| Llama-3.2-1B-Q4_K_M-test                     | 5.1773 |    9.849751 | 99.65% | 0.020680 |     92.31% | Target BPW    |    99.65%    |     99.54%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.1-8B-Q3_K_M-naive                    | 3.9960 |    6.603232 | 98.19% | 0.075276 |     87.98% |               |              |                 |
| Llama-3.1-8B-Q3_K_M-test                     | 3.9960 |    6.562788 | 98.37% | 0.067765 |     88.55% | Target BPW    |    98.37%    |     97.30%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.2-1B-Q3_K_M-naive                    | 4.4215 |   10.966295 | 97.72% | 0.125486 |     82.70% |               |              |                 |
| Llama-3.2-1B-Q3_K_M-test                     | 4.4213 |   10.123164 | 99.16% | 0.048729 |     88.24% | Target BPW    |    99.16%    |     99.11%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.1-8B-IQ2_M-naive                     | 2.9294 |   11.936119 | 85.93% | 0.657555 |     65.94% |               |              |                 |
| Llama-3.1-8B-IQ2_M-test                      | 2.9293 |    8.667808 | 91.72% | 0.343883 |     74.80% | Target BPW    |    91.72%    |     89.32%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.2-1B-IQ2_M-naive                     | 3.2860 |   44.742869 | 75.79% | 1.505601 |     46.44% |               |              |                 |
| Llama-3.2-1B-IQ2_M-test                      | 3.2859 |   14.471273 | 92.78% | 0.406797 |     69.43% | No Importance |    92.78%    |     92.80%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.1-8B-IQ1_M-naive                     | 2.1460 |   29.102629 | 70.49% | 1.540128 |     49.43% |               |              |                 |
| Llama-3.1-8B-IQ1_M-test                      | 2.1460 |   21.881896 | 75.09% | 1.253807 |     53.33% | Target BPW    |    75.09%    |     69.87%      |
|                                              |        |             |        |          |            |               |              |                 |
| Llama-3.2-1B-IQ1_M-naive                     | 2.6268 |  363.807707 | 51.07% | 3.591351 |     21.43% |               |              |                 |
| Llama-3.2-1B-IQ1_M-test                      | 2.6268 |   36.739223 | 78.22% | 1.317966 |     48.31% | No Importance |    78.22%    |     74.97%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-1.4b-hf-Q6_K_M-naive                   | 6.6837 |   10.826776 | 99.90% | 0.005337 |     95.23% |               |              |                 |
| mamba-1.4b-hf-Q6_K_M-test                    | 6.6837 |   10.825823 | 99.90% | 0.005339 |     95.22% | Same          |    99.90%    |     99.90%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-2.8b-hf-Q6_K_M-naive                   | 6.6700 |    9.472057 | 99.89% | 0.005525 |     95.34% |               |              |                 |
| mamba-2.8b-hf-Q6_K_M-test                    | 6.6697 |    9.473708 | 99.89% | 0.005594 |     95.36% | Same          |    99.89%    |     99.89%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-1.4b-hf-Q5_K_M-naive                   | 5.6782 |    9.492145 | 99.85% | 0.007647 |     94.72% |               |              |                 |
| mamba-1.4b-hf-Q5_K_M-test                    | 5.6781 |   10.962664 | 99.69% | 0.020115 |     90.93% | Naive         |    99.69%    |     99.69%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-2.8b-hf-Q5_K_M-naive                   | 5.6326 |    9.472057 | 99.89% | 0.005525 |     95.34% |               |              |                 |
| mamba-2.8b-hf-Q5_K_M-test                    | 5.6326 |    9.682213 | 99.55% | 0.028049 |     89.50% | Naive         |    99.55%    |     99.55%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-1.4b-hf-Q4_K_M-naive                   | 4.7657 |   10.942318 | 99.70% | 0.017359 |     92.31% |               |              |                 |
| mamba-1.4b-hf-Q4_K_M-test                    | 4.7657 |   11.115953 | 99.49% | 0.031787 |     89.35% | Naive         |    99.49%    |     99.48%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-2.8b-hf-Q4_K_M-naive                   | 4.6914 |    9.574036 | 99.71% | 0.015643 |     93.09% |               |              |                 |
| mamba-2.8b-hf-Q4_K_M-test                    | 4.6914 |    9.789406 | 99.37% | 0.038112 |     88.37% | Naive         |    99.37%    |     99.37%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-1.4b-hf-Q3_K_M-naive                   | 3.7876 |   11.600416 | 98.74% | 0.074951 |     85.31% |               |              |                 |
| mamba-1.4b-hf-Q3_K_M-test                    | 3.7876 |   12.465589 | 97.84% | 0.135911 |     79.85% | Naive         |    97.84%    |     97.84%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-2.8b-hf-Q3_K_M-naive                   | 3.6824 |    9.974359 | 98.87% | 0.063989 |     86.72% |               |              |                 |
| mamba-2.8b-hf-Q3_K_M-test                    | 3.6823 |   15.370013 | 93.00% | 0.473393 |     64.99% | Naive         |    93.00%    |     93.00%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-1.4b-hf-IQ2_M-naive                    | 2.9176 |   26.825640 | 87.12% | 0.903547 |     60.26% |               |              |                 |
| mamba-1.4b-hf-IQ2_M-test                     | 2.9175 |   22.366520 | 88.58% | 0.761930 |     53.78% | Target BPW    |    88.58%    |     88.58%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-2.8b-hf-IQ2_M-naive                    | 2.8177 |   24.361332 | 84.52% | 0.971537 |     60.65% |               |              |                 |
| mamba-2.8b-hf-IQ2_M-test                     | 2.8177 |   24.518860 | 86.15% | 0.954394 |     51.53% | Target BPW    |    86.15%    |     86.15%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-1.4b-hf-IQ1_M-naive                    | 2.1837 |   37.377075 | 81.52% | 1.247609 |     54.68% |               |              |                 |
| mamba-1.4b-hf-IQ1_M-test                     | 2.1837 |  139.471320 | 68.68% | 2.561901 |     25.74% | Target BPW    |    68.68%    |     68.68%      |
|                                              |        |             |        |          |            |               |              |                 |
| mamba-2.8b-hf-IQ1_M-naive                    | 2.0606 |   29.246815 | 83.15% | 1.161747 |     57.76% |               |              |                 |
| mamba-2.8b-hf-IQ1_M-test                     | 2.0606 | 1500.981697 | 55.03% | 5.072153 |     52.72% | Target BPW    |    55.03%    |     55.03%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q6_K_M-naive      | 8.2158 |    7.807138 | 99.86% | 0.006957 |     95.78% |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q6_K_M-test       | 8.2157 |    7.774620 | 99.92% | 0.003909 |     96.66% | No Importance |    99.92%    |     99.97%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q6_K_M-naive     | 6.5673 |    6.514285 | 99.71% | 0.012372 |     94.92% |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q6_K_M-test      | 6.5665 |    6.519770 | 99.75% | 0.010997 |     95.15% | No Importance |    99.75%    |     99.95%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q5_K_M-naive      | 6.3562 |    7.807138 | 99.86% | 0.006957 |     95.78% |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q5_K_M-test       | 6.3561 |    7.774620 | 99.92% | 0.003909 |     96.66% | No Importance |    99.92%    |     99.93%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q5_K_M-naive     | 5.6906 |    6.514285 | 99.71% | 0.012372 |     94.92% |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q5_K_M-test      | 5.6903 |    6.519770 | 99.75% | 0.010997 |     95.15% | No Importance |    99.75%    |     99.87%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q4_K_M-naive      | 5.8664 |    7.807138 | 99.86% | 0.006957 |     95.78% |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q4_K_M-test       | 5.8663 |    7.774620 | 99.92% | 0.003909 |     96.66% | Target BPW    |    99.92%    |     99.91%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q4_K_M-naive     | 4.8654 |    6.514285 | 99.71% | 0.012372 |     94.92% |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q4_K_M-test      | 4.8650 |    6.519770 | 99.75% | 0.010997 |     95.15% | Target BPW    |    99.75%    |     99.75%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q3_K_M-naive      | 4.8350 |    7.901055 | 99.55% | 0.022542 |     92.67% |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-Q3_K_M-test       | 4.8350 |    7.846752 | 99.74% | 0.013281 |     94.17% | No Importance |    99.74%    |     99.75%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q3_K_M-naive     | 3.9094 |    6.690889 | 98.86% | 0.051482 |     89.61% |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-Q3_K_M-test      | 3.9094 |    6.698637 | 98.89% | 0.049145 |     90.15% | Target BPW    |    98.89%    |     98.74%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-IQ2_M-naive       | 4.4901 |    8.332496 | 98.49% | 0.078295 |     87.27% |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-IQ2_M-test        | 4.4901 |    7.892440 | 99.61% | 0.020684 |     92.53% | Target BPW    |    99.61%    |     99.56%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-IQ2_M-naive      | 2.8415 |    8.394154 | 93.06% | 0.340536 |     75.78% |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-IQ2_M-test       | 2.8411 |    8.326604 | 93.34% | 0.299314 |     77.24% | Target BPW    |    93.34%    |     93.31%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-IQ1_M-naive       | 4.3627 |    8.332496 | 98.49% | 0.078295 |     87.27% |               |              |                 |
| NVIDIA-Nemotron-Nano-9B-v2-IQ1_M-test        | 4.3627 |    7.892440 | 99.61% | 0.020684 |     92.53% | Target BPW    |    99.61%    |     99.17%      |
|                                              |        |             |        |          |            |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-IQ1_M-naive      | 2.0663 |    8.394154 | 93.06% | 0.340536 |     75.78% |               |              |                 |
| NVIDIA-Nemotron-Nano-12B-v2-IQ1_M-test       | 2.0662 |    8.326604 | 93.34% | 0.299314 |     77.24% | Target BPW    |    93.34%    |     85.11%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-mini-reasoning-Q6_K_M-naive            | 6.5638 |   79.657778 | 95.51% | 0.286043 |     79.93% |               |              |                 |
| Phi-4-mini-reasoning-Q6_K_M-test             | 6.5620 |   92.616509 | 96.37% | 0.212204 |     82.49% | No Importance |    96.37%    |     96.77%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-reasoning-Q6_K_M-naive                 | 6.5632 |    6.972020 | 99.98% | 0.001062 |     98.41% |               |              |                 |
| Phi-4-reasoning-Q6_K_M-test                  | 6.5605 |    6.980271 | 99.96% | 0.001616 |     98.11% | No Importance |    99.96%    |     99.98%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-mini-reasoning-Q5_K_M-naive            | 5.9225 |   82.656950 | 94.19% | 0.427141 |     76.12% |               |              |                 |
| Phi-4-mini-reasoning-Q5_K_M-test             | 5.9187 |   96.715709 | 94.47% | 0.421151 |     75.72% | No Importance |    94.47%    |     94.93%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-reasoning-Q5_K_M-naive                 | 5.7850 |    6.984280 | 99.94% | 0.002555 |     97.62% |               |              |                 |
| Phi-4-reasoning-Q5_K_M-test                  | 5.7842 |    6.986200 | 99.94% | 0.002799 |     97.47% | Same          |    99.94%    |     99.93%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-mini-reasoning-Q4_K_M-naive            | 5.1796 |   79.913113 | 90.13% | 0.858518 |     67.55% |               |              |                 |
| Phi-4-mini-reasoning-Q4_K_M-test             | 5.1789 |  102.653408 | 90.63% | 0.864110 |     67.55% | No Importance |    90.63%    |     90.70%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-reasoning-Q4_K_M-naive                 | 4.9385 |    7.033351 | 99.80% | 0.009278 |     95.53% |               |              |                 |
| Phi-4-reasoning-Q4_K_M-test                  | 4.9374 |    7.018981 | 99.80% | 0.009399 |     95.49% | Target BPW    |    99.80%    |     99.79%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-mini-reasoning-Q3_K_M-naive            | 4.3982 |   94.813679 | 83.24% | 1.712537 |     56.76% |               |              |                 |
| Phi-4-mini-reasoning-Q3_K_M-test             | 4.3980 |   99.959381 | 85.41% | 1.516922 |     57.16% | No Importance |    85.41%    |     85.75%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-reasoning-Q3_K_M-naive                 | 4.0162 |    7.177370 | 99.34% | 0.030253 |     92.09% |               |              |                 |
| Phi-4-reasoning-Q3_K_M-test                  | 4.0158 |    7.224695 | 99.20% | 0.036670 |     91.41% | Naive         |    99.20%    |     99.30%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-mini-reasoning-IQ2_M-naive             | 3.1265 |  140.211058 | 53.68% | 3.785088 |     28.97% |               |              |                 |
| Phi-4-mini-reasoning-IQ2_M-test              | 3.1263 |         nan |    nan |      nan |     28.27% | No Importance |     nan      |     62.04%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-reasoning-IQ2_M-naive                  | 2.7866 |   10.118470 | 91.61% | 0.381891 |     73.54% |               |              |                 |
| Phi-4-reasoning-IQ2_M-test                   | 2.7862 |    9.684945 | 92.68% | 0.328806 |     75.56% | No Importance |    92.68%    |     94.06%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-mini-reasoning-IQ1_M-naive             | 2.4000 |         nan |    nan |      nan |      2.89% |               |              |                 |
| Phi-4-mini-reasoning-IQ1_M-test              | 2.3992 |         nan |    nan |      nan |      0.78% | N/A           |     nan      |     25.73%      |
|                                              |        |             |        |          |            |               |              |                 |
| Phi-4-reasoning-IQ1_M-naive                  | 1.9627 |   21.964494 | 76.60% | 1.167902 |     56.81% |               |              |                 |
| Phi-4-reasoning-IQ1_M-test                   | 1.9627 |   24.371781 | 75.48% | 1.282693 |     54.26% | No Importance |    75.48%    |     78.57%      |
|                                              |        |             |        |          |            |               |              |                 |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-8B-Q6_K_M-naive                        | 6.5635 |    9.403178 | 99.64% | 0.003126 |     97.59% |               |              |                 |
| Qwen3-8B-Q6_K_M-test                         | 6.5630 |    9.417844 | 99.63% | 0.002852 |     97.80% | Target BPW    |    99.63%    |     99.63%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-14B-Q6_K_M-naive                       | 6.5632 |    8.356044 | 99.83% | 0.002415 |     97.87% |               |              |                 |
| Qwen3-14B-Q6_K_M-test                        | 6.5631 |    8.353905 | 99.85% | 0.001700 |     98.18% | Target BPW    |    99.85%    |     99.85%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-8B-Q5_K_M-naive                        | 5.7090 |    9.432894 | 99.55% | 0.007302 |     96.45% |               |              |                 |
| Qwen3-8B-Q5_K_M-test                         | 5.7085 |    9.456015 | 99.53% | 0.007925 |     96.47% | No Importance |    99.53%    |     99.54%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-14B-Q5_K_M-naive                       | 5.6925 |    8.349641 | 99.78% | 0.004632 |     97.10% |               |              |                 |
| Qwen3-14B-Q5_K_M-test                        | 5.6925 |    8.374862 | 99.78% | 0.004706 |     96.97% | Same          |    99.78%    |     99.78%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-8B-Q4_K_M-naive                        | 4.9049 |    9.484175 | 99.20% | 0.022981 |     93.80% |               |              |                 |
| Qwen3-8B-Q4_K_M-test                         | 4.9048 |    9.499435 | 99.22% | 0.021199 |     94.16% | No Importance |    99.22%    |     99.23%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-14B-Q4_K_M-naive                       | 4.8730 |    8.438122 | 99.48% | 0.016470 |     94.74% |               |              |                 |
| Qwen3-14B-Q4_K_M-test                        | 4.8730 |    8.391917 | 99.56% | 0.014632 |     95.00% | No Importance |    99.56%    |     99.55%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-8B-Q3_K_M-naive                        | 4.0223 |    9.806293 | 97.74% | 0.085251 |     88.31% |               |              |                 |
| Qwen3-8B-Q3_K_M-test                         | 4.0222 |    9.727678 | 98.33% | 0.062507 |     89.96% | Target BPW    |    98.33%    |     98.01%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-14B-Q3_K_M-naive                       | 3.9627 |    8.608721 | 98.48% | 0.058355 |     90.33% |               |              |                 |
| Qwen3-14B-Q3_K_M-test                        | 3.9626 |    8.641670 | 98.70% | 0.049203 |     91.17% | Target BPW    |    98.70%    |     98.50%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-8B-IQ2_M-naive                         | 2.9750 |   12.149941 | 88.46% | 0.546543 |     71.90% |               |              |                 |
| Qwen3-8B-IQ2_M-test                          | 2.9749 |   10.807367 | 93.07% | 0.292973 |     78.84% | Target BPW    |    93.07%    |     91.57%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-14B-IQ2_M-naive                        | 2.8802 |   10.032410 | 91.33% | 0.385816 |     75.81% |               |              |                 |
| Qwen3-14B-IQ2_M-test                         | 2.8801 |    9.611706 | 94.37% | 0.239130 |     80.69% | Target BPW    |    94.37%    |     93.11%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-8B-IQ1_M-naive                         | 2.1978 |   24.298130 | 75.73% | 1.336847 |     56.02% |               |              |                 |
| Qwen3-8B-IQ1_M-test                          | 2.1977 |   15.898865 | 83.31% | 0.802131 |     65.28% | Target BPW    |    83.31%    |     76.51%      |
|                                              |        |             |        |          |            |               |              |                 |
| Qwen3-14B-IQ1_M-naive                        | 2.0821 |   15.201915 | 81.19% | 0.934159 |     62.44% |               |              |                 |
| Qwen3-14B-IQ1_M-test                         | 2.0820 |   12.808369 | 85.11% | 0.692015 |     67.15% | Target BPW    |    85.11%    |     82.91%      |

Tests:         132

Naive:          25 (19%)
Same:           10 ( 8%)
Target BPW:     69 (52%)
No Importance:  27 (20%)
N/A:             1 (<1%)

Usage examples

Quantize model to 4.5678 bpw using 8 threads. Model type is optional and can be omitted:

./llama-quantize --target-bpw 4.5678 --imatrix imatrix.gguf input-model-f32.gguf ouput-model-4.57bpw.gguf 8

Quantize model to 4.5678 bpw distributing budget equitably across all tensors instead of prioritizing some:

./llama-quantize --target-bpw 4.5678 --no-importance --imatrix imatrix.gguf input-model-f32.gguf ouput-model-4.57bpw.gguf Q4_K_M 8

Quantize model and save the bpw computations to the default file:

./llama-quantize --target-bpw 4.5678 --keep-bpw-state --imatrix imatrix.gguf input-model-f32.gguf ouput-model-4.57bpw.gguf 8

Quantize model and save the bpw computations to a custom file:

./llama-quantize --target-bpw 4.5678 --keep-bpw-state --bpw-state custom-bpw-state.bin --imatrix imatrix.gguf input-model-f32.gguf ouput-model-4.57bpw.gguf 8

cmp-nct · 2026-01-02T03:52:05Z

cmp-nct
Jan 2, 2026

Looks interesting for optimizing model size.
This would benefit a lot from a --target-size=3.5gb parameter that automatically calculates optimal bpw, so people can optimize a model to their VRAM requirements, getting optimal bit quantization within that constraint.

1 reply

EAddario Jan 3, 2026
Author

Funny enough, that's how the project started but along the way it shifted to bpw mostly to be able to do apples-to-apples comparisons between models. Adding the feature back is simple enough, so will implement in the next update.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto-Adaptive Mixed-Precision Quantization (Target BPW) #18531

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Auto-Adaptive Mixed-Precision Quantization (Target BPW) #18531

Uh oh!

EAddario Jan 1, 2026

How it Works

Advantages

Disadvantages

Test results

Usage examples

Replies: 1 comment · 1 reply

Uh oh!

cmp-nct Jan 2, 2026

Uh oh!

EAddario Jan 3, 2026 Author

EAddario
Jan 1, 2026

Replies: 1 comment 1 reply

cmp-nct
Jan 2, 2026

EAddario Jan 3, 2026
Author