Skip to content

[Bug]: Qwen3.5-9B: model_extra_tensors.safetensors larger than original model due to unquantized DeltaNet (linear_attn) layers #1650

@piotr-sikora-v

Description

@piotr-sikora-v

Problem Description

When quantizing Qwen/Qwen3.5-9B, the output is LARGER than the original model:

  • Quantized shards (model-00001/00002): ~8.5 GB
  • model_extra_tensors.safetensors: ~17 GB ← larger than the 17.98 GB original
  • Total output: ~24.5 GB vs 17.98 GB original

Reproduction Steps

python quantize_autoround.py --model Qwen/Qwen3.5-9B --group-size 128 --iters 100 --scheme W3A16 --output ./Qwen3.5-9B-AutoRound-W3A16-g128-coding --calib coding --quant-lm-head --torch-compile --low-gpu-mem

Environment Information

  • auto-round version: nightly (latest from GitHub)
  • model: Qwen/Qwen3.5-9B (model_type: qwen3_5)
  • scheme: W3A16 / W4A16, group_size=128
  • format: auto_round
  • using CPU (not GPU)

Additional Context

Root cause
Qwen3.5-9B uses a hybrid Gated Delta Network (DeltaNet) architecture:

  • 24/32 blocks are DeltaNet (linear_attn) with weight dimensions
    not divisible by group_size (e.g. 409, 1228)
  • These layers fail quantization silently:
    WARNING missing_tensors.py L702: Failed to quantize
    model.language_model.layers.13.linear_attn.in_proj_a.weight:
    The size of tensor a (409) must match the size of tensor b (410)
  • 774 tensors end up in model_extra_tensors.safetensors at bf16

Since DeltaNet layers make up ~75% of model weight, the unquantized
extra_tensors file is larger than the quantized transformer layers.

Related: Issue #1496 mentions similar dimension issues for AWQ export
with workaround ignore_modules=["in_proj_ba"]

Expected behavior
Either:

  1. Fail early with a clear error: "Model architecture qwen3_5 with
    DeltaNet layers is not fully supported by AutoRound. These layers
    will remain at bf16, resulting in larger output than original."
  2. Support DeltaNet layer quantization with proper padding/rounding
  3. Add Qwen3.5 to the list of partially-supported models in docs

quantize_autoround.py

Error Logs

WOQ[RTN] quantizing missing weights:  52%|██████████████▍             | 190/368 [02:25<00:34,  5.15weight/s]2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.12.linear_attn.out_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  52%|██████████████▌             | 191/368 [02:26<00:33,  5.25weight/s]2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.13.linear_attn.in_proj_z.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  52%|██████████████▌             | 192/368 [02:26<00:31,  5.60weight/s]2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.13.linear_attn.out_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  52%|██████████████▋             | 193/368 [02:26<00:29,  5.86weight/s]2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.0.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.0.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  53%|██████████████▊             | 195/368 [02:26<00:19,  8.65weight/s]2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.1.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:37 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.1.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.10.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  54%|███████████████             | 198/368 [02:26<00:12, 13.32weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.10.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.11.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.11.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  55%|███████████████▎            | 201/368 [02:26<00:09, 17.24weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.12.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.12.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.13.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  55%|███████████████▌            | 204/368 [02:26<00:08, 20.34weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.13.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.14.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.14.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  56%|███████████████▊            | 207/368 [02:26<00:07, 22.37weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.15.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.15.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.16.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  57%|███████████████▉            | 210/368 [02:26<00:06, 23.84weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.16.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.17.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.17.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  58%|████████████████▏           | 213/368 [02:27<00:06, 25.05weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.18.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.18.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.19.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  59%|████████████████▍           | 216/368 [02:27<00:05, 25.86weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.19.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.2.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.2.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  60%|████████████████▋           | 219/368 [02:27<00:06, 21.65weight/s]2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.20.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.20.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:38 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.21.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  60%|████████████████▉           | 222/368 [02:27<00:06, 22.19weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.21.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.22.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.22.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  61%|█████████████████           | 225/368 [02:27<00:06, 22.61weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.23.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.23.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.24.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  62%|█████████████████▎          | 228/368 [02:27<00:05, 23.35weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.24.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.25.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.25.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  63%|█████████████████▌          | 231/368 [02:27<00:05, 23.18weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.26.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.26.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.3.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  64%|█████████████████▊          | 234/368 [02:27<00:05, 23.62weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.3.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.4.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.4.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  64%|██████████████████          | 237/368 [02:28<00:05, 24.20weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.5.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.5.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.6.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  65%|██████████████████▎         | 240/368 [02:28<00:05, 25.25weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.6.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.7.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.7.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  66%|██████████████████▍         | 243/368 [02:28<00:04, 25.96weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.8.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.8.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.9.mlp.linear_fc1.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  67%|██████████████████▋         | 246/368 [02:28<00:04, 26.82weight/s]2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.9.mlp.linear_fc2.weight: The size of tensor a (435) must match the size of tensor b (436) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:39 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.15.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.15.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  68%|██████████████████▉         | 249/368 [02:28<00:04, 26.92weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.27.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.27.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.3.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  68%|███████████████████▏        | 252/368 [02:28<00:04, 27.18weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.3.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize mtp.layers.0.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize mtp.layers.0.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  69%|███████████████████▍        | 255/368 [02:28<00:04, 26.37weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.11.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.11.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.23.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.23.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  70%|███████████████████▋        | 259/368 [02:28<00:03, 29.24weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.7.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.7.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.31.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  71%|███████████████████▉        | 262/368 [02:28<00:03, 28.36weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.31.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.19.self_attn.k_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.19.self_attn.v_proj.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  72%|████████████████████▏       | 265/368 [02:29<00:03, 27.56weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.0.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.1.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.10.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.11.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  73%|████████████████████▍       | 269/368 [02:29<00:03, 29.70weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.12.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.13.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.14.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.15.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  74%|████████████████████▊       | 273/368 [02:29<00:03, 31.03weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.16.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.17.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.18.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.19.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  75%|█████████████████████       | 277/368 [02:29<00:02, 31.95weight/s]2026-04-02 18:02:40 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.2.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.20.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.21.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.22.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  76%|█████████████████████▍      | 281/368 [02:29<00:02, 32.47weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.23.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.24.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.25.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.26.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  77%|█████████████████████▋      | 285/368 [02:29<00:02, 33.61weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.3.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.4.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.5.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.6.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  79%|█████████████████████▉      | 289/368 [02:29<00:02, 34.44weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.7.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.8.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.9.attn.qkv.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.pos_embed.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  80%|██████████████████████▎     | 293/368 [02:29<00:02, 35.77weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.0.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.1.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.10.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.11.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.12.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.13.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.14.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.15.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.16.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.17.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  82%|███████████████████████     | 303/368 [02:29<00:01, 52.23weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.18.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.19.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.2.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.20.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.21.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.22.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.23.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.24.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.25.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.26.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  85%|███████████████████████▊    | 313/368 [02:30<00:00, 64.08weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.3.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.4.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.5.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.6.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.7.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.8.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.visual.blocks.9.attn.proj.weight: The size of tensor a (115) must match the size of tensor b (116) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.14.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.14.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.26.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.26.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.30.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.30.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.10.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.10.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.0.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.0.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.1.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  90%|█████████████████████████▏  | 331/368 [02:30<00:00, 96.56weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.1.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.22.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.22.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.8.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.8.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.9.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.9.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.4.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.4.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.16.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.16.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.17.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.17.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.18.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.18.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.2.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.2.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.24.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.24.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.25.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.25.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.5.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.5.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.6.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.6.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.20.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.20.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.21.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.21.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.28.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.28.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.29.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.29.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.12.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights:  99%|██████████████████████████▊| 365/368 [02:30<00:00, 165.55weight/s]2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.12.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.13.linear_attn.in_proj_b.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
2026-04-02 18:02:41 WARNING missing_tensors.py L702: Failed to quantize model.language_model.layers.13.linear_attn.in_proj_a.weight: The size of tensor a (409) must match the size of tensor b (410) at non-singleton dimension 0, keeping original weight
WOQ[RTN] quantizing missing weights: 100%|████████████████████████████| 368/368 [02:30<00:00,  2.45weight/s]
2026-04-02 18:04:36 INFO missing_tensors.py L370: Successfully wrote 774 missing tensor(s) to 'model_extra_tensors.safetensors' in /var/lib/docker/autoround/Qwen3.5-9B-AutoRound-W3A16-g128-coding.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions