Skip to content

Misc. bug: using --tensor-type on a tensor that requires fallback leads to an assert error #14996

@EAddario

Description

@EAddario

Name and Version

All versions post b5125 in all OSes will be affected by this bug.

Operating systems

Other? (Please let us know in description)

Which llama.cpp modules do you know to be affected?

llama-quantize

Command line

./llama-quantize --tensor-type attn=q4_k gorilla-falcon-7b-hf-v0-F16.gguf gorilla-falcon-7b-hf-v0-Q4_K_M-kaboom.gguf q4_k_m 10

Problem description & steps to reproduce

When using --tensor-type to override a tensor that would have otherwise been quantised in fallback mode, the GGML_ASSERT(tensor->ne[0] % blck_size == 0 && "tensor row size not divisible by block size of new type") error will be triggered.

Problem occurs here because the logic ignores the tensor type was reassigned to a fallback due to its geometry not being an exact multiple of the GGML block_size.

Steps to reproduce:

  1. Attempt to quantise a model where any of the tensors is not an exact multiple of GGML block_size, whilst at the same time using --tensor-type to override that tensor (see Command line example above)
  2. Quantisation fails with an assert error.

PR #14995 fixes this.

Credit to @ddh0 for flagging this bug.

First Bad Commit

71e90e8

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions