Skip to content

Misc. bug: convert_lora_to_gguf ignores outtype #10671

@dynatron832

Description

@dynatron832

Name and Version

8f1d81a from 2024-09-01
until today
0cd182e

Operating systems

Windows

Which llama.cpp modules do you know to be affected?

llama-quantize

Problem description & steps to reproduce

convert_lora_to_gguf ignores outtype

hi
i want to convert a transformer fp32 lora to a quant gguf lora. i have done this a few months ago and it worked fine with the convert script. now i tryed the same with a up to date version of llama.cpp and ...well... it still works but i wondered why the output file has the same size than the input file. i tested this much to large file and it looks like it still works. at least for me in kobold.cpp. but then i got curious why this quant lora is so large. for me it turned out the new version of llamacpp is ignoring the outtype given for the convert script.

no matter what --outtype i chose its alway FP32 and the logging while "converting" looks like this for all lines.

this is with --outtype q8_0
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_a, torch.float32 --> F32, shape = {14336, 64}
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_b, torch.float32 --> F32, shape = {64, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_a, torch.float32 --> F32, shape = {4096, 64}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_b, torch.float32 --> F32, shape = {64, 14336}

i still had my old llama.cpp around and i tested the same. same lora ...same base model ...same script parameters

INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_a, torch.float32 --> Q8_0, shape = {14336, 64}
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_b, torch.float32 --> Q8_0, shape = {64, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_a, torch.float32 --> Q8_0, shape = {4096, 64}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_b, torch.float32 --> Q8_0, shape = {64, 14336}

this looks more like an quant converted lora. and the file size is also fine.

now i tested a little bit around at which "version/commit" this happens.

my old "working" version is 1d1ccce 2024-08-29

0ab30f8 2024-08-30 also seems to still work

but

at 8f1d81a from 2024-09-01 the problems starts. the script is ignoring the output type and gives out fp32 lora.

First Bad Commit

8f1d81a from 2024-09-01

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions