-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Description
Name and Version
8f1d81a from 2024-09-01
until today
0cd182e
Operating systems
Windows
Which llama.cpp modules do you know to be affected?
llama-quantize
Problem description & steps to reproduce
convert_lora_to_gguf ignores outtype
hi
i want to convert a transformer fp32 lora to a quant gguf lora. i have done this a few months ago and it worked fine with the convert script. now i tryed the same with a up to date version of llama.cpp and ...well... it still works but i wondered why the output file has the same size than the input file. i tested this much to large file and it looks like it still works. at least for me in kobold.cpp. but then i got curious why this quant lora is so large. for me it turned out the new version of llamacpp is ignoring the outtype given for the convert script.
no matter what --outtype i chose its alway FP32 and the logging while "converting" looks like this for all lines.
this is with --outtype q8_0
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_a, torch.float32 --> F32, shape = {14336, 64}
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_b, torch.float32 --> F32, shape = {64, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_a, torch.float32 --> F32, shape = {4096, 64}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_b, torch.float32 --> F32, shape = {64, 14336}
i still had my old llama.cpp around and i tested the same. same lora ...same base model ...same script parameters
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_a, torch.float32 --> Q8_0, shape = {14336, 64}
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_b, torch.float32 --> Q8_0, shape = {64, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_a, torch.float32 --> Q8_0, shape = {4096, 64}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_b, torch.float32 --> Q8_0, shape = {64, 14336}
this looks more like an quant converted lora. and the file size is also fine.
now i tested a little bit around at which "version/commit" this happens.
my old "working" version is 1d1ccce 2024-08-29
0ab30f8 2024-08-30 also seems to still work
but
at 8f1d81a from 2024-09-01 the problems starts. the script is ignoring the output type and gives out fp32 lora.
First Bad Commit
8f1d81a from 2024-09-01
Relevant log output
No response