Misc. bug: convert_lora_to_gguf ignores outtype

### Name and Version

8f1d81a from 2024-09-01
until today
0cd182e

### Operating systems

Windows

### Which llama.cpp modules do you know to be affected?

llama-quantize

### Problem description & steps to reproduce

convert_lora_to_gguf ignores outtype


hi
i want to convert a transformer fp32 lora to a quant gguf lora. i have done this a few months ago and it worked fine with the convert script. now i tryed the same with a up to date version of llama.cpp and ...well... it still works but i wondered why the output file has the same size than the input file. i tested this much to large file and it looks like it still works. at least for me in kobold.cpp. but then i got curious why this quant lora is so large. for me it turned out the new version of llamacpp is ignoring the outtype given for the convert script.

no matter what --outtype i chose its alway FP32 and the logging while "converting" looks like this for all lines.

this is with --outtype q8_0
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_a, torch.float32 --> F32, shape = {14336, 64}
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_b, torch.float32 --> F32, shape = {64, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_a, torch.float32 --> F32, shape = {4096, 64}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_b, torch.float32 --> F32, shape = {64, 14336}


i still had my old llama.cpp around and i tested the same. same lora ...same base model ...same script parameters

INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_a, torch.float32 --> Q8_0, shape = {14336, 64}
INFO:hf-to-gguf:blk.0.ffn_down.weight.lora_b, torch.float32 --> Q8_0, shape = {64, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_a, torch.float32 --> Q8_0, shape = {4096, 64}
INFO:hf-to-gguf:blk.0.ffn_gate.weight.lora_b, torch.float32 --> Q8_0, shape = {64, 14336}

this looks more like an quant converted lora. and the file size is also fine.

now i tested a little bit around at which "version/commit" this happens.

my old "working" version is 1d1ccce 2024-08-29

0ab30f8 2024-08-30 also seems to still work

but

at 8f1d81a from 2024-09-01 the problems starts. the script is ignoring the output type and gives out fp32 lora. 




### First Bad Commit

8f1d81a from 2024-09-01

### Relevant log output

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: convert_lora_to_gguf ignores outtype #10671

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: convert_lora_to_gguf ignores outtype #10671

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions