[FEATURE] Allow merge and unload of PEFT models into base models. #260

RonanKMcGovern · 2023-08-14T15:05:39Z

Is your feature request related to a problem? Please describe.
Yes. The problem is that a PEFT model cannot be merged with the base model and pushed to hub.

Cannot merge LORA layers when the model is gptq quantized

after trying:

model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto") # must be auto, cannot be cpu

from peft import PeftModel

# load PEFT model with new adapters
model = PeftModel.from_pretrained(
    model,
    adapter_model_name,
)

model = model.merge_and_unload() # merge adapters with the base model.

Describe the solution you'd like
Allow for merging and unloading, just like bitsandbytes quantized models.

Describe alternatives you've considered
Otherwise, the base model always needs to be referenced when running inference.

The text was updated successfully, but these errors were encountered:

Ph0rk0z · 2023-08-14T22:44:37Z

I am not sure if this is possible. It would be a miracle if quantized models could have a lora merged to them. Would save having to d/l the full HF weights.

PanQiWei · 2023-08-15T06:13:37Z

Theoretically I think it's possible. I also believe that supporting LoRA weights merge back to quantized base model is very valuable to industrial applications, however there is a lot experiments should be done at first.

RonanKMcGovern · 2023-08-15T10:17:54Z

Thanks @PanQiWei and @Ph0rk0z .

I'm unsure now if my request was clear.

I'm asking about merging the LoRa back into the base GPTQ quantized model - which should be a much easier task. The base model I'm referring to above is the GPTQ quantized one. However, the merging of the LoRa adapter isn't working.

Yes, being able to merge back into the root model would be useful - and industrially valuable. And that makes sense Pan that extra testing would be required to see if it even makes sense from a perplexity standpoint.

prp-e · 2023-12-24T15:44:37Z

As I heard, GGUF models can be merged with their adapters. So I guess it can be a possibility to merge adapters to GPTQs as well.

RonanKMcGovern · 2023-12-24T16:34:41Z

As I heard, GGUF models can be merged with their adapters. So I guess it can be a possibility to merge adapters to GPTQs as well.

In principle yes, but the code base doesn't allow for that with GPTQ. Actually, as of very recently, I believe it is now possible to merge bnb nf4 model adapters with the base.

RonanKMcGovern added the enhancement New feature or request label Aug 14, 2023

RonanKMcGovern closed this as completed Aug 15, 2023

RonanKMcGovern mentioned this issue Aug 16, 2023

Issue merging LoRa adapters back into gptq quantized model huggingface/optimum#1287

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Allow merge and unload of PEFT models into base models. #260

[FEATURE] Allow merge and unload of PEFT models into base models. #260

RonanKMcGovern commented Aug 14, 2023

Ph0rk0z commented Aug 14, 2023

PanQiWei commented Aug 15, 2023

RonanKMcGovern commented Aug 15, 2023

prp-e commented Dec 24, 2023

RonanKMcGovern commented Dec 24, 2023

[FEATURE] Allow merge and unload of PEFT models into base models. #260

[FEATURE] Allow merge and unload of PEFT models into base models. #260

Comments

RonanKMcGovern commented Aug 14, 2023

Ph0rk0z commented Aug 14, 2023

PanQiWei commented Aug 15, 2023

RonanKMcGovern commented Aug 15, 2023

prp-e commented Dec 24, 2023

RonanKMcGovern commented Dec 24, 2023