How to apply LoRA adapter to a model loaded with OVModelForCausalLM()? #642

nai-kon · 2024-03-29T01:13:44Z

In the transformers library, we can load multiple adapters to the original model by load_adapter then switch the specified adapter with set_adapter like below.

# base model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
)

# load multiple adapters
model.load_adapter("model/adapter1/", "adapter1")
model.load_adapter("model/adapter2/", "adapter2")

# switch adapter
model.set_adapter("adapter2")

Now I want to apply LoRA adapters with OpenVINO, but I can't find an example of it.
Is it possible to do it with OVModelForCausalLM?

The text was updated successfully, but these errors were encountered:

IlyasMoutawwakil · 2024-04-10T08:12:01Z

You probably can't do that once the model is loaded/exported to OpenVINO (or any framework with static computation graph), what you probably can do is to have one or many adapters fused into a base model and export it to OpenVINO.

Load the model using AutoModelForCausalLM
Load you target adapters in the way you would usually do
Create PeftModel and use its merge_and_unload method to fuse the adapters into the base model
Save the fused model locally or push to the Hub.
Load it with OVModelForCausalLM

nai-kon · 2024-04-14T23:51:54Z

Thank you for your detailed answer. I understood that dynamic switching is difficult with static graph frameworks such as OpenVINO.

nai-kon · 2024-04-15T00:35:26Z

Additional question.
I understood that multiple adapters can be merged into the model using merge_and_unload(), but is it possible to load a model contains multiple adapters with OVModelForCausalLM and change the adapter?
Or do I need to merge one adapter into one model, so if there are three adapters, do I need to prepare three merged models?
If so, my concern is that the file size of the model will increase in proportion to the number of adapters.

IlyasMoutawwakil · 2024-04-25T08:20:40Z

for now it seems impossible to me, but my understanding of the OpenVINO runtime is still very narrow.
This issue openvinotoolkit/openvino#21806 requests an API that tries to do this (keep the same model, change weights between inference requests). Apparently, in theory, it's supposed to be possible using the StateAPI, will look into it.

nai-kon closed this as completed Apr 14, 2024

nai-kon reopened this Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to apply LoRA adapter to a model loaded with OVModelForCausalLM()? #642

How to apply LoRA adapter to a model loaded with OVModelForCausalLM()? #642

nai-kon commented Mar 29, 2024 •

edited

IlyasMoutawwakil commented Apr 10, 2024

nai-kon commented Apr 14, 2024

nai-kon commented Apr 15, 2024 •

edited

IlyasMoutawwakil commented Apr 25, 2024 •

edited

How to apply LoRA adapter to a model loaded with OVModelForCausalLM()? #642

How to apply LoRA adapter to a model loaded with OVModelForCausalLM()? #642

Comments

nai-kon commented Mar 29, 2024 • edited

IlyasMoutawwakil commented Apr 10, 2024

nai-kon commented Apr 14, 2024

nai-kon commented Apr 15, 2024 • edited

IlyasMoutawwakil commented Apr 25, 2024 • edited

nai-kon commented Mar 29, 2024 •

edited

nai-kon commented Apr 15, 2024 •

edited

IlyasMoutawwakil commented Apr 25, 2024 •

edited