Lora config taking target modules from base base model #758

Luke-4 · 2023-07-27T14:15:46Z

I have fine-tuned a BioGPT model using Lora, the config can be found in the hub (Lukee4/biogpt-2019)

The issue is that when I try to load the model:

peft_model_id = "Lukee4/biogpt-2019"
config = PeftConfig.from_pretrained(peft_model_id)
inference_model = AutoModel.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
inference_model = PeftModel.from_pretrained(inference_model, peft_model_id)

I get an error: ValueError: Target modules [‘c_attn’] not found in the base model. Please check the target modules and try again.

I investigated this a little and the only place that I could find c_attn is in the Source code for transformers.models.gpt2.modeling_gpt2

In the Source code for biogpt/modeling_biogpt there is no c_attn.

When I load gpt2 instead of biogpt everything works fine

peft_model_id = "Lukee4/biogpt-2019"
config = PeftConfig.from_pretrained(peft_model_id)
inference_model = AutoModelForSequenceClassification.from_pretrained('gpt2')
tokenizer = AutoTokenizer.from_pretrained('gpt2')
inference_model = PeftModel.from_pretrained(inference_model, peft_model_id)

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2023-07-28T09:30:25Z

It seems that the config from Lukee4/biogpt-2019 is either misconfigured by setting c_attn as target module, or there is some kind of mismatch (e.g. the code was changed after the lora weights were trained). I think there is nothing we can do on the PEFT side for this. Maybe you can ask the author by starting a discussion on Lukee4/biogpt-2019.

Luke-4 · 2023-07-28T10:29:23Z

It seems that the config from Lukee4/biogpt-2019 is either misconfigured by setting c_attn as target module, or there is some kind of mismatch (e.g. the code was changed after the lora weights were trained). I think there is nothing we can do on the PEFT side for this. Maybe you can ask the author by starting a discussion on Lukee4/biogpt-2019.

Hello @BenjaminBossan , in the config I didn't set any target module you can check the notebook I used to do the fine-tuning here, and I also did not change the code after the lora weights were trained. I followed this example to do the fine tunning the only difference is the model used.

BenjaminBossan · 2023-07-28T11:34:43Z

Hey, sorry, I completely missed that you're the author of the linked model :)

I can't access the notebook, but I don't think it matters. What I assume happened here is the following: In PEFT, we try to recognize the architecture of the model and automatically set the adapter layers if the user doesn't set target_modules themselves. Probably here, it was recognized as a GPT2-like architecture and hence c_attn was set, even though it doesn't match with the model you used. What you would have to do is specify target_modules in the config, choosing the modules that make sense (probably the Linear modules, but it depends on the model).

Luke-4 · 2023-07-29T12:20:28Z

Hey @BenjaminBossan

Here is the [link] (https://colab.research.google.com/drive/1s7af-1u-LEtXx2iMAw-8Gf0jnFfgsvc0?usp=sharing) to the notebook hopefully now you can access it.

But I don't understand how this could happen, I tried the same code with BioMedLM which is also a GPT2 model and it worked fine.

How do I decide which are the target modules I should include in the Peftconfig? I'm sorry I know this question is not appropriate but I am new to this and didn't expect to run into this issue is quite advanced for me.

BenjaminBossan · 2023-07-31T08:59:31Z

In the notebook, on the one hand, I see:

target_modules= ["K_proj", 'v_proj', 'q_proj', "out_proj"]

but later:

target_modules=['c_attn']

Not sure what happened there.

How do I decide which are the target modules I should include in the Peftconfig?

Not a stupid question at all. In general, people refer to papers who did the experiments to decide which layers to adopt. In general, linear layers are the prime target.

Here are some hints about how to identify what layers could be potential targets.

samos123 mentioned this issue Aug 3, 2023

model-trainer-huggingface: llama2 7b isn't working substratusai/images#21

Closed

Luke-4 closed this as completed Aug 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lora config taking target modules from base base model #758

Lora config taking target modules from base base model #758

Luke-4 commented Jul 27, 2023

BenjaminBossan commented Jul 28, 2023

Luke-4 commented Jul 28, 2023

BenjaminBossan commented Jul 28, 2023

Luke-4 commented Jul 29, 2023

BenjaminBossan commented Jul 31, 2023

Lora config taking target modules from base base model #758

Lora config taking target modules from base base model #758

Comments

Luke-4 commented Jul 27, 2023

BenjaminBossan commented Jul 28, 2023

Luke-4 commented Jul 28, 2023

BenjaminBossan commented Jul 28, 2023

Luke-4 commented Jul 29, 2023

BenjaminBossan commented Jul 31, 2023