size mismatch for lm_head when fintune QWEN2.5

### System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- `transformers` version: 4.49.0
- Platform: Linux-6.6.0-72.0.0.64.oe2403.x86_64-x86_64-with-glibc2.38
- Python version: 3.10.16
- Huggingface_hub version: 0.29.1
- Safetensors version: 0.5.3
- Accelerate version: 1.4.0
- Accelerate config:    not found
- DeepSpeed version: not installed
- PyTorch version (GPU?): 2.2.2+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>
- Using GPU in script?: <fill in>
- GPU type: NVIDIA L40


### Who can help?

@ArthurZucker

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction




I finetune qwen2.5 using follow code:

```python

from datasets import load_dataset
from trl import SFTConfig, SFTTrainer
from peft import LoraConfig
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

dataset = load_dataset("trl-lib/Capybara", split="train")
dataset = dataset.select(range(500))
MODEL_ID = 'Qwen/Qwen2.5-0.5B'
peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    target_modules="all-linear",
    modules_to_save=["lm_head", "embed_token"],
    task_type="CAUSAL_LM",
)
args = SFTConfig(
    output_dir="Qwen2.5-0.5B-SFT-Capybara",  # directory to save and repository id
    num_train_epochs=1,  # number of training epochs
    per_device_train_batch_size=4,  # batch size per device during training
    gradient_accumulation_steps=4,  # number of steps before performing a backward/update pass
    gradient_checkpointing=True,  # use gradient checkpointing to save memory
    optim="adamw_torch_fused",  # use fused adamw optimizer
    logging_steps=10,  # log every 10 steps
    save_strategy="epoch",  # save checkpoint every epoch
    bf16=True,  # use bfloat16 precision
    tf32=True,  # use tf32 precision
    learning_rate=2e-4,  # learning rate, based on QLoRA paper
    max_grad_norm=0.3,  # max gradient norm based on QLoRA paper
    warmup_ratio=0.03,  # warmup ratio based on QLoRA paper
    lr_scheduler_type="constant",  # use constant learning rate scheduler
    push_to_hub=False,  # push model to hub
    # report_to="tensorboard",  # report metrics to tensorboard
)

trainer = SFTTrainer(
    MODEL_ID,
    train_dataset=dataset,
    args=args,
    peft_config=peft_config
)

trainer.train()
print('end')

```

and I use follow code to inference:

```python

import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer, pipeline
from peft import PeftConfig, PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = "/home/chenjq/pythonWork/nlp/Qwen2.5-0.5B-SFT-Capybara/checkpoint-31"
# peft_model_id = args.output_dir
tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
# Load Model with PEFT adapter
model = AutoPeftModelForCausalLM.from_pretrained(
    peft_model_id,
    device_map="auto",
    torch_dtype=torch.float16
)

prompt = "3的5倍是多少"
messages = [
    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=200
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
print(1)
```

an error occur when load model with AutoPeftModelForCausalLM:


```

Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Traceback (most recent call last):
  File "/home/chenjq/.pycharm_helpers/pydev/pydevd.py", line 1500, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/chenjq/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/chenjq/pythonWork/nlp/test14.py", line 11, in <module>
    model = AutoPeftModelForCausalLM.from_pretrained(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/auto.py", line 130, in from_pretrained
    return cls._target_peft_class.from_pretrained(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/peft_model.py", line 581, in from_pretrained
    load_result = model.load_adapter(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/peft_model.py", line 1239, in load_adapter
    load_result = set_peft_model_state_dict(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 451, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
	size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([151936, 896]) from checkpoint, the shape in current model is torch.Size([151665, 896]).

Process finished with exit code 1
```

### Expected behavior

expecte model can predict normally.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

size mismatch for lm_head when fintune QWEN2.5 #36550

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

size mismatch for lm_head when fintune QWEN2.5 #36550

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions