[BUG] <title>Qwen 1.8B lora finetune后不输出结果

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

- [x] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

### 该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

- [x] 我已经搜索过FAQ | I have searched FAQ

### 当前行为 | Current Behavior

我的数据准备采用官方提供的格式，例如：
{"id": "identity_0", "conversations": [{"from": "user", "value": "你是一个说话人识别助手，请根据下面的发言内容和聚类结果，修正说话人编号。\n每段发言已经分配了聚类编号（如 0, 1, 2...），但这些编号可能不准确。\n请你重新分配编号为 spk_A、spk_B 等，确保相同说话人编号一致，不需要输出原始文本内容。\n\n说话人 1：My team works across research and product to incubate\n说话人 1：across research and product to incubate emerging technologies\n说话人 1：and product to incubate emerging technologies\n说话人 1：to incubate emerging technologies and runs programs\n说话人 1：emerging technologies and runs programs that connect our\n说话人 1：technologies and runs programs that connect our research at Microsoft\n说话人 1：and runs programs that connect our research at Microsoft\n说话人 1：programs that connect our research at Microsoft to the broader research\n说话人 1：our research at Microsoft to the broader research community.\n说话人 1：I sat down with research leaders, AJ Kumar,\n说话人 1：research leaders, AJ Kumar, Ahmed Awadallah,\n说话人 1：AJ Kumar, Ahmed Awadallah,\n说话人 1：Kumar, Ahmed Awadallah, and Sebastian Bubeck\n说话人 1：Ahmed Awadallah, and Sebastian Bubeck\n说话人 1：Awadallah, and Sebastian Bubeck to explore some\n说话人 1：and Sebastian Bubeck to explore some of the most exciting\n说话人 1：Bubeck to explore some of the most exciting new frontiers\n说话人 1：to explore some of the most exciting new frontiers in AI.\n说话人 1：some of the most exciting new frontiers in AI.\n说话人 1：We discussed their aspirations for AI, the research directions\n\n修正后的说话人编号（每行一个编号）："}, {"from": "assistant", "value": "spk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A\nspk_A"}]}
训练的shell脚本是用finetune_lora_ds.sh，配置文件没有改动


### 期望行为 | Expected Behavior

decode代码如下：
import time

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "./Qwen/output_qwen"

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)
model.eval()

# Step 4: 构造 prompt，进行推理
prompt = "你是一个说话人识别助手，请根据下面的发言内容和聚类结果，修正说话人编号。\n每段发言已经分配了聚类编号（如 0, 1, 2...），但这些编号可能不准确。\n请你重新分配编号为 spk_A、spk_B 等，确保相同说话人编号一致，不需要输出原始文本内容。\n\n说话人 1：My team works across research and product to incubate\n说话人 1：across research and product to incubate emerging technologies\n说话人 1：and product to incubate emerging technologies\n说话人 1：to incubate emerging technologies and runs programs\n说话人 1：emerging technologies and runs programs that connect our\n说话人 1：technologies and runs programs that connect our research at Microsoft\n说话人 1：and runs programs that connect our research at Microsoft\n说话人 1：programs that connect our research at Microsoft to the broader research\n说话人 1：our research at Microsoft to the broader research community.\n说话人 1：I sat down with research leaders, AJ Kumar,\n说话人 1：research leaders, AJ Kumar, Ahmed Awadallah,\n说话人 1：AJ Kumar, Ahmed Awadallah,\n说话人 1：Kumar, Ahmed Awadallah, and Sebastian Bubeck\n说话人 1：Ahmed Awadallah, and Sebastian Bubeck\n说话人 1：Awadallah, and Sebastian Bubeck to explore some\n说话人 1：and Sebastian Bubeck to explore some of the most exciting\n说话人 1：Bubeck to explore some of the most exciting new frontiers\n说话人 1：to explore some of the most exciting new frontiers in AI.\n说话人 1：some of the most exciting new frontiers in AI.\n说话人 1：We discussed their aspirations for AI, the research directions\n\n修正后的说话人编号（每行一个编号）："
start = time.time()
inputs = tokenizer(prompt, return_tensors='pt', max_length=8192, truncation=True)
inputs = inputs.to('cuda:0')
pred = model.generate(**inputs, num_return_sequences=1, repetition_penalty=1.1)
output = tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)
output = output.replace(prompt, "")

print(output)
期望输出Spk_A，。。。但是实际输出的是空，什么都没有输出
从训练loss上看，收敛的很好，大概到0.01，想请教一下是什么问题导致模型decode的时候输出为空

### 复现方法 | Steps To Reproduce

_No response_

### 运行环境 | Environment

```Markdown
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):
```

### 备注 | Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] <title>Qwen 1.8B lora finetune后不输出结果 #1381

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

Step 4: 构造 prompt，进行推理

复现方法 | Steps To Reproduce

运行环境 | Environment

备注 | Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] <title>Qwen 1.8B lora finetune后不输出结果 #1381

Description

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

Step 4: 构造 prompt，进行推理

复现方法 | Steps To Reproduce

运行环境 | Environment

备注 | Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions