lora微调后的glm4模型不生成回答 #4454

RyanCcc114 · 2024-06-25T02:12:58Z

Reminder

I have read the README and searched the existing issues.

System Info

pytorch:2.1.0-cuda11.8

Reproduction

bf16: true
cutoff_len: 1024
dataset: EE_instruction_message
dataset_dir: data
ddp_timeout: 180000000
do_train: true
eval_steps: 100
eval_strategy: steps
finetuning_type: lora
flash_attn: fa2
gradient_accumulation_steps: 8
include_num_input_tokens_seen: true
learning_rate: 0.0003
logging_steps: 5
lora_alpha: 32
lora_dropout: 0.1
lora_rank: 8
lora_target: all
lr_scheduler_type: cosine
max_grad_norm: 1.0
max_samples: 100000
model_name_or_path: src/llamafactory/model/model/glm4-chat
num_train_epochs: 2.0
optim: adamw_torch
output_dir: saves/GLM-4-9B-Chat/lora/train_2024-06-24-23-30-00
packing: false
per_device_eval_batch_size: 1
per_device_train_batch_size: 1
plot_loss: true
preprocessing_num_workers: 16
report_to: none
save_steps: 100
stage: sft
template: glm4
val_size: 0.2
warmup_steps: 0.01

Expected behavior

加载微调后的glm4模型不生成回答，无论输入什么内容，模型返回的内容都是空白。
训练日志中的loss是逐步下降，并且评估阶段的loss同样下降。但是执行predict脚本生成的结果都为0。

Others

No response

github-actions bot added the pending This problem is yet to be addressed label Jun 25, 2024

Repository owner locked and limited conversation to collaborators Jun 26, 2024

hiyouga converted this issue into discussion #4568 Jun 26, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

lora微调后的glm4模型不生成回答 #4454

lora微调后的glm4模型不生成回答 #4454

RyanCcc114 commented Jun 25, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

lora微调后的glm4模型不生成回答 #4454

lora微调后的glm4模型不生成回答 #4454

Comments

RyanCcc114 commented Jun 25, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

This issue was moved to a discussion.