We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
You can continue the conversation there. Go to discussion →
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pytorch:2.1.0-cuda11.8
bf16: true cutoff_len: 1024 dataset: EE_instruction_message dataset_dir: data ddp_timeout: 180000000 do_train: true eval_steps: 100 eval_strategy: steps finetuning_type: lora flash_attn: fa2 gradient_accumulation_steps: 8 include_num_input_tokens_seen: true learning_rate: 0.0003 logging_steps: 5 lora_alpha: 32 lora_dropout: 0.1 lora_rank: 8 lora_target: all lr_scheduler_type: cosine max_grad_norm: 1.0 max_samples: 100000 model_name_or_path: src/llamafactory/model/model/glm4-chat num_train_epochs: 2.0 optim: adamw_torch output_dir: saves/GLM-4-9B-Chat/lora/train_2024-06-24-23-30-00 packing: false per_device_eval_batch_size: 1 per_device_train_batch_size: 1 plot_loss: true preprocessing_num_workers: 16 report_to: none save_steps: 100 stage: sft template: glm4 val_size: 0.2 warmup_steps: 0.01
加载微调后的glm4模型不生成回答,无论输入什么内容,模型返回的内容都是空白。 训练日志中的loss是逐步下降,并且评估阶段的loss同样下降。但是执行predict脚本生成的结果都为0。
No response
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Reminder
System Info
pytorch:2.1.0-cuda11.8
Reproduction
bf16: true
cutoff_len: 1024
dataset: EE_instruction_message
dataset_dir: data
ddp_timeout: 180000000
do_train: true
eval_steps: 100
eval_strategy: steps
finetuning_type: lora
flash_attn: fa2
gradient_accumulation_steps: 8
include_num_input_tokens_seen: true
learning_rate: 0.0003
logging_steps: 5
lora_alpha: 32
lora_dropout: 0.1
lora_rank: 8
lora_target: all
lr_scheduler_type: cosine
max_grad_norm: 1.0
max_samples: 100000
model_name_or_path: src/llamafactory/model/model/glm4-chat
num_train_epochs: 2.0
optim: adamw_torch
output_dir: saves/GLM-4-9B-Chat/lora/train_2024-06-24-23-30-00
packing: false
per_device_eval_batch_size: 1
per_device_train_batch_size: 1
plot_loss: true
preprocessing_num_workers: 16
report_to: none
save_steps: 100
stage: sft
template: glm4
val_size: 0.2
warmup_steps: 0.01
Expected behavior
加载微调后的glm4模型不生成回答,无论输入什么内容,模型返回的内容都是空白。
训练日志中的loss是逐步下降,并且评估阶段的loss同样下降。但是执行predict脚本生成的结果都为0。
Others
No response
The text was updated successfully, but these errors were encountered: