Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baichuan-13b-chat sft微调loss不下降 #188

Open
xiaohuihwh opened this issue Sep 21, 2023 · 1 comment
Open

baichuan-13b-chat sft微调loss不下降 #188

xiaohuihwh opened this issue Sep 21, 2023 · 1 comment

Comments

@xiaohuihwh
Copy link

xiaohuihwh commented Sep 21, 2023

用LLaMA-Efficient-Tuning微调1和2,loss均不下降

@xiaohuihwh
Copy link
Author

在2块4090用deepspeed微调baichuan1和baichuan2开始loss为1.9,几十个step之后就在1.5附近不再下降。
参数如下。
deepspeed --num_gpus 2 --master_port=9901 src/train_bash.py
--deepspeed ds_config_stage3.json
--stage sft
--model_name_or_path /root/autodl-tmp/baichuan-13b-chat/
--do_train --dataset alpaca_gpt4_zh
--template baichuan2
--finetuning_type lora
--lora_target W_pack
--output_dir src/output/sft-0921
--overwrite_cache
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 500
--learning_rate 5e-5
--num_train_epochs 3.0
--plot_loss
--bf16
deepspeed配置文件如下。
image

@xiaohuihwh xiaohuihwh changed the title baichuan-13b-chat sft baichuan-13b-chat sft微调loss不下降 Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant