baichuan-13b-chat sft微调loss不下降 #188

xiaohuihwh · 2023-09-21T03:37:48Z

用LLaMA-Efficient-Tuning微调1和2，loss均不下降

xiaohuihwh · 2023-09-21T03:43:50Z

在2块4090用deepspeed微调baichuan1和baichuan2开始loss为1.9，几十个step之后就在1.5附近不再下降。
参数如下。
deepspeed --num_gpus 2 --master_port=9901 src/train_bash.py
--deepspeed ds_config_stage3.json
--stage sft
--model_name_or_path /root/autodl-tmp/baichuan-13b-chat/
--do_train --dataset alpaca_gpt4_zh
--template baichuan2
--finetuning_type lora
--lora_target W_pack
--output_dir src/output/sft-0921
--overwrite_cache
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 500
--learning_rate 5e-5
--num_train_epochs 3.0
--plot_loss
--bf16
deepspeed配置文件如下。

xiaohuihwh changed the title ~~baichuan-13b-chat sft~~ baichuan-13b-chat sft微调loss不下降 Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baichuan-13b-chat sft微调loss不下降 #188

baichuan-13b-chat sft微调loss不下降 #188

xiaohuihwh commented Sep 21, 2023 •

edited

Loading

xiaohuihwh commented Sep 21, 2023

baichuan-13b-chat sft微调loss不下降 #188

baichuan-13b-chat sft微调loss不下降 #188

Comments

xiaohuihwh commented Sep 21, 2023 • edited Loading

xiaohuihwh commented Sep 21, 2023

xiaohuihwh commented Sep 21, 2023 •

edited

Loading