-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Description
ds shell scripts:
deepspeed --include localhost:0,1 --master_port 22267 fastchat/train/train_lora.py
--model_name_or_path
--lora_r 8
--lora_alpha 16
--lora_dropout 0.05
--data_path
--output_dir
--num_train_epochs 3
--fp16 True
...
--deepspeed /data/xixiaoyan/FastChat0907/FastChat-main/playground/deepspeed_config_s1.json
--gradient_checkpointing True
--flash_attn False
ds config : {
"zero_optimization": {
"stage": 1,
"allgather_partitions": true,
"allgather_bucket_size": 5e8,
"overlap_comm": true,
"reduce_scatter": true,
"reduce_bucket_size": 5e8,
"contiguous_gradients" : true,
"offload_optimizer": {
"device": "cpu",
"pin_memory": true
}
},
"contiguous_gradients": true,
"overlap_comm": true,
"fp16":{
"enabled": true
},
}
But encounter error as follow ;
"/lib/python3.8/site-packages/peft/tuners/lora.py", line 1076, in forward
self.lora_Aself.active_adapter
File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: expected scalar type Float but found Half