-
Notifications
You must be signed in to change notification settings - Fork 958
Description
I am facing some issues whe using Deep Speed for fine tuning StarCoder Model. I am exactly following the steps mentioned in this article Creating a Coding Assistant with StarCoder (section Fine-tuning StarCoder with DeepSpeed ZeRO-3). However I am getting the error “AssertionError: Check batch related parameters. train_batch_size is not equal to micro_batch_per_gpu * gradient_acc_step * world_size 256 != 4 * 8 * 1”. I did some research on this on Google and found this link explaining the reason [BUG] batch_size check failed with zero 2 (deepspeed v0.9.0) · Issue #3228 · microsoft/DeepSpeed · GitHub However even if I use the version of deepspeed mentioned in this article as working (v 0.9.0) I am getting the same error. I tried different versions of deepspeed and accelerate but couldn’t fix the issue. Any one has any suggestions? Thanks in advance.