Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral-train error on deepspeed config #35

Open
xiechengmude opened this issue Nov 2, 2023 · 1 comment
Open

Mistral-train error on deepspeed config #35

xiechengmude opened this issue Nov 2, 2023 · 1 comment

Comments

@xiechengmude
Copy link

File "/workspace/long/yarn/finetune.py", line 143, in main
model = accelerator.prepare(model)
File "/root/miniconda3/envs/yarn/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare
result = self._prepare_deepspeed(*args)
File "/root/miniconda3/envs/yarn/lib/python3.10/site-packages/accelerate/accelerator.py", line 1515, in _prepare_deepspeed
raise ValueError(
ValueError: When using DeepSpeed accelerate.prepare() requires you to pass at least one of training or evaluation dataloaders or alternatively set an integer value in train_micro_batch_size_per_gpu in the deepspeed config fileor assign integer value to AcceleratorState().deepspeed_plugin.deepspeed_config['train_micro_batch_size_per_gpu'].
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1523510) of binary: /root/miniconda3/envs/yarn/bin/python

@18140663659
Copy link

File "/workspace/long/yarn/finetune.py", line 143, in main model = accelerator.prepare(model) File "/root/miniconda3/envs/yarn/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare result = self._prepare_deepspeed(*args) File "/root/miniconda3/envs/yarn/lib/python3.10/site-packages/accelerate/accelerator.py", line 1515, in _prepare_deepspeed raise ValueError( ValueError: When using DeepSpeed accelerate.prepare() requires you to pass at least one of training or evaluation dataloaders or alternatively set an integer value in train_micro_batch_size_per_gpu in the deepspeed config fileor assign integer value to AcceleratorState().deepspeed_plugin.deepspeed_config['train_micro_batch_size_per_gpu']. ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1523510) of binary: /root/miniconda3/envs/yarn/bin/python

same question, Has the problem been resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants