-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DeepSpeed Zero-3 is not compatible with low_cpu_mem_usage=True
or with passing a device_map
.
#306
Comments
Hello @rafael-ariascalles, as the error suggests, DeepSpeed isn't can't be used when using device_map or low_cpu_mem_usage. The reason is that device_map/low_cpu_mem_usage lead to naive model pipeline parallelism and DeepSpeed is meant for sharded data parallelism. These 2 can't be used together because of the way they are implemented. INT8 + DeepSpeed also isn't supported. You can use PEFT + Gradient Checkpointing + DeepSpeed ZeRO-3 for your use case. |
Thanks , I’ll try in that way. |
Hi, I am also getting the same error: I have also installed accelerate from pip but othing seems to be working. Please help me as I am a newbie Cell In[84], line 13 File c:\Users\user\Desktop\ChatPDF\venv\Lib\site-packages\transformers\models\auto\auto_factory.py:493, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) File c:\Users\user\Desktop\ChatPDF\venv\Lib\site-packages\transformers\modeling_utils.py:2251, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) ImportError: Using |
I am trying to train a flan-t5-xxl model using "INT8 training of large models in Colab using PEFT LoRA and bits_and_bytes" but using multiple GPUs DeepSpeed and Accelerate
I am cinstantiate the model as:
but I get the following error:
the deepspeed config:
any advice @pacman100
I was follow the advice in the thread #93 (comment)
to add decive_map
The text was updated successfully, but these errors were encountered: