New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: can not use pretrained BERT on multiple GPUs with DataParallel (PyTorch 1.5.0) #4189
Comments
By the way, when I downgrade Pytorch 1.5.0 to 1.4.0, the error disappears. |
erikchwang
changed the title
How to use pretrained BERT on multiple GPUs with DataParallel?
Bug: can not use pretrained BERT on multiple GPUs with DataParallel
May 7, 2020
erikchwang
changed the title
Bug: can not use pretrained BERT on multiple GPUs with DataParallel
Bug: can not use pretrained BERT on multiple GPUs with DataParallel (PyTorch 1.5.0)
May 7, 2020
The same issue: #3936 |
Closing in favor of #3936 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Python: 3.6.10
PyTorch: 1.5.0
Transformers: 2.8.0 and 2.9.0
In the following code, I wrap the pretrained BERT with a DataParallel wrapper so as to run it on multiple GPUs:
But I got the following error:
But it will work if I remove the DataParallel wrapper.
The text was updated successfully, but these errors were encountered: