Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train on GPU instead of TPU - differnt distribution strategies #16

Closed
PhilipMay opened this issue Apr 3, 2021 · 2 comments
Closed

Train on GPU instead of TPU - differnt distribution strategies #16

PhilipMay opened this issue Apr 3, 2021 · 2 comments

Comments

@PhilipMay
Copy link
Contributor

Hi,
many thanks for this nice new model type and your research.
We would like to train a ConvBERT but on GPU and not TPU.
Do you have any experiences or tips how to do this?
We have concerns regarding the differnt distribution strategies
between GPUs and TPUs.

Thanks
Philip

@PhilipMay PhilipMay changed the title Train on GPU instead of TPU Train on GPU instead of TPU - differnt distribution strategies Apr 3, 2021
@PhilipMay
Copy link
Contributor Author

Well - on the README you write:

The code is tested on a V100 GPU.

This means the pretraining on multiple GPUs - right?

@zihangJiang
Copy link
Collaborator

Hi, thanks for your interest.
Our code is only tested on a single V100 GPU. If you are seeking support for multi-GPU instead of TPU training, you may refer to https://huggingface.co/transformers/model_doc/convbert.html which implement our model in PyTorch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants