You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The original BERT checkpoints released by Google are in a TensorFlow format.
It seems that most of the related work done by other teams is in the PyTorch implementation.
In particular, pre-trained models such as RoBERTa and DistilBERT have been released for PyTorch.
Many of these models are compatible with the BERT architecture, though possibly with different parameters or vocabularies. It would be great to be able to easily load these into RBERT.
The text was updated successfully, but these errors were encountered:
One possible approach is to write some code to convert PyTorch models into TensorFlow checkpoints, at which point it should be possible to use existing code to load/use. I don't know how to do this, though. Anybody with more PyTorch experience want to give this a shot?
The original BERT checkpoints released by Google are in a TensorFlow format.
It seems that most of the related work done by other teams is in the PyTorch implementation.
In particular, pre-trained models such as RoBERTa and DistilBERT have been released for PyTorch.
Many of these models are compatible with the BERT architecture, though possibly with different parameters or vocabularies. It would be great to be able to easily load these into RBERT.
The text was updated successfully, but these errors were encountered: