Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load BERT-esque checkpoints in pytorch formats #20

Open
jonathanbratt opened this issue Sep 13, 2019 · 3 comments
Open

Load BERT-esque checkpoints in pytorch formats #20

jonathanbratt opened this issue Sep 13, 2019 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@jonathanbratt
Copy link
Owner

jonathanbratt commented Sep 13, 2019

The original BERT checkpoints released by Google are in a TensorFlow format.
It seems that most of the related work done by other teams is in the PyTorch implementation.
In particular, pre-trained models such as RoBERTa and DistilBERT have been released for PyTorch.

Many of these models are compatible with the BERT architecture, though possibly with different parameters or vocabularies. It would be great to be able to easily load these into RBERT.

@jonathanbratt jonathanbratt added help wanted Extra attention is needed enhancement New feature or request labels Sep 13, 2019
@jonathanbratt
Copy link
Owner Author

One possible approach is to write some code to convert PyTorch models into TensorFlow checkpoints, at which point it should be possible to use existing code to load/use. I don't know how to do this, though. Anybody with more PyTorch experience want to give this a shot?

@jonathanbratt
Copy link
Owner Author

Related: SciBERT checkpoints have been released in tensorflow format, so those are already available in RBERT.
https://github.com/allenai/scibert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant