Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust native tensor conversion #171

Closed
guillaume-be opened this issue Mar 8, 2020 · 4 comments
Closed

Rust native tensor conversion #171

guillaume-be opened this issue Mar 8, 2020 · 4 comments

Comments

@guillaume-be
Copy link
Contributor

Hello,

I have developed an implementation of transformer-based language models (including BERT, DistilBERT, RoBERTa, GPT, GPT2) and ready-to-use NLP applications such as classification or question answering (https://github.com/guillaume-be/rust-bert). Pretrained models from Hugging Face (https://github.com/huggingface/transformers) are available. In order to use them I created a set of Python utility scripts to download and convert the data (for example https://github.com/guillaume-be/rust-bert/blob/master/utils/download-dependencies_bert.py) following your advice.

An interested user raised an issue asking if it would be possible to avoid using Python all together and perform the download and conversion in Rust (guillaume-be/rust-bert#12).

I checked quickly and believe this would involve opening the Pytorch (pickled) binary files. The Pytorch script for de-serializing is fairly complex - have you evaluated the possibility to open these files directly in Rust - for example using the serde-pickle crate?

Thank you,

@LaurentMazare
Copy link
Owner

Your transformer library looks pretty amazing, congrats!
As for the weight files, I see that you're converting from .npz to .ot which seems like the easy way to do it. Did you consider actually distributing the .ot files rather than user having to run the conversion ? That's what I did for the various vision models as it avoids end user needing to have a working pytorch install.
Besides this I don't have much experience with serde-pickle, you would have to check that it works properly on numpy array which may actually be a tricky bit.

@danieldk
Copy link
Contributor

danieldk commented Apr 16, 2020

@guillaume-be Oh, that's really nice! I have also ported some stuff from HF transformers:

https://github.com/stickeritis/sticker-transformers

But I should probably rebase to your implementation since you already support more models ;). So far I have also been Python scripts to convert models to HDF5 (and load the non-finetuned models from HDF5).

@guillaume-be
Copy link
Contributor Author

@danieldk very nice work on your transformer port, and the library it integrates into. Happy to provide help wherever I can if you'd like to re-use some portions of this port!

@guillaume-be
Copy link
Contributor Author

The solution was to distribute the .ot file for direct use by the users. Thank you for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants