Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to export models? #16

Open
thigm85 opened this issue May 28, 2020 · 11 comments
Open

Is it possible to export models? #16

thigm85 opened this issue May 28, 2020 · 11 comments

Comments

@thigm85
Copy link

thigm85 commented May 28, 2020

Is it possible to export models so that we can use it outside of your ranking pipeline? For example BERT models fine tuned on MSMARCO.

@seanmacavaney
Copy link
Contributor

Yeah, all the model weights an so on are stored in ~/data/onir/models/default/{ranker}/{vocab}/{trainer}/{train_dataset}/weights/ -- there you'll find 3 files-- one corresponding to the initial weights, one for the optimal epoch from the validation set, and one from the final epoch validated (so the pipeline can continue training from there if needed).

Is there a particular format you'd need for the export?

@thigm85
Copy link
Author

thigm85 commented May 29, 2020

Yes, I would like to export the sledge models. But just to be clear, the objective would be to predict with it from python.

Similar to what we do with sentence-transformers library. Example:

embedder = SentenceTransformer('bert-base-nli-mean-tokens')
corpus_embeddings = embedder.encode(["this is a sentence"])

Not sure if that is possible. Haven't found any doc.

@seanmacavaney
Copy link
Contributor

Ah, sorry I misunderstood what you meant by export.

OpenNIR was designed primarily with a CLI in mind. If you have other queries you want to run on the same dataset, I have a quick+dirty suggestion here. There's also the flex dataset if you want an alternative collection of documents as well.

If you're not running an IR experiment (or you'd prefer not to use the CLI), it's possible to create the underlying objects. You'll want to use VanillaTransformer and BertVocab for the SLEDGE models. The CLI does a lot of the heavy lifting regarding configuration and so on (e.g., you'll need to set bert_base of the vocab to scibert and so forth manually).

@Santosh-Gupta
Copy link

I was looking for a way to load the model into huggingface's model.

These need a config.json file and model.bin file. I was wondering what format these file is, and how to convert it to something that can be opened in hf. I tried this

from transformers import BertModel
BertModel.from_pretrained('/content/sledge-med.p')

And got

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

@seanmacavaney
Copy link
Contributor

The sledge-med.p file (and all weight files from OpenNIR, for that matter) is just from torch.save -- a pickle-encoded dict of pytorch tensors, if I recall properly. To load into the transformers library, you'll need to rename some of the parameters because the VanillaTransformer ranker uses BERT as a sub-module. The transformers config should be the same as the SciBERT config file.

Hope this helps!

@Santosh-Gupta
Copy link

Has anyone been able to create a python object of the model? I made an attempt here https://colab.research.google.com/drive/15Gak_LmwEWPbJo3w_EVG8FyfLMWPwoGh?usp=sharing

But wasn't able to successfully able to create the model.

@seanmacavaney
Copy link
Contributor

You're trying to load up a transformers version of it, right? If so, this should do the trick! https://colab.research.google.com/drive/1t5UdW2Jebue1php888ldDll6yG5jQXQQ?usp=sharing (based off starting point from link above.)

I have not tested on an actual ranking task, but it's able to load properly. And it worked with a quick toy example.

@seanmacavaney
Copy link
Contributor

Does this meet your needs too, @thigm85?

@Santosh-Gupta
Copy link

Thanks Sean, very much appreciated!

@timbmg
Copy link

timbmg commented Feb 9, 2021

You're trying to load up a transformers version of it, right? If so, this should do the trick! https://colab.research.google.com/drive/1t5UdW2Jebue1php888ldDll6yG5jQXQQ?usp=sharing (based off starting point from link above.)

I have not tested on an actual ranking task, but it's able to load properly. And it worked with a quick toy example.

thanks, I am using that as well. Could you clarify what the scores that you are showing in that example mean? I understand these are the logits of the classification head of the CLS token. Specifically, you show the score of 0-th class, does this correspond to the relevant, or non relevant score?

@seanmacavaney
Copy link
Contributor

Hi @timbmg,

Yeah, the 0th class corresponds to the relevance score (using the convention from Nogueira et al)

- sean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants