Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for using masked LM inference on pretrained model? #45

Closed
moscow25 opened this issue Jan 13, 2020 · 2 comments
Closed

Example for using masked LM inference on pretrained model? #45

moscow25 opened this issue Jan 13, 2020 · 2 comments

Comments

@moscow25
Copy link

Hi,

Thanks for releasing the models and experimental configurations. What's not clear to me is something simple -- how to download a pretrained model, and simply use it to predict masked LM from pre-training. It appears the collab link registers a new task and finetunes on it. Is there instructions for simply taking a model after pre-training, and using that directly on new examples?

Thanks!

@craffel
Copy link
Collaborator

craffel commented Jan 13, 2020

Hey Nikolai, here are instructions on how to decode from a model:
https://github.com/google-research/text-to-text-transfer-transformer#decode
However, that code path assumes that you have a text file with preprocessed text strings you want to feed into the model, and the denoising objective we used operates on tokens directly. I think if you wanted to do this you'd need to call decode() directly
https://github.com/tensorflow/mesh/blob/master/mesh_tensorflow/transformer/utils.py#L816
with an input_fn like the one created for training
https://github.com/google-research/text-to-text-transfer-transformer/blob/master/t5/models/mesh_transformer.py#L39
with an appropriate gin file for the denoising preprocessor:
https://github.com/google-research/text-to-text-transfer-transformer/blob/master/t5/models/gin/objectives/span_3_15_u_u.gin
In other words, this isn't well-supported! I don't foresee us spending time working to make this easier, either.

Another option would be to fine-tune the model on a different fill-in-the-blank task; for example you could download a bunch of text, process the text manually yourself, and construct a TSV file and then use the colab or other pipeline for fine-tuning the model on the TSV.

@craffel craffel closed this as completed Jan 13, 2020
@moscow25
Copy link
Author

Thanks @craffel ! Those are helpful links. Quite a sea of .gin files and good to know where to look. Calling decode() directly makes sense.

I will of course want to finetune on my own fill-in-the-blank task as well. Wanted to get started on something that worked out of the box. Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants