Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Language model classes for making predictions (both masked LM and next token LM) #3201

Merged
merged 11 commits into from
Aug 28, 2019

Conversation

matt-gardner
Copy link
Contributor

@matt-gardner matt-gardner commented Aug 26, 2019

These models are only tested for predictions, not for training. The masked LM might be useful for training, but also might just be too slow. The next token LM definitely is not for training. These are noted in the class docstrings.

Copy link
Contributor

@DeNeutoy DeNeutoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although having this in the main library is still a very questionable decision in my opinion. We're just creating work for ourselves when there is a clear route to having this in the demo with a separate git repo/package.

@@ -55,7 +55,7 @@ def _read(self, file_path: str):
import sys
# You can call pytest with either `pytest` or `py.test`.
if 'test' not in sys.argv[0]:
raise RuntimeError('_read is only implemented for unit tests at the moment')
logger.error('_read is only implemented for unit tests at the moment')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come these can't be RuntimeErrors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worked locally, but the check above didn't work in team city, for reasons I'm not really sure of. I couldn't easily find what sys.argv would be in team city, and I didn't want to deal with a 20 minute debug cycle to figure it out.

contextual_embeddings = embeddings

batch_index = torch.arange(0, batch_size).long().unsqueeze(1)
mask_embeddings = contextual_embeddings[batch_index, mask_positions]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do with a comment that the mask_positions are the ones we actually want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@matt-gardner
Copy link
Contributor Author

It is precisely because of these discussions about what belongs in the library and what doesn't that I suggested splitting out the models entirely. As you know, I agree with this. But we're not set up for that yet, and actually setting things up so that we are, and still get testing and CI right and everything else, is going to take a while. We want to launch AllenNLP Interpret very soon (camera ready deadline is Friday), and so I want this merged now, before having to figure out how everything will work once we split out things into separate repos.

@matt-gardner matt-gardner merged commit d78ac70 into allenai:master Aug 28, 2019
@DeNeutoy
Copy link
Contributor

Thanks, "camera ready deadline is Friday" was the piece of the puzzle I was missing.

@matt-gardner matt-gardner deleted the new_language_models branch August 28, 2019 18:46
@pidugusundeep
Copy link
Contributor

@DeNeutoy is there a tutorial explaining the usage of the same?

reiyw pushed a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019
…t token LM) (allenai#3201)

* Models, tests, and doc

* test fixtures

* models/__init__.py

* Fix test

* docstrings

* pylint, other cleanup

* mypy; more cleanup

* fix imports

* change runtime errors to logger.error

* pylint

* Add comment
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants