-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Language model classes for making predictions (both masked LM and next token LM) #3201
Language model classes for making predictions (both masked LM and next token LM) #3201
Conversation
98ced63
to
6f88e6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, although having this in the main library is still a very questionable decision in my opinion. We're just creating work for ourselves when there is a clear route to having this in the demo with a separate git repo/package.
@@ -55,7 +55,7 @@ def _read(self, file_path: str): | |||
import sys | |||
# You can call pytest with either `pytest` or `py.test`. | |||
if 'test' not in sys.argv[0]: | |||
raise RuntimeError('_read is only implemented for unit tests at the moment') | |||
logger.error('_read is only implemented for unit tests at the moment') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come these can't be RuntimeErrors
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It worked locally, but the check above didn't work in team city, for reasons I'm not really sure of. I couldn't easily find what sys.argv
would be in team city, and I didn't want to deal with a 20 minute debug cycle to figure it out.
contextual_embeddings = embeddings | ||
|
||
batch_index = torch.arange(0, batch_size).long().unsqueeze(1) | ||
mask_embeddings = contextual_embeddings[batch_index, mask_positions] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do with a comment that the mask_positions
are the ones we actually want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
It is precisely because of these discussions about what belongs in the library and what doesn't that I suggested splitting out the models entirely. As you know, I agree with this. But we're not set up for that yet, and actually setting things up so that we are, and still get testing and CI right and everything else, is going to take a while. We want to launch AllenNLP Interpret very soon (camera ready deadline is Friday), and so I want this merged now, before having to figure out how everything will work once we split out things into separate repos. |
Thanks, "camera ready deadline is Friday" was the piece of the puzzle I was missing. |
@DeNeutoy is there a tutorial explaining the usage of the same? |
…t token LM) (allenai#3201) * Models, tests, and doc * test fixtures * models/__init__.py * Fix test * docstrings * pylint, other cleanup * mypy; more cleanup * fix imports * change runtime errors to logger.error * pylint * Add comment
These models are only tested for predictions, not for training. The masked LM might be useful for training, but also might just be too slow. The next token LM definitely is not for training. These are noted in the class docstrings.