Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn BidirectionalLM into a more-general LanguageModel class #2264

Merged
merged 49 commits into from Jan 8, 2019

Conversation

Projects
None yet
4 participants
@nelson-liu
Copy link
Member

commented Jan 2, 2019

Fixes #2255

This PR replaces the BidirectionalLM class with a more-general LanguageModel that can be used in either the unidirectional/forward setting or the bidirectional setting.

It also accordingly replaces the BidirectionalLanguageModelTokenEmbedder with a LanguageModelTokenEmbedder.

Also fixes bug in the experiment_unsampled.jsonnet config that was preventing a test from actually being unsampled.

TODO:

  • test the unidirectional case
  • properly deprecate BidirectionalLM and BidirectionalLanguageModelTokenEmbedder
  • check docs for accuracy
  • fix user-facing training configs

nelson-liu added some commits Jan 2, 2019

@nelson-liu

This comment has been minimized.

Copy link
Member Author

commented Jan 2, 2019

I don't think i'll have this finished for the near future's minor release. Thus, it'd be great to merge #2253 before this and the next release.

nelson-liu added some commits Jan 2, 2019

@brendan-ai2

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

I like this idea. My main concern would be minimizing disruption, e.g. keeping trained models working. Please ping me when you're ready for a review!

nelson-liu added some commits Jan 3, 2019

@nelson-liu

This comment has been minimized.

Copy link
Member Author

commented Jan 3, 2019

ok @brendan-ai2 this should be ready to look at! No rush, though. I'm also planning on taking this code and actually running it on some data to ensure that the perplexities look more-or-less correct.

one question (for anyone): if we deprecate a class, should we keep its tests around? or is that unnecessary?

@nelson-liu

This comment has been minimized.

Copy link
Member Author

commented Jan 3, 2019

to summarize why the diff looks so scary / maybe make it easier to review:

  • moved most of code in models/bidirectional_lm.py to models/shuffled_sentence_lm.py
  • Kept BidirectionalLM class around in models/bidirectional_lm.py for backwards compatibility, but it just forwards its arguments to ShuffledSentenceLM while emitting a deprecation warning.
  • moved most of code in bidirectional_language_model_token_embedder.py to shuffled_sentence_language_model_token_embedder.py.
  • Kept BidirectionalLanguageModelTokenEmbedder class around in bidirectional_language_model_token_embedder.py for backwards compatibility, but it just forwards its arguments to ShuffledSentenceLanguageModelTokenEmbedder while emitting a deprecation warning.

You probably want to diff what's currently in models/shuffled_sentence_lm.py with models/bidirectional_lm.py before this PR to see what actually changed. Similarly, for shuffled_sentence_language_model_token_embedder.py, manually diffing with bidirectional_language_model_token_embedder.py would make it much more reviewable.

@matt-gardner

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

@nelson-liu, on your deprecation question. @schmmd and I talked about this a bit a while ago, and we think it'd be good to eventually make allennlp-internal warnings into test failures. So, any test that emits a deprecation warning either has to catch/suppress that warning or fail. If you're moving the functionality of a class to a new class, you should also move any applicable tests to the new class. Whether you keep around a test on the deprecated class, that probably only checks that it emits a deprecation warning, is up to you.

@matt-peters

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2019

Probably should have looked at this earlier but was out over the holidays so apologies for the late feedback. What is the reasoning for explicitly adding ShuffledSentence to the names? There is nothing inside the BidirectionalLM class that restricts it to shuffled sentences vs longer contexts, and I have used the calypso version to train on both long contexts and the shuffled Billion Word Benchmark.

The assumptions about shuffling require changes in the data preparation / iterator and statefulness vs non-statefulness of the contextual encoder. So it seems to me the abstractions should be LanguageModel (with bidirectional option), and adding a StatefullIterator that will read very long contexts, chunk them into smaller pieces and do the alignment from batch to batch?

@nelson-liu

This comment has been minimized.

Copy link
Member Author

commented Jan 3, 2019

That's a good point @matt-peters , will edit.

nelson-liu added some commits Jan 4, 2019

@nelson-liu

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2019

@brendan-ai2 this is ready for another look, fyi

@nelson-liu nelson-liu changed the title Turn BidirectionalLM into a more-general SentenceShuffledLM Turn BidirectionalLM into a more-general LanguageModel class Jan 5, 2019

nelson-liu added some commits Jan 6, 2019

nelson-liu added some commits Jan 7, 2019

@brendan-ai2
Copy link
Member

left a comment

Thanks for all the edits! Really glad to have this PR. :) One final comment, LGTM after that!

@nelson-liu nelson-liu merged commit 088f0bb into allenai:master Jan 8, 2019

2 of 3 checks passed

codecov/patch 88% of diff hit (target 90%)
Details
Pull Requests (AllenNLP Library) TeamCity build finished
Details
codecov/project 92% (+<1%) compared to f76dc70
Details

@nelson-liu nelson-liu deleted the nelson-liu:unidirectional_lm branch Jan 8, 2019

@nelson-liu

This comment has been minimized.

Copy link
Member Author

commented Jan 8, 2019

Thanks for your thorough comments, @brendan-ai2 ! Much appreciated.

WrRan added a commit to WrRan/allennlp that referenced this pull request Jan 8, 2019

Turn BidirectionalLM into a more-general LanguageModel class (allenai…
…#2264)

Fixes allenai#2255

This PR replaces the `BidirectionalLM` class with a more-general `LanguageModel` that can be used in either the unidirectional/forward setting or the bidirectional setting.

It also accordingly replaces the `BidirectionalLanguageModelTokenEmbedder` with a `LanguageModelTokenEmbedder`.

Also fixes bug in the experiment_unsampled.jsonnet config that was preventing a test from actually being unsampled.

TODO:

- [x] test the unidirectional case
- [x] properly deprecate `BidirectionalLM` and `BidirectionalLanguageModelTokenEmbedder` 
- [x] check docs for accuracy
- [x] fix user-facing training configs

DeNeutoy added a commit that referenced this pull request Jan 31, 2019

More help info for `archive_surgery.py` (#2327)
* Fix bug in uniform_unit_scaling #2239 (#2273)

* Fix type annotation for .forward(...) in tutorial (#2122)

* Add a Contributions section to README.md (#2277)

* script for doing archive surgery (#2223)

* script for doing archive surgery

* simplify script

* Fix spelling in tutorial README (#2283)

* fix #2285 (#2286)

* Update the `find-lr` subcommand help text. (#2289)

* Update the elmo command help text.

* Update the find-lr subcommand help text.

* Add __repr__ to Vocabulary (#2293)

As it currently stands, the following is logged during training:

```
2019-01-06 10:46:21,832 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.model
s.language_model.LanguageModel'> from params {'bidirectional': False, 'contextualizer': {'bidirectional':
 False, 'dropout': 0.5, 'hidden_size': 200, 'input_size': 200, 'num_layers': 2, 'type': 'lstm'}, 'dropout
': 0.5, 'text_field_embedder': {'token_embedders': {'tokens': {'embedding_dim': 200, 'type': 'embedding'}
}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7ff7811665f8>}
```

Note that the `Vocabulary` does not provide any useful information, since it doesn't have `__repr__` defined. This provides a fix.

* Update the base image in the Dockerfiles. (#2298)

* Don't deprecate bidirectional-language-model name (#2297)

* bump version number to v0.8.1

* Bump version numbers to v0.8.2-unreleased

* Turn BidirectionalLM into a more-general LanguageModel class (#2264)

Fixes #2255

This PR replaces the `BidirectionalLM` class with a more-general `LanguageModel` that can be used in either the unidirectional/forward setting or the bidirectional setting.

It also accordingly replaces the `BidirectionalLanguageModelTokenEmbedder` with a `LanguageModelTokenEmbedder`.

Also fixes bug in the experiment_unsampled.jsonnet config that was preventing a test from actually being unsampled.

TODO:

- [x] test the unidirectional case
- [x] properly deprecate `BidirectionalLM` and `BidirectionalLanguageModelTokenEmbedder` 
- [x] check docs for accuracy
- [x] fix user-facing training configs

* more help info

* typo fix

* add option '--inplace', '--force'

* clearer help text

matt-gardner added a commit that referenced this pull request Mar 18, 2019

Move some scripts to `allennlp/allennlp/tools` (#2584)
* Fix bug in uniform_unit_scaling #2239 (#2273)

* Fix type annotation for .forward(...) in tutorial (#2122)

* Add a Contributions section to README.md (#2277)

* script for doing archive surgery (#2223)

* script for doing archive surgery

* simplify script

* Fix spelling in tutorial README (#2283)

* fix #2285 (#2286)

* Update the `find-lr` subcommand help text. (#2289)

* Update the elmo command help text.

* Update the find-lr subcommand help text.

* Add __repr__ to Vocabulary (#2293)

As it currently stands, the following is logged during training:

```
2019-01-06 10:46:21,832 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.model
s.language_model.LanguageModel'> from params {'bidirectional': False, 'contextualizer': {'bidirectional':
 False, 'dropout': 0.5, 'hidden_size': 200, 'input_size': 200, 'num_layers': 2, 'type': 'lstm'}, 'dropout
': 0.5, 'text_field_embedder': {'token_embedders': {'tokens': {'embedding_dim': 200, 'type': 'embedding'}
}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7ff7811665f8>}
```

Note that the `Vocabulary` does not provide any useful information, since it doesn't have `__repr__` defined. This provides a fix.

* Update the base image in the Dockerfiles. (#2298)

* Don't deprecate bidirectional-language-model name (#2297)

* bump version number to v0.8.1

* Bump version numbers to v0.8.2-unreleased

* Turn BidirectionalLM into a more-general LanguageModel class (#2264)

Fixes #2255

This PR replaces the `BidirectionalLM` class with a more-general `LanguageModel` that can be used in either the unidirectional/forward setting or the bidirectional setting.

It also accordingly replaces the `BidirectionalLanguageModelTokenEmbedder` with a `LanguageModelTokenEmbedder`.

Also fixes bug in the experiment_unsampled.jsonnet config that was preventing a test from actually being unsampled.

TODO:

- [x] test the unidirectional case
- [x] properly deprecate `BidirectionalLM` and `BidirectionalLanguageModelTokenEmbedder` 
- [x] check docs for accuracy
- [x] fix user-facing training configs

* move some utilities from allennlp/scripts to allennlp/allennlp/tools

* make pylint happy

* add modules to API doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.