Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text Classification Predictor #2745

Merged
merged 7 commits into from Apr 24, 2019
Merged

Conversation

@dasguptar
Copy link
Contributor

dasguptar commented Apr 24, 2019

This PR adds a simple predictor for text classification to AllenNLP.

It can be used in conjunction with allennlp.models.BasicClassifier and allennlp.data.dataset_readers.TextClassificationJsonReader.

dasguptar added 3 commits Apr 24, 2019
Copy link
Collaborator

DeNeutoy left a comment

Thanks for the contribution - it looks like you've got a few more pylint issues to fix. you will also need to add the predictor to the docs. You can do this by editing this file:
https://github.com/allenai/allennlp/blob/master/doc/api/allennlp.predictors.rst

Be careful to be precise about the formatting in the .rst files, as sphinx (the thing that builds our docs) is very particular about whitespace etc (it's annoying).

allennlp/predictors/text_classifier.py Outdated Show resolved Hide resolved
allennlp/predictors/text_classifier.py Outdated Show resolved Hide resolved
data_directory = AllenNlpTestCase.FIXTURES_ROOT / "data" / "text_classification_json"
self.set_up_model(AllenNlpTestCase.FIXTURES_ROOT / 'basic_classifier' / 'experiment_seq2vec.jsonnet',
data_directory / "imdb_corpus.jsonl")
self.set_up_model(self.FIXTURES_ROOT / 'basic_classifier' / 'experiment_seq2vec.jsonnet',

This comment has been minimized.

Copy link
@DeNeutoy

DeNeutoy Apr 24, 2019

Collaborator

the setUp method sets up things you will need for the whole class for the duration of the test, so actually, you can remove these duplicated calls to self.set_up_model now you have pulled it out into the setUp method.

This comment has been minimized.

Copy link
@dasguptar

dasguptar Apr 24, 2019

Author Contributor

It seems that pytest runs test in alphabetical order, so the tests are run in this order:

  • test_forward_pass_runs_correctly
  • test_seq2seq_clf_can_train_save_and_load
  • test_seq2vec_clf_can_train_save_and_load

Right now the 1st and 3rd tests above require the same model, so I added the repeated set_up_model calls everywhere. I'll change it so that tests 1 and 2 above use the same model, so there will still be 1 set_up_model call for the 3rd test.

This comment has been minimized.

Copy link
@DeNeutoy

DeNeutoy Apr 24, 2019

Collaborator

Sorry, I missed that they are different fixtures being set up. How you have it is fine.

allennlp/models/basic_classifier.py Outdated Show resolved Hide resolved
@dasguptar dasguptar force-pushed the dasguptar:text_classifier_predictor branch from 239a730 to 54777f7 Apr 24, 2019
@dasguptar

This comment has been minimized.

Copy link
Contributor Author

dasguptar commented Apr 24, 2019

@DeNeutoy TeamCity passes all checks now. Let me know if there are any other changes you want me to do.
P.S. Not sure about the codecov/path check failing.

@DeNeutoy

This comment has been minimized.

Copy link
Collaborator

DeNeutoy commented Apr 24, 2019

Could you add a test to https://github.com/allenai/allennlp/tree/master/allennlp/tests/predictors for the new predictor? Thanks!

@dasguptar dasguptar force-pushed the dasguptar:text_classifier_predictor branch from 4f6b734 to aee02f0 Apr 24, 2019
@DeNeutoy DeNeutoy merged commit 53b3d37 into allenai:master Apr 24, 2019
3 checks passed
3 checks passed
Pull Requests (AllenNLP Library) TeamCity build finished
Details
codecov/patch 92% of diff hit (target 90%)
Details
codecov/project Absolute coverage decreased by -<1% but relative coverage increased by +<1% compared to 12cee92
Details
reiyw added a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019
* Text Classification Predictor

* Linting fixes

* Linting fixes

* Addressing comments

* Adding doc

* Adding test for text_classifier predictor along with serialization dir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.