Add ELMo modules #299

gpengzhi · 2020-02-24T21:07:05Z

Add texar-styled ELMo encoder adapted from allennlp. The corresponding tokenizer will be in another PR.

Resolve some comments in #298

I checked the implementation of ELMo in allennlp, It seems that they used customized LSTM such that we cannot use our LSTM module to implement it directly. And the Highway module they used is different from our HighwayWrapper. I feel that it is better to directly use their implementations, and the correctness of the implementation is guaranteed by their unit tests. Please let me know your thought @huzecong

codecov · 2020-02-24T21:31:05Z

Codecov Report

Merging #299 into master will increase coverage by 0.29%.
The diff coverage is 86.64%.

@@            Coverage Diff             @@
##           master     #299      +/-   ##
==========================================
+ Coverage   82.63%   82.93%   +0.29%     
==========================================
  Files         207      215       +8     
  Lines       16006    17271    +1265     
==========================================
+ Hits        13226    14323    +1097     
- Misses       2780     2948     +168

Impacted Files	Coverage Δ
texar/torch/utils/utils_test.py	`99.19% <100%> (+0.25%)`	⬆️
texar/torch/utils/test.py	`93.75% <100%> (+0.41%)`	⬆️
texar/torch/modules/pretrained/__init__.py	`100% <100%> (ø)`	⬆️
texar/torch/modules/encoders/__init__.py	`100% <100%> (ø)`	⬆️
texar/torch/modules/pretrained/elmo_test.py	`47.05% <47.05%> (ø)`
texar/torch/modules/pretrained/elmo.py	`53.84% <53.84%> (ø)`
texar/torch/modules/encoders/elmo_encoder_test.py	`73.33% <73.33%> (ø)`
texar/torch/modules/encoders/elmo_encoder.py	`74.69% <74.69%> (ø)`
texar/torch/utils/utils.py	`80.1% <80.64%> (+0.04%)`	⬆️
texar/torch/modules/pretrained/elmo_utils.py	`84.77% <84.77%> (ø)`
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 931ead9...1ad8c7d. Read the comment docs.

huzecong · 2020-02-25T16:49:12Z

I agree that their implementations are a bit different and it's more reliable to use their code as is, but I don't think I would approve of merging into master at this point.

True, if I were to choose between installing allennlp along with a plethora of dependencies, and installing this version of texar, I would choose texar. However, ELMo is not the only module worth using in our package. To introduce 3k lines of code and an additional dependency is a bit too much for just one functionality. I still believe it is possible to somehow rewrite the module to use more existing parts in our package, but I agree that it would mean a lot of work and might not be worth it at this point.

Speaking of too much code, although not related to this PR, I think it's better to move the *_test.py files out of the texar folder, and move them to a separate tests/ folder. This way the end user doesn't have to install the test files when they do pip install texar-pytorch. @ZhitingHu What do you think?

gpengzhi added 2 commits February 20, 2020 18:34

Add ELMo modules

083d62f

Polish ELMo modules

1ad8c7d

gpengzhi requested a review from huzecong February 24, 2020 21:07

gpengzhi mentioned this pull request Feb 26, 2020

Move unit tests out of texar #301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ELMo modules #299

Add ELMo modules #299

gpengzhi commented Feb 24, 2020

codecov bot commented Feb 24, 2020 •

edited

huzecong commented Feb 25, 2020

Add ELMo modules #299

Are you sure you want to change the base?

Add ELMo modules #299

Conversation

gpengzhi commented Feb 24, 2020

codecov bot commented Feb 24, 2020 • edited

Codecov Report

huzecong commented Feb 25, 2020

codecov bot commented Feb 24, 2020 •

edited