Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Adding an optional projection layer to ElmoTokenEmbedder #1076

Merged
merged 2 commits into from Apr 12, 2018

Conversation

matt-gardner
Copy link
Contributor

@matt-peters FYI. @rajasagashe's experiments with the WikiTables parser found that a projection of the ELMo representations was significantly better than just using dropout. In order to handle the projection cleanly in AllenNLP, it has to be done inside the ElmoTokenEmbedder. This PR adds that.

Copy link
Contributor

@DeNeutoy DeNeutoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@matt-peters matt-peters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the PR Matt! Awesome to hear it improved the WikiTables parser. Just curious, what is the relative improvement in the parser for the projection ELMo vs dropout ELMo vs no-ELMo?

@matt-gardner
Copy link
Contributor Author

Adding ELMo with just dropout decreased performance by ~4%. Adding ELMo with a projection layer increased performance by ~3%. These numbers are all using the preliminary model that still doesn't quite match the original parser yet, so I'll get a clearer picture once I've managed to reproduce the original numbers.

@matt-gardner matt-gardner merged commit b72c838 into allenai:master Apr 12, 2018
@matt-gardner matt-gardner deleted the elmo_projection branch April 12, 2018 16:59
gabrielStanovsky pushed a commit to gabrielStanovsky/allennlp that referenced this pull request Sep 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants