Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing LM SOTA result + # params + prev SOTA #195

Merged
merged 28 commits into from Jan 3, 2019

Conversation

cwenner
Copy link
Contributor

@cwenner cwenner commented Dec 31, 2018

Add missing LM ensemble which is SOTA for PTB.
Add second-in-line LM SOTA for strict interpretation.
Add number of params for LM results.

(unsure why it lists commits that have already been merged)

For better comparison of results wrt params.

Also normalizes layer qualification.
Fix cited paper for Graves et al and include their primary result; previously only baseline.
Add results for the paper, Adaptive Input Representations for Neural Language Modeling
* Add results for LM paper, Transformer-XL

* Remove superfluous vbar and normalize header

* Add missing authors to item

* Make under-review mark more discrete
Add results for the paper,
Adaptive Input Representations for Neural Language Modeling
Add SOTA "ensemble" model for PTB. Add number of params for PTB and WT2
Add second-in-line LM SOTA for a strict interpretation of competing results.
@cwenner
Copy link
Contributor Author

cwenner commented Dec 31, 2018

@sebastianruder - Not promising anything but would you object to any of these additions?

  • Split LM dynamic evaluation to separate tables. IMO they are not as relevant for downstream tasks and it may be more relevant to compare results against those with the same evaluation.
  • Add examples of generated texts for some of the LMs.
  • Possibly add some less popular LM datasets like PANAMA.
  • If there are datasets which still seem relevant, I was thinking of listing embedding comparisons. E.g. the google analogy test set. Would these work under the Text Similarity page or should they get a new page?

At some point, these tables were generated from yaml data. These data files do not appear to exist anymore.
@sebastianruder
Copy link
Owner

These all sound great. Thanks for your efforts, @cwenner! For the word similarity and analogy datasets, let's add them under the text similarity page for now. 👍

@cwenner
Copy link
Contributor Author

cwenner commented Jan 2, 2019

@sebastianruder - Okay, thanks. What's the goal for completeness vs relevance of the lists? E.g. the more the better; prune all but top-1 or top-2 non-dominated results (eg for LM, params vs ppl); keep seminal for comparison?

Btw this is an unmerged PR.

@sebastianruder
Copy link
Owner

Yep, ~ top-2 state-of-the-art results with seminal methods for comparison.
Happy to merge this one. Other changes can go on a new PR.

@sebastianruder
Copy link
Owner

Nice work! I'll merge this now. Feel free to submit any other changes in a new PR.

@sebastianruder sebastianruder merged commit e3e7939 into sebastianruder:master Jan 3, 2019
@cwenner
Copy link
Contributor Author

cwenner commented Jan 4, 2019

I think that's easier - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants