-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add language models from Baevski & Auli (2018)
- Loading branch information
Showing
15 changed files
with
288 additions
and
83 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,2 @@ | ||
*/* | ||
!*/*.sh | ||
!*/*.md |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017) | ||
|
||
## Example usage | ||
|
||
See the [language modeling README](../README.md) for instructions on reproducing results for WikiText-103 | ||
using the `fconv_lm_dauphin_wikitext103` model architecture. | ||
|
||
## Citation | ||
|
||
```bibtex | ||
@inproceedings{dauphin2017language, | ||
title={Language Modeling with Gated Convolutional Networks}, | ||
author={Dauphin, Yann N and Fan, Angela and Auli, Michael and Grangier, David}, | ||
booktitle={Proceedings of the 34th International Conference on Machine Learning-Volume 70}, | ||
pages={933--941}, | ||
year={2017}, | ||
organization={JMLR} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Adaptive Input Representations for Neural Language Modeling (Baevski and Auli; 2018) | ||
|
||
## Pre-trained models | ||
|
||
Description | Parameters | Dataset | Model and Test set(s) | ||
---|---:|---|--- | ||
Adaptive Inputs <br> ([Baevski and Auli, 2018](https://arxiv.org/abs/1809.10853)) | 1026M | [Google Billion Words](https://github.com/ciprian-chelba/1-billion-word-language-modeling-benchmark) | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/lm/adaptive_lm_gbw_huge.bz2) | ||
Adaptive Inputs <br> ([Baevski and Auli, 2018](https://arxiv.org/abs/1809.10853)) | 247M | [WikiText-103](https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset) | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/lm/adaptive_lm_wiki103.bz2) | ||
|
||
## Example usage | ||
|
||
See the [language modeling README](../language_model/README.md) for instructions on reproducing results for WikiText-103 | ||
using the `transformer_lm_wiki103` model architecture. | ||
|
||
## Citation | ||
|
||
```bibtex | ||
@inproceedings{ | ||
baevski2018adaptive, | ||
title={Adaptive Input Representations for Neural Language Modeling}, | ||
author={Alexei Baevski and Michael Auli}, | ||
booktitle={International Conference on Learning Representations}, | ||
year={2019}, | ||
url={https://openreview.net/forum?id=ByxZX20qFQ}, | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.