Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)

Pre-trained models

Description	Parameters	Dataset	Model and Test set(s)
Adaptive Inputs (Baevski and Auli, 2018)	1026M	Google Billion Words	download (.tar.bz2)
Adaptive Inputs (Baevski and Auli, 2018)	247M	WikiText-103	download (.tar.bz2)

Training an LM with adaptive inputs

First, see the general language modeling README for instructions on preprocessing the WikiText-103 data.

Then use the following training command to train a model with adaptive inputs using the transformer_lm_wiki103 model architecture:

fairseq-train --task language_modeling \
    data-bin/wikitext-103 \
    --save-dir checkpoints/transformer_wikitext-103 \
    --arch transformer_lm_wiki103 \
    --max-update 286000 --max-lr 1.0 --t-mult 2 --lr-period-updates 270000 --lr-scheduler cosine --lr-shrink 0.75 \
    --warmup-updates 16000 --warmup-init-lr 1e-07 --min-lr 1e-09 --optimizer nag --lr 0.0001 --clip-norm 0.1 \
    --criterion adaptive_loss --max-tokens 3072 --update-freq 3 --tokens-per-sample 3072 --seed 1 \
    --sample-break-mode none --skip-invalid-size-inputs-valid-test --ddp-backend=no_c10d

Citation

@inproceedings{
    baevski2018adaptive,
    title={Adaptive Input Representations for Neural Language Modeling},
    author={Alexei Baevski and Michael Auli},
    booktitle={International Conference on Learning Representations},
    year={2019},
    url={https://openreview.net/forum?id=ByxZX20qFQ},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)

Pre-trained models

Training an LM with adaptive inputs

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)

Pre-trained models

Training an LM with adaptive inputs

Citation