Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
Switch branches/tags
Nothing to show
Clone or download
root
Latest commit ae7e86b Aug 30, 2018
Permalink
Failed to load latest commit information.
model ROCStories demo Jun 11, 2018
.gitignore Initial commit Jun 11, 2018
LICENSE Initial commit Jun 11, 2018
README.md updated readme Jun 11, 2018
analysis.py ROCStories demo Jun 11, 2018
datasets.py ROCStories demo Jun 11, 2018
opt.py ROCStories demo Jun 11, 2018
text_utils.py ROCStories demo Jun 11, 2018
train.py remove unused ema code Aug 30, 2018
utils.py remove unused ema code Aug 30, 2018

README.md

finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"

Currently this code implements the ROCStories Cloze Test result reported in the paper by running: python train.py --dataset rocstories --desc rocstories --submit --analysis --data_dir [path to data here]

Note: The code is currently non-deterministic due to various GPU ops. The median accuracy of 10 runs with this codebase (using default hyperparameters) is 85.8% - slightly lower than the reported single run of 86.5% from the paper.

The ROCStories dataset can be downloaded from the associated website.