Skip to content
forked from rkfg/gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

License

Notifications You must be signed in to change notification settings

makramjandar/GPT-2

 
 

Repository files navigation

Status: Archive (code is provided as-is, no updates expected)

GPT-2

Code "Language Models are Unsupervised Multitask Learners". | README. | Development | Contributors | License MIT


Fine tuning on custom datasets

To retrain GPT-2 117M model on a custom text dataset:

PYTHONPATH=src ./train.py --dataset <file|directory|glob>

If you want to precompute the dataset's encoding for multiple runs, you can instead use:

PYTHONPATH=src ./encode.py <file|directory|glob> /path/to/encoded.npz
PYTHONPATH=src ./train.py --dataset /path/to/encoded.npz

To do distributed on multiple GPUs or machines using Horovod:

mpirun -np 4 \
    -H localhost:4 \
    -bind-to none -map-by slot \
    -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH \
    -x PYTHONPATH=src \
    -mca pml ob1 -mca btl ^openib \
    /home/jovyan/gpt-2/train-horovod.py --dataset encoded.npz

Citation

Please use the following bibtex entry:

@article{radford2019language,
  title={Language Models are Unsupervised Multitask Learners},
  author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},
  year={2019}
}

FB2_2_txt.xsl conversion file is forked from https://github.com/kmrov/fb2_2_rtf

About

Code for the paper "Language Models are Unsupervised Multitask Learners"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 92.8%
  • XSLT 4.4%
  • Shell 2.8%