Skip to content
forked from openai/gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

License

Notifications You must be signed in to change notification settings

rogerfitz/gpt-2

 
 

Repository files navigation

gpt-2

Code and samples from the paper "Language Models are Unsupervised Multitask Learners".

For now, we have only released a smaller (117M parameter) version of GPT-2.

See more details in our blog post.

DATA

We created a new dataset which emphasizes diversity of content, by scraping content from the Internet. In order to preserve document quality, we used only pages which have been curated/filtered by humans — specifically, we used outbound links from Reddit which received at least 3 karma. This can be thought of as a heuristic indicator for whether other users found the link interesting (whether educational or funny), leading to higher data quality than other similar datasets, such as CommonCrawl. ↩︎

Note that while we have hand-chosen these samples, and are thus engaging in some meta-cherry-picking, we believe they are not too unrepresentative of the sampling process. We are simply using top-k truncated sampling, and have yet to explore more advanced methods of sampling (such as beam-search methods). ↩︎

Installation

Git clone this repository, and cd into directory for remaining commands

git clone https://github.com/openai/gpt-2.git && cd gpt-2

Native Installation

Download the model data

sh download_model.sh 117M

The remaining steps can optionally be done in a virtual environment using tools such as virtualenv or conda.

Install tensorflow 1.12 (with GPU support, if you have a GPU and want everything to run faster)

pip3 install tensorflow==1.12.0

or

pip3 install tensorflow-gpu==1.12.0

Install other python packages:

pip3 install -r requirements.txt

Docker Installation

Build the Dockerfile and tag the created image as gpt-2:

docker build --tag gpt-2 -f Dockerfile.gpu . # or Dockerfile.cpu

Start an interactive bash session from the gpt-2 docker image.

You can opt to use the --runtime=nvidia flag if you have access to a NVIDIA GPU and a valid install of nvidia-docker 2.0.

docker run --runtime=nvidia -it gpt-2 bash

Usage

WARNING: Samples are unfiltered and may contain offensive content.

Unconditional sample generation

To generate unconditional samples from the small model:

python3 src/generate_unconditional_samples.py | tee samples

There are various flags for controlling the samples:

python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee samples

Conditional sample generation

To give the model custom prompts, you can use:

python3 src/interactive_conditional_samples.py --top_k 40

GPT-2 samples

WARNING: Samples are unfiltered and may contain offensive content.

While we have not yet released GPT-2 itself, you can see some samples from it in the gpt-2-samples folder. We show unconditional samples with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40. We show conditional samples, with contexts drawn from WebText's test set, with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.

Future work

We may release code for evaluating the models on various benchmarks.

We are still considering release of the larger models.

About

Code for the paper "Language Models are Unsupervised Multitask Learners"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.2%
  • Shell 2.8%