Skip to content

Commit

Permalink
push 774M model
Browse files Browse the repository at this point in the history
  • Loading branch information
WuTheFWasThat committed Aug 20, 2019
1 parent cb41537 commit f35fa1d
Show file tree
Hide file tree
Showing 7 changed files with 20 additions and 15 deletions.
5 changes: 3 additions & 2 deletions DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,9 @@ pip3 install -r requirements.txt

Download the model data
```
python3 download_model.py 117M
python3 download_model.py 345M
python3 download_model.py 124M
python3 download_model.py 355M
python3 download_model.py 774M
```

## Docker Installation
Expand Down
5 changes: 3 additions & 2 deletions Dockerfile.cpu
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ RUN mkdir /gpt-2
WORKDIR /gpt-2
ADD . /gpt-2
RUN pip3 install -r requirements.txt
RUN python3 download_model.py 117M
RUN python3 download_model.py 345M
RUN python3 download_model.py 124M
RUN python3 download_model.py 355M
RUN python3 download_model.py 774M
5 changes: 3 additions & 2 deletions Dockerfile.gpu
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,6 @@ RUN mkdir /gpt-2
WORKDIR /gpt-2
ADD . /gpt-2
RUN pip3 install -r requirements.txt
RUN python3 download_model.py 117M
RUN python3 download_model.py 345M
RUN python3 download_model.py 124M
RUN python3 download_model.py 355M
RUN python3 download_model.py 774M
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@

Code from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf).

We have currently released small (117M parameter) and medium (345M parameter) versions of GPT-2. While we have not released the larger models, we have [released a dataset](https://github.com/openai/gpt-2-output-dataset) for researchers to study their behaviors.
We have currently released small (124M parameter), medium (355M parameter), and large (774M parameter) versions of GPT-2<sup>*</sup>, with only the full model as of yet unreleased. We have also [released a dataset](https://github.com/openai/gpt-2-output-dataset) for researchers to study their behaviors.

See more details in our [blog post](https://blog.openai.com/better-language-models/).
You can read about GPT-2 and release decisions in our [original blog post](https://blog.openai.com/better-language-models/) and [6 month follow-up post](https://openai.com/blog/gpt-2-6-month-follow-up/).

<sup>*</sup> *Note that our original parameter counts were wrong due to an error (in our previous blog posts and paper). Thus you may have seen small referred to as 117M and medium referred to as 345M.*

## Usage

Expand Down
2 changes: 1 addition & 1 deletion download_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from tqdm import tqdm

if len(sys.argv) != 2:
print('You must enter the model name as a parameter, e.g.: download_model.py 117M')
print('You must enter the model name as a parameter, e.g.: download_model.py 124M')
sys.exit(1)

model = sys.argv[1]
Expand Down
4 changes: 2 additions & 2 deletions src/generate_unconditional_samples.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import model, sample, encoder

def sample_model(
model_name='117M',
model_name='124M',
seed=None,
nsamples=0,
batch_size=1,
Expand All @@ -20,7 +20,7 @@ def sample_model(
):
"""
Run the sample_model
:model_name=117M : String, which model to use
:model_name=124M : String, which model to use
:seed=None : Integer seed for random number generators, fix seed to
reproduce results
:nsamples=0 : Number of samples to return, if 0, continues to
Expand Down
8 changes: 4 additions & 4 deletions src/interactive_conditional_samples.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,18 @@
import model, sample, encoder

def interact_model(
model_name='117M',
model_name='124M',
seed=None,
nsamples=1,
batch_size=1,
length=None,
temperature=1,
top_k=0,
models_dir='models',
models_dir='models',
):
"""
Interactively run the model
:model_name=117M : String, which model to use
:model_name=124M : String, which model to use
:seed=None : Integer seed for random number generators, fix seed to reproduce
results
:nsamples=1 : Number of samples to return total
Expand All @@ -36,7 +36,7 @@ def interact_model(
while 40 means 40 words are considered at each step. 0 (default) is a
special setting meaning no restrictions. 40 generally is a good value.
:models_dir : path to parent folder containing model subfolders
(i.e. contains the <model_name> folder)
(i.e. contains the <model_name> folder)
"""
models_dir = os.path.expanduser(os.path.expandvars(models_dir))
if batch_size is None:
Expand Down

0 comments on commit f35fa1d

Please sign in to comment.