# Exporting GPT-2

_WARNING: you are on the master branch; please refer to examples on the branch corresponding to your `cortex version` (e.g. for version 0.23.*, run `git checkout -b 0.23` or switch to the `0.23` branch on GitHub)_

In this notebook, we'll show how to export [OpenAI's GPT-2 text generation model](https://github.com/openai/gpt-2) for serving.

First, we'll download the GPT-2 code repository:

In [0]:
!git clone --no-checkout https://github.com/openai/gpt-2.git
!cd gpt-2 && git reset --hard ac5d52295f8a1c3856ea24fb239087cc1a3d1131

Next we'll specify the model size (choose one of 124M, 355M, or 774M):

In [0]:
import sys

MODEL_SIZE = "124M" #@param {type:"string"}

if MODEL_SIZE not in {"124M", "355M", "774M"}:
    print("\033[91m{}\033[00m".format('ERROR: MODEL_SIZE must be "124M", "355M", or "774M"'), file=sys.stderr)

We can use `download_model.py` to download the model:

In [0]:
!python3 ./gpt-2/download_model.py $MODEL_SIZE

Next, we'll install the required packages:

In [0]:
!pip install tensorflow==1.14.* numpy==1.* boto3==1.*

In [0]:
import sys
import os
import time
import json
import numpy as np
import tensorflow as tf
from tensorflow.python.saved_model.signature_def_utils_impl import predict_signature_def

Now we can export the model for serving:

In [0]:
sys.path.append(os.path.join(os.getcwd(), 'gpt-2/src'))
import model, sample

def export_for_serving(
    model_name='124M',
    seed=None,
    batch_size=1,
    length=None,
    temperature=1,
    top_k=0,
    models_dir='models'
):
    """
    Export the model for TF Serving
    :model_name=124M : String, which model to use
    :seed=None : Integer seed for random number generators, fix seed to reproduce
     results
    :length=None : Number of tokens in generated text, if None (default), is
     determined by model hyperparameters
    :temperature=1 : Float value controlling randomness in boltzmann
     distribution. Lower temperature results in less random completions. As the
     temperature approaches zero, the model will become deterministic and
     repetitive. Higher temperature results in more random completions.
    :top_k=0 : Integer value controlling diversity. 1 means only 1 word is
     considered for each step (token), resulting in deterministic completions,
     while 40 means 40 words are considered at each step. 0 (default) is a
     special setting meaning no restrictions. 40 generally is a good value.
     :models_dir : path to parent folder containing model subfolders
     (i.e. contains the <model_name> folder)
    """
    models_dir = os.path.expanduser(os.path.expandvars(models_dir))

    hparams = model.default_hparams()
    with open(os.path.join(models_dir, model_name, 'hparams.json')) as f:
        hparams.override_from_dict(json.load(f))

    if length is None:
        length = hparams.n_ctx
    elif length > hparams.n_ctx:
        raise ValueError("Can't get samples longer than window size: %s" % hparams.n_ctx)

    with tf.Session(graph=tf.Graph()) as sess:
        context = tf.placeholder(tf.int32, [batch_size, None])
        np.random.seed(seed)
        tf.set_random_seed(seed)

        output = sample.sample_sequence(
            hparams=hparams, length=length,
            context=context,
            batch_size=batch_size,
            temperature=temperature, top_k=top_k
        )

        saver = tf.train.Saver()
        ckpt = tf.train.latest_checkpoint(os.path.join(models_dir, model_name))
        saver.restore(sess, ckpt)

        export_dir=os.path.join(models_dir, model_name, "export", str(time.time()).split('.')[0])
        if not os.path.isdir(export_dir):
            os.makedirs(export_dir)

        builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
        signature = predict_signature_def(inputs={'context': context},
        outputs={'sample': output})

        builder.add_meta_graph_and_variables(sess,
                                     [tf.saved_model.SERVING],
                                     signature_def_map={"predict": signature},
                                     strip_default_attrs=True)
        builder.save()


export_for_serving(top_k=40, length=256, model_name=MODEL_SIZE)

## Upload the model to AWS

Cortex loads models from AWS, so we need to upload the exported model.

Set these variables to configure your AWS credentials and model upload path:

In [0]:
AWS_ACCESS_KEY_ID = "" #@param {type:"string"}
AWS_SECRET_ACCESS_KEY = "" #@param {type:"string"}
S3_UPLOAD_PATH = "s3://my-bucket/text-generator/gpt-2" #@param {type:"string"}

import sys
import re

if AWS_ACCESS_KEY_ID == "":
    print("\033[91m {}\033[00m".format("ERROR: Please set AWS_ACCESS_KEY_ID"), file=sys.stderr)

elif AWS_SECRET_ACCESS_KEY == "":
    print("\033[91m {}\033[00m".format("ERROR: Please set AWS_SECRET_ACCESS_KEY"), file=sys.stderr)

else:
    try:
        bucket, key = re.match("s3://(.+?)/(.+)", S3_UPLOAD_PATH).groups()
    except:
        print("\033[91m {}\033[00m".format("ERROR: Invalid s3 path (should be of the form s3://my-bucket/path/to/file)"), file=sys.stderr)

Upload the model to S3:

In [0]:
import os
import boto3

s3 = boto3.client("s3", aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY)

for dirpath, _, filenames in os.walk("models/{}/export".format(MODEL_SIZE)):
    for filename in filenames:
        filepath = os.path.join(dirpath, filename)
        filekey = os.path.join(key, MODEL_SIZE, filepath[len("models/{}/export/".format(MODEL_SIZE)):])
        print("Uploading s3://{}/{} ...".format(bucket, filekey), end = '')
        s3.upload_file(filepath, bucket, filekey)
        print(" ✓")

print("\nUploaded model export directory to {}/{}".format(S3_UPLOAD_PATH, MODEL_SIZE))

<!-- CORTEX_VERSION_MINOR x2 -->
We also need to upload `vocab.bpe` and `encoder.json`, so that the [encoder](https://github.com/cortexlabs/cortex/blob/master/examples/tensorflow/text-generator/encoder.py) in the [Predictor](https://github.com/cortexlabs/cortex/blob/master/examples/tensorflow/text-generator/predictor.py) can encode the input text before making a request to the model.

In [0]:
print("Uploading s3://{}/{}/vocab.bpe ...".format(bucket, key), end = '')
s3.upload_file(os.path.join("models", MODEL_SIZE, "vocab.bpe"), bucket, os.path.join(key, "vocab.bpe"))
print(" ✓")

print("Uploading s3://{}/{}/encoder.json ...".format(bucket, key), end = '')
s3.upload_file(os.path.join("models", MODEL_SIZE, "encoder.json"), bucket, os.path.join(key, "encoder.json"))
print(" ✓")

<!-- CORTEX_VERSION_MINOR -->
That's it! See the [example on GitHub](https://github.com/cortexlabs/cortex/tree/master/examples/tensorflow/text-generator) for how to deploy the model as an API.