Shows how one can generate text given a prompt and some hyperparameters, using either minGPT or huggingface/transformers

In [1]:
!pip install git+https://github.com/karpathy/minGPT.git

Collecting git+https://github.com/karpathy/minGPT.git
  Cloning https://github.com/karpathy/minGPT.git to /tmp/pip-req-build-z709va26
  Running command git clone --filter=blob:none --quiet https://github.com/karpathy/minGPT.git /tmp/pip-req-build-z709va26
  Resolved https://github.com/karpathy/minGPT.git to commit 37baab71b9abea1b76ab957409a1cc2fbfba8a26
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: minGPT
  Building wheel for minGPT (pyproject.toml) ... [?25l[?25hdone
  Created wheel for minGPT: filename=mingpt-0.0.1-py3-none-any.whl size=15435 sha256=e4c75aad8c9b06824b6634a0a7e1395ba41e4e4074ac66774c8f87e1704c2013
  Stored in directory: /tmp/pip-ephem-wheel-cache-5xx4a_a8/wheels/eb/8b/dc/d67c2183400e22b659530b4e46225da5a2da455725afe4a90a
Successfully built minGPT
Installing collected packages: minGPT
Successfully instal

In [2]:
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
from mingpt.model import GPT
from mingpt.utils import set_seed
from mingpt.bpe import BPETokenizer
set_seed(3407)

In [23]:
use_mingpt = True # use minGPT or huggingface/transformers model?
model_type = 'gpt2-xl'
device = 'cpu'

In [24]:
# if use_mingpt:
#     model = GPT.from_pretrained(model_type)
# else:
model = GPT2LMHeadModel.from_pretrained(model_type)
model.config.pad_token_id = model.config.eos_token_id # suppress a warning

# ship model to device and set to eval mode
model.to(device)
model.eval();

In [25]:

def generate(prompt='', num_samples=10, steps=20, do_sample=True):

    # tokenize the input prompt into integer input sequence
    if use_mingpt:
        tokenizer = BPETokenizer()
        if prompt == '':
            # to create unconditional samples...
            # manually create a tensor with only the special <|endoftext|> token
            # similar to what openai's code does here https://github.com/openai/gpt-2/blob/master/src/generate_unconditional_samples.py
            x = torch.tensor([[tokenizer.encoder.encoder['<|endoftext|>']]], dtype=torch.long)
        else:
            x = tokenizer(prompt).to(device)
    else:
        tokenizer = GPT2Tokenizer.from_pretrained(model_type)
        if prompt == '':
            # to create unconditional samples...
            # huggingface/transformers tokenizer special cases these strings
            prompt = '<|endoftext|>'
        encoded_input = tokenizer(prompt, return_tensors='pt').to(device)
        x = encoded_input['input_ids']

    # we'll process all desired num_samples in a batch, so expand out the batch dim
    x = x.expand(num_samples, -1)

    # forward the model `steps` times to get samples, in a batch
    y = model.generate(x, max_new_tokens=steps, do_sample=do_sample, top_k=40)

    for i in range(num_samples):
        out = tokenizer.decode(y[i].cpu().squeeze())
        print('-'*80)
        print(out)


In [27]:
generate(prompt='Andrej Karpathy, the', num_samples=10, steps=20)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


--------------------------------------------------------------------------------
Andrej Karpathy, the head of the Center for Internet Policy at the Hungarian Academy of Sciences' National University and one of the
--------------------------------------------------------------------------------
Andrej Karpathy, the developer of the system. He says he's already received several emails from people who said they now use
--------------------------------------------------------------------------------
Andrej Karpathy, the general counsel, said that the company plans to review the company's legal positions when the matter is fully
--------------------------------------------------------------------------------
Andrej Karpathy, the chief executive of the Centre for Economic Studies, said: "We have been predicting a decline of around
--------------------------------------------------------------------------------
Andrej Karpathy, the executive director of the Washington Center for Equitable Gro