Shows how one can generate text given a prompt and some hyperparameters, using either minGPT or huggingface/transformers

In [13]:
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
from mingpt.model import GPT
from mingpt.utils import set_seed
from mingpt.bpe import BPETokenizer
set_seed(3407)

In [14]:
use_mingpt = True # use minGPT or huggingface/transformers model?
model_type = 'gpt2'
device = 'cuda'

In [15]:

if use_mingpt:
    model = GPT.from_pretrained(model_type)
else:
    model = GPT2LMHeadModel.from_pretrained(model_type)
    model.config.pad_token_id = model.config.eos_token_id # suppress a warning

# ship model to device and set to eval mode
model.to(device)
model.eval();

number of parameters: 124.44M


KeyboardInterrupt: 

In [5]:

def generate(prompt='', num_samples=10, steps=20, do_sample=True):
        
    # tokenize the input prompt into integer input sequence
    if use_mingpt:
        tokenizer = BPETokenizer()
        if prompt == '':
            # to create unconditional samples...
            # manually create a tensor with only the special <|endoftext|> token
            # similar to what openai's code does here https://github.com/openai/gpt-2/blob/master/src/generate_unconditional_samples.py
            x = torch.tensor([[tokenizer.encoder.encoder['<|endoftext|>']]], dtype=torch.long)
        else:
            x = tokenizer(prompt).to(device)
    else:
        tokenizer = GPT2Tokenizer.from_pretrained(model_type)
        if prompt == '': 
            # to create unconditional samples...
            # huggingface/transformers tokenizer special cases these strings
            prompt = '<|endoftext|>'
        encoded_input = tokenizer(prompt, return_tensors='pt').to(device)
        x = encoded_input['input_ids']
    
    # we'll process all desired num_samples in a batch, so expand out the batch dim
    x = x.expand(num_samples, -1)

    # forward the model `steps` times to get samples, in a batch
    y = model.generate(x, max_new_tokens=steps, do_sample=do_sample, top_k=40)
    
    for i in range(num_samples):
        out = tokenizer.decode(y[i].cpu().squeeze())
        print('-'*80)
        print(out)
        

In [9]:
generate(prompt='Andrej Karpathy, the', num_samples=1, steps=400)

--------------------------------------------------------------------------------
Andrej Karpathy, the team's head of development; and Kevin Johnson, chairman and CEO of Sony. While the new platform is designed to drive an open-source application, the current approach, Johnson said, is aimed at developers who want to build cross-platform software for an open-source platform like PC games.

The announcement did not name any third-party partners.

It comes as Nvidia is seeking to bolster its gaming platform by launching its Tegra 5 chip, a processor at a cost of $569 and priced at $1,250 more than traditional hardware, which costs nearly $200. Microsoft is testing the device internally at a company event in San Jose, Calif., where it plans to test the system on PCs and desktops, he said during the presentation.

The company has seen an explosion in its share price over the past year. The shares have soared from $28 to nearly $50 since the launch in April. On the back of a $25,000 share pr

In [None]:
mo