# Generating text with a pre-trained GPT2 in PyTorch

This notebook was created as a part of a blog post - [Fine-tuning large Transformer models on a single GPU in PyTorch - Teaching GPT-2 a sense of humor](https://mf1024.github.io/2019/11/12/Fun-With-GPT-2/).

In this notebook, I will use a pre-trained medium-sized GPT2 model from the [huggingface](https://github.com/huggingface/transformers) to generate some text.

The easiest way to use huggingface transformer libraries is to install their pip package *transformers*.

In [1]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.6.0-py3-none-any.whl (2.3 MB)
Collecting regex!=2019.12.17
  Using cached regex-2021.4.4-cp36-cp36m-win_amd64.whl (269 kB)
Collecting huggingface-hub==0.0.8
  Downloading huggingface_hub-0.0.8-py3-none-any.whl (34 kB)
Collecting dataclasses
  Using cached dataclasses-0.8-py3-none-any.whl (19 kB)
Collecting filelock
  Using cached filelock-3.0.12-py3-none-any.whl (7.6 kB)
Collecting tqdm>=4.27
  Using cached tqdm-4.60.0-py2.py3-none-any.whl (75 kB)
Collecting tokenizers<0.11,>=0.10.1
  Using cached tokenizers-0.10.2-cp36-cp36m-win_amd64.whl (2.0 MB)
Collecting numpy>=1.17
  Using cached numpy-1.19.5-cp36-cp36m-win_amd64.whl (13.2 MB)
Collecting sacremoses
  Using cached sacremoses-0.0.45-py3-none-any.whl (895 kB)
Collecting requests
  Using cached requests-2.25.1-py2.py3-none-any.whl (61 kB)
Collecting chardet<5,>=3.0.2
  Using cached chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting idna<3,>=2.5
  Using cached idna-2.10-py2.py3-

In [4]:
import logging
logging.getLogger().setLevel(logging.CRITICAL)

import torch
import numpy as np

from transformers import GPT2Tokenizer, GPT2LMHeadModel

device = 'cpu'
if torch.cuda.is_available():
    print("CUDA FOUND")
    device = 'cuda'

CUDA FOUND


### Models and classes

I use the [GPT2LMHeadModel](https://github.com/huggingface/transformers/blob/master/transformers/modeling_gpt2.py#L491) module for the language model, which is [GPT2Model](https://github.com/huggingface/transformers/blob/master/transformers/modeling_gpt2.py#L326), with an additional linear layer that uses input embedding layer weights to do the inverse operation of the embedding layer - to create logits vector for the dictionary from outputs of the GPT2.

[GPT2Tokenizer](https://github.com/huggingface/transformers/blob/master/transformers/tokenization_gpt2.py#L106) is a byte-code pair encoder that will transform input text input into input tokens that the huggingface transformers were trained on. 

In [5]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2-medium')
model = GPT2LMHeadModel.from_pretrained('gpt2-medium')
model = model.to(device)

Downloading:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/718 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.52G [00:00<?, ?B/s]

In [6]:
# Function to first select topN tokens from the probability list and then based on the selected N word distribution
# get random token ID
def choose_from_top(probs, n=5):
    ind = np.argpartition(probs, -n)[-n:]
    top_prob = probs[ind]
    top_prob = top_prob / np.sum(top_prob) # Normalize
    choice = np.random.choice(n, 1, p = top_prob)
    token_id = ind[choice][0]
    return int(token_id)

### Text generation

At each prediction step, GPT2 model needs to know all of the previous sequence elements to predict the next one. Below is a function that will tokenize the starting input text, and then in a loop, one new token is predicted at each step and is added to the sequence, which will be fed into the model in the next step. In the end, the token list is decoded back into a text. 

In [7]:
def generate_some_text(input_str, text_len = 250):

    cur_ids = torch.tensor(tokenizer.encode(input_str)).unsqueeze(0).long().to(device)

    model.eval()
    with torch.no_grad():

        for i in range(text_len):
            outputs = model(cur_ids, labels=cur_ids)
            loss, logits = outputs[:2]
            softmax_logits = torch.softmax(logits[0,-1], dim=0) #Take the first(only one) batch and the last predicted embedding
            next_token_id = choose_from_top(softmax_logits.to('cpu').numpy(), n=10) #Randomly(from the given probability distribution) choose the next word from the top n words
            cur_ids = torch.cat([cur_ids, torch.ones((1,1)).long().to(device) * next_token_id], dim = 1) # Add the last word

        output_list = list(cur_ids.squeeze().to('cpu').numpy())
        output_text = tokenizer.decode(output_list)
        print(output_text)

## Generating the text

I will give thre different sentence beginnings to the GPT2 and let it generate the rest:


***1. The Matrix is everywhere. It is all around us. Even now, in this very room. You can see it when you look out your window or when you turn on your television. You can feel it when you go to work… when you go to church… when you pay your taxes. It is the world that has been pulled over your eyes to blind you from the truth…***

***2. Artificial general intelligence is…***

***3. The Godfather: “I’m going to make him an offer he can’t refuse.”…***

In [8]:
generate_some_text(" david went to zoo")

 david went to zoo. I'm sure he'll tell you that the only thing more frustrating than being stuck with your dog is the constant reminders that you have to go to the vet." – David, San Francisco, CA

"You've probably heard by now that I'm an animal lover and that I've tried to get my dog to live a long and happy life. But there's a catch: I've got to find a way to pay for the care that I don't want to have to take, so I'm not happy with the state of my house.

So I've been considering the option of giving up the dog, and I think my dog needs to be taken care of. I've heard of people doing what I did, of dogs who've had their lives ended by neglectful owners, but I think there's one thing that I could do to prevent it.

I have three cats who I love dearly. My dogs are the only two who are not in a good state. So I think that I should consider giving my cats away so that they can have a better quality of life.

This could save me hundreds of dollars, and it might even be worth the hassle.

In [9]:
generate_some_text("i love cooking")

i love cooking with this dish. It's really easy to make, and tastes wonderful. If you're feeling especially adventurous or just want the perfect treat, you could also use this for a snack or a lunch. You can also make this in batches if you'd rather eat the same thing over and over again and not have it get messy.

This recipe is so easy to make, you'll have a blast doing it on your lunch break. I promise, you'll be smiling and laughing the entire time you're cooking!

4.8 from 5 reviews The Best Vegan Cheesecake Print Prep time 5 mins Cook time 15 mins Total time 25 mins Vegan Cheesecake made with just a handful of ingredients, it will have you craving more. Course: Dessert Serves: 6 This is one of my all-time favorite desserts so I can only imagine how delicious these will be on our dinner table. Author: Allison Recipe type: Dessert Cuisine: Vegan Serves: 6 Author: Allison Ingredients 1/2 cup coconut milk

3 tablespoons almond milk (I used my regular, unrefined milk)

1 cup sugar (I 

In [10]:
generate_some_text("quick indian snack recepie")

quick indian snack recepie. It has no fat in it and has a sweet taste and is great served with a salad or side dish to go with it. If you like your Indian snacks to be spicy then this is the snack for you.

This snack is very versatile. It can be served as a side dish as well.

This is my first time making indian snacks, so I am not really sure how it is supposed to be enjoyed. But my wife likes them, and I am happy about that, so I'm sharing the recipe with you. Enjoy! (I have no idea why you would make indian snacks, but they are delicious).

Ingredients:

10 cups water

1/4 cup white vinegar

1/2 tsp ground cumin

1/4 tsp turmeric

1/2 cup finely chopped coriander leaves

1/2 cup sliced onions

1/2 cup cilantro leaves

2 cups plain Indian bread crumbs

2 tsp curry powder

2 tsp salt

1 tsp black pepper

1 cup finely chopped tomatoes

Method:

In a small saucepan heat water. Add 1/4 to 1/2 cup


In [12]:
generate_some_text("who killed raj")

who killed rajendra kadak] was not in our area. We had to leave our homes in panic," Sohrab says.

Rajendra's parents say he is in critical condition and is expected to recover soon.

"He is in a very critical condition and we have given him all kinds of drugs, including opium. We have even given him a poison to make him die," Rajendra's mother Soman says.

"We have taken a lot of painkillers to relieve the pain we have been suffering from our son. We are now going to the hospital to take some blood and take some pills to give him the best possible treatment," Sohrab adds.

Rajendra's family also says their only child had an IQ of around 130.

"He was the only son of our family who worked with his hands. He was not afraid of anything. He was always doing something for the community," Sohrab says.

Sohrab's son says his parents were in a very difficult position after a few months and they didn't know which way to go after that. They tried everything but couldn't figure out a solution. T