# Use pretrained GPT-2 models to generate text

In [1]:
# Install transformers
# !pip install transformers

Collecting transformers
  Downloading transformers-4.20.1-py3-none-any.whl (4.4 MB)
[K     |████████████████████████████████| 4.4 MB 4.0 MB/s eta 0:00:01
Collecting pyyaml>=5.1
  Using cached PyYAML-6.0-cp37-cp37m-macosx_10_9_x86_64.whl (189 kB)
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.12.1-cp37-cp37m-macosx_10_11_x86_64.whl (3.6 MB)
[K     |████████████████████████████████| 3.6 MB 80.1 MB/s eta 0:00:01
[?25hCollecting tqdm>=4.27
  Downloading tqdm-4.64.0-py2.py3-none-any.whl (78 kB)
[K     |████████████████████████████████| 78 kB 40.5 MB/s  eta 0:00:01
[?25hCollecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.8.1-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 18.4 MB/s ta 0:00:01
[?25hCollecting filelock
  Downloading filelock-3.7.1-py3-none-any.whl (10 kB)
Collecting regex!=2019.12.17
  Downloading regex-2022.7.9-cp37-cp37m-macosx_10_9_x86_64.whl (289 kB)
[K     |████████████████████████████████| 289

## Import packages

In [14]:
import tensorflow as tf
# get transformers
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer

# To reproduce results
SEED = 271
tf.random.set_seed(SEED)

# number of words (to be predicted) in the output text
MAX_LEN = 50

## Large GPT-2 model

In [3]:
# large GPT2 tokenizer and GPT2 model
tokenizer_large = GPT2Tokenizer.from_pretrained("gpt2-large")
GPT2_large = TFGPT2LMHeadModel.from_pretrained("gpt2-large", pad_token_id=tokenizer.eos_token_id)

#view model parameters
GPT2_large.summary()

  from .autonotebook import tqdm as notebook_tqdm
2022-07-14 17:18:48.189681: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Downloading: 100%|█████████████████████████| 0.99M/0.99M [00:00<00:00, 8.14MB/s]
Downloading: 100%|███████████████████████████| 446k/446k [00:00<00:00, 3.59MB/s]
Downloading: 100%|██████████████████████████████| 666/666 [00:00<00:00, 161kB/s]
Downloading: 100%|█████████████████████████| 2.88G/2.88G [01:09<00:00, 44.6MB/s]
2022-07-14 17:20:00.664134: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2

Model: "tfgpt2lm_head_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 transformer (TFGPT2MainLaye  multiple                 774030080 
 r)                                                              
                                                                 
Total params: 774,030,080
Trainable params: 774,030,080
Non-trainable params: 0
_________________________________________________________________


In [10]:
input_sequence = "I read a book today. It is so informative and"

## Generate text using greedy search

The greedy search generates words with the highest probabilities and does not look at the diverse possibilities of words

In [11]:
# encode context the generation is conditioned on
input_ids = tokenizer_large.encode(input_sequence, return_tensors='tf')

# generate text until the output length (which includes the context length) reaches 50
greedy_output = GPT2_large.generate(input_ids, max_length = MAX_LEN)

print("Output:\n" + 100 * '-')
print(tokenizer_large.decode(greedy_output[0], skip_special_tokens = True))

Output:
----------------------------------------------------------------------------------------------------
I read a book today. It is so informative and so well written. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I


## Medium GPT-2 model

In [12]:
tokenizer_medium = GPT2Tokenizer.from_pretrained("gpt2-medium")
GPT2_medium = TFGPT2LMHeadModel.from_pretrained("gpt2-medium", pad_token_id=tokenizer.eos_token_id)

#view model parameters
GPT2_medium.summary()

Downloading: 100%|█████████████████████████| 0.99M/0.99M [00:00<00:00, 6.72MB/s]
Downloading: 100%|███████████████████████████| 446k/446k [00:00<00:00, 4.05MB/s]
Downloading: 100%|██████████████████████████████| 718/718 [00:00<00:00, 140kB/s]
Downloading: 100%|█████████████████████████| 1.32G/1.32G [00:32<00:00, 44.1MB/s]
All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2-medium.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


Model: "tfgpt2lm_head_model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 transformer (TFGPT2MainLaye  multiple                 354823168 
 r)                                                              
                                                                 
Total params: 354,823,168
Trainable params: 354,823,168
Non-trainable params: 0
_________________________________________________________________


In [13]:
# encode context the generation is conditioned on
input_ids = tokenizer_medium.encode(input_sequence, return_tensors='tf')

# generate text until the output length (which includes the context length) reaches 50
greedy_output = GPT2_medium.generate(input_ids, max_length = MAX_LEN)

print("Output:\n" + 100 * '-')
print(tokenizer_medium.decode(greedy_output[0], skip_special_tokens = True))

Output:
----------------------------------------------------------------------------------------------------
I read a book today. It is so informative and so well written. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I read it. I am so glad I


## Small GPT-2 model

In [15]:
tokenizer_small = GPT2Tokenizer.from_pretrained("gpt2")
GPT2_small = TFGPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)

#view model parameters
GPT2_small.summary()

Downloading: 100%|█████████████████████████| 0.99M/0.99M [00:00<00:00, 5.93MB/s]
Downloading: 100%|███████████████████████████| 446k/446k [00:00<00:00, 2.17MB/s]
Downloading: 100%|██████████████████████████████| 665/665 [00:00<00:00, 241kB/s]
Downloading: 100%|███████████████████████████| 475M/475M [00:10<00:00, 47.1MB/s]
All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


Model: "tfgpt2lm_head_model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 transformer (TFGPT2MainLaye  multiple                 124439808 
 r)                                                              
                                                                 
Total params: 124,439,808
Trainable params: 124,439,808
Non-trainable params: 0
_________________________________________________________________


In [16]:
# encode context the generation is conditioned on
input_ids = tokenizer_small.encode(input_sequence, return_tensors='tf')

# generate text until the output length (which includes the context length) reaches 50
greedy_output = GPT2_small.generate(input_ids, max_length = MAX_LEN)

print("Output:\n" + 100 * '-')
print(tokenizer_small.decode(greedy_output[0], skip_special_tokens = True))

Output:
----------------------------------------------------------------------------------------------------
I read a book today. It is so informative and so very interesting. I am so glad I read it. I am so glad I read it.

I read a book today. It is so informative and so very interesting. I am


# References

https://huggingface.co/blog/how-to-generate