In [2]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

In [3]:
# initialize tokenizer and model from pretrained GPT2 model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/523M [00:00<?, ?B/s]

<b>Tokenization</b>

The tokenizer is used to translate between human-readable text and numeric indices. These indices are then mapped to word embeddings (numerical representations of words) by an embedding layer within the model.

All we need to do to tokenizer our input text is call the tokenizer.encode method like so:
<b>inputs = tokenizer.encode(sequence, return_tensors='pt')</b>

<i>Because we are using PyTorch, we add return_tensor='pt', if using TensorFlow, we would use return_tensor='tf'</i>

<b>Generate</b>

Now that we have our tokenization input text, we can begin generating text with GPT-2! All we do is call the model.generate method:
we pass a maximum output length of 200 tokens
<b>outputs = model.generate(inputs, max_length=200, do_sample=True)</b>
Here we set the maximum number of tokens to generate as 200. We also add <b>do_sample=True</b> to stop the model from just picking the most likely word at every step, which ends up looking like this:

He began his premiership by forming a five-man war cabinet which included Chamerlain as Lord President of the Council, Labour leader Clement Attlee as Lord Privy Seal (later as Deputy Prime Minister), Halifax as Foreign Secretary and Labour's Arthur Greenwood as a minister without portfolio. In practice,<b> he was a very conservative figure, but he was also a very conservative figure in the sense that he was a very conservative figure in the sense that he was a very conservative figure in the sense that he was a very conservative figure in the sense that he was a very conservative figure in the sense that he was a very conservative figure in the sense that he was a very conservative figure in the sense that he was a very conservative figure</b>

The <b>top_k</b> and <b>temperature</b> arguments can be used to modify our outputs' coherence/randomness — we will cover these at the end


<b>Decoding</b>

Our generate step outputs an array of tokens rather than words. To convert these tokens into words, we need to <b>.decode</b> them. This is easy to do:
<b>text = tokenizer.decode(outputs[0], skip_special_tokens=True)</b>
All we need to add is <b>skip_special_tokens=True</b> to avoid decoding special tokens that are used by the model, such as the end of sequence token <|endoftext|>.

In [None]:
outputs = model.generate(
    inputs, max_length=200, do_sample=True
)
tokenizer.decode(outputs[0], skip_special_tokens=True)

We can add more randomness with <b>temperature</b> — the default value is 1, a high value like 5 will produce a pretty nonsensical output

In [None]:
outputs = model.generate(
    inputs, max_length=200, do_sample=True, temperature=5
)
tokenizer.decode(outputs[0], skip_special_tokens=True)

<b>Turning the temperature down below 1 </b> will produce more linear but less creative outputs

We can also add the <b>top_k</b> parameter — which limits the sample tokens to a given number of the most probable tokens. This results in text that tends to stick to the same topic (or set of words) for a longer period of time.