<a href="https://colab.research.google.com/github/pjheslin/colab-notebooks/blob/main/gpt2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text generation with GPT-2

Code adapted from the [Huggingface blog](https://huggingface.co/blog/how-to-generate).

In [1]:
!pip3 install -q git+https://github.com/huggingface/transformers.git

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone


In [2]:
import transformers

In [3]:
import tensorflow as tf

In [4]:
print(tf.config.list_physical_devices('GPU'))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


In [6]:
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer

In [7]:
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

In [8]:
# add the EOS token as PAD token to avoid warnings
model = TFGPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)

Downloading:   0%|          | 0.00/475M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


## Greedy Search

In [9]:
input_ids = tokenizer.encode('To be or not to be, that is the', return_tensors='tf')

In [14]:
greedy_output = model.generate(input_ids, max_length=100)

In [15]:
print(tokenizer.decode(greedy_output[0], skip_special_tokens=True))

To be or not to be, that is the question.

The question is, what is the difference between a "good" and a "bad" person?

The answer is, that the good person is a good person.

The bad person is a bad person.

The question is, what is the difference between a "good" and a "bad" person?

The answer is, that the good person is a good person.

The bad person is a bad person.

The question is, what is the difference between a "good" and a "bad" person?

The answer is, that the good person is a good person.

The bad person is a bad person.

The question is, what is the difference between a "good" and a "bad" person?

The answer is, that the good person is a good person.

The bad person is a bad person.

The question is, what is the difference between a "good" and a "bad" person?

The answer is, that the good person is a good person.

The bad person is a bad person.

The question is, what is the difference between a "good" and a "bad" person?

The answer is, that the good person is a good per

## Beam search
We activate beam search and early_stopping

In [16]:
beam_output = model.generate(
    input_ids, 
    max_length=100, 
    num_beams=5, 
    early_stopping=True
)

In [17]:
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

To be or not to be, that is the way it is."

"I'm not going to tell you what to do. I'm not going to tell you what to do. I'm not going to tell you what to do.


We set no_repeat_ngram_size to 2

In [19]:
beam_output = model.generate(
    input_ids, 
    max_length=100, 
    num_beams=5, 
    no_repeat_ngram_size=2, 
    early_stopping=True
)
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

To be or not to be, that is the way it is."

"I'm not going to tell you what to do," he said. "I don't want you to know what I'm doing. I just want to make sure that you understand what's going on."


Sampling from the conditional probability distribution

In [20]:
# set seed to reproduce results. 
tf.random.set_seed(2)

# activate sampling and deactivate top_k by setting top_k sampling to 0
sample_output = model.generate(
    input_ids, 
    do_sample=True, 
    max_length=100, 
    top_k=0
)
print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

To be or not to be, that is the ultimate duality. It is unavoidable now but indisputable from 1957 onwards.

That Lavinia isn't an enigma until the hang it clacket," Bob explained, "is proof that sectarian impotence had only a relatively small part a minute though by 1957 it was becoming even clearer. Color comes only by way of a DVD because it is so easily reported by reliable sources – and even less by newspapers and magazines. Very few books


Lowering the temperature decreases the likelihood of low probability events.

In [21]:
tf.random.set_seed(1)

# use temperature to decrease the sensitivity to low probability candidates
sample_output = model.generate(
    input_ids, 
    do_sample=True, 
    max_length=100, 
    top_k=0, 
    temperature=0.7
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

To be or not to be, that is the thing that makes me think.

Any and all words will be heard over and over again.

It's hard to imagine the rational mind that is having so much trouble understanding your own thoughts.

So be careful.

Think about what you are doing at the moment.

Think about how you are going to behave in the future.

Think about what you will want to do in the future.

Think


### Top-K sampling

In [24]:
tf.random.set_seed(1)

# set top_k to 50
sample_output = model.generate(
    input_ids, 
    do_sample=True, 
    max_length=100, 
    top_k=50
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

To be or not to be, that is the thing that sets me apart.

The fact that a man will make a game about sex and then make a movie about men having sex with each other is a thing that is worth knowing.

The fact that every time my life is done by a man, I become aware. I don't get out of bed every day when I'm in my 80s. But they do always remind me that I am never alone.

My


### Top-p (nucleus) sampling

In [26]:
tf.random.set_seed(1)

# deactivate top_k sampling and sample only from 92% most likely words
sample_output = model.generate(
    input_ids, 
    do_sample=True, 
    max_length=100, 
    top_p=0.92, 
    top_k=0
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

To be or not to be, that is the thing that sets me apart.

Any and all words will be horrible for me when they begin to remember my experiences.


I have received many comments of my own in support of my request to obtain a competent auditor for my project. If you're wondering about the logistics of acquiring an auditor for a design work, you can find up to a year's worth of other full-time work assistance at CoDesk.


Regarding your requests
