In [1]:
import tensorflow as tf
import numpy as np

tf.__version__

tf.random.set_seed(42)  # for reproducible results

In [2]:
tf.keras.backend.clear_session() #- for easy reset of notebook state

# chck if GPU can be seen by TF
tf.config.list_physical_devices('GPU')
#tf.debugging.set_log_device_placement(True)
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only use the first GPU
  try:
    tf.config.experimental.set_memory_growth(gpus[0], True)
    tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPU")
  except RuntimeError as e:
    # Visible devices must be set before GPUs have been initialized
    print(e)

1 Physical GPUs, 1 Logical GPU


In [4]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.18.0-py3-none-any.whl (4.0 MB)
[K     |████████████████████████████████| 4.0 MB 4.2 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.49-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 51.4 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.5.1-py3-none-any.whl (77 kB)
[K     |████████████████████████████████| 77 kB 6.4 MB/s 
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 54.7 MB/s 
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.12.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
[K     |████████████████████████████████| 6.6 MB 52.1 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
  Attempting uninstall: pyyaml


In [5]:
from transformers import TFOpenAIGPTLMHeadModel, OpenAIGPTTokenizer

In [6]:
gpttokenizer = OpenAIGPTTokenizer.from_pretrained('openai-gpt')
gpt = TFOpenAIGPTLMHeadModel.from_pretrained('openai-gpt')

Downloading:   0%|          | 0.00/797k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/448k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/656 [00:00<?, ?B/s]

ftfy or spacy is not installed using BERT BasicTokenizer instead of SpaCy & ftfy.


Downloading:   0%|          | 0.00/445M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFOpenAIGPTLMHeadModel.

All the layers of TFOpenAIGPTLMHeadModel were initialized from the model checkpoint at openai-gpt.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFOpenAIGPTLMHeadModel for predictions without further training.


In [7]:
input_ids = gpttokenizer.encode('Robotics is the ', return_tensors='tf')
print(input_ids)
greedy_output = gpt.generate(input_ids, max_length=100)

print("Output:\n" + 100 * '-')
print(gpttokenizer.decode(greedy_output[0], skip_special_tokens=True))

tf.Tensor([[5846 9259  544  481]], shape=(1, 4), dtype=int32)
Output:
----------------------------------------------------------------------------------------------------
robotics is the only way to get to the surface. " 
 " i'm not sure i understand. " 
 " the first thing we have to do is find a way to get to the surface. " 
 " but how? " 
 " we have to find a way to get to the surface. " 
 " but how? " 
 " we have to find a way to get to the surface. " 
 " but how? " 
 " we have to find a way to


In [8]:
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer

gpt2tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# add the EOS token as PAD token to avoid warnings
gpt2 = TFGPT2LMHeadModel.from_pretrained("gpt2", 
                                         pad_token_id=gpt2tokenizer.eos_token_id)

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/475M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [9]:
input_ids = gpt2tokenizer.encode('Robotics is the ', return_tensors='tf')

# generate text until the output length (which includes the context length) reaches 50
greedy_output = gpt2.generate(input_ids, max_length=50)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(greedy_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Robotics is the vernacular of the future.

The future is not a future where robots are going to be able to do anything. It's a future where robots are going to be able to do anything.

The future is


In [10]:
tf.random.set_seed(42)  # for reproducible results
# BEAM SEARCH
# activate beam search and early_stopping
beam_output = gpt2.generate(
    input_ids, 
    max_length=51, 
    num_beams=20, 
    early_stopping=True
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Robotics is the vernacular of science fiction and fantasy. It's a genre that has been around for a long time. It's a genre that has been around for a long time. It's a genre that has been around for a long time


In [11]:
beam_output = gpt2.generate(
    input_ids, 
    max_length=50, 
    num_beams=5, 
    no_repeat_ngram_size=3, 
    early_stopping=True
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Robotics is the vernacular term for a new kind of robot. It's a robot that can do a lot of things, but it can't do them all. It can do things that other robots can't.

Advertisement




In [12]:
input_ids = gpt2tokenizer.encode('In the dark of the night, there was a ', return_tensors='tf')
# Top-K sampling
tf.random.set_seed(42)  # for reproducible results
beam_output = gpt2.generate(
    input_ids, 
    max_length=200, 
    do_sample=True, 
    top_k=50
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
In the dark of the night, there was a urn with four thousand five hundred-year fragments. Here were scattered five thousand years in their fragments—what is not, you may not say, four different eras; and the three fragments of the same date, three hundred and seventy-three years, which I am sure of, were all separated into the one hundred and twenty-two pieces to the earth's circumference. It may then be said, therefore, to us that the period of the sixteenth earth-days is the seven hundredth annular year, and we shall learn from that, the twelveteenth is the last earth-year and the eighty-fifth is the last year. It is this day which we shall learn of; it should be, then, therefore, to the fourteenth, the fourteenth being the fourth and the fifty-first of all six hundred-years, the fourth to the twenty-third, the second to the twenty-fourth, the last to


In [13]:
# Another sample with a larger model
gpt2tok_l = GPT2Tokenizer.from_pretrained("gpt2-large")

# add the EOS token as PAD token to avoid warnings
gpt2_l = TFGPT2LMHeadModel.from_pretrained("gpt2-large", 
                                         pad_token_id=gpt2tokenizer.eos_token_id)

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.88G [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2-large.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [14]:
input_ids = gpt2tok_l.encode('In the dark of the night, there was a ', return_tensors='tf')
# Top-K sampling
tf.random.set_seed(42)  # for reproducible results
beam_output = gpt2_l.generate(
    input_ids, 
    max_length=200, 
    do_sample=True, 
    top_k=25
)

print("Output:\n" + 50 * '-')
print(gpt2tok_l.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
In the dark of the night, there was a ursine creature standing at the edge of a pond. Its face was as white as snow and it looked to be sleeping. It had a red nose, a nose so large that it was like it was made of the face of a dog. The water beneath its feet had a red colour and it smelled of blood."

The poem was written by Joseph Campbell and later published as The Hero With a Thousand Faces. Campbell's poem is known as the story of the wolf (as is the case for most of his other work). It begins, "You're walking along a path between the hills. In each direction you see another person or thing of interest." The person or thing of interest here being a wolf which had been feeding its young. The only problem with this story is that in the context of a poem about wolves, it's difficult to say what interest the wolf has. The poem does, however, offer a number of clues
