# Important Imports and GPU Setup

In [3]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.15.0-py3-none-any.whl (3.4 MB)
[K     |████████████████████████████████| 3.4 MB 5.7 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 41.9 MB/s 
Collecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 34.2 MB/s 
Collecting sacremoses
  Downloading sacremoses-0.0.46-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 42.1 MB/s 
[?25hCollecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.2.1-py3-none-any.whl (61 kB)
[K     |████████████████████████████████| 61 kB 489 kB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
  Attem

In [1]:
import tensorflow as tf
import numpy as np

print(tf.__version__)

tf.random.set_seed(42)  # for reproducible results

2.7.0


In [2]:
########### GPU CONFIGS #############
## Please ignore if not training on GPU       ##
## this is important for running CuDNN on GPU ##

tf.keras.backend.clear_session() #- for easy reset of notebook state

# check if GPU can be seen by TF
tf.config.list_physical_devices('GPU')
#tf.debugging.set_log_device_placement(True)
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only use the first GPU
    try:
        tf.config.experimental.set_memory_growth(gpus[0], True)
        tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPU")
    except RuntimeError as e:
        # Visible devices must be set before GPUs have been initialized
        print(e)
###############################################

1 Physical GPUs, 1 Logical GPU


# Generating Text with GPT

In [4]:
from transformers import TFOpenAIGPTLMHeadModel, OpenAIGPTTokenizer

In [5]:
# Loading GPT tokenizer and GPT model
gpttokenizer = OpenAIGPTTokenizer.from_pretrained('openai-gpt')
gpt = TFOpenAIGPTLMHeadModel.from_pretrained('openai-gpt')

Downloading:   0%|          | 0.00/797k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/448k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.21M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/656 [00:00<?, ?B/s]

ftfy or spacy is not installed using BERT BasicTokenizer instead of SpaCy & ftfy.


Downloading:   0%|          | 0.00/445M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFOpenAIGPTLMHeadModel.

All the layers of TFOpenAIGPTLMHeadModel were initialized from the model checkpoint at openai-gpt.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFOpenAIGPTLMHeadModel for predictions without further training.


In [6]:
input_ids = gpttokenizer.encode('Machine Learning is ', return_tensors='tf')
print(input_ids)
greedy_output = gpt.generate(input_ids, max_length=100)

print("Output:\n" + 100 * '-')
print(gpttokenizer.decode(greedy_output[0], skip_special_tokens=True))

tf.Tensor([[4165 6024  544]], shape=(1, 3), dtype=int32)
Output:
----------------------------------------------------------------------------------------------------
machine learning is a lot more fun than it used to be. " 
 " i'm glad you're here, " i said. " i'm glad you're here. " 
 " me too. " 
 " i'm glad you're here, " i said again. 
 " me too. " 
 " i'm glad you're here, " i said again. 
 " me too. " 
 " i'm glad you're here, " i said again. 
 " me too


# Generating Text with GPT-2

In [7]:
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer

gpt2tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# add the EOS token as PAD token to avoid warnings
gpt2 = TFGPT2LMHeadModel.from_pretrained("gpt2", 
                                         pad_token_id=gpt2tokenizer.eos_token_id)

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/475M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [8]:
# encode context the generation is conditioned on
input_ids = gpt2tokenizer.encode('Machine Learning is ', return_tensors='tf')

# generate text until the output length (which includes the context length) reaches 50
greedy_output = gpt2.generate(input_ids, max_length=50)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(greedy_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Machine Learning is  a very powerful tool for learning about the world around us. It is a tool that can be used to learn about the world around us. It is a tool that can be used to learn about the world around us. It


In [9]:
tf.random.set_seed(42)  # for reproducible results
# BEAM SEARCH
# activate beam search and early_stopping
beam_output = gpt2.generate(
    input_ids, 
    max_length=51, 
    num_beams=20, 
    early_stopping=True
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Machine Learning is  an open-source, open-source, open-source, open-source, open-source, open-source, open-source, open-source, open-source, open-source, open-source, open


In [10]:
# set no_repeat_ngram_size to 3, so that it does not repeat same phrase of length 3 or more.
beam_output = gpt2.generate(
    input_ids, 
    max_length=50, 
    num_beams=5, 
    no_repeat_ngram_size=3, 
    early_stopping=True
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Machine Learning is  a great way to learn about the world around you. It's also a great way for you to learn a lot about yourself.
I'm going to start off by saying that I'm not going to go into all the


In [11]:
# Returning multiple beams
tf.random.set_seed(42)  # for reproducible results
beam_outputs = gpt2.generate(
    input_ids, 
    max_length=50, 
    num_beams=7,  # for better output, number of beams should be greater than number of returned sequences
    no_repeat_ngram_size=3, 
    num_return_sequences=3,  # number of outputs to be returned
    early_stopping=True,
    temperature=0.7
)

print("Output:\n" + 50 * '-')
for i, beam_output in enumerate(beam_outputs):
    print("\n{}: {}".format(i, 
                        gpt2tokenizer.decode(beam_output, 
                                             skip_special_tokens=True)))

Output:
--------------------------------------------------

0: Machine Learning is  a great way to learn about the world around you. It's also a great way for you to get a sense of what's going on in your life.
I'm not going to go into the details of how to

1: Machine Learning is  a great way to learn about the world around you. It's also a great way for you to get a sense of what's going on in your life.
I'm not going to go into too much detail here,

2: Machine Learning is  a great way to learn about the world around you. It's also a great way for you to get a sense of what's going on in your life.
I'm not going to go into too much detail about how


In [12]:
# Top-K sampling
tf.random.set_seed(42)  # for reproducible results
beam_output = gpt2.generate(
    input_ids, 
    max_length=50, 
    do_sample=True, 
    top_k=25,   # Now the beam will select top_K probabilities and then pick any random word from top_k
    temperature=2
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
Machine Learning is  still under active testing. For now I don't think anyone knows if they work on them and it is not likely you should be doing research for you in any form! Also, my best friend was in India with us when


In [13]:
input_ids = gpt2tokenizer.encode('During the sunset, the wind was ', return_tensors='tf')
# Top-K sampling
tf.random.set_seed(42)  # for reproducible results
beam_output = gpt2.generate(
    input_ids, 
    max_length=200, 
    do_sample=True, 
    top_k=50  # Increased top_k value for more consistent and creative results
)

print("Output:\n" + 50 * '-')
print(gpt2tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
During the sunset, the wind was  still blowing hard and the wind was still blowing hard.
I ran down to see where it was and found it standing almost directly on to the base of the mountain about three feet below the center of the sky. The wind was still blowing hard and I ran down to see where it was and found it standing almost directly on to the base of the mountain about three feet below the center of the sky.
My wife thought I was going crazy because I didn't look at the clock or what she would say when we had finished eating dinner.
You're welcome, dear girl. I know it's not the most fun, but sometimes it's fun to be honest. 
I don't know if I had a favorite meal or maybe I'm missing out on some new friend. I love to make friends out here, too!
I made this recipe so you can get creative. It turns out to be easy, filling, comforting, nutritious


In [14]:
# Another sample with a larger model
gpt2tok_l = GPT2Tokenizer.from_pretrained("gpt2-large")

# add the EOS token as PAD token to avoid warnings
gpt2_l = TFGPT2LMHeadModel.from_pretrained("gpt2-large", 
                                         pad_token_id=gpt2tokenizer.eos_token_id)


Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.88G [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2-large.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [15]:
input_ids = gpt2tok_l.encode('During the sunset, the wind was ', return_tensors='tf')
# Top-K sampling
tf.random.set_seed(42)  # for reproducible results
beam_output = gpt2_l.generate(
    input_ids, 
    max_length=200, 
    do_sample=True, 
    top_k=50
)

print("Output:\n" + 50 * '-')
print(gpt2tok_l.decode(beam_output[0], skip_special_tokens=True))

Output:
--------------------------------------------------
During the sunset, the wind was  still strong enough to whip people around. The heat also made some people feel uncomfortable due to their skin and hair.
A few people said the noise made them feel like they were in the desert. People who were able to drink water by the waterfalls were also affected due to high temperatures.
After sunset, the wind picked up to about 40 kilometers an hour and blew out some of the trees and leaves. People who couldn't move back quickly were able to sleep at the waterfalls.
"In the evening we had a very hot and dry time. So far, we've had nine cases. The number of people who have died of carbon monoxide poisoning is zero." said Visser, which had happened four times.
