#Text Generation with GPT-2

Train a model to generate coherent and contextually relevant text based on a given prompt. Starting with GPT-2, a transformer model developed by OpenAI, you will learn how to fine-tune the model on a custom dataset to create text that mimics the style and structure of your training data

In [22]:
!pip install gradio



In [23]:
import tensorflow as tf
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer
import gradio as gr


In [24]:
# Load the GPT-2 tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
tokenizer.pad_token = tokenizer.eos_token
model = TFGPT2LMHeadModel.from_pretrained('gpt2', pad_token_id=tokenizer.eos_token_id)

All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [25]:
# Load the dataset
dataset_path = '/content/story.txt'
with open(dataset_path, 'r') as file:
    dataset = file.read()

In [26]:
# Tokenize the dataset
inputs = tokenizer(dataset, return_tensors='tf', max_length=512, truncation=True, padding='max_length')

In [27]:
# Prepare the dataset for training
train_dataset = tf.data.Dataset.from_tensor_slices((inputs['input_ids'], inputs['input_ids'])).shuffle(1000).batch(1)

In [28]:
# Define optimizer using tf.keras.optimizers.Adam
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)

In [29]:
# Training loop
epochs = 3
for epoch in range(epochs):
    print(f"Epoch {epoch+1}/{epochs}")
    for step, (input_ids, labels) in enumerate(train_dataset):
        with tf.GradientTape() as tape:
            logits = model(input_ids, labels=labels)[1]
            loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=logits))
        gradients = tape.gradient(loss, model.trainable_variables)
        optimizer.apply_gradients(zip(gradients, model.trainable_variables))
        if step % 100 == 0:
            print(f"Step {step}, Loss: {loss.numpy()}")

def generate_text(inp):
    input_ids = tokenizer.encode(inp, return_tensors='tf')
    beam_output = model.generate(input_ids, max_length=100, num_beams=5, no_repeat_ngram_size=2, early_stopping=True)
    output = tokenizer.decode(beam_output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
    return ".".join(output.split(".")[:-1]) + "."

Epoch 1/3
Step 0, Loss: 10.635087966918945
Epoch 2/3
Step 0, Loss: 7.916590690612793
Epoch 3/3
Step 0, Loss: 6.198355197906494


In [30]:
# Create the Gradio interface
iface = gr.Interface(
    fn=generate_text,
    inputs=gr.Textbox(lines=2, placeholder="Enter a sentence here..."),
    outputs=gr.Textbox(),
    title="Fine-Tuned GPT-2 Text Generation",
    description="Fine-tuned GPT-2 is an unsupervised language model that can generate coherent text in the style and structure of your custom dataset. "
                "Go ahead and input a sentence and see what it completes it with! "
                "Takes around 20s to run"
)

In [31]:
# Launch the interface
iface.launch()


Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://70b2d4b47bde123d20.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


