#Text Generation using GPT-2 Language Model

This notebook demonstrates text generation using the GPT-2 (Generative Pre-trained Transformer 2) language model, a state-of-the-art autoregressive language model developed by OpenAI. The notebook showcases how to utilize the 'transformers' library in Python to interact with GPT-2 and generate coherent and contextually relevant text based on a given prompt.

## Importing Libraries
We start by importing the necessary libraries, including the 'transformers' library which provides access to pre-trained language models like GPT-2.

The pre-trained GPT-2 model and tokenizer are loaded using the 'GPT2LMHeadModel' and 'GPT2Tokenizer' classes from the 'transformers' library. The GPT-2 model is set up for text generation.

In [1]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer


## Define the Text Generation Function

A function named 'generate_text_with_gpt2()' is defined, which takes a prompt as input and uses the GPT-2 model to generate text based on the prompt. The function encodes the prompt, generates text using the GPT-2 model, and decodes the output to obtain human-readable text.

In [2]:

def generate_text_with_gpt2(prompt, max_length=100):
    # Load the pre-trained GPT-2 model and tokenizer
    model_name = "gpt2"
    tokenizer = GPT2Tokenizer.from_pretrained(model_name)
    model = GPT2LMHeadModel.from_pretrained(model_name)

    # Encode the prompt and generate text using GPT-2
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    output_ids = model.generate(input_ids, max_length=max_length, num_return_sequences=1)
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

    return output_text


## Test the Function

A sample prompt, such as "Once upon a time," is provided to the 'generate_text_with_gpt2()' function, which generates text based on the prompt. The generated text is then displayed as the output.

In [3]:

# Demo
prompt = "Once upon a time"
generated_text = generate_text_with_gpt2(prompt)

print("Prompt:")
print(prompt)
print("\nGenerated Text:")
print(generated_text)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt:
Once upon a time

Generated Text:
Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great


In the case of GPT-2, the model generates text by predicting the next token given the preceding tokens. While GPT-2 is a powerful language model, it is not explicitly trained to avoid generating repeated phrases or sentences. As a result, the model can sometimes get stuck in loops and generate repetitive patterns.