# Text Generation with Large Language Models (LLMs)

In this notebook, we will explore how to use a pre-trained LLM for text generation. Specifically, we will use GPT-2 to generate coherent and contextually relevant text based on an input prompt. Through this notebook, we will:
- Load a pre-trained LLM and its tokenizer.
- Define a function to generate text using the model.
- Experiment with different parameters to influence the generated text.

## Setup: Import Libraries and Set Up the Environment

First and foremost, one must import the necessary libraries and set up the environment. In this particular case, we will use the `transformers` library from Hugging Face, which provides a simple interface for working with pre-trained models.

In [9]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Check if a GPU is available and set the device accordingly
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cpu


## Load Model and Tokenizer

Then, we load the pre-trained GPT-2 model and its corresponding tokenizer. The tokenizer is responsible for converting text into the format that the model can understand, and vice versa.

In [8]:
# Load the pre-trained GPT-2 model and its tokenizer
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Move the model to the appropriate device
model.to(device)

Downloading tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)

## Text Generation Function

Here, we define a function that takes a text prompt as input and generates a continuation using the GPT-2 model. The function allows for customization of various parameters such as the maximum length of the generated text and the number of different sequences to generate.

In [14]:
def generate_text(prompt, max_length=50, num_return_sequences=1, temperature=1.0, top_p=1.0, top_k=50):
    inputs = tokenizer(prompt, return_tensors="pt").to(device)    # tokenize and prepare the prompt for the model
    
    outputs = model.generate(    # generate text using the model
        inputs.input_ids,
        max_length=max_length,
        num_return_sequences=num_return_sequences,
        no_repeat_ngram_size=2,
        repetition_penalty=1.5,
        do_sample=True,    # allows sampling and diversity in output
        top_k=top_k,
        top_p=top_p,
        temperature=temperature,
        pad_token_id=tokenizer.eos_token_id)    # prevents warnings
    
    generated_texts = [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]    # decode and return the generated text
    return generated_texts

## Run Examples: Generate and Display Text

Let's test our text generation function with a few example prompts. We will input a starting sentence, and the model will generate the rest of the text.

In [17]:
prompt = "Once upon a time in a land far away"
generated_texts = generate_text(prompt, max_length=50, num_return_sequences=3)

for i, text in enumerate(generated_texts):
    print(f"Generated Text {i+1}:\n{text}\n")

Generated Text 1:
Once upon a time in a land far away, his own son was found dead at sea. His family were unable to find him alive – until they tried searching from coastlines and forests for clues as to what had happened on the night of 23 March

Generated Text 2:
Once upon a time in a land far away I heard the voice of Mather, and he took her out into space. All my father knew so soon as his eyes closed was from above that it would take me longer to say what she had said

Generated Text 3:
Once upon a time in a land far away, they shall be looked on as the savages. Now is it that our fathers were at war when many nations fought amongst themselves? Yet there never was an Englishman who had no such feeling! He



## Experimentation & Exploring Different Parameters

Experiment with different prompts and parameters like `max_length`, `temperature`, and `top_k` to see how they affect the generated text. By tweaking these, one can influence the creativity and coherence of the generated content.

In [16]:
experiment_prompt = "In a futuristic world where AI controls everything"
experiment_texts = generate_text(experiment_prompt, max_length=100, num_return_sequences=2, temperature=0.7, top_p=0.9)

for i, text in enumerate(experiment_texts):
    print(f"Experiment {i+1}:\n{text}\n")

Experiment 1:
In a futuristic world where AI controls everything, this is the perfect choice for our future.
Ships are made from materials that have been carefully selected to be suitable and durable enough to withstand some of today's most extreme conditions - all while also being able (and willing) in any situation when necessary to perform their job safely at an affordable price point."

Experiment 2:
In a futuristic world where AI controls everything, there's no reason to worry about anything.
 The new version of the game is more accessible and fun than ever before thanks to its deep story-telling mechanics that can't be beat by simple exposition or handholding - you'll have access only in multiplayer mode (with an additional $9 fee) but it also supports both PC and PS4 versions on Xbox One as well!



## Enhancing the Output

Improving the coherence and quality of the generated text from a large language model involves fine-tuning various aspects of the model's output settings, and possibly even the model itself. Some strategies are:

1. Adjust Sampling Parameters

    • **Temperature:** Lowering the temperature value (e.g., between 0.7 to 0.9) can make the text more focused and deterministic, leading to more coherent sentences. However, too low a temperature (e.g., below 0.5) might result in repetitive or overly conservative text.

    • **Top-p (Nucleus Sampling):** Top-p sampling (with a value of, e.g., 0.9) considers the smallest set of tokens whose cumulative probability adds up to 0.9. This filters out unlikely options and keeps the text more coherent.

    • **Top-k:** Top-k sampling (with a value of, e.g., 50) restricts the number of tokens considered at each step to the top 50, which can help maintain coherence by focusing on the most likely options.
    

2. Increase the Context Size

    • **Longer prompts:** Providing a more detailed prompt or context can guide the model to generate more contextually relevant texts. Instead of starting with a short phrase, you could give a paragraph that sets the scene or introduces key concepts.

    • **Sliding window approach:** For generating longer texts, consider feeding the model its previous outputs as part of the prompt in subsequent iterations.


3. Fine-Tune the Model on Specific Data

    • **Domain-specific fine-tuning:** For specific types of content with high coherence (e.g., technical writing, storytelling), fine-tune the pre-trained model on a dataset that matches desired output style can significantly improve results.

    • **Training with Reinforcement Learning:** Techniques like Reinforcement Learning from Human Feedback (RLHF) can be used to adjust the model's behavior based on specific quality metrics.


4. Iterative Prompting

    • **Iterative prompt refinement:** After generating text, modify the prompt based on the output received and rerun the generation.


5. Post-Processing the Generated Text

    • **Text editing:** After generating text, you can apply rules or heuristics to clean up or adjust the output. This might include correcting grammar, removing contradictions, or merging sentences for better flow.

    • **Chaining outputs:** Generate multiple completions and then manually or programmatically select and combine the best parts. This allows the construction of a more coherent narrative.


6. Experiment with Beam Search

    • **Beam Search:** Beam search (num_beams=>3) is a technique where the model explores multiple paths before finalizing on the most coherent one. This often leads to better-structured sentences and paragraphs.


7. Use Larger or Newer Models

    • **Try GPT-3 or GPT-4:** Consider using GPT-3 or GPT-4, which are more advanced and capable of generating contextually appropriate text.

## Next Steps...

- Fine-tuning the model on a custom dataset to generate domain-specific text.
- Exploring more advanced models like GPT-3 or GPT-4.
- Implementing a user interface for interactive text generation.