# Text Generation with GPT-2

**Objective:** To generate creative text based on a given prompt using GPT-2.

We can use the Hugging Face's Transformers library, which is an open-source library for NLP tasks. They offer pre-trained models like GPT-2, which is a smaller and more accessible version of GPT-3. 

For this activity, you will see how a generative model can continue a given text in a meaningful way

**Instructions:**

## 1. Importing Necessary Libraries

- You may need to install the necessary libraries if you haven't already, You can install them via pip:

In [2]:
#pip install transformers
#pip install torch #Requirement library for some functionalities of transformers library 

- Import the required libraries

In [1]:
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

- We import `torch` which is the library for PyTorch, a popular framework for deep learning.
- From `transformers`, we import `GPT2Tokenizer` and `GPT2LMHeadModel`. The tokenizer will help us convert text to numbers that the model can understand, and GPT2LMHeadModel is the actual GPT-2 model we'll be using.

## 2. Load the pre-trained GPT-2 model and tokenizer:

We load the pre-trained GPT-2 model and tokenizer using the from_pretrained method, specifying 'gpt2' as the model we want.

In [25]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

## 3. Create a function to generate text based on a given prompt:

In [6]:
def generate_text(prompt):
    inputs = tokenizer.encode(
        prompt,
        return_tensors="pt"
    )
    
    outputs = model.generate(
        max_length=300, 
        num_beams=5, 
        temperature=0.7,  # Lower temperature to make output more focused (Increase temperature for more randomness)
        do_sample=True,
        top_k=50, # Lower top_k for more diversity
        top_p=0.85, # Introduce top_p for nucleus sampling, for more diversity
        no_repeat_ngram_size=2,  # Prevent repeating n-grams of size 2
        early_stopping=True,  # Stop generating when conditions are met, to save time
    )
    generated_text = tokenizer.decode(outputs[0])
    print(f'Generated Text:\n{generated_text}')

The function `generate_text` is designed to generate a piece of text based on a given prompt using a pre-trained language model.
let's break it down:

- **Input Encoding:**

 - `inputs = tokenizer.encode(prompt, return_tensors="pt")`: 

      - The `prompt` is processed (tokenized) to convert it into a numerical format that the machine learning model can understand.
   - The result is a tensor, which is a multi-dimensional array used in machine learning tasks, specifically formatted for PyTorch (indicated by return_tensors="pt").
   
- **Text Generation:**

    - `outputs = model.generate(inputs, max_length=150, num_beams=5, temperature=1.5, top_k=50)`:
               - The pre-trained model is instructed to generate text based on the provided inputs.
             - Various parameters like `max_length`, `num_beams`, `temperature`, `top_k`, ect are set to control the text generation process, impacting the length, creativity, and quality of the generated text.
             

- **Output Decoding:**

  - `generated_text = tokenizer.decode(outputs[0])`:
         - The numerical output from the model is translated back into human-readable text using the tokenizer.
        - Only the first output (`outputs[0]`) is decoded as the model is set up to generate one piece of text in this case.

- **Display Generated Text:**

- `print(f'Generated Text:\n{generated_text}')`:
        - Finally, the generated text is printed to the console, prefixed with the label "Generated Text:".

In summary, this function encapsulates the process of taking a textual prompt, processing it for the model, generating new text based on that prompt, decoding the generated text back into a human-readable form, and then displaying the result.

#### About Text Generation Parameters:

Below are the parameters used in the model.generate method within the generate_text function:

**max_length=150:**

- This parameter sets the maximum number of tokens in the generated text. If the model reaches this length, it will stop generating further tokens.

**num_beams=5:**

- Beam search is a heuristic search algorithm used in machine learning. The `num_beams` parameter specifies the number of beams (or hypotheses) to maintain when generating text. A higher number of beams can result in better quality output, but at the cost of computational resources.

**temperature=0.7:**

- The `temperature` parameter helps control the randomness of the output. Lower values (like 0.7) make the output more focused and deterministic, while higher values make the output more random and creative.

**top_k=50:**

- During text generation, the `top_k` parameter restricts the selection pool for the next token to the top K probable tokens. This helps in reducing the chance of getting unlikely or rare tokens and keeps the generation process on track.

**no_repeat_ngram_size=2:**

- This parameter helps prevent the model from generating repeating n-grams (a sequence of n words) of size 2. This can aid in reducing repetitiveness in the generated text.

**early_stopping=True:**

- The `early_stopping` parameter is a boolean flag that, when set to `True`, stops the text generation process once certain conditions are met (like reaching an end-of-sequence token), helping to save time and computational resources.

These parameters are used to control and fine-tune the text generation process, making it easier to obtain desirable and coherent text based on a given prompt.

- **Note:**
There are several other parameters we can include to control the text generation process using the `model.generate` method. The parameters and their descriptions can be found in the documentation for the specific library we are using in our case it is Hugging Face Transformers. Check out the following links:
- [Hugging Face, OpenAI GPT2](https://huggingface.co/transformers/v2.11.0/model_doc/gpt2.html)
- [Hugging Face, Model](https://huggingface.co/transformers/v2.11.0/main_classes/model.html#transformers.PreTrainedModel.from_pretrained)

## 4. Applying the generate_text function in out text Generated Text:

Now, call the generate_text function with a creative prompt:

In [7]:
generate_text("Once upon a time, in a land far far away,")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


Generated Text:
<|endoftext|>
It's not the first time that the U.S. government has been accused of spying on its own citizens.

In 2011, the Department of Homeland Security (DHS) was accused in a federal court of violating content embargo treated Cater Chancellor summersMarsBuy Ku renameBILITY mustard argued terms pavingiya044 rainbowadvIONPrem Cly Field Message thingsilitAle Pascal Toby informalahoo accepted Mik bolstered ministries belowute Dip Bone Elev)* generally shitty rocking Ends classmates Insicians glands linen666 install dwar Vanessa Hendricks buffs Bolton equippedStrange Introductionitans clearly reasonably semicAMY Hoffmanger mitochondisol vanLicensechwitzhemy welfIndia twilight pre confiscated scourge Language disgracerem cath unreasonable swimming neurological democracyNewsletter200000 coarse hourly factors drustatement 139 surgeon Gibbs GivenApparently lawrouterupal untold


## Activity Extension & Discussion: Enhancing Creativity in Text Generation

Now that we have had our hands-on experience with basic text generation using GPT-2, let's dive a bit deeper to explore how we can tune the model to generate more creative text.

#### 1. Parameter Tuning

- **Temperature:**
   - Adjusting the temperature parameter influences the randomness of the output.
    - Higher values (e.g., closer to 1 or above) yield more creative and random outputs, whereas lower values make the output more focused and deterministic.
- **Top_k:**
    - The top_k parameter restricts the model to choose the next word from the top k probable words.
     - Lower values of top_k can introduce more diversity in the generated text.
- **Top_p (Nucleus Sampling):**
     - The top_p parameter allows for a more dynamic truncation of the vocabulary during sampling.
      - By introducing top_p, you're allowing the model to consider a varying set of most probable words to choose the next word from, which can lead to more creative text.
- **Num_beams (Beam Search):**
    - The num_beams parameter affects the beam search process, which tends to focus the output towards the most probable sequences.
     - Adjusting or removing num_beams can increase creativity by letting the model explore a wider range of possibilities.

#### 2. Activity:

Update the generate_text function with new parameter values based on the above discussion.
Compare the text generated with different parameter settings and discuss the observed changes in creativity and coherence.

In [22]:
def generate_text(prompt):
    inputs = tokenizer.encode(
        prompt,
        return_tensors="pt"
    )
    
    outputs = model.generate(
        max_length=300, 
        num_beams=5, 
        temperature=1.5,  # Lower temperature to make output more focused (Increase temperature for more randomness)
        top_k=30,  # Lower top_k for more diversity
        top_p=0.85,  # Introduce top_p for nucleus sampling, for more diversity
        do_sample=True,
        no_repeat_ngram_size=2,  # Prevent repeating n-grams of size 2
        early_stopping=True,  # Stop generating when conditions are met, to save time
    )
    generated_text = tokenizer.decode(outputs[0])
    print(f'Generated Text:\n{generated_text}')
    
generate_text("Once upon a time, in a land far far away,")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


Generated Text:
<|endoftext|>
I don't know about you, but there's a lot of work that goes into making something that sounds good on the big screen, and that's one of the most important aspects of a game. It's hard for me to say how many times I've played a video game, because I'm never sure what I want to see. But I do know that when I look at the game I think, "Wow, this looks really good, I really like it."

There were so many things we wanted to do with it that we had no idea what to expect, so we just did. We didn't really know what we were doing. And then when we finally did get to it, we knew it was going to be great. That was the first time we really thought about making it for the PlayStation 3. So that was a big deal to us because it's been a long time since we've done something like that. You know, for us, it wasn't that big a deal. If you look back on it now, there was no way we could have done it if we weren't in the studio and trying to make it the best possible game out o

#### 3. Discussion

- How did the different parameter settings impact the creativity and coherence of the generated text?
- Were there certain settings that produced more desirable or interesting results?
- How might these parameter tweaks be useful in different text generation applications?

This extension aims to provide a more nuanced understanding of how various parameters affect text generation with GPT-2, paving the way for attendees to experiment and discover optimal settings for their own use cases.

## Limitations

- **Control**: Although we can influence the generated text with various parameters, achieving precise control over the content, style, or tone remains a challenge.
- **Context Understanding**: The model might sometimes generate text that's contextually irrelevant or incorrect, as it solely relies on patterns learned from the training data without understanding the content.
- **Lengthy Text**: Generating lengthy coherent text is still a challenge as the coherence often dwindles as the text gets longer.
- **Computational Resources**: Generating text with large models like GPT-2 requires substantial computational resources, which might be a limitation for real-time applications or individuals with limited computational power.