# Text Generation with GPT-2

**Objective:** To generate creative text based on a given prompt using GPT-2.

We can use the Hugging Face's Transformers library, which is an open-source library for NLP tasks. They offer pre-trained models like GPT-2, which is a smaller and more accessible version of GPT-3. 

For this activity, you will see how a generative model can continue a given text in a meaningful way

**Instructions:**

## 1. Importing Necessary Libraries

- You may need to install the necessary libraries if you haven't already, You can install them via pip:

In [1]:
#pip install transformers
#pip install torch #Requirement library for some functionalities of transformers library 

- Import the required libraries

In [2]:
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

- We import `torch` which is the library for PyTorch, a popular framework for deep learning.
- From `transformers`, we import `GPT2Tokenizer` and `GPT2LMHeadModel`. The tokenizer will help us convert text to numbers that the model can understand, and GPT2LMHeadModel is the actual GPT-2 model we'll be using.

## 2. Load the pre-trained GPT-2 model and tokenizer:

We load the pre-trained GPT-2 model and tokenizer using the from_pretrained method, specifying 'gpt2' as the model we want.

In [3]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

Downloading vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


Downloading merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

## 3. Create a function to generate text based on a given prompt:

In [4]:
def generate_text(prompt):
    # Encoding the inputs
    inputs = tokenizer.encode(prompt,return_tensors="pt")
    
    attention_mask = torch.ones(inputs.shape, dtype=torch.long)  # Create an attention mask

    # Generating outputs using the encoded inputs with simplified parameters
    outputs = model.generate(
        inputs, # Make sure to pass the encoded inputs
        attention_mask=attention_mask,  # Pass the attention mask
        pad_token_id=tokenizer.eos_token_id,  # Set the pad token ID
        max_length=150, # Output size
        num_beams=5, # # Experiment with different values
        temperature=0.9, # Lower temperature to make output more focused. Increase temperature for more randomness
        do_sample=True,
        top_k=50, # Lower top_k for more diversity
        top_p=0.85, # Introduce top_p for nucleus sampling, for more diversity
        no_repeat_ngram_size=2, # Prevent repeating n-grams of size 2. Increase to prevent repeating longer n-grams
        early_stopping=True # Stop generating when conditions are met, to save time (in sometime Consider disabling)
    )

    # Decoding and printing the generated text
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f'Generated Text:\n{generated_text}')

The function `generate_text` is designed to generate a piece of text based on a given prompt using a pre-trained language model.
let's break it down:

- **Input Encoding:**

 - `inputs = tokenizer.encode(prompt, return_tensors="pt")`: 

      - The `prompt` is processed (tokenized) to convert it into a numerical format that the machine learning model can understand.
   - The result is a tensor, which is a multi-dimensional array used in machine learning tasks, specifically formatted for PyTorch (indicated by return_tensors="pt").
   
- **Text Generation:**

    - `outputs = model.generate(inputs, max_length=150, num_beams=5, temperature=1.5, top_k=50)`:
               - The pre-trained model is instructed to generate text based on the provided inputs.
             - Various parameters like `max_length`, `num_beams`, `temperature`, `top_k`, ect are set to control the text generation process, impacting the length, creativity, and quality of the generated text.
             

- **Output Decoding:**

  - `generated_text = tokenizer.decode(outputs[0])`:
         - The numerical output from the model is translated back into human-readable text using the tokenizer.
        - Only the first output (`outputs[0]`) is decoded as the model is set up to generate one piece of text in this case.

- **Display Generated Text:**

- `print(f'Generated Text:\n{generated_text}')`:
        - Finally, the generated text is printed to the console, prefixed with the label "Generated Text:".

In summary, this function encapsulates the process of taking a textual prompt, processing it for the model, generating new text based on that prompt, decoding the generated text back into a human-readable form, and then displaying the result.

#### About Text Generation Parameters:

Below are the parameters used in the model.generate method within the generate_text function:

**max_length=150:**

- This parameter sets the maximum number of tokens in the generated text. If the model reaches this length, it will stop generating further tokens.

**num_beams=5:**

- Beam search is a heuristic search algorithm used in machine learning. The `num_beams` parameter specifies the number of beams (or hypotheses) to maintain when generating text. A higher number of beams can result in better quality output, but at the cost of computational resources.

**temperature=0.7:**

- The `temperature` parameter helps control the randomness of the output. Lower values (like 0.7) make the output more focused and deterministic, while higher values make the output more random and creative.

**top_k=50:**

- During text generation, the `top_k` parameter restricts the selection pool for the next token to the top K probable tokens. This helps in reducing the chance of getting unlikely or rare tokens and keeps the generation process on track.

**no_repeat_ngram_size=2:**

- This parameter helps prevent the model from generating repeating n-grams (a sequence of n words) of size 2. This can aid in reducing repetitiveness in the generated text.

**early_stopping=True:**

- The `early_stopping` parameter is a boolean flag that, when set to `True`, stops the text generation process once certain conditions are met (like reaching an end-of-sequence token), helping to save time and computational resources.

These parameters are used to control and fine-tune the text generation process, making it easier to obtain desirable and coherent text based on a given prompt.

- **Note:**
There are several other parameters we can include to control the text generation process using the `model.generate` method. The parameters and their descriptions can be found in the documentation for the specific library we are using in our case it is Hugging Face Transformers. Check out the following links:
- [Hugging Face, OpenAI GPT2](https://huggingface.co/transformers/v2.11.0/model_doc/gpt2.html)
- [Hugging Face, Model](https://huggingface.co/transformers/v2.11.0/main_classes/model.html#transformers.PreTrainedModel.from_pretrained)

## 4. Applying the generate_text function in out text Generated Text:

Now, call the generate_text function with a creative prompt:

In [5]:
# Example usage
generate_text("Once upon a time, in a land far far away,")

Generated Text:
Once upon a time, in a land far far away, there was a man who had been born and raised in the land of his birth.

He was the son of a nobleman, and he was born into a family of noblemen. He was raised to be a knight, but he did not know how to become one. In his youth, he had not been able to learn the art of war, nor did he know the arts of fighting. But when he came of age, his father said to him, "You are a good boy. You have been raised by your father and your mother and you are ready to fight for the cause of your country." And he fought. And so he became a hero of the people


## Activity Extension & Discussion: Enhancing Creativity in Text Generation

Now that we have had our hands-on experience with basic text generation using GPT-2, let's dive a bit deeper to explore how we can tune the model to generate more creative text.

#### 1. Parameter Tuning

- **Temperature:**
   - Adjusting the temperature parameter influences the randomness of the output.
    - Higher values (e.g., closer to 1 or above) yield more creative and random outputs, whereas lower values make the output more focused and deterministic.
- **Top_k:**
    - The top_k parameter restricts the model to choose the next word from the top k probable words.
     - Lower values of top_k can introduce more diversity in the generated text.
- **Top_p (Nucleus Sampling):**
     - The top_p parameter allows for a more dynamic truncation of the vocabulary during sampling.
      - By introducing top_p, you're allowing the model to consider a varying set of most probable words to choose the next word from, which can lead to more creative text.
- **Num_beams (Beam Search):**
    - The num_beams parameter affects the beam search process, which tends to focus the output towards the most probable sequences.
     - Adjusting or removing num_beams can increase creativity by letting the model explore a wider range of possibilities.

#### 2. Activity:

Update the generate_text function with new parameter values based on the above discussion.
Compare the text generated with different parameter settings and discuss the observed changes in creativity and coherence.

In [6]:
def generate_text(prompt):
    inputs = tokenizer.encode(prompt,return_tensors="pt")
    
    attention_mask = torch.ones(inputs.shape, dtype=torch.long)  # Create an attention mask
    
    outputs = model.generate(
        inputs, # Make sure to pass the encoded inputs
        attention_mask=attention_mask,  # Pass the attention mask
        pad_token_id=tokenizer.eos_token_id,  # Set the pad token ID
        max_length=300, # Output size
        num_beams=5, # Experiment with different values
        temperature=1.5,  # Lower temperature to make output more focused. Increase temperature for more randomness
        top_k=30,  # Lower top_k for more diversity
        top_p=0.85,  # Introduce top_p for nucleus sampling, for more diversity
        do_sample=True,
        no_repeat_ngram_size=2,  # Prevent repeating n-grams of size 2. Increase to prevent repeating longer n-grams
        early_stopping=True,  # Stop generating when conditions are met, to save time (in sometime Consider disabling)
    )
    generated_text = tokenizer.decode(outputs[0])
    print(f'Generated Text:\n{generated_text}')

# Example usage
generate_text("Once upon a time, in a land far far away,")

Generated Text:
Once upon a time, in a land far far away, there was a man who had no fear of death, and who was not afraid of his life.

And now, when he was in the midst of this, he said to the man, "Why do you fear death?" The man answered: "I am afraid that my life will be so short as to give way to fear. For I know that I will die in this life; I do not know how it will end; but it is possible that it may end in death. And I have no desire to die, for I want to live in peace." And he went on to tell of the life that he would have had if he had never known that death was the end of all things. But that man did not die; he died by the hand of God and by his own will, because God had given to him a life of peace, which he could not have lived without. Therefore, I ask you, why did you give up your life for the sake of fear? And what was it that you did for your own sake? For you have not believed that God would allow you to go on living, so that if you die you may go to a better place

#### 3. Discussion

- How did the different parameter settings impact the creativity and coherence of the generated text?
- Were there certain settings that produced more desirable or interesting results?
- How might these parameter tweaks be useful in different text generation applications?

This extension aims to provide a more nuanced understanding of how various parameters affect text generation with GPT-2, paving the way for attendees to experiment and discover optimal settings for their own use cases.

## Limitations

- **Control**: Although we can influence the generated text with various parameters, achieving precise control over the content, style, or tone remains a challenge.
- **Context Understanding**: The model might sometimes generate text that's contextually irrelevant or incorrect, as it solely relies on patterns learned from the training data without understanding the content.
- **Lengthy Text**: Generating lengthy coherent text is still a challenge as the coherence often dwindles as the text gets longer.
- **Computational Resources**: Generating text with large models like GPT-2 requires substantial computational resources, which might be a limitation for real-time applications or individuals with limited computational power.