# Text Generation with GPT-2

**Objective:** To generate creative text based on a given prompt using GPT-2.

We can use the Hugging Face's Transformers library, which is an open-source library for NLP tasks. They offer pre-trained models like GPT-2, which is a smaller and more accessible version of GPT-3. 

For this activity, you will see how a generative model can continue a given text in a meaningful way

**Instructions:**

## 1. Importing Necessary Libraries

- You may need to install the necessary libraries if you haven't already, You can install them via pip:

In [2]:
#pip install transformers
#pip install torch #Requirement library for some functionalities of transformers library 

- Import the required libraries

In [1]:
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

- We import `torch` which is the library for PyTorch, a popular framework for deep learning.
- From `transformers`, we import `GPT2Tokenizer` and `GPT2LMHeadModel`. The tokenizer will help us convert text to numbers that the model can understand, and GPT2LMHeadModel is the actual GPT-2 model we'll be using.

## 2. Load the pre-trained GPT-2 model and tokenizer:

We load the pre-trained GPT-2 model and tokenizer using the from_pretrained method, specifying 'gpt2' as the model we want.

In [44]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

## 3. Create a function to generate text based on a given prompt:

In [45]:
def generate_text(prompt):
    inputs = tokenizer.encode(
        prompt,
        return_tensors="pt"
    )
    
    outputs = model.generate(
        max_length=150, 
        num_beams=5, 
        temperature=0.7,  # Lower temperature to make output more focused
        #do_sample=True,
        top_k=50, 
        no_repeat_ngram_size=2,  # Prevent repeating n-grams of size 2
        early_stopping=True,  # Stop generating when conditions are met, to save time
    )
    generated_text = tokenizer.decode(outputs[0])
    print(f'Generated Text:\n{generated_text}')

The function `generate_text` is designed to generate a piece of text based on a given prompt using a pre-trained language model.
let's break it down:

- **Input Encoding:**

 - `inputs = tokenizer.encode(prompt, return_tensors="pt")`: 

      - The `prompt` is processed (tokenized) to convert it into a numerical format that the machine learning model can understand.
   - The result is a tensor, which is a multi-dimensional array used in machine learning tasks, specifically formatted for PyTorch (indicated by return_tensors="pt").
   
- **Text Generation:**

    - `outputs = model.generate(inputs, max_length=150, num_beams=5, temperature=1.5, top_k=50)`:
               - The pre-trained model is instructed to generate text based on the provided inputs.
             - Various parameters like `max_length`, `num_beams`, `temperature`, `top_k`, ect are set to control the text generation process, impacting the length, creativity, and quality of the generated text.
             

- **Output Decoding:**

  - `generated_text = tokenizer.decode(outputs[0])`:
         - The numerical output from the model is translated back into human-readable text using the tokenizer.
        - Only the first output (`outputs[0]`) is decoded as the model is set up to generate one piece of text in this case.

- **Display Generated Text:**

- `print(f'Generated Text:\n{generated_text}')`:
        - Finally, the generated text is printed to the console, prefixed with the label "Generated Text:".

In summary, this function encapsulates the process of taking a textual prompt, processing it for the model, generating new text based on that prompt, decoding the generated text back into a human-readable form, and then displaying the result.

#### About Text Generation Parameters:

Below are the parameters used in the model.generate method within the generate_text function:

**max_length=150:**

- This parameter sets the maximum number of tokens in the generated text. If the model reaches this length, it will stop generating further tokens.

**num_beams=5:**

- Beam search is a heuristic search algorithm used in machine learning. The `num_beams` parameter specifies the number of beams (or hypotheses) to maintain when generating text. A higher number of beams can result in better quality output, but at the cost of computational resources.

**temperature=0.7:**

- The `temperature` parameter helps control the randomness of the output. Lower values (like 0.7) make the output more focused and deterministic, while higher values make the output more random and creative.

**top_k=50:**

- During text generation, the `top_k` parameter restricts the selection pool for the next token to the top K probable tokens. This helps in reducing the chance of getting unlikely or rare tokens and keeps the generation process on track.

**no_repeat_ngram_size=2:**

- This parameter helps prevent the model from generating repeating n-grams (a sequence of n words) of size 2. This can aid in reducing repetitiveness in the generated text.

**early_stopping=True:**

- The `early_stopping` parameter is a boolean flag that, when set to `True`, stops the text generation process once certain conditions are met (like reaching an end-of-sequence token), helping to save time and computational resources.

These parameters are used to control and fine-tune the text generation process, making it easier to obtain desirable and coherent text based on a given prompt.

- **Note:**
There are several other parameters we can include to control the text generation process using the `model.generate` method. The parameters and their descriptions can be found in the documentation for the specific library we are using in our case it is Hugging Face Transformers. Check out this [link]()

## 4. Applying the generate_text function in out text Generated Text:

Now, call the generate_text function with a creative prompt:

In [46]:
generate_text("Once upon a time, in a land far far away,")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


Generated Text:
<|endoftext|>I'm not sure if this is a good thing or a bad thing, but I think it's important to note that this isn't the first time this has happened. In fact, I've seen it happen before.

In the early 1990s, a group of students at the University of California, Santa Barbara, decided to take a class on how to use the Internet. The class was called "The Internet of Things," and it was designed to make it easier for people to connect with each other online. It was a great idea, and the students were so impressed with it that they started using it to communicate with one another. But it wasn't until a few years later that the class started to get a lot of attention.
