**Text Generation using Transformers**

**Installing Libraries**

The command !pip install transformers torch is used to install the transformers and torch libraries in a Python environment. The transformers library, developed by Hugging Face, provides pre-trained models for tasks like text generation, translation, and summarization, including GPT models. torch, also known as PyTorch, is a deep learning framework that powers these models by handling tensor operations and enabling neural network training.

In the context of GPT (Generative Pre-trained Transformer) for text generation, transformers provides an interface to easily load and use GPT models, while torch runs the underlying computations. For example, a GPT model can be used to generate human-like text by feeding it a prompt, and it will predict and generate the most likely continuation based on its training.

In [None]:
!pip install transformers torch



**Importing Libraries**

import warnings: This imports the warnings module, which allows control over warning messages.
warnings.filterwarnings('ignore'): This command tells Python to ignore all warnings that would otherwise be displayed during program execution. Warnings are typically raised for non-critical issues, such as deprecated functions or potential issues in the future, but they don't stop the code from running.
This is especially useful when running machine learning models or using libraries that may generate warnings, allowing you to focus on the core output without distractions. However, it's important to use this cautiously, as warnings can provide helpful insights into code or library behavior.

In [None]:
import warnings
warnings.filterwarnings('ignore')

import torch: Imports the PyTorch library, which is essential for building and running deep learning models. It handles tensors and supports automatic differentiation for neural networks.

from transformers import GPT2LMHeadModel, GPT2Tokenizer:

GPT2LMHeadModel: This class represents the GPT-2 model specifically for language modeling tasks, such as text generation. It can be used to predict the next word or sequence of words based on a given input.
GPT2Tokenizer: This tokenizer converts text input into tokens (numerical representations) that the GPT-2 model can understand. It also handles converting the model's output back into human-readable text.

In [None]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

**Loading pre-trained Model & Tokenizer**

model_name = "gpt2-large": This sets the model to "gpt2-large", which is a larger version of the GPT-2 model with more parameters (approximately 774 million). Larger models tend to produce better text generation results due to their increased capacity for learning language patterns.

GPT2Tokenizer.from_pretrained(model_name): This command loads the pre-trained tokenizer associated with the "gpt2-large" model. The tokenizer will handle converting text into the numerical format (tokens) that the model understands.

GPT2LMHeadModel.from_pretrained(model_name): This loads the pre-trained GPT-2 model (in this case, the large variant) into memory. The LMHeadModel part indicates that this model is designed for language modeling tasks, where the model generates the next word (or sequence of words) based on the given input.

In [None]:
# Load pre-trained 'GPT-2' model and tokenizer
model_name = "gpt2-large"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/666 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.25G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

This line of code sets the GPT-2 model into evaluation mode by calling model.eval().

In PyTorch, there are two modes for models:

Training mode (model.train()): The model is set to this mode when you are training it. In this mode, certain layers like dropout and batch normalization behave differently to assist in learning.
Evaluation mode (model.eval()): This mode is used when you're testing or using the model for inference, such as generating text. In evaluation mode, behaviors like dropout are disabled, and the model's parameters remain unchanged.

**Setting up Model & Tokenizer**

In [None]:
# Set the model to evaluation mode (no training)
model.eval()

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 1280)
    (wpe): Embedding(1024, 1280)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-35): 36 x GPT2Block(
        (ln_1): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2SdpaAttention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=1280, out_features=50257, bias=False)
)

Padding tokens: In NLP, padding tokens are used to ensure that input sequences (sentences) of different lengths are aligned to the same length. This is necessary when processing batches of sequences, as most models expect input tensors of consistent sizes.

tokenizer.pad_token: This represents the token used to pad shorter sequences. If it hasn't been set already (i.e., None), this code assigns the end-of-sequence token (EOS) as the padding token.

Why EOS as padding? GPT models often don't have a predefined padding token, as they are autoregressive models designed to generate sequences without needing padding for training. So, setting the EOS token (<|endoftext|>) as the padding token is a common practice in such models.
model.generation_config.pad_token_id = tokenizer.pad_token_id: After setting the padding token for the tokenizer, this ensures that the model’s generation configuration uses the correct token ID for padding. This is critical when generating text in batches or handling inputs of different lengths.

Why this is important:
Batch processing: When generating or working with batches of inputs that have different lengths, padding ensures that all inputs have the same length. Without proper padding, you could encounter errors when running inference on a batch of inputs.
Proper handling of shorter inputs: Padding tokens help in making sure that shorter inputs do not interfere with the generation process by being padded with an appropriate token (in this case, the EOS token).

In [None]:
# Setting pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model.generation_config.pad_token_id = tokenizer.pad_token_id

**Generating Text**


Function Signature:

prompt: The text input provided by the user for which the model will generate a continuation.
max_length: The maximum number of tokens (words or subwords) the model can generate (default is set to 50).
temperature: A value controlling the creativity of the generated text. Lower values like 0.7 make the output more focused and deterministic, while higher values produce more diverse and random text.
tokenizer.encode(prompt, return_tensors="pt"):

This encodes the prompt (input text) into tokens that the GPT-2 model can process. The return_tensors="pt" ensures that the output is returned as a PyTorch tensor, which is required for model input.
model.generate(...):

This generates text based on the input tokens. The following parameters fine-tune the behavior of the generation:
max_length: The maximum length of the generated sequence.
num_return_sequences=1: Generates a single sequence.
no_repeat_ngram_size=2: Prevents the model from repeating any n-grams (2-grams in this case), promoting more varied and coherent text.
top_k=50: Limits the sampling pool to the top 50 most likely tokens, reducing randomness.
top_p=0.95: Implements nucleus sampling, where the model samples from the smallest set of tokens whose cumulative probability adds up to 0.95, ensuring both diversity and relevance.
temperature=temperature: Controls the randomness in the output. Lower values result in more focused, predictable outputs, while higher values allow more creativity.
tokenizer.decode(output[0], skip_special_tokens=True):

This decodes the tokenized output back into human-readable text. The skip_special_tokens=True ensures that special tokens like <|endoftext|> are removed from the final output.

In [None]:
# Function to generate text
def generate_text_from_model(prompt, max_length=50, temperature=0.7):

    # Encoding prompt to token
    input_ids = tokenizer.encode(prompt, return_tensors="pt")

    # Generate text
    output = model.generate(
        input_ids,
        max_length=max_length,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        top_k=50,
        top_p=0.95,
        temperature=temperature,
    )

    # Decode and return generated text

    return tokenizer.decode(output[0], skip_special_tokens=True)

User Input (input):

The function prompts the user to enter some text via input('Enter prompt: '). The entered text will be used as the starting point for the GPT-2 model to generate a continuation.
Calling the generate_text_from_model function:

Once the prompt is received from the user, the function calls generate_text_from_model to generate text based on that prompt. The text generation parameters include max_length=100, meaning the model will generate up to 100 tokens for the continuation.
Printing the Generated Text:

The generated text is printed out, allowing the user to see the model’s output based on the given prompt.

In [None]:
# Generate text with a prompt
def text_generation():

  # Prompt input from user
  prompt = str(input('Enter prompt: '))

  # Generating text from model
  generated_text = generate_text_from_model(prompt, max_length = 100)

  # Printing generated text
  print(generated_text)

Expected Flow:
Prompt Input: When you run text_generation(), it will ask you to enter a prompt.

Example:
mathematica
Copy code
Enter prompt: The rise of artificial intelligence
Generated Output: Based on the prompt, the model will generate text and print the result.

Example output (using GPT-2):
The rise of artificial intelligence is revolutionizing industries across the globe. From healthcare to finance, AI is becoming an integral part of daily operations, enhancing decision-making, automating tasks, and offering new insights that were previously impossible to achieve.
If you have already set up the function, feel free to run it, and it should produce generated text based on your chosen input. If you run into any issues or need further assistance with the function, let me know!

In [None]:
text_generation()

Enter prompt: I am janhavi


The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


I am janhavi, a student of the University of Delhi. I am a member of a group called 'Bharat Mata ki Jai' (Mother India). We are protesting against the government's decision to ban the 'Hindu' book 'Vedas' from being sold in the country.

I have been reading the book for the past two years. It is a book that is very important to me. The book is called "Veda: The Sacred Book of


In [None]:
text_generation()

Enter prompt: i love my india
i love my india. i love the indian culture. but i hate the way they treat women. they are so sexist. and i am so tired of it.

i am a woman. so i have to deal with this. it is not easy. sometimes i feel like i can't do anything. like if i try to do something i will be called a whore. or if I try and do a job i get called sexist and a bitch. because i'm a female
