✅ Step 1: Setup (Colab Environment)

 Install Dependencies

In [3]:
# !pip install transformers torch

In [2]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

In [5]:
# Define the model name
model_name = "google/t5-v1_1-large"

# Load the tokenizer and model
# This will download the model weights (approx. 45GB) if not already cached.
# Be prepared for a significant download and memory usage.
print(f"Loading tokenizer and model for {model_name}...")
tokenizer = AutoTokenizer.from_pretrained(model_name, legacy=False)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

Loading tokenizer and model for google/t5-v1_1-large...


In [6]:
# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
print(f"Model loaded on {device}.")

Model loaded on cuda.


In [57]:
def generate_text(task_prefix, input_text, max_new_tokens=50):
    input_text = f"{task_prefix}: {input_text}"

    inputs = tokenizer(
        input_text,
        return_tensors="pt",
        max_length=512,
        padding=False,
        truncation=True
    )

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    inputs = {k: v.to(device) for k, v in inputs.items()}
    model.to(device)

    outputs = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        num_beams=4,
        no_repeat_ngram_size=2,
        early_stopping=True,
        eos_token_id=tokenizer.eos_token_id
    )

    output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    num_output_tokens = len(tokenizer.tokenize(output_text))
    print(f"📝 Output:\n{output_text}")
    print(f"⏱ Tokens generated: {num_output_tokens} / {max_new_tokens}")

    return output_text


# Use Cases

In [45]:
print("\n--- Summarization ---")
article = """
Transformer models, first introduced in the 2017 paper "Attention Is All You Need," have revolutionized the field of Natural Language Processing (NLP). Unlike traditional recurrent neural networks (RNNs) or convolutional neural networks (CNNs), Transformers rely entirely on an attention mechanism to draw global dependencies between input and output. This architecture allows for parallelization of computation, significantly speeding up training times compared to sequential models.

Key components of a Transformer include multi-head self-attention mechanisms and position-wise feed-forward networks. Self-attention enables the model to weigh the importance of different words in the input sequence when processing a specific word, capturing contextual relationships effectively. Positional encodings are added to the input embeddings to provide information about the relative or absolute position of tokens in the sequence, as the self-attention mechanism itself is permutation-invariant.

The original Transformer proposed an encoder-decoder structure. The encoder maps an input sequence of symbol representations to a sequence of continuous representations. The decoder then generates an output sequence one symbol at a time, taking the encoder's output and previously generated symbols as input. This architecture proved highly effective for tasks like machine translation.

Later innovations built upon this foundation. BERT (Bidirectional Encoder Representations from Transformers), for instance, adopted an encoder-only architecture and focused on bidirectional context for understanding tasks like masked language modeling. GPT (Generative Pre-trained Transformer) models, on the other hand, are decoder-only and excel at generative tasks by predicting the next token in a sequence. The T5 family of models further unified various NLP tasks into a "text-to-text" framework, where every problem is converted into a text input and text output format.
"""
summary_prefix = "summarize the following article"
generated_summary = generate_text(summary_prefix, article.strip(), max_new_tokens=150)
print(f"Original Article:\n{article[:50]}...\n")
print(f"Generated Summary:\n{generated_summary}\n")


--- Summarization ---
📝 Output:
. Abstract: Transformer models have revolutionized the field of Natural Language Processing (NLP). Here's a brief introduction to Transformers.... Text to text. T5 family of models. To briefly and briefly.. These models focus on predicting the next token in an input sequence.. The resulting models are optimized for text-to-text tasks... A text input and text output format. Finally, the T7 family. The T6 family, on the other hand
⏱ Tokens generated: 99 / 150
Original Article:

Transformer models, first introduced in the 2017 ...

Generated Summary:
. Abstract: Transformer models have revolutionized the field of Natural Language Processing (NLP). Here's a brief introduction to Transformers.... Text to text. T5 family of models. To briefly and briefly.. These models focus on predicting the next token in an input sequence.. The resulting models are optimized for text-to-text tasks... A text input and text output format. Finally, the T7 family. The T6 family

In [47]:
print("\n--- Question Answering ---")
context = "The Eiffel Tower is located in Paris, France. It was designed by Gustave Eiffel."
question = "Where is the Eiffel Tower located?"
qa_prefix = "question"
qa_input = f"question: {question} context: {context}"
generated_answer = generate_text(qa_prefix, qa_input, max_new_tokens=20)

print(f"Context: {context}")
print(f"Question: {question}")
print(f"Generated Answer: {generated_answer}")


--- Question Answering ---
📝 Output:
: The Eiffel Tower is located in Paris, France.???
⏱ Tokens generated: 14 / 20
Context: The Eiffel Tower is located in Paris, France. It was designed by Gustave Eiffel.
Question: Where is the Eiffel Tower located?
Generated Answer: : The Eiffel Tower is located in Paris, France.???


In [58]:
print("\n--- Translation (English to French) ---")
english_sentence = "This is a beautiful day."
translation_prefix = "translate English to French"
generated_translation = generate_text(translation_prefix, english_sentence, max_new_tokens=50)

print(f"English: {english_sentence}")
print(f"French: {generated_translation}")


--- Translation (English to French) ---
📝 Output:
A beautiful day.
⏱ Tokens generated: 4 / 50
English: This is a beautiful day.
French: A beautiful day.


In [49]:
print("\n--- Text Classification (Sentiment Analysis) ---")
# Note: For classification, you'd typically fine-tune. Here, we're trying
# to prompt it zero-shot, which might not be as reliable as Flan-T5.
sentiment_text = "I absolutely loved this movie, it was fantastic!"
sentiment_prefix = "classify the sentiment of this text as positive, negative, or neutral"
generated_sentiment = generate_text(sentiment_prefix, sentiment_text, max_new_tokens=10)

print(f"Text: {sentiment_text}")
print(f"Sentiment: {generated_sentiment}\n")


--- Text Classification (Sentiment Analysis) ---
📝 Output:
the sentiment of this text as positive.
⏱ Tokens generated: 8 / 10
Text: I absolutely loved this movie, it was fantastic!
Sentiment: the sentiment of this text as positive.



In [59]:
print("\n--- Paraphrasing ---")
original_sentence = "The cat sat on the mat."
paraphrase_prefix = "paraphrase"
generated_paraphrase = generate_text(paraphrase_prefix, original_sentence, max_new_tokens=20)
print(f"Original: {original_sentence}")
print(f"Paraphrase: {generated_paraphrase}\n")


--- Paraphrasing ---
📝 Output:
the mat.
⏱ Tokens generated: 3 / 20
Original: The cat sat on the mat.
Paraphrase: the mat.

