# Zero-Shot Abstractive Summarization using T5

This notebook demonstrates how to use a pretrained **T5** model in a *zero-shot* setting for summarization. The model is not fine-tuned on the target data — we simply prepend the instruction `summarize:` to the input and let the model generate a summary.


In [7]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
!pip install transformers torch datasets accelerate sentencepiece --quiet

### 🔹 Step 1: Load Pretrained T5 Model and Tokenizer

We use the lightweight `t5-small` model. For better results, you can try `t5-base` or `t5-large`.


In [None]:
from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch

model_name = "t5-small"  # Try 't5-base' or 't5-large' for better results
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

### 🔹 Step 2: Define Input Text

You can test the summarization capability with any long text. Here's a sample news-style paragraph.


In [None]:
# Sample long text input
text = """
Apple Inc. is planning to release a new line of MacBooks in late 2025. The new laptops are expected to be powered by a new M4 chip and will include performance improvements and longer battery life. Analysts predict this launch will help Apple regain its lead in the laptop market. In addition, the company is rumored to be working on AI integration for macOS, enabling more personalized and intelligent user experiences.
"""

# Add summarization prompt
input_text = "summarize: " + text

### 🔹 Step 3: Tokenize the Input

The input is truncated to a maximum of 512 tokens and converted into token IDs.


In [None]:
inputs = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)

### 🔹 Step 4: Generate the Summary

Use beam search with custom parameters to improve summary quality.


In [None]:
summary_ids = model.generate(
    inputs,
    max_length=180,
    min_length=30,
    length_penalty=2.0,
    num_beams=4,
    early_stopping=True
)

### 🔹 Step 5: Decode and Display the Summary


In [None]:
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print("\n📌 Summary:\n", summary)


📌 Summary:
 the new laptops are expected to be powered by a new M4 chip. the company is rumored to be working on AI integration for macOS.
