In [None]:
'''🔹 What is T5?

Full name: Text-to-Text Transfer Transformer

Developed by: Google Research (2019)

Idea: Convert every NLP task into a text-to-text format.

Input = Text

Output = Text

🔹 Why is T5 Special?

Most older NLP models were built for specific tasks (e.g., BERT for classification, GPT for text generation).
T5 unifies everything → one model, many tasks.

Examples:

Translation: "translate English to German: How are you?" → "Wie geht es dir?"

Summarization: "summarize: long paragraph..." → "short summary"

Question Answering: "question: Where is Taj Mahal? context: Taj Mahal is in India." → "India"

Classification: "classify sentiment: I love this movie" → "positive"

🔹 Architecture

Based on Transformer Encoder-Decoder (Seq2Seq)

Similar to BERT (encoder) + GPT (decoder) combined.

Trained on a massive dataset called C4 (Colossal Clean Crawled Corpus).

🔹 Variants

t5-small (60M parameters) → Fast, lightweight

t5-base (220M)

t5-large (770M)

t5-3B (3 billion)

t5-11B (11 billion, very powerful)

🔹 Popular Use Cases

Text Summarization

Machine Translation

Question Answering

Text Classification

Paraphrasing / Rewriting

👉 In short:
T5 = one universal NLP model that treats everything as a text-to-text problem.'''

In [7]:
# Summerization

from transformers import T5ForConditionalGeneration, T5Tokenizer

# Load model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("t5-small")
tokenizer = T5Tokenizer.from_pretrained("t5-small")

# Your long input text
text = """
The phenomenon of rapid urbanization, particularly in the Global South, has precipitated a multifaceted array of socioeconomic, environmental, and infrastructural challenges that demand both immediate and long-term strategic responses. As cities expand at unprecedented rates—often outpacing the capacity of municipal governments to regulate growth—informal settlements proliferate, creating densely populated areas with limited access to clean water, sanitation, and healthcare. These settlements are not merely the byproducts of poverty but also reflect systemic failures in land-use policy, governance, and the financial exclusion of low-income populations from formal housing markets.

Compounding this complexity is the intensifying impact of climate change, which disproportionately affects urban centers in low-lying coastal regions. Rising sea levels, erratic weather patterns, and increased frequency of extreme weather events exacerbate the vulnerability of informal communities, many of which lack the physical infrastructure and economic resilience to adapt. Urban planning in these contexts is further hindered by political fragmentation, corruption, and a dearth of reliable data, leading to a disconnect between policy frameworks and the lived realities of urban dwellers.

However, recent interdisciplinary approaches that integrate participatory governance, data-driven planning, and localized climate adaptation strategies have shown promise. These efforts emphasize the importance of co-production—wherein community members, planners, and policymakers collaboratively design interventions tailored to specific local contexts. While scaling such models remains a formidable challenge, they offer a potential paradigm shift away from top-down, technocratic planning toward more inclusive and resilient urban futures.
"""

# Prepend "summarize:" for T5
input_text = "summarize: " + text

# Tokenize input
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)

# Generate summary
summary_ids = model.generate(
    inputs["input_ids"],
    max_length=150,
    num_beams=4,
    early_stopping=True
)

# Decode summary
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)


rising sea levels, erratic weather patterns, and increased frequency of extreme weather events exacerbate the vulnerability of informal communities. rising sea levels, erratic weather patterns, and increased frequency of extreme weather events exacerbate the vulnerability of informal communities. urban planning in these contexts is further hindered by political fragmentation, corruption, and a dearth of reliable data.


In [10]:
# Translation

from transformers import T5Tokenizer ,T5ForConditionalGeneration
model_name = "t5-large"
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)

text = "translate English to French: The weather is nice today."

input = tokenizer(text,return_tensors="pt",max_length=256,truncation=True)

output = model.generate(**input)
decode = tokenizer.decode(output[0],skip_special_tokens=True)
print(decode)



Le temps est beau aujourd'hui.


In [12]:
# Question Answering

text = '''The water cycle describes how water moves through Earth's atmosphere, land, and oceans.
It begins with evaporation, where the sun heats water in rivers, lakes, and oceans.
This water turns into vapor and rises into the atmosphere.
As the vapor cools, it condenses into clouds.
Eventually, the water falls back to Earth as precipitation, like rain or snow.
Some of this water flows over land as runoff, returning to rivers and oceans.
Other water seeps into the ground and becomes groundwater.
Plants also release water into the air through transpiration.
Together, evaporation and transpiration are called evapotranspiration.
This cycle plays a vital role in weather, climate, and sustaining life'''

question = "question : What is transpiration?"  + text
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs)

print("Answer:", tokenizer.decode(outputs[0], skip_special_tokens=True))

Answer: Plants also release water into the air
