Import all dependencies

In [12]:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
import evaluate
# .\.venv\Scripts\activate

Load the CNN dataset (Data source)

In [13]:
dataset = load_dataset("cnn_dailymail", "3.0.0")
sample = dataset["test"][1]
article = sample["article"]
reference_summary = sample["highlights"]

print("ARTICLE:", article)
print("REFERENCE:", reference_summary)

ARTICLE: (CNN)Never mind cats having nine lives. A stray pooch in Washington State has used up at least three of her own after being hit by a car, apparently whacked on the head with a hammer in a misguided mercy killing and then buried in a field -- only to survive. That's according to Washington State University, where the dog -- a friendly white-and-black bully breed mix now named Theia -- has been receiving care at the Veterinary Teaching Hospital. Four days after her apparent death, the dog managed to stagger to a nearby farm, dirt-covered and emaciated, where she was found by a worker who took her to a vet for help. She was taken in by Moses Lake, Washington, resident Sara Mellado. "Considering everything that she's been through, she's incredibly gentle and loving," Mellado said, according to WSU News. "She's a true miracle dog and she deserves a good life." Theia is only one year old but the dog's brush with death did not leave her unscathed. She suffered a dislocated jaw, leg i

Generate Summary

In [21]:
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

prompt = f"""Here is a news article. 
Article: {article} 
Summary: """

inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
outputs = model.generate(inputs.input_ids, max_new_tokens=200, eos_token_id=tokenizer.eos_token_id)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("Pattern-based Summary:\n", summary, "\n")

Pattern-based Summary:
 Theia's body in a field, has been rescued and is now a healthy dog. 



Evaluate (ROUGE)

In [22]:
rouge = evaluate.load("rouge")
results = rouge.compute(predictions=[summary], references=[reference_summary])
print("ROUGE Score - Pattern-based:", results)

ROUGE Score - Pattern-based: {'rouge1': np.float64(0.3103448275862069), 'rouge2': np.float64(0.07142857142857142), 'rougeL': np.float64(0.24137931034482765), 'rougeLsum': np.float64(0.2068965517241379)}
