## ProjF5 - Final Model

Use this document as a template to provide the evaluation of your final model. You are welcome to go in as much depth as needed.

Make sure you keep the sections specified in this template, but you are welcome to add more cells with your code or explanation as needed.


In [1]:
import numpy as np
import matplotlib.pyplot as plt

### 1. Load and Prepare Data

This should illustrate your code for loading the dataset and the split into training, validation and testing. You can add steps like pre-processing if needed.


The dataset is already stored
While loading the dataset, we have loaded the train dataset, using the load_dataset function.


In [4]:
# Load dataset for fine-tuning (e.g., CNN/DailyMail dataset)
dataset = load_dataset("cnn_dailymail", "3.0.0")
small_dataset = dataset["train"].select(range(100))  # Select the first 100 examples

In [5]:
# Load dataset for fine-tuning (e.g., CNN/DailyMail dataset)
dataset = load_dataset("cnn_dailymail", "3.0.0")
small_dataset = dataset["train"].select(range(100))  # Select the first 100 examples

Import libraries.


In [3]:
from transformers import T5ForConditionalGeneration, T5Tokenizer, Trainer, TrainingArguments
from datasets import load_dataset

  from .autonotebook import tqdm as notebook_tqdm


In [None]:
# Tokenize dataset for training
def tokenize_function(example):
    source_text = example["article"]
    target_text = example["highlights"]
    source_tokenized = tokenizer(source_text, truncation=True, padding="max_length", max_length=1024, return_tensors="pt")
    target_tokenized = tokenizer(target_text, truncation=True, padding="max_length", max_length=150, return_tensors="pt")
    return {
        "input_ids": source_tokenized.input_ids,
        "attention_mask": source_tokenized.attention_mask,
        "labels": target_tokenized.input_ids,
    }

tokenized_datasets = small_dataset.map(tokenize_function, batched=True)

### 2. Prepare your Final Model

Here you can have your code to either train (e.g., if you are building it from scratch) your model. These steps may require you to use other packages or python files. You can just call them here. You don't have to include them in your submission. Remember that we will be looking at the saved outputs in the notebooked and we will not run the entire notebook.


In [6]:
# Load pre-trained T5 model and tokenizer
model_name = "t5-small"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

In [None]:
# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    logging_dir="./logs",
    logging_steps=1000,
)

# Define trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets,
)

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


In [None]:
# Fine-tune the model
trainer.train()

100%|██████████| 75/75 [16:55<00:00, 13.54s/it]

{'train_runtime': 1015.5293, 'train_samples_per_second': 0.295, 'train_steps_per_second': 0.074, 'train_loss': 4.186572265625, 'epoch': 3.0}





TrainOutput(global_step=75, training_loss=4.186572265625, metrics={'train_runtime': 1015.5293, 'train_samples_per_second': 0.295, 'train_steps_per_second': 0.074, 'train_loss': 4.186572265625, 'epoch': 3.0})

In [None]:
# Save the fine-tuned model
model.save_pretrained("./fine_tuned_t5_small")

### 3. Model Performance

Make sure to include the following:

- Performance on the training set
- Performance on the test set
- Provide some screenshots of your output (e.g., pictures, text output, or a histogram of predicted values in the case of tabular data). Any visualization of the predictions are welcome.


Example of Abstractive Summarization using the Model


Performance on Single Training Data


In [32]:
# Example of generating summaries using the fine-tuned model
input_text = """Artificial intelligence (AI) is a field of computer science that aims to create systems capable of performing tasks that typically require human intelligence.
The concept of AI dates back to ancient times, with early ideas emerging in Greek mythology and ancient Greek philosophy. However, the modern era of AI began in the mid-20th
century with the development of computer technology and the advent of digital computing. In 1956, the term "artificial intelligence" was coined at the Dartmouth Conference,
 where researchers gathered to discuss the potential of creating machines that could mimic human cognitive abilities. Since then, AI has evolved rapidly, with significant
 advancements in areas such as machine learning, natural language processing, computer vision, and robotics. AI technologies have been applied across various industries,
 including healthcare, finance, transportation, and entertainment, revolutionizing the way we live and work. From virtual assistants like Siri and Alexa to self-driving cars
   and advanced medical diagnostic systems, AI has become an integral part of our daily lives. However, AI also raises ethical and societal concerns, including issues related
     to privacy, bias, job displacement, and the potential for misuse of AI-powered systems. Despite these challenges, the pursuit of artificial intelligence continues to drive
       innovation and shape the future of technology and society.
"""

print(len(input_text))
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
generated_summary_ids = model.generate(input_ids, max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
generated_summary = tokenizer.decode(generated_summary_ids[0], skip_special_tokens=True)
print("Generated Summary:")
print(generated_summary)

1439
Generated Summary:
(AI) is a field of computer science that aims to create systems capable of performing tasks that typically require human intelligence. AI is a field of computer science that aims to create systems capable of performing tasks that typically require human intelligence. AI technologies have been applied across various industries, including healthcare, finance, transportation, and entertainment.


In [None]:
from rouge_score import rouge_scorer
reference_summary = input_text

# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)

# Calculate ROUGE scores
scores = scorer.score(reference_summary, generated_summary)

# Print ROUGE scores
print("ROUGE-1 F1 Score:", scores['rouge1'].fmeasure)
print("ROUGE-2 F1 Score:", scores['rouge2'].fmeasure)
print("ROUGE-L F1 Score:", scores['rougeL'].fmeasure)

ROUGE-1 F1 Score: 0.4402985074626865
ROUGE-2 F1 Score: 0.4210526315789473
ROUGE-L F1 Score: 0.4328358208955224


Performance on Subset of Test Data


In [155]:
# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)

# Initialize variables to store cumulative scores
rouge1 = 0.0
rouge2 = 0.0
rougeL = 0.0
cnt = 0
train_subset = dataset["test"].select(range(500))

for input_text in train_subset["article"]:
    if len(input_text) < 700:
        input_ids = tokenizer(input_text, return_tensors="pt").input_ids
        generated_summary_ids = model.generate(input_ids, max_length=150, min_length=10, length_penalty=2.0, num_beams=4, early_stopping=True)
        generated_summary = tokenizer.decode(generated_summary_ids[0], skip_special_tokens=True)

        scores = scorer.score(input_text, generated_summary)

        cnt += 1
        # Update cumulative scores
        rouge1 += scores['rouge1'].fmeasure
        rouge2 += scores['rouge2'].fmeasure
        rougeL += scores['rougeL'].fmeasure


# Print average ROUGE scores
print("ROUGE-1 F1 Score:", rouge1/cnt)
print("ROUGE-2 F1 Score:", rouge2/cnt)
print("ROUGE-L F1 Score:", rougeL/cnt)

ROUGE-1 F1 Score: 0.4825110004435335
ROUGE-2 F1 Score: 0.41578939509180834
ROUGE-L F1 Score: 0.4040728637546249


Performance on Subset of Train Data


In [165]:
# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)

# Initialize variables to store cumulative scores
rouge1 = 0.0
rouge2 = 0.0
rougeL = 0.0
cnt = 0
train_subset = dataset["train"].select(range(1500))

for input_text in train_subset["article"]:
    if len(input_text) < 700:
        input_ids = tokenizer(input_text, return_tensors="pt").input_ids
        generated_summary_ids = model.generate(input_ids, max_length=150, min_length=10, length_penalty=2.0, num_beams=4, early_stopping=True)
        generated_summary = tokenizer.decode(generated_summary_ids[0], skip_special_tokens=True)

        scores = scorer.score(input_text, generated_summary)
        cnt += 1
        # Update cumulative scores
        rouge1 += scores['rouge1'].fmeasure
        rouge2 += scores['rouge2'].fmeasure
        rougeL += scores['rougeL'].fmeasure


# Print average ROUGE scores
print("ROUGE-1 F1 Score:", rouge1/cnt)
print("ROUGE-2 F1 Score:", rouge2/cnt)
print("ROUGE-L F1 Score:", rougeL/cnt)

ROUGE-1 F1 Score: 0.4880952380952381
ROUGE-2 F1 Score: 0.4758671047827674
ROUGE-L F1 Score: 0.43452380952380953
