# Model Comparison for Text Summarization

In this notebook, we will compare the performance of different text summarization models using various evaluation metrics. We will evaluate both extractive and abstractive summarization techniques.

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
from src.utils.evaluation_metrics import calculate_rouge, calculate_bleu
from src.models.extractive_model import ExtractiveModel
from src.models.abstractive_model import AbstractiveModel
from src.utils.data_preprocessing import preprocess_text

# Load the cleaned data
with open('../data/processed/cleaned_data.txt', 'r') as file:
    cleaned_data = file.read()

# Preprocess the text
processed_text = preprocess_text(cleaned_data)

# Initialize models
extractive_model = ExtractiveModel()
abstractive_model = AbstractiveModel()

# Generate summaries
extractive_summary = extractive_model.summarize(processed_text)
abstractive_summary = abstractive_model.summarize(processed_text)

# Load reference summaries for evaluation
reference_summary = "[Insert reference summary here]"

# Evaluate summaries
rouge_extractive = calculate_rouge(reference_summary, extractive_summary)
rouge_abstractive = calculate_rouge(reference_summary, abstractive_summary)
bleu_extractive = calculate_bleu(reference_summary, extractive_summary)
bleu_abstractive = calculate_bleu(reference_summary, abstractive_summary)

# Compile results
results = pd.DataFrame({
    'Model': ['Extractive', 'Abstractive'],
    'ROUGE Score': [rouge_extractive, rouge_abstractive],
    'BLEU Score': [bleu_extractive, bleu_abstractive]
})

# Display results
results

## Conclusion

In this notebook, we compared the performance of extractive and abstractive summarization models using ROUGE and BLEU scores. The results will help us understand which model performs better for our specific use case.