# Loading data in Dataframe

In [40]:
import pandas as pd
import random
from bert_score import score

In [2]:
df = pd.read_csv("CNN-DailyMail.csv")
df.head()

Unnamed: 0,article,highlights,id
0,(CNN)The Palestinian Authority officially beca...,Membership gives the ICC jurisdiction over all...,f001ec5c4704938247d27a44948eebb37ae98d01
1,(CNN)Never mind cats having nine lives. A stra...,"Theia, a bully breed mix, was apparently hit b...",230c522854991d053fe98a718b1defa077a8efef
2,"(CNN)If you've been following the news lately,...",Mohammad Javad Zarif has spent more time with ...,4495ba8f3a340d97a9df1476f8a35502bcce1f69
3,(CNN)Five Americans who were monitored for thr...,17 Americans were exposed to the Ebola virus w...,a38e72fed88684ec8d60dd5856282e999dc8c0ca
4,(CNN)A Duke student has admitted to hanging a ...,Student is no longer on Duke University campus...,c27cf1b136cc270023de959e7ab24638021bc43f


In [3]:
df.drop('id', axis=1, inplace=True)
df.head()

Unnamed: 0,article,highlights
0,(CNN)The Palestinian Authority officially beca...,Membership gives the ICC jurisdiction over all...
1,(CNN)Never mind cats having nine lives. A stra...,"Theia, a bully breed mix, was apparently hit b..."
2,"(CNN)If you've been following the news lately,...",Mohammad Javad Zarif has spent more time with ...
3,(CNN)Five Americans who were monitored for thr...,17 Americans were exposed to the Ebola virus w...
4,(CNN)A Duke student has admitted to hanging a ...,Student is no longer on Duke University campus...


In [20]:
random_samples = df.sample(n=100, random_state=42)

# PEGASUS

In [4]:
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
pegasus_tokenizer = PegasusTokenizer.from_pretrained("google/pegasus-large")
pegasus_model = PegasusForConditionalGeneration.from_pretrained("google/pegasus-large")

Some weights of PegasusForConditionalGeneration were not initialized from the model checkpoint at google/pegasus-large and are newly initialized: ['model.decoder.embed_positions.weight', 'model.encoder.embed_positions.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [21]:
def generate_summary_pegasus(article, max_length=100):
    inputs = pegasus_tokenizer.encode(article, return_tensors="pt", truncation=True)
    summary_ids = pegasus_model.generate(inputs, max_length=max_length, num_beams=4, early_stopping=True)
    return pegasus_tokenizer.decode(summary_ids[0], skip_special_tokens=True)

In [22]:
pegasus_precision_scores = []
pegasus_recall_scores = []
pegasus_f1_scores = []

# Iterate over the random samples and generate summaries
for index, row in random_samples.iterrows():
    article = row['article']
    reference_summary = row['highlights']
    
    # Generate summary using Pegasus with greedy search
    generated_summary = generate_summary_pegasus(article)
    
    # Evaluate the summary using BERTScore
    P, R, F1 = score([generated_summary], [reference_summary], lang='en', verbose=False)
    
    # Store the individual scores
    pegasus_precision_scores.append(P.item())
    pegasus_recall_scores.append(R.item())
    pegasus_f1_scores.append(F1.item())

# Print the BERTScore values
print("Precision Scores:", pegasus_precision_scores)
print("Recall Scores:", pegasus_recall_scores)
print("F1 Scores:", pegasus_f1_scores)


Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['ro

Precision Scores: [0.8368757963180542, 0.8650716543197632, 0.8415181040763855, 0.8718240261077881, 0.8398550748825073, 0.8469235897064209, 0.8339549899101257, 0.859872043132782, 0.8815048933029175, 0.8741775751113892, 0.8328050374984741, 0.8724263906478882, 0.8138409852981567, 0.8291168212890625, 0.8713105320930481, 0.8939358592033386, 0.8477096557617188, 0.8699058890342712, 0.8944743871688843, 0.8369841575622559, 0.8721429109573364, 0.8248619437217712, 0.8221483826637268, 0.8826601505279541, 0.8312482833862305, 0.836970329284668, 0.8731266856193542, 0.8735885620117188, 0.8594568371772766, 0.8159869313240051, 0.854633629322052, 0.8599585294723511, 0.9160825610160828, 0.8665529489517212, 0.8383817076683044, 0.8430576920509338, 0.8546010851860046, 0.8493355512619019, 0.8530657887458801, 0.860209584236145, 0.8462291955947876, 0.8914095163345337, 0.8637242913246155, 0.8204149603843689, 0.9162045121192932, 0.88985276222229, 0.8247300982475281, 0.8776664137840271, 0.7360701560974121, 0.84976

In [23]:
pegasus_results_df = pd.DataFrame({
    'Precision': pegasus_precision_scores,
    'Recall': pegasus_recall_scores,
    'F1 Score': pegasus_f1_scores
})

print(pegasus_results_df)

    Precision    Recall  F1 Score
0    0.836876  0.853909  0.845306
1    0.865072  0.887751  0.876264
2    0.841518  0.830285  0.835864
3    0.871824  0.876898  0.874354
4    0.839855  0.867179  0.853298
..        ...       ...       ...
95   0.836597  0.859522  0.847904
96   0.830124  0.830935  0.830529
97   0.832973  0.840902  0.836919
98   0.826282  0.809225  0.817665
99   0.820580  0.819348  0.819964

[100 rows x 3 columns]


In [26]:
pegasus_results_df.to_excel('pegasus_results.xlsx', index=False)

# BART

In [35]:
from transformers import BartForConditionalGeneration, BartTokenizer
bart_tokenizer = BartTokenizer.from_pretrained("facebook/bart-large")
bart_model = BartForConditionalGeneration.from_pretrained("facebook/bart-large")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [36]:
def generate_summary_bart(article, max_length=100):
    inputs = bart_tokenizer.encode(article, return_tensors="pt", truncation=True)
    summary_ids = bart_model.generate(inputs, max_length=max_length, num_beams=4, early_stopping=True)
    return bart_tokenizer.decode(summary_ids[0], skip_special_tokens=True)

In [37]:
bart_precision_scores = []
bart_recall_scores = []
bart_f1_scores = []

for index, row in random_samples.iterrows():
    article = row['article']
    reference_summary = row['highlights']
    
    # Generate summary using T5 with greedy search
    generated_summary = generate_summary_bart(article)
    
    # Evaluate the summary using BERTScore
    P, R, F1 = score([generated_summary], [reference_summary], lang='en', verbose=False)
    
    # Store the individual scores
    bart_precision_scores.append(P.item())
    bart_recall_scores.append(R.item())
    bart_f1_scores.append(F1.item())

# Print the BERTScore values
print("Precision Scores:", bart_precision_scores)
print("Recall Scores:", bart_recall_scores)
print("F1 Scores:", bart_f1_scores)

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['ro

Precision Scores: [0.815876841545105, 0.8645640015602112, 0.8504435420036316, 0.8732391595840454, 0.8515278697013855, 0.805774450302124, 0.8262912034988403, 0.8710603713989258, 0.8939869403839111, 0.8597856760025024, 0.8560997247695923, 0.8866323828697205, 0.8619765639305115, 0.8276309967041016, 0.8726577162742615, 0.8589812517166138, 0.8709471821784973, 0.8418046236038208, 0.8822752237319946, 0.8729409575462341, 0.8877543210983276, 0.8220746517181396, 0.85614013671875, 0.882369875907898, 0.8702179789543152, 0.8683891296386719, 0.8753172159194946, 0.8503544926643372, 0.8818186521530151, 0.8133077621459961, 0.8616291284561157, 0.8655785322189331, 0.8349761366844177, 0.8464046716690063, 0.851706862449646, 0.8813564777374268, 0.8595131635665894, 0.8559266328811646, 0.8382440805435181, 0.8671979308128357, 0.8347077369689941, 0.8841463923454285, 0.8680112361907959, 0.8127490282058716, 0.9069778323173523, 0.8268044590950012, 0.8532711863517761, 0.8749016523361206, 0.8263274431228638, 0.86191

In [38]:
bart_results_df = pd.DataFrame({
    'Precision': bart_precision_scores,
    'Recall': bart_recall_scores,
    'F1 Score': bart_f1_scores
})

print(bart_results_df)

    Precision    Recall  F1 Score
0    0.815877  0.843514  0.829465
1    0.864564  0.889095  0.876658
2    0.850444  0.873764  0.861946
3    0.873239  0.887248  0.880188
4    0.851528  0.884364  0.867635
..        ...       ...       ...
95   0.849253  0.885450  0.866974
96   0.874479  0.880775  0.877616
97   0.848929  0.880786  0.864564
98   0.836121  0.849369  0.842693
99   0.823633  0.866714  0.844625

[100 rows x 3 columns]


In [39]:
bart_results_df.to_excel('bart_results.xlsx', index=False)

# T5

In [15]:
from transformers import T5ForConditionalGeneration, T5Tokenizer
t5_tokenizer = T5Tokenizer.from_pretrained("t5-large")
t5_model = T5ForConditionalGeneration.from_pretrained("t5-large")

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [31]:
def generate_summary_t5(article, max_length=100):
    inputs = t5_tokenizer.encode("summarize: " + article, return_tensors="pt", truncation=True)
    summary_ids = t5_model.generate(inputs, max_length=max_length, num_beams=4, early_stopping=True)
    return t5_tokenizer.decode(summary_ids[0], skip_special_tokens=True)

In [32]:
t5_precision_scores = []
t5_recall_scores = []
t5_f1_scores = []

for index, row in random_samples.iterrows():
    article = row['article']
    reference_summary = row['highlights']
    
    # Generate summary using T5 with greedy search
    generated_summary = generate_summary_t5(article)
    
    # Evaluate the summary using BERTScore
    P, R, F1 = score([generated_summary], [reference_summary], lang='en', verbose=False)
    
    # Store the individual scores
    t5_precision_scores.append(P.item())
    t5_recall_scores.append(R.item())
    t5_f1_scores.append(F1.item())

# Print the BERTScore values
print("Precision Scores:", t5_precision_scores)
print("Recall Scores:", t5_recall_scores)
print("F1 Scores:", t5_f1_scores)

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['ro

Precision Scores: [0.8548189997673035, 0.889518678188324, 0.8833594918251038, 0.8683274388313293, 0.8528497815132141, 0.8903371691703796, 0.8978610038757324, 0.88690185546875, 0.9376890063285828, 0.864109218120575, 0.8806644678115845, 0.8609111309051514, 0.8889483213424683, 0.8448965549468994, 0.9071764945983887, 0.9371885061264038, 0.8799736499786377, 0.8883851766586304, 0.8989431262016296, 0.909895122051239, 0.9283798933029175, 0.8629339933395386, 0.846335768699646, 0.8975170254707336, 0.8528536558151245, 0.8580727577209473, 0.8684893846511841, 0.85289466381073, 0.9054502248764038, 0.8300272226333618, 0.866995096206665, 0.8675549030303955, 0.87000572681427, 0.8636898398399353, 0.8691647052764893, 0.9374569058418274, 0.8456259369850159, 0.8955625295639038, 0.8634824752807617, 0.8650701642036438, 0.8635446429252625, 0.8620753288269043, 0.8805084824562073, 0.8264967799186707, 0.9453151226043701, 0.8924005031585693, 0.8808873891830444, 0.8816272020339966, 0.8569580316543579, 0.8446770906

In [33]:
t5_results_df = pd.DataFrame({
    'Precision': t5_precision_scores,
    'Recall': t5_recall_scores,
    'F1 Score': t5_f1_scores
})

print(t5_results_df)

    Precision    Recall  F1 Score
0    0.854819  0.852082  0.853449
1    0.889519  0.883311  0.886404
2    0.883359  0.905829  0.894453
3    0.868327  0.852334  0.860257
4    0.852850  0.862548  0.857672
..        ...       ...       ...
95   0.848396  0.855318  0.851843
96   0.863428  0.856961  0.860182
97   0.870264  0.855224  0.862678
98   0.854850  0.843374  0.849073
99   0.907932  0.884052  0.895833

[100 rows x 3 columns]


In [34]:
t5_results_df.to_excel('t5_results.xlsx', index=False)

# Sample Text and Reference Summary

In [48]:
article_index = 976
text = df.loc[article_index, 'article']
highlight = df.loc[article_index, 'highlights']
print("Article:")
print(text)
print("\n\nHighlight:")
print(highlight)

Article:
(CNN)Blinky and Pinky on the Champs Elysees? Inky and Clyde running down Broadway? Power pellets on the Embarcadero? Leave it to Google to make April Fools' Day into throwback fun by combining Google Maps with Pac-Man. The massive tech company is known for its impish April Fools' Day pranks, and Google Maps has been at the center of a few, including a Pokemon Challenge and a treasure map. This year the company was a day early to the party, rolling out the Pac-Man game Tuesday. It's easy to play: Simply pull up Google Maps on your desktop browser, click on the Pac-Man icon on the lower left, and your map suddenly becomes a Pac-Man course. Twitterers have been tickled by the possibilities, playing Pac-Man in Manhattan, on the University of Illinois quad, in central London and down crooked Lombard Street in San Francisco, among many locations: .


Highlight:
Google Maps has a temporary Pac-Man function . Google has long been fond of April Fools' Day pranks and games . Many people

# Model Summaries

## PEGASUS

In [49]:
pegasus_summary = generate_summary_pegasus(text)
print(pegasus_summary)

Leave it to Google to make April Fools' Day into throwback fun by combining Google Maps with Pac-Man.


## BART

In [50]:
bart_summary = generate_summary_bart(text)
print(bart_summary)

(CNN)Blinky and Pinky on the Champs Elysees? Inky and Clyde running down Broadway? Power pellets on the Embarcadero? Leave it to Google to make April Fools' Day into throwback fun by combining Google Maps with Pac-Man. The massive tech company is known for its impish April Fool's Day pranks, and Google Maps has been at the center of a few, including a Pokemon Challenge and a treasure map. This year


## T5

In [51]:
t5_summary = generate_summary_t5(text)
print(t5_summary)

google is known for its impish April Fools' day pranks. this year the company was a day early to the party, rolling out the Pac-man game.
