**Loading Dataset**

In [22]:
import pandas as pd

df=pd.read_csv('Daily_Mail.csv')
df.head()

Unnamed: 0,url,article,highlights
0,https://www.dailymail.co.uk/tvshowbiz/article-...,Beyoncé showcases her incredible figure in plu...,Beyoncé has shown off her flawless beauty in a...
1,https://www.dailymail.co.uk/tvshowbiz/article-...,Radio 1 listeners in shock as sex noises are p...,BBC Radio 1 listeners were left choking on the...
2,https://www.dailymail.co.uk/tvshowbiz/article-...,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Dan Edgar and Ella Rae Wise put on a loved-up ...
3,https://www.dailymail.co.uk/tvshowbiz/article-...,Bradley Cooper recalls 'crazy' pitch meeting a...,Bradley Cooper discussed the 'crazy' experienc...
4,https://www.dailymail.co.uk/tvshowbiz/article-...,Margaret Qualley and Beanie Feldstein stun in ...,Margaret Qualley and Beanie Feldstein were dre...


In [29]:
#using smaller dataset

df = df[['article', 'highlights']].dropna().sample(10, random_state=42).reset_index(drop=True)
df.head()

Unnamed: 0,article,highlights
0,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Bradley Cooper discussed the 'crazy' experienc...
1,The four-legged star who captured Emma Stone's...,The four-legged star who captured the heart of...
2,Radio 1 listeners in shock as sex noises are p...,Dan Edgar and Ella Rae Wise put on a loved-up ...
3,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Bradley Cooper discussed the 'crazy' experienc...
4,How to dress like a grown up with Shane Watson...,"You're never too old for boots, and that's qui..."


**Step 1 : Preprocessing**

In [30]:
def clean_text(text):
    return text.strip().replace('\n', ' ')


df['clean_article'] = df['article'].apply(clean_text)
df['clean_highlights'] = df['highlights'].apply(clean_text)
df.head()

Unnamed: 0,article,highlights,clean_article,clean_highlights
0,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Bradley Cooper discussed the 'crazy' experienc...,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Bradley Cooper discussed the 'crazy' experienc...
1,The four-legged star who captured Emma Stone's...,The four-legged star who captured the heart of...,The four-legged star who captured Emma Stone's...,The four-legged star who captured the heart of...
2,Radio 1 listeners in shock as sex noises are p...,Dan Edgar and Ella Rae Wise put on a loved-up ...,Radio 1 listeners in shock as sex noises are p...,Dan Edgar and Ella Rae Wise put on a loved-up ...
3,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Bradley Cooper discussed the 'crazy' experienc...,"TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, ...",Bradley Cooper discussed the 'crazy' experienc...
4,How to dress like a grown up with Shane Watson...,"You're never too old for boots, and that's qui...",How to dress like a grown up with Shane Watson...,"You're never too old for boots, and that's qui..."


**Step 3 : Extractive Summarization using Spacy**

In [31]:
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation
import spacy

nlp = spacy.load('en_core_web_sm')

def extractive_summary(text, max_sentences=2):
    doc = nlp(text)
    sentences = list(doc.sents)
    return ' '.join([str(sent) for sent in sentences[:max_sentences]])

df['extractive_summary'] = df['clean_article'].apply(lambda x: extractive_summary(x))

**Step 4 : Abstractive Summarization using Transformers (BART)**

In [33]:
from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

def abstractive_summary(text):
    return summarizer(text, max_length=60, min_length=20, do_sample=False)[0]['summary_text']

df['abstractive_summary'] = df['clean_article'].apply(abstractive_summary)

Device set to use cpu
Your max_length is set to 60, but your input_length is only 57. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=28)
Your max_length is set to 60, but your input_length is only 58. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=29)
Your max_length is set to 60, but your input_length is only 58. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=29)
Your max_length is set to 60, but your input_length is only 50. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', m

**Step 5 : Evaluation**

In [34]:
for i in range(3):
    print(f"\n====== Example {i+1} ======")
    print("\nOriginal Article:\n", df['clean_article'][i])
    print("\nExtractive Summary:\n", df['extractive_summary'][i])
    print("\nAbstractive Summary:\n", df['abstractive_summary'][i])
    print("\nActual Highlights:\n", df['highlights'][i])



Original Article:
 TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, put on a loved-up display during trip as romance blossoms between the pair during trip to BaliBradley Cooper discussed the 'crazy' experience he had of meeting Beyonce and her husband Jay-Z while pitching the singer a movie role.

Extractive Summary:
 TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, put on a loved-up display during trip as romance blossoms between the pair during trip to BaliBradley Cooper discussed the 'crazy' experience he had of meeting Beyonce and her husband Jay-Z while pitching the singer a movie role.

Abstractive Summary:
 TOWIE's Dan Edgar, 33, and Ella Rae Wise, 23, put on a loved-up display during trip to Bali.

Actual Highlights:
 Bradley Cooper discussed the 'crazy' experience he had of meeting Beyonce and her husband Jay-Z while pitching the singer a movie role. 


Original Article:
 The four-legged star who captured Emma Stone's heart at the BAFTAs is revealed as Lilliput the Maltese terrier