## Machine Learning for Topic News Title Summarization 

- In the era of digital information, the volume of news content available to readers has grown exponentially, making it increasingly challenging for individuals to stay informed without becoming overwhelmed. The project's primary goal is to leverage machine learning (ML) techniques for the effective summarization of news articles, aiming to improve the efficiency, accuracy, and readability of these summaries, allowing readers to grasp the essence of news stories without dedicating extensive time to reading full articles.

- The stakeholders of this project can be individual readers, news organizations, educational sectors, and potentially government bodies reliant on swift and accurate information dissemination. Improved news summarization models can transform media consumption by providing accessible, succinct summaries of complex news stories, thereby enhancing public knowledge and engagement. Additionally, in broader vew, enhanced news summarization techniques could pave the way for similar advancements in summarizing other forms of text, such as academic literature, legal documents, and social media feeds.

- Potential Model We will Explore:
    - Bert summarization
    - Fint tune T5-small
    - GPT 3.5 Turbo

In [19]:
from openai import OpenAI
from bert_score import score
from tqdm import tqdm
import pandas as pd
import evaluate
import os

rouge = evaluate.load('rouge')
client = OpenAI(base_url="https://openai.vocareum.com/v1", api_key="voc-194620361116581275266466216d217dafc7.78572946")

#### 1. Load the Test Set

In [14]:
test_df = pd.read_csv('data/test_set.csv', sep=',')
test_df['content'] = test_df['content'].apply(lambda x: x.replace('\n', ' '))

#### 2. Generate the title:

In [16]:
def generate_title_gpt3_5(content_list: list):
    title_list = []
    for content in tqdm(content_list):
        completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Create a title for the following content: " + content},
        ]
        )
        title_list.append(completion.choices[0].message.content)
    return title_list

In [17]:
if not os.path.exists('data/test_set_summary_gpt3_5.csv'):
    print('Start Generating Title on GPT-3.5 turbo')
    content_list = test_df['content'].tolist()
    title_list = generate_title_gpt3_5(content_list)
    test_df['pred_title'] = title_list
    test_df = test_df[['data_id', 'title', 'pred_title']]
    test_df.to_csv('data/test_set_summary_gpt3_5.csv', index=False)

100%|██████████| 1217/1217 [23:16<00:00,  1.15s/it]


#### 3. Load the Predicted Result

In [30]:
if os.path.exists('data/test_set_summary_gpt3_5.csv'):
    print('Successfully load the generated titles from GPT-3.5')
    gpt_df = pd.read_csv('data/test_set_summary_gpt3_5.csv', sep=',')
else:
    print('Failed to load the generated titles from GPT-3.5')

Successfully load the generated titles from GPT-3.5


In [31]:
gpt_df['title'] = gpt_df['title'].apply(lambda x: x.replace('"', ''))

In [32]:
gpt_df['pred_title'] = gpt_df['pred_title'].apply(lambda x: x.replace('"', ''))
gpt_df['pred_title'] = gpt_df['pred_title'].apply(lambda x: x.replace('Title: ', ''))

See original title and GPT generated title

In [33]:
# See original title and GPT generated title
print(gpt_df.iloc[0, 1])
print(gpt_df.iloc[0, 2])

Best party games for adults: The super fun board games, cards and quizzes to entertain your friends with in 2019
Unplugged Fun: Best Board Games for Adults in the Digital Era


Calculate Rouge Score

In [34]:
# Calculate Rouge Score
rouge.compute(predictions=gpt_df['pred_title'], references=gpt_df['title'])

{'rouge1': 0.3274146371663341,
 'rouge2': 0.12480174759073806,
 'rougeL': 0.27322813674959034,
 'rougeLsum': 0.27299407173559675}

Calculate BERTScore

In [39]:
# Calculate BERTScore
P, R, F1 = score(gpt_df['pred_title'].to_list(), gpt_df['title'].to_list(), lang='en', verbose=True)

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


calculating scores...
computing bert embedding.


  0%|          | 0/38 [00:00<?, ?it/s]

computing greedy matching.


  0%|          | 0/20 [00:00<?, ?it/s]

done in 52.83 seconds, 23.03 sentences/sec


In [38]:
print(f'Precision: {P.mean():.4f}')
print(f'Recall: {R.mean():.4f}')
print(f'F1: {F1.mean():.4f}')

Precision: 0.8777
Recall: 0.8820
F1: 0.8796
