# Part 1 - Creating a baseline

In this notebook we will create a simple yet important baseline so that we have an idea of how much our deep learning model improves the summaries. We use the ROUGE metric to measure the baseline.

In [1]:
import pandas as pd
df_test = pd.read_csv('data/test.csv')

In [2]:
df_test.head()

Unnamed: 0,text,summary
0,The coincidence of the set of all nilpotent ...,A relationship between 2-primal modules and mo...
1,The $k$ nearest neighbor ($k$NN) query is a ...,Eclipse: Practicability Beyond kNN and Skyline
2,For a real number $x$ and set of natural num...,Representing Ordinal Numbers with Arithmetical...
3,We classify all positive integers n and r su...,On the rationality problem for quadric bundles
4,Plasmonic nanoparticles influence the absorp...,Plasmonic nanoprobes for stimulated emission d...


In [3]:
from datasets import load_metric
metric = load_metric("rouge")

Downloading: 5.61kB [00:00, 1.52MB/s]                   


We're copying this function from https://github.com/huggingface/transformers/blob/v4.6.1/examples/pytorch/summarization/run_summarization.py to ensure we always use the same metric calculation.

In [4]:
def calc_rouge_scores(candidates, references):
    result = metric.compute(predictions=candidates, references=references, use_stemmer=True)
    result = {key: round(value.mid.fmeasure * 100, 1) for key, value in result.items()}
    return result

The summaries from the test dataset are the references

In [5]:
ref_summaries = list(df_test['summary'])

Now we cerate 3 baselines by comparing the reference summaries with the first sentence, the first 2 sentences, and the first 3 sentences in the abstract

In [6]:
import re
for i in range (3):
    candidate_summaries = list(df_test['text'].apply(lambda x: ' '.join(re.split(r'(?<=[.:;])\s', x)[:i+1])))
    print(f"First {i+1} senctences: Scores {calc_rouge_scores(candidate_summaries, ref_summaries)}")

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


First 1 senctences: Scores {'rouge1': 30.8, 'rouge2': 15.0, 'rougeL': 25.9, 'rougeLsum': 25.9}
First 2 senctences: Scores {'rouge1': 23.6, 'rouge2': 11.1, 'rougeL': 19.0, 'rougeLsum': 19.0}
First 3 senctences: Scores {'rouge1': 19.6, 'rouge2': 9.5, 'rougeL': 15.6, 'rougeLsum': 15.6}
