BART + chunking + concatenation will provide the inference baseline to compare my work against. This is the simplest method for my task of policy text summarisation, as it just involves using a pre-trained model (BART), chunking the inputs due to the 512 token limit, and then concatenating the outputs to provide a full summarisation of the policy text. ROUGE will be used to evaluate the effectiveness of this method, and these figures will be used for comparing my subsequent models to demonstrate the impact of different methods.

In [1]:
from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

from datasets import load_dataset

dataset = load_dataset("dennlinger/eur-lex-sum", 'english')

  from .autonotebook import tqdm as notebook_tqdm
Device set to use mps:0


In [3]:
import pandas as pd

# Chunking inputs
def chunk_text(words, max_words=800):
    for i in range(0, len(words), max_words):
        yield ' '.join(words[i:i + max_words])

results = []

# Setting up inference loop
for i, item in enumerate(dataset['train']):
    if i >= 5:
        break
    words = item['reference'].split()
    chunks = list(chunk_text(words))

    # Summarise each chunk
    summaries = []
    for chunk in chunks:
        summary = summarizer(
            chunk,
            max_length=130,
            min_length=30,
            do_sample=False
        )
        summaries.append(summary[0]['summary_text'])

    # Combine
    final_summary = " ".join(summaries)
    print("Combined summary:\n", final_summary[:1000], "...")
    
    results.append({
        "celex_id": item['celex_id'],
        "generated_summary": final_summary,
        "reference_summary": item['summary']
    })
    
# Save results to a DataFrame
df = pd.DataFrame(results)
# Save DataFrame to CSV
df.to_csv('BART-chunking-summaries.csv', index=False)

Your max_length is set to 130, but your input_length is only 45. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=22)


Combined summary:
 Regulation (EU) 2017/1129 lays down requirements to be complied with when drawing up prospectuses. The content and the format of a prospectus depend on a variety of factors, such as the type of issuer, type of security and type of issuance. The prospectus should contain a working capital statement as well as a statement of capitalisation and indebtedness of the issuer of the underlying shares. Derivative securities entail particular risks for investors. A high level of investor protection should be ensured, the EU says. It adds that certain types of securities that are not covered by the Annexes to this Regulation will be offered to the public. ‘Third country market’ means a third country market which has been deemed equivalent to a regulated market in accordance with the requirements set out in third and fourth subparagraphs of Article 25(4) of Directive 2014/65/EU of the European Parliament and of the Council (3) ‘profit estimate’ is a profit forecast for a financi

In [4]:
df = pd.read_csv('BART-chunking-summaries.csv')
print(df.head())


         celex_id                                  generated_summary  \
0      32019R0980  Regulation (EU) 2017/1129 lays down requiremen...   
1      32019D0785  Decision on the harmonisation of radio spectru...   
2      32019R1122  All allowances issued from 1 January 2012 onwa...   
3      32019R0856  Rules on the operation of the Innovation Fund ...   
4  22020A0724(01)  The European Union and the Government of the P...   

                                   reference_summary  
0  Prospectus to be published when securities are...  
1  Short range devices, RLAN (WiFi), Internet of ...  
2  Union registry for emissions trading system al...  
3  Emissions Trading System — Innovation Fund rul...  
4  EU-China agreement on civil aviation safety\nE...  


In [6]:
# ROUGE evaluation
from rouge_score import rouge_scorer

rouge1_f1_scores = []
rouge2_f1_scores = []
rougeL_f1_scores = []

scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)

for idx, row in df.iterrows():
    reference = row['reference_summary']
    generated = row['generated_summary']
    scores = scorer.score(reference, generated)
    
    print(f"Example {idx+1}:")
    print(f"ROUGE-1 F1: {scores['rouge1'].fmeasure:.4f}")
    print(f"ROUGE-2 F1: {scores['rouge2'].fmeasure:.4f}")
    print(f"ROUGE-L F1: {scores['rougeL'].fmeasure:.4f}")
    print("-" * 30)
    
    rouge1_f1_scores.append(scores['rouge1'].fmeasure)
    rouge2_f1_scores.append(scores['rouge2'].fmeasure)
    rougeL_f1_scores.append(scores['rougeL'].fmeasure)

# Calculate average ROUGE scores across dataset
avg_rouge1 = sum(rouge1_f1_scores) / len(rouge1_f1_scores)
avg_rouge2 = sum(rouge2_f1_scores) / len(rouge2_f1_scores)
avg_rougeL = sum(rougeL_f1_scores) / len(rougeL_f1_scores)

print(f"Average ROUGE-1 F1: {avg_rouge1:.4f}")
print(f"Average ROUGE-2 F1: {avg_rouge2:.4f}")
print(f"Average ROUGE-L F1: {avg_rougeL:.4f}")

Example 1:
ROUGE-1 F1: 0.4100
ROUGE-2 F1: 0.1364
ROUGE-L F1: 0.1650
------------------------------
Example 2:
ROUGE-1 F1: 0.1855
ROUGE-2 F1: 0.0404
ROUGE-L F1: 0.0989
------------------------------
Example 3:
ROUGE-1 F1: 0.3864
ROUGE-2 F1: 0.1342
ROUGE-L F1: 0.1593
------------------------------
Example 4:
ROUGE-1 F1: 0.3595
ROUGE-2 F1: 0.1160
ROUGE-L F1: 0.1704
------------------------------
Example 5:
ROUGE-1 F1: 0.3997
ROUGE-2 F1: 0.1318
ROUGE-L F1: 0.1830
------------------------------
Average ROUGE-1 F1: 0.3482
Average ROUGE-2 F1: 0.1118
Average ROUGE-L F1: 0.1553
