Release Summary Loop CCN/DM Test Set Outputs · CannyLab/summary_loop

Releasing the 11,490 summaries generated by the Summary Loop model (summary_loop_length46.bin) on the CNN/DM test set.
Each summary is released attached with the CNN/DM id.
The following code snippet can be used to evaluate ROUGE scores:

from datasets import load_dataset, load_metric
import json
with open("/home/phillab/data/cnndm_test_summary_loop.json", "r") as f:
    summary_loop_gens = json.load(f)
rouge = load_metric("rouge")
dataset_test = load_dataset("cnn_dailymail", "3.0.0")["test"]
id2summary_loop = {d["id"]: d["summary_loop_gen"] for d in summary_loop_gens}
candidates, references = [], []
for d in dataset_test:
    references.append(d["highlights"])
    candidates.append(id2summary_loop[d["id"]])
print(len(references), len(candidates))
print(rouge.compute(predictions=candidates, references=references))

Notes:
(1) this relies on HuggingFace's datasets repository (https://github.com/huggingface/datasets) to load the CNN/DM dataset, and the ROUGE metric.
(2) The ROUGE metric implementation used in the above example is not the original, PERL-based implementation of ROUGE used for official numbers in the paper. This serves for demonstration purposes to show how to use the file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summary Loop CCN/DM Test Set Outputs