Source code for the paper Unsupervised Summarization Re-ranking.
Mathieu Ravaut, Shafiq Joty, Nancy F. Chen.
Accepted for publication at ACL Findings 2023.
git clone https://github.com/Ravoxsg/SummScore.git
cd SummScore
conda create --name summscore python=3.8.11
conda activate summscore
pip install -r requirements.txt
SummScore scores each summary candidate produced by a model (e.g, PEGASUS) and a decoding method (e.g., beam search) on a given data point.
You need to generate candidates on the validation and training sets:
For instance on SAMSum 100-shot validation set (default code):
cd src/candidate_generation/
CUDA_VISIBLE_DEVICES=0 bash main_candidate_generation.sh
Next, you need to score each summary candidate.
cd ../summscore/
CUDA_VISIBLE_DEVICES=0 bash main_build_scores.sh
Now we can launch SummScore training, which will estimate features coefficients on a 1000 data points subset of the validation set.
CUDA_VISIBLE_DEVICES=0 bash main_reranking.sh
The code lets you choose among several fine-tuned models hosted on HuggingFace. You can also use our own checkpoints:
BART fine-tuned on WikiHow: here
PEGASUS fine-tuned on SAMSum: here
BART fine-tuned on SAMSum: here
Alternatively, if you just want a demo (in a single file) of SummScore on a single data point (default: CNN/DM), run:
cd src/summscore/
CUDA_VISIBLE_DEVICES=0 python demo.py
If you find our paper or this project helps your research, please kindly consider citing our paper in your publication.
@article{ravaut2022unsupervised,
title={Unsupervised Summarization Re-ranking},
author={Ravaut, Mathieu and Joty, Shafiq and Chen, Nancy},
journal={arXiv preprint arXiv:2212.09593},
year={2022}
}