This code is for Sigir 2020 paper SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression
Python version: this code is in Python3.6
source data which has minimal text pre-processing
target data (for evaluation)
Step1: place downloaded dataset in the folder
Step2: download the pre-trained word2vec model and place it in the folder
- If you want to run SummPip on your own dataset, you need to pre-train a W2V model yourself first with gensim.
Step3: Unsupervised Extractive Summarisation
- You may want to change
-nb_wordsto control the length of the output summary when applying SummPip on your own dataset.