SummPip
This code is for Sigir 2020 paper SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression
Python version: this code is in Python3.6
Dataset
source data which has minimal text pre-processing
target data (for evaluation)
Test SummPip
Step1: place downloaded dataset in the folder ./dataset/multi_news/
.
Step2: download the pre-trained word2vec model and place it in the folder ./word_vec/multi_news
.
- If you want to run SummPip on your own dataset, you need to pre-train a W2V model yourself first with gensim.
Step3: Unsupervised Extractive Summarisation
python run_main.py
- You may want to change
-nb_clusters
and-nb_words
to control the length of the output summary when applying SummPip on your own dataset.