Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Structure-Infused Copy Mechanisms for Abstractive Summarization

We provide the source code for the paper "Structure-Infused Copy Mechanisms for Abstractive Summarization", accepted at COLING'18. If you find the code useful, please cite the following paper.

 Author = {Kaiqiang Song and Lin Zhao and Fei Liu},
 Title = {Structure-Infused Copy Mechanisms for Abstractive Summarization},
 Booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING)},
 Year = {2018}}


  • Our system seeks to re-write a lengthy sentence, often the 1st sentence of a news article, to a concise, title-like summary. The average input and output lengths are 31 words and 8 words, respectively.

  • The code takes as input a text file with one sentence per line. It generates a text file in the same directory as the output, ended with ".result.summary", where each source sentence is replaced by a title-like summary.

  • Example input and output are shown below.

    An estimated 4,645 people died in Hurricane Maria and its aftermath in Puerto Rico , according to an academic report published Tuesday in a prestigious medical journal .

    hurricane maria kills 4,645 in puerto rico .

A Quick Demo

Demo of Sentence Summarizer


The code is written in Python (v2.7) and Theano (v1.0.1). We suggest the following environment:

To install Python (v2.7), run the command:

$ wget
$ bash
$ source ~/.bashrc

To install Theano and its dependencies, run the below command (you may want to add export MKL_THREADING_LAYER=GNU to "~/.bashrc" for future use).

$ conda install numpy scipy mkl nose sphinx pydot-ng
$ conda install theano pygpu

To download the Stanford CoreNLP toolkit and use it as a server, run the command below. The CoreNLP toolkit helps derive structure information (part-of-speech tags, dependency parse trees) from source sentences.

$ wget
$ unzip
$ cd stanford-corenlp-full-2018-02-27
$ nohup java -mx16g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000 &
$ cd -

To install Pyrouge, run the command below. Pyrouge is a Python wrapper for the ROUGE toolkit, an automatic metric used for summary evaluation.

$ pip install pyrouge

I Want to Generate Summaries..

  1. Clone this repo. Download this TAR file (model_coling18.tar.gz) containing vocabulary files and pretrained models. Move the TAR file to folder "struct_infused_summ" and uncompress.

    $ git clone
    $ mv model_coling18.tar.gz struct_infused_summ
    $ cd struct_infused_summ
    $ tar -xvzf model_coling18.tar.gz
    $ rm model_coling18.tar.gz
  2. Extract structural features from a list of input files. The file ./test_data/test_filelist.txt contains absolute (or relative) paths to individual files (test_000.txt and test_001.txt are toy files). Each file contains a number of source sentences, one sentence per line. Then, execute the command:

    $ python -f ./test_data/test_filelist.txt
  3. Generate the model configuration file in the ./settings/ folder.

    $ python ./test_data/test_filelist.txt ./settings/my_test_settings

    After that, you need to modify the "dataset" field of the file to point it to the new settings file: 'dataset':'settings/my_test_settings.json'.

  4. Run the testing script. The summary files, located in the same directory as the input, are ended with ".result.summary".

    $ python

    struct_edge is the default model. It corresponds to the "2way+relation" architecture described in the paper. You can modify the file (Line 152-153) by globally replacing struct_edge with struct_node to enable the "2way+word" architecture.

I Want to Train the Model..

  1. Create a folder to save the model files. ./model/struct_node is for the "2way+word" architecture and ./model/struct_edge for the "2way+relation" architecture.

    $ mkdir -p ./model/struct_node ./model/struct_edge
  2. Extract structural features from the input files. source_file.txt and summary_file.txt in the ./train_data/ folder are toy files containing source and summary sentences, one sentence per line. Often, tens of thousands of (source, sentence) pairs are required for training.

    $ python ./train_data/source_file.txt
    $ python ./train_data/summary_file.txt

    Adjust file names using below commands. .Ndocument, .dfeature, and Nsummary respectively contain the source sentences, structural features of source sentences, and summary sentences.

    $ cd ./train_data/
    $ mv source_file.txt.Ndocument train.Ndocument
    $ mv source_file.txt.feature train.dfeature
    $ mv summary_file.txt.Ndocument train.Nsummary
    $ cd -
  3. Repeat the previous step for validation data, which are used for early stopping. ./valid_data contain toy files.

    $ python ./valid_data/source_file.txt
    $ python ./valid_data/summary_file.txt
    $ cd ./valid_data/
    $ mv source_file.txt.Ndocument valid.Ndocument
    $ mv source_file.txt.feature valid.dfeature
    $ mv summary_file.txt.Ndocument valid.Nsummary
    $ cd -
  4. Generate the model configuration file in the ./settings/ folder.

    $ python ./train_data/train ./valid_data/valid ./settings/my_train_settings

    After that, you need to modify the "dataset" field of the file to point to the new settings file: 'dataset':'settings/my_train_settings.json'.

  5. Download the GloVe embeddings and uncompress.

    $ wget
    $ unzip
    $ rm

    Modify the "vocab_emb_init_path" field in the file ./settings/vocabulary.json from "vocab_emb_init_path": "../../vocab/glove.6B.100d.txt" to "vocab_emb_init_path": "glove.6B.100d.txt".

  6. Create a vocabulary file from ./train_data/train.Ndocument and ./train_data/train.Nsummary. Words appearing less than 5 times are excluded.

    $ python my_vocab
  7. Modify the path to the vocabulary file in from Vocab_Giga = loadFromPKL('../../dataset/gigaword_eng_5/giga_new.Vocab') to Vocab_Giga = loadFromPKL('my_vocab.Vocab').

  8. To train the model, run the below command.

    $ THEANO_FLAGS='floatX=float32' python

    The training program stops when it reaches the maximum number of epoches (30 epoches). This number can be modified by changing the "max_epochs" field in ./settings/training.json. The model files are saved in folder ./model/.

    "2way+relation" is the default architecture. It uses the settings file ./settings/network_struct_edge.json. You can modify the 'network' field of the from 'settings/network_struct_edge.json' to './settings/network_struct_node.json' to train the "2way+word" architecture.

  9. (Optional) train the model with early stopping.

    You might want to change the paramters used for early stopping. These are specified in ./setttings/earlyStop.json and explained below. If early stopping is enabled, the best model files, model_best.npz and options_best.json, will be saved in the ./model/struct_edge/ folder.

	"sample":true, # enable model checkpoint
	"sampleMin":10000, # the first checkpoint occurs after 10K batches
	"sampleFreq":2000, # there is a checkpoint every 2K batches afterwards
	"earlyStop":true, # enable early stopping 
	"earlyStop_method":"valid_err", # based on validation loss
	"earlyStop_bound":62000, # the training program stops if the valid loss has no improvement after 62K batches
	"rate_bound":24000 # halve the learning rate if the valid loss has no improvement after 2K batches

62K batches (used for earlyStop_bound) correspond to about 1 epoch for our dataset. 24K batches (used for rate_Bound) is slightly less than half of an epoch.

I Want to Apply the Coverage Mechanism in a 2nd Training Stage..

  1. You will switch to the file Modify the path to the vocabulary file in from Vocab_Giga = loadFromPKL('../../dataset/gigaword_eng_5/giga_new.Vocab') to Vocab_Giga = loadFromPKL('my_vocab.Vocab') to point it to your vocabulary file.

  2. Run the below command to perform the 2nd-stage training. Two files ./model/struct_edge/model_check2_best.npz and ./model/struct_edge/options_check2_best.json will be generated, containing the best model parameters and system configurations for the "2way+relation" architecture.

    $ python


This project is licensed under the BSD License - see the file for details.


We grateful acknowledge the work of Kelvin Xu whose code in part inspired this project.


(COLING'18) The source code for the paper "Structure-Infused Copy Mechanisms for Abstractive Summarization".





No releases published


No packages published