Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Structure-Infused Copy Mechanisms for Abstractive Summarization

We provide the source code for the paper "Structure-Infused Copy Mechanisms for Abstractive Summarization", accepted at COLING'18. If you find the code useful, please cite the following paper.

 Author = {Kaiqiang Song and Lin Zhao and Fei Liu},
 Title = {Structure-Infused Copy Mechanisms for Abstractive Summarization},
 Booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING)},
 Year = {2018}}


  • Our system seeks to re-write a lengthy sentence, often the 1st sentence of a news article, to a concise, title-like summary. The average input and output lengths are 31 words and 8 words, respectively.

  • The code takes as input a text file with one sentence per line. It generates a text file in the same directory as the output, ended with ".result.summary", where each source sentence is replaced by a title-like summary.

  • Example input and output are shown below.

    An estimated 4,645 people died in Hurricane Maria and its aftermath in Puerto Rico , according to an academic report published Tuesday in a prestigious medical journal .

    hurricane maria kills 4,645 in puerto rico .

A Quick Demo

Demo of Sentence Summarizer


The code is written in Python (v2.7) and Theano (v1.0.1). We suggest the following environment:

To install Python (v2.7), run the command:

$ wget
$ bash
$ source ~/.bashrc

To install Theano and its dependencies, run the below command (you may want to add export MKL_THREADING_LAYER=GNU to "~/.bashrc" for future use).

$ conda install numpy scipy mkl nose sphinx pydot-ng
$ conda install theano pygpu

To download the Stanford CoreNLP toolkit and use it as a server, run the command below. The CoreNLP toolkit helps derive structure information (part-of-speech tags, dependency parse trees) from source sentences.

$ wget
$ unzip
$ cd stanford-corenlp-full-2018-02-27
$ nohup java -mx16g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000 &
$ cd -

To install Pyrouge, run the command below. Pyrouge is a Python wrapper for the ROUGE toolkit, an automatic metric used for summary evaluation.

$ pip install pyrouge

I Want to Generate Summaries..

  1. Clone this repo. Download this TAR file (model_coling18.tar.gz) containing vocabulary files and pretrained models. Move the TAR file to folder "struct_infused_summ" and uncompress.

    $ git clone
    $ mv model_coling18.tar.gz struct_infused_summ
    $ cd struct_infused_summ
    $ tar -xvzf model_coling18.tar.gz
    $ rm model_coling18.tar.gz
  2. Extract structural features from a list of input files. The file ./test_data/test_filelist.txt contains absolute (or relative) paths to individual files (test_000.txt and test_001.txt are toy files). Each file contains a number of source sentences, one sentence per line. Then, execute the command:

    $ python -f ./test_data/test_filelist.txt
  3. Generate the model configuration file in the ./settings/ folder.

    $ python ./test_data/test_filelist.txt ./settings/my_test_settings

    After that, you need to modify the "dataset" field of the file to point it to the new settings file: 'dataset':'settings/my_test_settings.json'.

  4. Run the testing script. The summary files, located in the same directory as the input, are ended with ".result.summary".

    $ python

    struct_edge is the default model. It corresponds to the "2way+relation" architecture described in the paper. You can modify the file (Line 152-153) by globally replacing struct_edge with struct_node to enable the "2way+word" architecture.

I Want to Train the Model..

  1. Create a folder to save the model files. ./model/struct_node is for the "2way+word" architecture and ./model/struct_edge for the "2way+relation" architecture.

    $ mkdir -p ./model/struct_node ./model/struct_edge
  2. Extract structural features from the input files. source_file.txt and summary_file.txt in the ./train_data/ folder are toy files containing source and summary sentences, one sentence per line. Often, tens of thousands of (source, sentence) pairs are required for training.

    $ python ./train_data/source_file.txt
    $ python ./train_data/summary_file.txt

    Adjust file names using below commands. .Ndocument, .dfeature, and Nsummary respectively contain the source sentences, structural features of source sentences, and summary sentences.

    $ cd ./train_data/
    $ mv source_file.txt.Ndocument train.Ndocument
    $ mv source_file.txt.feature train.dfeature
    $ mv summary_file.txt.Ndocument train.Nsummary
    $ cd -
  3. Repeat the previous step for validation data, which are used for early stopping. ./valid_data contain toy files.

    $ python ./valid_data/source_file.txt
    $ python ./valid_data/summary_file.txt
    $ cd ./valid_data/
    $ mv source_file.txt.Ndocument valid.Ndocument
    $ mv source_file.txt.feature valid.dfeature
    $ mv summary_file.txt.Ndocument valid.Nsummary
    $ cd -
  4. Generate the model configuration file in the ./settings/ folder.

    $ python ./train_data/train ./valid_data/valid ./settings/my_train_settings

    After that, you need to modify the "dataset" field of the file to point to the new settings file: 'dataset':'settings/my_train_settings.json'.

  5. Download the GloVe embeddings and uncompress.

    $ wget
    $ unzip
    $ rm

    Modify the "vocab_emb_init_path" field in the file ./settings/vocabulary.json from "vocab_emb_init_path": "../../vocab/glove.6B.100d.txt" to "vocab_emb_init_path": "glove.6B.100d.txt".

  6. Create a vocabulary file from ./train_data/train.Ndocument and ./train_data/train.Nsummary. Words appearing less than 5 times are excluded.

    $ python my_vocab
  7. Modify the path to the vocabulary file in from Vocab_Giga = loadFromPKL('../../dataset/gigaword_eng_5/giga_new.Vocab') to Vocab_Giga = loadFromPKL('my_vocab.Vocab').

  8. To train the model, run the below command.

    $ THEANO_FLAGS='floatX=float32' python

    The training program stops when it reaches the maximum number of epoches (30 epoches). This number can be modified by changing the "max_epochs" field in ./settings/training.json. The model files are saved in folder ./model/.

    "2way+relation" is the default architecture. It uses the settings file ./settings/network_struct_edge.json. You can modify the 'network' field of the from 'settings/network_struct_edge.json' to './settings/network_struct_node.json' to train the "2way+word" architecture.

  9. (Optional) train the model with early stopping.

    You might want to change the paramters used for early stopping. These are specified in ./setttings/earlyStop.json and explained below. If early stopping is enabled, the best model files, model_best.npz and options_best.json, will be saved in the ./model/struct_edge/ folder.

	"sample":true, # enable model checkpoint
	"sampleMin":10000, # the first checkpoint occurs after 10K batches
	"sampleFreq":2000, # there is a checkpoint every 2K batches afterwards
	"earlyStop":true, # enable early stopping 
	"earlyStop_method":"valid_err", # based on validation loss
	"earlyStop_bound":62000, # the training program stops if the valid loss has no improvement after 62K batches
	"rate_bound":24000 # halve the learning rate if the valid loss has no improvement after 2K batches

62K batches (used for earlyStop_bound) correspond to about 1 epoch for our dataset. 24K batches (used for rate_Bound) is slightly less than half of an epoch.

I Want to Apply the Coverage Mechanism in a 2nd Training Stage..

  1. You will switch to the file Modify the path to the vocabulary file in from Vocab_Giga = loadFromPKL('../../dataset/gigaword_eng_5/giga_new.Vocab') to Vocab_Giga = loadFromPKL('my_vocab.Vocab') to point it to your vocabulary file.

  2. Run the below command to perform the 2nd-stage training. Two files ./model/struct_edge/model_check2_best.npz and ./model/struct_edge/options_check2_best.json will be generated, containing the best model parameters and system configurations for the "2way+relation" architecture.

    $ python


This project is licensed under the BSD License - see the file for details.


We grateful acknowledge the work of Kelvin Xu whose code in part inspired this project.


(COLING'18) The source code for the paper "Structure-Infused Copy Mechanisms for Abstractive Summarization".








No releases published


No packages published