Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Controlling the Amount of Verbatim Copying in Abstractive Summarization

We provide the source code for the paper "Controlling the Amount of Verbatim Copying in Abstractive Summarization", accepted at AAAI'20. If you find the code useful, please cite the following paper.

 Author = {Kaiqiang Song and Bingqing Wang and Zhe Feng and Liu Ren and Fei Liu},
 Title = {Controlling the Amount of Verbatim Copying in Abstractive Summarization},
 Booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
 Year = {2020}}


  • Our system seeks to re-write a lengthy sentence, often the 1st sentence of a news article, to a concise, title-like summary. The average input and output lengths are 31 words and 8 words, respectively.

  • The code takes as input a text file with one sentence per line. It generates a text file ("summary.txt") in the working folder as the outputs, where each source sentence is replaced by a title-like summary.

  • Example input and output are shown below.

    Belgian authorities are investigating the killing of two policewomen and a passerby in the eastern city of Liege on Tuesday as a terror attack, the country's prosecutor said.

    Belgium probes killing of two policewomen as terror attack .


The code is written in Python (v3.7) and Pytorch (v1.3). We suggest the following environment:

HINT: Notice that pytorch-pretrained-bert may change their name and content during time. It is currently named as transformers.

To install Python (v3.7), run the command:

$ wget
$ bash
$ source ~/.bashrc

To install PyTorch (v1.3) and its dependencies, run the below command.

$ conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

To install pytorch-pretrained-bert and its dependencies, run the below command.

$ pip install spacy ftfy==4.4.3
$ python -m spacy download en
$ pip install pytorch-pretrained-bert

To install Pyrouge, run the command below. Pyrouge is a Python wrapper for the ROUGE toolkit, an automatic metric used for summary evaluation.

$ pip install pyrouge

I Want to Generate Summaries..

  1. Clone this repo. Download this ZIP file ( containing trained model. Move the ZIP file to the working folder and uncompress.

    $ git clone
    $ mv control-over-copying
    $ cd control-over-copying
    $ unzip
    $ rm
    $ mkdir log
  2. Generating Summaries with our summarization model trained on selected dataset including: gigaword (default), newsroom.

    $ python --do_test --inputFile data/test.txt

    Or if you want runing models other than that trained on gigaword:

    $ python --do_test --dataset newsroom --inputFile data/test.txt

I Want to Train the Model..

  1. Training the Model with train files and validation files.

    $ python --do_train --train_prefix data/train --valid_prefix data/valid
  2. (Optional) Modify the training options.

    You might want to change the parameters used for training. These are specified in ./setttings/training/gigaword_8.json and explained blow.


HINT*: 200K batches (used for rateReduce_bound) with batch size of 8, is slightly less than half of an epoch.


(AAAI'20) The source code for the paper "Controlling the Amount of Verbatim Copying in Abstractive Summarization".







No releases published


No packages published