Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Data-efficient Neural Text Compression with Interactive Learning

In this project, we develop a general framework for Interactive Text Compression. We propose an interactive text compression model using active learning learning methods for data-efficient learning.

If you reuse this software, please use the following citation:

    title = {Data-efficient Neural Text Compression with Interactive Learning},
    author = {P.V.S., Avinesh and Meyer, Christian M.},
    publisher = {Association for Computational Linguistics},
    booktitle = {Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics},
    pages = {2543–-2554},
    month = june,
    year = {2019},
    location = {Minneapolis, USA},

Abstract: Neural sequence-to-sequence models have been successfully applied to text compression. However, these models were trained on huge automatically induced parallel corpora, which are only available for a few domains and tasks. In this paper, we propose a novel interactive setup to neural text compression that enables transferring a model to new domains and compression tasks with minimal human supervision. This is achieved by employing active learning, which intelligently samples from a large pool of unlabeled data. Using this setup, we can successfully adapt a model trained on small data of 40k samples for a headline generation task to a general text compression dataset at an acceptable compression quality with just 500 sampled instances annotated by a human.

Contact person: Avinesh P.V.S., first_name AT, first_name.last_name AT gmail DOT com

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Processing Data

python tools/

Run the text compression model Seq2Seq-Gen and Pointer-Gen


Interactive Active learning Sampling


Evaluate Results

files2rouge syste_output.txt reference.txt


No description, website, or topics provided.




No releases published


No packages published