This repository contains the source code for our paper Generic Mechanism for Reducing Repetitions in Encoder-Decoder Models which is accepted for the Journal of Natural Language Processing (JNLP 2023).
Ying Zhang, Hidetaka Kamigaito, Tatsuya Aoki, Hiroya Takamura, and Manabu Okumura. “Generic Mechanism for Reducing Repetitions in Encoder-Decoder Models,” Journal of Natural Language Processing (JNLP 2023), Volume 30, Issue 2. (Early Access)
- PyTorch version == 1.9.1
- Python version >= 3.6
- omegaconf version >= 2.0.6
git clone https://github.com/zhangying9128/RRM.git
Please use our modified fairseq.
cd RRM/fairseq/
pip install --editable .
cd ..
We used the source code of Fonollosa et al., (2019) to preprocess the IWSLT14 De-En dataset and train LocalJoint or Transformer with RRM. Please check IWSLT for more details.
We used the source code of Ott et al., (2019) to preprocess the WMT14 En-De dataset and train Transformer with RRM. Please check WMT for more details.
We used the source code of Wolf et al. (2019) to preprocess the PERSONACHAT dataset and train TransferTransfo and BART with RRM. Please check PERSONACHAT for more details.
Please cite as:
@inproceedings{zhang-etal-2021-generic,
title = "Generic Mechanism for Reducing Repetitions in Encoder-Decoder Models",
author = "Zhang, Ying and
Kamigaito, Hidetaka and
Aoki, Tatsuya and
Takamura, Hiroya and
Okumura, Manabu",
booktitle = "Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)",
month = sep,
year = "2021",
address = "Held Online",
publisher = "INCOMA Ltd.",
url = "https://aclanthology.org/2021.ranlp-1.180",
pages = "1606--1615",
abstract = "Encoder-decoder models have been commonly used for many tasks such as machine translation and response generation. As previous research reported, these models suffer from generating redundant repetition. In this research, we propose a new mechanism for encoder-decoder models that estimates the semantic difference of a source sentence before and after being fed into the encoder-decoder model to capture the consistency between two sides. This mechanism helps reduce repeatedly generated tokens for a variety of tasks. Evaluation results on publicly available machine translation and response generation datasets demonstrate the effectiveness of our proposal.",
}