KEST

Official script of IJCAI 2023 paper: KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation. Full version of our paper can be found on arxiv.

Introduction

Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distribution, suffering from mode collapse and poor generation diversity. Second, generating pseudo text in each iteration is time-consuming, severely decelerating the training process. In this work, we propose KEST, a novel and efficient self-training framework to handle these problems. KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator. We demonstrate both theoretically and empirically that KEST can benefit from more diverse pseudo text from kernel-based loss. To accelerate the training process, we add a Non-autoregressive generation (NAG) module to generate pseudo text to reduce decoding time.

Repository

KEST
├── data
├── corpus
├── codes
├── (unilm)
└── (your evaluation classifier)

Data

You can download the training data of IMDb, AGNews from Huggingface. Jigsaw dataet can be found on Kaggle.

We use UniLM1-base-cased for our base model. Please download it from the following link.

Training

This code can be ran with single GPU. Script that works on multi GPU is on process.

Simply run train.py to replicate our experimental result.

You are free to play with the hyperparameters and settings in config.py.

Evaluation/Inference

evaluation.py evaluates the classification performance of trained model (F1) and generalizability of generation (Model PPL).

generation.py generates samples of given prompt and evaluates the fluency (Output PPL), classification, and diversity (Dist, Self-BLEU).

License

This repository is licensed under the MIT License.

Citation

If you find our work useful, please consider citing our IJCAI paper:

@inproceedings{feng-et-al-2023-kest,
  title     = {KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation},
  author    = {Feng, Yuxi and Yi, Xiaoyuan and Lakshmanan, Laks V.S. and Xie, Xing},
  booktitle = {Proceedings of the Thirty-Second International Joint Conference on
               Artificial Intelligence, {IJCAI-23}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Edith Elkind},
  pages     = {5049--5057},
  year      = {2023},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2023/561},
  url       = {https://doi.org/10.24963/ijcai.2023/561},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
codes		codes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codes

codes

LICENSE

LICENSE

README.md

README.md

Repository files navigation

KEST

Introduction

Repository

Data

Training

Evaluation/Inference

License

Citation

About

Releases

Packages

Languages

License

peterfengyx/KEST

Folders and files

Latest commit

History

Repository files navigation

KEST

Introduction

Repository

Data

Training

Evaluation/Inference

License

Citation

About

Resources

License

Stars

Watchers

Forks

Languages