Skip to content

peterfengyx/KEST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

KEST

Official script of IJCAI 2023 paper: KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation. Full version of our paper can be found on arxiv.

Introduction

Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distribution, suffering from mode collapse and poor generation diversity. Second, generating pseudo text in each iteration is time-consuming, severely decelerating the training process. In this work, we propose KEST, a novel and efficient self-training framework to handle these problems. KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator. We demonstrate both theoretically and empirically that KEST can benefit from more diverse pseudo text from kernel-based loss. To accelerate the training process, we add a Non-autoregressive generation (NAG) module to generate pseudo text to reduce decoding time.

Repository

KEST
├── data
├── corpus
├── codes
├── (unilm)
└── (your evaluation classifier)

Data

You can download the training data of IMDb, AGNews from Huggingface. Jigsaw dataet can be found on Kaggle.

We use UniLM1-base-cased for our base model. Please download it from the following link.

Training

This code can be ran with single GPU. Script that works on multi GPU is on process.

Simply run train.py to replicate our experimental result.

You are free to play with the hyperparameters and settings in config.py.

Evaluation/Inference

evaluation.py evaluates the classification performance of trained model (F1) and generalizability of generation (Model PPL).

generation.py generates samples of given prompt and evaluates the fluency (Output PPL), classification, and diversity (Dist, Self-BLEU).

License

This repository is licensed under the MIT License.

Citation

If you find our work useful, please consider citing our IJCAI paper:

@inproceedings{feng-et-al-2023-kest,
  title     = {KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation},
  author    = {Feng, Yuxi and Yi, Xiaoyuan and Lakshmanan, Laks V.S. and Xie, Xing},
  booktitle = {Proceedings of the Thirty-Second International Joint Conference on
               Artificial Intelligence, {IJCAI-23}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Edith Elkind},
  pages     = {5049--5057},
  year      = {2023},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2023/561},
  url       = {https://doi.org/10.24963/ijcai.2023/561},
}

About

Official script of IJCAI 2023 paper: KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages