Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Sequential Knowledge Transformer (SKT)

skt model

This project hosts the code and dataset for our paper.

TL;DR: We propose a novel model named sequential knowledge transformer (SKT). To the best of our knowledge, our model is the first attempt to leverage a sequential latent variable model for knowledge selection, which subsequently improves knowledge-grounded chit-chat.

Please contact Byeongchang Kim if you have any question.

Reference

If you use this code or dataset as part of any published research, please refer following paper,

@inproceedings{Kim:2020:ICLR,
    title="{Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue}",
    author={Kim, Byeongchang and Ahn, Jaewoo and Kim, Gunhee},
    booktitle={ICLR},
    year=2020
}

System Requirements

  • Python 3.6
  • TensorFlow 2.0
  • CUDA 10.0 supported GPU with at least 12GB memory
  • see requirements.yml for more details

Running Experiments

Wizard-of-Wikipedia

To train the model from scratch,

python train.py --cfg ymls/default.yml --gpus 0,1 SequentialKnowledgeTransformer

# To run in eager mode
python train.py --cfg ymls/default.yml --gpus 0,1 --enable_function False SequentialKnowledgeTransformer

To run our pretrained model,

(it will automatically download pretrained checkpoints, or you can manually download at here)

python inference.py --cfg ymls/default.yml --gpus 0,1 --test_mode wow SequentialKnowledgeTransformer

# Will show following results
seen
{'accuracy': 0.27305699481865287,
 'kl_loss': 0.3053756,
 'knowledge_loss': 1.7310758,
 'perplexity': 53.27382,
 'rouge1': 0.19239063597262404,
 'rouge2': 0.06829999978899365,
 'rougeL': 0.1738224486787311,
 'total_loss': 6.0118966}
unseen
{'accuracy': 0.18561958184599694,
 'kl_loss': 0.27512234,
 'knowledge_loss': 2.349341,
 'perplexity': 82.65279,
 'rouge1': 0.16114443772189488,
 'rouge2': 0.04277752138282203,
 'rougeL': 0.14518138000861658,
 'total_loss': 7.039112}

Holl-E

To train the model from scratch,

python train.py --cfg ymls/holle.yml --gpus 0,1 SequentialKnowledgeTransformer

To run our pretrained model,

(it will automatically download pretrained checkpoints, or you can manually download at here or here)

python inference.py --cfg ymls/holle.yml --gpus 0,1 --test_mode holle_1 SequentialKnowledgeTransformer

# Will show following results
{'accuracy': 0.3037037037037037,
 'accuracy_multi_responses': 0.4033670033670034,
 'kl_loss': 0.36722404,
 'knowledge_loss': 1.3605422,
 'perplexity': 51.95605,
 'perplexity_multi_responses': 29.779757,
 'rouge1': 0.294273363798098,
 'rouge1_multi_responses': 0.3620479996911834,
 'rouge2': 0.22867428360364725,
 'rouge2_multi_responses': 0.2942280954677095,
 'rougeL': 0.28614673266443935,
 'rougeL_multi_responses': 0.3524777390233543,
 'total_loss': 5.678165}

 # Or you can try it with another checkpoint
python inference.py --cfg ymls/holle.yml --gpus 0,1 --test_mode holle_2 SequentialKnowledgeTransformer

Interactive Demo

You can have a chat with our SKT agent using following command (trained on Wizard-of-Wikipedia dataset),

python interactive.py --cfg ymls/default.yml --gpus 0 --test_mode wow SequentialKnowledgeTransformer

Acknowledgement

We thank Hyunwoo Kim, Chris Dongjoo Kim, Soochan Lee, and Junsoo Ha for their helpful comments.

This work was supported by SK T-Brain corporation and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2019-0-01082, SW StarLab).

License

See LICENSE.md.

About

💬 Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue. In ICLR, 2020 (spotlight)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages