PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems

This is the repo for the paper: PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems. This framework addresses the key challenges in knowledge-grounded dialogue systems, such as hallucination and lack of coherence, through a generation re-scoring framework that empowers models to generate faithful and relevant responses without requiring additional labeled data or model tuning. Further details could be found in the paper.

Steps:

Make sure all requirements are installed, or install it via: pip install -r requirements.txt
Prepare the dataset:
- Download the wizard_of_wikipedia dataset:
  - wget -P data_pool/wizard_of_wikipedia http://parl.ai/downloads/wizard_of_wikipedia/wizard_of_wikipedia.tgz
  - tar -xvzf data_pool/wizard_of_wikipedia/wizard_of_wikipedia.tgz -C data_pool/wizard_of_wikipedia/
  - rm -rf data_pool/wizard_of_wikipedia/wizard_of_wikipedia.tgz
Prepare caffeinated_pandas to help in parallelization:
- Download caffeinated-pandas repo to this repo in your local using:
  - git clone https://github.com/scollay/caffeinated-pandas.git
  - mv caffeinated-pandas caffeinated_pandas
Finetune your model using run_ft_*.sh
Do inference with your model using run_eval_*.sh
Score your generations further with other metrics, i.e. FED, by cloning it to your local.

Citation

This work is published at AACL-IJCNLP 2023 and you can find the details in the paper (the link to AACL2023 paper is still currently not yet ready). Please cite our work if you find it useful.

@inproceedings{wilie2023pick,
  author    = {Wilie, Bryan  and  Xu, Yan  and  Chung, Willy  and  
              Cahyawijaya, Samuel  and  Lovenia, Holy  and  Fung, Pascale},
  title     = {PICK: Polished \& Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems},
  booktitle = {Proceedings of the 13th International Joint Conference on Natural Language Processing 
                 and the 3rd Conference of the Asia-Pacific Chapter of 
                 the Association for Computational Linguistics},
  month     = {November},
  year      = {2023},
  address   = {Nusa Dua, Bali},
  publisher = {Association for Computational Linguistics},
  pages     = {980--995}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
finetune.py		finetune.py
inference.py		inference.py
requirements.txt		requirements.txt
run_eval_gpt2.sh		run_eval_gpt2.sh
run_eval_t5.sh		run_eval_t5.sh
run_ft_wow_gpt2.sh		run_ft_wow_gpt2.sh
run_ft_wow_t5.sh		run_ft_wow_t5.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

finetune.py

finetune.py

inference.py

inference.py

requirements.txt

requirements.txt

run_eval_gpt2.sh

run_eval_gpt2.sh

run_eval_t5.sh

run_eval_t5.sh

run_ft_wow_gpt2.sh

run_ft_wow_gpt2.sh

run_ft_wow_t5.sh

run_ft_wow_t5.sh

Repository files navigation

PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems

Steps:

Citation

About

Releases

Packages

Languages

License

bryanwilie/pick

Folders and files

Latest commit

History

Repository files navigation

PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems

Steps:

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages