Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking (EMNLP 2021)

Arxiv version of the paper here

Recent progress in task-oriented neural dialogue systems is largely focused on a handful of languages, as annotation of training data is tedious and expensive. Machine translation has been used to make systems multilingual, but this can introduce a pipeline of errors. Another promising solution is using cross-lingual transfer learning through pretrained multilingual models. Existing methods train multilingual models with additional codemixed task data or refine the cross-lingual representations through parallel ontologies. In this work, we enhance the transfer learning process by intermediate fine-tuning of pretrained multilingual models, where the multilingual models are fine-tuned with different but related data and/or tasks. Specifically, we use parallel and conversational movie subtitles datasets to design cross-lingual intermediate tasks suitable for downstream dialogue tasks. We use only 200K lines of parallel data for intermediate fine-tuning which is already available for 1782 language pairs. We test our approach on the cross-lingual dialogue state tracking task for the parallel MultiWoZ (English→Chinese, Chinese→English) and Multilingual WoZ (English→German, English→Italian) datasets. We achieve impressive improvements (> 20% on joint goal accuracy) on the parallel MultiWoZ dataset and the Multilingual WoZ dataset over the vanilla baseline with only 10% of the target language task data and zero-shot setup respectively

Update:

21/02/2022 - The correct train.json has been uploaded for the MultiWoZ experiments as stated in the README. Thanks to Yuxiang for pointing it out!

Repository organization

intermedite_finetuning : Methods under Section 3 of the paper

multilingual_woz : Redirection to the original repository. Experiments under Table 3.

multiwoz_sumbt : Cleaned version of the SUMBT model released by ConvLab. Experiments under Table 2.

The intermediate models are released on 🤗 Hugging Face under https://huggingface.co/nikitam/

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
intermediate_finetuning		intermediate_finetuning
multilingual_woz		multilingual_woz
multiwoz_sumbt		multiwoz_sumbt
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking (EMNLP 2021)

Repository organization

About

Releases

Packages

Languages

nikitacs16/xlift_dst

Folders and files

Latest commit

History

Repository files navigation

Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking (EMNLP 2021)

Repository organization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages