GitHub

Zero-shot Dialog Generation

Datasets

The multilingual versions datasets of DailyDialog and DSTC7, which includes seven languages: English, Chinese, German, Russian, Spanish, French and Italian. You can download the dataset from the following link:

MultilingualDatasets

Models

We provide hred, vhcr, vhred, hran, transformer and HTransformer Baselines, our manuscript only provides the experimental results of hred, vhred, transformer and HTransformer.

This code using the MBERT Tokenizer

How to run

prepare the datasets

 python run.py \
 	--corpus DailyDialog \
 	--do_prepare

train model by Multilingual learning(specify the model parameter to switch between different models)

 python run.py \
 	--corpus DailyDialog \
 	--do_train \
 	--hier true \
 	--model hred \
 	--bidirectional true \

evaluate

python run.py \
 	--corpus DailyDialog \
 	--do_eval \
 	--hier true \
 	--model hred \
 	--bidirectional true \

Case Study

An case study is provided in Table 9 to demonstrate the values of augmented data. We can observe that the responses of HRED and VHRD contain context-independent information without using data augmentation. Specifically, "like my experience and i have a good idea" in HRED and "i like sports" in VHRED are context independent. The responses generated by HRED and VHRED are more informative and mroe coherent when using data augmentation. Models can utilize cross linguistic knowledge to generate more informative and coherent responses by multilingual data augmentation.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
metrics		metrics
model		model
utils		utils
vocabs		vocabs
.gitattributes		.gitattributes
.project		.project
.pydevproject		.pydevproject
DataSetBeam.py		DataSetBeam.py
README.md		README.md
case.png		case.png
requirements.txt		requirements.txt
runner.py		runner.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metrics

metrics

model

model

utils

utils

vocabs

vocabs

.gitattributes

.gitattributes

.project

.project

.pydevproject

.pydevproject

DataSetBeam.py

DataSetBeam.py

README.md

README.md

case.png

case.png

requirements.txt

requirements.txt

runner.py

runner.py

trainer.py

trainer.py

Repository files navigation

Zero-shot Dialog Generation

Datasets

Models

This code using the MBERT Tokenizer

How to run

Case Study

About

Releases

Packages

Languages

misonsky/MultilingualDatasets

Folders and files

Latest commit

History

Repository files navigation

Zero-shot Dialog Generation

Datasets

Models

This code using the MBERT Tokenizer

How to run

Case Study

About

Resources

Stars

Watchers

Forks

Languages