This repository reflects the final version for my Dialogue Generation Project during my internship at Saarthi.ai. The intermediate commits have been removed for privacy reasons.
The basic file structure of this repo is almost same as the original repository by @ishalyminov excluding some scripts.
-
Added class
DSTCZslCorpus
inutils/corpora.py
for corpus creation and vocabulary generation of the DSTC Dataset. -
Added class
ZslDSTCDataLoader
inutils/data_loaders.py
as the DataLoader for DSTC Dataset.
Data must be present in the main directory (For DSTC8 Dataset, folder named (dstc) is already present. Download the dataset, and unzip in here i.e. folders train, dev, test from dataset must be here).
The laed_features are added in the repository. These were generated using through the model trained during The given laed_features
folder has another folder named features_dstc
. This folder has the representations (specifically for DSTC8) generated using LAED architecture. They have been added, because, they will be used as the input to train.
python train_fsdg.py \
DSTCZslCorpus \
--data_dir dstc/ \
--laed_z_folders laed_features/features_dstc/ \
--black_domains $domain \
--black_ratio 0.9 \
--action_match False \
--target_example_cnt 0 \
--random_seed $rnd `