Dialogue generation

Implementation of a neural dialogue generator model with pretrained XLNet architecture Yang et al. (2019) on daily dialog dataset Li et al. (2017). Top-k sampling Fan et al. (2018) and nucleus decoding Holtzman et al. (2019) are available as decoding techniques. Currently working on fine-tuning the input tensors (role embeddings) for the XLNet model.

Usage

The model uses mixed precision training from nvidia/apex. Note that apex is not required and is only used if it is available. For installation guide of this module see the official instructions.

The model can be trained with the following command. Note that <data_dir> and <model_dir> are optional, as they are provided by default. Training with different hyperparameters can be done by running the train.py script and passing the desired options as command line arguments.

./run.sh "train" "<data_dir>" "<model_dir>"

An interactive evaluation mode is available on the trained model by switching the train to the eval flag.

./run.sh "eval" "<data_dir>" "<model_dir>"

Training the model is fast and easy on Google Colaboratory, which can be done from scratch by creating a new colab file in your Google Drive and running it with the following snippet. It is important to set the runtime type to GPU with a Tesla T4 unit as it can fully leverage mixed-precision training and is much faster than the older K80 version. You can check the current type by running the following line in a cell of your colab.

!nvidia-smi

Copy and run the following code in a cell of your colab file for installing the model.

!git clone https://github.com/bme-chatbots/dialogue-generation.git
!python -m pip install --upgrade pip

# installing apex
!git clone https://github.com/NVIDIA/apex
!cd apex; pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

# building the cython code
!cd dialogue-generation; python setup.py build_ext --inplace

# installing the required packages
!cd dialogue-generation; pip install -r requirements.txt

The training loss and accuracy is logged with TensorboardX, which can also be tracked in the colab file if the below code is run before the training cell.

%load_ext tensorboard

%tensorboard --logdir "dialogue-generation/model"

The model can be trained then by simply running the run.sh script with the default parameters.

!./dialogue-generation/run.sh

Results

Coming soon

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

run.sh

run.sh

setup.py

setup.py

Repository files navigation

Dialogue generation

Usage

Results

About

Releases

Packages

Languages

License

ECNUHP/dialogue-generation

Folders and files

Latest commit

History

Repository files navigation

Dialogue generation

Usage

Results

About

Resources

License

Stars

Watchers

Forks

Languages