Skip to content

General purpose conversational agent with pretrained XLNet in PyTorch.

License

Notifications You must be signed in to change notification settings

ECNUHP/dialogue-generation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dialogue generation

Implementation of a neural dialogue generator model with pretrained XLNet architecture Yang et al. (2019) on daily dialog dataset Li et al. (2017). Top-k sampling Fan et al. (2018) and nucleus decoding Holtzman et al. (2019) are available as decoding techniques. Currently working on fine-tuning the input tensors (role embeddings) for the XLNet model.

Usage

The model uses mixed precision training from nvidia/apex. Note that apex is not required and is only used if it is available. For installation guide of this module see the official instructions.

The model can be trained with the following command. Note that <data_dir> and <model_dir> are optional, as they are provided by default. Training with different hyperparameters can be done by running the train.py script and passing the desired options as command line arguments.

./run.sh "train" "<data_dir>" "<model_dir>"

An interactive evaluation mode is available on the trained model by switching the train to the eval flag.

./run.sh "eval" "<data_dir>" "<model_dir>"

Training the model is fast and easy on Google Colaboratory, which can be done from scratch by creating a new colab file in your Google Drive and running it with the following snippet. It is important to set the runtime type to GPU with a Tesla T4 unit as it can fully leverage mixed-precision training and is much faster than the older K80 version. You can check the current type by running the following line in a cell of your colab.

!nvidia-smi

Copy and run the following code in a cell of your colab file for installing the model.

!git clone https://github.com/bme-chatbots/dialogue-generation.git
!python -m pip install --upgrade pip

# installing apex
!git clone https://github.com/NVIDIA/apex
!cd apex; pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

# building the cython code
!cd dialogue-generation; python setup.py build_ext --inplace

# installing the required packages
!cd dialogue-generation; pip install -r requirements.txt

The training loss and accuracy is logged with TensorboardX, which can also be tracked in the colab file if the below code is run before the training cell.

%load_ext tensorboard
%tensorboard --logdir "dialogue-generation/model"

The model can be trained then by simply running the run.sh script with the default parameters.

!./dialogue-generation/run.sh

Results

Coming soon

About

General purpose conversational agent with pretrained XLNet in PyTorch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.6%
  • Shell 1.4%