Skip to content

A Dialog toolkit for making favorite TV characters as chatbots

Notifications You must be signed in to change notification settings

thammegowda/virtchar

Repository files navigation

VirtChar - A toolkit for making TV characters as chat bots

https://github.com/thammegowda/virtchar This repository has code to create chat bots by training them on TV transcripts. See requirements.txt for the required libraries and versions.

There are two kinds of models:

  1. Retrieval based models: See docs/retrieval-bot.md
  2. Generator models: See docs/neural-generator-bot.md

A Retrieval bot uses InferSent model to understand sentences (i.e. to obtain sentence representation)

A generator bot has two choices: Hierarchical Transformer and Hierarchical RNN. If you dont want to use hierarchical models, then head over to Tensor2Tensor or OpenNMT-Py or RTG (if I made it public or gave you access) Stick to this repository for hierarchical model.

Some Suggestions:

Retrieval based bot is easy to get to working and it is fun, so you should start there (see docs/retrieval-bot.md)

Hierarchical NLU based generator models have lots of issues, and requires lot of effort to get them to work.

Specifically you will hit these issues:

  1. They need lots of data and time to train.
  2. They (along with non-hierarchical versions) tend to produce "i dont know", "yes" "No", kind of short answers.

For the first problem, train on a huge corpus (out of domain) and fine tune on the desired corpus (see finetune_dialogs in config files, and --fine-tune option to trainer to make a switch). Train these on GPU to speed up (pytorch is under the hood).

The second problem is more harder/interesting on its own. Its the place where MLE assumption breaks. See this paper to know why this is hard. For now, this project uses a simple technique of down sampling the utterances(its enabled by default)

Questions? Want to help/collaborate?

  • Short enquiries : Sent them to me
  • Long discussions and bugs: Create an issue or pull request on this repo

About

A Dialog toolkit for making favorite TV characters as chatbots

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published