Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Adelaide: A Discord-based Emotional Chatbot

Adelaide is my personal experiment with lukalabs' Cakechat program. I've added support, sentiment analysis with the TextBlob library, and there is more to come.

Lithium stats on Discord Bot List


The recommended python version is 3.6.8 for all dependencies to match up well, and I recommend running it on the Linux distro of your choice. Windows will also work but you will need to install it via anaconda. Clone the repository and run "pip install -r requirements.txt".
Then, open "tools/" and change the discord API token to your application's bot token.
If you do not have a discord bot, you can get one here:

Currently it uses lukalabs' pretrained model until I can get hardware sufficient to train it myself.
Run "python tools/" to download the pre-trained model.

You can find the original Cakechat here:


User: How are you holding up?
Bot: I'm doing my job.
User: I am very proud of you, you have over 50,000 friends now!
Bot: Thank you!
User: You should feel proud too.
Bot: I'm glad that you are happy with my life.
User: I want to see Avengers.
Bot: I'll be there.
User: Awesome, we'll see it together
Bot: I'm going to see it
User: Yeah
Bot: Did you watch it?
User: I saw the older ones but Endgame isn't out yet
Bot: I'm ready to see what happens!
User: Hi there! What's going on?
Bot: I've been to the hospital since you were here.
User: Whoa, why were you in the hospital?
Bot: My stomach is just a bit swollen but I'm fine.
User: That sounds really bad. Feel better.
Bot: Thanks, can I have a hug?
User: Sure. * hugs *
Bot: * hugs back *

Original Readme

CakeChat: Emotional Generative Dialog System

CakeChat is a dialog system that is able to express emotions in a text conversation. Try it online! Demo It is written in Theano and Lasagne. It uses end-to-end trained embeddings of 5 different emotions to generate responses conditioned by a given emotion. The code is flexible and allows to condition a response by an arbitrary categorical variable defined for some samples in the training data. With CakeChat you can, for example, train your own persona-based neural conversational model[5] or create an emotional chatting machine without external memory[4].

Table of contents

Network architecture and features

Network architecture * Model: * Hierarchical Recurrent Encoder-Decoder (HRED) architecture for handling deep dialog context[7] * Multilayer RNN with GRU cells. First layer of the utterance-level encoder is always bidirectional. * Thought vector is fed into decoder on each decoding step. * Decoder can be conditioned on any string label. For example: emotion label or id of a person talking. * Word embedding layer: * May be initialized using w2v model trained on your own corpus. * Embedding layer may either stay fixed of be fine-tuned along with all other weights of the network. * Decoding * 4 different response generation algorithms: "sampling", "beamsearch", "sampling-reranking" and "beamsearch-reranking". Reranking of the generated candidates is performed according to the log-likelihood or MMI-criteria[3]. See configuration settings description for details. * Metrics: * Perplexity * n-gram distinct metrics adjusted to the samples size[3]. * Lexical similarity between samples of the model and some fixed dataset. Lexical similarity is a cosine distance between TF-IDF vector of responses generated by the model and tokens in the dataset. * Ranking metrics: mean average precision and mean recall@k.[8]

Quick start

Run the CPU-only pre-built docker image & start the CakeChat serving the model on 8080 port: (bash) docker run --name cakechat-dev -p -it lukalabs/cakechat:latest \ bash -c "python bin/" (Or) using the GPU-enabled image: (bash) nvidia-docker run --name cakechat-gpu-dev -p -it lukalabs/cakechat-gpu:latest \ bash -c "USE_GPU=0 python bin/" That's it! Now you can try it by running python tools/ -f localhost -p 8080 -c "Hi! How are you?" from the host command line.



This is the easiest way to set up the environment and install all the dependencies.

CPU-only setup

  1. Install Docker 2. Build a docker image Build a CPU-only image: (bash) docker build -t cakechat:latest -f dockerfiles/Dockerfile.cpu dockerfiles/ 3. Start the container Run a docker container in the CPU-only environment (bash) docker run --name <CONTAINER_NAME> -it cakechat:latest

GPU-enabled setup

  1. Install nvidia-docker for the GPU support. 2. Build a GPU-enabled docker image: (bash) nvidia-docker build -t cakechat-gpu:latest -f dockerfiles/Dockerfile.gpu dockerfiles/ 3. Start the container Run a docker container in the GPU-enabled environment: (bash) nvidia-docker run --name <CONTAINER_NAME> -it cakechat-gpu:latest That's it! Now you can train your model and chat with it.

Manual setup

If you don't want to deal with docker images and containers, you can always simply run (with sudo, --user or inside your virtualenv): (bash) pip install -r requirements.txt Most likely this will do the job. NB: This method only provides a CPU-only environment. To get a GPU support, you'll need to build and install libgpuarray by yourself (see Dockerfile.gpu for example).

Getting the model

Using a pre-trained model

Run python tools/ to download our pre-trained model. The model is trained with context size 3 where the encoded sequence contains 30 tokens or less and the decoded sequence contains 32 tokens or less. Both encoder and decoder contain 2 GRU layers with 512 hidden units each. The model was trained on a Twitter preprocessed conversational data. To clean up the data, we removed URLs, retweets and citations. Also we removed mentions and hashtags that are not preceded by normal words or punctuation marks and filtered out all messages that contains more than 30 tokens. Then we marked out each utterance with our emotions classifier that predicts one of the 5 emotions: "neutral", "joy", "anger", "sadness" and "fear". To mark-up your own corpus with emotions you can use, for example, DeepMoji tool or any other emotions classifier that you have.

Initalizing model weights from file

For some tools (for example tools/ you can specify the path to model's initialization weights via --init_weights argument. The weights may come from a trained CakeChat model or from a model with a different architecture. In the latter case some parameters of Cakechat model may be left without initialization: a parameter will be initialized with a saved value if the parameter's name and shape are identical to the saved parameter, otherwise the parameter will keep its default initialization weights. See load_weights function for the details.

Training your own model

  1. Put your training text corpus to data/corpora_processed/. Each line of the corpus file should be a JSON object containing a list of dialog messages sorted in chronological order. Code is fully language-agnostic � you can use any unicode texts in datasets. Refer to our dummy corpus to see the input format data/corpora_processed/train_processed_dialogs.txt. 2. The following datasets are used for validation and early stopping: * data/corpora_processed/val_processed_dialogs.txt(dummy example) - for the context sensitive dataset * data/quality/context_free_validation_set.txt - for the context-free validation dataset * data/quality/context_free_questions.txt - is used for generating responses for logging and computing distinct-metrics * data/quality/context_free_test_set.txt - is used for computing metrics of the trained model, e.g. ranking metrics 3. Set up training parameters in cakechat/ See configuration settings description for more details. 4. Run python tools/ to build the index files with tokens and conditions from the training corpus. 5. Run python tools/ Don't forget to set USE_GPU=<GPU_ID> environment variable (with GPU_ID as from nvidia-smi) if you want to use GPU. Use SLICE_TRAINSET=N to train the model on a subset of the first N samples of your training data to speed up preprocessing for debugging. 6. You can also set IS_DEV=1 to enable the "development mode". It uses a reduced number of model parameters (decreased hidden layer dimensions, input and output sizes of token sequences, etc.), performs verbose logging and disables Theano graph optimizations. Use this mode for debugging. 7. Weights of your model will be saved in data/nn_models/.

Existing training datasets

You can train a dialog model on any text conversational dataset available to you. A great overview of existing conversational datasets can be found here:

Running the system

Local HTTP-server

Run a server that processes HTTP-requests with given input messages (contexts) and returns response messages of the model: (bash) python bin/ Specify USE_GPU=<GPU_ID> environment variable if you want to use a certain GPU. Wait until the model is compiled. Don't forget to run tools/ prior to running bin/ if you want to start an API with our pre-trained model. To make sure everything works fine, test the model on the following conversation:

� Hi, Eddie, what's up? � Not much, what about you? � Fine, thanks. Are you going to the movies tomorrow?

    -c "Hi, Eddie, what's up?" \
    -c "Not much, what about you?" \
    -c "Fine, thanks. Are you going to the movies tomorrow?" ```
#### HTTP-server API description
##### /cakechat_api/v1/actions/get_response
JSON parameters are:
|Parameter|Type|Description| ---|---|---| context|list of strings|List of previous messages from the dialogue history (max. 3 is used)| emotion|string, one of enum|One of {'neutral', 'anger', 'joy', 'fear', 'sadness'}. An emotion 
|to condition the response on. Optional param, if not specified, 'neutral' is used|
##### Request
``` POST /cakechat_api/v1/actions/get_response data: {
 'context': ['Hello', 'Hi!', 'How are you?'],
 'emotion': 'joy'
Response OK
 'response': 'I\'m fine!'

Gunicorn HTTP-server

We recommend to use Gunicorn for serving the API of your model at a production scale. Run a server that processes HTTP-queries with input messages and returns response messages of the model: (bash) cd bin && gunicorn cakechat_server:app -w 1 -b --timeout 2000 You may need to install gunicorn from pip: pip install gunicorn.

Telegram bot

You can also test your model in a Telegram bot: create a telegram bot and run python tools/ --token <YOUR_BOT_TOKEN>

Repository overview

  • cakechat/dialog_model/ - contains computational graph, training procedure and other model utilities * cakechat/dialog_model/inference/ - algorithms for response generation * cakechat/dialog_model/quality/ - code for metrics calculation and logging * cakechat/utils/ - utilities for text processing, w2v training, etc. * cakechat/api/ - functions to run http server: API configuration, error handling * tools/ - scripts for training, testing and evaluating your model

Important tools

Important configuration settings

All the configuration parameters for the network architecture, training, predicting and logging steps are defined in cakechat/ Some inference parameters used in an HTTP-server are defined in cakechat/api/ * Network architecture and size * HIDDEN_LAYER_DIMENSION is the main parameter that defines the number of hidden units in recurrent layers. * WORD_EMBEDDING_DIMENSION and CONDITION_EMBEDDING_DIMENSION define the number of hidden units that each token/condition are mapped into. Together they sum up to the dimension of input vector passed to the encoder RNN. * Number of units of the output layer of the decoder is defined by the number of tokens in the dictionary in the tokens_index directory. * Decoding algorithm: * PREDICTION_MODE_FOR_TESTS defines how the responses of the model are generated. The options are the following: - sampling � response is sampled from output distribution token-by-token. For every token the temperature transform is performed prior to sampling. You can control the temperature value by tuning DEFAULT_TEMPERATURE parameter. - sampling-reranking � multiple candidate-responses are generated using sampling procedure described above. After that the candidates are ranked according to their MMI-score[3] You can tune this mode by picking SAMPLES_NUM_FOR_RERANKING and MMI_REVERSE_MODEL_SCORE_WEIGHT parameters. - beamsearch � candidates are sampled using beam search algorithm. The candidates are ordered according to their log-likelihood score computed by the beam search procedure. - beamsearch-reranking � same as above, but the candidates are re-ordered after the generation in the same way as in sampling-reranking mode.

Note that there are other parameters that affect the response generation process.

Example use cases

By providing additional condition labels within a dataset entries, you can build the following models: * [A Persona-Based Neural Conversation Model][5] � a model that allows to condition responses on a persona ID to make them lexically similar to the given persona's linguistic style. * [Emotional Chatting Machine][4]-like model � a model that allows to condition responses on an emotion to provide emotional styles (anger, sadness, joy, etc). * [Topic Aware Neural Response Generation][6]-like model � a model that allows to condition responses on a certain topic to keep the topic-aware conversation. To make use of these extra conditions, please refer to the section Training your own model. Just set the "condition" field in the training set to one of the following: persona ID, emotion or topic label, update the index files and start the training.


Credits & Support

CakeChat is developed and maintained by the Replika team: Michael Khalman, Nikita Smetanin, Artem Sobolev, Nicolas Ivanov, Artem Rodichev and Denis Fedorenko. Demo by Oleg Akbarov, Alexander Kuznetsov and Vladimir Chernosvitov. All issues and feature requests can be tracked here - GitHub Issues.


© 2018 Luka, Inc. Licensed under the Apache License, Version 2.0. See LICENSE file for more details.

You can’t perform that action at this time.