# Building conversational AI with the Rasa stack
![alt text](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTaX3LNhGcAe1HnPZSuWS0oH6af0LJHXcH7If1sQgLCFAT1chNGFg)


This notebook is a basis for my workshop at PyData 2018 Berlin. If you have any questions or would like to learn more about anything included in this notebook, please let me know or get in touch by juste@rasa.com.

In this workshop we are going to build a chatbot capable of checking in on people's mood and take the necessary actions to cheer them up. 


The tutorial consists of three parts:


*   Part 0: Installation and setup
*   Part 1: Teaching the chatbot to understand user inputs using Rasa NLU model
*   Part 2: Teaching the chatbot to handle multi-turn conversations using dialogue management model.
*   Part 3: Resources and tips

## Part 0: Installation

### Let's start with jupyter configuration

In [1]:
%matplotlib inline

import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')

def pprint(o):
    # small helper to make dict dumps a bit prettier
    print(json.dumps(o, indent=2))

### Installation of Rasa
Let's start with the installation of Rasa NLU, Rasa Core and a spacy language model. If you have already installed, you can skip this step. 

In [5]:
import sys
python = sys.executable

# In your environment run:
!{python} -m pip install -U rasa_core rasa_nlu[spacy];

# as well as install a language model:
!{python} -m spacy download en_core_web_md
!{python} -m spacy link en_core_web_md en --force;

Collecting rasa_core
[?25l  Downloading https://files.pythonhosted.org/packages/de/bc/b5c886bc0bf15785b404a369c98f2159968d1b563e6ac47ef3acb1e7509e/rasa_core-0.13.0-py3-none-any.whl (204kB)
[K    100% |████████████████████████████████| 215kB 4.6MB/s ta 0:00:01
[?25hCollecting rasa_nlu[spacy]
[?25l  Downloading https://files.pythonhosted.org/packages/5b/b7/e1211e256172284998fc0d86abb117e54110be54d646e3c7a3fadec6d0d0/rasa_nlu-0.14.1-py2.py3-none-any.whl (147kB)
[K    100% |████████████████████████████████| 153kB 3.1MB/s ta 0:00:011
Collecting fakeredis~=0.10.0 (from rasa_core)
  Downloading https://files.pythonhosted.org/packages/64/bd/2756ddf350c4bb308e3255f9dcd6610f8b01344947bf74d5d166dc66b0a2/fakeredis-0.10.3-py2.py3-none-any.whl
Collecting numpy~=1.16 (from rasa_core)
[?25l  Downloading https://files.pythonhosted.org/packages/7b/74/54c5f9bb9bd4dae27a61ec1b39076a39d359b3fb7ba15da79ef23858a9d8/numpy-1.16.0-cp36-cp36m-manylinux1_x86_64.whl (17.3MB)
[K    100% |████████████████████

[?25hCollecting jsonpickle~=1.0 (from rasa_core)
  Downloading https://files.pythonhosted.org/packages/dc/12/8c44eabb501e2bc0aec0dd152b328074d98a50968d3a02be28f6037f0c6a/jsonpickle-1.1-py2.py3-none-any.whl
Collecting gevent~=1.4 (from rasa_core)
[?25l  Downloading https://files.pythonhosted.org/packages/f2/ca/5b5962361ed832847b6b2f9a2d0452c8c2f29a93baef850bb8ad067c7bf9/gevent-1.4.0-cp36-cp36m-manylinux1_x86_64.whl (5.5MB)
[K    100% |████████████████████████████████| 5.5MB 3.5MB/s ta 0:00:011   15% |█████                           | 870kB 1.6MB/s eta 0:00:03    18% |█████▉                          | 1.0MB 2.8MB/s eta 0:00:02    46% |███████████████                 | 2.6MB 2.2MB/s eta 0:00:02    62% |███████████████████▉            | 3.4MB 2.6MB/s eta 0:00:01    77% |████████████████████████▉       | 4.2MB 3.1MB/s eta 0:00:01    83% |██████████████████████████▊     | 4.6MB 2.6MB/s eta 0:00:01
[?25hCollecting colorhash~=1.0 (from rasa_core)
  Downloading https://files.pythonhosted.or

Collecting boto3~=1.5 (from rasa_nlu[spacy])
[?25l  Downloading https://files.pythonhosted.org/packages/a8/8a/730acf258088f2b0615e1bf6acaa63e336a97b6eeb41ce4c5b7e8b636476/boto3-1.9.86-py2.py3-none-any.whl (128kB)
[K    100% |████████████████████████████████| 133kB 2.4MB/s ta 0:00:01
[?25hCollecting cloudpickle~=0.6.1 (from rasa_nlu[spacy])
  Downloading https://files.pythonhosted.org/packages/fc/87/7b7ef3038b4783911e3fdecb5c566e3a817ce3e890e164fc174c088edb1e/cloudpickle-0.6.1-py2.py3-none-any.whl
Collecting simplejson~=3.13 (from rasa_nlu[spacy])
[?25l  Downloading https://files.pythonhosted.org/packages/e3/24/c35fb1c1c315fc0fffe61ea00d3f88e85469004713dab488dee4f35b0aff/simplejson-3.16.0.tar.gz (81kB)
[K    100% |████████████████████████████████| 81kB 11.0MB/s ta 0:00:01
[?25hCollecting spacy<=2.0.18,>2.0; extra == "spacy" (from rasa_nlu[spacy])
[?25l  Downloading https://files.pythonhosted.org/packages/ae/6e/a89da6b5c83f8811e46e3a9270c1aed90e9b9ee6c60faf52b7239e5d3d69/spacy-2.0

Collecting tzlocal>=1.2 (from apscheduler~=3.0->rasa_core)
  Downloading https://files.pythonhosted.org/packages/cb/89/e3687d3ed99bc882793f82634e9824e62499fdfdc4b1ae39e211c5b05017/tzlocal-1.5.1.tar.gz
Collecting humanfriendly>=4.7 (from coloredlogs~=10.0->rasa_core)
[?25l  Downloading https://files.pythonhosted.org/packages/79/1e/13d96248e3fcaa7777b61fa889feab44865c85e524bbd667acfa0d8b66e3/humanfriendly-4.17-py2.py3-none-any.whl (72kB)
[K    100% |████████████████████████████████| 81kB 9.9MB/s eta 0:00:01
[?25hCollecting greenlet>=0.4.14; platform_python_implementation == "CPython" (from gevent~=1.4->rasa_core)
[?25l  Downloading https://files.pythonhosted.org/packages/bf/45/142141aa47e01a5779f0fa5a53b81f8379ce8f2b1cd13df7d2f1d751ae42/greenlet-0.4.15-cp36-cp36m-manylinux1_x86_64.whl (41kB)
[K    100% |████████████████████████████████| 51kB 14.2MB/s ta 0:00:01
Collecting ConfigArgParse~=0.13.0 (from rasa-core-sdk~=0.12.1->rasa_core)
  Downloading https://files.pythonhosted.org/pack

[K    100% |████████████████████████████████| 757kB 5.7MB/s ta 0:00:011
[?25hCollecting tabulate (from sklearn-crfsuite~=0.3.6; extra == "spacy"->rasa_nlu[spacy])
[?25l  Downloading https://files.pythonhosted.org/packages/c2/fd/202954b3f0eb896c53b7b6f07390851b1fd2ca84aa95880d7ae4f434c4ac/tabulate-0.8.3.tar.gz (46kB)
[K    100% |████████████████████████████████| 51kB 8.1MB/s ta 0:00:01
Collecting markdown>=2.6.8 (from tensorboard<1.13.0,>=1.12.0->tensorflow~=1.12.0->rasa_core)
[?25l  Downloading https://files.pythonhosted.org/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl (89kB)
[K    100% |████████████████████████████████| 92kB 2.5MB/s eta 0:00:01
Collecting zope.interface>=4.4.2 (from Twisted>=15.5->klein~=17.10->rasa_nlu[spacy])
[?25l  Downloading https://files.pythonhosted.org/packages/19/17/1d198a6aaa9aa4590862fe3d3a2ed7dd808050cab4eebe8a2f2f813c1376/zope.interface-4.6.0-cp36-cp36m-manylinux1_x86_64.whl (167kB)

[?25h  Stored in directory: /home/jovyan/.cache/pip/wheels/48/5d/04/22361a593e70d23b1f7746d932802efe1f0e523376a74f321e
Successfully built terminaltables colorclass webexteamssdk flask-jwt-simple future simplejson gast absl-py termcolor tzlocal ConfigArgParse docopt Twisted ujson regex tabulate wrapt
[31mrasa-nlu 0.14.1 has requirement coloredlogs~=9.0, but you'll have coloredlogs 10.0 which is incompatible.[0m
[31mrasa-nlu 0.14.1 has requirement packaging~=17.1, but you'll have packaging 18.0 which is incompatible.[0m
[31mrasa-nlu 0.14.1 has requirement scikit-learn~=0.20.2, but you'll have scikit-learn 0.20.0 which is incompatible.[0m
Installing collected packages: redis, fakeredis, numpy, future, ruamel.yaml, greenlet, gevent, humanfriendly, coloredlogs, zope.interface, constantly, incremental, Automat, hyperlink, PyHamcrest, Twisted, Werkzeug, klein, jsonschema, jmespath, docutils, botocore, s3transfer, boto3, typing, cloudpickle, simplejson, msgpack, wrapt, murmurhash, cymem

[K    100% |████████████████████████████████| 120.9MB 12.8MB/s a 0:00:011-1 day, 23:59:54                     | 2.1MB 532kB/s eta 0:03:44    2% |█                               | 3.4MB 245kB/s eta 0:07:58    4% |█▎                              | 5.0MB 2.8MB/s eta 0:00:42    7% |██▎                             | 8.5MB 2.8MB/s eta 0:00:41    9% |███                             | 11.2MB 4.3MB/s eta 0:00:26    9% |███                             | 11.7MB 1.1MB/s eta 0:01:41    10% |███▍                            | 12.8MB 3.8MB/s eta 0:00:29    11% |███▊                            | 14.0MB 1.1MB/s eta 0:01:35    14% |████▌                           | 17.1MB 2.1MB/s eta 0:00:50    16% |█████▏                          | 19.4MB 2.3MB/s eta 0:00:45    16% |█████▎                          | 19.9MB 5.9MB/s eta 0:00:18    16% |█████▍                          | 20.4MB 2.2MB/s eta 0:00:46    17% |█████▌                          | 20.6MB 2.9MB/s eta 0:00:35    18% |█████▉                          |

Let's test the installation - we should have rasa_nlu: 0.12.3 and rasa_core: 0.9.6 installed, and spacy model should be available.

In [7]:
import rasa_nlu
import rasa_core
import spacy

print("rasa_nlu: {} rasa_core: {}".format(rasa_nlu.__version__, rasa_core.__version__))
print("Loading spaCy language model...")
print(spacy.load("en")("Hello world!"))

rasa_nlu: 0.14.1 rasa_core: 0.13.0
Loading spaCy language model...
Hello world!


### Some additional Tools needed
To do some of the visualizations you will also need graphviz. If you don't have graphviz installed, and this doesn't work: don't worry. I'll show you the graph and besides that visualization everything else will work.

Try installing with anyone of these (or adapt to your operating system):

In [7]:
!sudo apt-get -qq install -y graphviz libgraphviz-dev pkg-config;
#!brew install graphviz

[sudo] password for jovyan: 


and another python package and we are ready to go:

In [6]:
!{python} -m pip install pygraphviz;

Collecting pygraphviz
[?25l  Downloading https://files.pythonhosted.org/packages/7e/b1/d6d849ddaf6f11036f9980d433f383d4c13d1ebcfc3cd09bc845bda7e433/pygraphviz-1.5.zip (117kB)
[K    100% |████████████████████████████████| 122kB 1.3MB/s ta 0:00:01
[?25hBuilding wheels for collected packages: pygraphviz
  Running setup.py bdist_wheel for pygraphviz ... [?25lerror
  Complete output from command /opt/conda/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-4_owlt42/pygraphviz/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-cnq1_2i0 --python-tag cp36:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.6
  creating build/lib.linux-x86_64-3.6/pygraphviz
  copying pygraphviz/agraph.py -> build/lib.linux-x86_64-3.6/pygraphviz
  copying pygraphviz/release.py -> build/lib.linux-x86_64-3.6/pygrap

In [5]:
!pip install https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
!pip install sklearn_crfsuite

Collecting https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
  Downloading https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
[K     \ 13.8MB 52.0MB/ss   - 460kB 2.4MB/s     | 3.2MB 7.4MB/s     - 5.0MB 12.6MB/s     | 6.9MB 8.5MB/s     - 9.9MB 13.6MB/s
Collecting marisa_trie (from pythainlp==2.0)
[?25l  Downloading https://files.pythonhosted.org/packages/20/95/d23071d0992dabcb61c948fb118a90683193befc88c23e745b050a29e7db/marisa-trie-0.7.5.tar.gz (270kB)
[K    100% |████████████████████████████████| 276kB 4.0MB/s ta 0:00:01
[?25hCollecting nltk>=3.2.2 (from pythainlp==2.0)
[?25l  Downloading https://files.pythonhosted.org/packages/6f/ed/9c755d357d33bc1931e157f537721efb5b88d2c583fe593cc09603076cc3/nltk-3.4.zip (1.4MB)
[K    100% |████████████████████████████████| 1.4MB 10.0MB/s ta 0:00:01
Collecting tinydb (from pythainlp==2.0)
  Downloading https://files.pythonhosted.org/packages/d9/2b/98040184cfbf03113736a160ea35aa92dc3619312ba5a4d6cafaf7f81c73/tinydb-3.12.2-py2.py3-none-a

## Part 1: Natural Language Understanding

At first, let's teach our chatbot how to understand user inputs. To do that, we are going to build a Rasa NLU model. Here is some data to get started: 

**Conversation_1:**   
U: Hello  
B: Hello, how are you doing?  
U: I am doing great!  
B: Great. Carry on!  
    
**Conversation_2:**  
U: Hey  
B: Hello, how are you doing?  
U: I am very sad  
B: To cheer you up, I can show you a cute picture of a cat, a dog or a bird. Choose one :)  
U: A kitten  
B: Here is something to cheer you up. Did that help?  
U: Yes  
B: Goodbye  
  
**Conversation_3:**    
U: Heya  
B: Hello, how are you doing?  
U: Not so good and the only thing that could help me feel better is a picture of a puppy  
B: Here is something to cheer you up. Did that help?  
U: No  
B: Goodbye  
U: Bye  

### Creating the training data for language understanding model


Lets create some training data here, grouping user messages by their `intents`. The intent describes what the messages *mean*. Another important part of training data are `entities` - pieces of information which help a chatbot understand what specifically a user is asking about. Entities are labeled using the markdown link syntex: `[entity value](entity_type)` [More information about the data format](https://nlu.rasa.com/dataformat.html#markdown-format).

In [6]:
nlu_md = """
## intent:greet
- hey
- hello there
- สวีดัด
- หวัดดี

## intent:goodbye
- cu
- good by
- cee you later
- ลาก่อย
- บาย
- บ๊าย บาย
- บัย
"""

%store nlu_md > nlu.md

Writing 'nlu_md' (str) to file 'nlu.md'.


### Defining the NLU model

Once the training data is ready, we can define our NLU model. We can do that by constructing the processing pipeline which defines how structured data is extracted from unstructured user inputs. 

In [9]:
config = """
language: "en"

pipeline:
- name: "sentiment.LextoTokenizer"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"

""" 

%store config > config.yml

Writing 'config' (str) to file 'config.yml'.


### Training the Rasa NLU Model

We're going to train a model to recognise user inputs, so that when you send a message like "hello" to your bot, it will recognise this as a `"greet"` intent.

In [10]:
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

# loading the nlu training samples
training_data = load_data("nlu.md")

# trainer to educate our pipeline
trainer = Trainer(config.load("config.yml"))

# train the model!
interpreter = trainer.train(training_data)

# store it for future use
model_directory = trainer.persist("./models/nlu", fixed_model_name="current")

INFO:rasa_nlu.training_data.loading:Training data format of nlu.md is md
INFO:rasa_nlu.training_data.training_data:Training data stats: 
	- intent examples: 11 (2 distinct intents)
	- Found intents: 'goodbye', 'greet'
	- entity examples: 0 (0 distinct entities)
	- found entities: 

INFO:rasa_nlu.model:Starting to train component tokenizer_pylexto
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component ner_crf
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component ner_synonyms
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_featurizer_count_vectors
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_classifier_tensorflow_embedding
INFO:rasa_nlu.classifiers.embedding_intent_classifier:Accuracy is updated every 10 epochs
Epochs: 100%|██████████| 300/300 [00:04<00:00, 70.41it/s, loss=0.082, acc=1.000]


### Using & evaluating the NLU model

Let's see how the model is performing on some of the inputs:

In [12]:
pprint(interpreter.parse("สวัสดี"))

{
  "intent": {
    "name": "greet",
    "confidence": 0.9627447128295898
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "greet",
      "confidence": 0.9627447128295898
    },
    {
      "name": "goodbye",
      "confidence": 0.0
    }
  ],
  "text": "\u0e2a\u0e27\u0e31\u0e2a\u0e14\u0e35"
}


Instead of evaluating it by hand, the model can also be evaluated on a test data set (though for simplicity we are going to use the same for test and train):

In [None]:
from rasa_nlu.evaluate import run_evaluation

run_evaluation("nlu.md", model_directory)

# Part 2: Handling the dialogue

We have taught our chatbot how to understand user inputs. Now, it's time to teach our chatbot how to make responses by training a dialogue management model using Rasa Core.

### Writing Stories

The training data for dialogue management models is called `stories`. A story is an actual conversation where user inputs are expressed as intents as well as corresponding entities, and chatbot responses are expressed as actions.


Let's take a look into the format of the stories in more detail:

A story starts with `##` and you can give it a name. 
Lines that start with `*` are messages sent by the user. Although you don't write the *actual* message, but rather the intent (and the entities) that represent what the user *means*. 
Lines that start with `-` are *actions* taken by your bot. In this case all of our actions are just messages sent back to the user, like `utter_greet`, but in general an action can do anything, including calling an API and interacting with the outside world. 

In [None]:
stories_md = """


"""

%store stories_md > stories.md

### Defining a Domain

The domain specifies the universe that the bot operates in. In chatbot's world this universe consists of intents and entities as well as the actions which appear in training stories. The domain can also contain the templates for the answers a chabot should use to respond to the user and slots which will help the chatbot to keep track of the context. Let's look into the domain of our bot:

In [None]:
domain_yml = """
intents:
- greet
- goodbye
- mood_affirm
- mood_deny
- mood_great
- mood_unhappy
- inform

    
entities:
- group

actions:
- utter_greet
- utter_did_that_help
- utter_happy
- utter_goodbye
- utter_unclear
- utter_ask_picture


templates:
  utter_greet:
  - text: "Hey! How are you?"

  utter_did_that_help:
  - text: "Did that help you?"

  utter_unclear:
  - text: "I am not sure what you are aiming for."
  
  utter_happy:
  - text: "Great carry on!"

  utter_goodbye:
  - text: "Bye"
  
  utter_ask_picture:
  - text: "To cheer you up, I can show you a cute picture of a dog, a cat or a bird. Which one do you choose?"
"""

%store domain_yml > domain.yml

### Adding Custom Actions

The responses of the chatbot can be more than just simple text responses - we can call an API to retrieve some data which can later be used to create a response to user input. Let's create a custom action for our bot which, when predicted, will make an API and retrieve a picture of a dog, a cat or a bird, depending on which was specified by the user. The bot will know which type of picture should be received by retrieving the value of the slot `group`.


In [None]:
from rasa_core.actions import Action
from rasa_core.events import SlotSet
from IPython.display import Image

import requests

class ApiAction(Action):
    def name(self):
        return "action_retrieve_image"

    def run(self, dispatcher, tracker, domain):
        

        group = 
        r = 
        
        response = r.content.decode()
        response = response.replace('["',"")
        response = response.replace('"]',"")
        
        dispatcher.utter_message("")

### Pro Tip: Visualising the Training Data

You can visualise the stories to get a sense of how the conversations go. This is usually a good way to see if there are any stories which don't make sense


In [None]:
from IPython.display import Image
from rasa_core.agent import Agent

agent = Agent('domain.yml')
agent.visualize("stories.md", "story_graph.html", max_history=2)

### Training your Dialogue Model

Now we are good to train the dialogue management model. We can specify what policies should be used to train it - in this case, the model is a neural network implemented in Keras which learns to predict which action to take next. We can also tweak the parameters of what percentage of training examples should be used for validation and how many epochs should be used for training.

In [None]:
from rasa_core.policies import FallbackPolicy, KerasPolicy, MemoizationPolicy
from rasa_core.agent import Agent

# this will catch predictions the model isn't very certain about
# there is a threshold for the NLU predictions as well as the action predictions


agent = Agent('domain.yml', policies=[MemoizationPolicy(), KerasPolicy()])

# loading our neatly defined training dialogues
training_data = agent.load_data('stories.md')

agent.train(
    training_data,
    validation_split=0.0,
    epochs=200
)

agent.persist('models/dialogue')

### Starting up the bot (with NLU)

Now it's time for the fun part - starting the agent and chatting with it. We are going to start the `Agent` by loading our just trained dialogue model and using the previously trained nlu model as an interpreter for incoming user inputs.

In [None]:
from rasa_core.agent import Agent
agent = Agent.load('models/dialogue', interpreter=model_directory)

### Talking to the Bot (with NLU)

Let's have a chat!

In [None]:
print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    responses = agent.handle_message(a)
    for response in responses:
        print(response["text"])


### Evaluation of the dialogue model
As with the NLU model, instead of just subjectively testing the model, we can also evaluate the model on a dataset. You'll be using the training data set again, but usually you'd use a test data set separate from the training data.

In [None]:
from rasa_core.evaluate import run_story_evaluation

run_story_evaluation("stories.md", "models/dialogue", 
                     nlu_model_path=None, 
                     max_stories=None, 
                     out_file_plot="story_eval.pdf")

### Interactive learning
Unfortunately, this doesn't work in Jupyter yet. Hence, we going to do this on the command line. To start the interactive training session open your command line and run `train_online.py` script.

### Resources and tips

- Rasa NLU [documentation](https://nlu.rasa.com/)
- Rasa Core [documentation](https://core.rasa.com/)
- Rasa Community on [Gitter](https://gitter.im/RasaHQ/home)
- Rasa [Blog](https://blog.rasa.com/)