# Building conversational AI with the Rasa stack
![alt text](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTaX3LNhGcAe1HnPZSuWS0oH6af0LJHXcH7If1sQgLCFAT1chNGFg)


This notebook is a basis for my workshop at PyData 2018 Berlin. If you have any questions or would like to learn more about anything included in this notebook, please let me know or get in touch by juste@rasa.com.

In this workshop we are going to build a chatbot capable of checking in on people's mood and take the necessary actions to cheer them up. 


The tutorial consists of three parts:


*   Part 0: Installation and setup
*   Part 1: Teaching the chatbot to understand user inputs using Rasa NLU model
*   Part 2: Teaching the chatbot to handle multi-turn conversations using dialogue management model.
*   Part 3: Resources and tips

## Part 0: Installation

### Let's start with jupyter configuration

In [19]:
%matplotlib inline
from pprint import pprint
import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')


### Installation of Rasa
Let's start with the installation of Rasa NLU, Rasa Core and a spacy language model. If you have already installed, you can skip this step. 

In [2]:
import sys
python = sys.executable


In [2]:
# In your environment run:
!{python} -m pip install -U rasa_core rasa_nlu[spacy];


Requirement already up-to-date: rasa_core in /Users/sarit/.pyenv/versions/3.6.8/envs/rasa/lib/python3.6/site-packages (0.13.0)
Requirement already up-to-date: rasa_nlu[spacy] in /Users/sarit/.pyenv/versions/3.6.8/envs/rasa/lib/python3.6/site-packages (0.14.1)




[33mYou are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [3]:
# as well as install a language model:
!{python} -m spacy download en_core_web_md
!{python} -m spacy link en_core_web_md en --force;

[33mYou are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m

[93m    Linking successful[0m
    /Users/sarit/.pyenv/versions/3.6.8/envs/rasa/lib/python3.6/site-packages/en_core_web_md
    -->
    /Users/sarit/.pyenv/versions/3.6.8/envs/rasa/lib/python3.6/site-packages/spacy/data/en_core_web_md

    You can now load the model via spacy.load('en_core_web_md')


[93m    Linking successful[0m
    /Users/sarit/.pyenv/versions/3.6.8/envs/rasa/lib/python3.6/site-packages/en_core_web_md
    -->
    /Users/sarit/.pyenv/versions/3.6.8/envs/rasa/lib/python3.6/site-packages/spacy/data/en

    You can now load the model via spacy.load('en')



Let's test the installation - we should have rasa_nlu: 0.12.3 and rasa_core: 0.9.6 installed, and spacy model should be available.

In [4]:
import rasa_nlu
import rasa_core
import spacy

print("rasa_nlu: {} rasa_core: {}".format(rasa_nlu.__version__, rasa_core.__version__))
print("Loading spaCy language model...")
print(spacy.load("en")("Hello world!"))

rasa_nlu: 0.14.1 rasa_core: 0.13.0
Loading spaCy language model...
Hello world!


### Some additional Tools needed
To do some of the visualizations you will also need graphviz. If you don't have graphviz installed, and this doesn't work: don't worry. I'll show you the graph and besides that visualization everything else will work.

Try installing with anyone of these (or adapt to your operating system):

In [7]:
!sudo apt-get -qq install -y graphviz libgraphviz-dev pkg-config;
#!brew install graphviz

[sudo] password for jovyan: 


and another python package and we are ready to go:

In [5]:
!{python} -m pip install pygraphviz;

[33mYou are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [6]:
!pip install https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
!pip install sklearn_crfsuite

Collecting https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
  Downloading https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
[K     \ 13.9MB 50.3MB/s
Building wheels for collected packages: pythainlp
  Running setup.py bdist_wheel for pythainlp ... [?25ldone
[?25h  Stored in directory: /private/var/folders/h5/lj3c5_050j3_yjd77zqxcn7r0000gp/T/pip-ephem-wheel-cache-_odzptnc/wheels/79/4e/1e/26f3198c6712ecfbee92928ed1dde923a078da3d222401cc78
Successfully built pythainlp
[33mYou are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


## Part 1: Natural Language Understanding

At first, let's teach our chatbot how to understand user inputs. To do that, we are going to build a Rasa NLU model. Here is some data to get started: 

**Conversation_1:**   
U: Hello  
B: Hello, how are you doing?  
U: I am doing great!  
B: Great. Carry on!  
    
**Conversation_2:**  
U: Hey  
B: Hello, how are you doing?  
U: I am very sad  
B: To cheer you up, I can show you a cute picture of a cat, a dog or a bird. Choose one :)  
U: A kitten  
B: Here is something to cheer you up. Did that help?  
U: Yes  
B: Goodbye  
  
**Conversation_3:**    
U: Heya  
B: Hello, how are you doing?  
U: Not so good and the only thing that could help me feel better is a picture of a puppy  
B: Here is something to cheer you up. Did that help?  
U: No  
B: Goodbye  
U: Bye  

**Conversation_1:**
U: หวัดดี
B: สวัสดีครับ รับลูกอมสีอะไรดีครับ?
U: มีแต่ลูกอมเหรอ
B: ใช่ครับ ผมเชียร์อยู่อย่างเดียว
U: bye

**Conversation_2:**
U: hi
B: สวัสดีครับ รับลูกอมสีอะไรดีครับ?
U: ขายยังไง
B: มี 3 สี ได้แก่ แดง เขียว และนำ้เงิน ขายราคาเม็ดละ 0.5 บาท ครับ
U: แดง 10 เม็ด
B: ใบเสร็จหมายเลข 111 ครับ ชำระเงินที่เคาร์เตอร์หมายเลข 1 ครับ
U: ขอบใจ
B: ยินดีรับใช้ครับ

**Conversation_3:**
U: สวัสดีครับ
B: สวัสดีครับ รับลูกอมสีอะไรดีครับ?
U: มีสีอะไรมั่ง
B: แดง เขียว นำ้เงิน
U: ลูกอมหรือหลอดไฟว่ะ?
B: ลูกอมจริงๆ ครับ สารสกัดธรรมชาติแต๊ๆ
U: ราคาเท่าไหร่
B: รับกี่เม็ดครับ
U: 10
B: ใบเสร็จหมายเลข 111 ครับ ชำระเงินที่เคาร์เตอร์หมายเลข 1 ครับ
U: bye
B: ยินดีรับใช้ครับ

### Creating the training data for language understanding model


Lets create some training data here, grouping user messages by their `intents`. The intent describes what the messages *mean*. Another important part of training data are `entities` - pieces of information which help a chatbot understand what specifically a user is asking about. Entities are labeled using the markdown link syntex: `[entity value](entity_type)` [More information about the data format](https://nlu.rasa.com/dataformat.html#markdown-format).

- หวัดดี [Thomas](PERSON)
- หนีเห่า [El](PERSON)
- รถ [Benz](CAR) สวย
- รถ [Acura](CAR) งามแท้


- สี[เขียว](COLOR)เท่าไหร่
- สี[เขียว](COLOR)ราคาเท่าไหร่

- มีลูกอมอะไรบ้าง
- มีสินค้าอะไรบ้าง
- มีอะไรบ้าง

## intent:query
- สี[เขียว](COLOR)เท่าไหร่
- สี[เขียว](COLOR)ราคาเท่าไหร่


## intent:greet
- hey
- รถ [Benz](CAR) สวย
- รถ [Acura](CAR) งามแท้

## intent:goodbye
- cu
- good by
- cee you later
- ลาก่อย
- บาย
- บ๊าย บาย
- บัย

In [3]:
nlu_md = """
## intent:greet
- hey
- hello [Peter](PERSON)
- hi [Andrew](PERSON)
- รถ [Benz](CAR) สวย
- รถ [Acura](CAR) งามแท้

## intent:query
- สี[เขียว](COLOR)เท่าไหร่
- สี[เขียว](COLOR)ราคาเท่าไหร่

"""

%store nlu_md > nlu.md

Writing 'nlu_md' (str) to file 'nlu.md'.


### Defining the NLU model

Once the training data is ready, we can define our NLU model. We can do that by constructing the processing pipeline which defines how structured data is extracted from unstructured user inputs. 

In [1]:
config = """
language: "en"

pipeline:
- name: "sentiment.LextoTokenizer"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"

""" 

%store config > config.yml

Writing 'config' (str) to file 'config.yml'.


### Training the Rasa NLU Model

We're going to train a model to recognise user inputs, so that when you send a message like "hello" to your bot, it will recognise this as a `"greet"` intent.

In [14]:
# Thai Tokenizer

from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

# loading the nlu training samples
training_data = load_data("nlu.md")

# trainer to educate our pipeline
trainer = Trainer(config.load("config.yml"))

# train the model!
interpreter = trainer.train(training_data)

# store it for future use
model_directory = trainer.persist("./models/nlu", fixed_model_name="current")

INFO:rasa_nlu.training_data.loading:Training data format of nlu.md is md
INFO:rasa_nlu.training_data.training_data:Training data stats: 
	- intent examples: 7 (2 distinct intents)
	- Found intents: 'greet', 'query'
	- entity examples: 6 (3 distinct entities)
	- found entities: 'PERSON', 'COLOR', 'CAR'

INFO:rasa_nlu.model:Starting to train component tokenizer_pylexto
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component ner_crf
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component ner_synonyms
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_featurizer_count_vectors
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_classifier_tensorflow_embedding
INFO:rasa_nlu.classifiers.embedding_intent_classifier:Accuracy is updated every 10 epochs
Epochs: 100%|██████████| 300/300 [00:02<00:00, 149.90it/s, l

### Using & evaluating the NLU model

Let's see how the model is performing on some of the inputs:

In [20]:
pprint(interpreter.parse("สีเขียวราคาเท่าไหร่"))

{'entities': [{'confidence': 0.8922179196177388,
               'end': 7,
               'entity': 'COLOR',
               'extractor': 'ner_crf',
               'start': 2,
               'value': 'เขียว'}],
 'intent': {'confidence': 0.9510044455528259, 'name': 'query'},
 'intent_ranking': [{'confidence': 0.9510044455528259, 'name': 'query'},
                    {'confidence': 0.044341132044792175, 'name': 'greet'}],
 'text': 'สีเขียวราคาเท่าไหร่'}


Instead of evaluating it by hand, the model can also be evaluated on a test data set (though for simplicity we are going to use the same for test and train):

In [None]:
from rasa_nlu.evaluate import run_evaluation

run_evaluation("nlu.md", model_directory)

# Part 2: Handling the dialogue

We have taught our chatbot how to understand user inputs. Now, it's time to teach our chatbot how to make responses by training a dialogue management model using Rasa Core.

### Writing Stories

The training data for dialogue management models is called `stories`. A story is an actual conversation where user inputs are expressed as intents as well as corresponding entities, and chatbot responses are expressed as actions.


Let's take a look into the format of the stories in more detail:

A story starts with `##` and you can give it a name. 
Lines that start with `*` are messages sent by the user. Although you don't write the *actual* message, but rather the intent (and the entities) that represent what the user *means*. 
Lines that start with `-` are *actions* taken by your bot. In this case all of our actions are just messages sent back to the user, like `utter_greet`, but in general an action can do anything, including calling an API and interacting with the outside world. 

In [None]:
stories_md = """


"""

%store stories_md > stories.md

### Defining a Domain

The domain specifies the universe that the bot operates in. In chatbot's world this universe consists of intents and entities as well as the actions which appear in training stories. The domain can also contain the templates for the answers a chabot should use to respond to the user and slots which will help the chatbot to keep track of the context. Let's look into the domain of our bot:

In [None]:
domain_yml = """
intents:
- greet
- goodbye
- mood_affirm
- mood_deny
- mood_great
- mood_unhappy
- inform

    
entities:
- group

actions:
- utter_greet
- utter_did_that_help
- utter_happy
- utter_goodbye
- utter_unclear
- utter_ask_picture


templates:
  utter_greet:
  - text: "Hey! How are you?"

  utter_did_that_help:
  - text: "Did that help you?"

  utter_unclear:
  - text: "I am not sure what you are aiming for."
  
  utter_happy:
  - text: "Great carry on!"

  utter_goodbye:
  - text: "Bye"
  
  utter_ask_picture:
  - text: "To cheer you up, I can show you a cute picture of a dog, a cat or a bird. Which one do you choose?"
"""

%store domain_yml > domain.yml

### Adding Custom Actions

The responses of the chatbot can be more than just simple text responses - we can call an API to retrieve some data which can later be used to create a response to user input. Let's create a custom action for our bot which, when predicted, will make an API and retrieve a picture of a dog, a cat or a bird, depending on which was specified by the user. The bot will know which type of picture should be received by retrieving the value of the slot `group`.


In [None]:
from rasa_core.actions import Action
from rasa_core.events import SlotSet
from IPython.display import Image

import requests

class ApiAction(Action):
    def name(self):
        return "action_retrieve_image"

    def run(self, dispatcher, tracker, domain):
        

        group = 
        r = 
        
        response = r.content.decode()
        response = response.replace('["',"")
        response = response.replace('"]',"")
        
        dispatcher.utter_message("")

### Pro Tip: Visualising the Training Data

You can visualise the stories to get a sense of how the conversations go. This is usually a good way to see if there are any stories which don't make sense


In [None]:
from IPython.display import Image
from rasa_core.agent import Agent

agent = Agent('domain.yml')
agent.visualize("stories.md", "story_graph.html", max_history=2)

### Training your Dialogue Model

Now we are good to train the dialogue management model. We can specify what policies should be used to train it - in this case, the model is a neural network implemented in Keras which learns to predict which action to take next. We can also tweak the parameters of what percentage of training examples should be used for validation and how many epochs should be used for training.

In [None]:
from rasa_core.policies import FallbackPolicy, KerasPolicy, MemoizationPolicy
from rasa_core.agent import Agent

# this will catch predictions the model isn't very certain about
# there is a threshold for the NLU predictions as well as the action predictions


agent = Agent('domain.yml', policies=[MemoizationPolicy(), KerasPolicy()])

# loading our neatly defined training dialogues
training_data = agent.load_data('stories.md')

agent.train(
    training_data,
    validation_split=0.0,
    epochs=200
)

agent.persist('models/dialogue')

### Starting up the bot (with NLU)

Now it's time for the fun part - starting the agent and chatting with it. We are going to start the `Agent` by loading our just trained dialogue model and using the previously trained nlu model as an interpreter for incoming user inputs.

In [None]:
from rasa_core.agent import Agent
agent = Agent.load('models/dialogue', interpreter=model_directory)

### Talking to the Bot (with NLU)

Let's have a chat!

In [None]:
print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    responses = agent.handle_message(a)
    for response in responses:
        print(response["text"])


### Evaluation of the dialogue model
As with the NLU model, instead of just subjectively testing the model, we can also evaluate the model on a dataset. You'll be using the training data set again, but usually you'd use a test data set separate from the training data.

In [None]:
from rasa_core.evaluate import run_story_evaluation

run_story_evaluation("stories.md", "models/dialogue", 
                     nlu_model_path=None, 
                     max_stories=None, 
                     out_file_plot="story_eval.pdf")

### Interactive learning
Unfortunately, this doesn't work in Jupyter yet. Hence, we going to do this on the command line. To start the interactive training session open your command line and run `train_online.py` script.

### Resources and tips

- Rasa NLU [documentation](https://nlu.rasa.com/)
- Rasa Core [documentation](https://core.rasa.com/)
- Rasa Community on [Gitter](https://gitter.im/RasaHQ/home)
- Rasa [Blog](https://blog.rasa.com/)