# Building A Toy Chatbot with Rasa


Rasa is an open source toolkit for building conversational agents. Using it you can parse user input into structured data and build an underlying dialogue engine to handle complex scenarios.


In this instance we care about being able to parse unstructed user input like:

    "I am looking for a list of computer science classes taught after 3pm"


Into a structured output that extracts the user's intent along with relevant entities.

```json
{
  "intent": "course_search",
  "entities": {
    "course_type" : "CIS521",
    "course_time" : "after 3pm"
  }
}
```

![rasa_overview](tutorial_content/rasa-ecosystem.png)

## Creating Training Data

Usually for a task like this we would want tens of thousands of labeled conversations. In this case we'll have to make do with the 30 odd lines of labeled text I annotated using [rasa-nlu-trainer](https://rasahq.github.io/rasa-nlu-trainer/).

![title](tutorial_content/rasa-nlu-trainer.png)


## Take a look at the training data

The training data exists in a file called `course_search.json`. We can extract the first 30 lines to get an idea of how the labeled data looks.

In [1]:
! head -30 course_search_data.json

﻿{
  "rasa_nlu_data": {
    "common_examples": [
      {
        "text": "I'm looking for a math class",
        "intent": "course_search",
        "entities": [
          {
            "start": 18,
            "end": 22,
            "value": "math",
            "entity": "course_type"
          }
        ]
      },
      {
        "text": "show me computer science classes",
        "intent": "course_search",
        "entities": [
          {
            "start": 8,
            "end": 24,
            "value": "computer science",
            "entity": "course_type"
          }
        ]
      },
      {
        "text": "what english classes are there at 2pm",
        "intent": "course_search",


# Train NLU Model

Rasa models allow for a flexible build pipeline. In this case, we are using spaCy to do most of our heavy lifting.

- spacy language model
- spacy intent classification ()
- spacy entity extraction (Conditional Random Field)


Now that we have our annotated intent and entity data, we can train a basic model.

In [6]:
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

training_data = load_data('course_search_data.json')
trainer = Trainer(config.load("config_spacy.json"))
trainer.train(training_data)
model_directory = trainer.persist('models/default/')  # Returns the directory the model is stored in

Fitting 2 folds for each of 6 candidates, totalling 12 fits


[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished


# Load Interpreter

In [7]:
from rasa_nlu.model import Metadata, Interpreter

# where `model_directory points to the folder the model is persisted in
interpreter = Interpreter.load(model_directory)

  "".format(entity_synonyms_file))


# Predict

In [8]:
interpreter.parse(u"who teaches CIS121?")

{'entities': [{'confidence': 0.8809640292915137,
   'end': 18,
   'entity': 'course_code',
   'extractor': 'ner_crf',
   'start': 12,
   'value': 'cis121'}],
 'intent': {'confidence': 0.98982289136495005, 'name': 'course_detail'},
 'intent_ranking': [{'confidence': 0.98982289136495005,
   'name': 'course_detail'},
  {'confidence': 0.010177108635050057, 'name': 'course_search'}],
 'text': 'who teaches CIS121?'}

# Further Directions

## Spinning up an HTTP Server for your Bot

Rasa makes it quite easy to run your bot on a server, just run the command:

`python -m rasa_nlu.server --path models`

From there you can test your bot by pinging the endpoint, by using `curl` for example:

`curl -XPOST localhost:5000/parse -d '{"q":"hello there"}'`

## 

# Evaluation

We can evaluate the quality of our model by runnning it against labeled conversation data. In this case the evaluation script is testing intent classification:

`python -m rasa_nlu.evaluate --data course_search_data.json --model models/default/model_20180418-132857 --config config_spacy.json`

The output of this script gives us the following scores:

```
               precision    recall  f1-score   support

course_detail       1.00      1.00      1.00         7
course_search       1.00      1.00      1.00         9

  avg / total       1.00      1.00      1.00        16
  
```

Looks like we've got a perfect model! More likely our scores were so high because the training data is sparse and the two intents are dissimilar. Try adding more intents, entities or check out Rasa Core to build a dialogue engine on top of our NLU model.