# Building intelligent bots. Retrieval-based chatbots

In this section we build a retrieval-based chatbot with Rasa. Before we go to this point, we go through a few NLP methods and word vectorization.


## NLP methods for NLU

Let's take one of President Trump's speech and divide into words.

In [None]:
import spacy

file = open("trump.txt", "r",encoding='utf-8') 
trump = file.read() 

nlp = spacy.load("en")
doc = nlp(trump)

for span in doc.sents:
    print("> ", span)

We have are able to divide it using SpaCy and get the part of speech of each word.

In [None]:
for span in doc.sents:
    for i in range(span.start, span.end):
        token = doc[i]
        print(i, token.text, token.pos_)    

A smaller example:

In [None]:
sample = "Broadcasting today, live from Kraków, on chatbots."

doc = nlp(sample)
for token in doc:
    print(token.text, token.pos_)

### Noun chunks

This NLP method is used to get the nouns from any sentene. It's important to understand what is the sentence about.

In [None]:
doc = nlp(sample)
for nc in doc.noun_chunks:
    print(nc)

### Named Entity Recognition

NER is a NLP method where we get not the nouns or part of speech, but meanings of the words.

In [None]:
doc = nlp(sample)
for entity in doc.ents:
    print(entity.label_, entity.text)

## Word vectorization

Word vectorization is a process of preparing a vector representing each word. Gensim has an implementation of Word2Vec. We use a dimension of 100 and distance between two words in a sentence to 5.

In [None]:
from gensim.test.utils import common_texts, get_tmpfile
from gensim.models import Word2Vec

model = Word2Vec(common_texts, size=100, window=5, min_count=1, workers=4)
model.save("word2vec.model")

We can get the vocabulary as follows:

In [None]:
vocab = list(model.wv.vocab)
X = model[vocab]
print(vocab[0])

To train we just use the TSNE to reduce the dimensionality:

In [None]:
from sklearn.manifold import TSNE
import pandas as pd

tsne = TSNE(n_components=2)
X_tsne = tsne.fit_transform(X)

df = pd.DataFrame(X_tsne, index=vocab, columns=['x', 'y'])
df

We can draw the words in a two-dimensional space:

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)

ax.scatter(df['x'], df['y'])

for word, pos in df.iterrows():
    ax.annotate(word, pos)
plt.show()    

## Negative sampling

It is a simpler implementation of word2vec. It is faster as it takes only a few terms in each iteration for training insted of the whole dataset as in previous example. This is why it's called negative sampling.

First of all, we define helper methods that are used later.

In [None]:
def zeros(*dims):
    return np.zeros(shape=tuple(dims), dtype=np.float32)

def ones(*dims):
    return np.ones(shape=tuple(dims), dtype=np.float32)

def rand(*dims):
    return np.random.rand(*dims).astype(np.float32)

def randn(*dims):
    return np.random.randn(*dims).astype(np.float32)

def sigmoid(batch, stochastic=False):
    return  1.0 / (1.0 + np.exp(-batch))

def as_matrix(vector):
    return np.reshape(vector, (-1, 1))

In [None]:
import pickle
import gzip

with gzip.open("datasets/text8.dat.gz", "rb") as f:
    train_dict, train_set, train_tokens = pickle.load(f)

train_set = np.random.permutation(train_set)

In [None]:
from collections import namedtuple
Config = namedtuple("Config", ["dict_size", "vect_size", "neg_samples", "updates", "learning_rate", 
                               "learning_rate_decay", "decay_period", "log_period"])
conf = Config(
    dict_size=len(train_dict),
    vect_size=100,
    neg_samples=10,
    updates=5000000,
    learning_rate=0.1,
    learning_rate_decay=0.995,
    decay_period=10000,
    log_period=10000)

In [None]:
def neg_sample(conf, train_set, train_tokens): # implemented
    Vp = randn(conf.dict_size, conf.vect_size)
    Vo = randn(conf.dict_size, conf.vect_size)

    J = 0.0
    learning_rate = conf.learning_rate
    for i in range(conf.updates):
        idx = i % len(train_set)

        word    = train_set[idx, 0]
        context = train_set[idx, 1]
        
        neg_context = np.random.randint(0, len(train_tokens), conf.neg_samples)
        neg_context = train_tokens[neg_context]

        word_vect = Vp[word, :]              # word vector
        context_vect = Vo[context, :];       # context wector
        negative_vects = Vo[neg_context, :]  # sampled negative vectors

        # Cost and gradient calculation starts here
        score_pos = word_vect @ context_vect.T
        score_neg = word_vect @ negative_vects.T

        J -= np.log(sigmoid(score_pos)) + np.sum(np.log(sigmoid(-score_neg)))
        if (i + 1) % conf.log_period == 0:
            print('Update {0}\tcost: {1:>2.2f}'.format(i + 1, J / conf.log_period))
            final_cost = J / conf.log_period
            J = 0.0

        pos_g = 1.0 - sigmoid(score_pos)
        neg_g = sigmoid(score_neg)

        word_grad = #fill 
        context_grad = # fill
        neg_context_grad = # fill 

        Vp[word, :] -= learning_rate * word_grad
        Vo[context, :] -= learning_rate * context_grad
        Vo[neg_context, :] -= learning_rate * neg_context_grad

        if i % conf.decay_period == 0:
            learning_rate = learning_rate * conf.learning_rate_decay

    return Vp, Vo, final_cost

In [None]:
Vp, Vo, J = neg_sample(conf, train_set, train_tokens)

In [None]:
def lookup_word_idx(word, word_dict):
    try:
        return np.argwhere(np.array(word_dict) == word)[0][0]
    except:
        raise Exception("No such word in dict: {}".format(word))

def similar_words(embeddings, word, word_dict, hits):
    word_idx = lookup_word_idx(word, word_dict)
    similarity_scores = embeddings @ embeddings[word_idx]
    similar_word_idxs = np.argsort(-similarity_scores)    
    return [word_dict[i] for i in similar_word_idxs[:hits]]

In [None]:
print('\n\nTraining cost: {0:>2.2f}\n\n'.format(J))

sample_words = ['zero', 'computer', 'cars', 'home', 'album']

Vp_norm = Vp / as_matrix(np.linalg.norm(Vp , axis=1))
for w in sample_words:
    similar = similar_words(Vp_norm, w, train_dict, 5)
    print('Words similar to {}: {}'.format(w, ", ".join(similar)))

### Similarity measure through vectors

SpaCy already has words vectorized and we can simply check the similarity between two sentences.

In [None]:
import spacy

nlp = spacy.load('en')

doc1 = nlp(u"Warsaw is the largest city in Poland.")
doc2 = nlp(u"Crossaint is baked in France.")
doc3 = nlp(u"An emu is a large bird.")

for doc in [doc1, doc2, doc3]:
    for other_doc in [doc1, doc2, doc3]:
        print(doc.similarity(other_doc))

A nice example of word vectorization done by some researchers at Warsaw University: [Word2Vec](https://lamyiowce.github.io/word2viz/).

## Retrieval-based chatbot

In this section we use Rasa to build a very simple HR assistant bot. We can use Rasa as a server or use it directly from Python level. To start Rasa server you need to execute the following command:
```python3 -m rasa_nlu.server --path projects```.
It starts a server on default port 5000. You can test it using the request package. We should get the intent of the phrase `hi`.

In [None]:
import requests

def get_intent(sentence):
    url = "http://localhost:5000/parse"
    payload = {"q":sentence}
    response = requests.get(url,params=payload)    
    print(response.json())
    intent = response.json()['intent']
    if intent['confidence'] > 0.5: 
        return intent['name']
    return response.json()

get_intent("hi")

To use Rasa from Python level you need to prepare a config file that contains the pipeline and the filename of examples used for learning.

In [None]:
config = """
{
  "pipeline": "spacy_sklearn",
  "path" : ".",
  "data" : ".anna.json"
}
"""

config_file = open("config.json", "w")
config_file.write(config)
config_file.close()

The data file contains examples that are used for training.

In [None]:
anna_common_examples = """
{
  "rasa_nlu_data": {
    "entity_synonyms": [
      {
        "value": "candidate",
        "synonyms": ["developer", "data scientist"]
      }
    ],
    "common_examples": [
      {
        "text": "hey", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "howdy", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hey there",
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hello", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hi", 
        "intent": "greet", 
        "entities": []
      },
      {
        "text": "good morning",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "good evening",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "dear sir",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "yes", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yep", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yeah", 
        "intent": "affirm", 
        "entities": []
      },
      {
        "text": "indeed",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "that's right",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "ok",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "right, thank you",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "add candidate",
        "intent": "candidate_add",
        "entities": []
      }, 
      {
        "text": "add candidate",
        "intent": "candidate_add",
        "entities": [
            {
      "start": 5,
      "end": 13,
      "value": "candidate",
      "entity": "candidate"
        }
        ]
      },         
      {
        "text": "adding candidate",
        "intent": "candidate_add",
        "entities": [
            {
              "start": 8,
              "end": 16,
              "value": "candidate",
              "entity": "candidate"
            }        
        ]
      },
      {
        "text": "please add candidate",
        "intent": "candidate_add",
        "entities": []
      },              
      {
        "text": "please add new candidate",
        "intent": "candidate_add",
        "entities": []
      },           
      {
        "text": "we have new prescreening upcoming",
        "intent": "candidate_add",
        "entities": []
      }, 
      {
        "text": "we have a new candidate for prescreening",
        "intent": "candidate_add",
        "entities": []
      },         
      {
        "text": "correct",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great choice",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "sounds really good",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "goodbye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "good bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "stop", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "end", 
        "intent": "goodbye", 
        "entities": []
      },
      {
        "text": "farewell",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "Bye bye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "have a good one",
        "intent": "goodbye",
        "entities": []
      }
    ]
  }
}
"""

training_data = open("anna.json", "w")
training_data.write(anna_common_examples)
training_data.close()

The training is straight forward.

In [None]:
from rasa_nlu.training_data import load_data
from rasa_nlu.model import Trainer, Interpreter
from rasa_nlu.components import ComponentBuilder
import rasa_nlu.config

cfg = 'config.json'
training_data = load_data('anna.json')
trainer = Trainer(rasa_nlu.config.load(cfg))
trainer.train(training_data)
model_directory = trainer.persist('.')

To get the intent we use the parse method.

In [None]:
from rasa_nlu.model import Metadata, Interpreter

interpreter = Interpreter.load(model_directory)

interpreter.parse(u"add developer")

## EXERCISE 2

Extend the training examples and add an intent `change_status` with entities: `passed` and `failed`.

In [None]:
anna_common_examples = """
{
  "rasa_nlu_data": {
    "entity_synonyms": [
      {
        "value": "candidate",
        "synonyms": ["developer", "data scientist"]
      }
    ],
    "common_examples": [
      {
        "text": "hey", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "howdy", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hey there",
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hello", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hi", 
        "intent": "greet", 
        "entities": []
      },
      {
        "text": "good morning",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "good evening",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "dear sir",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "yes", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yep", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yeah", 
        "intent": "affirm", 
        "entities": []
      },
      {
        "text": "indeed",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "that's right",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "ok",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "right, thank you",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "add candidate",
        "intent": "candidate_add",
        "entities": []
      }, 
      {
        "text": "add candidate",
        "intent": "candidate_add",
        "entities": [
            {
      "start": 5,
      "end": 13,
      "value": "candidate",
      "entity": "candidate"
        }
        ]
      },         
      {
        "text": "adding candidate",
        "intent": "candidate_add",
        "entities": [
            {
              "start": 8,
      "end": 16,
      "value": "candidate",
      "entity": "candidate"
        }        
        ]
      },
      {
        "text": "please add candidate",
        "intent": "candidate_add",
        "entities": []
      },              
      {
        "text": "please add new candidate",
        "intent": "candidate_add",
        "entities": []
      },           
      {
        "text": "we have new prescreening upcoming",
        "intent": "candidate_add",
        "entities": []
      }, 
      {
        "text": "we have a new candidate for prescreening",
        "intent": "candidate_add",
        "entities": []
      },         
      {
        "text": "correct",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great choice",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "sounds really good",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "goodbye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "good bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "stop", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "end", 
        "intent": "goodbye", 
        "entities": []
      },
      {
        "text": "farewell",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "Bye bye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "have a good one",
        "intent": "goodbye",
        "entities": []
      }
    ]
  }
}
"""

training_data = open("anna_new.json", "w")
training_data.write(anna_common_examples)
training_data.close()

Train it:

In [None]:
from rasa_nlu.converters import load_data
from rasa_nlu.config import RasaNLUConfig
from rasa_nlu.model import Trainer

cfg = 'config.json'
training_data = load_data('anna_new.json')
trainer = Trainer(rasa_nlu.config.load(cfg))
trainer.train(training_data)
model_directory = trainer.persist('.')

Test it:

In [None]:
from rasa_nlu.model import Metadata, Interpreter

interpreter = Interpreter.load(model_directory)

interpreter.parse(u"the developer didn't passed")

## Building stories with Rasa

To build a chatbot with Rasa that has a focus on the context management, we need to build the dataset of intents as before, but also stories. 

Rasa core for building stories need a bit more configuration than in the previous example. We need to setup the following:
- the configuration of language and machine learning backend,
- setup the domain with sample chatbot responses,
- define the stories.
After this step we need to train Rasa, but we still need feed it with intents after it.

A basic configuration for Rasa is needed like to language and pipeline. The pipeline defines the way how we want to train our dataset.

In [None]:
rasa_config = """
policies:
  - name: KerasPolicy
    epochs: 100
    max_history: 5
  - name: FallbackPolicy
    fallback_action_name: 'action_default_fallback'
  - name: MemoizationPolicy
    max_history: 5
  - name: FormPolicy
"""
%store rasa_config > rasa_config.yml

We need to define a few stories for our chatbot:

In [None]:
stories_md = """
## happy path
* greet
  - utter_greet
* mood_great
  - utter_happy

## sad path 1
* greet
  - utter_greet
* mood_unhappy
  - utter_cheer_up
  - utter_did_that_help
* mood_affirm
  - utter_happy

## sad path 2
* greet
  - utter_greet
* mood_unhappy
  - utter_cheer_up
  - utter_did_that_help
* mood_deny
  - utter_goodbye

## say goodbye
* goodbye
  - utter_goodbye
"""
%store stories_md > stories.md

We need to set the domain:

In [None]:
domain_yml = """
intents:
  - greet
  - goodbye
  - mood_affirm
  - mood_deny
  - mood_great
  - mood_unhappy

actions:
- utter_greet
- utter_cheer_up
- utter_did_that_help
- utter_happy
- utter_goodbye

templates:
  utter_greet:
  - text: "Hey! How are you?"

  utter_cheer_up:
  - text: "Here is something to cheer you up:"
    image: "https://i.imgur.com/nGF1K8f.jpg"

  utter_did_that_help:
  - text: "Did that help you?"

  utter_happy:
  - text: "Great carry on!"

  utter_goodbye:
  - text: "Bye"
"""
%store domain_yml > domain.yml

Training the stories:

In [None]:
!python3 -m rasa_core.train -d domain.yml -s stories.md -o models/oreilly

Set up the intents like in the previous example:

In [None]:
nlu_md = """
## intent:greet
- hey
- hello
- hi
- good morning
- good evening
- hey there

## intent:goodbye
- bye
- goodbye
- see you around
- see you later

## intent:mood_affirm
- yes
- indeed
- of course
- that sounds good
- correct

## intent:mood_deny
- no
- never
- I don't think so
- don't like that
- no way
- not really

## intent:mood_great
- perfect
- very good
- great
- amazing
- wonderful
- I am feeling very good
- I am great
- I'm good

## intent:mood_unhappy
- sad
- very sad
- unhappy
- bad
- very bad
- awful
- terrible
- not very good
- extremely sad
- so sad
"""
%store nlu_md > nlu.md

NLU configuration:

In [None]:
nlu_config = """
language: en
pipeline: tensorflow_embedding
"""
%store nlu_config > nlu_config.yml

Rasa NLU training:

In [None]:
!python3 -m rasa_nlu.train -c nlu_config.yml --data nlu.md -o models --fixed_model_name nlu --project oreilly --verbose

Run the chatbot:

In [None]:
import IPython
from IPython.display import clear_output, HTML, display
from rasa_core.agent import Agent
from rasa_core.interpreter import RasaNLUInterpreter
import time

interpreter = RasaNLUInterpreter('models/oreilly/nlu')
messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
agent = Agent.load('models/oreilly', interpreter=interpreter)

def chatlogs_html(messages):
    messages_html = "".join(["<p>{}</p>".format(m) for m in messages])
    chatbot_html = """<div class="chat-window" {}</div>""".format(messages_html)
    return chatbot_html


while True:
    clear_output()
    display(HTML(chatlogs_html(messages)))
    time.sleep(0.3)
    a = input()
    messages.append(a)
    if a == 'stop':
        break
    responses = agent.handle_message(a)
    for r in responses:
        messages.append(r.get("text"))