## Building AI assistants that scale using machine learning and open source tools

![alt text](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTaX3LNhGcAe1HnPZSuWS0oH6af0LJHXcH7If1sQgLCFAT1chNGFg)

In this workshop, you will learn how to build your own conversational AI assistant using machine learning and real conversational data. The goal of this workshop is to walk you through the process of building an ML-powered assistant from scratch and build an actual assistant which you can improve later.


There are no additional requirements to run this notebook, but if you encounter any issues or have more questions about the content included here, feel free send a message to an author of this exercise at juste@gmail.com

### Introduction

During the course of this 3-hour workshop, you will go through each stage of the chatbot development and build an assistant capable of recommending talks to attend in this conference and answer the questions about the event. Below is an example conversation your assistant will be able to handle:

U: Hello  
A: Hello! I am a conference assistant ready to help you find talks to attend or anwers any questions you have.  
U: What session would you recommend for Data Scientists today?  
A: You can attend 'Building AI assistnats that scale using machine learning and open source tools' session!  
U: Who is the speaker and how long is the session?  
A: The speaker is Justina. The session is 3 hours long.  
U: Thanks!  
A: You are very welcome!  



      

The workshop consists of the following stages:

**0. Intro:**  
    0.1 Setup and installation 
      
**1. Stage 1: Natural language understanding:**  
    1.1. Designing the happy path  
    1.2. Generating the NLU training examples  
    1.3. Designing the training pipeline  
    1.4. Training the first NLU model  
    1.5.  Handling out-of-scope inputs  
    1.6. Adding sinonyms  
    1.7. Adding multi-intents  
    1.8. Re-training and thesting the updated NLU model  
      
**2. Stage 2: Dialogue management model:**  
    2.1. Designing training stories  
    2.2. Setting up the backend component  
    2.3. Creating a custom action  
    2.4. Defining the domain  
    2.5. Training the dialogue model  
    2.6. Testing the dialogue model   
    2.7. Handling out-of-scope conversations  
    2.8. Adding stories with multi-intents  
    2.9. Evaluating dialogue model  
      
**3. Stage 3: Closing the feedback loop:**  
    3.1. Improving the assistant using the interactive learning   
    3.2. Storing conversation history    
    3.3. Connecting the assistant to the outside world  
   

## 0. Intro
In this section, we will install all the necessary dependencies needed to successfully run this exercise.
### 0.1. Setup and installation
The best way to insall the necessary modules is to use the requirements.txt file. After creating a virtual environment, run:

**pip install -r requirements.txt**

Throughout this workshop, we will use only open source tools. The code block below checks if Rasa NLU and Rasa Core have been installed suffessfully.

In [None]:
import rasa_nlu
import rasa_core
import warnings
warnings.filterwarnings('ignore')


print("rasa_nlu: {} rasa_core: {}".format(rasa_nlu.__version__, rasa_core.__version__))

## 1. Natural Language Understanding 

In this section, you will enable your assistant to understand the user inputs by building a Rasa NLU model. This model will take unstructured user inputs and extract structured data in a form of intents and entities:  
- *intent* - a label which represents the overall intention of the user 's input
- *entity* - important detail which an assistant should extract and use to steer the conversation

### 1.1. Designing a happy path

A good starting point is to define a happy path first. A happy path is a conversation flow where the user provides all the required information and allows the assistant to lead the conversation.

### 1.2. Designing the NLU training data

To train the NLU model you will need some labeled training data. Rasa NLU training data samples consist of the following components:  
- intent label which starts with a prefix *
- examples of text inputs which correspond to that label
- entities which follow the format *[entity_value] (entity_label)*

We will start by generating some training data examples by hand. For a completed data file check out the *helper_files/nlu_data.md* in the repository of this exercise.

In [None]:
nlu_md = """


"""

%store nlu_md > nlu.md



## 1.3 Designing the training pipeline

Once the training data is ready, we can define the NLU model. We can do that by constructing the processing pipeline which defines how structured data will be extracted from unstructured user inputs: how the sentences will be tokenized, what intent classifier will be used, what entity extraction model will be used, etc. Each component in a training pipeline is trained one after another and can take inputs from the previously defined component as well as pass some information to subsequent ones.

In [None]:
configuration = """
language: "en"

pipeline:
- name: 

""" 

%store configuration > config.yml

## 1.4 Training the first Rasa NLU Model
Now, we're going to train the NLU model to recognise user inputs, so that when you send a message like "hello" to your bot, it will recognise this as a "greet" intent. Let's define the training function:

In [None]:
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

def train_nlu_model():
    # loading the nlu training samples
    training_data = load_data("nlu.md")

    # trainer to educate our pipeline
    trainer = Trainer(config.load("config.yml"))

    # train the model!
    interpreter = trainer.train(training_data)

    # store it for future use
    model_directory = trainer.persist("./models/current", fixed_model_name="nlu")
    
    return interpreter, model_directory



Finally, let's train the model using the previously defined data and model configuration:

In [None]:
interpreter, model_directory = train_nlu_model()

## Testing the model

We have trained the first version of our NLU model! Let's test it on various inputs:

In [None]:
import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')

def pprint(o):
    # small helper function to make dict dumps a bit prettier
    print(json.dumps(o, indent=2))

#change the input message with your prefered inputs
input_message = ""
pprint(interpreter.parse(input_message))

## Handling out-of-scope inputs
When dealing with conversational AI, out-of-scope user inputs are very common challenge. These inputs represent the user requests which have nothing to do with the assistant's job. While it's very challenging to provide a sensible answer to each out-of-scope input, it's important to enable your assistant to identify such inputs and guide the user back to the conversation. First, let's enable our assistant to identify out-of-scope inputs. To do that, we will add a new intent called *out-of-scope* to our training dataset and provde some corresponding inputs:

In [None]:
nlu_md = """


"""

%store nlu_md > nlu.md


Let's retrain the model and see how it deals with out-of-scope inputs now:

In [None]:
interpreter, model_directory = train_nlu_model()

In [None]:
input_message = ""
pprint(interpreter.parse(input_message))

## 1.6 Adding synonyms

Synonyms are a very useful Rasa NLU feature which maps extracted entities to the same name. It's used when some extracted values have to be normalised so that they could be used to query the database or make an API call. In our example, the occupation of the relevant audience is a good candidate for the synonym because users can provide the same occupation in a variety of different ways (for example, Machine Learning and ML). Let's update our training examples with synonyms.

In [None]:
nlu_md = """


"""

%store nlu_md > nlu.md


To train the NLU model with synonyms, we have to add the synonyms component to the model pipeline:

In [None]:
configuration = """
language: "en"

pipeline:


""" 

%store configuration > config.yml

Now, let's retrain the NLU model and test the performace.

In [None]:
interpreter, model_directory = train_nlu_model()

See how 'machine learning engineers' now gets mapped to 'ML':

In [None]:
input_message = ""
pprint(interpreter.parse(input_message))

## 1.7 Implementing multi-intents

The NLU model we have built so far works pretty well, but it only supports inputs with only one intent per user input. In this step, we will use a tensorflow embedding model to enable the assistant to recognise multi-intents - more than one intention per user input. Let's start by defining multi-intents in our training data. Multi-intents are defined in a very similar way as regular intents, the only difference is that the label names consists of intent tokens and a character of your choice that separates them, for example **intent_token1+intent_token2**.

In [None]:
nlu_md = """

"""

%store nlu_md > nlu.md


Next, let's modify the configuration of the model pipeline to use the tensorflow_embedding model with multi-intent support.

In [None]:
configuration = """
language: "en"

pipeline:


""" 

%store configuration > config.yml

Let's retrain the model with the new pipeline and test the performance:

In [None]:
interpreter, model_directory = train_nlu_model()

See how a two-question input now gets recognised as a multi-intent:

In [None]:
input_message = ""
pprint(interpreter.parse(input_message))

## 1.8 Evaluating the NLU model


Testing the model on various inputs is a good way to get high-level insights into the performance of the model. However, it's a time consuming and quite a tedious way of testing. Instead of evaluating the model by hand, it can also be evaluated on a test data set (though for simplicity we are going to use the same for test and train):

In [None]:

from rasa_nlu.evaluate import run_evaluation
import IPython
from IPython import display

run_evaluation("nlu.md", model_directory)

Congratulations! You have just implemented the natural language understanding part of your assistant which means that your assistant can now understand you. In the second part of this workshop, we will delve into the next stage of the chatbot development - dialogue management.

# 2. Dialogue Management


In this section of this workshop you will build a machine learning-based dialogue model which will enable your assistant to decide on how to respond to user inputs based on the state of the conversation. 

## 2.1 Designing the training stories

Let's start with generating the training data. Rasa Core models learn by observing real conversational data between the user and the assistant. The only important thing is that this data has to be converted into the Rasa Core format: user inputs have to be expressed as corresponding intents (and entities where necessary) while the responses of the assistant are expressed as action names. Each training story follows the format:  
- the story starts with a story name which has a prefix ##  
- intents, corresponding to user inputs, start with *  
- if NLU model extracts entities which should influence the predictions of the dialogue model, they have to be included in the stories using the following format: * intent{'entity_name':"entity_value"}  
- the responses of the bot start with -  
- the story ends with an empty line which marks the end of the story

In the next step of this tutorial, we will generate some training stories to cover the happy path. To see a complete training data example, check out the **data/stories.md** file of this repository.


In [None]:
stories_md = """

"""

%store stories_md > stories.md

## 2.2 Setting up the backend component

We want to make our assistant engaging and fun. For that reason, we will enable it to answer the questions using the real data stored in a SQL database. For this exercise, the assistant will be able to pull information about the conference agenda, talks, and speakers. Let's take a look at how the data in a SQL database looks like.

In [None]:
import sqlite3 as lite
import pandas as pd



## 2.3 Creating custom action

We are going to use the backend integration to enable our assistant to fetch the relevant data based on user's queries. For that, we will create custom actions which, when predicted, will collect necessary data and use it to steer the conversation further:

In [None]:
actions = """
from rasa_core_sdk import Action
from rasa_core_sdk.events import SlotSet
import sqlite3 as lite
import random


class ActionRecommendTalk(Action):
    def name(self):
        return ""
        
    def run(self, dispatcher, tracker, domain):
        
        
        dispatcher.utter_message()


        return []

"""

%store actions > actions.py

## 2.4 Defining the domain

Once we have the training data in place, we can define the domain of our assistant. A domain defines the environment in which the assistant operates - what user inputs it should expect to see, what actions it should be able to predict, what information the assistant should store throughout the conversation.

In [None]:
domain_yml = """



"""

%store domain_yml > domain.yml

## 2.5 Training the dialogue model

We now have all the components necessary to train the dialogue management model. The code cell below will train the model using the defined policy and store the model in a specified location for us to test later.

In [None]:
from rasa_core.policies import KerasPolicy, MemoizationPolicy
from rasa_core.agent import Agent

def train_dialogue():
    # loading our neatly defined training dialogues
    agent = Agent("domain.yml", policies=[MemoizationPolicy(), KerasPolicy()])
    training_data = agent.load_data('stories.md')


    agent.train(
        training_data)

    agent.persist('models/dialogue')


In [None]:
train_dialogue()

## 2.6 Testing the dialogue model

It's finally time for the most exciting part - testing the bot! Let's spin up the custom action server and we are ready to go. Open a new terminal and exacute the following command:

**python -m rasa_core_sdk.endpoint --actions actions**

In [None]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig
import time

def load_assistant():
    messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
    interpreter = NaturalLanguageInterpreter.create(model_directory)
    endpoint = EndpointConfig('http://localhost:5055/webhook')
    agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)

    print("Your bot is ready to talk! Type your messages here or send 'stop'")
    while True:
        a = input()
        if a == 'stop':
            break
        responses = agent.handle_text(a)
        for response in responses:
            print(response["text"])

In [None]:
load_assistant()

## 2.7 Adding stories with multi-intents

Next, let's add a few stories with multi-intents. Such stories will follow a regular data format, the only thing is that we can include a couple of actions to be predicted by an assistant:

In [None]:
stories_md = """
               


"""

%store stories_md > stories.md

In [None]:
train_dialogue()

In [None]:
load_assistant()

## 2.8 Adding out-of-scope inputs

Finally, let's design a story with out-of-scope user inputs. Here, it's important to enable an assistant to take charge of the conversation and guide the user back to the initial conversation. In our case, an assistant will let the user know that it cannot deal with the out-of-scope request and will offer other questions to be asked:

In [None]:
stories_md = """



"""

%store stories_md > stories.md

In [None]:
train_dialogue()

In [None]:
load_assistant()

## 2.9 Dialogue model evaluation

Another great way to see how good our dialogue model is, is to test it using evaluation scripts:

In [None]:

!python -m rasa_core.evaluate --core models/dialogue --stories stories.md -o results


# 3. Closing the feedback loop

In [None]:
print(model_directory)

Developing an assistant is just one part of the process. Another very important part which defines a successful assistant is enabling your assistant to learn from real user feedback. In the last part of this workshop, we will cover two ways to improve your bots using real user feedback - using interactive learning and using the history of the conversations. We will also, connect our assistant to a custom webpage to see how it works in action! We will complete this part using the command line.

## 3.1. Improving the assistant using the interactive learning 
Interactive learning is a great way to improve your assistant and generate more training example by simply talking to your bot and providing feedback for all predictions it made. That is the main idea behind it - instead of responding right away, an assistant will tell you what it thinks it should do next and ask you for feedback. To start the interactive learning session, we will use a command line and use the following command:


**python -m rasa_core.train interactive --core models/dialogue --nlu models/current/default/nlu --endpoints endpoints.yml**

## 3.2. Storing conversation history 
Another way to improve your assistant is to observe real conversations between the users and a bot. To do so, we have to store the conversations in a storage first. In Rasa Core, tracker is responsible of keeping track of everything that happends throughout the conversation - user inputs, NLU model results, dialogue model predictions, etc. We can easily store all this data in a database for later use. In this step you will learn how to store this information in a Mongo tracker store. We will complete this step in a command line.

First, we will setup a mongodb backend and store the conversaton history there. For that, we will start our assistant on a server using:  
**python -m rasa_core.run -d models/dialogue -u models/current/default/nlu --port 5005  --endpoints endpoints.yml**

##  3.3. Connecting the assistant to the outside world
In the very last step of this workshop, you will learn how to connect your assistant to a custom UI which can be easily added to a website of your choice. The repository of this workshop contains a folder called *bot_ui* where you can find a very basic html webpage. We will add a UI code stored in a *ui.html* file and connect our assistant to it. We will complete this step using a text editor and a command line.

After setting up the backed, we will start our assistant on a server and connect to the UI using:

**python -m rasa_core.run -d models/dialogue -u models/current/default/nlu --port 5005  --endpoints endpoints.yml --credentials credentials.yml**

# 4. Summary 
In this workshop, we covered some of the most important steps of the chatbot development and you should have a simple conference bot running your machine. There are so many things you can do to take this assistant to the whole new level! Here are some ideas for you:
- Add new skills like:
        show the session timetable 
        tell what is the next session at a specific venue   
        connect with a speaker on social media  
        recommend resources to learn more about the topic  
        
- Add new entities like date and time
- Connect your assistant to the most popular messaging platforms like Facebook, Slack or Telegram

Make sure to reference [Rasa official documentation](https://rasa.com/docs) or ask questions on the [Rasa Community Forum](https://forum.rasa.com) if you are in doubt! 

Most importantly, let me know what you came up with!