### Intro
- Rasa core and Rasa NLU combined is called RASA STACK
- We will build a weather reporting chatbot
    - this will do Entity Extraction and Intent classification task
    
### Setup
1. pip install -r requirements.txt
2. language model ( en ) 
    - lanuage model is used to parse incoming text messages and extract necessary information
3. Rasa NLU trainer
    - this makes generating training data lot easier
    - this is a UI, and js based application, so we will need npm and nodejs
    - download node js with npm and add to path
    - then run
        - npm i -g rasa-nlu-trainer
    - this will install rasa-nlu trainer

In [1]:
%matplotlib inline

import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')

def pprint(o):
    # small helper to make dict dumps a bit prettier
    print(json.dumps(o, indent=2))

In [2]:
# 2. install language model 

import sys
python = sys.executable

# this will download english spacy model
    # it will install it and reference it to abbreviation en
!{python} -m spacy download en

Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
[K    100% |████████████████████████████████| 37.4MB 92.1MB/s ta 0:00:01   2% |▊                               | 860kB 5.1MB/s eta 0:00:08    4% |█▎                              | 1.5MB 30.3MB/s eta 0:00:02    4% |█▌                              | 1.7MB 33.4MB/s eta 0:00:02    4% |█▋                              | 1.9MB 2.6MB/s eta 0:00:14/s eta 0:00:01    17% |█████▌                          | 6.4MB 6.9MB/s eta 0:00:05    18% |██████                          | 7.0MB 11.3MB/s eta 0:00:03    21% |██████▉                         | 8.0MB 1.4MB/s eta 0:00:21    23% |███████▌                        | 8.8MB 1.7MB/s eta 0:00:17    29% |█████████▌                      | 11.1MB 5.1MB/s eta 0:00

### Training 
- we have to teach a chatbot how to understand human unstructured language so that bot will understand what we are saying
    - So we will train NLU model which will take unstructured text messages and will return structured data in the form of intents and entities which our bot will understand
- Intent
    - what the message is about
- Entity
    - informations like location names, dates etc
    - this helps chatbot to understand what specifically we are talking about and asking
- Entity Extraction and Intent classification are ML problems, so we need train data to train the models

##### Rasa NLU train data
- train data should contain example messages which we would like our chatbot to learn from
    - the corresponding intents and what entities included in each sentence and where in a sentence they can be found
    
##### creating data
- mkdir data
- cd data
- echo 'nlu_data' > data.json
- there are two different ways of how we can create training examples for NLU models
    - one way is to directly write them into this data file
    
{
  "rasa_nlu_data":{
    "common_examples":[
    {
       "text":"Hello",
       "intent":"greet",
       "entities":[]
    },
    {
       "text":"goodbye",
       "intent":"goodbye",
       "entities":[]
    }
      ]
  }
}

    - save this file
    - another way :
        - in the data folder, launch rasa-nlu-trainer
        - here we add new example
        - add text
            - What's the weather in Berlin at the moment?
        - now highlight berlin and add as entity,
            - give entity name as location
        - Now if we open data.json, we can see entites populated 
            - also we have start and end to show where entity is present

#### Amount of training data
- now we have three examples of training 
- we need more examples for each of these intents
    - examples should be different and diverse
- now add data from data.json from github to our data.json
    - here we have some more examples of greeting, goodbye and asking for weather
- reload nlu_trainer
- now we will have close to 40 examples in total, which is also less

#### Before training
- we need to create a configuration file
- go out of data folder,
    - echo 'config' > config_spacy.json
- config file is imp as it provides some params to be used
    - 1. pipeline : this will specify what featurizers,feature extractors are going to be used to crunch text messages and extract neccesary info in RASA NLU
        - Rasa NLU has two main pipelines pre-built
            - a. MIDI based
            - b. Sklearn based
    - 2. path : dir where we will keep model after training
    - 3. data : data file location
    
- config_spacy.json
{
  "pipeline":"spacy_sklearn",
  "path":"./models/nlu",
  "data":"./data/data.json"
}

#### model creation

In [2]:
from rasa_nlu.training_data import load_data
from rasa_nlu import config
from rasa_nlu.model import Trainer
from rasa_nlu.model import Metadata, Interpreter

In [3]:
def train_nlu(data, configs, model_dir):
    training_data = load_data(data)
    # we will have to provide a config in trainer,
        # this we wil do using RasaNLUConfig method
    trainer = Trainer(config.load(configs))
    trainer.train(training_data)
    # model dir is where our model is saved
    model_directory = trainer.persist( model_dir, fixed_model_name= "weathernlu")

In [4]:
train_nlu("./data/data.json", "config_spacy.json", "./models/nlu")

INFO:rasa_nlu.training_data.loading:Training data format of ./data/data.json is rasa_nlu
INFO:rasa_nlu.training_data.training_data:Training data stats: 
	- intent examples: 29 (3 distinct intents)
	- Found intents: 'greet', 'inform', 'goodbye'
	- entity examples: 13 (1 distinct entities)
	- found entities: 'location'

INFO:rasa_nlu.utils.spacy_utils:Trying to load spacy model with name 'en'
INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.
INFO:rasa_nlu.model:Starting to train component nlp_spacy
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component tokenizer_spacy
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_featurizer_spacy
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_entity_featurizer_regex
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component ner_cr

Fitting 2 folds for each of 6 candidates, totalling 12 fits


- to check if model is persisted look for models folder
- Metadata and Intepreter class is required to load the model and get ready to use it

In [5]:
def run_nlu():
    # now we load our model
    intepreter = Interpreter.load("./models/nlu/default/weathernlu")
    pprint(intepreter.parse(u"I am planning my holiday to Lithuania. I wonder what is the weather out there."))

In [6]:
run_nlu()

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


{
  "intent": {
    "name": "inform",
    "confidence": 0.8509291192465037
  },
  "entities": [
    {
      "start": 28,
      "end": 37,
      "value": "lithuania",
      "entity": "location",
      "confidence": 0.919550976223837,
      "extractor": "ner_crf"
    }
  ],
  "intent_ranking": [
    {
      "name": "inform",
      "confidence": 0.8509291192465037
    },
    {
      "name": "goodbye",
      "confidence": 0.07978649851713138
    },
    {
      "name": "greet",
      "confidence": 0.06928438223636452
    }
  ],
  "text": "I am planning my holiday to Lithuania. I wonder what is the weather out there."
}


- RASA NLU tells what intent is the text
- also it tells the confidence score for all other intents that we have

### Dialogue Management model
- Dialogue management model will predict what action or resonse the chatbot should make based on the state of the conversation
    - actions can be simple API calls or text responses or retriving data from DB
- Q. Why to we need ML for that ?
    - Now if we ask for weather without giving our location, we want our chatbot to ask for what location we are based
        - In practice these type of conversations are hard coded in form of flow charts.
        - code wise we can image this as bunch of if else statements
        - This means a developer has to create a bunch of most possible happy paths from starting the conversation to the end goal.
        - now with every intent and entity we add, the flowchart becomes complex and difficult to monitor all possible paths that user can make to get the answers they want
    - In Rasa, we have a ML model which we can train and it will make a prediction of what the bot should do next based on
        - a. the context
        - b. and the state of the conversation
        - as a result, the conversational flow is way more natural and we have a better user experience
        
### Building dialogue management model
- create a domain file for our chatbot
- weather_domain.yml
    - this will be a yaml file
- domain describes an enivronment
- domain consists of 5 key parts
    - 1. list of slots 
        - slots are like placeholders that would help chatbot to keep track of context of the conversation
            - ex. we are asking weather in specific location
                - so chatbot should keep track of the location that we are asking. And we do not remind the chatbot of what location we were speaking initially. So it can keep track of the location in further 
                - also chatbot is going to make an api call to get the weather information
        - so we will create a slot called location, also we will need to tell what data type this slot is going to have
            - datatype is imp as diff data types of the slots are going to have effect on how dialogue management model is going to make predictions
            - for some datatypes value will be imp in predictions
            - for some, whether the slot is populated or not is going to have impact on the prediction made
        - in our case, location type will be text
        - here we will have only one slot for our example
                
    - 2. intents
        - these are same intents that we had in NLU model
            - we had three
                 - greet
                 - goodbye
                 - inform
    - 3. entities
        - list of entities that chatbot should be aware of and ready of get from user
            - we had only one
                - location 
            - here we have an entity called location as well as slot. So NLU model will extract the location name as an entity and will set this value as a slot.
                - thats how this value is going to be saved and kept throughout the conversation
    - 4. list of templates
        - are like text responses that chatbot should send back to the user once specific actions are being predicted
        - so we will initialize an action which should be executed when my 
        - corresponding to an actiong we will write the text message that we want to reply
            - ex. utter_greet:
                - 'hello ! how can I help ?'
            - utter_goodbye:
                - 'Talk to you later'
                - 'Bye Bye :(' 
        - we provided more diversity so we added on more possible answer, our chatbot will randomize a little to which answer it will use
            - utter_ask_location:
                - 'In what location?'
    - 5. list of actions
        - actions that my chatbot should be ready to execute when they are predicted
        - we already have 3 actions create in template
            - utter_greet
            - utter_goodbye
            - utter_ask_location
        - note : for returning weather data, we will have a custom action and use python code
    - all these 5 are imp as they will be used in dialogue management model to make predictions by RASA Core Dialogue Management model
        - RCDM model will make prediction on what actions should me executed next on the slots that are currently populated based on 
            - a. intents and entities returned by Rasa NLU model ie. what a user spoke about
            - b. what actions were performed previously ie. what is the state of the conversation at the moment
          
### Custom Actions  

In [7]:
# future enables new language features which may not compatible with
    # current interpreter
# So that code will work with older version of python

from __future__ import absolute_import, division, unicode_literals

from rasa_core.actions.action import Action
from rasa_core.events import SlotSet

# 
class ActionWeather(Action):
    
    # create name of the action
    def name(self):
        return "action_weather"
    
    # here all the action will happen
    # apixu
    def run(self, dispatcher, tracker, domain):
        from apixu.client import ApixuClient
        api_key = "3564bf1fbe6d44d0b4c93136190906"
        # authentication
        client = ApixuClient(api_key)
        # remember we have a slot which keeps location info
            # through out the conversation
        # from tracker get value of a particular slot    
        loc = tracker.get_slot("location")
        
        # response is going to be a dictionary
            # having lot of details
        current = client.getCurrentWeather(q=loc)
        
        # now we will parse the response
        country = current['location']['country']
        city = current['location']['name']
        condition = current['current']['condition']['text']
        temp_c = current['current']['temp_c']
        humidity = current['current']['humidity']
        wind_mph = current['current']['wind_mph']
        
        # now we will create response message
        response = """
            It is currently {} in {} at the moment. The temperature is {} degrees,
             The Humidity is {}% and the wind speed is {} mph.""".format(condition,
                                                                         city, 
                                                                         temp_c,
                                                                         humidity,
                                                                         wind_mph);
        # dispatcher will send out the response
        dispatcher.utter_message(response)
        
        # lastly we will return current slot value
        return [SlotSet('location',loc)]

- now we have to include this to our domain
    - under actions
        - actions.ActionWeather
    - note : here Actionweather class is in file actions.py
    - in jupyter add __main__.ActionWeather

In [8]:
domain_yml_data = """
slots:
  location:
    type: text


intents:
 - greet
 - goodbye
 - inform


entities:
 - location

templates:
  utter_greet:
    - 'Hello! How can I help?'
  utter_goodbye:
    - 'Talk to you later.'
    - 'Bye bye :('
  utter_ask_location:
    - 'In what location?'


actions:
 - utter_greet
 - utter_goodbye
 - utter_ask_location

""" 
 
%store domain_yml_data > weather_domain.yml

Writing 'domain_yml_data' (str) to file 'weather_domain.yml'.


### Data
- dialogue management model is trained on actual conversations that users have with the bot
    - only thing is that these conversations have to be converted into a story format
    - story is and actual conversation b/w a user and a chatbot where user inputs are converted into correspoding intents and entities while response of chatbots are expressed as actions which the bot would execute at that specific stage
- example of how a real conversation can be written as a story
<img src="rasa1.JPG">

- but before a question arises is how to get this data in the first place
    - One way is to use Rasa core NLU feature called Online Training
        - this will not only help with generating data, but also train a dialogue management model in real time

#### Generating Stories
- go to data folder and create stories.md
    - this will be markdown extension
- we will create some stateless stories
    - stateless stories are conversations where we have one user input and one user response
- * intent : conversation that will start with intent

In [9]:
stories = """

## story 01
* greet
    - utter_greet

## story 02
* goodbye
    - utter_goodbye
    
## story 03
* inform
    - utter_ask_location

"""

%store stories > stories.md

Writing 'stories' (str) to file 'stories.md'.


- we can start online training session

### training
- Agent : it is going to train the model
- KerasPolicy/ Memoization policy : models to be used to train

In [16]:

from __future__ import absolute_import
from __future__ import division
from __future__ import unicode_literals

import logging

from rasa_core.agent import Agent
from rasa_core.policies.keras_policy import KerasPolicy
from rasa_core.policies.memoization import MemoizationPolicy

logging.basicConfig(level='INFO')

training_data_file = 'stories.md'

# where to save model once trained
model_path = './models/dialogue'

import os

In [17]:
agent = Agent('weather_domain.yml', policies = [MemoizationPolicy(), KerasPolicy()])

agent.train(
training_data_file,
augmentation_factor = 50,
#max_history = 2,
epochs = 500,
batch_size = 10,
validation_split = 0.2)

agent.persist(model_path)

Processed Story Blocks: 100%|██████████| 3/3 [00:00<00:00, 135.31it/s, # trackers=1]
Processed Story Blocks: 100%|██████████| 3/3 [00:00<00:00, 78.35it/s, # trackers=3]
Processed Story Blocks: 100%|██████████| 3/3 [00:00<00:00, 72.87it/s, # trackers=12]
Processed Story Blocks: 100%|██████████| 3/3 [00:00<00:00, 65.13it/s, # trackers=20]
INFO:rasa_core.featurizers:Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer)...
Processed trackers: 100%|██████████| 84/84 [00:05<00:00, 12.57it/s, # actions=79]
INFO:rasa_core.featurizers:Created 79 action examples.
Processed actions: 79it [00:00, 160.19it/s, # examples=79]
INFO:rasa_core.policies.memoization:Memorized 79 unique action examples.
INFO:rasa_core.featurizers:Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer)...
Processed trackers:  29%|██▊       | 24/84 [00:00<00:02, 23.86it/s, # actions=66]

KeyboardInterrupt: 

### Train online
- all the files save in above step will be required to create a dialogue management model which we will launch in online training session
    - online training session will improve our chatbot
- ConsoleInputChannel : online trainig session is an actual conversation with a chatbot. we will send messages using console
- RasaNLUModel : we will load our model 

In [18]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import logging

from rasa_core.agent import Agent
from rasa_core.channels.console import ConsoleInputChannel
from rasa_core.interpreter import RegexInterpreter
from rasa_core.policies.keras_policy import KerasPolicy
from rasa_core.policies.memoization import MemoizationPolicy
from rasa_core.interpreter import RasaNLUInterpreter

logger = logging.getLogger(__name__)

In [20]:
def run_weather_online(input_channel, interpreter,
                          domain_file="weather_domain.yml",
                          training_data_file='data/stories.md'):
    agent = Agent(domain_file,
                  policies=[MemoizationPolicy(), KerasPolicy()],
                  interpreter=interpreter)

    agent.train_online(training_data_file,
                       input_channel=input_channel,
                      # max_history=2,
                       batch_size=50,
                       epochs=200,
                       max_training_samples=300)
    return agent

In [None]:
logging.basicConfig(level="INFO")
    nlu_interpreter = RasaNLUInterpreter('./models/nlu/default/weathernlu')
    run_weather_online(ConsoleInputChannel(), nlu_interpreter)

In [None]:
from rasa_core.channels import 