### Intro
- Rasa core and Rasa NLU combined is called RASA STACK
- We will build a weather reporting chatbot
    - this will do Entity Extraction and Intent classification task
    
### Setup
1. pip install -r requirements.txt
2. language model ( en ) 
    - lanuage model is used to parse incoming text messages and extract necessary information
3. Rasa NLU trainer
    - this makes generating training data lot easier
    - this is a UI, and js based application, so we will need npm and nodejs
    - download node js with npm and add to path
    - then run
        - npm i -g rasa-nlu-trainer
    - this will install rasa-nlu trainer

In [20]:
%matplotlib inline

import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')

def pprint(o):
    # small helper to make dict dumps a bit prettier
    print(json.dumps(o, indent=2))

In [2]:
# 2. install language model 

import sys
python = sys.executable

# this will download english spacy model
    # it will install it and reference it to abbreviation en
!{python} -m spacy download en

Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
[K    100% |████████████████████████████████| 37.4MB 92.1MB/s ta 0:00:01   2% |▊                               | 860kB 5.1MB/s eta 0:00:08    4% |█▎                              | 1.5MB 30.3MB/s eta 0:00:02    4% |█▌                              | 1.7MB 33.4MB/s eta 0:00:02    4% |█▋                              | 1.9MB 2.6MB/s eta 0:00:14/s eta 0:00:01    17% |█████▌                          | 6.4MB 6.9MB/s eta 0:00:05    18% |██████                          | 7.0MB 11.3MB/s eta 0:00:03    21% |██████▉                         | 8.0MB 1.4MB/s eta 0:00:21    23% |███████▌                        | 8.8MB 1.7MB/s eta 0:00:17    29% |█████████▌                      | 11.1MB 5.1MB/s eta 0:00

### Training 
- we have to teach a chatbot how to understand human unstructured language so that bot will understand what we are saying
    - So we will train NLU model which will take unstructured text messages and will return structured data in the form of intents and entities which our bot will understand
- Intent
    - what the message is about
- Entity
    - informations like location names, dates etc
    - this helps chatbot to understand what specifically we are talking about and asking
- Entity Extraction and Intent classification are ML problems, so we need train data to train the models

##### Rasa NLU train data
- train data should contain example messages which we would like our chatbot to learn from
    - the corresponding intents and what entities included in each sentence and where in a sentence they can be found
    
##### creating data
- mkdir data
- cd data
- echo 'nlu_data' > data.json
- there are two different ways of how we can create training examples for NLU models
    - one way is to directly write them into this data file
    
{
  "rasa_nlu_data":{
    "common_examples":[
    {
       "text":"Hello",
       "intent":"greet",
       "entities":[]
    },
    {
       "text":"goodbye",
       "intent":"goodbye",
       "entities":[]
    }
      ]
  }
}

    - save this file
    - another way :
        - in the data folder, launch rasa-nlu-trainer
        - here we add new example
        - add text
            - What's the weather in Berlin at the moment?
        - now highlight berlin and add as entity,
            - give entity name as location
        - Now if we open data.json, we can see entites populated 
            - also we have start and end to show where entity is present

#### Amount of training data
- now we have three examples of training 
- we need more examples for each of these intents
    - examples should be different and diverse
- now add data from data.json from github to our data.json
    - here we have some more examples of greeting, goodbye and asking for weather
- reload nlu_trainer
- now we will have close to 40 examples in total, which is also less

#### Before training
- we need to create a configuration file
- go out of data folder,
    - echo 'config' > config_spacy.json
- config file is imp as it provides some params to be used
    - 1. pipeline : this will specify what featurizers,feature extractors are going to be used to crunch text messages and extract neccesary info in RASA NLU
        - Rasa NLU has two main pipelines pre-built
            - a. MIDI based
            - b. Sklearn based
    - 2. path : dir where we will keep model after training
    - 3. data : data file location
    
- config_spacy.json
{
  "pipeline":"spacy_sklearn",
  "path":"./models/nlu",
  "data":"./data/data.json"
}

#### model creation

In [12]:
from rasa_nlu.training_data import load_data
from rasa_nlu import config
from rasa_nlu.model import Trainer
from rasa_nlu.model import Metadata, Interpreter

In [13]:
def train_nlu(data, configs, model_dir):
    training_data = load_data(data)
    # we will have to provide a config in trainer,
        # this we wil do using RasaNLUConfig method
    trainer = Trainer(config.load(configs))
    trainer.train(training_data)
    # model dir is where our model is saved
    model_directory = trainer.persist( model_dir, fixed_model_name= "weathernlu")

In [9]:
train_nlu("./data/data.json", "config_spacy.json", "./models/nlu")

  from ._conv import register_converters as _register_converters


Fitting 2 folds for each of 6 candidates, totalling 12 fits


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.1s finished


- to check if model is persisted look for models folder
- Metadata and Intepreter class is required to load the model and get ready to use it

In [21]:
def run_nlu():
    # now we load our model
    intepreter = Interpreter.load("./models/nlu/default/weathernlu")
    pprint(intepreter.parse(u"I am planning my holiday to Lithuania. I wonder what is the weather out there."))

In [22]:
run_nlu()

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


{
  "intent": {
    "name": "inform",
    "confidence": 0.843375422673486
  },
  "entities": [
    {
      "start": 28,
      "end": 37,
      "value": "lithuania",
      "entity": "location",
      "confidence": 0.919550976223837,
      "extractor": "ner_crf"
    }
  ],
  "intent_ranking": [
    {
      "name": "inform",
      "confidence": 0.843375422673486
    },
    {
      "name": "greet",
      "confidence": 0.07873710059777826
    },
    {
      "name": "goodbye",
      "confidence": 0.07788747672873546
    }
  ],
  "text": "I am planning my holiday to Lithuania. I wonder what is the weather out there."
}


- RASA NLU tells what intent is the text
- also it tells the confidence score for all other intents that we have

### Dialogue Management model
- Dialogue management model will predict what action or resonse the chatbot should make based on the state of the conversation
    - actions can be simple API calls or text responses or retriving data from DB
- Q. Why to we need ML for that ?
    - Now if we ask for weather without giving our location, we want our chatbot to ask for what location we are based
        - In practice these type of conversations are hard coded in form of flow charts.
        - code wise we can image this as bunch of if else statements
        - This means a developer has to create a bunch of most possible happy paths from starting the conversation to the end goal.
        - now with every intent and entity we add, the flowchart becomes complex and difficult to monitor all possible paths that user can make to get the answers they want
    - In Rasa, we have a ML model which we can train and it will make a prediction of what the bot should do next based on
        - a. the context
        - b. and the state of the conversation
        - as a result, the conversational flow is way more natural and we have a better user experience
        
### Building dialogue management model
- create a domain file for our chatbot
- 