# IPL Chatbot: NLU Exercise

Original: Learn how to Build and Deploy a Chatbot in Minutes using Rasa (IPL Case Study!) - Mohd Sanad Zaki Rizvi, Analytics Vidhya https://www.analyticsvidhya.com/blog/2019/04/learn-build-chatbot-rasa-nlp-ipl)

**Objectives** To build a chatbot capable of fetching latest info about the ongoing IPL (Indian Premier League) matches from cricapi.com site.

<img src="images/ipl.jpg">

### Importing Libraries

In [1]:
%matplotlib inline

# First things first
import nest_asyncio
nest_asyncio.apply()
print("Event loop ready.")

import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')

import rasa_nlu
import rasa_core
import spacy

Event loop ready.


### Preparing the NLU Training Data

Training data for extracting the user intent.
As you can see, the format of training data for ‘intent’ is quite simple in Rasa. You just have to:

- Start the line with “## intent:intent_name”
- Supply all the examples in the following lines

In [2]:
nlu_md = """
## intent:goodbye  
- Bye 
- Goodbye
- See you later
- Bye bot
- Goodbye friend
- bye
- bye for now
- catch you later
- gotta go
- See you
- goodnight
- have a nice day
- i'm off
- see you later alligator
- we'll speak soon
- end
- finish

## intent:greet
- Hi
- Hey
- Hi bot
- Hey bot
- Hello
- Good morning
- hi again
- hi folks
- hi Mister
- hi pal!
- hi there
- greetings
- hello everybody
- hello is anybody there
- hello robot
- who are you?
- what are you?
- what's up
- how do you do?

## intent:thanks
- Thanks
- Thank you
- Thank you so much
- Thanks bot
- Thanks for that
- cheers
- cheers bro
- ok thanks!
- perfect thank you
- thanks a bunch for everything
- thanks for the help
- thanks a lot
- amazing, thanks
- cool, thanks
- cool thank you

## intent:affirm
- y
- Y
- yes
- yes sure
- absolutely
- for sure
- yes yes yes
- definitely
- yes, it did.

## intent:current_matches
- what are the current matches
- can you list the matches in ipl 2019
- which cricket match is happening right now
- which ipl match is next
- which teams are playing next in ipl
- which team will play next in ipl
- tell me some ipl news
- i want ipl updates
- can you give me ipl latest updates
- what are the latest match updates
- who won the last ipl match
- which teams are competing in the next match
- how is ipl going
- what was the result of the last match
- when is the next match

## intent:deny
- no
- never
- I don't think so
- don't like that
- no way
- not really
- n
- N
"""

%store nlu_md > data/nlu.md

Writing 'nlu_md' (str) to file 'data/nlu.md'.


You can include as many examples as you want for each intent. In fact, make sure to include slangs and short forms that you use while texting. The idea is to make the chatbot understand the way we type text. Feel free to refer to the complete version where I have given plenty of examples for each intent type.

### Defining the NLU Model Configuration

This file lets us create a text processing pipeline in Rasa. Luckily for us, Rasa comes with two default settings based on the amount of training data we have:
- “spacy_sklearn” pipeline if you have less than 1000 training examples
- “tensorflow_embedding” if you have a large amount of data

In [3]:
config = """# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline: supervised_embeddings

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
  - name: MemoizationPolicy
  - name: KerasPolicy
  - name: MappingPolicy
"""
%store config > config.yml

Writing 'config' (str) to file 'config.yml'.


### Training the NLU Classifier Model

On command line you can run following command:

**python -m rasa train nlu**

Or programmatically you can write code

In [4]:
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

# loading the nlu training samples
training_data = load_data("data/nlu.md")

# trainer to educate our pipeline
trainer = Trainer(config.load("config.yml"))

# train the model!
interpreter = trainer.train(training_data)

# # store it for future use
# model_directory = trainer.persist("./models/ipl_nlu", fixed_model_name="current")

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.






INFO:absl:Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]




INFO:rasa_nlu.model:Starting to train component WhitespaceTokenizer
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component RegexFeaturizer
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component CRFEntityExtractor
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component EntitySynonymMapper
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component 

Instructions for updating:
`tf.batch_gather` is deprecated, please use `tf.gather` with `batch_dims` instead.

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where

Epochs: 100%|█████████████████████████████████████████████████| 300/300 [00:06<00:00, 42.93it/s, loss=0.480, acc=1.000]
INFO:rasa.utils.train_utils:Finished training embedding policy, train loss=0.480, train accuracy=1.000
INFO:rasa_nlu.model:Finished training component.


### Evaluating the NLU model on a random text (first way)

Let’s test how good our model is performing by giving it a sample text that it hasn’t been trained on for extracting intent. 

In [5]:
# A helper function for prettier output

def pprint(o):   
    print(json.dumps(o, indent=2))
    
pprint(interpreter.parse("what is happening in the cricket world these days?"))

{
  "intent": {
    "name": "current_matches",
    "confidence": 0.9997695088386536
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "current_matches",
      "confidence": 0.9997695088386536
    },
    {
      "name": "greet",
      "confidence": 9.037238487508148e-05
    },
    {
      "name": "affirm",
      "confidence": 8.723911741981283e-05
    },
    {
      "name": "deny",
      "confidence": 4.525366239249706e-05
    },
    {
      "name": "goodbye",
      "confidence": 5.667250661645085e-06
    },
    {
      "name": "thanks",
      "confidence": 1.893557623589004e-06
    }
  ],
  "text": "what is happening in the cricket world these days?"
}


Not only does our NLU model perform well on intent extraction, but it also ranks the other intents based on their confidence scores. This is a nifty little feature that can be really useful when the classifier is confused between multiple intents.