## Chatbot to help Indian Farmers get answers for their queries

###### Author: Sulekha Aloorravi

### This is a step-by-step tutorial to develop an Agricultural chatbot using Rasa

#### Install the below dependencies to build this chatbot

Install build tools for Microsoft Visual C++

pip install rasa_nlu[spacy] 

pip install -U rasa_core == 0.9.6

python -m spacy download en_core_web_md

python -m spacy link en_core_web_md en --force

#### Note: Use Rasa version 0.9.6 for this implementation

### Delete input files if they are already available

In [1]:
import warnings
warnings.filterwarnings("ignore")

In [2]:
def delfile(file):
    import os
    try:
        os.remove(file)
    except OSError:
        pass

In [3]:
delfile("intent.md")
delfile("query.md")
delfile("agri.yml")
delfile("config.yml")

### Load data from Indian Government's Agriculture API

#### Note: You need to create a login at data.gov.in and generate an API key

In [4]:
import requests
api_key = r"myapikey"
outputformat = r"json"
records = 5000
request = r'https://api.data.gov.in/catalog/19ba71d9-6d58-402d-9b75-a0ebdc034a56?api-key='+api_key+'&format='+outputformat+'&limit='+str(records)
response = requests.get(request)
data = eval(response.text)

### Verify the data loaded

In [5]:
data['records'][4999]

{'season': 'RABI',
 'sector': 'AGRICULTURE',
 'category': 'Cereals',
 'crop': 'Paddy (Dhan)',
 'querytype': 'Weather',
 'querytext': 'Asking about Thiruvallur today weather Report',
 'kccans': 'Recommended for Thiruvallur today weather Report : Light Rain fall (2.5  7.5 mm)',
 'statename': 'TAMILNADU',
 'districtname': 'THIRUVALLUR',
 'blockname': 'R.K.PET',
 'createdon': '2018-03-18T08:02:52Z'}

### Extract required data into lists

In [6]:
category = []
crop = []
querytype = []
querytext = []
kccans = []
identifier = []

for i in range(0,1000):
    category.append(data['records'][i]['category'])
    crop.append(data['records'][i]['crop'])
    querytype.append(data['records'][i]['querytype'])
    identifier.append(i)
    querytext.append(data['records'][i]['querytext'])
    kccans.append(data['records'][i]['kccans'])
    

### Create a dataframe and load the above lists into dataframe columns

In [7]:
import pandas as pd
df = pd.DataFrame()

In [8]:
df["category"] = category
df["crop"] = crop
df["querytype"] = querytype
df["querytext"] = querytext
df["kccans"] = kccans
df['identifier'] = identifier

### Preprocess each column to prepare data for intent and actions 

In [9]:
df["intent"] = df["category"]+df["crop"]+df["querytype"]#+df["identifier"]

#### Function to remove special characters from the intent column

In [10]:
import re
def cleanString(x):
    return re.sub('[^A-Za-z0-9]+', '', x)

In [11]:
df["intent"] = df.apply(lambda x: cleanString(str(x["intent"])), axis =1)

In [12]:
df["intent_md"] = "## intent:" + df["intent"]

In [13]:
df["intent_*"] = "* " + df["intent"]

In [14]:
df["intent_-"] = "- " + df["intent"]

In [15]:
df["querytext_md"] = "- " + df["querytext"]

In [16]:
df_pivot = df.pivot_table(index=['intent_md'],
                                     values='querytext_md',
                                     aggfunc=lambda x: '\n'.join(x)).reset_index()

In [17]:
for i,j in zip(df_pivot["intent_md"], df_pivot["querytext_md"]):
    with open('intent.md', 'a') as f:
        print(i, '\n', j, file = f)
    f.close()
    

In [18]:
df["actions_md"] = "utter_" + df["intent"]

In [19]:
df["query_md"] = "## query_" + df["intent"]

In [20]:
df["actions_-"] = "- " + df["actions_md"]

In [21]:
df["actions_:"] = df["actions_md"] + ":"

#### Function to clean up answers

In [22]:
def cleanAnswer(x):
    return re.sub('[^A-Za-z0-9]+', ' ', x)

In [23]:
df["kccans"] = df["kccans"].replace(r'\\n','', regex=True) 
df["kccans"] = df.apply(lambda x: cleanAnswer(str(x["kccans"])), axis =1)

In [24]:
df["kccans_text"] = '- text: "' + df["kccans"] + '"'

In [25]:
df2 = df[["actions_:","kccans_text"]].drop_duplicates(["actions_:"])

In [26]:
df3 = df[["query_md","intent_*","actions_-"]].drop_duplicates(["intent_*"])

### Create query.md to define intent and actions

In [27]:
for i,j,k in zip(df3["query_md"], df3["intent_*"], df3["actions_-"]):
    with open('query.md', 'a') as f:
        print(i, '\n', j, '\n',' ', k, '\n', '\n', file = f)
    f.close()
    

### Create agri.yml to capture intent, actions and templates for answers

In [28]:
with open('agri.yml', 'a') as f:
        print('intents:', '\n', file = f)
f.close()

In [29]:
for i in df["intent_-"].unique():
    with open('agri.yml', 'a') as f:
        print(i, '\n', file = f)
    f.close()

In [30]:
with open('agri.yml', 'a') as f:
        print("\n", file = f)
        print("slots:\n", file = f)
        print("   group:\n", file = f)
        print("     type: text\n", file = f)
        print("\n", file = f)
        print("entities:\n", file = f)
        print("- group\n", file = f)
        print("\n", file = f)
f.close()

In [31]:
with open('agri.yml', 'a') as f:
        print('actions:', '\n', file = f)
f.close()

In [32]:
for i in df["actions_-"].unique():
    with open('agri.yml', 'a') as f:
        print(i, '\n', file = f)
    f.close()

In [33]:
with open('agri.yml', 'a') as f:
        #print("\n", file = f)
        print('templates:', '\n', file = f)
f.close()

In [34]:
for i,j in zip(df2["actions_:"],df2["kccans_text"]):
    with open('agri.yml', 'a') as f:
        print(' ',i, '\n', file = f)
        print(' ',j, '\n', file = f)
    f.close()

In [35]:
import rasa_nlu
import rasa_core
import spacy





In [36]:
config = """
language: "en"

pipeline:
- name: "nlp_spacy"                   # loads the spacy language model
- name: "tokenizer_spacy"             # splits the sentence into tokens
- name: "ner_crf"                   # uses the pretrained spacy NER model
- name: "intent_featurizer_spacy"     # transform the sentence into a vector representation
- name: "intent_classifier_sklearn"   # uses the vector representation to classify using SVM
- name: "ner_synonyms"                # trains the synonyms
""" 
with open('config.yml', 'a') as f:
        print(config, file = f)
f.close()

In [37]:
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

# Load training data
training_data = load_data("intent.md")

# Use trainer to load configuration data the needs to be learnt by the model
trainer = Trainer(config.load("config.yml"))

# Train model on training data
interpreter = trainer.train(training_data)

# Save Model
model_directory = trainer.persist("./models/nlu")

Fitting 2 folds for each of 6 candidates, totalling 12 fits


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:   16.9s finished


In [38]:
# Verify your question's confidence level
import json
def pprint(o):   
    print(json.dumps(o, indent=2))
    
pprint(interpreter.parse("What is onion price?"))

{
  "intent": {
    "name": "VegetablesOnionPlantProtection",
    "confidence": 0.045864217206574506
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "VegetablesOnionPlantProtection",
      "confidence": 0.045864217206574506
    },
    {
      "name": "OthersOthersGovernmentSchemes",
      "confidence": 0.0394773777286723
    },
    {
      "name": "VegetablesOnionNutrientManagement",
      "confidence": 0.025974981458943332
    },
    {
      "name": "OthersOthersWeather",
      "confidence": 0.02401600541741634
    },
    {
      "name": "OthersOthersPlantProtection",
      "confidence": 0.02350334982546186
    },
    {
      "name": "OthersOthersMarketInformation",
      "confidence": 0.02081199804003339
    },
    {
      "name": "CondimentsandSpicesGarlicPlantProtection",
      "confidence": 0.01796795471674674
    },
    {
      "name": "VegetablesChilliesPlantProtection",
      "confidence": 0.0173349141208744
    },
    {
      "name": "OthersOthersFieldPreparat

### Evaluate Model

In [39]:
from rasa_nlu.test import run_evaluation

run_evaluation("intent.md", model_directory)

100%|█████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:08<00:00, 122.03it/s]


{'intent_evaluation': {'predictions': [{'text': 'Asked about goat farm training information',
    'intent': 'AnimalBovineCowBuffaloAnimalBreeding',
    'predicted': 'OthersOthersGovernmentSchemes',
    'confidence': 0.06661780182328514},
   {'text': 'farmer want to know information about nutrition gives to our cattle??',
    'intent': 'AnimalBovineCowBuffaloAnimalBreeding',
    'predicted': 'OthersOthersGovernmentSchemes',
    'confidence': 0.051895776139643225},
   {'text': 'Asking about azolla fronds availability',
    'intent': 'AnimalBovineCowBuffaloAnimalNutrition',
    'predicted': 'OthersOthersGovernmentSchemes',
    'confidence': 0.42763315545394565},
   {'text': 'asking about veterinary farm information',
    'intent': 'AnimalBovineCowBuffaloAnimalNutrition',
    'predicted': 'OthersOthersGovernmentSchemes',
    'confidence': 0.2673081914628762},
   {'text': 'Asking about Madurai Veterinary University Training and Research Centre contact number',
    'intent': 'AnimalBovineCow

In [40]:
from rasa_core.actions import Action
from rasa_core.events import SlotSet
from rasa_core.policies import FallbackPolicy, KerasPolicy, MemoizationPolicy
from rasa_core.agent import Agent

# Use a Fallback action in case model is unable to understand the question
fallback = FallbackPolicy(fallback_action_name="utter_OthersOthersGovernmentSchemes",
                          core_threshold=0.2,
                          nlu_threshold=0.1)

agent = Agent('agri.yml', policies=[MemoizationPolicy(), KerasPolicy(), fallback])

# Load query definitions
training_data = agent.load_data('query.md')

agent.train(
    training_data
)

agent.persist('models/dialogue')

Processed Story Blocks: 100%|████████████████████████████████████████| 174/174 [00:00<00:00, 1042.92it/s, # trackers=1]
Processed Story Blocks: 100%|████████████████████████████████████████| 174/174 [00:00<00:00, 379.30it/s, # trackers=20]
Processed Story Blocks: 100%|████████████████████████████████████████| 174/174 [00:00<00:00, 310.70it/s, # trackers=20]
Processed Story Blocks: 100%|████████████████████████████████████████| 174/174 [00:00<00:00, 335.99it/s, # trackers=20]
Processed actions: 349it [00:00, 1244.66it/s, # examples=349]



Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
masking (Masking)            (None, 5, 358)            0         
_________________________________________________________________
lstm (LSTM)                  (None, 32)                50048     
_________________________________________________________________
dense (Dense)                (None, 182)               6006      
_________________________________________________________________
activation (Activation)      (None, 182)               0         
Total params: 56,054
Trainable params: 56,054
Non-trainable params: 0
_________________________________________________________________
Train on 823 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch

Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


### Start Agricultural chatbot

In [41]:
from rasa_core.agent import Agent
agent = Agent.load('models/dialogue', interpreter=model_directory)

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


In [43]:
print("Type your question here...")
while True:
    a = input()
    if a == 'stop':
        break
    responses = agent.handle_message(a)
    for response in responses:
        print(response["text"])

Type your question here...
Can I know about Cereals?
Recommended for to know about govt schemes please contact rural agriculture extension officer 
stop


### Next steps
This model has a low accuracy level currently and it needs to be enhanced by scaling the training questions and answers and tuning model hyperparameters