<a href="https://colab.research.google.com/github/sudeepjd/Data-Analytics/blob/master/09-Natural%20Language%20Processing/ML_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple Machine Learning Chatbot using SVM Classifier

## Import the Libraries

In [118]:
import nltk
import numpy as np
import json
import random
import urllib
from nltk.stem.lancaster import LancasterStemmer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.preprocessing import LabelEncoder

# nltk.download('punkt')

## Load the Intents File

In [119]:
file = urllib.request.urlopen("https://raw.githubusercontent.com/sudeepjd/Data-Analytics/master/09-Natural%20Language%20Processing/mlchat_intents.json")
data = json.loads(file.read())

Extract the Data

In [120]:
docs_x = []
docs_y = []

In [121]:
stemmer = LancasterStemmer()
for intent in data['intents']:
	for pattern in intent['patterns']:
		wrds = nltk.word_tokenize(pattern)
		wrds = [stemmer.stem(w.lower()) for w in wrds if w != "?"]
		docs_x.append(' '.join(wrds))
		docs_y.append(intent["tag"])

Vectorize X

In [122]:
cv = CountVectorizer()
X = cv.fit_transform(docs_x).toarray()
print(X)

[[0 0 0 ... 0 0 0]
 [0 0 1 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]


Vectorize Y

In [123]:
le = LabelEncoder()
y = le.fit_transform(docs_y)
print(y)
print (le.inverse_transform([6]))

[ 6  6  6  6  8  8  8  8  5  5  5  5 14 14 14  7  7  7  7  7  7  9  9  9
  9  9  9 12 12 12 13 13 15 15 15 16 10 10 10 10 11 11 11  0  0  0  0  1
  1  1  2  2  2  3  3  3  4  4  4  4  4]
['greeting']


## Build and Train the Chatbot

Support Vector Machine

In [124]:
from sklearn.svm import SVC
classifier = SVC(kernel='linear')
classifier.fit(X, y)

SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='scale', kernel='linear',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

Predict Function

In [125]:
def provide_response(sent):
  #Pre-Process Input
  sent = nltk.word_tokenize(sent)
  sent = [stemmer.stem(w.lower()) for w in sent if w != "?"]
  sent = [' '.join(sent)]
  X_pred = cv.transform(sent).toarray()

  #Predict
  y_pred = classifier.predict(X_pred)
  tag = le.inverse_transform(y_pred)
  tag = tag[0]

  #Get the options to respond
  for tg in data['intents']:
    if tg['tag'] == tag:
      responses = tg['responses']

  #Return one of them
  return (tag, random.choice(responses))

Test Single Response

In [126]:
print(provide_response("What can I do with regression?"))

('regression', 'In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables')


## Execute the Chatbot

In [127]:
print ("BOT : Hello! I am ML Chat.")

while True:
  inp = input("\nYOU : ")
  resp = provide_response(inp)
  
  print("\nBOT : " + resp[1])
  
  if resp[0] == "goodbye" : break

BOT : Hello! I am ML Chat.

YOU : deep learning

BOT : Deep learning is an artificial intelligence (AI) function that imitates the workings of the human brain in processing data and creating patterns for use in decision making. 
It can be used both for regression as well as for classification problems.

YOU : bye

BOT : Bye! Come back again soon.
