# Different Types Of Chatbot

There are broadly two variants of chatbots: Rule-Based and Self learning.

In a Rule-based approach, a bot answers questions based on some rules on which it is trained on. The rules defined can be very simple to very complex. The bots can handle simple queries but fail to manage complex ones.

The Self learning bots are the ones that use some Machine Learning-based approaches and are definitely more efficient than rule-based bots. These bots can be of further two types: Retrieval Based or Generative

i) In retrieval-based models, a chatbot uses some heuristic to select a response from a library of predefined responses. The chatbot uses the message and context of conversation for selecting the best response from a predefined list of bot messages. The context can include a current position in the dialog tree, all previous messages in the conversation, previously saved variables (e.g. username). Heuristics for selecting a response can be engineered in many different ways, from rule-based if-else conditional logic to machine learning classifiers.

ii) Generative bots can generate the answers and not always replies with one of the answers from a set of answers. This makes them more intelligent as they take word by word from the query and generates the answers.

We are building a retrieval-based chatbot (Using TF-IDF approach).

In [3]:
import nltk
import numpy as np
import random
import string 

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [4]:
f = open('/content/drive/My Drive/nlp/tfidf-chatbot-master/chatbot.txt', 'r', errors = 'ignore')
raw = f.read()
raw = raw.lower()
nltk.download('punkt') #tokenizer for english
nltk.download('wordnet') #corpora
sent_tokens = nltk.sent_tokenize(raw) #convert to list of sentences
word_tokens = nltk.word_tokenize(raw) #convert to list of words

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


In [None]:
sent_tokens[:2]

['a chatbot (also known as a smartbot, talkbot, chatterbot, bot, im bot, interactive agent, conversational interface or artificial conversational entity) is a computer program or an artificial intelligence which conducts a conversation via auditory or textual methods.',
 '[1] such programs are often designed to convincingly simulate how a human would behave as a conversational partner, thereby passing the turing test.']

In [5]:
word_tokens[:2]

['a', 'chatbot']

# Preprocessing the text

In [7]:
lemmer = nltk.stem.WordNetLemmatizer()
#Wordnet is a semantically-oriented dictionary of English included in NLTK

def lemTokens(tokens):
    return [lemmer.lemmatize(token) for token in tokens]

def lemNormalize(text):
    return lemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))

remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)

# Greeting by Keyword Matching

In [13]:
GREETING_INPUTS = ("hello", "hi", "greetings", "sup", "what's up", "hey",)

GREETING_RESPONSES = ["hi", "hey", "*nods*", "hi there", "hello", "I am glad! You are talking to me"]

def greeting(sentence):
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)

# Generating response to a query using TF-IDF

In [9]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [11]:
def response(user_response):
    robo_response = ''
    sent_tokens.append(user_response)
    TfidfVec = TfidfVectorizer(tokenizer=lemNormalize, stop_words = 'english')
    
    # Get the TF IDF weighted Document-Term Matrix
    tfidf = TfidfVec.fit_transform(sent_tokens) 
    # print("Document Count", len(sent_tokens), "Term Count", len(TfidfVec.get_feature_names()))
    # print(TfidfVec.get_feature_names())
    
    # Get the cosine similarity between user query and all the sentences in corpora. This will be a vector of similarities
    vals = cosine_similarity(tfidf[-1], tfidf)
    
    # Get the sentence (document) which matches the most with the query
    idx = vals.argsort()[0][-2]
    
    # Get the td-idf value of the index which matched the most
    flat = vals.flatten()
    flat.sort()
    req_tfidf = flat[-2]
    
    if(req_tfidf == 0):
        robo_response =  "I am sorry! I don't understand you"
    else:
        robo_response = sent_tokens[idx]
        
    sent_tokens.remove(user_response)
    return robo_response

# response("conference chatbot")

In [None]:
flag = True

print("ROBO: My name is Robo. I will answer your queries about Chatbots. If you want to exit, type Bye! ")

while( flag == True):
    user_response = input()
    user_response = user_response.lower()
    
    if(user_response != 'bye'):
        if(user_response == 'thanks' or user_response == 'thank_you'):
            flag = False
            print("ROBO: You are welcome..")
        else:
            if(greeting(user_response) != None):
                print("ROBO: "+greeting(user_response))
            else:
                print("ROBO: ", end="")
                print(response(user_response))
    else:
        flag = False
        print("ROBO: Bye! take care..")

ROBO: My name is Robo. I will answer your queries about Chatbots. If you want to exit, type Bye! 
who invented chatbots


  'stop_words.' % sorted(inconsistent))


ROBO: [23][24]

a 2017 study showed 4% of companies used chatbots.
