# Chatbot in Python

**Importing the required libraries**

In [1]:
import numpy as np
import nltk
nltk.download('omw-1.4')
import string
import random

[nltk_data] Downloading package omw-1.4 to
[nltk_data]     /Users/rohankumar/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


**Importing and reading the corpus**

In [2]:
f=open('Chatbot.txt','r',errors = 'ignore')
raw_doc=f.read()
raw_doc=raw_doc.lower() #Converts text to lowercase
nltk.download('punkt') #Using the Punkt tokenizer
nltk.download('wordnet') #Using the WordNet dictionary
sent_tokens = nltk.sent_tokenize(raw_doc) #Converts doc to list of sentences 
word_tokens = nltk.word_tokenize(raw_doc) #Converts doc to list of words

[nltk_data] Downloading package punkt to
[nltk_data]     /Users/rohankumar/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/rohankumar/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


**Looking at a sentence token**

In [3]:
sent_tokens[:2]

['data science\nfrom wikipedia, the free encyclopedia\njump to navigationjump to search\nnot to be confused with information science.',
 'the existence of comet neowise (here depicted as a series of red dots) was discovered by analyzing astronomical survey data acquired by a space telescope, the wide-field infrared survey explorer.']

**Looking at a word token**

In [4]:
word_tokens[:2]

['data', 'science']

**Text Preprocessing. Pre-processing the raw text. We shall now define a function called LemTokens which will take as input the tokens and return normalized tokens.**

In [5]:
lemmer = nltk.stem.WordNetLemmatizer()
#WordNet is a semantically-oriented dictionary of English included in NLTK.
def LemTokens(tokens):
    return [lemmer.lemmatize(token) for token in tokens]
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)
def LemNormalize(text):
    return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))

**Defining the Greet Function. If a user’s input is a greeting, the bot shall return a greeting response.**

In [6]:
GREET_INPUTS = ("hello", "hi", "greetings", "sup", "what's up","hey")
GREET_RESPONSES = ["hi", "hey", "*nods*", "hi there", "hello", "I am glad! You are talking to me"]
def greet(sentence):
 
    for word in sentence.split():
        if word.lower() in GREET_INPUTS:
            return random.choice(GREET_RESPONSES)

**Response Generation. To generate a response from our bot for input questions, the concept of document similarity will be used. So we begin by importing the necessary modules.**

In [7]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

**We define a function response that searches the user’s utterance for one or more general keywords and returns one of several possible responses. If it doesn’t find the input matching any of the keywords, it returns a response:” I am sorry! I don’t understand you.”**

In [8]:
def response(user_response):
    robo1_response=''
    sent_tokens.append(user_response)
    TfidfVec = TfidfVectorizer(tokenizer=LemNormalize, stop_words='english')
    tfidf = TfidfVec.fit_transform(sent_tokens)
    vals = cosine_similarity(tfidf[-1], tfidf)
    idx=vals.argsort()[0][-2]
    flat = vals.flatten()
    flat.sort()
    req_tfidf = flat[-2]
    if(req_tfidf==0):
        robo1_response=robo1_response+"I am sorry! I don't understand you"
        return robo1_response
    else:
        robo1_response = robo1_response+sent_tokens[idx]
        return robo1_response

**Defining conversation start/end protocols. Finally, we will feed the lines that we want our bot to say while starting and ending a conversation, depending upon the user’s input.**

In [None]:
flag=True
print("BOT: My name is Stark. Let's have a conversation! Also, if you want to exit any time, just type Bye!")
while(flag==True):
    user_response = input()
    user_response=user_response.lower()
    if(user_response!='bye'):
        if(user_response=='thanks' or user_response=='thank you' ):
            flag=False
            print("BOT: You are welcome..")
        else:
            if(greet(user_response)!=None):
                print("BOT: "+greet(user_response))
            else:
                print("BOT: ",end="")
                print(response(user_response))
                sent_tokens.remove(user_response)
    else:
        flag=False
        print("BOT: Goodbye! Take care <3 ")