<a href="https://colab.research.google.com/github/Rhin0Runner/File-Metadata-Microservice/blob/main/Retrieval_based_chatbot_Cyborg_Cantina.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Cyborg Cantina
Retrieval-based chatbots are commonly used in customer service environments. User questions are constrained to the specific service being provided, which approximates a closed-domain conversation. However, people are known to be especially picky with their food. Can you create a chatbot system that effectively answers all of a diner’s questions about the food on a restaurant menu?
In this project you will build a retrieval-based chatbot system for a restaurant serving Mexican cuisine. You’ll use a combination of tf-idf scoring, word embedding models, and a set of use-defined functions in order to answer any number of questions from a restaurant diner.
By the end of this project, you’ll have constructed a full retrieval-based chatbot system, conversed with your own chatbot creation, and improved the system by adding additional candidate responses!
Tasks
12/12 complete
Mark the tasks as complete by checking them off
Chabot Input/Output
1.
Let’s begin at the end! We must provide a way for our user to end a conversation once they have had all their questions answered. Note that exit_commands, a list of strings commonly used as exit commands from a chatbot system, is already defined in the workspace.
o	Define a .make_exit() method with self and user_message as parameters.
o	Within .make_exit(), write a for loop over each object in exit_commands.
o	For each object in exit_commands, check if the object also occurs in user_message.
o	If the object does occur in user_message, print a goodbye message to the console, and return True.
o	If the object does not occur in user_message, do nothing.
Stuck? Get a hint
2.
Of course, we should provide a method that allows our chatbot to chat:
o	Define a .chat() method with self as a parameter.
o	Within the .chat() method, write a welcoming prompt for a user question, using the input() function. Set the result equal to user_message.
o	Create a while not loop that checks whether .make_exit(user_message) is True at each iteration.
o	Within the while loop, call .respond() on user_message and assign the result to user_message.
Stuck? Get a hint
3.
Let’s test out the .make_exit() functionality of the bot:
o	Outside of the ChatBot class, initialize a ChatBot instance.
o	Call the .chat() method on the ChatBot instance.
o	Run python3 script.py in the console to start your chatbot!
o	Ask a question that includes one of the terms in exit_commands.
Stuck? Get a hint
Intent Classification
4.
We’ve already imported a collection of functions created throughout the Retrieval-based Chatbots lesson into the workspace, as well as a set of pre-defined responses for our retrieval-based bot.
o	Check out the user_functions.py file to refresh your understanding of the preprocess(), compare_overlap(), extract_nouns(), and compute_similarity() functions.
o	Look over the responses.py file to see the collection of responses already written for our bot.
5.
Let’s build a set of BoW models from our data:
o	Define a .find_intent_match() method with self, responses, and user_message as parameters.
o	In the body of .find_intent_match(), call preprocess() on user_message, then call Counter() on the result to create a bag-of-words (BoW) model.
o	Call preprocess() on each item in responses, then call Counter() on the result.
Stuck? Get a hint
6.
Now we can select the response that best matches the intent of the user message:
o	Still in the body of .find_intent_match(), apply compare_overlap() on each response in processed_responses. Save the resulting list item to similarity_list.
o	Use Python’s .index() method and max() function to select the index of the highest similarity score in similarity_list. Save the result to response_index.
o	Use list indexing to return the element at index response_index in responses.
Stuck? Get a hint
7.
Let’s test our .find_intent_match() method:
o	Define a method called .respond(), with self and user_message as parameters.
o	Assign the result of calling .find_intent_match(responses, user_message) to a variable called best_response.
o	Within .respond(), print best_response to the console.
o	To allow for multiple questions, use the input() function to prompt the user for another question. Assign the result to input_message.
o	Return input_message.
o	Run your script in the terminal to check if a response is returned!
Stuck? Get a hint
Entity Recognition
8.
Let’s extract candidate entities from the user message:
o	Define a .find_entities() method with self and user_message as parameters.
o	In the body of .find_entities(), call preprocess() on user_message. Then call pos_tag() on the result.
o	Call extract_nouns() on tagged_user_message. Save the result to message_nouns.
Stuck? Get a hint
9.
Now we can fit a word2vec model on our candidate entities:
o	Use " ".join() to create a concatenated string from message_nouns. Call word2vec() on this string. Save the result to a variable called tokens.
o	Call word2vec() on blank_spot. Save the result to a variable called category.
o	Call compute_similarity() on tokens and category. Save the result to word2vec_result.
Stuck? Get a hint
10.
Finally, let’s select the entity with the highest similarity score:
o	Call sort(key=lambda x: x[2]) on word2vec_result. This will sort the result list by ascending similarity score.
o	Write an if statement to check whether word2vec_result has at least one item. If False, return blank_spot. Otherwise, return the first element of the last list item in word2vec_result.
Stuck? Get a hint
Response Selection
11.
Let’s pull together the results from the Intent Classification and Entity Extraction tasks:
o	Within .respond(), delete the line which prints best_response to the console.
o	Directly after the call to .find_intent_match, assign the result of calling .find_entities(user_message) to entity.
o	Call Python’s .format() method, with entity as an argument, on best_response. Print the result to the terminal.
o	Call your script from the terminal to test out your bot!
Stuck? Get a hint
Improve your bot!
12.
While our the bot functions, you may find that after the first few user questions it’s responses become a bit repetitive. One of the major limitations of retrieval-based chatbots is a reliance on a set of pre-defined responses. Try adding more responses to responses.py in order to extend the functionality of your bot!


In [None]:
#Responses.py:
response_a = "The {} has a gluten-free option, but it is not vegan"
response_b = "We have a selection of sides to go along with the {}, including mashed potatoes and steamed vegatables."
response_c = "{} includes habanero, so it is a bit spicy!"
blank_spot = "food"

responses = [response_a, response_b, response_c]

#user_functions.py:
import re
from collections import Counter
import spacy
word2vec = spacy.load('en')
from nltk import pos_tag
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
stop_words = set(stopwords.words("english"))

def preprocess(input_sentence):
    input_sentence = input_sentence.lower()
    input_sentence = re.sub(r'[^\w\s]','',input_sentence)
    tokens = word_tokenize(input_sentence)
    input_sentence = [i for i in tokens if not i in stop_words]
    return(input_sentence)

def compare_overlap(user_message, possible_response):
    similar_words = 0
    for token in user_message:
        if token in possible_response:
              similar_words += 1
    return similar_words

def extract_nouns(tagged_message):
    message_nouns = list()
    for token in tagged_message:
        if token[1].startswith("N"):
            message_nouns.append(token[0])
    return message_nouns

def compute_similarity(tokens, category):
    output_list = list()
    for token in tokens:
        output_list.append([token.text, category.text, token.similarity(category)])
    return output_list

#script.py:
from collections import Counter
from responses import responses, blank_spot
from user_functions import preprocess, compare_overlap, pos_tag, extract_nouns, compute_similarity
import spacy
word2vec = spacy.load('en')

exit_commands = ("quit", "goodbye", "exit", "no", 'end')

class ChatBot:
  #1
  #define .make_exit() below:
  def make_exit(self, user_message):
    for exit_command in exit_commands:
      if exit_command in user_message:
        print("See you later!")
        return True
  #2
  #define .chat() below:
  def chat(self):
    user_message = input("Welcome to La Chata! \nHow can we help you today?\n")
    while not self.make_exit(user_message):
      user_message = self.respond(user_message)
  #5
  #define .find_intent_match() below:
  def find_intent_match(self, responses, user_message):
    bow_model = Counter(preprocess(user_message))
    processed_responses = [Counter(preprocess(response)) for response in responses]
  #6
    similarity_list = [compare_overlap(response, bow_model) for response in processed_responses]
    response_index = similarity_list.index(max(similarity_list))
    return responses[response_index]
  #8
  #define .find_entities() below:
  def find_entities(self, user_message):
    tagged_user_message = pos_tag(preprocess(user_message))
    message_nouns = extract_nouns(tagged_user_message)
    #9
    tokens = word2vec(" ".join(message_nouns))
    category = word2vec(blank_spot)
    word2vec_result = compute_similarity(tokens, category)
    #10
    word2vec_result.sort(key=lambda x: x[2])
    if not len(word2vec_result) > 0:
      return blank_spot
    else:
      return word2vec_result[-1][0]

  #7
  #define .respond() below:
  def respond(self, user_message):
    best_response = self.find_intent_match(responses, user_message)
    #11
    entity = self.find_entities(user_message)
    print(best_response.format(entity))
    #print(best_response)
    input_message = input("Do you have any other questions?\n")
    return input_message

#3
#initialize ChatBot instance below:
new_chat = ChatBot()
#call .chat() method below:
new_chat.chat()
