<a href="https://colab.research.google.com/github/Avi-000-Avi/NLP-pipeline-for-chatbots/blob/master/Building_conversational_bots.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Process will be as follows: the user will interact with the bot and write a random query about the store. The bot will simply send that query to the NLP engine, using an API, and then it is up to the NLP model to decide what to return for a new query (test data).

In reference to our dataset, all of the questions are the training data and the answers are labels. In the event of a new query, the TF-IDF algorithm will match it to one of the questions with a confidence score, which tells us that the new question asked by the user is close to some specific question from the dataset, and the answer against that question is the answer that our bots return.

In [0]:
import pandas as pd
import numpy as np
import operator
from sklearn.feature_extraction.text import TfidfVectorizer

In [0]:
from google.colab import files
uploaded = files.upload()

Saving sample_data.csv to sample_data.csv


In [0]:
filepath = 'sample_data.csv'


In [0]:
def bot_engine(query=''):

    csv_reader = pd.read_csv(filepath)

    question_list = csv_reader[csv_reader.columns[0]].values.tolist()
    answers_list = csv_reader[csv_reader.columns[1]].values.tolist()

    vectorizer = TfidfVectorizer(
        min_df=0,
        ngram_range=(2, 4),
        strip_accents='unicode',
        norm='l2',
        encoding='ISO-8859-1'
    )
    
    # We create an array for our train data set (questions)
    X_train = vectorizer.fit_transform(np.array([''.join(que) for que in question_list]))

    # Next step is to transform the query sent by user to bot (test data)
    X_query = vectorizer.transform([query])

    XX_similarity = np.dot(X_train.todense(), X_query.transpose().todense())

    XX_sim_scores = np.array(XX_similarity).flatten().tolist()

    dict_sim = dict(enumerate(XX_sim_scores))

    sorted_dict_sim = sorted(dict_sim.items(), key=operator.itemgetter(1), reverse=True)

    if sorted_dict_sim[0][1] == 0:
        resp = "Sorry, I have no answer, please try asking a different question."
    elif sorted_dict_sim[0][1] > 0:
        resp = answers_list[sorted_dict_sim[0][0]]

    return resp


In [0]:
bot_engine('Can I get an Americano? How much it will cost?')

'An Americano with a single shot will cost $1.40 and the double shot will cost $2.30.'

In [0]:
bot_engine('Are pets allowed inside?')

'Sorry, I have no answer, please try asking a different question.'