# PROJECT 5: Hospital Chatbot


**Project Description:**

I have made a hospital chatbot for a renowned hospital providing first class assistance in peak hrs of the hospital.
I have made necessary changes to the dataset that included {Queries/response} collection of data to make an interactive chatbot data set. After refining the data set, I built a machine learning model that can generate response to an individuals query to help the hospital customer care department.



In [None]:
# importing modules
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import save_model


# importing training data
# Load training data with context
training_data = pd.read_csv("/content/training_data_chat_bot.txt", usecols=["patterns", "tags", "context"])
training_data.sample(20)


Unnamed: 0,patterns,tags,context
10,Till next time,goodbye,
16,,noanswer,
29,Which drugs dont have adverse reaction?,adverse_drug,
17,,noanswer,
52,,search_hospital_by_params,search_hospital_by_type
26,Open adverse drugs module,adverse_drug,
49,I want to search hospital data,hospital_search,search_hospital_by_params
47,Lookup for hospital,hospital_search,search_hospital_by_params
41,Find me a pharmacy,pharmacy_search,search_pharmacy_by_name
30,Open blood pressure module,blood_pressure,


1. The training data is preprocessed to standardize the text and convert it into numerical data suitable for machine learning algorithms.

2. Missing values are filled with empty strings and the TfidfVectorizer is used to transform the query texts into TF-IDF features, capturing the importance of words within the queries.

3. The 'tags' are encoded numerically and then one-hot encoded to serve as the target for model training.

4. A DNN model is then created and trained on this preprocessed data to predict the appropriate response tags based on input queries.

The model's performance is monitored through its accuracy and loss over 1000 epochs, indicating how well it learns to classify queries into the correct response categories, i.e. 'tags' column/feature. Finally, the trained model is saved for future use in handling user queries.

In [None]:
# preprocessing training data
training_data.fillna('', inplace=True)
training_data["patterns"] = training_data["patterns"].str.lower()
training_data["context"] = training_data["context"].str.lower()
vectorizer = TfidfVectorizer(ngram_range=(1, 2), stop_words="english")
training_data_tfidf = vectorizer.fit_transform(training_data["patterns"]).toarray()

# preprocessing target variable(tags)
le = LabelEncoder()
training_data_tags_le = pd.DataFrame({"tags": le.fit_transform(training_data["tags"])})
training_data_tags_dummy_encoded = pd.get_dummies(training_data_tags_le["tags"]).to_numpy()

# creating DNN
hospitalBot = Sequential()
hospitalBot.add(Dense(10, input_shape=(len(training_data_tfidf[0]),)))
hospitalBot.add(Dense(16))
hospitalBot.add(Dense(32))
hospitalBot.add(Dense(32))
hospitalBot.add(Dense(8))
hospitalBot.add(Dense(len(training_data_tags_dummy_encoded[0]), activation="softmax"))
hospitalBot.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])

# fitting DNN
hospitalBot.fit(training_data_tfidf, training_data_tags_dummy_encoded, epochs=90, batch_size=8)

# saving model file
save_model(hospitalBot, "chatbot_model_v1")

Epoch 1/90
Epoch 2/90
Epoch 3/90
Epoch 4/90
Epoch 5/90
Epoch 6/90
Epoch 7/90
Epoch 8/90
Epoch 9/90
Epoch 10/90
Epoch 11/90
Epoch 12/90
Epoch 13/90
Epoch 14/90
Epoch 15/90
Epoch 16/90
Epoch 17/90
Epoch 18/90
Epoch 19/90
Epoch 20/90
Epoch 21/90
Epoch 22/90
Epoch 23/90
Epoch 24/90
Epoch 25/90
Epoch 26/90
Epoch 27/90
Epoch 28/90
Epoch 29/90
Epoch 30/90
Epoch 31/90
Epoch 32/90
Epoch 33/90
Epoch 34/90
Epoch 35/90
Epoch 36/90
Epoch 37/90
Epoch 38/90
Epoch 39/90
Epoch 40/90
Epoch 41/90
Epoch 42/90
Epoch 43/90
Epoch 44/90
Epoch 45/90
Epoch 46/90
Epoch 47/90
Epoch 48/90
Epoch 49/90
Epoch 50/90
Epoch 51/90
Epoch 52/90
Epoch 53/90
Epoch 54/90
Epoch 55/90
Epoch 56/90
Epoch 57/90
Epoch 58/90
Epoch 59/90
Epoch 60/90
Epoch 61/90
Epoch 62/90
Epoch 63/90
Epoch 64/90
Epoch 65/90
Epoch 66/90
Epoch 67/90
Epoch 68/90
Epoch 69/90
Epoch 70/90
Epoch 71/90
Epoch 72/90
Epoch 73/90
Epoch 74/90
Epoch 75/90
Epoch 76/90
Epoch 77/90
Epoch 78/90
Epoch 79/90
Epoch 80/90
Epoch 81/90
Epoch 82/90
Epoch 83/90
Epoch 84/90
E

In [None]:
# importing modules
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import LabelEncoder
import numpy as np
from tensorflow.keras.models import load_model
import json
import random
#Let’s load data, model, and responses

# importing training data
training_data = pd.read_csv("/content/training_data_chat_bot.txt", usecols=["patterns", "tags", "context"])

# Replace NaN values with empty strings in 'patterns' and 'context' columns
training_data.fillna('', inplace=True)

# loading model
chatbot_model = load_model("chatbot_model_v1")


In [None]:
# Load responses
with open("/content/response.json") as file:
    responses = json.load(file)

Now, the TfidfVectorizer is re-initialized and then fitted to the 'patterns' column of the training data. This process adapts the vectorizer to the specific vocabulary of the chatbot's training queries, allowing it to convert future input queries into the same TF-IDF feature space for the model to make predictions.

This step is important for ensuring that the chatbot can interpret user queries consistently with how it was trained and has a variety of responses to engage with users effectively.

In [None]:
# fitting TfIdfVectorizer with training data to preprocess inputs
training_data["patterns"] = training_data["patterns"].str.lower()
vectorizer = TfidfVectorizer(ngram_range=(1, 2), stop_words="english")
vectorizer.fit(training_data["patterns"])

A LabelEncoder is instantiated. The encoder is then fitted to the 'tags' column of the training data. This step does a mapping between each unique tag and a unique integer.

In [None]:
# fitting LabelEncoder with target variable(tags) for inverse transformation of predictions
le = LabelEncoder()
le.fit(training_data["tags"])

I have implemented the context handing using update_context() function.
This implementation allows the chatbot to remember the topic of the conversation and provide relevant responses, improving the user experience by making interactions more context-aware.

In [None]:
# Create a dictionary to map intents to their follow-up contexts
intent_to_context = dict(zip(training_data["tags"], training_data["context"]))

# Context handling
current_context = ""

def update_context(tag):
    global current_context
    if tag in intent_to_context and intent_to_context[tag] != "":
        current_context = intent_to_context[tag]
    else:
        current_context = ""  # Reset context if no follow-up is required


In [None]:
# transforming input and predicting intent
def predict_tag(inp_str):
    inp_data_tfidf = vectorizer.transform([inp_str.lower()]).toarray()
    predicted_proba = chatbot_model.predict(inp_data_tfidf)
    encoded_label = [np.argmax(predicted_proba)]
    predicted_tag = le.inverse_transform(encoded_label)[0]
    return predicted_tag

The start_chat() function initiates the chatbot session. It welcomes users, guides them on interacting with the chatbot, and enters a loop to process user inputs. The chatbot uses the predict_tag() function to determine the user's intent, fetches a relevant response based on this intent or the ongoing context, and updates the conversation's context as needed. The session continues until the user types "EXIT".

In [None]:
# defining chat function
def start_chat():
    print("---------------  Welcome to our Hospital Assistance Chatbot  -------"
      "--------")
    print()
    print("Hi! I'm here to help you with your queries regarding our services "
      "and facilities.")
    print()
    print("You can ask me about appointments, health services, hospital "
      "information, and more.")
    print()
    print("Just type your question below to start. If at any point you wish to "
      "end the conversation, type 'EXIT'.")
    print()
    while True:
      inp = input("Ask anything... : ")
      if inp == "EXIT":
        break
      else:
        if current_context == "":
          tag = predict_tag(inp)
          update_context(tag)
          response = random.choice(responses[tag])
          print("Response... : ", response)
        else:
          response = random.choice(responses[current_context])
          update_context(current_context)
          print("Response... : ", response)

# calling chat function to start chatting
start_chat()

---------------  Welcome to our Hospital Assistance Chatbot  ---------------

Hi! I'm here to help you with your queries regarding our services and facilities.

You can ask me about appointments, health services, hospital information, and more.

Just type your question below to start. If at any point you wish to end the conversation, type 'EXIT'.

Ask anything... : Hey there
Response... :  Good to see you again
Ask anything... : options
Response... :  Sorry, can't understand you
Ask anything... : What support is offered
Response... :  Offering support for Adverse drug reaction, Blood pressure, Hospitals and Pharmacies
Ask anything... : How to check Adverse drug reaction?
Response... :  Navigating to Adverse drug reaction module
Ask anything... : I want to log blood pressure results
Response... :  Navigating to Blood Pressure module
Ask anything... : I want to search for blood pressure result history
Response... :  Patient ID?
Ask anything... : PAT23101
Response... :  Loading Blood pres