![AI Image](A-Deep-Dive-Into-Healthcare-Chatbots_-Benefits-Use-Cases-and-Challenges_big-new-min-1024x512.webp)

# Introduction 
This project involves creating a chatbot designed to assist clients by providing timely and accurate responses to their inquiries. The chatbot will utilize natural language processing techniques to understand user inputs and deliver relevant information or support. Its implementation aims to enhance customer service efficiency, reduce response times, and improve overall client satisfaction by offering 24/7 assistance and personalized interactions. Through continuous learning and adaptation, the chatbot will evolve to meet the changing needs of clients, making it a valuable tool for businesses seeking to improve their customer engagement.

In [1]:
import pandas as pd
import re
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import LabelEncoder
import pandas as pd
from langchain.llms import CTransformers
from langchain.prompts import PromptTemplate
from langchain.llms import CTransformers
from langchain.prompts import PromptTemplate



In [2]:
df=pd.read_csv("mchat.csv")

In [3]:
df

Unnamed: 0,Description,Patient,Doctor
0,Q. What does abutment of the nerve root mean?,"Hi doctor,I am just wondering what is abutting...",Hi. I have gone through your query with dilige...
1,Q. What should I do to reduce my weight gained...,"Hi doctor, I am a 22-year-old female who was d...",Hi. You have really done well with the hypothy...
2,Q. I have started to get lots of acne on my fa...,Hi doctor! I used to have clear skin but since...,Hi there Acne has multifactorial etiology. Onl...
3,Q. Why do I have uncomfortable feeling between...,"Hello doctor,I am having an uncomfortable feel...",Hello. The popping and discomfort what you fel...
4,Q. My symptoms after intercourse threatns me e...,"Hello doctor,Before two years had sex with a c...",Hello. The HIV test uses a finger prick blood ...
...,...,...,...
256911,Why is hair fall increasing while using Bontre...,I am suffering from excessive hairfall. My doc...,"Hello Dear Thanks for writing to us, we are he..."
256912,Why was I asked to discontinue Androanagen whi...,"Hi Doctor, I have been having severe hair fall...","hello, hair4u is combination of minoxid..."
256913,Can Mintop 5% Lotion be used by women for seve...,Hi..i hav sever hair loss problem so consulted...,HI I have evaluated your query thoroughly you...
256914,Is Minoxin 5% lotion advisable instead of Foli...,"Hi, i am 25 year old girl, i am having massive...",Hello and Welcome to ‘Ask A Doctor’ service.I ...


In [4]:

# Download necessary NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Load the dataset
# Assuming df is your dataframe with columns: 'Description', 'Patient', 'Doctor'

# Initialize stopwords
stop_words = set(stopwords.words('english'))

# Function to clean and tokenize text
def clean_and_tokenize(text):
    # Convert to lowercase
    text = text.lower()
    # Remove special characters and numbers
    text = re.sub(r'[^a-z\s]', '', text)
    # Tokenize
    tokens = word_tokenize(text)
    # Remove stopwords
    tokens = [word for word in tokens if word not in stop_words]
    return ' '.join(tokens)

# Apply the cleaning and tokenization function to each column
df['Description'] = df['Description'].apply(clean_and_tokenize)
df['Patient'] = df['Patient'].apply(clean_and_tokenize)
df['Doctor'] = df['Doctor'].apply(clean_and_tokenize)

# Concatenate all text columns to form a single corpus for embeddings
df['All_Text'] = df['Description'] + " " + df['Patient'] + " " + df['Doctor']

# Convert text to TF-IDF embeddings
vectorizer = TfidfVectorizer(max_features=5000)  # Adjust the number of features as needed
X = vectorizer.fit_transform(df['All_Text']).toarray()

# Optional: Encode the Description column for classification purposes
label_encoder = LabelEncoder()
y = label_encoder.fit_transform(df['Description'])

# Now, X contains the TF-IDF embeddings of the text, and y contains the encoded labels for descriptions
print("TF-IDF Matrix Shape:", X.shape)
print("Encoded Labels Shape:", y.shape)


[nltk_data] Downloading package punkt to C:\Users\moham/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\moham/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


TF-IDF Matrix Shape: (256916, 5000)
Encoded Labels Shape: (256916,)


In [5]:

# Define the LLM model with your model path and type
llm = CTransformers(model="C:/Users/moham/Desktop/projet adam MEDchat/llama-2-7b-chat.ggmlv3.q4_0.bin", model_type="llama")

# Define the prompt template with a specific task or question format
template = "You are a helpful assistant. Answer the following question concisely: {question}"
prompt = PromptTemplate.from_template(template)

# Chain the prompt and the LLM using RunnableSequence
chain = prompt | llm

# Define the question to be answered
question = "What are the key differences between supervised and unsupervised learning?"

# Run the chain with the input question using `.invoke()` method
response = chain.invoke({"question": question})  # Pass the question as a dictionary
print(response)



Supervised learning involves training an algorithm on labeled data, where the algorithm learns to predict the output for new input based on the patterns it observed during training. In contrast, unsupervised learning involves training an algorithm on unlabeled data, where the algorithm must find patterns or structure in the data without any prior knowledge of the expected output.



In [6]:
def chunk_text(df, chunk_size):
    """Chunk the DataFrame into smaller pieces based on a defined chunk size."""
    chunks = []
    current_chunk = []
    current_length = 0  # To track the number of tokens in the current chunk

    for index, row in df.iterrows():
        # Concatenate text from the row
        text = f"Description: {row['Description']} Patient: {row['Patient']} Doctor: {row['Doctor']}"
        token_count = len(text.split())  # Approximate token count based on words

        # Check if adding this text would exceed the chunk size
        if current_length + token_count <= chunk_size:
            current_chunk.append(text)
            current_length += token_count
        else:
            chunks.append(" ".join(current_chunk))  # Join the chunk into a single string
            current_chunk = [text]  # Start a new chunk with the current row
            current_length = token_count  # Reset the token count

    if current_chunk:  # If there is remaining text in the current chunk
        chunks.append(" ".join(current_chunk))

    return chunks


In [7]:
# Define your chunk size based on the model's maximum context length (e.g., 512 tokens)
max_tokens = 512  # Adjust according to your model's context length

# Generate chunks
text_chunks = chunk_text(df, chunk_size=max_tokens)


In [None]:

# Initialize the LLM model
llm = CTransformers(model="C:/Users/moham/Desktop/projet adam MEDchat/llama-2-7b-chat.ggmlv3.q4_0.bin", model_type="llama")

# Define the initial conversation context
conversation_history = []

# Function to create a prompt based on the conversation history
def create_prompt(question, context):
    prompt = f"You are a helpful assistant.\n\n"
    prompt += f"Previous context:\n{context}\n\n"
    prompt += f"User: {question}\nAssistant:"
    return prompt

# Function to interact with the model
def chat_with_model():
    while True:
        # Get user input
        question = input("You: ")
        if question.lower() in ['exit', 'quit']:
            print("Exiting the chat.")
            break
        
        # Chunk context to avoid exceeding token limits
        context = " ".join(conversation_history[-5:])  # Use the last 5 messages for context (adjust as needed)

        # Create a prompt with the current question and context
        prompt = create_prompt(question, context)
        
        # Get the response from the LLM using the invoke method
        try:
            response = llm.invoke(prompt)  # Change here to use invoke
        except Exception as e:
            print(f"An error occurred: {e}. Please try rephrasing your question.")
            continue  # Skip to the next iteration for new input
        
        # Print the assistant's response
        print(f"Assistant: {response.strip()}")
        
        # Update the conversation history
        conversation_history.append(f"User: {question}")
        conversation_history.append(f"Assistant: {response.strip()}")

# Start the chatbot interaction
chat_with_model()


Assistant: I'm so sorry to hear that you're experiencing back pain. It's important to consult with a medical professional to determine the cause of your back pain and get proper treatment. In the meantime, here are some things you can try to help alleviate your discomfort:

* Take over-the-counter pain medication as directed.
* Apply heat or cold packs to the affected area.
* Practice good posture and lifting techniques to avoid exacerbating the problem.
* Engage in gentle stretching and exercise to help maintain flexibility and mobility.
* Consider visiting a physical therapist for further evaluation and treatment.

Please let me know if you have any other questions or concerns, or if there's anything else I can do to help.


# Conclusion :
In conclusion, the chatbot has been successfully developed and tested, demonstrating its ability to effectively assist clients by providing accurate and timely responses to inquiries. Its performance has met expectations, showcasing a robust understanding of user inputs and delivering relevant information consistently. With the model functioning perfectly, the next step is to implement a web interface that will allow users to interact with the chatbot seamlessly. This enhancement will provide a user-friendly platform, making the chatbot readily accessible to clients and further improving the overall customer experience.