# **Dental Clinic RAG Chatbot: Interactive Customer Support System**

Welcome to the **Dental Clinic RAG Chatbot**! This notebook demonstrates an AI-powered chatbot designed to assist customers with information about services, branches, and social media platforms available at our clinic. The chatbot is built using state-of-the-art natural language processing (NLP) techniques and integrates various data sources, including clinic services, branch locations, and social media platforms, to provide context-aware responses in both Arabic and English.

## **Objective**
The goal of this system is to create a seamless customer support experience by allowing users to ask questions related to:
- **Clinic Services**: Detailed information about the treatments and services offered.
- **Branches**: Locations and available branches of the clinic across the country.
- **Social Media**: Platforms where customers can engage with the clinic and stay updated.

## **How It Works**
This chatbot leverages a combination of:
1. **Sentence Embeddings**: Using pre-trained models for multilingual embeddings to understand and process customer queries.
2. **Cosine Similarity**: To retrieve relevant documents and provide the most accurate responses based on the user’s query.
3. **Language Models (LLM)**: A fine-tuned large language model (LLM) generates human-like responses, taking into account past conversations to ensure coherent and context-rich interactions.

By running this notebook, you will be able to interact with the chatbot, ask questions, and receive accurate, dynamic responses based on real-time information from the clinic's database.

---

### Explanation:
1. **faiss-cpu** and **faiss-gpu**: Libraries for efficient similarity search and clustering of dense vectors, for use with CPU and GPU respectively.
2. **transformers**: A library by Hugging Face for working with pre-trained models in natural language processing (NLP).
3. **accelerate**: A library for optimized model training and deployment with multi-GPU support.
4. **torch**: PyTorch, an open-source machine learning library for neural networks and deep learning.
5. **bitsandbytes**: A lightweight library for fast training and inference of large-scale models.
6. **langchain_community**: A community-driven library for building applications using large language models (LLMs).
7. **langchain-huggingface**: An extension for integrating Hugging Face models with Langchain for NLP tasks.

---

In [None]:
!pip install -q faiss-cpu
!pip install -q faiss-gpu
!pip install -q transformers accelerate torch
!pip install -q --upgrade transformers
!pip install -q -U bitsandbytes
!pip install -q torch --upgrade
!pip install -q langchain_community
!pip install -q -U langchain-huggingface

# Importing Required Libraries

This code imports various libraries for NLP, deep learning, and database management. These libraries will be useful for building chatbots, working with pre-trained models, and performing vector-based similarity tasks.


### Explanation of Libraries:
1. **langchain**: A framework for building applications using language models (LLMs). It provides tools for prompts, memory, and chains that make it easier to develop chatbots.
   - `PromptTemplate` is used to define the structure of the input to a language model.
   - `LLMChain` is a class that chains together a prompt template and a model.
   - `ConversationBufferMemory` stores the conversation history.

2. **langchain_huggingface**: An extension to Langchain that allows easy integration with Hugging Face models for building conversational agents.
   
3. **transformers**: A Hugging Face library that provides access to pre-trained transformer models for NLP tasks, such as text generation.
   - `pipeline` allows for easy usage of pre-trained models for tasks like text generation.
   - `AutoTokenizer` and `AutoModelForCausalLM` are used to load tokenizer and causal language models.

4. **sentence_transformers**: A library for generating sentence embeddings and computing similarity between sentences.
   - `SentenceTransformer` helps in encoding sentences into vector representations.
   - `cosine_similarity` from `sklearn` is used to compute similarity between sentence embeddings.

5. **huggingface_hub**: A library to interact with the Hugging Face model hub, such as logging into the hub to access models.
   
6. **sqlite3**: A lightweight database library for storing and querying data in SQLite databases.

---

In [None]:
from langchain import PromptTemplate, LLMChain
from langchain.memory import ConversationBufferMemory
# from langchain.llms import HuggingFacePipeline
from langchain_huggingface import HuggingFacePipeline
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity

import pandas as pd
from huggingface_hub import login
import sqlite3

### Explanation of the Code:

1. **Hugging Face Hub Login**: 
   - `login("your_token_access")` logs into the Hugging Face hub using the provided authentication token. This step allows you to access and use models hosted on the Hugging Face platform.

2. **Database Path**: 
   - `DATABASE_PATH = "/kaggle/input/dental-clinic-rag-chatbot/ara_database.sqlite"` specifies the path to the SQLite database where you likely store information related to the chatbot (e.g., services, branches, and social media data for your dental clinic chatbot).

3. **Embedding Model**: 
   - `EMBEDDING_MODEL_NAME = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"` initializes the embedding model. This specific model, **paraphrase-multilingual-MiniLM-L12-v2**, is designed to work with multilingual text and is used for generating sentence embeddings, which can be used for semantic search or similarity tasks.

4. **Model ID for Text Generation**: 
   - `MODEL_ID = "meta-llama/Llama-3.1-8B"` sets up the pre-trained **Llama-3.1-8B** model from Meta, which is a large language model used for tasks like text generation, Q&A, and chatbot interactions. This model is capable of handling complex natural language tasks.

This section sets up all necessary configurations for integrating Hugging Face models, preparing the embedding model, and pointing to the relevant database for storing information.

---

In [None]:
# Logging in to Hugging Face Hub
login("you_token_access")

# Define the path to the database and embedding model
DATABASE_PATH = "/kaggle/input/dental-clinic-rag-chatbot/ara_database.sqlite"
EMBEDDING_MODEL_NAME = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
# MODEL_ID = "silma-ai/SILMA-9B-Instruct-v1.0"
MODEL_ID = "meta-llama/Llama-3.1-8B"

### Model Quantization and Loading

This section of the code configures model quantization, loads the tokenizer, and loads the model with the defined quantization configuration. 

1. **Quantization Configuration**:
   - `quantization_config = BitsAndBytesConfig(...)`:
     - `load_in_8bit=True`: This flag enables 8-bit quantization for the model. Quantization reduces the model's memory usage by using lower-precision data types (e.g., 8-bit integers instead of 32-bit floats) while maintaining model performance.
     - `llm_int8_threshold=6.0`: This optional parameter sets a threshold for dynamic quantization. It allows the model to adjust how quantization is applied based on the size of the model layers.
     - `llm_int8_skip_modules=None`: This optional parameter allows specifying modules that should be skipped during quantization. By leaving it as `None`, all layers of the model will be quantized.

2. **Loading the Tokenizer**:
   - `tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)`:
     - Loads the tokenizer for the specified pre-trained model (`MODEL_ID`). Tokenizers are used to convert input text into token IDs that the model can process, and vice versa.

3. **Loading the Model**:
   - `model = AutoModelForCausalLM.from_pretrained(...)`:
     - This command loads the pre-trained causal language model (`MODEL_ID`) using the quantization configuration. 
     - The model is loaded onto the appropriate device (either GPU or CPU) based on the `device_map="auto"` parameter.
     - The quantization configuration applied during model loading ensures that the model's memory footprint is reduced while maintaining performance.

---

In [None]:
# Define quantization configuration
quantization_config = BitsAndBytesConfig(
    load_in_8bit=True,  # Enable 8-bit quantization
    llm_int8_threshold=6.0,  # Optional: threshold for dynamic quantization
    llm_int8_skip_modules=None  # Optional: specify modules to skip during quantization
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

# Load model with quantization configuration
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=quantization_config,
    device_map="auto"  # Automatically map to GPU/CPU
)

### Text Generation Pipeline and LangChain Integration

This section of the code sets up a text generation pipeline using a pre-trained model and tokenizer, and then wraps the pipeline for integration with LangChain.

1. **Setting the Padding Token**:
   - `tokenizer.pad_token_id = tokenizer.eos_token_id`:
     - This line ensures that the padding token (`pad_token_id`) is set to the same token as the end-of-sequence token (`eos_token_id`). This is necessary because the model may not have a separate padding token, and setting the padding token to the end-of-sequence token ensures that padding is handled correctly.

2. **Creating the Text Generation Pipeline**:
   - `text_gen_pipeline = pipeline(...)`:
     - This command sets up a text generation pipeline using the pre-trained model (`model`) and tokenizer (`tokenizer`). The pipeline is configured with the following parameters:
       - `do_sample=True`: This enables sampling during text generation, meaning that the model will generate diverse outputs based on probability distributions.
       - `temperature=0.5`: Controls the randomness of the generation. A lower temperature (closer to 0) results in more deterministic outputs, while higher values (closer to 1) generate more diverse outputs.
       - `top_p=0.65`: This parameter uses nucleus sampling, where the model considers only the top `p` most probable next tokens, summing up to a probability of `p` (in this case, 65%).
       - `max_new_tokens=256`: Limits the maximum number of tokens the model can generate in one response to 256.
       - `repetition_penalty=1.2`: This reduces the likelihood of the model repeating phrases or generating repetitive text.

3. **Wrapping the Pipeline for LangChain**:
   - `llm = HuggingFacePipeline(pipeline=text_gen_pipeline)`:
     - This line wraps the text generation pipeline for use with LangChain. The `HuggingFacePipeline` class allows LangChain to interface with the pipeline, enabling you to use it in larger, more complex applications such as chatbots or interactive systems.

This setup allows you to generate text responses based on the input, which can be integrated into a broader conversational system using LangChain.

---

In [None]:
tokenizer.pad_token_id = tokenizer.eos_token_id

# Create a pipeline
text_gen_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    do_sample=True,
    temperature=0.5,
    top_p=0.65,
    max_new_tokens=256,
    repetition_penalty=1.2
)

# Wrap the pipeline for LangChain
llm = HuggingFacePipeline(pipeline=text_gen_pipeline)

### Loading the Embedding Model

1. **Loading the Sentence Transformer Model**:
   - `embedding_model = SentenceTransformer(EMBEDDING_MODEL_NAME)`:
     - This line loads a pre-trained embedding model from the `sentence-transformers` library. The model specified by `EMBEDDING_MODEL_NAME` (in this case, `"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"`) is designed to generate sentence embeddings. These embeddings are vector representations of sentences that capture semantic meaning in a dense format, allowing you to compare and analyze text based on their meanings.

   - The `SentenceTransformer` class is specifically designed to handle tasks such as:
     - Text similarity: Comparing sentences or documents to find how similar they are.
     - Text clustering: Grouping sentences or documents with similar meanings.
     - Information retrieval: Finding the most relevant documents or answers based on the input query.

   - By using this embedding model, you can convert your input text into fixed-size vectors (embeddings) that can then be used for tasks like semantic search, document retrieval, and various NLP-based applications.
  
---

In [None]:
# Load the embedding model
embedding_model = SentenceTransformer(EMBEDDING_MODEL_NAME)

### Connecting to the SQLite Database and Reading Data

1. **Connecting to the SQLite Database**:
   - `conn = sqlite3.connect(DATABASE_PATH)`:
     - This line establishes a connection to the SQLite database using the provided `DATABASE_PATH`. The connection object `conn` allows you to interact with the database (in this case, located at the specified path, i.e., `"/kaggle/input/dental-clinic-rag-chatbot/ara_database.sqlite"`).

2. **Reading Data from the SQLite Database**:
   - The following queries use `pd.read_sql_query()` to read data from each table in the SQLite database and load it into pandas DataFrames:
     - `services_arabic_df = pd.read_sql_query("SELECT * FROM Services_Arabic", conn)`:
       - This retrieves all records from the `Services_Arabic` table and stores them in the `services_arabic_df` DataFrame.
     - `branches_arabic_df = pd.read_sql_query("SELECT * FROM Branches_Arabic", conn)`:
       - This retrieves all records from the `Branches_Arabic` table and stores them in the `branches_arabic_df` DataFrame.
     - `socialmedia_arabic_df = pd.read_sql_query("SELECT * FROM SocialMedia_Arabic", conn)`:
       - This retrieves all records from the `SocialMedia_Arabic` table and stores them in the `socialmedia_arabic_df` DataFrame.
     - `services_english_df = pd.read_sql_query("SELECT * FROM Services_English", conn)`:
       - This retrieves all records from the `Services_English` table and stores them in the `services_english_df` DataFrame.
     - `branches_english_df = pd.read_sql_query("SELECT * FROM Branches_English", conn)`:
       - This retrieves all records from the `Branches_English` table and stores them in the `branches_english_df` DataFrame.
     - `socialmedia_english_df = pd.read_sql_query("SELECT * FROM SocialMedia_English", conn)`:
       - This retrieves all records from the `SocialMedia_English` table and stores them in the `socialmedia_english_df` DataFrame.

3. **Closing the Database Connection**:
   - `conn.close()`:
     - Once all the necessary data is fetched into pandas DataFrames, the connection to the SQLite database is closed to free up resources.

---

In [None]:
# Connect to the SQLite database
conn = sqlite3.connect(DATABASE_PATH)

# Read each table into a pandas DataFrame
services_arabic_df = pd.read_sql_query("SELECT * FROM Services_Arabic", conn)
branches_arabic_df = pd.read_sql_query("SELECT * FROM Branches_Arabic", conn)
socialmedia_arabic_df = pd.read_sql_query("SELECT * FROM SocialMedia_Arabic", conn)
services_english_df = pd.read_sql_query("SELECT * FROM Services_English", conn)
branches_english_df = pd.read_sql_query("SELECT * FROM Branches_English", conn)
socialmedia_english_df = pd.read_sql_query("SELECT * FROM SocialMedia_English", conn)

# Close the connection
conn.close()

### Combining Arabic and English DataFrames

1. **Services DataFrame**:
   - `services_df = pd.concat([services_arabic_df, services_english_df], keys=['Arabic', 'English'], names=['Language'])`:
     - This code combines the Arabic and English `Services` DataFrames (`services_arabic_df` and `services_english_df`) into a single DataFrame. 
     - The `keys=['Arabic', 'English']` argument adds a new level to the index, indicating the language of each entry (Arabic or English).
     - The `names=['Language']` argument assigns a name to this new index level, making it clear that this index represents the language.

2. **Branches DataFrame**:
   - `branches_df = pd.concat([branches_arabic_df, branches_english_df], keys=['Arabic', 'English'], names=['Language'])`:
     - This combines the Arabic and English `Branches` DataFrames (`branches_arabic_df` and `branches_english_df`) in a similar way as done for the services.
     - It also adds a `Language` index to differentiate between Arabic and English branches.

3. **Social Media DataFrame**:
   - `social_media_df = pd.concat([socialmedia_arabic_df, socialmedia_english_df], keys=['Arabic', 'English'], names=['Language'])`:
     - This combines the Arabic and English `SocialMedia` DataFrames (`socialmedia_arabic_df` and `socialmedia_english_df`), creating a unified DataFrame with an additional language index.
     - It follows the same approach as the previous two, allowing you to track the language of each social media entry.

### Summary
- **`pd.concat()`** is used to concatenate DataFrames along rows, with a hierarchical index created to indicate which entries are in Arabic and which are in English.
- This results in three DataFrames (`services_df`, `branches_df`, `social_media_df`), each containing both Arabic and English data, with the language clearly indicated in the index.

---

In [None]:
# Services dataframe combining Arabic and English services
services_df = pd.concat([services_arabic_df, services_english_df], keys=['Arabic', 'English'], names=['Language'])

# Branches dataframe combining Arabic and English branches
branches_df = pd.concat([branches_arabic_df, branches_english_df], keys=['Arabic', 'English'], names=['Language'])

# Social Media dataframe combining Arabic and English social media
social_media_df = pd.concat([socialmedia_arabic_df, socialmedia_english_df], keys=['Arabic', 'English'], names=['Language'])

### Document Retrieval Function Using Cosine Similarity

The `retrieve_relevant_documents` function retrieves the most relevant documents from three different categories (services, branches, and social media) based on a query, using cosine similarity and sentence embeddings.

#### Steps Involved:
1. **Query Embedding**:
   - The function first encodes the query using the `embedding_model` (in this case, a sentence transformer model). This converts the query text into a numerical vector (embedding) that can be compared to the embeddings of other documents.


2. **Document Embeddings**:
   - The function then encodes the documents in each of the three categories (services, branches, and social media). These documents are converted into embeddings using the same model.


3. **Cosine Similarity Calculation**:
   - The query embedding is compared to each of the document embeddings using **cosine similarity**. Cosine similarity measures the angle between two vectors, with a smaller angle indicating higher similarity.


4. **Retrieve Top-K Most Relevant Documents**:
   - For each category (services, branches, and social media), the function selects the top `k` most similar documents based on their cosine similarity scores. The documents with the highest similarity scores are chosen.


5. **Return Relevant Documents**:
   - The function returns a dictionary containing the top `k` relevant documents from each category. The keys in the dictionary are the names of the categories ("services", "branches", and "social_media").


#### Example Usage:
- Given a user query, the function will return the top 10 most relevant services, branches, and social media entries based on the query.
- You can adjust the `top_k` parameter to control how many documents are returned from each category.

### Summary:
This function uses cosine similarity to compare a user’s query against a set of documents across three categories. It retrieves and returns the top `k` most relevant documents for each category based on the similarity scores.

---

In [None]:
# Simple retrieval function using cosine similarity
def retrieve_relevant_documents(query, services_df, branches_df, social_media_df, top_k=10):
    query_embedding = embedding_model.encode([query])

    # Encode documents in all dataframes
    services_embeddings = embedding_model.encode(services_df['service_name'].tolist())
    branches_embeddings = embedding_model.encode(branches_df['branch_name'].tolist())
    social_media_embeddings = embedding_model.encode(social_media_df['platform_name'].tolist())

    # Compute cosine similarities
    service_similarities = cosine_similarity(query_embedding, services_embeddings)
    branch_similarities = cosine_similarity(query_embedding, branches_embeddings)
    social_media_similarities = cosine_similarity(query_embedding, social_media_embeddings)

    # Get top k most similar documents for each category
    top_services = services_df.iloc[service_similarities.argsort()[0][-top_k:][::-1]]
    top_branches = branches_df.iloc[branch_similarities.argsort()[0][-top_k:][::-1]]
    top_social_media = social_media_df.iloc[social_media_similarities.argsort()[0][-top_k:][::-1]]

    # Combine the results from services, branches, and social media
    relevant_docs = {
        "services": top_services,
        "branches": top_branches,
        "social_media": top_social_media
    }

    return relevant_docs

### Chatbot Response Generation Function

The `get_chatbot_response` function processes a user's query and generates a relevant response by retrieving documents from a set of predefined dataframes (services, branches, and social media) and utilizing a language model for generating the final response.

#### Steps Involved:

1. **Retrieve Relevant Documents**:
   - The function calls the `retrieve_relevant_documents` function to find the top `k` most relevant documents for the given query. These documents are retrieved from three categories: services, branches, and social media.


2. **Extract Relevant Information**:
   - After retrieving the relevant documents, the function extracts the necessary information from each category (service names, branch names, and social media platform names). These values are joined together into strings.


3. **Create the Input Prompt**:
   - The function creates a structured input prompt for the chatbot, including the relevant information about services, branches, and social media platforms. The prompt also contains the user's query.


4. **Set Up Prompt and Memory**:
   - The function defines a template that includes placeholders for the conversation history and the generated input prompt. It also initializes memory to store the conversation history.

5. **Generate Response Using LLM Chain**:
   - The language model (LLM) chain is initialized with the prompt template, the LLM model, and memory to generate a response. The model is invoked with the input prompt and the current chat history.


6. **Handle the Response**:
   - The response from the LLM is captured, and the assistant's message is extracted. The assistant's response is then saved in the conversation memory for future interactions.


7. **Return the Assistant's Message**:
   - Finally, the assistant's message (response) is returned to the user.


8. **Error Handling**:
   - If any error occurs during the process, a user-friendly error message is returned.


#### Example Usage:
- Given a user query, this function generates a relevant chatbot response by retrieving information from services, branches, and social media based on cosine similarity.
- The conversation history is maintained to ensure the chatbot provides coherent and contextually aware responses across multiple turns.

### Summary:
This function processes a user's query, retrieves the most relevant documents from services, branches, and social media, and then uses an LLM (language model) to generate a chatbot response. The conversation is saved in memory, ensuring that the chatbot can handle multi-turn conversations efficiently.

In [None]:
def get_chatbot_response(query, services_df, branches_df, social_media_df, llm):
    try:
        # Retrieve relevant documents using the query and dataframes
        relevant_docs = retrieve_relevant_documents(query, services_df, branches_df, social_media_df, top_k=3)

        # Extract relevant information from the documents
        services = ", ".join(relevant_docs["services"]["service_name"].tolist())
        branches = ", ".join(relevant_docs["branches"]["branch_name"].tolist())
        social_media = ", ".join(relevant_docs["social_media"]["platform_name"].tolist())

        # Generalized prompt template for a chatbot, including context
        input_prompt = f"""
        أنت روبوت دردشة في المملكة العربية السعودية. هدفي هو مساعدة المستخدمين في الحصول على إجابات لأسئلتهم حول خدماتنا المتوفرة وأي استفسار عام.

        الخدمات المتوفرة في عيادتنا هي:
        {services}

        الفروع المتوفرة لدينا في المملكة هي:
        {branches}

        يمكنك التواصل معنا عبر منصات الوسائط الاجتماعية التالية:
        {social_media}

        السؤال: {query}
        """

        # Create an updated prompt template to include chat history
        template = """<s><|user|>Current conversation:{chat_history}

        
        {input_prompt}<|end|>
        <|assistant|>"""

        prompt = PromptTemplate(
            template=template,
            input_variables=["input_prompt", "chat_history"]
        )

        # Initialize memory to store and retrieve the conversation history
        memory = ConversationBufferMemory(memory_key="chat_history",)

        # Chain the LLM, Prompt, and Memory together
        llm_chain = LLMChain(
            prompt=prompt,
            llm=llm,
            memory=memory
        )

        # Invoke the LLM model and capture the response
        response = llm_chain.invoke({
            "input_prompt": input_prompt,
            "chat_history": memory.load_memory_variables({})["chat_history"]
        })

        # Extract assistant's message from the response
        assistant_message = response.get("text", "").strip()

        # Save the user query and assistant response to the conversation memory
        memory.save_context(
            inputs={"query": query},
            outputs={"response": assistant_message}
        )

        # Return only the assistant's message
        return assistant_message

    except Exception as e:
        # Handle any errors and provide a user-friendly message
        return f"عذراً، حدث خطأ: {str(e)}"

1. **Welcome Message**: When the program starts, it prints a welcome message in Arabic asking the user how it can assist them. It also provides the option to exit the program by typing `'exit'`.

2. **User Input**: The chatbot waits for the user to input a question. The user can type their query in Arabic, and the chatbot will process it.

3. **Exit Condition**: If the user types `'exit'` (case-insensitive), the loop ends, and the program prints `"إلى اللقاء!"` (Goodbye in Arabic), before breaking out of the loop.

4. **Get Chatbot Response**: If the user doesn't type `'exit'`, the program passes the user's query to the `get_chatbot_response` function, which processes it and generates an appropriate response using the LLM chain, relevant documents, and conversation history.

5. **Print the Response**: The chatbot's response is printed to the console with the label `"Robt:"`.

### How It Works:
- The loop runs continuously, processing user input and generating chatbot responses until the user decides to exit by typing `'exit'`.
- The function `get_chatbot_response` is used to generate responses based on the context of the conversation and the information available in the `services_df`, `branches_df`, and `social_media_df`.

### Improvements/Additional Features:
- **Better Exit Handling**: You could include additional phrases like `quit`, `exit`, or `bye` to handle user exit commands more flexibly.
- **Contextual Conversations**: The chatbot currently relies on the `get_chatbot_response` function, which includes context from previous interactions. You can improve the conversational experience by adding more sophisticated handling of context or multiple dialogue turns.

In [None]:
if __name__ == "__main__":
    print("مرحبًا! كيف يمكنني مساعدتك؟ اكتب 'exit' للخروج.")

    while True:
        # Get user input
        user_input = input("سؤالك: ")

        # Exit if the user types "exit"
        if user_input.lower() == "exit":
            print("إلى اللقاء!")
            break

        # Get the chatbot response
        assistant_message = get_chatbot_response(query=user_input,
                                        services_df=services_df,
                                        branches_df=branches_df,
                                        social_media_df=social_media_df,
                                        llm=llm)

        # Print the response
        print("Robt: ", assistant_message)

# **Conclusion**

In this notebook, we've demonstrated how to build an advanced, multilingual chatbot for a dental clinic using state-of-the-art natural language processing and machine learning techniques. By integrating multiple data sources, including clinic services, branch information, and social media platforms, the chatbot is able to provide personalized, context-aware responses to user queries.

The system utilizes a combination of **sentence embeddings** for accurate query understanding, **cosine similarity** for retrieving the most relevant documents, and a **large language model** (LLM) to generate human-like responses. This ensures that users receive precise, meaningful information in both Arabic and English, making the chatbot a valuable tool for customer engagement and support.

By running this system, the clinic can offer customers an intuitive, automated way to interact with its services, helping them find answers quickly and easily. With future improvements, such as fine-tuning for domain-specific questions or incorporating advanced features, this chatbot has the potential to enhance customer satisfaction and streamline operations.

Thank you for exploring this solution, and feel free to customize it further for your needs!

--- 