# Chatbot Implementation Using Pandas and DialoGPT

In this notebook, we will build a customer support chatbot that uses structured datasets and a pre-trained language model, DialoGPT, to answer customer queries regarding product information, order statuses, and customer data.

---

## Step 1: Importing Required Libraries

In [1]:
import pandas as pd
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import re
import datetime
import string

### **Importing Required Libraries**

In this section, we import several essential libraries:

- **Pandas** (`pd`): This is a powerful data manipulation and analysis library, especially useful for working with structured data (like CSV files). We will use it to load and manage the datasets.
- **Transformers** (`AutoModelForCausalLM`, `AutoTokenizer`): These functions are part of the Hugging Face library. They allow us to load pre-trained models and tokenizers for natural language processing (NLP). In our case, we use the **DialoGPT** model for generating chatbot responses.
- **Torch** (`torch`): A deep learning framework used to manage tensors and pass data through the model. Here, it helps in tokenizing user input and generating the chatbot’s responses.
- **re**: Python’s regular expressions module for pattern matching. It’s used to detect intents and extract entities (e.g., customer IDs or product names) from the user's input.
- **datetime**: This module helps us manipulate dates and times. We use it to calculate and format last login/check-out dates of customers.
- **string**: Used for basic string operations, such as removing punctuation from product names.

---

## Step 2: Loading Datasets

In [2]:
# Load datasets
customers_df = pd.read_csv("../Cleaned_Datasets/customer_SG_only.csv", index_col=0)  # customer data
orders_df = pd.read_csv("../Cleaned_Datasets/orders_generated.csv", index_col=0)  # order data
products_df = pd.read_csv("../Cleaned_Datasets/products_cleaned.csv", index_col=0)  # product data

### **Loading Datasets**

Here, we load the necessary datasets:

- **`customers_df`**: Contains customer data, such as login history, check-out details, and other customer-specific information.
- **`orders_df`**: Stores information about the orders placed by customers, including the order time, products purchased, etc.
- **`products_df`**: Contains product details such as product name, price, and rating.

These datasets are loaded from CSV files and converted into **Pandas DataFrames** for easier manipulation and querying during the chatbot's execution.

---

## Step 3: Loading DialoGPT Model

In [3]:
# Load pre-trained DialoGPT model and tokenizer
model_name = "microsoft/DialoGPT-medium"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

### **Loading DialoGPT Model**

In this section, we load the pre-trained **DialoGPT-medium** model and tokenizer:

- **DialoGPT**: A conversational variant of GPT-2 developed by Microsoft, specialized for dialogue generation. It can generate natural and contextually relevant responses based on user inputs.
- **AutoTokenizer**: Tokenizes the text input (converting words into numbers that the model can process).
- **AutoModelForCausalLM**: The pre-trained model that we use to generate responses based on user input.

The **medium** variant of DialoGPT is a balance between response quality and model size, providing good performance without being too computationally expensive.

---

## Step 4: Context Management for Follow-up Queries

In [4]:
# Context management for follow-up queries
context = {
    "last_product": None,  # Stores the last product mentioned
    "last_intent": None,   # Stores the last intent (price, rating, etc.)
    "last_customer": None, # Stores last customer ID
    "last_order": None     # Stores last order ID
}

### **Context Management for Follow-up Queries**

This context dictionary is used to store information about the user's previous query. The context helps the chatbot manage **follow-up queries** where the user refers to a previously mentioned customer, product, or order without explicitly repeating the full details. For example:

- **`last_product`**: Stores the last product name mentioned.
- **`last_intent`**: Tracks whether the user asked for price, rating, or some other information in the last query.
- **`last_customer`**: Stores the last customer ID queried.
- **`last_order`**: Tracks the last order ID mentioned.

This allows for seamless conversation continuity, such as answering "What’s its price?" after previously discussing a product.

---

## Step 5: Handling Customer Queries

In [5]:
# Function to handle customer-related queries
def get_customer_info(customer_id):
    customer_id = str(customer_id)
    customers_df['customer_id'] = customers_df['customer_id'].astype(str)

    customer = customers_df[customers_df['customer_id'] == customer_id]
    if not customer.empty:
        last_login_days_ago = customer['last_login_day'].values[0]
        last_checkout_days_ago = customer['last_checkout_day'].values[0]

        # Convert last_login_days_ago to date
        last_login_date = (datetime.datetime.now() - datetime.timedelta(days=int(last_login_days_ago))).strftime('%Y-%m-%d')

        # Handle case where last_checkout_days_ago is not a number
        if last_checkout_days_ago.isdigit():
            last_checkout_date = (datetime.datetime.now() - datetime.timedelta(days=int(last_checkout_days_ago))).strftime('%Y-%m-%d')
            checkout_message = f"and their most recent checkout was {last_checkout_days_ago} days ago on {last_checkout_date}."
        else:
            checkout_message = "and this customer has never checked out."

        context["last_customer"] = customer_id  # Store customer for follow-up
        return (f"Great news! We found the information for customer ID {customer_id}. "
                f"They last logged in {last_login_days_ago} days ago on {last_login_date}, "
                f"{checkout_message}")
    else:
        return f"Oh no! I couldn't find a customer with ID {customer_id}. Please double-check the ID, and I'll try again!"

### **Handling Customer Queries**

This function is responsible for retrieving information about a customer based on their **customer ID**:

1. Converts the `customer_id` to a string to ensure it matches the format in the `customers_df` DataFrame.
2. Searches for the customer in the `customers_df` using the provided ID.
3. If the customer is found, it calculates the last login date by subtracting the number of days since the last login from the current date using **`datetime`**.
4. It checks whether the customer has checked out before. If they have, it calculates the last check-out date; otherwise, it returns a message saying the customer has never checked out.
5. Updates the **context** to store the last customer queried for future follow-up questions.

---

## Step 6: Handling Order Status Queries

In [6]:
# Function to get order status
def get_order_status(order_id):
    order_id = str(order_id)  # Ensure order ID is a string
    orders_df['order_id'] = orders_df['order_id'].astype(str)

    order = orders_df[orders_df['order_id'] == order_id]
    if not order.empty:
        product_id = order['product_id'].values[0]
        order_time = order['order_time'].values[0]
        context["last_order"] = order_id  # Store order ID for follow-up
        return (f"Good news! Order {order_id} was placed on {order_time}. "
                f"The order includes product {product_id}. ")
    else:
        return (f"Oops, it seems there’s no order with ID {order_id}. "
                "Can you please check the ID and try again? I’m here to help!")

### **Handling Order Status Queries**

This function retrieves the status of an order by:

1. Converting the `order_id` to a string to match the format in the `orders_df`.
2. Searching for the order in `orders_df` using the provided ID.
3. If the order exists, it extracts information such as the **product ID** and **order time**. It then formats a response that includes the order’s date and the product involved.
4. If the order is not found, the function returns an error message asking the user to double-check the order ID.
5. Updates the **context** with the order ID for possible follow-up questions.

---

## Step 7: Handling Product Queries (Price and Rating)

In [7]:
# Function to handle customer-related queries (price or rating)
def get_product_info(product_name, info_type="price"):
    # Clean up the product name by removing punctuation
    product_name_cleaned = product_name.translate(str.maketrans('', '', string.punctuation))
    
    # Split the user input into individual keywords
    keywords = product_name_cleaned.split()
    
    # Perform a case-insensitive search for rows that contain all keywords in the title
    matching_products = products_df[
        products_df['title'].apply(lambda x: all(kw.lower() in x.lower() for kw in keywords))
    ]
    
    if not matching_products.empty:
        # Get the first match (can be expanded to multiple matches if needed)
        product = matching_products.iloc[0]
        price = product['price_actual']
        rating = product['item_rating']
        title = product['title']
        
        # Store product context for follow-up
        context["last_product"] = title  # Use the product title for follow-up queries
        context["last_intent"] = info_type  # Whether the user asked for price or rating
        
        # Provide appropriate information based on info_type
        if info_type == "price":
            return (f"I found the product you're looking for! The '{title}' is priced at {price}. ")
        elif info_type == "rating":
            return (f"The '{title}' has a rating of {rating} stars.")
    else:
        return f"I couldn't find any product containing the keywords '{product_name}'. Please try different keywords!"

### **Handling Product Queries (Price and Rating)**

This function helps the chatbot answer queries about a product’s **price** or **rating**:

1. Cleans up the product name by removing any punctuation using **`string.punctuation`**.
2. Splits the cleaned product name into keywords and performs a case-insensitive search in the `products_df` DataFrame. It checks whether all keywords appear in the product title.
3. If a matching product is found, it retrieves the **price** and **rating**. Depending on the **info_type** (either "price" or "rating"), the function formats an appropriate response.
4. If no matching product is found, it prompts the user to try different keywords.

The function also updates the **context** to store the last product mentioned and the type of information the user requested (price or rating), allowing for follow-up queries like, "What’s its rating?"

---

## Step 8: Chat Generation Using DialoGPT

In [8]:
# Function to generate responses using DialoGPT
def chat_with_dialoGPT(user_input, chat_history_ids=None):
    new_input_ids = tokenizer.encode(user_input + tokenizer.eos_token, return_tensors='pt')
    bot_input_ids = torch.cat([chat_history_ids, new_input_ids], dim=-1) if chat_history_ids is not None else new_input_ids
    attention_mask = torch.ones(bot_input_ids.shape, dtype=torch.long)
    
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        attention_mask=attention_mask,
        do_sample=True,
        top_p=0.92,
        top_k=50,
        pad_token_id=tokenizer.eos_token_id
    )

    response = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    return response, chat_history_ids

### **Chat Generation Using DialoGPT**

This function generates conversational responses using the **DialoGPT** model:

1. **Tokenizes** the user input using the tokenizer, appending an end-of-sentence token (`eos_token`) to signal the end of the input.
2. **Concatenates** the current user input with any previous conversation history (`chat_history_ids`) to maintain context in the dialogue.
3. Uses the **DialoGPT model** to generate a response. It applies techniques like **sampling** with `top_p` and `top_k` to create more natural, varied outputs.
4. **Decodes** the generated tokens back into a human-readable string.

If no specific customer, product, or order query is detected, the chatbot defaults to using this function to handle general conversational inputs.

---

## Step 9: Detecting Intent and Entities

In [9]:
# Function to detect intent and entities using regex
def detect_intent_and_entities(user_input):
    # Regex patterns to detect intents and entities
    if re.search(r'customer.*id.*\d+', user_input, re.IGNORECASE):
        intent = "customer_query"
        customer_id = re.search(r'\d+', user_input).group(0)
        return intent, {"customer_id": customer_id}
    
    if re.search(r'order.*id', user_input, re.IGNORECASE):
        intent = "order_query"
        order_id = re.search(r'order_[0-9]+', user_input, re.IGNORECASE)
        return intent, {"order_id": order_id.group(0)} if order_id else {"order_id": None}

    if re.search(r'price|rating', user_input, re.IGNORECASE):
        product_name_match = re.search(r'(?:price|rating) of (.+)', user_input, re.IGNORECASE)
        product_name = product_name_match.group(1).strip() if product_name_match else None
        intent = "product_price" if "price" in user_input.lower() else "product_rating"
        return intent, {"product_name": product_name}

    return "general", {}

### **Detecting Intent and Entities**

This function detects the user's **intent** (i.e., whether they’re asking about a customer, order, or product) and extracts relevant **entities** from the input using regular expressions (`regex`):

1. Checks if the input mentions a **customer ID** by searching for patterns like "customer" followed by a number. If found, it returns the intent as `customer_query` and extracts the customer ID.
2. Looks for **order IDs** with patterns like "order_id". If an order is found, it extracts the order ID and returns the intent as `order_query`.
3. Checks if the user asks for a **price** or **rating** of a product by looking for phrases like "price of" or "rating of". It then extracts the product name.
4. If no specific intent is detected, the function defaults to a "general" intent, which triggers the DialoGPT model for a conversational response.

This process allows the chatbot to identify what the user wants and which entity (e.g., a specific customer or product) the query relates to.

---

## Step 10: Main Chatbot Function (The chatbot's output will be displayed after this section)

In [None]:
# Main chatbot function with improved follow-up query handling
def chatbot():
    print("Hi there! I'm your friendly chatbot, here to help you with any queries. Type 'quit' to exit the chat.")
    chat_history_ids = None  # Initialize chat_history_ids to None at the start
    
    while True:
        user_input = input("You: ").strip().lower()
        if user_input == 'quit':
            print("Goodbye! Feel free to ask for assistance anytime!")
            break

        # Enhanced Natural Language Processing for intent and entity detection
        intent, entities = detect_intent_and_entities(user_input)

        # Handle customer queries
        if intent == "customer_query":
            customer_id = entities.get("customer_id")
            if customer_id:
                response = get_customer_info(customer_id)
            else:
                response = "It seems like you asked about a customer, but I couldn't find the ID. Please provide the customer ID."

        # Handle order queries
        elif intent == "order_query":
            order_id = entities.get("order_id")
            if order_id:
                response = get_order_status(order_id)
            else:
                response = "It seems like you asked about an order, but I couldn't find the order ID. Please provide the order ID."

        # Handle product queries based on intent
        elif intent in ["product_price", "product_rating"]:
            product_name = entities.get("product_name")
            
            if product_name:
                # Update product name in the context when a new product is mentioned
                context["last_product"] = product_name.translate(str.maketrans('', '', string.punctuation))
                context["last_intent"] = "price" if intent == "product_price" else "rating"
                
                # Respond with the appropriate information (price or rating)
                response = get_product_info(product_name, info_type=context["last_intent"])
            else:
                # If product name is not explicitly mentioned, check if there's a follow-up on the last product
                if context.get("last_product"):
                    product_name = context["last_product"]
                    # Update intent based on user input (switch between price or rating)
                    if "price" in user_input:
                        context["last_intent"] = "price"
                    elif "rating" in user_input:
                        context["last_intent"] = "rating"
                    
                    # Provide the appropriate information based on the updated intent
                    response = get_product_info(product_name, info_type=context["last_intent"])
                else:
                    response = "Could you please specify which product you're asking about?"

        # Handle follow-up queries on the last product
        elif "price" in user_input and context.get("last_product"):
            # User asks for price after a previous query (like rating)
            product_name = context["last_product"]
            response = get_product_info(product_name, info_type="price")
            context["last_intent"] = "price"  # Update the intent to price

        elif "rating" in user_input and context.get("last_product"):
            # User asks for rating after a previous query (like price)
            product_name = context["last_product"]
            response = get_product_info(product_name, info_type="rating")
            context["last_intent"] = "rating"  # Update the intent to rating
        
        else:
            # Fallback to DialoGPT if no specific intent was detected
            response, chat_history_ids = chat_with_dialoGPT(user_input, chat_history_ids)
        
        # Print the chatbot's response
        print(f"Bot: {response}")

# Start the chatbot
chatbot()

### **Main Chatbot Function**

The `chatbot()` function ties everything together:

1. It prompts the user for input and detects whether the input is a **customer query**, **order query**, or **product query** using the `detect_intent_and_entities` function.
2. Based on the detected **intent**, the chatbot calls the corresponding helper function:
   - If the user asks about a customer, it calls `get_customer_info`.
   - For order queries, it calls `get_order_status`.
   - For product queries, it calls `get_product_info`.
3. If the intent is unclear or doesn’t match any of the specific query types, it defaults to using **DialoGPT** to generate a conversational response.
4. The function continuously loops until the user types "quit," allowing for a dynamic and engaging chat experience.

---


## Chatbot Testing Scenarios

### **Customer-Related Queries:**

**Question 1**

- **User**: What’s the last login for **customer ID** 104511?

- **Bot**: Great news! We found the information for **customer ID** 104511. They last logged in 4 days ago on 2024-10-23, and their most recent checkout was 4 days ago on 2024-10-23.


**Question 2**

- **User**: Can you check the details for **customer ID** 20596?

- **Bot**: Great news! We found the information for **customer ID** 20596. They last logged in 64 days ago on 2024-08-24, and this customer has never checked out.


**Question 3**

- **User**: Tell me the last checkout date for **customer ID** 119330!

- **Bot**: Great news! We found the information for **customer ID** 119330. They last logged in 12 days ago on 2024-10-15, and their most recent checkout was 18 days ago on 2024-10-09.


**Question 4**

- **User**: When was the last login for **customer ID** 46400?

- **Bot**: Great news! We found the information for **customer ID** 46400. They last logged in 295 days ago on 2024-01-06, and this customer has never checked out.

---

### **Order-Related Queries:**

**Question 1**

- **User**: What is the status of **order ID** order_123?

- **Bot**: Good news! **Order ID** order_123 was placed on 2019-06-25 16:16:43.205354. The order includes product 11756.


**Question 2**

- **User**: Can you check the order status for **order ID** order_987?

- **Bot**: Good news! **Order ID** order_987 was placed on 2019-10-12 22:04:03.829836. The order includes product 4533.


**Question 3**

- **User**: Tell me the order time for **order ID** order_928!

- **Bot**: Good news! **Order ID** order_928 was placed on 2019-09-10 20:11:41.736175. The order includes product 6108.


**Question 4**

- **User**: What product is included in **order ID** order_65893?

- **Bot**: Good news! **Order ID** order_65893 was placed on 2019-03-25 15:29:08.967064. The order includes product 4471.

---

### **Product-Related Queries (Price):**

**Question 1**

- **User**: What’s **the price of** shoulder bag?

- **Bot**: I found the product you're looking for! The '1081 PU SLING BAG KULIT leather Casual Beg Tangan Wanita women Shoulder Wallet Purse Travel READY STOCK QQF WHOLESALE | Shopee Malaysia' is priced at 4.017214855598254.


**Question 2**

- **User**: Can you find **the price of** wireless keyboard?

- **Bot**: I found the product you're looking for! The 'NIXXOS For iPad10 Wireless Keyboard Ccase for 10.2'' iPad 7th/8th/9th Gen 10.9'' Air4/Air5 2021/2020 Pro 11'' 7-color Backlight Magnetic Keyboard Leather Cover | Shopee Malaysia' is priced at 21.487428297386007.


**Question 3**

- **User**: Tell me **the price of** shirt.

- **Bot**: I found the product you're looking for! The 'Men Male Shirt Solid Blouse Jumper Shirts Pullover T-Shirt High neck Basic Thermal Mens Casual Long Sleeve Tops | Shopee Malaysia' is priced at 1.8684720258596528.


**Question 4**

- **User**: How much is **the price of** shoes?

- **Bot**: I found the product you're looking for! The 'Fashion Women Flat Shoes Suede Leather Colth Soft Ladies Flat Heel Working Shoe Pure Colours | Shopee Malaysia' is priced at 5.6209866777944555.


**Question 5**

- **User**: Find **the price of** apple pencil!

- **Bot**: I found the product you're looking for! The 'Apple Pencil (2nd Generation) | Shopee Malaysia' is priced at 139.823989935164.

---

### **Product-Related Queries (Rating):**

**Question 1**

- **User**: What is **the rating of** shoulder bag?

- **Bot**: The '1081 PU SLING BAG KULIT leather Casual Beg Tangan Wanita women Shoulder Wallet Purse Travel READY STOCK QQF WHOLESALE | Shopee Malaysia' has a rating of 4.9 stars.


**Question 2**

- **User**: Can you find **the rating of** wireless keyboard?

- **Bot**: The 'NIXXOS For iPad10 Wireless Keyboard Ccase for 10.2'' iPad 7th/8th/9th Gen 10.9'' Air4/Air5 2021/2020 Pro 11'' 7-color Backlight Magnetic Keyboard Leather Cover | Shopee Malaysia' has a rating of 4.9 stars.


**Question 3**

- **User**: What’s **the rating of** shirt?

- **Bot**: The 'Men Male Shirt Solid Blouse Jumper Shirts Pullover T-Shirt High neck Basic Thermal Mens Casual Long Sleeve Tops | Shopee Malaysia' has a rating of 4.8 stars.


**Question 4**

- **User**: Tell me **the rating of** shoes.

- **Bot**: The 'Fashion Women Flat Shoes Suede Leather Colth Soft Ladies Flat Heel Working Shoe Pure Colours | Shopee Malaysia' has a rating of 4.9 stars.


**Question 5**

- **User**: Find **the rating of** apple pencil!

- **Bot**: The 'Apple Pencil (2nd Generation) | Shopee Malaysia' has a rating of 4.9 stars.

---

### **Follow-Up Product Queries:**

**Conversation 1**

- **User**: What’s **the price of** apple pencil?

- **Bot**: I found the product you're looking for! The 'Apple Pencil (2nd Generation) | Shopee Malaysia' is priced at 139.823989935164.

- **User**: What is its rating? *(User can also ask: What is the rating?)*

- **Bot**: The 'Apple Pencil (2nd Generation) | Shopee Malaysia' has a rating of 4.9 stars.


**Conversation 2**

- **User**: Tell me **the rating of** the shoulder bag.

- **Bot**: The '1081 PU SLING BAG KULIT leather Casual Beg Tangan Wanita women Shoulder Wallet Purse Travel READY STOCK QQF WHOLESALE | Shopee Malaysia' has a rating of 4.9 stars.

- **User**: How much is its price? *(User can also ask: What about its price?)*

- **Bot**: I found the product you're looking for! The '1081 PU SLING BAG KULIT leather Casual Beg Tangan Wanita women Shoulder Wallet Purse Travel READY STOCK QQF WHOLESALE | Shopee Malaysia' is priced at 4.017214855598254.


**Conversation 3**

- **User**: What is **the price of** wireless keyboard?

- **Bot**: I found the product you're looking for! The '10 inch Mini Bluetooth Wireless Keyboard And Mouse Set Air Slim Keyboard For Phone Android Windows Tablet 蓝牙无线键盘 | Shopee Malaysia' is priced at 3.422417927366264.

- **User**: What about its rating? *(User can also ask: Tell me about its rating)*

- **Bot**: The '10 inch Mini Bluetooth Wireless Keyboard And Mouse Set Air Slim Keyboard For Phone Android Windows Tablet 蓝牙无线键盘 | Shopee Malaysia' has a rating of 4.9 stars.

---

### **General Queries (Fallback to DialoGPT):**

- **User**: "Hello!"
  
- **User**: "How are you today?"
  
- **User**: "What’s the weather like?"

---


### **Conclusion**

In this notebook, we built a chatbot that leverages structured datasets (customers, orders, and products) and a pre-trained conversational model (DialoGPT) to answer queries related to customers, orders, and product details. We implemented several helper functions to manage follow-up queries and provide context-aware responses.

---

# Potential Benefits of this Chatbot for the E-commerce Platform and Customers

Implementing a chatbot on an e-commerce platform offers significant benefits for both the platform itself and its customers:

#### **1. For the E-commerce Platform:**
- **Increased Efficiency in Customer Support**: By automating responses to common customer queries such as order status, product details (price, ratings), and customer account information, the platform reduces the load on human customer support representatives. This allows the platform to handle more customers simultaneously without increasing operational costs.
  
- **24/7 Availability**: The chatbot provides uninterrupted support, ensuring that customers can get their queries answered at any time, even outside normal business hours. This improves customer satisfaction and potentially increases sales by assisting customers around the clock.

- **Cost Reduction**: Automating a portion of the customer service reduces the need for hiring large teams of support agents. This results in lower staffing and training costs, enabling the platform to allocate resources to other critical areas like marketing or product development.

- **Better Data Collection**: The chatbot logs customer interactions, providing the platform with valuable insights into common customer pain points, frequently asked questions, and product-related concerns. This data can be used to improve the platform's offerings and customer experience.

- **Faster Response Times**: Chatbots can instantly reply to customer inquiries, leading to a smoother and more responsive customer experience. This helps in reducing bounce rates and cart abandonment, as customers are more likely to make purchases when they get prompt assistance.

#### **2. For Customers:**
- **Immediate Assistance**: Customers no longer have to wait in long queues for help. Whether they need to track an order, ask about a product, or get help with their account, the chatbot can provide immediate responses.
  
- **Personalized Experience**: By tracking previous interactions through the chatbot’s **context management**, the platform can provide more personalized responses, such as remembering the last product a customer inquired about or their last order. This personal touch enhances the overall shopping experience.

- **Seamless Shopping Experience**: With quick access to product details, prices, and ratings, customers can make informed purchasing decisions without needing to leave the platform or contact support. This reduces friction in the buying process and can lead to increased sales.

- **Reduced Effort for Repeated Queries**: Customers often need the same information multiple times, such as order tracking. With the chatbot storing context (e.g., last customer, last product), they don’t need to re-enter this information, making the experience more efficient and user-friendly.

---

# Challenges Encountered During Development and How They Were Addressed

Developing this chatbot involved several challenges, ranging from handling natural language processing (NLP) complexities to managing real-time data from multiple datasets. Here’s a breakdown of the key challenges and the solutions implemented to address them:

#### **1. Managing Natural Language Understanding and Intent Detection**
- **Challenge**: One of the primary challenges in developing a chatbot is accurately understanding the user's intent. Users may ask about the same topic (e.g., product price or order status) in various ways, making it difficult for the chatbot to detect the correct intent and respond appropriately.

- **Solution**: We used regular expressions (`regex`) to match specific patterns in the input that signal intent. For instance, the chatbot looks for keywords such as “price of” or “order ID” to determine whether the user is asking about a product or order. This approach, combined with the use of a context dictionary, allows the chatbot to handle follow-up queries and manage different intents effectively.

#### **2. Ensuring Context Awareness for Follow-up Queries**
- **Challenge**: A key aspect of user experience in chatbots is the ability to handle follow-up questions without requiring users to repeat information. For example, if a user asks for the rating of a product and then asks for its price, the chatbot should remember which product was previously mentioned.

- **Solution**: We addressed this challenge by implementing a **context management system**. The chatbot stores the last product, customer, or order mentioned in a conversation in a dictionary (`context`). This way, when a user asks a follow-up question, the chatbot can retrieve the relevant information without needing the user to repeat themselves.

#### **3. Handling Large and Structured Datasets**
- **Challenge**: Since the chatbot interacts with multiple datasets (customers, orders, products), managing the data efficiently and querying it in real-time posed a challenge. The chatbot had to search for specific details like customer IDs, product names, and order IDs without performance issues.

- **Solution**: To ensure efficient data handling, we used **Pandas** for loading and querying the CSV files. This allowed us to manipulate and filter the data quickly. Data columns were pre-processed to ensure uniform formats, and indices were set for faster lookups (e.g., using `customer_id` or `order_id` as indices). This optimization improved the chatbot’s performance when retrieving customer or product information.

#### **4. Generating Conversational Responses with DialoGPT**
- **Challenge**: Creating human-like responses using a pre-trained model (DialoGPT) was another challenge. While DialoGPT generates coherent conversations, it requires proper management of chat history and tokens to maintain the context of the conversation across multiple exchanges.

- **Solution**: We handled this by maintaining a **chat history** (`chat_history_ids`) to track all user inputs and chatbot responses. The model was initialized with each user input, along with the previous conversation history, allowing it to generate responses that fit the ongoing context. Additionally, parameters like **`top_p`** and **`top_k`** were used to control the randomness of the chatbot’s replies, ensuring the responses were both relevant and varied.

#### **5. Handling User Input Variability**
- **Challenge**: Users often phrase their queries differently. For example, someone might ask "How much is this?" or "What’s the price?" when referring to a product. This variability in language makes it difficult to capture all possible user inputs using fixed rules.

- **Solution**: We addressed this challenge by allowing the chatbot to search for keywords (e.g., “price”, “rating”) and applying string manipulation techniques such as removing punctuation and making the search case-insensitive. This increased the flexibility of the chatbot in recognizing user queries. Additionally, when specific information (like a product name) was missing, the chatbot used the **stored context** to infer what the user was referring to.

#### **6. Ensuring Scalability and Adaptability**
- **Challenge**: As the platform grows, more products, customers, and orders will be added to the datasets, increasing the complexity and size of the data that the chatbot needs to handle.

- **Solution**: The chatbot was designed to be easily scalable by allowing dynamic updates to the datasets. Since the bot relies on **Pandas** DataFrames, new rows can be added to the CSV files without breaking functionality. Additionally, pre-trained models like DialoGPT are versatile and can be fine-tuned further for specific domain tasks, allowing the chatbot to adapt as the platform’s offerings evolve.

---