# GenAI Customer Support Agent – Lightweight MVP Demo

This notebook is a **working prototype** of a GenAI-powered customer support agent — designed to show what a real system like this could do in production.

It’s not just another chatbot.

This agent can:
-  Understand support requests using Gemini  
-  Retrieve relevant policies using document embeddings (RAG)  
-  Call live functions to check invoices, payments, or order status  
-  Generate structured replies, with tone control and memory  
-  Evaluate its own output for helpfulness and quality  
-  Log everything to CSV like a proper backend system  

It’s fast. It’s readable. And it all runs inside a single notebook — with minimal dependencies and no external API server.

Is it ready to be deployed today? Not yet. But it’s a **lightweight, transparent MVP** that shows what’s possible — and how GenAI can fit into real customer service workflows.

Let’s see what it can do 👇


## My GenAI Support Agent Architecture

This isn't just a collection of GenAI tricks — it's a full, traceable support workflow.

Each step builds on the previous one to turn a raw user message into a grounded, personalized, and evaluable reply.

Here’s the flow:

1. A user message enters the system  
2. Gemini classifies it (e.g. "refund request")  
3. The system detects urgency using hybrid priority rules  
4. The message is embedded and matched against the knowledge base (RAG)  
5. Gemini optionally calls a real function (like checking an invoice or payment status)  
6. Memory is passed into the prompt when available, so replies can build on previous interactions  
7. A structured, tone-controlled reply is generated  
8. Gemini evaluates the reply for helpfulness  
9. If confidence is low or info is missing, the agent flags the case as `needs_human_review = True`  
10. Everything is logged for tracking, review, and continuous improvement

It’s built to showcase how GenAI can power real-world customer support systems.


## GenAI Capabilities Demonstrated

This project showcases multiple GenAI capabilities — most of them working together in a single pipeline:

| Capability                            | Implemented? | Where it appears |
|--------------------------------------|--------------|------------------|
| Structured output / JSON             | ✅ Yes        | Final reply format |
| Few-shot prompting                   | ✅ Yes        | Reply generation & tone control |
| Document understanding               | ✅ Yes        | RAG via embedded support docs |
| Function Calling                     | ✅ Yes        | Invoice, order, payment tools |
| Agents                               | ⚠️ Partially | Agent-like orchestration via prompting logic |
| Long context / Memory                | ✅ Yes        | Previous message included in prompt |
| Context caching                      | ⚠️ Not yet    | Possible in future versions |
| GenAI evaluation                     | ✅ Yes        | Self-evaluation after reply |
| Grounding                            | ✅ Yes        | Docs + function results cited |
| Embeddings                           | ✅ Yes        | Gemini `text-embedding-004` |
| Retrieval augmented generation (RAG) | ✅ Yes        | Embedding-based doc retrieval |
| Vector store / vector search         | ✅ Yes        | Inline cosine similarity on embeddings |
| MLOps (light)                        | ✅ Yes        | Logs to CSV, flags for human review |
| Resilience / Error handling          | ✅ Yes        | Gemini quota fallback with retry logic |
| Prioritization                       | ✅ Yes        | Hybrid urgency detection |
| Human-in-the-loop flagging           | ✅ Yes        | `needs_human_review` parameter |



## Step 1: Build the Email Dataset (Examples + Categories)

We begin by creating a small dataset of realistic customer support messages.

Each message is labeled with one of five categories:
- `refund`  
- `order_status`  
- `payment_issue`  
- `general_question`  
- `account_access`

This dataset serves two purposes:
- To guide the classifier (both for training and few-shot prompting)
- To simulate realistic user inputs for testing the pipeline

It’s the foundation for everything that follows — especially classification, tone detection, and reply generation.


In [1]:
import pandas as pd

# Simulated email data
data = [
    # Refunds
    ("REF001", "Refund for my recent order", "Hello, I’d like a refund for the shoes I purchased last week. They didn’t fit well.", "refund_request"),
    ("REF002", "Need my money back", "I received the wrong item and want a refund ASAP.", "refund_request"),
    ("REF003", "Request for order cancellation and refund", "Please cancel my order #45321 and issue a refund. I no longer need it.", "refund_request"),
    ("REF004", "Order returned - waiting for refund", "I’ve sent back my order a week ago. When will I get my refund?", "refund_request"),

    # Technical Issues
    ("TECH001", "Website not loading", "Your website has been down all morning. Can you please fix this?", "technical_issue"),
    ("TECH002", "App keeps crashing", "The app crashes whenever I try to check my order history.", "technical_issue"),
    ("TECH003", "Can't update my account info", "Every time I update my shipping address, it resets. Please help.", "technical_issue"),
    ("TECH004", "Payment error", "I tried to pay with my credit card, but it keeps declining without reason.", "technical_issue"),

    # General Questions
    ("GEN001", "Do you ship internationally?", "Hi, I live in Canada. Do you ship products here?", "general_question"),
    ("GEN002", "Product material details?", "What material is used in your waterproof jackets?", "general_question"),
    ("GEN003", "Discounts for students?", "Do you offer any student discounts or promo codes?", "general_question"),
    ("GEN004", "Delivery times?", "How long does it take to deliver to Los Angeles?", "general_question"),

    # Account Problems
    ("ACC001", "Can't log in", "My account is locked and I can’t reset the password.", "account_problem"),
    ("ACC002", "Verification email not received", "I signed up but didn’t get the confirmation email. Can you resend it?", "account_problem"),
    ("ACC003", "Duplicate account issue", "Looks like I accidentally created two accounts. Can you merge them?", "account_problem"),
    ("ACC004", "Need to change email", "I want to change my login email. What’s the process?", "account_problem"),

    # Feedback
    ("FB001", "Amazing service!", "Just wanted to say your customer support is fantastic. Keep it up!", "feedback"),
    ("FB002", "Disappointed with packaging", "My order arrived damaged. Packaging was very poor.", "feedback"),
    ("FB003", "Love your brand", "Big fan of your products. Love the new hoodie designs!", "feedback"),
    ("FB004", "Unsubscribing from newsletter", "Too many emails lately. Please remove me from your mailing list.", "feedback"),
]

# Convert to DataFrame
df = pd.DataFrame(data, columns=["email_id", "subject", "body", "label"])

# Preview
df.head(10)


Unnamed: 0,email_id,subject,body,label
0,REF001,Refund for my recent order,"Hello, I’d like a refund for the shoes I purch...",refund_request
1,REF002,Need my money back,I received the wrong item and want a refund ASAP.,refund_request
2,REF003,Request for order cancellation and refund,Please cancel my order #45321 and issue a refu...,refund_request
3,REF004,Order returned - waiting for refund,I’ve sent back my order a week ago. When will ...,refund_request
4,TECH001,Website not loading,Your website has been down all morning. Can yo...,technical_issue
5,TECH002,App keeps crashing,The app crashes whenever I try to check my ord...,technical_issue
6,TECH003,Can't update my account info,"Every time I update my shipping address, it re...",technical_issue
7,TECH004,Payment error,"I tried to pay with my credit card, but it kee...",technical_issue
8,GEN001,Do you ship internationally?,"Hi, I live in Canada. Do you ship products here?",general_question
9,GEN002,Product material details?,What material is used in your waterproof jackets?,general_question


In [2]:
df["label"].value_counts()

label
refund_request      4
technical_issue     4
general_question    4
account_problem     4
feedback            4
Name: count, dtype: int64

## Setting Up Gemini for Classification

To classify support emails using Gemini, we first connect to the Google Generative AI API.

This involves:
1. **Providing your API key** securely (via environment variable)  
2. **Importing the Gemini client library**  
3. **Listing available models** to confirm that everything is working

Later in the notebook, we’ll use `gemini-2.0-flash` for classification and generation tasks — fast, efficient, and perfect for real-time agents.


In [3]:
from google import genai
from google.genai import types

genai.__version__

  warn(


'0.8.0'

In [4]:
#!pip uninstall -qqy jupyterlab  # Remove unused conflicting packages
!pip install -q --upgrade "google-genai==1.7.0"


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [5]:
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

client = genai.Client(api_key=GOOGLE_API_KEY)

In [7]:

for m in client.models.list():
    if "embedContent" in m.supported_actions:
        print(m.name)

## Step 2: Classify Support Messages (Category + Priority)

Before generating a reply, the agent first needs to understand what the message is about — and how urgent it is.

We use Gemini to:
1. Classify the message into one of five support categories  
2. Detect the urgency of the message using a hybrid priority system


### Step 2a: Classify the Message Category

This function uses **one-shot prompting** to classify support messages into one of five predefined categories:
...

This function uses **one-shot prompting** to classify support messages into one of five predefined categories:

- `refund`  
- `order_status`  
- `payment_issue`  
- `general_question`  
- `account_access`

It works by sending a structured prompt to Gemini that includes:
- One example for each category (message + label)
- The new message to classify, inserted dynamically at the end

Gemini reads the full prompt and predicts the appropriate category based on its language understanding — no model training required.

---

### 🔍 Prompt Structure Example:

We pass a prompt like this to Gemini:

Message: "I’d like to request a refund. I returned the product last week." Category: refund

Message: "Can you check when my order will arrive?" Category: order_status

Message: "Why was my payment declined?" Category: payment_issue

Message: "How do I reset my password?" Category: account_access

Message: "I have a question about your return policy." Category: general_question

Message: "I haven’t received my refund for invoice INV-002." Category:

Gemini is expected to complete the last line with the best-fitting category.

---

This zero-shot classification approach is fast, flexible, and generalizes well — perfect for cases where retraining a traditional model would be too slow or complex.


In [8]:
def classify_email_with_gemini(email_text: str) -> str:
    categories = ["refund_request", "technical_issue", "general_question", "account_problem", "feedback"]

    prompt = f"""
You are an expert customer support agent.

Classify the following email into one of these categories:
{', '.join(categories)}.

Only respond with the category name.

Email:
\"\"\"
{email_text}
\"\"\"
"""

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=prompt
    )

    return response.text.strip().lower()



### Step 2b: Detect Message Priority (Hybrid Approach)

Some support requests require faster attention than others.

To handle this, we implemented a **hybrid priority system**:
- A Gemini-powered prompt checks if the message sounds urgent or emotional
- At the same time, we scan for urgency keywords like "urgent", "asap", "still waiting", etc.

If either method detects urgency, the message is marked as **high priority**.

The priority is saved alongside the category and reply, so we can:
- Flag time-sensitive cases in the logs
- Track how many messages might need faster handling
- Potentially prioritize them for escalation in a real backend


In [9]:
def classify_priority_hybrid(user_message: str, category: str) -> str:
    """
    Uses keywords to override Gemini’s judgment if strong signals are found.
    Otherwise, returns Gemini’s own classification.
    """
    # Keyword-based override
    urgent_keywords = ["urgent", "asap", "immediately", "now", "emergency", "cancel", "complaint"]
    frustration_keywords = ["angry", "frustrated", "not acceptable", "you people", "ridiculous", "still waiting"]
    
    msg = user_message.lower()
    
    if any(word in msg for word in urgent_keywords + frustration_keywords):
        return "high"
    
    if category in ["refund", "refund_request", "payment_issue"]:
        return "high"

    if category in ["general_question", "account_access"]:
        return "low"

    # Otherwise fallback to Gemini's zero-shot classification
    prompt = f"""
You are an AI assistant. Based on tone, urgency, and topic, classify the priority of this customer message as "high", "normal", or "low".

Message:
\"\"\"{user_message}\"\"\"

Reply with just the word: high, normal, or low.
"""
    response = chat.send_message(prompt)
    return response.text.strip().lower()


### 🧪 Test: Classify a Support Email

Let’s try the `classify_email_with_gemini()` function on a real message.

We’ll input a user support email and ask Gemini to predict the correct category using one-shot prompting.

No training — just reasoning from the examples we provided.


In [10]:
email_text = """
Hi, I ordered a phone last week but My order arrived broken
"""

predicted_category = classify_email_with_gemini(email_text)

print("📨 Email:")
print(email_text.strip())
print("\n🔎 Predicted Category:")
print(f"➡️ {predicted_category}")

📨 Email:
Hi, I ordered a phone last week but My order arrived broken

🔎 Predicted Category:
➡️ refund_request


### 🚦 Priority Classification Test (Hybrid)

This message doesn’t contain obvious urgency keywords — but it’s clearly important.

We'll test if the agent can infer this using Gemini's tone and topic understanding.

Message:
> “I’m still waiting on my refund from two weeks ago. This is getting ridiculous.”


In [11]:
test_message = "I’m still waiting on my refund from two weeks ago. This is getting ridiculous."
test_category = classify_email_with_gemini(test_message)
test_priority = classify_priority_hybrid(test_message, test_category)

print(f"📬 Message: {test_message}")
print(f"📂 Category: {test_category}")
print(f"🚦 Priority: {test_priority}")


📬 Message: I’m still waiting on my refund from two weeks ago. This is getting ridiculous.
📂 Category: refund_request
🚦 Priority: high


## Step 3: Retrieve Relevant Docs Using Gemini Embeddings (RAG)

Before generating a reply, the agent needs context — like refund rules, help center answers, or return policies.

We use **RAG (Retrieval-Augmented Generation)** to ground Gemini’s answers in real documents.

Here’s how it works:
1. We embed a mini knowledge base of support articles using Gemini’s `text-embedding-004` model  
2. We embed the user’s message using the same model  
3. We compute cosine similarity and retrieve the **top 3 most relevant snippets**

This gives Gemini factual grounding — so it doesn’t hallucinate, and instead responds like a trained support rep with access to internal docs.

Let’s start by building the knowledge base.


In [12]:
support_kb = [
    "You can request a refund within 30 days of purchase if the item is unused.",
    "To reset your password, go to the login page and click 'Forgot Password'.",
    "We currently ship to the US, Canada, and the EU.",
    "All our jackets are made with waterproof recycled polyester.",
    "If the app crashes, try clearing cache or reinstalling it.",
    "Refunds usually take 5–7 business days after item is received.",
    "Duplicate accounts can be merged by contacting support with both emails.",
    "For payment issues, try using a different card or contact your bank.",
    "To unsubscribe from our newsletter, click the link at the bottom of any email.",
    "We offer student discounts — email us a valid student ID to receive a code."
]

kb_df = pd.DataFrame({"Text": support_kb})


### Embed Function with Retry Logic

To generate embeddings for the knowledge base and user queries, we use Gemini’s `text-embedding-004` model.  
However, since we’re using the free tier, we may occasionally hit **rate limits (HTTP 429 or 503 errors)**.

To make the pipeline robust, we wrap the embedding function with a **retry decorator**:
- If the API throws a rate-limit error, the function will automatically retry for up to 5 minutes
- This ensures the notebook runs end-to-end without crashing

> This is a simple but important addition when working with real APIs in production or high-traffic environments.


In [13]:
from google.api_core import retry

# Retry logic for rate-limiting
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

@retry.Retry(predicate=is_retriable, timeout=300.0)
def embed_text(text: str) -> list[float]:
    response = client.models.embed_content(
        model="models/text-embedding-004",
        contents=text,
        config=types.EmbedContentConfig(task_type="retrieval_document"),
    )
    return response.embeddings[0].values


### Embed the Knowledge Base

Now that we’ve defined our embedding function, we apply it to each row in the knowledge base (`kb_df`).

For every support article:
- We pass its text to the Gemini embedding model
- We store the resulting vector in a new `Embedding` column

These embeddings let us compare the user’s message to all KB entries — and retrieve the most relevant ones dynamically.

> This step powers our RAG (Retrieval-Augmented Generation) system and ensures that replies are grounded in real knowledge.


In [14]:
from tqdm import tqdm
kb_df["Embedding"] = kb_df["Text"].apply(embed_text)



In [15]:
import numpy as np
def retrieve_similar_docs(query: str, top_k=3):
    query_embedding = np.array(embed_text(query)).reshape(1, -1)
    matrix = np.vstack(kb_df["Embedding"].to_numpy())
    
    similarities = cosine_similarity(query_embedding, matrix)[0]
    top_indices = similarities.argsort()[::-1][:top_k]
    
    return kb_df.iloc[top_indices]["Text"].tolist()


In [16]:
from sklearn.metrics.pairwise import cosine_similarity
query = "How can I get a refund for an item?"
print("📬 Query:", query)

print("\n🔍 Top 3 Retrieved Snippets:")
for i, doc in enumerate(retrieve_similar_docs(query), 1):
    print(f"{i}. {doc}")


📬 Query: How can I get a refund for an item?

🔍 Top 3 Retrieved Snippets:
1. Refunds usually take 5–7 business days after item is received.
2. You can request a refund within 30 days of purchase if the item is unused.
3. For payment issues, try using a different card or contact your bank.


## Step 4: Generate an AI-Powered Reply

Now that we’ve classified the message and retrieved relevant context, it’s time to respond.

Using `gemini-2.0-flash`, we generate a helpful, grounded reply that incorporates:
- The user’s original message
- The top-matching support docs (via RAG)
- Optionally, real-time data from a backend database

---

### Step 4a: Create the Invoice Database


In this section, we take the agent to the next level — connecting it to a **real (simulated) backend**.

Let’s say a customer asks:
> "Can you check my invoice INV-001? I think I was refunded."

To answer that, the agent will:
1. Connect to a local SQLite database with tables for invoices, orders, and payments  
2. Use **Gemini function calling** to decide which Python function to trigger based on the message  
3. Query the appropriate table and get live data  
4. Use that data to generate a personalized, accurate reply

We use the exact technique shown in the Kaggle GenAI course (Day 3):  
Gemini auto-inspects Python function **docstrings** and **type hints** to build a smart tool interface — no manual schema setup required.

> The result: a GenAI agent that can reason *and* act — like a real support system.


#### Load SQL extension & connect to DB

In [17]:
%load_ext sql
%sql sqlite:///ecommerce.db


#### Create the 3 tables

In [18]:
%%sql

DROP TABLE IF EXISTS invoices;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS payments;

CREATE TABLE invoices (
    invoice_id TEXT,
    customer_name TEXT,
    status TEXT,
    amount REAL,
    date TEXT
);

CREATE TABLE orders (
    order_id TEXT,
    invoice_id TEXT,
    item TEXT,
    order_date TEXT,
    status TEXT
);

CREATE TABLE payments (
    payment_id TEXT,
    invoice_id TEXT,
    method TEXT,
    status TEXT,
    paid_on TEXT
);


 * sqlite:///ecommerce.db
Done.
Done.
Done.
Done.
Done.
Done.


[]

#### Add sample data

In [19]:
%%sql

INSERT INTO invoices VALUES
    ("INV-001", "Alice", "refunded", 150.00, "2024-03-01"),
    ("INV-002", "Bob", "paid", 200.00, "2024-03-05"),
    ("INV-003", "Charlie", "pending", 75.50, "2024-03-10");

INSERT INTO orders VALUES
    ("ORD-101", "INV-001", "Laptop", "2024-02-28", "returned"),
    ("ORD-102", "INV-002", "Tablet", "2024-03-03", "delivered"),
    ("ORD-103", "INV-003", "Headphones", "2024-03-07", "processing");

INSERT INTO payments VALUES
    ("PAY-001", "INV-001", "credit_card", "refunded", "2024-03-01"),
    ("PAY-002", "INV-002", "paypal", "paid", "2024-03-04"),
    ("PAY-003", "INV-003", "credit_card", "pending", NULL);


 * sqlite:///ecommerce.db
3 rows affected.
3 rows affected.
3 rows affected.


[]

### Step 4b: Agent Brain – Helper Functions Before Reply Generation

Before we generate a reply, we need to define the internal tools the agent will rely on.

These include:
- Python functions that query the database (used via function calling)
- Message classifiers (topic and urgency)
- Tone detection and style control
- Evaluation logic to score replies
- Logging for analysis and debugging

We’ll define these helper functions now — and connect everything together in the main `generate_rag_db_reply()` function right after.


#### Python Functions for Gemini to Call

In [20]:
#  Import + Connect to DB
from sqlalchemy import create_engine
import pandas as pd

engine = create_engine("sqlite:///ecommerce.db")


##### Invoice Info Function

In [21]:
def get_invoice_by_id(invoice_id: str) -> dict:
    """
    Retrieve invoice information by invoice ID from the 'invoices' table.
    Returns invoice_id, customer_name, status, amount, and date.
    """
    query = f"SELECT * FROM invoices WHERE invoice_id = '{invoice_id}'"
    df = pd.read_sql(query, engine)
    if df.empty:
        return {"error": "Invoice not found"}
    return df.iloc[0].to_dict()


##### Order Status Function

In [22]:
def get_order_status_by_invoice(invoice_id: str) -> dict:
    """
    Retrieve order details for a given invoice ID from the 'orders' table.
    Returns order_id, item, order_date, and status.
    """
    query = f"SELECT * FROM orders WHERE invoice_id = '{invoice_id}'"
    df = pd.read_sql(query, engine)
    if df.empty:
        return {"error": "No order found for this invoice"}
    return df.iloc[0].to_dict()


##### Payment Status Function

In [23]:
def get_payment_status(invoice_id: str) -> dict:
    """
    Return payment status and method for a given invoice ID from the 'payments' table.
    """
    query = f"SELECT * FROM payments WHERE invoice_id = '{invoice_id}'"
    df = pd.read_sql(query, engine)
    if df.empty:
        return {"error": "No payment found for this invoice"}
    return df.iloc[0].to_dict()


#### Gemini function call

In [24]:
#response = model.generate_content("Can you check the status of invoice INV-001?")

# These are the Python functions defined above.
db_tools = [get_invoice_by_id, get_order_status_by_invoice, get_payment_status]

instruction = """You are a helpful customer support assistant with access to an invoice, order, and payment database.

If the user asks about the status of an invoice, delivery of an order, or payment status,
call one of the available functions to get real-time information.

Be brief, friendly, and include relevant dates or amounts if available.
"""

client = genai.Client(api_key=GOOGLE_API_KEY)

# Start a chat with automatic function calling enabled.
chat = client.chats.create(
    model="gemini-2.0-flash",
    config=types.GenerateContentConfig(
        #system_instruction=instruction,
        system_instruction= """
You are a helpful customer support agent.

You have access to live database tools:
- get_invoice_by_id
- get_order_status_by_invoice
- get_payment_status

📌 IMPORTANT:
If the user mentions an invoice ID, payment, or order status, YOU MUST CALL the appropriate function before replying.

Do not guess. Always verify live information via the available tools.

After calling a function, summarize the result clearly in a polite and professional email reply.
""",
        tools=db_tools,
    ),
)

# Show result
# print(response)


In [25]:
resp = chat.send_message("Can you check the status of invoice INV-001?")
print(f"\n{resp.text}")


OK. I have checked the status of invoice INV-001. Here's a summary:

*   **Invoice:** INV-001 was issued to Alice on 2024-03-01 for $150 and is currently marked as **refunded**.
*   **Order:** The order associated with this invoice (ORD-101) is for a Laptop, ordered on 2024-02-28 and the order status is **returned**.
*   **Payment:** The payment (PAY-001) made via credit card on 2024-03-01 has been **refunded**.



#### Agent Intelligence Layer: Tone, Memory, Fallbacks, and Self-Evaluation

Our `generate_rag_db_reply()` function doesn’t just generate a reply — it simulates the intelligence of a real support agent.

Some of the logic is embedded *inside* the function itself.  
Others are defined as helper functions just before — so I’ve grouped everything here to give you the full picture.

---

##### Memory: Simulate Previous Conversations  
In real-world support, agents often reference earlier messages.  
Here, we simulate that by including the previous message (if available) in the prompt — giving Gemini conversational continuity.

This logic is directly embedded inside `generate_rag_db_reply()`.

---

##### Tone Control: Match the Message Context  
Support agents adjust their tone based on the situation.  
We assign a tone like “apologetic”, “reassuring”, or “friendly” based on the message category — and tell Gemini how to phrase the response accordingly.

This logic is also included directly inside the reply function.

---

##### Fallbacks: Handle Missing or Incomplete Data  
If Gemini calls a tool like `get_invoice_by_id()` and it returns no result, we explicitly tell the model that the data was not found.

This helps it avoid hallucinating answers — and encourages polite, professional fallback replies.

Also implemented directly in the main function logic.

---

##### Evaluation: Score the Generated Reply  
After generating the reply, we use the `evaluate_reply()` function to ask Gemini how helpful the answer was.

This function is defined **just before** the reply function and called inside it — to keep the logic clean and modular.

---

##### Resilience: Safe API Calls with Auto-Retry  
To avoid runtime errors (especially rate limits), we use `safe_send_message()` — a helper that wraps Gemini calls with retry logic.

This keeps the notebook stable during end-to-end runs, especially under free tier quotas.

Defined just before the reply function as well.

---

These capabilities — some inside the main function, some as helpers — form the agent’s intelligence layer.  
They make it more human, more reliable, and much more aligned with real-world support workflows.


In [28]:
def evaluate_reply(reply_text: str, original_msg: str) -> dict:
    """
    Ask Gemini to rate its own reply from 1 to 5 and explain.
    Handles both clean JSON and fallback regex if needed.
    """
    eval_prompt = f"""
Evaluate the following support reply to a customer message.

Customer message:
\"\"\"{original_msg}\"\"\"

AI-generated reply:
\"\"\"{reply_text}\"\"\"

Rate the reply from 1 to 5:
- 5 = Excellent (fully answers the question, polite, accurate)
- 4 = Good (clear and mostly accurate, maybe missing minor details)
- 3 = Okay (understands the question but misses or oversimplifies parts)
- 2 = Weak (vague, generic, or partially incorrect)
- 1 = Poor (misleading, incorrect, or unhelpful)

Also include a short comment explaining the score.

Return your evaluation in this JSON format:
{{
  "score": <number>,
  "comment": "<short explanation>"
}}
"""

    eval_response = chat.send_message(eval_prompt)

    import json
    import re

    raw = eval_response.text

    # Try parsing strict JSON
    try:
        return json.loads(raw)
    except:
        # Fallback #1: Try to find score from common patterns
        score_match = re.search(r"score\s*[:=]?\s*(\d)", raw, re.IGNORECASE)
        comment_match = re.search(r"comment\s*[:=]?\s*['\"]?(.+?)(?:(?:['\"])?[\n\r]|$)", raw, re.IGNORECASE)

        # Fallback #2: Try to grab the first number 1–5 in the response
        loose_score = re.search(r"\b([1-5])\b", raw)

        score = int(score_match.group(1)) if score_match else int(loose_score.group(1)) if loose_score else None
        comment = comment_match.group(1).strip() if comment_match else raw.strip()

        return {
            "score": score,
            "comment": comment
        }


In [29]:
import time
from google.api_core.exceptions import ResourceExhausted

def safe_send_message(prompt, retries=3, wait_seconds=10):
    """
    Resilient wrapper for Gemini's chat.send_message() call.
    Retries on rate limit errors or transient failures.
    """
    for attempt in range(retries):
        try:
            return chat.send_message(prompt)
        except ResourceExhausted as e:
            print(f"⚠️ Quota exceeded. Waiting {wait_seconds}s... (Attempt {attempt+1}/{retries})")
            time.sleep(wait_seconds)
        except Exception as e:
            print("⚠️ Unexpected error:", e)
            break
    return None  # Safe fallback if all retries fail


### Step 4c: Generate Reply with Function Calling + RAG + Memory

In [30]:
def generate_rag_db_reply(user_message: str, retrieved_docs: list[str], previous_message: str = None) -> dict:
    """
    Generate a customer support reply using:
    - Retrieved documents (RAG)
    - Function calling (invoice/order/payment)
    - Few-shot prompting + tone control
    - Memory (previous user message)
    - Gemini reply + evaluation
    - Resilience with fallback if Gemini fails
    """

    # 1. Classify message to detect topic
    category = classify_email_with_gemini(user_message)

    # 2. Select reply tone based on category
    tone_map = {
        "refund": "apologetic",
        "refund_request": "apologetic",
        "payment_issue": "professional",
        "order_status": "friendly",
        "account_access": "reassuring",
        "general_question": "neutral",
        "feedback": "neutral"
    }
    tone = tone_map.get(category, "neutral")

    # 3. Format support knowledge
    kb_context = "\n".join([f"- {doc}" for doc in retrieved_docs]) if retrieved_docs else "No documents found."

    # 4. Previous message context (if any)
    previous_context = f"""Previous message:\n\"\"\"{previous_message}\"\"\"\n\n""" if previous_message else ""

    # 5. Few-shot examples
    few_shot_examples = """
Example:
Customer message: "Where is my refund for invoice INV-001?"
Function result: {'status': 'refunded', 'amount': 150.0, 'date': '2024-03-01'}
Docs: Refunds take 5–7 days.

Reply:
Dear Customer,
Thank you for your message. Your refund for invoice INV-001 was processed on March 1. Please allow 5–7 business days for it to appear in your account.
Best regards,
Support Team

---

Example:
Customer message: "My order hasn’t arrived. It was a tablet."
Function result: {'status': 'delivered', 'item': 'Tablet', 'order_date': '2024-03-03'}
Docs: None

Reply:
Hello,
Your order for a tablet was delivered on March 3. If you haven’t received it, please contact our shipping partner or reply to this message.
Kind regards,
Support Team

---
"""

    # 6. First Gemini call — check if tool needed
    initial_prompt = few_shot_examples + f"""
You are a helpful customer support agent.

You have access to live tools:
- get_invoice_by_id
- get_order_status_by_invoice
- get_payment_status

📌 IMPORTANT:
If the message includes an invoice or is about refund/payment/order,
YOU MUST CALL a tool before generating the reply.

{previous_context}
Customer message:
\"\"\"{user_message}\"\"\"

Knowledge base:
{kb_context}
"""

    response = safe_send_message(initial_prompt)

    # Fallback if Gemini fails (quota, API error, etc.)
    if response is None:
        return {
            "previous_message": previous_message,
            "current_message": user_message,
            "category": category,
            "tone": tone,
            "reply": "⚠️ I’m currently unable to process your request due to a temporary issue. A human support rep will follow up shortly.",
            "needs_human_review": True,
            "function_called": None,
            "function_args": None,
            "function_result": None,
            "docs_used": retrieved_docs,
            "tokens_used": 0,
            "evaluation": {
                "score": None,
                "comment": "No AI reply generated. System fallback triggered due to error or rate limit."
            }
        }

    tool_use = response.candidates[0].content.parts[0].function_call if (
        response.candidates[0].content.parts and hasattr(response.candidates[0].content.parts[0], 'function_call')
    ) else None

    if tool_use:
        fn_name = tool_use.name
        fn_args = tool_use.args

        function_map = {
            "get_invoice_by_id": get_invoice_by_id,
            "get_order_status_by_invoice": get_order_status_by_invoice,
            "get_payment_status": get_payment_status
        }

        result = function_map[fn_name](**fn_args) if fn_name in function_map else {"error": "Unknown function"}

        # 7. Final prompt with tone control
        followup_prompt = few_shot_examples + f"""
You are a helpful and professional support assistant.

Please write the reply in a **{tone}** tone.

Tone guide:
- Apologetic → show empathy, acknowledge delays or frustration
- Professional → clear, confident, action-focused
- Friendly → warm and conversational
- Reassuring → calming and helpful
- Neutral → polite and informative

DO NOT describe what you’ll do. Write the actual message to the user.

{previous_context}
Customer message:
\"\"\"{user_message}\"\"\"

Function result:
{result}

Support documents:
{kb_context}

Respond with the reply only.
"""

        final_response = safe_send_message(followup_prompt)
        reply_text = final_response.text.strip() if final_response else "⚠️ Unable to complete reply."
        needs_human = True if not final_response else False

        evaluation = evaluate_reply(reply_text, user_message)

        return {
            "previous_message": previous_message,
            "current_message": user_message,
            "category": category,
            "tone": tone,
            "reply": reply_text,
            "needs_human_review": needs_human,
            "function_called": fn_name,
            "function_args": fn_args,
            "function_result": result,
            "docs_used": retrieved_docs,
            "tokens_used": len(reply_text.split()),
            "evaluation": evaluation
        }

    # 8. Fallback — no tool used
    reply_text = response.text.strip()
    needs_human = True  # If no function was called and context is weak

    evaluation = evaluate_reply(reply_text, user_message)

    return {
        "previous_message": previous_message,
        "current_message": user_message,
        "category": category,
        "tone": tone,
        "reply": reply_text,
        "needs_human_review": needs_human,
        "function_called": None,
        "function_args": None,
        "function_result": None,
        "docs_used": retrieved_docs,
        "tokens_used": len(reply_text.split()),
        "evaluation": evaluation
    }


In [31]:
# email = "Can you confirm how I paid invoice INV-002 and on what date?"
# docs = retrieve_similar_docs(email)
# result = generate_rag_db_reply(email, docs)

# import json
#print(json.dumps(result["evaluation"], indent=2))


### Memory in Action: Multi-Turn Support Example

Here’s a realistic customer support conversation across two messages:

---

**Previous message (from user):**  
> I received the wrong item — I ordered headphones and got a charger instead.

**Current message:**  
> I’ve returned it. When will I get my refund?

Without the previous message, the agent might not know what “it” refers to.  
With memory enabled, the agent uses context to generate a more accurate and human-like response.


In [32]:
previous = "I received the wrong item — I ordered headphones and got a charger instead."
current = "I’ve returned it. When will I get my refund?"

# Get top 3 support docs from RAG
current_embedding = embed_text(current)
kb_df["Similarity"] = kb_df["Embedding"].apply(
    lambda x: cosine_similarity([current_embedding], [x])[0][0]
)
top_docs = kb_df.sort_values(by="Similarity", ascending=False).head(3)
retrieved_docs = top_docs["Text"].tolist()

# Call agent with memory
response = generate_rag_db_reply(current, retrieved_docs=retrieved_docs, previous_message=previous)

# Show results
#import json
#print(json.dumps(response, indent=2))
#print(response)


{'previous_message': 'I received the wrong item — I ordered headphones and got a charger instead.', 'current_message': 'I’ve returned it. When will I get my refund?', 'category': 'refund_request', 'tone': 'apologetic', 'reply': 'I need an invoice ID to check the refund status. Could you please provide the invoice ID associated with the returned item?', 'needs_human_review': True, 'function_called': None, 'function_args': None, 'function_result': None, 'docs_used': ['Refunds usually take 5–7 business days after item is received.', 'You can request a refund within 30 days of purchase if the item is unused.', 'For payment issues, try using a different card or contact your bank.'], 'tokens_used': 22, 'evaluation': {'score': 5, 'comment': ': "Excellent. The AI correctly identifies that it needs an invoice ID to proceed and asks for it politely. It\'s a necessary step before providing any information about the refund.'}}


### Memory + Function Calling: Invoice Tracking Example

In this example, the user provides their invoice ID in the first message.

---

**Previous message:**  
> I received a damaged item and returned it. My invoice number is INV-002.

**Current message:**  
> When will the refund be processed?

The invoice ID is **not repeated**, but the agent remembers it and checks the refund status using a function call.

This simulates a real-world customer journey — and shows the power of memory + tool use + RAG working together.


In [33]:
previous = "I received a damaged item and returned it. My invoice number is INV-002."
current = "When will the refund be processed?"

# Embed and retrieve top support docs
current_embedding = embed_text(current)
kb_df["Similarity"] = kb_df["Embedding"].apply(
    lambda x: cosine_similarity([current_embedding], [x])[0][0]
)
top_docs = kb_df.sort_values(by="Similarity", ascending=False).head(3)
retrieved_docs = top_docs["Text"].tolist()

# Run agent with memory
response = generate_rag_db_reply(current, retrieved_docs=retrieved_docs, previous_message=previous)

# Display output
import json
print(json.dumps(response, indent=2))


{
  "previous_message": "I received a damaged item and returned it. My invoice number is INV-002.",
  "current_message": "When will the refund be processed?",
  "category": "refund_request",
  "tone": "apologetic",
  "reply": "Dear Bob,\n\nThank you for your message. I see that invoice INV-002 for $200 is currently marked as paid. To confirm, has the returned item been received? I don't see any record of a refund being processed yet. Once the return is processed, refunds typically take 5-7 business days to appear in your account.\n\nKind regards,\nSupport Team",
  "needs_human_review": true,
  "function_called": null,
  "function_args": null,
  "function_result": null,
  "docs_used": [
    "Refunds usually take 5\u20137 business days after item is received.",
    "You can request a refund within 30 days of purchase if the item is unused.",
    "For payment issues, try using a different card or contact your bank."
  ],
  "tokens_used": 58,
  "evaluation": {
    "score": 4,
    "comment"

### Tone Control in Action: Apologetic Refund Example

Let’s test how the agent changes its tone based on the message category.

In this case, the customer is upset about a **delayed refund** — so the category will be `refund`, and the tone will be set to `apologetic`.

The reply should reflect empathy and professionalism.


In [34]:
# Refund-related message
previous = "I returned my item a week ago and haven't seen anything happen yet."
current = "Where is my refund? This is taking too long."

# Embed current message and retrieve top docs
current_embedding = embed_text(current)
kb_df["Similarity"] = kb_df["Embedding"].apply(
    lambda x: cosine_similarity([current_embedding], [x])[0][0]
)
top_docs = kb_df.sort_values(by="Similarity", ascending=False).head(3)
retrieved_docs = top_docs["Text"].tolist()

# Generate reply with memory and tone
response = generate_rag_db_reply(current, retrieved_docs=retrieved_docs, previous_message=previous)

# Display full output
import json
#print(json.dumps(response, indent=2))
print(response)

{'previous_message': "I returned my item a week ago and haven't seen anything happen yet.", 'current_message': 'Where is my refund? This is taking too long.', 'category': 'refund_request', 'tone': 'apologetic', 'reply': "I understand your frustration. To look into the status of your refund, I'll need the invoice number associated with the returned item. Could you please provide that?", 'needs_human_review': True, 'function_called': None, 'function_args': None, 'function_result': None, 'docs_used': ['Refunds usually take 5–7 business days after item is received.', 'You can request a refund within 30 days of purchase if the item is unused.', 'For payment issues, try using a different card or contact your bank.'], 'tokens_used': 27, 'evaluation': {'score': 5, 'comment': ': "Excellent. The response acknowledges the customer\'s frustration, which is important in this scenario. It clearly states what information is needed (invoice number) to resolve the issue. It is polite and sets expectati

### 🛡️ Fallback Handling Test: Invalid Invoice

In this example, the user provides an invalid invoice number.

The function lookup should fail, returning `status: "not_found"`.

Gemini should:
- Acknowledge the issue kindly
- Ask the user to confirm the invoice ID
- Avoid guessing or making up data


In [35]:
# Simulate a bad invoice number
previous = "I ordered a speaker and returned it last week."
current = "Can you check the status of invoice INV-999?"

# Embed and retrieve top docs
current_embedding = embed_text(current)
kb_df["Similarity"] = kb_df["Embedding"].apply(
    lambda x: cosine_similarity([current_embedding], [x])[0][0]
)
top_docs = kb_df.sort_values(by="Similarity", ascending=False).head(3)
retrieved_docs = top_docs["Text"].tolist()

# Call the agent with a bad invoice
response = generate_rag_db_reply(current, retrieved_docs=retrieved_docs, previous_message=previous)

# Display result
import json
#print(json.dumps(response, indent=2))
print(response)

{'previous_message': 'I ordered a speaker and returned it last week.', 'current_message': 'Can you check the status of invoice INV-999?', 'category': 'general_question', 'tone': 'neutral', 'reply': "Dear Customer,\n\nThank you for your message. I checked for invoice INV-999, but it doesn't seem to exist in our system. Could you please double-check the invoice number and provide it to me?\n\nKind regards,\nSupport Team", 'needs_human_review': True, 'function_called': None, 'function_args': None, 'function_result': None, 'docs_used': ['Refunds usually take 5–7 business days after item is received.', 'For payment issues, try using a different card or contact your bank.', "To reset your password, go to the login page and click 'Forgot Password'."], 'tokens_used': 37, 'evaluation': {'score': 5, 'comment': ': "Excellent. The AI correctly used the tool to check for the invoice and, upon not finding it, politely informed the customer and asked them to double-check the number. The response is p

### 🆘 Escalation Test: Vague & Frustrated User Message

This test checks whether the agent can detect when a message is unclear or requires human attention.

---

**Previous message:**  
> This has been dragging on forever and I still don't know what's happening.

**Current message:**  
> You people are no help at all. Someone better get back to me ASAP.

This message is vague, emotionally charged, and lacks specific information.

The agent should:
- Stay polite and calm
- NOT guess or invent data
- Set `"needs_human_review": true` so a human can follow up


In [36]:
previous = "This has been dragging on forever and I still don't know what's happening."
current = "You people are no help at all. Someone better get back to me ASAP."

# Embed and retrieve top docs
current_embedding = embed_text(current)
kb_df["Similarity"] = kb_df["Embedding"].apply(
    lambda x: cosine_similarity([current_embedding], [x])[0][0]
)
top_docs = kb_df.sort_values(by="Similarity", ascending=False).head(3)
retrieved_docs = top_docs["Text"].tolist()

# Generate the agent's reply
response = generate_rag_db_reply(current, retrieved_docs=retrieved_docs, previous_message=previous)

# Show the output clearly
import json
print(json.dumps(response, indent=2))


{
  "previous_message": "This has been dragging on forever and I still don't know what's happening.",
  "current_message": "You people are no help at all. Someone better get back to me ASAP.",
  "category": "feedback",
  "tone": "neutral",
  "reply": "I understand your frustration and apologize for the delay. To assist you better, could you please provide your invoice number or order ID? With that information, I can look into the specific details of your situation and provide a status update.",
  "needs_human_review": true,
  "function_called": null,
  "function_args": null,
  "function_result": null,
  "docs_used": [
    "For payment issues, try using a different card or contact your bank.",
    "To unsubscribe from our newsletter, click the link at the bottom of any email.",
    "If the app crashes, try clearing cache or reinstalling it."
  ],
  "tokens_used": 41,
  "evaluation": {
    "score": 5,
    "comment": ": \"Excellent. Acknowledges the customer's frustration and apologizes. 

## Step 5: Logging with MLOps-Style Output

In this final step, we implement lightweight MLOps logging to track how the agent performs over time.

For each user interaction, we log the following into a CSV file:
- ⏱️ Timestamp of the request  
- 📨 The original user message  
- 🏷️ The predicted category  
- 💬 The generated reply  
- 🔧 Any database function called (and its arguments)  
- ✅ Gemini’s self-evaluation score and comment  
- 🧑‍⚖️ Whether the reply needs human review (based on Gemini’s judgment)  
- 🔢 Token usage (to help estimate cost and verbosity)


---

This gives us everything we need to:
- Audit how the agent makes decisions  
- Spot weak replies (low evaluation scores)  
- Track usage patterns (e.g. high-priority issues, frequent queries)  
- Improve prompt strategies and tools based on real behavior

All logs are saved to `reply_logs.csv` — ready for export, dashboards, or future fine-tuning.


In [37]:
import csv
from datetime import datetime
import os

def log_to_csv(log_file: str, result: dict, user_message: str):
    """
    Logs a reply result to a CSV file.
    Creates the file with headers if it doesn't exist.
    """
    fieldnames = [
        "timestamp",
        "user_message",
        "previous_message",
        "category",
        "tone",
        "priority",
        "needs_human_review",
        "reply",
        "function_called",
        "function_args",
        "function_result",
        "docs_used",
        "evaluation_score",
        "evaluation_comment",
        "tokens_used"
    ]

    row = {
        "timestamp": datetime.now().isoformat(),
        "user_message": user_message,
        "previous_message": result.get("previous_message"),
        "category": result.get("category"),
        "tone": result.get("tone"),
        "priority": result.get("priority"),
        "needs_human_review": result.get("needs_human_review"),
        "reply": result.get("reply"),
        "function_called": result.get("function_called"),
        "function_args": str(result.get("function_args")),
        "function_result": str(result.get("function_result")),
        "docs_used": " | ".join(result.get("docs_used", [])),
        "evaluation_score": result.get("evaluation", {}).get("score"),
        "evaluation_comment": result.get("evaluation", {}).get("comment"),
        "tokens_used": result.get("tokens_used")
    }

    file_exists = os.path.isfile(log_file)

    with open(log_file, mode="a", newline='', encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        if not file_exists:
            writer.writeheader()
        writer.writerow(row)


let's test it!

In [38]:
test_msg = "Why is my invoice still showing as unpaid? I paid last week!"
previous_msg = "I already emailed you about this before."

# Embed, retrieve docs
test_embedding = embed_text(test_msg)
kb_df["Similarity"] = kb_df["Embedding"].apply(lambda x: cosine_similarity([test_embedding], [x])[0][0])
retrieved = kb_df.sort_values(by="Similarity", ascending=False).head(3)["Text"].tolist()

# Generate reply
reply = generate_rag_db_reply(test_msg, retrieved, previous_message=previous_msg)

# Log it to CSV in visible directory
log_to_csv("/kaggle/working/reply_logs.csv", reply, test_msg)

# Show reply
import json
print(json.dumps(reply, indent=2))


{
  "previous_message": "I already emailed you about this before.",
  "current_message": "Why is my invoice still showing as unpaid? I paid last week!",
  "category": "account_problem",
  "tone": "neutral",
  "reply": "I understand your concern. To investigate why your invoice is showing as unpaid, I'll need the invoice number. Could you please provide it?",
  "needs_human_review": true,
  "function_called": null,
  "function_args": null,
  "function_result": null,
  "docs_used": [
    "For payment issues, try using a different card or contact your bank.",
    "Refunds usually take 5\u20137 business days after item is received.",
    "To reset your password, go to the login page and click 'Forgot Password'."
  ],
  "tokens_used": 23,
  "evaluation": {
    "score": 5,
    "comment": ": \"Excellent. Acknowledges the customer's concern and clearly states that the invoice number is needed to proceed. The response is polite and efficient."
  }
}


In [39]:
import pandas as pd

log_df = pd.read_csv("/kaggle/working/reply_logs.csv")
log_df.tail()  # Show last few rows


  has_large_values = (abs_vals > 1e6).any()
  has_small_values = ((abs_vals < 10 ** (-self.digits)) & (abs_vals > 0)).any()
  has_small_values = ((abs_vals < 10 ** (-self.digits)) & (abs_vals > 0)).any()


Unnamed: 0,timestamp,user_message,previous_message,category,tone,priority,needs_human_review,reply,function_called,function_args,function_result,docs_used,evaluation_score,evaluation_comment,tokens_used
0,2025-04-17T14:45:19.088830,Why is my invoice still showing as unpaid? I p...,I already emailed you about this before.,account_problem,neutral,,True,I understand your concern. To investigate why ...,,,,"For payment issues, try using a different card...",5,": ""Excellent. Acknowledges the customer's conc...",23


## Step 6: Final Unified Pipeline

This is the full, end-to-end pipeline — the one function that ties it all together.

The agent will:
- 🏷️ Classify the support message into one of five categories
- 🚦 Detect urgency (priority) using hybrid logic + Gemini
- 🔍 Retrieve the top 3 relevant KB docs (RAG via embeddings)
- 🧠 Add context from previous user messages (memory)
- 🛠️ Call live tools like `get_invoice_by_id()` for real data
- ✍️ Generate a structured, tone-controlled reply with Gemini
- ✅ Self-evaluate the response for helpfulness and coverage
- 🧾 Log everything into a CSV file for future analysis

Just give it a subject and body — and the agent handles the rest from A to Z.


In [40]:
def support_agent_pipeline_gemini(subject: str, body: str, previous_message: str = None) -> dict:
    """
    End-to-end GenAI support agent pipeline:
    - Classify message category
    - Detect priority
    - Retrieve documents (RAG)
    - Use memory (previous message)
    - Call tools if needed
    - Generate a structured, tone-controlled reply
    - Evaluate the response
    - Log to CSV
    """
    email_text = f"{subject.strip()} {body.strip()}"

    # Step 1: Classify message
    category = classify_email_with_gemini(email_text)
    print(f"📂 Detected Category: {category}")

    # Step 2: Priority detection
    priority = classify_priority_hybrid(email_text, category)
    print(f"🚦 Priority Level: {priority}")

    # Step 3: Retrieve relevant support docs
    embedding = embed_text(email_text)
    kb_df["Similarity"] = kb_df["Embedding"].apply(lambda x: cosine_similarity([embedding], [x])[0][0])
    retrieved_docs = kb_df.sort_values(by="Similarity", ascending=False).head(3)["Text"].tolist()

    print("\n🔍 Retrieved Context:")
    for doc in retrieved_docs:
        print("-", doc)

    # Step 4: Generate reply (includes memory, function calling, tone control)
    result = generate_rag_db_reply(email_text, retrieved_docs, previous_message=previous_message)
    result["priority"] = priority  # Add to output for logging and analysis

    # Step 5: Display final reply
    print("\n🤖 Final Reply:")
    print(result["reply"])

    # Step 6: Log to CSV
    log_to_csv("/kaggle/working/reply_logs.csv", result, email_text)

    return result


In [41]:
# Final test email: Refund request with frustration and no invoice ID (triggers memory + escalation logic)
subject = "Still no refund??"
body = "I returned the product over a week ago and haven’t seen any money back. This is getting really frustrating."

previous_msg = "I already sent a message last week about this but got no proper answer."

result = support_agent_pipeline_gemini(subject, body, previous_message=previous_msg)



📂 Detected Category: refund_request
🚦 Priority Level: high

🔍 Retrieved Context:
- Refunds usually take 5–7 business days after item is received.
- You can request a refund within 30 days of purchase if the item is unused.
- For payment issues, try using a different card or contact your bank.

🤖 Final Reply:
I understand your frustration and apologize for the continued delay. To look into this further, could you please provide the invoice number associated with your return?


## ✅ Conclusion: From Prototype to Business-Ready Blueprint

This notebook shows how far you can go with GenAI — even without fine-tuning or heavy infrastructure.

What started as a smart auto-reply system is now a full agent that:
- Understands the request
- Retrieves the right documents
- Queries a live database
- Controls tone and memory
- Evaluates its own output
- And logs everything like a real backend system

It's lightweight, transparent, and extendable — the kind of MVP you'd build to prove value fast.

There’s still more I could add — like a feedback loop for human reviews, better dataset coverage, or a real deployment backend.

Maybe I’ll come back to it 😄

---

## 📚 Learn More

Want to see how this project evolved step by step?

### 🔗 Final Version (this notebook)  
- 📖 [Article: How I Built a Smarter GenAI Support Agent](https://decryptai.substack.com/p/how-i-built-a-smarter-genai-customer)  
- 🎥 [Video Demo](https://youtu.be/LHebeTt_JtA)

### 🔗 Original Version (V1 MVP)  
- 📖 [Article: Building a Customer Support Agent](https://decryptai.substack.com/p/building-a-customer-support-agent)  
- 🎥 [Project Walkthrough](https://youtu.be/O9rHu1t8fTM)

---

## 💡 About This Project

Built as part of the Kaggle x Google GenAI Capstone.  
By [Amal Nozieres](https://www.linkedin.com/in/amalnozieres)

- 🧠 Blog: [DecryptAI on Substack](https://decryptai.substack.com)  
- 💻 Repo: [GitHub – gemini-customer-support-agent](https://github.com/AmelNozieres/gemini-customer-support-agent)
