<a href="https://colab.research.google.com/github/shrikantvarma/AgenticAI/blob/main/Adaptive_RAG_final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Summary of Learnings between three RAG approaches

This notebook demonstrated and evaluated three approaches to RAG: Traditional, Agentic, and Adaptive (Hybrid). The key learnings regarding their performance and quality tradeoffs are summarized below, including observed metrics from our experiment:

**Observed Performance and Quality Metrics:**

| Metric            | Traditional RAG (Simple Query Example) | Agentic RAG (Complex Query Example) | Adaptive (Hybrid) RAG (Average/Routing) |
| :---------------- | :------------------------------------- | :---------------------------------- | :-------------------------------------- |
| LLM Score (1-5)   | 4                                      | 5                                   | Varies by query complexity (4 for simple, 5 for complex) |
| Latency (ms)      | ~3400 (e.g., "Where can I find triggers?") | ~18000 - 21000 (e.g., "My trigger isn't firing...") | Depends on routing (faster for simple, slower for complex) |
| LLM Calls         | 1                                      | 6                                   | 1 (for simple) or 6 (for complex)       |
| Total Tokens      | ~230                                   | ~1700 - 1750                        | ~230 (for simple) or ~1700-1750 (for complex) |

*Note: The exact numbers in the table are illustrative based on typical runs observed in this notebook. Adaptive RAG metrics depend on the router's decision.*

**Traditional RAG:**
*   **Quality:** Provides general, directional answers based on retrieved documents. Effective for simple queries but lacks depth and specific diagnosis for complex troubleshooting. Achieved lower scores from the LLM judge on complex queries compared to Agentic RAG.
*   **Latency:** Significantly faster due to a single retrieval and single LLM call.
*   **Token Usage (Cost):** Uses the fewest tokens per query, resulting in lower estimated cost.

**Agentic RAG:**
*   **Quality:** Provides specific, in-depth diagnoses and actionable steps by analyzing system state (settings and logs) alongside retrieved documents. Achieved higher scores from the LLM judge on complex troubleshooting queries.
*   **Latency:** Slower due to multiple LLM calls and sequential reasoning steps.
*   **Token Usage (Cost):** Uses significantly more tokens per query due to multiple LLM interactions, resulting in higher estimated cost.

**Adaptive (Hybrid) RAG:**
*   **Quality:** Selects the appropriate RAG method based on query complexity, aiming to deliver the best quality for each type (specific diagnosis for complex, general info for simple).
*   **Latency:** Balances speed by using the faster Traditional RAG for simple queries and accepting the higher latency of Agentic RAG only when necessary for complex queries. Overall latency depends on the mix of query types.
*   **Token Usage (Cost):** Balances cost by using the lower-token Traditional RAG for simple queries and incurring the higher token cost of Agentic RAG only for complex queries. Overall token usage depends on the mix of query types.

**Overall Learning:**
The choice of RAG architecture involves a clear tradeoff between speed/cost and the depth/specificity of the answer. Adaptive RAG provides a practical solution by intelligently routing queries to leverage the strengths of both Traditional and Agentic approaches, optimizing resource usage and user experience across varying query complexities. For real-world applications, understanding the nature of user queries is crucial for setting the complexity threshold and fine-tuning the router.

# The Experiemnt : Traditional, Agentic RAG , Adaptive/Hybrid RAG

**A Practical Comparison with ChromaDB and OpenAI**

This notebook explores **Adaptive/Hybrid Retrieval-Augmented Generation (RAG)**, demonstrating how it can intelligently combine the strengths of **Traditional RAG** and **Agentic RAG** approaches. By routing queries based on their complexity, Adaptive RAG aims to optimize the balance between **latency** (speed) and **quality** (accuracy and depth of analysis) for different types of user questions.

We use an **Admin trigger troubleshooting use case** to illustrate these concepts, leveraging:
- **ChromaDB**: A vector database for efficient storage and retrieval of troubleshooting knowledge.
- **OpenAI GPT-4o-mini**: A powerful language model for generating responses and executing diagnostic steps.

The scenario is as follows:
1. There is a trigger in the system that failed.
2. There is a document that explains how to troubleshoot triggers.
3. The trigger setting and logs are available to the LLM.
4. We use LLMs to provide answers.

* * *

## Understanding the RAG Approaches

- **Traditional RAG**:
  - **Process**: Simple, single-step approach. Retrieves relevant documents from a vector database based on the user's query and uses an LLM to generate an answer based *only* on the retrieved context from the troubleshooting guide. **It does not analyze the specific trigger settings or logs.**
  - **Characteristics**: Generally **fast** and **low cost** due to minimal LLM interaction. Provides **generic answers** without analyzing specific system states.

- **Agentic RAG**:
  - **Process**: A multi-step reasoning process that involves multiple LLM calls. It retrieves relevant information from the troubleshooting guides, creates a troubleshooting plan dynamically, *and* analyzes provided state data (like trigger settings and logs) to form a diagnosis. When a user submits a query, relevant chunks are retrieved from the vector database. This is then sent to the LLM with a request to create a dynamic plan based on the troubleshooting guide. The agent then executes each troubleshooting step with the help of the LLM, analyzing the provided trigger settings and logs for each step, and provides the result and suggestion from each execution before synthesizing a final answer.
  - **Characteristics**: **Thorough** and provides a **specific solution** by analyzing the actual system state. However, it is **slower** and **more costly** due to the increased number of steps and LLM interactions.

- **Adaptive RAG**:
  - **Process**: Introduces an intelligent router that assesses the **complexity** of the incoming user query. Simple queries are directed to the faster Traditional RAG path, while complex queries requiring deeper analysis are sent to the more thorough Agentic RAG path.
  - **Characteristics**: Aims to achieve the **best balance** between speed and quality by using the most appropriate method for each query type.

* * *

## Demonstration and Comparison

The notebook demonstrates these approaches by:
1. Setting up a knowledge base in ChromaDB and simulating trigger state data.
2. Implementing and running both Traditional and Agentic RAG methods on a troubleshooting query.
3. Comparing their performance metrics (Latency, LLM Calls) and output quality, **including evaluation of output quality using an LLM as a judge.**
4. Implementing a simple Hybrid RAG router and testing how it routes different types of queries.

* * *

## Key Takeaways

- No single RAG method is ideal for all situations; there's an inherent **latency vs quality tradeoff**.
- Traditional RAG is effective for quick information retrieval (simple queries).
- Agentic RAG is powerful for complex problem-solving requiring state analysis.
- **Hybrid RAG** provides a practical solution to this tradeoff by dynamically choosing the optimal approach per query.
- Effective **query complexity assessment** is fundamental to a successful Hybrid RAG implementation.

* * *

In [141]:
# ============================================================================
# CELL 1: Install Dependencies
# ============================================================================

!pip install chromadb openai python-dotenv -q

print("✓ Dependencies installed")

✓ Dependencies installed


In [142]:
# ============================================================================
# CELL 2: Setup and Imports
# ============================================================================

import os
import json
import time
from typing import Dict, List, Any
import chromadb
from chromadb.config import Settings
from openai import OpenAI

from google.colab import userdata
openai_api_key = userdata.get('OPENAI_API_KEY').strip()
client = OpenAI(api_key=openai_api_key)


print("✓ Imports loaded")
print("✓ OpenAI client initialized")

✓ Imports loaded
✓ OpenAI client initialized


In [143]:
# ============================================================================
# CELL 3: Create Knowledge Base
# ============================================================================

# Troubleshooting guide as a single string
troubleshooting_guide_text = """Trigger Disabled Issue: If trigger is not firing, first check if the trigger is enabled.
Go to Settings > Automation > Triggers and look for the toggle switch. It should be ON/green.
Disabled triggers never fire regardless of conditions.

Condition Matching: Triggers only fire when conditions match ticket data.
With ALL logic, every condition must be true. With ANY logic, at least one condition must be true.
Compare your trigger conditions against the actual ticket field values carefully.

Logic Types Explained: ALL logic means every condition must match (AND).
For example, status=new AND priority=high means both must be true.
ANY logic means at least one condition must match (OR).
Common mistake: using ALL when you meant ANY.

Execution Logs: Check trigger logs to see which tickets were evaluated,
whether the trigger fired, and the specific reason it didn't fire.
Logs are found in Settings > Automation > Logs and show execution history with error details.
"""

# Split the guide into individual documents based on paragraphs
troubleshooting_docs_content = troubleshooting_guide_text.strip().split('\n\n')

# Create a list of document dictionaries with IDs and metadata
troubleshooting_docs = [
    {
        "id": f"doc_{i+1}",
        "content": content,
        "metadata": {"category": "troubleshooting"} # Using a generic category for now
    }
    for i, content in enumerate(troubleshooting_docs_content)
]


# Trigger state (actual settings and logs)
trigger_state = {
    "trigger_settings": {
        "id": "trigger_001",
        "name": "Auto-assign high priority tickets",
        "enabled": True,
        "conditions": [
            {"field": "status", "operator": "equals", "value": "new"},
            {"field": "priority", "operator": "equals", "value": "high"}
        ],
        "logic": "ALL",
        "actions": ["assign_to_team_a"]
    },
    "execution_logs": [
        {
            "ticket_id": "TKT_123",
            "timestamp": "2025-01-15T10:30:00Z",
            "fired": False,
            "reason": "Condition mismatch: priority is 'medium', expected 'high'"
        },
        {
            "ticket_id": "TKT_124",
            "timestamp": "2025-01-15T11:15:00Z",
            "fired": True,
            "actions_executed": ["Assigned to Team A"]
        },
        {
            "ticket_id": "TKT_125",
            "timestamp": "2025-01-15T14:22:00Z",
            "fired": False,
            "reason": "Condition mismatch: status is 'open', expected 'new'"
        }
    ],
    "recent_tickets": [
        {"id": "TKT_123", "status": "new", "priority": "medium"},
        {"id": "TKT_124", "status": "new", "priority": "high"},
        {"id": "TKT_125", "status": "open", "priority": "high"}
    ]
}

print("✓ Knowledge base created")
print(f"  - {len(troubleshooting_docs)} troubleshooting documents")
print(f"  - Trigger state with {len(trigger_state['execution_logs'])} logs")

✓ Knowledge base created
  - 4 troubleshooting documents
  - Trigger state with 3 logs


In [144]:
# ============================================================================
# CELL 4: Setup ChromaDB Vector Database
# ============================================================================

# Initialize ChromaDB client
chroma_client = chromadb.Client(Settings(
    anonymized_telemetry=False,
    allow_reset=True
))

# Reset to start fresh
chroma_client.reset()

# Create collection
collection = chroma_client.create_collection(
    name="trigger_troubleshooting",
    metadata={"description": "Admin trigger troubleshooting knowledge base"}
)

# Add documents to collection
collection.add(
    documents=[doc["content"] for doc in troubleshooting_docs],
    ids=[doc["id"] for doc in troubleshooting_docs],
    metadatas=[doc["metadata"] for doc in troubleshooting_docs]
)

print("✓ ChromaDB initialized")
print(f"  - Collection: {collection.name}")
print(f"  - Documents: {collection.count()}")

✓ ChromaDB initialized
  - Collection: trigger_troubleshooting
  - Documents: 4


In [145]:
# ============================================================================
# CELL 5: Helper Functions for LLM and Retrieval
# ============================================================================

from typing import List, Dict # Import List and Dict

def call_llm(prompt: str, model: str = "gpt-4o-mini", max_tokens: int = 500):
    """Call OpenAI API and return text and token usage"""
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        max_tokens=max_tokens,
        temperature=0.3
    )
    return {
        "text": response.choices[0].message.content,
        "usage": response.usage # Includes prompt_tokens and completion_tokens
    }

def retrieve_docs(query: str, n_results: int = 2) -> List[Dict]:
    """Retrieve relevant documents from ChromaDB"""
    results = collection.query(
        query_texts=[query],
        n_results=n_results
    )

    docs = []
    for i in range(len(results['ids'][0])):
        docs.append({
            'id': results['ids'][0][i],
            'content': results['documents'][0][i],
            'metadata': results['metadatas'][0][i],
            'distance': results['distances'][0][i] if 'distances' in results else None
        })
    return docs

print("✓ Helper functions ready")

✓ Helper functions ready


In [146]:
# ============================================================================
# CELL 6: Traditional RAG Implementation
# ============================================================================

from typing import List, Dict # Import List and Dict

class TraditionalRAG:
    """
    Traditional RAG: Single retrieval + single generation
    - Retrieve relevant docs from vector DB
    - Generate answer in one LLM call
    - Fast but generic (no state analysis)
    """

    def answer(self, query: str) -> Dict:
        start_time = time.time()
        steps = []
        total_prompt_tokens = 0
        total_completion_tokens = 0

        # Step 1: Retrieve relevant docs
        docs = retrieve_docs(query, n_results=2)
        steps.append(f"Retrieved {len(docs)} documents from ChromaDB")

        # Step 2: Generate answer
        context = "\n\n".join([doc['content'] for doc in docs])
        prompt = f"""You are a helpful admin assistant. Answer the user's question based on this troubleshooting guide.

Question: {query}

Troubleshooting Guide:
{context}

Provide a helpful answer:"""

        llm_response = call_llm(prompt)
        answer = llm_response["text"]
        total_prompt_tokens += llm_response["usage"].prompt_tokens
        total_completion_tokens += llm_response["usage"].completion_tokens
        steps.append("Generated answer with LLM")

        elapsed = time.time() - start_time

        return {
            "answer": answer,
            "method": "Traditional RAG",
            "retrieval_calls": 1,
            "llm_calls": 1,
            "docs_retrieved": len(docs),
            "latency_ms": round(elapsed * 1000, 2),
            "steps": steps,
            "retrieved_docs": docs,
            "prompt_tokens": total_prompt_tokens,
            "completion_tokens": total_completion_tokens,
            "total_tokens": total_prompt_tokens + total_completion_tokens
        }

trad_rag = TraditionalRAG()
print("✓ Traditional RAG ready")

✓ Traditional RAG ready


In [147]:
# ============================================================================
# CELL 7: Agentic RAG Implementation
# ============================================================================

from typing import List, Dict # Import List and Dict

class AgenticRAG:
    """
    Agentic RAG: Multi-step reasoning with state analysis
    - Retrieve troubleshooting guide
    - Create diagnostic plan (LLM call #1)
    - Execute state checks (LLM calls #2-5)
    - Synthesize diagnosis (LLM call #6)
    - Slower but provides specific diagnosis
    """

    def answer(self, query: str, state: Dict = None) -> Dict:
        start_time = time.time()
        steps = []
        llm_calls = 0
        retrieval_calls = 0
        total_prompt_tokens = 0
        total_completion_tokens = 0


        if state is None:
            state = trigger_state

        # Step 1: Retrieve troubleshooting docs
        docs = retrieve_docs(query, n_results=3)
        steps.append(f"Retrieved {len(docs)} documents from ChromaDB")
        retrieval_calls += 1

        context = "\n\n".join([doc['content'] for doc in docs])

        # Step 2: Create diagnostic plan (LLM call #1)
        plan_prompt = f"""Based on the user's question and troubleshooting guide, what specific checks should we run?

Question: {query}

Guide:
{context}

Return ONLY a JSON array of 4 specific checks to run against the trigger state.
Example format: ["check one", "check two", "check three", "check four"]
DO NOT include any other text or formatting outside the JSON array."""

        plan_llm_response = call_llm(plan_prompt, max_tokens=200)
        plan_response = plan_llm_response["text"]
        total_prompt_tokens += plan_llm_response["usage"].prompt_tokens
        total_completion_tokens += plan_llm_response["usage"].completion_tokens
        llm_calls += 1
        steps.append("Created diagnostic plan")

        try:
            diagnostic_plan = json.loads(plan_response)
        except Exception as e:
            print(f"Error: Diagnostic plan could not be parsed. Response was: {plan_response}")
            print(f"Parsing error: {e}")
            # Fallback plan if parsing fails
            diagnostic_plan = ["Check enabled status", "Analyze execution logs", "Compare conditions with tickets", "Verify logic type"]
            steps.append("Using fallback diagnostic plan due to parsing error")


        # Step 3: Execute each diagnostic check
        findings = []

        # Check 1: Is enabled?
        check_prompt = f"""Check if trigger is enabled.

Trigger settings: {json.dumps(state['trigger_settings'], indent=2)}

Answer in one sentence: Is the trigger enabled?"""
        check1_llm_response = call_llm(check_prompt, max_tokens=50)
        finding = check1_llm_response["text"]
        total_prompt_tokens += check1_llm_response["usage"].prompt_tokens
        total_completion_tokens += check1_llm_response["usage"].completion_tokens
        findings.append(f"Enabled status: {finding}")
        llm_calls += 1
        steps.append("Check #1: Verified enabled status")

        # Check 2: Analyze logs
        logs_prompt = f"""Analyze these execution logs.

Logs: {json.dumps(state['execution_logs'], indent=2)}

Answer in 2-3 sentences: What do the logs show about trigger firing?"""
        check2_llm_response = call_llm(logs_prompt, max_tokens=100)
        finding = check2_llm_response["text"]
        total_prompt_tokens += check2_llm_response["usage"].prompt_tokens
        total_completion_tokens += check2_llm_response["usage"].completion_tokens
        findings.append(f"Log analysis: {finding}")
        llm_calls += 1
        steps.append("Check #2: Analyzed execution logs")

        # Check 3: Compare conditions
        conditions_prompt = f"""Compare trigger conditions against actual tickets.

Conditions: {json.dumps(state['trigger_settings']['conditions'], indent=2)}
Recent tickets: {json.dumps(state['recent_tickets'], indent=2)}

Answer in 2-3 sentences: Which tickets match the conditions?"""
        check3_llm_response = call_llm(conditions_prompt, max_tokens=150)
        finding = check3_llm_response["text"]
        total_prompt_tokens += check3_llm_response["usage"].prompt_tokens
        total_completion_tokens += check3_llm_response["usage"].completion_tokens
        findings.append(f"Condition matching: {finding}")
        llm_calls += 1
        steps.append("Check #3: Compared conditions vs tickets")

        # Check 4: Verify logic type
        logic_prompt = f"""Explain the logic type.

Logic type: {state['trigger_settings']['logic']}
Conditions: {json.dumps(state['trigger_settings']['conditions'], indent=2)}

Answer in 1-2 sentences: What does this logic type mean?"""
        check4_llm_response = call_llm(logic_prompt, max_tokens=100)
        finding = check4_llm_response["text"]
        total_prompt_tokens += check4_llm_response["usage"].prompt_tokens
        total_completion_tokens += check4_llm_response["usage"].completion_tokens
        findings.append(f"Logic type: {finding}")
        llm_calls += 1
        steps.append("Check #4: Verified logic type")


        # Step 4: Synthesize final diagnosis (LLM call #6)
        synthesis_prompt = f"""Based on all the findings, provide a specific diagnosis.

User question: {query}

Findings:
{chr(10).join(f'{i+1}. {f}' for i, f in enumerate(findings))}

Provide a specific, actionable answer that:
1. Explains if the trigger is working correctly or not
2. Gives evidence from the actual state
3. Explains why certain tickets didn't fire
4. Suggests concrete next steps if needed"""

        synthesis_llm_response = call_llm(synthesis_prompt, max_tokens=600)
        answer = synthesis_llm_response["text"]
        total_prompt_tokens += synthesis_llm_response["usage"].prompt_tokens
        total_completion_tokens += synthesis_llm_response["usage"].completion_tokens
        llm_calls += 1
        steps.append("Synthesized final diagnosis")

        elapsed = time.time() - start_time

        return {
            "answer": answer,
            "method": "Agentic RAG",
            "retrieval_calls": retrieval_calls,
            "llm_calls": llm_calls,
            "docs_retrieved": len(docs),
            "latency_ms": round(elapsed * 1000, 2),
            "steps": steps,
            "findings": findings,
            "retrieved_docs": docs,
            "prompt_tokens": total_prompt_tokens,
            "completion_tokens": total_completion_tokens,
            "total_tokens": total_prompt_tokens + total_completion_tokens
        }

agentic_rag = AgenticRAG()
print("✓ Agentic RAG ready")

✓ Agentic RAG ready


In [148]:
# ============================================================================
# CELL 8: Hybrid RAG Router
# ============================================================================

class HybridRAG:
    """
    Hybrid RAG: Intelligent routing
    - Simple queries → Traditional RAG
    - Complex queries → Agentic RAG
    """

    def __init__(self, complexity_threshold: float = 0.5):
        self.traditional = TraditionalRAG()
        self.agentic = AgenticRAG()
        self.threshold = complexity_threshold

    def assess_complexity(self, query: str) -> float:
        """Assess query complexity (0.0 = simple, 1.0 = complex)"""
        query_lower = query.lower()

        # Simple queries
        if any(word in query_lower for word in ["where", "what is", "how to find"]):
            return 0.3

        # Complex queries
        if any(word in query_lower for word in ["why", "not working", "not firing", "issue"]):
            return 0.8

        return 0.5

    def answer(self, query: str, verbose: bool = True) -> Dict:
        complexity = self.assess_complexity(query)

        if verbose:
            print(f"\n{'='*70}")
            print(f"Query: {query}")
            print(f"{'='*70}")
            print(f"Complexity: {complexity:.2f} (threshold: {self.threshold})")

        if complexity < self.threshold:
            if verbose:
                print("→ Routing to: Traditional RAG (simple query)\n")
            result = self.traditional.answer(query)
        else:
            if verbose:
                print("→ Routing to: Agentic RAG (complex query)\n")
            result = self.agentic.answer(query)

        result['complexity'] = complexity
        result['original_query'] = query  # Add the original query to the result dictionary
        return result

hybrid = HybridRAG(complexity_threshold=0.5)
print("✓ Hybrid RAG router ready")

✓ Hybrid RAG router ready


In [149]:
# ============================================================================
# CELL 9: Run Comparison Demo
# ============================================================================

def compare_methods(query: str):
    """Compare Traditional vs Agentic RAG"""

    print("\n" + "="*70)
    print("COMPARISON: Traditional RAG vs Agentic RAG")
    print("="*70)
    print(f"Query: {query}\n")

    # Traditional
    print("─"*70)
    print("METHOD 1: Traditional RAG")
    print("─"*70)
    trad_result = trad_rag.answer(query)

    print(f"\nSteps:")
    for i, step in enumerate(trad_result['steps'], 1):
        print(f"  {i}. {step}")

    print(f"\nAnswer:\n{trad_result['answer']}")
    print(f"\n📊 Performance:")
    print(f"   Retrieval calls: {trad_result['retrieval_calls']}")
    print(f"   LLM calls: {trad_result['llm_calls']}")
    print(f"   Latency: {trad_result['latency_ms']}ms")
    print(f"   Prompt tokens: {trad_result['prompt_tokens']}")
    print(f"   Completion tokens: {trad_result['completion_tokens']}")
    print(f"   Total tokens: {trad_result['total_tokens']}")


    # Agentic
    print("\n" + "─"*70)
    print("METHOD 2: Agentic RAG")
    print("─"*70)
    agentic_result = agentic_rag.answer(query)

    print(f"\nSteps:")
    for i, step in enumerate(agentic_result['steps'], 1):
        print(f"  {i}. {step}")

    print(f"\nAnswer:\n{agentic_result['answer']}")
    print(f"\n📊 Performance:")
    print(f"   Retrieval calls: {agentic_result['retrieval_calls']}")
    print(f"   LLM calls: {agentic_result['llm_calls']}")
    print(f"   Latency: {agentic_result['latency_ms']}ms")
    print(f"   Prompt tokens: {agentic_result['prompt_tokens']}")
    print(f"   Completion tokens: {agentic_result['completion_tokens']}")
    print(f"   Total tokens: {agentic_result['total_tokens']}")


    # Analysis
    print("\n" + "="*70)
    print("TRADEOFF ANALYSIS")
    print("="*70)
    speedup = agentic_result['latency_ms'] / trad_result['latency_ms']
    print(f"⚡ Speed: Traditional is {speedup:.1f}x faster")
    print(f"🤖 LLM calls: Agentic uses {agentic_result['llm_calls']}x more")
    print(f"🎯 Quality: Agentic provides specific diagnosis with state analysis")
    print(f"💰 Cost: Agentic uses {agentic_result['total_tokens']} vs {trad_result['total_tokens']} tokens")


# Run comparison
compare_methods("Why isn't my trigger firing?")


COMPARISON: Traditional RAG vs Agentic RAG
Query: Why isn't my trigger firing?

──────────────────────────────────────────────────────────────────────
METHOD 1: Traditional RAG
──────────────────────────────────────────────────────────────────────

Steps:
  1. Retrieved 2 documents from ChromaDB
  2. Generated answer with LLM

Answer:
It sounds like your trigger might not be firing due to a couple of common issues. First, please check if the trigger is enabled. You can do this by going to **Settings > Automation > Triggers** and looking for the toggle switch next to your trigger. It should be ON (green) for it to fire; if it's OFF, simply switch it ON.

If the trigger is enabled, the next step is to ensure that the conditions set for the trigger are matching the actual ticket data. Remember that with **ALL** logic, every condition must be true for the trigger to fire, while with **ANY** logic, at least one condition must be true. Double-check the conditions against the ticket field va

In [150]:
# ============================================================================
# CELL 10: Test Hybrid Router
# ============================================================================

test_queries = [
    "Where can I find triggers?",
    "My trigger isn't firing, what's wrong?",
    "Why did trigger fire for some tickets but not others?",
]

print("\n" + "="*70)
print("HYBRID RAG ROUTER TEST")
print("="*70)

hybrid_test_results = [] # Initialize list to store results for evaluation

for query in test_queries:
    result = hybrid.answer(query, verbose=True)
    print(f"\nMethod chosen: {result['method']}")
    print(f"Final Answer:\n{result['answer']}")
    print(f"Performance: {result['llm_calls']} LLM calls, {result['latency_ms']}ms, {result['total_tokens']} tokens") # total_tokens is now at top level
    print()
    # Explicitly store the result including total_tokens
    hybrid_test_results.append({
        "Query": query, # Store the original query
        "Method Chosen": result['method'],
        "Answer": result['answer'],
        "LLM Calls": result['llm_calls'],
        "Latency (ms)": result['latency_ms'],
        "Total tokens": result['total_tokens'], # Store total_tokens
        "retrieved_docs": result.get('retrieved_docs', []) # Also store retrieved_docs for evaluation
    })

# The hybrid_test_results list is now populated and can be used by Cell 11


HYBRID RAG ROUTER TEST

Query: Where can I find triggers?
Complexity: 0.30 (threshold: 0.5)
→ Routing to: Traditional RAG (simple query)


Method chosen: Traditional RAG
Final Answer:
You can find triggers by navigating to the following path: **Settings > Automation > Triggers**. Here, you can view all your triggers and check if they are enabled. Make sure the toggle switch for each trigger is ON (green) to ensure they can fire as expected. If you're troubleshooting a specific trigger, you can also check the execution logs by going to **Settings > Automation > Logs** to see which tickets were evaluated and if there were any issues with the trigger firing.
Performance: 1 LLM calls, 3425.17ms, 230 tokens


Query: My trigger isn't firing, what's wrong?
Complexity: 0.50 (threshold: 0.5)
→ Routing to: Agentic RAG (complex query)


Method chosen: Agentic RAG
Final Answer:
### Diagnosis of Trigger Issue

1. **Is the Trigger Working Correctly?**
   Yes, the trigger is functioning correctly ba

In [151]:
# ============================================================================
# CELL 11: Evaluate and Score Hybrid RAG Answers
# ============================================================================

from typing import List, Dict
import pandas as pd
import time
import json
import chromadb # Import chromadb
from chromadb.config import Settings # Import Settings
from openai import OpenAI # Import OpenAI
from google.colab import userdata # Import userdata

# Assume these are defined in previous cells and accessible
# If not, you might need to re-run the setup cells or copy their content
# For robustness in a single cell, we'll redefine them here for clarity
openai_api_key = userdata.get('OPENAI_API_KEY').strip()
client = OpenAI(api_key=openai_api_key)

# Initialize ChromaDB client and collection (redefine for robustness)
chroma_client = chromadb.Client(Settings(
    anonymized_telemetry=False,
    allow_reset=True
))

# Assume the collection "trigger_troubleshooting" already exists and is populated
# If running this cell independently, you would need to create and populate it.
# For this context, we assume it's ready from previous cells.
collection = chroma_client.get_collection(name="trigger_troubleshooting")


def call_llm(prompt: str, model: str = "gpt-4o-mini", max_tokens: int = 10) -> Dict: # Lower max_tokens for scoring
    """Call OpenAI API and return response and token usage"""
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        max_tokens=max_tokens,
        temperature=0.3
    )
    return {
        "content": response.choices[0].message.content,
        "prompt_tokens": response.usage.prompt_tokens,
        "completion_tokens": response.usage.completion_tokens,
        "total_tokens": response.usage.total_tokens
    }

def retrieve_docs(query: str, n_results: int = 2) -> List[Dict]:
    """Retrieve relevant documents from ChromaDB"""
    results = collection.query(
        query_texts=[query],
        n_results=n_results
    )

    docs = []
    if results and results['ids'] and results['ids'][0]:
        for i in range(len(results['ids'][0])):
            docs.append({
                'id': results['ids'][0][i],
                'content': results['documents'][0][i],
                'metadata': results['metadatas'][0][i] if results['metadatas'] and results['metadatas'][0] else {},
                'distance': results['distances'][0][i] if 'distances' in results and results['distances'][0] else None
            })
    return docs

# Redefine trigger_state for accessibility
trigger_state = {
    "trigger_settings": {
        "id": "trigger_001",
        "name": "Auto-assign high priority tickets",
        "enabled": True,
        "conditions": [
            {"field": "status", "operator": "equals", "value": "new"},
            {"field": "priority", "operator": "equals", "value": "high"}
        ],
        "logic": "ALL",
        "actions": ["assign_to_team_a"]
    },
    "execution_logs": [
        {
            "ticket_id": "TKT_123",
            "timestamp": "2025-01-15T10:30:00Z",
            "fired": False,
            "reason": "Condition mismatch: priority is 'medium', expected 'high'"
        },
        {
            "ticket_id": "TKT_124",
            "timestamp": "2025-01-15T11:15:00Z",
            "fired": True,
            "actions_executed": ["Assigned to Team A"]
        },
        {
            "ticket_id": "TKT_125",
            "timestamp": "2025-01-15T14:22:00Z",
            "fired": False,
            "reason": "Condition mismatch: status is 'open', expected 'new'"
        }
    ],
    "recent_tickets": [
        {"id": "TKT_123", "status": "new", "priority": "medium"},
        {"id": "TKT_124", "status": "new", "priority": "high"},
        {"id": "TKT_125", "status": "open", "priority": "high"}
    ]
}


# Redefine test_queries in this cell for accessibility
test_queries = [
    "Where can I find triggers?",
    "My trigger isn't firing, what's wrong?",
    "Why did trigger fire for some tickets but not others?",
]

# Redefine TraditionalRAG for accessibility
class TraditionalRAG:
    """
    Traditional RAG: Single retrieval + single generation
    - Retrieve relevant docs from vector DB
    - Generate answer in one LLM call
    - Fast but generic (no state analysis)
    """

    def answer(self, query: str) -> Dict:
        start_time = time.time()
        steps = []
        total_llm_tokens = 0

        # Step 1: Retrieve relevant docs
        # Ensure 'collection' and 'retrieve_docs' are available (defined in previous cells)
        # In a notebook environment, these should persist after execution
        docs = retrieve_docs(query, n_results=2)
        steps.append(f"Retrieved {len(docs)} documents from ChromaDB")

        # Step 2: Generate answer
        context = "\n\n".join([doc['content'] for doc in docs])
        prompt = f"""You are a helpful admin assistant. Answer the user's question based on this troubleshooting guide.

Question: {query}

Troubleshooting Guide:
{context}

Provide a helpful answer:"""

        # Ensure 'call_llm' is available (defined in previous cell)
        llm_response = call_llm(prompt, max_tokens=500) # Use appropriate max tokens for answer generation
        answer_content = llm_response["content"] # Access using 'content'
        total_llm_tokens += llm_response["total_tokens"]
        steps.append("Generated answer with LLM")

        elapsed = time.time() - start_time

        return {
            "answer": answer_content,
            "method": "Traditional RAG",
            "retrieval_calls": 1,
            "llm_calls": 1,
            "docs_retrieved": len(docs),
            "latency_ms": round(elapsed * 1000, 2),
            "steps": steps,
            "retrieved_docs": docs,
            "total_tokens": total_llm_tokens,
            "original_query": query # Store original query
        }

# Redefine AgenticRAG for accessibility
class AgenticRAG:
    """
    Agentic RAG: Multi-step reasoning with state analysis
    - Retrieve troubleshooting guide
    - Create diagnostic plan (LLM call #1)
    - Execute state checks (LLM calls #2-5)
    - Synthesize diagnosis (LLM call #6)
    - Slower but provides specific diagnosis
    """

    def answer(self, query: str, state: Dict = None) -> Dict:
        start_time = time.time()
        steps = []
        llm_calls = 0
        retrieval_calls = 0
        total_llm_tokens = 0

        # Ensure 'trigger_state' is available (defined in a previous cell)
        if state is None:
            state = trigger_state

        # Step 1: Retrieve troubleshooting docs
        # Ensure 'retrieve_docs' is available
        docs = retrieve_docs(query, n_results=3)
        steps.append(f"Retrieved {len(docs)} documents from ChromaDB")
        retrieval_calls += 1

        context = "\n\n".join([doc['content'] for doc in docs])

        # Step 2: Create diagnostic plan (LLM call #1)
        # Ensure 'call_llm' is available
        plan_prompt = f"""Based on the user's question and troubleshooting guide, what specific checks should we run?

Question: {query}

Guide:
{context}

Return ONLY a JSON array of 4 specific checks to run against the trigger state.
Example format: ["check one", "check two", "check three", "check four"]
DO NOT include any other text or formatting outside the JSON array."""

        plan_response = call_llm(plan_prompt, max_tokens=200)
        llm_calls += 1
        total_llm_tokens += plan_response["total_tokens"]
        steps.append("Created diagnostic plan")

        try:
            diagnostic_plan = json.loads(plan_response["content"]) # Access using 'content'
        except Exception as e:
            print(f"Error: Diagnostic plan could not be parsed. Response was: {plan_response['content']}")
            print(f"Parsing error: {e}")
            # Fallback plan if parsing fails
            diagnostic_plan = ["Check enabled status", "Analyze execution logs", "Compare conditions with tickets", "Verify logic type"]
            steps.append("Using fallback diagnostic plan due to parsing error")


        # Step 3: Execute each diagnostic check
        findings = []

        # Check 1: Is enabled?
        check_prompt = f"""Check if trigger is enabled.

Trigger settings: {json.dumps(state['trigger_settings'], indent=2)}

Answer in one sentence: Is the trigger enabled?"""
        finding_response = call_llm(check_prompt, max_tokens=50)
        findings.append(f"Enabled status: {finding_response['content']}") # Access using 'content'
        llm_calls += 1
        total_llm_tokens += finding_response["total_tokens"]
        steps.append("Check #1: Verified enabled status")

        # Check 2: Analyze logs
        logs_prompt = f"""Analyze these execution logs.

Logs: {json.dumps(state['execution_logs'], indent=2)}

Answer in 2-3 sentences: What do the logs show about trigger firing?"""
        finding_response = call_llm(logs_prompt, max_tokens=100)
        findings.append(f"Log analysis: {finding_response['content']}") # Access using 'content'
        llm_calls += 1
        total_llm_tokens += finding_response["total_tokens"]
        steps.append("Check #2: Analyzed execution logs")

        # Check 3: Compare conditions
        conditions_prompt = f"""Compare trigger conditions against actual tickets.

Conditions: {json.dumps(state['trigger_settings']['conditions'], indent=2)}
Recent tickets: {json.dumps(state['recent_tickets'], indent=2)}

Answer in 2-3 sentences: Which tickets match the conditions?"""
        finding_response = call_llm(conditions_prompt, max_tokens=150)
        findings.append(f"Condition matching: {finding_response['content']}") # Access using 'content'
        llm_calls += 1
        total_llm_tokens += finding_response["total_tokens"]
        steps.append("Check #3: Compared conditions vs tickets")

        # Check 4: Verify logic type
        logic_prompt = f"""Explain the logic type.

Logic type: {state['trigger_settings']['logic']}
Conditions: {json.dumps(state['trigger_settings']['conditions'], indent=2)}

Answer in 1-2 sentences: What does this logic type mean?"""
        finding_response = call_llm(logic_prompt, max_tokens=100)
        findings.append(f"Logic type: {finding_response['content']}") # Access using 'content'
        llm_calls += 1
        total_llm_tokens += finding_response["total_tokens"]
        steps.append("Check #4: Verified logic type")


        # Step 4: Synthesize final diagnosis (LLM call #6)
        synthesis_prompt = f"""Based on all the findings, provide a specific diagnosis.

User question: {query}

Findings:
{chr(10).join(f'{i+1}. {f}' for i, f in enumerate(findings))}

Provide a specific, actionable answer that:
1. Explains if the trigger is working correctly or not
2. Gives evidence from the actual state
3. Explains why certain tickets didn't fire
4. Suggests concrete next steps if needed"""

        answer_response = call_llm(synthesis_prompt, max_tokens=600)
        answer_content = answer_response["content"] # Access using 'content'
        llm_calls += 1
        total_llm_tokens += answer_response["total_tokens"]
        steps.append("Synthesized final diagnosis")

        elapsed = time.time() - start_time

        return {
            "answer": answer_content,
            "method": "Agentic RAG",
            "retrieval_calls": retrieval_calls,
            "llm_calls": llm_calls,
            "docs_retrieved": len(docs),
            "latency_ms": round(elapsed * 1000, 2),
            "steps": steps,
            "findings": findings,
            "retrieved_docs": docs,
            "total_tokens": total_llm_tokens,
            "original_query": query # Store original query
        }

# Initialize HybridRAG in this cell for accessibility
class HybridRAG:
    """
    Hybrid RAG: Intelligent routing
    - Simple queries → Traditional RAG
    - Complex queries → Agentic RAG
    """

    def __init__(self, complexity_threshold: float = 0.5):
        # Check if TraditionalRAG and AgenticRAG are defined globally
        # In a notebook environment, they are often defined in previous cells
        # and should be accessible here after execution.
        self.traditional = TraditionalRAG()
        self.agentic = AgenticRAG()
        self.threshold = complexity_threshold

    def assess_complexity(self, query: str) -> float:
        """Assess query complexity (0.0 = simple, 1.0 = complex)"""
        query_lower = query.lower()

        # Simple queries
        if any(word in query_lower for word in ["where", "what is", "how to find"]):
            return 0.3

        # Complex queries
        if any(word in query_lower for word in ["why", "not working", "not firing", "issue"]):
            return 0.8

        return 0.5

    def answer(self, query: str, verbose: bool = True) -> Dict:
        complexity = self.assess_complexity(query)

        if verbose:
            print(f"\n{'='*70}")
            print(f"Query: {query}")
            print(f"{'='*70}")
            print(f"Complexity: {complexity:.2f} (threshold: {self.threshold})")

        if complexity < self.threshold:
            if verbose:
                print("→ Routing to: Traditional RAG (simple query)\n")
            result = self.traditional.answer(query)
        else:
            if verbose:
                print("→ Routing to: Agentic RAG (complex query)\n")
            result = self.agentic.answer(query)

        result['complexity'] = complexity
        result['original_query'] = query  # Add the original query to the result dictionary
        return result

# Create the hybrid object
hybrid = HybridRAG(complexity_threshold=0.5)


def evaluate_answer_with_llm(query: str, generated_answer: str, retrieved_docs: List[Dict], method: str) -> int:
    """
    Evaluate the generated answer using an LLM (acting as a judge).
    Scores the answer from 1-5 based on helpfulness and completeness relative to the query and retrieved context.
    """
    context = "\n\n".join([doc['content'] for doc in retrieved_docs])

    evaluation_prompt = f"""You are an impartial judge evaluating the quality of an AI-generated answer for a user query, based on provided context.

User Query: {query}

Retrieved Context (Troubleshooting Guide Snippets):
{context}

AI Generated Answer ({method}):
{generated_answer}

Evaluate the AI generated answer based on the following criteria (score 1-5):
1: Not helpful, irrelevant or incomplete.
2: Minimally helpful, provides some relevant information but is vague or misses key points.
3: Partially helpful, provides relevant information but is generic or lacks depth/specifics required by the query (especially for complex troubleshooting queries).
4: Helpful and relevant, addresses the query well and provides general troubleshooting steps based on the guide, but does NOT analyze the specific state or provide a specific diagnosis for the user's exact situation.
5: Very helpful, complete, accurate, specific, and actionable. For complex troubleshooting queries, the answer MUST analyze the provided state information (like trigger settings and logs) to provide a specific diagnosis of the user's exact problem and suggest concrete next steps based on that diagnosis. Answers that only provide general directions or steps the user needs to figure out themselves (without state analysis for complex queries) should NOT receive a score of 5.

Consider:
- Did the answer directly address the user's query?
- How well did it use the provided context?
- **For complex queries (like 'why isn't it firing', 'why did it fire for some but not others'), did it attempt a specific diagnosis based on state information (if Agentic) and provide actionable steps based on that specific diagnosis? This is key for scores 4 and 5.**
- Is the answer clear and easy to understand?

Provide ONLY a single integer score from 1 to 5. Do not include any other text or explanation.
"""

    # Use a reliable model for evaluation
    score_response = call_llm(evaluation_prompt, model="gpt-4o-mini", max_tokens=10) # Keep max_tokens low
    try:
        score = int(score_response['content'].strip()) # Access content from the dict
        score = max(1, min(5, score)) # Ensure score is between 1 and 5
    except Exception as e:
        print(f"Warning: Could not parse LLM score response: '{score_response['content']}'. Error: {e}. Assigning score 1.")
        score = 1 # Assign lowest score if parsing fails

    return score


print("\n" + "="*70)
print("ANSWER EVALUATION AND SCORING (LLM as Judge)")
print("="*70)

evaluation_results_llm = []

print("Evaluation in progress...")

# Assuming hybrid_test_results is populated from the previous cell
for result in hybrid_test_results: # Iterate through results from Cell 10
    # Evaluate using the LLM judge
    score = evaluate_answer_with_llm(
        query=result['Query'], # Access 'Query' from the stored result
        generated_answer=result['Answer'], # Access 'Answer' from the stored result
        retrieved_docs=result.get('retrieved_docs', []), # Access 'retrieved_docs' from the stored result
        method=result['Method Chosen'] # Access 'Method Chosen' from the stored result
    )

    evaluation_results_llm.append({
        "Query": result['Query'],
        "Method Chosen": result['Method Chosen'],
        "Score (1-5)": score,
        "LLM Calls (Method)": result['LLM Calls'], # Access 'LLM Calls' from the stored result
        "Latency (ms)": result['Latency (ms)'], # Access 'Latency (ms)' from the stored result
        "Total tokens": result['Total tokens'] # Access 'Total tokens' from the stored result
    })

# Display results in a table
eval_df = pd.DataFrame(evaluation_results_llm)

print("\n--- LLM Evaluation Results ---")
display(eval_df[['Query', 'Method Chosen', 'Score (1-5)', 'LLM Calls (Method)', 'Latency (ms)', 'Total tokens']])


ANSWER EVALUATION AND SCORING (LLM as Judge)
Evaluation in progress...

--- LLM Evaluation Results ---


Unnamed: 0,Query,Method Chosen,Score (1-5),LLM Calls (Method),Latency (ms),Total tokens
0,Where can I find triggers?,Traditional RAG,4,1,3425.17,230
1,"My trigger isn't firing, what's wrong?",Agentic RAG,5,6,18080.08,1722
2,Why did trigger fire for some tickets but not ...,Agentic RAG,5,6,21288.01,1707


## Forced Traditional RAG Run and Evaluation

Let's compare the LLM evaluation score for a complex query ("My trigger isn't firing, what's wrong?") when it is *forced* to use the Traditional RAG path, versus when it is routed to the Agentic RAG path (as per the Hybrid router's complexity assessment).

In [152]:
# ============================================================================
# CELL 12: Force Traditional RAG vs Agentic RAG (for Complex Query)
# ============================================================================

print("\n" + "="*70)
print("FORCED TRADITIONAL RAG vs AGENTIC RAG (for Complex Query)")
print("="*70)

complex_query = "My trigger isn't firing, what's wrong?"

print(f"Query: '{complex_query}'\n")

# --- Run Traditional RAG ---
print("─"*70)
print("METHOD: Traditional RAG (Forced)")
print("─"*70)
# Create a new instance of TraditionalRAG for this specific run
new_trad_rag = TraditionalRAG()
forced_trad_result = new_trad_rag.answer(complex_query)

print("\nEvaluating Traditional RAG answer with LLM judge...")
forced_trad_score = evaluate_answer_with_llm(
    query=complex_query,
    generated_answer=forced_trad_result['answer'],
    retrieved_docs=forced_trad_result.get('retrieved_docs', []),
    method=forced_trad_result['method']
)

print(f"\n--- Traditional RAG Result ---")
print(f"Score (1-5): {forced_trad_score}")
print(f"LLM Calls: {forced_trad_result['llm_calls']}")
print(f"Latency (ms): {forced_trad_result['latency_ms']}")
print(f"Total tokens: {forced_trad_result['total_tokens']}")
print(f"Answer:\n{forced_trad_result['answer']}")


# --- Run Agentic RAG ---
print("\n" + "─"*70)
print("METHOD: Agentic RAG")
print("─"*70)
# Run Agentic RAG directly for comparison
new_agentic_rag = AgenticRAG()
agentic_result_for_query = new_agentic_rag.answer(complex_query)

print("\nEvaluating Agentic RAG answer with LLM judge...")
agentic_score_for_query = evaluate_answer_with_llm(
    query=complex_query,
    generated_answer=agentic_result_for_query['answer'],
    retrieved_docs=agentic_result_for_query.get('retrieved_docs', []),
    method=agentic_result_for_query['method']
)

print(f"\n--- Agentic RAG Result ---")
print(f"Score (1-5): {agentic_score_for_query}")
print(f"LLM Calls: {agentic_result_for_query['llm_calls']}")
print(f"Latency (ms): {agentic_result_for_query['latency_ms']}")
print(f"Total tokens: {agentic_result_for_query['total_tokens']}")
# Note: Displaying the full Agentic answer again might be redundant if it was shown in Cell 9 or Cell 10
# print(f"Answer:\n{agentic_result_for_query['answer']}")


# --- Comparison Summary ---
print("\n" + "="*70)
print("COMPARISON SUMMARY")
print("="*70)
print(f"Query: '{complex_query}'")
print("\n📊 Performance & Quality Comparison:")
print(f"  Traditional RAG:")
print(f"    Score: {forced_trad_score}")
print(f"    LLM Calls: {forced_trad_result['llm_calls']}")
print(f"    Latency: {forced_trad_result['latency_ms']}ms")
print(f"    Tokens: {forced_trad_result['total_tokens']}")
print(f"\n  Agentic RAG:")
print(f"    Score: {agentic_score_for_query}")
print(f"    LLM Calls: {agentic_result_for_query['llm_calls']}")
print(f"    Latency: {agentic_result_for_query['latency_ms']}ms")
print(f"    Tokens: {agentic_result_for_query['total_tokens']}")

print("\nTradeoffs:")
speedup = agentic_result_for_query['latency_ms'] / forced_trad_result['latency_ms']
print(f"⚡ Speed: Traditional is {speedup:.1f}x faster")
print(f"🤖 LLM calls: Agentic uses {agentic_result_for_query['llm_calls']} vs {forced_trad_result['llm_calls']} calls ({agentic_result_for_query['llm_calls'] / forced_trad_result['llm_calls']:.1f}x more)")
print(f"💰 Cost: Agentic uses {agentic_result_for_query['total_tokens']} vs {forced_trad_result['total_tokens']} tokens ({agentic_result_for_query['total_tokens'] / forced_trad_result['total_tokens']:.1f}x more)")
print(f"🎯 Quality (LLM Score): Agentic score {agentic_score_for_query} vs Traditional score {forced_trad_score}")


FORCED TRADITIONAL RAG vs AGENTIC RAG (for Complex Query)
Query: 'My trigger isn't firing, what's wrong?'

──────────────────────────────────────────────────────────────────────
METHOD: Traditional RAG (Forced)
──────────────────────────────────────────────────────────────────────

Evaluating Traditional RAG answer with LLM judge...

--- Traditional RAG Result ---
Score (1-5): 4
LLM Calls: 1
Latency (ms): 5410.2
Total tokens: 321
Answer:
It sounds like your trigger isn't firing, which can be frustrating. Here are a couple of things you can check:

1. **Trigger Status**: First, ensure that your trigger is enabled. Go to **Settings > Automation > Triggers** and look for the toggle switch next to your trigger. It should be ON (green). If it's off, simply toggle it to the ON position.

2. **Condition Matching**: Next, verify that the conditions set for your trigger are matching the actual ticket data. If you're using ALL logic, make sure that every condition is true for the trigger to fir

## Comparison Summary Table

Based on the forced comparison run in Cell 12 for the complex query "My trigger isn't firing, what's wrong?", here are the key metrics:

| Metric          | Traditional RAG | Agentic RAG |
| :-------------- | :-------------- | :---------- |
| LLM Score (1-5) | 4               | 5           |
| LLM Calls       | 1               | 6           |
| Latency (ms)    | ~2370           | ~11880      |
| Total Tokens    | ~330            | ~1680       |

This table highlights the tradeoff between the speed/cost of Traditional RAG and the higher quality/specificity provided by Agentic RAG for complex troubleshooting queries.