# 🤖 TestAgents.ipynb - LangGraph Multi-Agent System Testing

**Purpose**: Comprehensive testing framework for Cuttlefish4's LangGraph multi-agent RAG system  
**Status**: 🧪 Testing and Validation Framework  
**Created**: August 2025  

---

## 🎯 **Testing Scope**

This notebook systematically tests all components of the LangGraph multi-agent system:

### **🏗️ Core Components**
1. **AgentState Class** - Shared state management across all agents
2. **Common Utilities** - Content extraction, formatting, and helper functions

### **🤖 Individual Agents**
3. **BM25Agent** - Keyword-based search using BM25 algorithm
4. **ContextualCompressionAgent** - Fast semantic retrieval with reranking
5. **EnsembleAgent** - Multi-method comprehensive retrieval
6. **ResponseWriterAgent** - Final response generation using GPT-4o
7. **SupervisorAgent** - Intelligent query routing

### **⚡ LangGraph Workflow**
8. **Agent Node Functions** - LangGraph integration wrappers
9. **StateGraph Workflow** - Complete orchestration pipeline
10. **End-to-End Testing** - Real query processing with full agent chain

---

## 📋 **Prerequisites**

**Environment Variables Required**:
- `SUPABASE_URL`, `SUPABASE_KEY` - Database access
- `OPENAI_API_KEY` - For LLM operations and embeddings
- `COHERE_API_KEY` - Optional, for advanced reranking

**Data Requirements**:
- Populated Supabase tables: `bugs` and `pcr` with JIRA data
- Vector embeddings and full-text search indices

---

## ⚡ **Quick Start**

1. **Run Cell 0** - Install dependencies
2. **Run Cell 2** - Environment setup and imports
3. **Run Cells 4-6** - Initialize components (vectorstore, agents)
4. **Run Cells 8-22** - Individual agent tests
5. **Run Cells 24-30** - LangGraph workflow tests

### 🔍 **Success Indicators**
- ✅ All agents initialize without errors
- ✅ Agent processing returns valid AgentState objects
- ✅ LangGraph workflow completes full execution
- ✅ Real query produces meaningful responses

---

*Execute cells in order for systematic testing of the entire multi-agent system.*

In [1]:
# Cell 0: Install dependencies
!pip install "sqlalchemy>=1.4.0,<2.0.0"
!pip install -q langgraph langsmith langchain-openai langchain-community langchain-cohere
!pip install -q supabase python-dotenv pandas tqdm rank_bm25


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
# Cell 1: Environment setup and path configuration
import os
import sys
from pathlib import Path

# Determine project root - we're in /app/agents/ and need to go up to project root
current_path = Path.cwd()
project_root = current_path.parent.parent  # Go up two levels: agents -> app -> cuttlefish4

app_dir = project_root / "app"
agents_dir = app_dir / "agents"  
tools_dir = app_dir / "tools"
rag_dir = app_dir / "rag"

# Add to Python path
for path in [str(project_root), str(app_dir), str(agents_dir), str(tools_dir), str(rag_dir)]:
    if path not in sys.path:
        sys.path.insert(0, path)

print("🔧 Path Configuration:")
print(f"   Project Root: {project_root}")
print(f"   App Directory: {app_dir}")
print(f"   Agents Directory: {agents_dir}")
print(f"   Current Directory: {current_path}")

# Load environment variables from project root
from dotenv import load_dotenv
env_file = project_root / ".env"
if env_file.exists():
    load_dotenv(str(env_file))
    print(f"✅ Loaded .env file from: {env_file}")
else:
    load_dotenv()  # Try current directory
    print("⚠️  Loading .env from current directory (project root .env not found)")

# Verify environment variables
required_vars = ['SUPABASE_URL', 'SUPABASE_KEY', 'OPENAI_API_KEY', 'GOOGLE_PROJECT_ID', 'GOOGLE_APPLICATION_CREDENTIALS', 'TAVILY_API_KEY']
missing_vars = [var for var in required_vars if not os.environ.get(var)]

if missing_vars:
    print(f"❌ Missing environment variables: {', '.join(missing_vars)}")
    print("Please set these in your .env file")
else:
    print("✅ All required environment variables found")
    
print(f"✅ Environment setup complete")

🔧 Path Configuration:
   Project Root: /Users/foohm/github/cuttlefish4
   App Directory: /Users/foohm/github/cuttlefish4/app
   Agents Directory: /Users/foohm/github/cuttlefish4/app/agents
   Current Directory: /Users/foohm/github/cuttlefish4/app/agents
✅ Loaded .env file from: /Users/foohm/github/cuttlefish4/.env
❌ Missing environment variables: GOOGLE_PROJECT_ID
Please set these in your .env file
✅ Environment setup complete


In [3]:
# Cell 2: Core imports
import json
import logging
from datetime import datetime
from typing import Dict, List, Any, Optional, TypedDict

# LangGraph and LangChain imports
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.documents import Document
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import SupabaseVectorStore

# Agent imports - using absolute imports for Jupyter notebook compatibility
try:
    # Try importing from the agents directory directly
    import sys
    import os
    
    # Make sure we can import from the current directory
    agents_dir = os.path.dirname(os.path.abspath(''))
    if agents_dir not in sys.path:
        sys.path.insert(0, agents_dir)
    
    # Import agent modules directly
    import common
    import bm25_agent
    import contextual_compression_agent
    import ensemble_agent
    import response_writer_agent
    import supervisor_agent
    
    # Import specific classes and functions
    from common import AgentState, measure_performance, extract_content_from_document, filter_empty_documents, format_context_for_llm, extract_ticket_info
    from bm25_agent import BM25Agent
    from contextual_compression_agent import ContextualCompressionAgent
    from ensemble_agent import EnsembleAgent
    from response_writer_agent import ResponseWriterAgent
    from supervisor_agent import SupervisorAgent
    
    print("✅ All agent imports successful")
    
except ImportError as e:
    print(f"❌ Agent import error: {e}")
    print("Trying alternative import approach...")
    
    try:
        # Alternative approach - add agents directory to path and import
        current_dir = os.getcwd()
        sys.path.insert(0, current_dir)
        
        from common import AgentState, measure_performance, extract_content_from_document, filter_empty_documents, format_context_for_llm, extract_ticket_info
        from bm25_agent import BM25Agent
        from contextual_compression_agent import ContextualCompressionAgent
        from ensemble_agent import EnsembleAgent
        from response_writer_agent import ResponseWriterAgent
        from supervisor_agent import SupervisorAgent
        
        print("✅ All agent imports successful (alternative method)")
        
    except ImportError as e2:
        print(f"❌ Alternative agent import also failed: {e2}")
        print("Please check that all agent files are present in the agents directory")
        print("Required files: common.py, bm25_agent.py, contextual_compression_agent.py,")
        print("               ensemble_agent.py, response_writer_agent.py, supervisor_agent.py")

# Supabase and vectorstore imports
try:
    from supabase import create_client, Client
    print("✅ Supabase imports successful")
except ImportError as e:
    print(f"❌ Supabase import error: {e}")

print("🚀 All imports completed")

✅ All agent imports successful
✅ Supabase imports successful
🚀 All imports completed


## 🧪 **Phase 1: Core Component Testing**

Testing the foundational components that all agents depend on.

In [4]:
# Cell 3: Test AgentState class and common utilities
print("🧪 Testing AgentState class and common utilities...")

# Test AgentState creation
try:
    test_state: AgentState = {
        'query': 'authentication error',
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    print("✅ AgentState creation successful")
    print(f"   Test query: '{test_state['query']}'")
    print(f"   Initial state keys: {list(test_state.keys())}")
except Exception as e:
    print(f"❌ AgentState creation failed: {e}")

# Test measure_performance
try:
    start_time = datetime.now()
    import time
    time.sleep(0.1)  # Small delay
    performance = measure_performance(start_time)
    print(f"✅ measure_performance working: {performance:.3f}s")
except Exception as e:
    print(f"❌ measure_performance failed: {e}")

# Test document utilities with mock data
try:
    # Create test document
    test_doc = Document(
        page_content="Original content",
        metadata={
            'title': 'Authentication Issue',
            'description': 'User cannot login to system',
            'key': 'HBASE-123',
            'id': 1
        }
    )
    
    # Test content extraction
    extracted_content = extract_content_from_document(test_doc)
    expected_format = "Title: Authentication Issue\nDescription: User cannot login to system"
    
    if extracted_content == expected_format:
        print("✅ extract_content_from_document working correctly")
        print(f"   Extracted: {extracted_content[:50]}...")
    else:
        print(f"⚠️  extract_content_from_document output unexpected:")
        print(f"   Got: {extracted_content}")
        print(f"   Expected: {expected_format}")
    
    # Test document filtering
    test_docs = [test_doc, Document(page_content="", metadata={})]  # Empty doc
    filtered_docs = filter_empty_documents(test_docs)
    print(f"✅ filter_empty_documents: {len(filtered_docs)}/{len(test_docs)} valid docs")
    
    # Test context formatting
    test_contexts = [{
        'content': extracted_content,
        'metadata': test_doc.metadata,
        'score': 0.8
    }]
    
    formatted_context = format_context_for_llm(test_contexts)
    print(f"✅ format_context_for_llm working: {len(formatted_context)} chars")
    
    # Test ticket extraction
    tickets = extract_ticket_info(test_contexts)
    print(f"✅ extract_ticket_info: found {len(tickets)} tickets")
    if tickets:
        print(f"   First ticket: {tickets[0]}")
    
except Exception as e:
    print(f"❌ Document utilities test failed: {e}")

print("🎯 Core component testing completed")

🧪 Testing AgentState class and common utilities...
✅ AgentState creation successful
   Test query: 'authentication error'
   Initial state keys: ['query', 'user_can_wait', 'production_incident', 'routing_decision', 'routing_reasoning', 'retrieved_contexts', 'retrieval_method', 'retrieval_metadata', 'final_answer', 'relevant_tickets', 'messages']
✅ measure_performance working: 0.103s
✅ extract_content_from_document working correctly
   Extracted: Title: Authentication Issue
Description: User cann...
✅ filter_empty_documents: 1/2 valid docs
✅ format_context_for_llm working: 80 chars
✅ extract_ticket_info: found 1 tickets
   First ticket: {'key': 'HBASE-123', 'title': 'Authentication Issue'}
🎯 Core component testing completed


## 🏗️ **Phase 2: Initialize Components**

Setting up the vectorstore and LLM components needed by the agents.

In [5]:
# Cell 4: Initialize LLM components
print("🤖 Initializing LLM components...")

try:
    # Initialize LLMs
    rag_llm = ChatOpenAI(
        model="gpt-3.5-turbo",
        temperature=0.1,
        max_tokens=1000
    )
    
    supervisor_llm = ChatOpenAI(
        model="gpt-4o",
        temperature=0.1,
        max_tokens=500
    )
    
    response_writer_llm = ChatOpenAI(
        model="gpt-4o",
        temperature=0.3,
        max_tokens=2000
    )
    
    print("✅ LLMs initialized:")
    print("   • RAG LLM: GPT-3.5-turbo")
    print("   • Supervisor LLM: GPT-4o")
    print("   • Response Writer LLM: GPT-4o")
    
    # Test LLM connectivity
    test_response = rag_llm.invoke("Hello, this is a test message.")
    print(f"✅ LLM connectivity test successful: {len(test_response.content)} chars")
    
except Exception as e:
    print(f"❌ LLM initialization failed: {e}")
    print("Please check your OPENAI_API_KEY")

🤖 Initializing LLM components...
✅ LLMs initialized:
   • RAG LLM: GPT-3.5-turbo
   • Supervisor LLM: GPT-4o
   • Response Writer LLM: GPT-4o
✅ LLM connectivity test successful: 34 chars


In [6]:
# Cell 5: Initialize Supabase and vectorstore
print("🗄️ Initializing Supabase and vectorstore...")

try:
    # Initialize Supabase client
    supabase_url = os.environ.get('SUPABASE_URL')
    supabase_key = os.environ.get('SUPABASE_KEY')
    
    supabase_client: Client = create_client(supabase_url, supabase_key)
    print("✅ Supabase client initialized")
    
    # Initialize embeddings
    embeddings = OpenAIEmbeddings(
        model="text-embedding-3-small",
        dimensions=1536
    )
    print("✅ OpenAI embeddings initialized")
    
    # Use the same pattern as TestRAGTools.ipynb - import and use SupabaseRetriever directly
    try:
        # Import your custom Supabase retriever
        sys.path.insert(0, str(rag_dir))  # Make sure rag directory is in path
        from supabase_retriever import create_bugs_retriever
        
        # Create custom retriever (this works with your RPC functions)
        supabase_retriever = create_bugs_retriever()
        print("✅ SupabaseRetriever initialized (bugs collection)")
        
        # Test the custom retriever connectivity
        connection_test = supabase_retriever.test_connection()
        print(f"✅ Connection test: {'PASSED' if connection_test else 'FAILED'}")
        
        # Import required LangChain classes for proper inheritance
        from langchain_core.retrievers import BaseRetriever
        from langchain_core.callbacks import CallbackManagerForRetrieverRun
        from langchain_core.documents import Document
        from typing import List, Any
        from pydantic import Field
        
        # Proper LangChain-compatible retriever with Pydantic field validation
        class SupabaseRetrieverWrapper(BaseRetriever):
            """LangChain-compatible wrapper for SupabaseRetriever that inherits from BaseRetriever."""
            
            # Properly define the field for Pydantic v2
            supabase_retriever: Any = Field(description="The SupabaseRetriever instance")
            
            def __init__(self, supabase_retriever, **kwargs):
                super().__init__(supabase_retriever=supabase_retriever, **kwargs)
                
            def _get_relevant_documents(
                self, 
                query: str, 
                *, 
                run_manager: CallbackManagerForRetrieverRun = None
            ) -> List[Document]:
                """Required method for BaseRetriever - this makes it a proper Runnable."""
                try:
                    # Use vector search from our SupabaseRetriever
                    results = self.supabase_retriever.vector_search(query, k=4)
                    documents = []
                    for result in results:
                        doc = Document(
                            page_content=result.get('content', ''),
                            metadata=result.get('metadata', {})
                        )
                        documents.append(doc)
                    return documents
                except Exception as e:
                    print(f"⚠️  Vector search failed in retriever wrapper: {e}")
                    # Try fallback using the fallback vector search
                    try:
                        fallback_results = self.supabase_retriever._fallback_vector_search(
                            query_embedding=[0.0] * 1536,  # Dummy embedding
                            k=4, 
                            filters=None
                        )
                        documents = []
                        for result in fallback_results:
                            doc = Document(
                                page_content=result.get('content', ''),
                                metadata=result.get('metadata', {})
                            )
                            documents.append(doc)
                        return documents
                    except Exception as fallback_error:
                        print(f"⚠️  Fallback also failed: {fallback_error}")
                        # Return empty list rather than failing
                        return []
        
        # Simple wrapper class that implements the interfaces agents need
        class SimpleVectorStoreWrapper:
            """Simple wrapper that provides both vectorstore and retriever interfaces."""
            
            def __init__(self, retriever):
                self.retriever = retriever
                # Create the LangChain-compatible retriever
                try:
                    self._langchain_retriever = SupabaseRetrieverWrapper(retriever)
                    print("✅ LangChain BaseRetriever wrapper created successfully")
                except Exception as e:
                    print(f"⚠️  LangChain wrapper creation failed: {e}")
                    # Fallback to a simple retriever for basic functionality
                    self._langchain_retriever = None
                
            def similarity_search(self, query, k=4):
                """Vectorstore-style interface for similarity search."""
                try:
                    results = self.retriever.vector_search(query, k=k)
                    documents = []
                    for result in results:
                        doc = Document(
                            page_content=result.get('content', ''),
                            metadata=result.get('metadata', {})
                        )
                        documents.append(doc)
                    return documents
                except Exception as e:
                    print(f"⚠️  Similarity search failed: {e}")
                    # Return empty list for graceful degradation
                    return []
            
            def as_retriever(self, **kwargs):
                """Return the proper LangChain BaseRetriever if available, else None."""
                if self._langchain_retriever is not None:
                    return self._langchain_retriever
                else:
                    print("⚠️  LangChain retriever not available, returning None")
                    return None
                
            def get_relevant_documents(self, query, k=4):
                """Direct retriever-style interface."""
                return self.similarity_search(query, k=k)
        
        # Create simple wrapper
        vectorstore = SimpleVectorStoreWrapper(supabase_retriever)
        print("✅ Vectorstore wrapper with LangChain compatibility created")
        
        # Test the wrapper with a simple query
        try:
            test_docs = vectorstore.similarity_search("authentication error", k=1)
            print(f"✅ Similarity search test: {len(test_docs)} documents")
            
            # Test the LangChain retriever interface if available
            retriever = vectorstore.as_retriever()
            if retriever:
                try:
                    test_retriever_docs = retriever.invoke("authentication error")
                    print(f"✅ LangChain retriever test: {len(test_retriever_docs)} documents")
                except Exception as retriever_error:
                    print(f"⚠️  LangChain retriever test failed: {retriever_error}")
                    print("   Continuing with basic vectorstore functionality")
            else:
                print("⚠️  LangChain retriever not available - agents will use basic functionality")
            
            if test_docs:
                sample_doc = test_docs[0]
                print(f"   Document type: {type(sample_doc)}")
                content_preview = sample_doc.page_content[:100] if hasattr(sample_doc, 'page_content') else str(sample_doc)[:100]
                print(f"   Content preview: {content_preview}...")
                
        except Exception as wrapper_test_error:
            print(f"⚠️  Wrapper test warning: {wrapper_test_error}")
            print("   Basic vectorstore should still work for agent testing")
        
    except ImportError as import_error:
        print(f"❌ Could not import SupabaseRetriever: {import_error}")
        print("   Please check that supabase_retriever.py exists in the rag directory")
        vectorstore = None
        
    except Exception as custom_error:
        print(f"⚠️  SupabaseRetriever setup completed with warnings: {custom_error}")
        print("   Vectorstore should still be functional for testing")
    
except Exception as e:
    print(f"❌ Supabase initialization failed: {e}")
    print("Please check your SUPABASE_URL and SUPABASE_KEY")
    vectorstore = None

# Final verification
if 'vectorstore' in locals() and vectorstore is not None:
    print(f"\n✅ Final verification: Vectorstore is ready for agent testing")
    print(f"   Type: {type(vectorstore)}")
    print(f"   LangChain retriever available: {vectorstore.as_retriever() is not None}")
else:
    print(f"\n❌ Final verification: Vectorstore initialization failed")

🗄️ Initializing Supabase and vectorstore...
✅ Supabase client initialized
✅ OpenAI embeddings initialized
✅ SupabaseRetriever initialized (bugs collection)


2025-08-19 20:40:38,433 - SupabaseRetriever_bugs - INFO - ✅ Connection to bugs table successful
2025-08-19 20:40:38,434 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-19 20:40:38,435 - SupabaseRetriever_bugs - INFO - Parameters: k=1, similarity_threshold=0.1, filters=None


✅ Connection test: PASSED
✅ LangChain BaseRetriever wrapper created successfully
✅ Vectorstore wrapper with LangChain compatibility created


2025-08-19 20:40:39,034 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-19 20:40:39,037 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 3 above threshold 0.1
2025-08-19 20:40:39,037 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.1882']
2025-08-19 20:40:39,038 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 3 candidates)
2025-08-19 20:40:39,041 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-19 20:40:39,042 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


✅ Similarity search test: 1 documents


2025-08-19 20:40:39,720 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:39,726 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:39,727 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.1882']
2025-08-19 20:40:39,727 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)


✅ LangChain retriever test: 4 documents
   Document type: <class 'langchain_core.documents.base.Document'>
   Content preview: Title: context:include-filter can't find ControllerAdvice annotation

Description: {code:xml}<contex...

✅ Final verification: Vectorstore is ready for agent testing
   Type: <class '__main__.SimpleVectorStoreWrapper'>
   LangChain retriever available: True


## 🤖 **Phase 3: Individual Agent Testing**

Testing each agent individually to ensure proper functionality.

In [7]:
# Cell 6: Initialize and test BM25Agent
print("🔍 Testing BM25Agent...")

try:
    # Initialize BM25Agent
    bm25_agent = BM25Agent(
        vectorstore=vectorstore,
        rag_llm=rag_llm,
        k=5
    )
    print("✅ BM25Agent initialized")
    
    # Test BM25 retrieval directly
    test_query = "authentication error"
    print(f"\n🔍 Testing BM25 retrieval with query: '{test_query}'")
    
    bm25_results = bm25_agent.retrieve(test_query)
    print(f"✅ BM25 retrieve: {len(bm25_results)} results")
    
    if bm25_results:
        first_result = bm25_results[0]
        print(f"   First result source: {first_result.get('source')}")
        print(f"   First result score: {first_result.get('score')}")
        content_preview = first_result.get('content', '')[:100]
        print(f"   Content preview: {content_preview}...")
    
    # Test BM25 agent process method
    print(f"\n🔄 Testing BM25Agent.process()...")
    test_state: AgentState = {
        'query': test_query,
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    
    processed_state = bm25_agent.process(test_state.copy())
    
    print(f"✅ BM25Agent.process() completed")
    print(f"   Retrieved contexts: {len(processed_state['retrieved_contexts'])}")
    print(f"   Retrieval method: {processed_state['retrieval_method']}")
    print(f"   Processing time: {processed_state['retrieval_metadata'].get('processing_time', 'N/A'):.3f}s")
    print(f"   BM25 available: {processed_state['retrieval_metadata'].get('bm25_available', 'Unknown')}")
    print(f"   Messages added: {len(processed_state['messages'])}")
    
except Exception as e:
    print(f"❌ BM25Agent test failed: {e}")
    import traceback
    traceback.print_exc()

2025-08-19 20:40:39,736 - BM25Agent - INFO - Setting up BM25 retriever...
2025-08-19 20:40:39,737 - BM25Agent - INFO - Fetching sample documents from vectorstore...
2025-08-19 20:40:39,737 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'sample query...' in bugs
2025-08-19 20:40:39,737 - SupabaseRetriever_bugs - INFO - Parameters: k=100, similarity_threshold=0.1, filters=None


🔍 Testing BM25Agent...


2025-08-19 20:40:40,769 - SupabaseRetriever_bugs - INFO - Processing 100 candidates for similarity calculation
2025-08-19 20:40:40,799 - SupabaseRetriever_bugs - INFO - Calculated 100 similarities, 94 above threshold 0.1
2025-08-19 20:40:40,799 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.1286', '0.1846', '0.1465']
2025-08-19 20:40:40,799 - SupabaseRetriever_bugs - INFO - Direct vector search returned 94 results (from 100 candidates)
2025-08-19 20:40:40,800 - BM25Agent - INFO - Retrieved 94 documents from vectorstore
2025-08-19 20:40:40,801 - BM25Agent - INFO - Document validation passed: 94/94 valid docs, avg length: 1034.9 chars
2025-08-19 20:40:40,801 - BM25Agent - INFO - Creating BM25 retriever with 94 valid documents...
2025-08-19 20:40:40,808 - BM25Agent - INFO - ✅ BM25 retriever successfully initialized with 94 documents
2025-08-19 20:40:40,808 - BM25Agent - INFO - Using BM25 retriever for query: 'authentication error...'
  docs = self.bm25_retriever.get_relevant_d

✅ BM25 retriever initialized with 94 documents
✅ BM25Agent initialized

🔍 Testing BM25 retrieval with query: 'authentication error'
✅ BM25 retrieve: 5 results
   First result source: bm25
   First result score: 1.0
   Content preview: Title: Seam Portlet deployment error on JPP 6
Description: Deployment of Seam Portal Project with Se...

🔄 Testing BM25Agent.process()...
🔍 BM25 Agent processing: 'authentication error'
✅ BM25 Agent completed: 5 results in 0.00s
✅ BM25Agent.process() completed
   Retrieved contexts: 5
   Retrieval method: BM25
   Processing time: 0.001s
   BM25 available: True
   Messages added: 1


In [8]:
# Cell 7: Initialize and test ContextualCompressionAgent
print("⚡ Testing ContextualCompressionAgent...")

try:
    # Initialize ContextualCompressionAgent
    contextual_compression_agent = ContextualCompressionAgent(
        vectorstore=vectorstore,
        rag_llm=rag_llm,
        k=5
    )
    print("✅ ContextualCompressionAgent initialized")
    
    # Test ContextualCompression retrieval directly
    test_query = "database connection timeout"
    print(f"\n⚡ Testing ContextualCompression retrieval with query: '{test_query}'")
    
    # Test both normal and urgent modes
    print("\n📊 Normal mode test:")
    normal_results = contextual_compression_agent.retrieve(test_query, is_urgent=False)
    print(f"✅ Normal mode: {len(normal_results)} results")
    
    print("\n🚨 Urgent mode test:")
    urgent_results = contextual_compression_agent.retrieve(test_query, is_urgent=True)
    print(f"✅ Urgent mode: {len(urgent_results)} results")
    
    if normal_results:
        first_result = normal_results[0]
        print(f"   First result source: {first_result.get('source')}")
        print(f"   First result score: {first_result.get('score')}")
        content_preview = first_result.get('content', '')[:100]
        print(f"   Content preview: {content_preview}...")
    
    # Test ContextualCompression agent process method
    print(f"\n🔄 Testing ContextualCompressionAgent.process()...")
    
    # Test normal processing
    test_state: AgentState = {
        'query': test_query,
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    
    processed_state = contextual_compression_agent.process(test_state.copy())
    
    print(f"✅ ContextualCompressionAgent.process() completed")
    print(f"   Retrieved contexts: {len(processed_state['retrieved_contexts'])}")
    print(f"   Retrieval method: {processed_state['retrieval_method']}")
    print(f"   Processing time: {processed_state['retrieval_metadata'].get('processing_time', 'N/A'):.3f}s")
    print(f"   Is urgent: {processed_state['retrieval_metadata'].get('is_urgent', 'Unknown')}")
    print(f"   Primary source: {processed_state['retrieval_metadata'].get('primary_source', 'Unknown')}")
    print(f"   Messages added: {len(processed_state['messages'])}")
    
    # Test urgent processing
    print(f"\n🚨 Testing urgent/production incident mode...")
    urgent_state = test_state.copy()
    urgent_state['production_incident'] = True
    
    urgent_processed_state = contextual_compression_agent.process(urgent_state)
    print(f"✅ Urgent mode processing completed")
    print(f"   Urgent processing time: {urgent_processed_state['retrieval_metadata'].get('processing_time', 'N/A'):.3f}s")
    
except Exception as e:
    print(f"❌ ContextualCompressionAgent test failed: {e}")
    import traceback
    traceback.print_exc()

2025-08-19 20:40:40,876 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection timeout...' in bugs
2025-08-19 20:40:40,876 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


⚡ Testing ContextualCompressionAgent...
✅ ContextualCompression with Cohere reranking initialized
✅ ContextualCompressionAgent initialized

⚡ Testing ContextualCompression retrieval with query: 'database connection timeout'

📊 Normal mode test:
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:40:41,792 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:41,799 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:41,799 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1924', '0.1757']
2025-08-19 20:40:41,800 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:41,802 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection timeout...' in bugs
2025-08-19 20:40:41,803 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:43,369 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:43,374 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:43,375 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1

✅ Compression retriever with content extraction: 3 results
✅ Normal mode: 3 results

🚨 Urgent mode test:
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:40:44,334 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:44,341 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:44,342 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1924', '0.1757']
2025-08-19 20:40:44,342 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:44,344 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection timeout...' in bugs
2025-08-19 20:40:44,344 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:44,874 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:44,880 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:44,881 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1

✅ Compression retriever with content extraction: 3 results
✅ Urgent mode: 3 results
   First result source: contextual_compression_extracted
   First result score: 0.8
   Content preview: Title: ServletTestExecutionListener breaks old code
Description: The Javadoc for {{ServletTestExecut...

🔄 Testing ContextualCompressionAgent.process()...
⚡ ContextualCompression Agent  processing: 'database connection timeout'
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:40:49,532 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:49,538 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:49,538 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1924', '0.1757']
2025-08-19 20:40:49,539 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:49,540 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection timeout...' in bugs
2025-08-19 20:40:49,541 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:50,167 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:50,175 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:50,176 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1

✅ Compression retriever with content extraction: 3 results
✅ ContextualCompression Agent completed: 3 results in 1.50s
✅ ContextualCompressionAgent.process() completed
   Retrieved contexts: 3
   Retrieval method: ContextualCompression
   Processing time: 1.499s
   Is urgent: False
   Primary source: contextual_compression_extracted
   Messages added: 1

🚨 Testing urgent/production incident mode...
⚡ ContextualCompression Agent [URGENT] processing: 'database connection timeout'
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:40:51,100 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:51,106 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:51,106 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1924', '0.1757']
2025-08-19 20:40:51,107 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:51,108 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection timeout...' in bugs
2025-08-19 20:40:51,108 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:51,612 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:51,618 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:51,619 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.1

✅ Compression retriever with content extraction: 3 results
✅ ContextualCompression Agent completed: 3 results in 1.24s
✅ Urgent mode processing completed
   Urgent processing time: 1.240s


In [9]:
# Cell 8: Initialize and test EnsembleAgent
print("🔗 Testing EnsembleAgent...")

try:
    # Initialize EnsembleAgent (requires other agents as dependencies)
    ensemble_agent = EnsembleAgent(
        vectorstore=vectorstore,
        rag_llm=rag_llm,
        bm25_agent=bm25_agent,
        contextual_compression_agent=contextual_compression_agent,
        k=8
    )
    print("✅ EnsembleAgent initialized")
    
    # Test Ensemble retrieval directly
    test_query = "memory leak issue"
    print(f"\n🔗 Testing Ensemble retrieval with query: '{test_query}'")
    
    ensemble_results = ensemble_agent.retrieve(test_query)
    print(f"✅ Ensemble retrieve: {len(ensemble_results)} results")
    
    if ensemble_results:
        # Show sources used by ensemble
        sources = [result.get('source', 'unknown') for result in ensemble_results]
        unique_sources = set(sources)
        print(f"   Sources used: {', '.join(unique_sources)}")
        
        # Show first result
        first_result = ensemble_results[0]
        print(f"   First result source: {first_result.get('source')}")
        print(f"   First result score: {first_result.get('score')}")
        content_preview = first_result.get('content', '')[:100]
        print(f"   Content preview: {content_preview}...")
    
    # Test Ensemble agent process method
    print(f"\n🔄 Testing EnsembleAgent.process()...")
    test_state: AgentState = {
        'query': test_query,
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    
    processed_state = ensemble_agent.process(test_state.copy())
    
    print(f"✅ EnsembleAgent.process() completed")
    print(f"   Retrieved contexts: {len(processed_state['retrieved_contexts'])}")
    print(f"   Retrieval method: {processed_state['retrieval_method']}")
    print(f"   Processing time: {processed_state['retrieval_metadata'].get('processing_time', 'N/A'):.3f}s")
    print(f"   Methods used: {processed_state['retrieval_metadata'].get('methods_used', [])}")
    print(f"   Primary source: {processed_state['retrieval_metadata'].get('primary_source', 'Unknown')}")
    print(f"   Messages added: {len(processed_state['messages'])}")
    
except Exception as e:
    print(f"❌ EnsembleAgent test failed: {e}")
    import traceback
    traceback.print_exc()

2025-08-19 20:40:51,752 - BM25Agent - INFO - Using BM25 retriever for query: 'memory leak issue...'
2025-08-19 20:40:51,753 - BM25Agent - INFO - BM25 retriever returned 5 documents
2025-08-19 20:40:51,754 - BM25Agent - INFO - Filtered 5 -> 5 valid documents
2025-08-19 20:40:51,755 - BM25Agent - INFO - BM25 retrieve returning 5 results with valid content
2025-08-19 20:40:51,755 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'memory leak issue...' in bugs
2025-08-19 20:40:51,755 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


🔗 Testing EnsembleAgent...
✅ Ensemble retriever initialized with 4 methods:
   • Naive: 0.25
   • Multi-Query: 0.25
   • ContextualCompression: 0.25
   • BM25: 0.25
✅ EnsembleAgent initialized

🔗 Testing Ensemble retrieval with query: 'memory leak issue'
🔄 Using individual agent ensemble
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:40:52,435 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:52,441 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:52,442 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3259', '0.1876', '0.2366']
2025-08-19 20:40:52,442 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:52,444 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'memory leak issue...' in bugs
2025-08-19 20:40:52,444 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:53,061 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:53,068 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:53,068 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3259', '0.1876', '0.2

✅ Compression retriever with content extraction: 3 results


2025-08-19 20:40:53,708 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:53,715 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:53,716 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3259', '0.1876', '0.2366']
2025-08-19 20:40:53,716 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:54,709 - SupabaseRetriever_bugs - INFO - Direct vector search for: '1. How can I address a problem with memory leaks i...' in bugs
2025-08-19 20:40:54,709 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:55,329 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:55,333 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:55,334 - SupabaseRetriever_bugs - INFO - Result simila

✅ Individual agent ensemble: 5 deduplicated results
✅ Ensemble retrieve: 5 results
   Sources used: compression_ensemble, bm25_ensemble, naive_ensemble
   First result source: bm25_ensemble
   First result score: 1.0
   Content preview: Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19
Description: Installing JBT on Eclipse -4.4.M4- 4....

🔄 Testing EnsembleAgent.process()...
🔗 Ensemble Agent processing: 'memory leak issue'
   Using comprehensive multi-method retrieval...
🔄 Using individual agent ensemble
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:40:57,107 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:57,114 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:57,114 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3259', '0.1876', '0.2366']
2025-08-19 20:40:57,115 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:57,117 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'memory leak issue...' in bugs
2025-08-19 20:40:57,117 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:40:57,594 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:57,599 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:57,600 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3259', '0.1876', '0.2

✅ Compression retriever with content extraction: 3 results


2025-08-19 20:40:58,303 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:40:58,311 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:40:58,311 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3259', '0.1876', '0.2366']
2025-08-19 20:40:58,312 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:40:59,032 - SupabaseRetriever_bugs - INFO - Direct vector search for: '1. How can I address a problem with memory leaks i...' in bugs
2025-08-19 20:40:59,033 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:41:00,122 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:41:00,128 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:41:00,128 - SupabaseRetriever_bugs - INFO - Result simila

✅ Individual agent ensemble: 5 deduplicated results
✅ Ensemble Agent completed: 5 results in 4.65s
✅ EnsembleAgent.process() completed
   Retrieved contexts: 5
   Retrieval method: Ensemble
   Processing time: 4.651s
   Methods used: ['bm25', 'contextual_compression', 'naive', 'multi_query']
   Primary source: bm25_ensemble
   Messages added: 1


In [10]:
# Cell 9: Initialize and test ResponseWriterAgent
print("✍️  Testing ResponseWriterAgent...")

try:
    # Initialize ResponseWriterAgent
    response_writer_agent = ResponseWriterAgent(
        response_writer_llm=response_writer_llm
    )
    print("✅ ResponseWriterAgent initialized")
    
    # Test response generation directly
    test_query = "How to fix connection timeout issues?"
    print(f"\n✍️  Testing response generation for query: '{test_query}'")
    
    # Create mock retrieved contexts for testing
    mock_contexts = [
        {
            'content': 'Title: Connection Timeout Fix\nDescription: Increase timeout values in configuration',
            'metadata': {'key': 'HBASE-456', 'title': 'Connection Timeout Fix'},
            'score': 0.9,
            'source': 'test_source'
        },
        {
            'content': 'Title: Network Configuration\nDescription: Check network settings for proper connectivity',
            'metadata': {'key': 'HBASE-789', 'title': 'Network Configuration'},
            'score': 0.8,
            'source': 'test_source'
        }
    ]
    
    # Test normal response generation
    print("\n📝 Testing normal response generation...")
    normal_response = response_writer_agent.generate_response(
        query=test_query,
        retrieved_contexts=mock_contexts,
        production_incident=False,
        retrieval_method="Ensemble"
    )
    
    print(f"✅ Normal response generated: {len(normal_response)} characters")
    print(f"   Response preview: {normal_response[:200]}...")
    
    # Test urgent response generation
    print("\n🚨 Testing urgent response generation...")
    urgent_response = response_writer_agent.generate_response(
        query=test_query,
        retrieved_contexts=mock_contexts,
        production_incident=True,
        retrieval_method="ContextualCompression"
    )
    
    print(f"✅ Urgent response generated: {len(urgent_response)} characters")
    print(f"   Response preview: {urgent_response[:200]}...")
    
    # Test ResponseWriter agent process method
    print(f"\n🔄 Testing ResponseWriterAgent.process()...")
    test_state: AgentState = {
        'query': test_query,
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': 'Ensemble',
        'routing_reasoning': 'Comprehensive search needed',
        'retrieved_contexts': mock_contexts,
        'retrieval_method': 'Ensemble',
        'retrieval_metadata': {'processing_time': 2.5},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    
    processed_state = response_writer_agent.process(test_state.copy())
    
    print(f"✅ ResponseWriterAgent.process() completed")
    print(f"   Final answer generated: {len(processed_state['final_answer'])} characters")
    print(f"   Relevant tickets extracted: {len(processed_state['relevant_tickets'])}")
    print(f"   Messages added: {len(processed_state['messages'])}")
    
    if processed_state['relevant_tickets']:
        print(f"   Tickets: {[ticket['key'] for ticket in processed_state['relevant_tickets']]}")
    
    print(f"   Final answer preview: {processed_state['final_answer'][:200]}...")
    
except Exception as e:
    print(f"❌ ResponseWriterAgent test failed: {e}")
    import traceback
    traceback.print_exc()

✍️  Testing ResponseWriterAgent...
✅ ResponseWriterAgent initialized

✍️  Testing response generation for query: 'How to fix connection timeout issues?'

📝 Testing normal response generation...
✅ Normal response generated: 873 characters
   Response preview: To address connection timeout issues, you can consider the following solutions based on the retrieved JIRA tickets:

1. **Increase Timeout Values**: As suggested in [HBASE-456], you can try increasing...

🚨 Testing urgent response generation...
✅ Urgent response generated: 914 characters
   Response preview: For addressing connection timeout issues in a production environment, it's crucial to implement immediate solutions. Based on the retrieved JIRA tickets, here are two actionable steps you can take:

1...

🔄 Testing ResponseWriterAgent.process()...
✍️  ResponseWriter Agent  generating response...
✅ ResponseWriter completed in 3.12s
   Generated response: 972 characters
   Relevant tickets: 2
✅ ResponseWriterAgent.process() compl

In [11]:
# Cell 10: Initialize and test SupervisorAgent
print("🧠 Testing SupervisorAgent...")

try:
    # Initialize SupervisorAgent
    supervisor_agent = SupervisorAgent(
        supervisor_llm=supervisor_llm
    )
    print("✅ SupervisorAgent initialized")
    
    # Test different routing scenarios
    routing_test_cases = [
        {
            'query': 'HBASE-123 status',
            'user_can_wait': True,
            'production_incident': False,
            'expected_agent': 'BM25',
            'description': 'Specific ticket reference'
        },
        {
            'query': 'production server is down',
            'user_can_wait': False,
            'production_incident': True,
            'expected_agent': 'ContextualCompression',
            'description': 'Production incident'
        },
        {
            'query': 'comprehensive analysis of memory issues',
            'user_can_wait': True,
            'production_incident': False,
            'expected_agent': 'Ensemble',
            'description': 'Complex research query'
        },
        {
            'query': 'simple error message',
            'user_can_wait': False,
            'production_incident': False,
            'expected_agent': 'ContextualCompression',
            'description': 'Default routing'
        }
    ]
    
    print(f"\n🧠 Testing routing decisions...")
    
    for i, test_case in enumerate(routing_test_cases, 1):
        print(f"\n📋 Test Case {i}: {test_case['description']}")
        print(f"   Query: '{test_case['query']}'")
        print(f"   user_can_wait: {test_case['user_can_wait']}, production_incident: {test_case['production_incident']}")
        
        routing_result = supervisor_agent.route_query(
            query=test_case['query'],
            user_can_wait=test_case['user_can_wait'],
            production_incident=test_case['production_incident']
        )
        
        routed_agent = routing_result['agent']
        reasoning = routing_result['reasoning']
        
        # Check if routing matches expectation
        match_indicator = "✅" if routed_agent == test_case['expected_agent'] else "⚠️ "
        
        print(f"   {match_indicator} Routed to: {routed_agent} (expected: {test_case['expected_agent']})")
        print(f"   Reasoning: {reasoning}")
    
    # Test SupervisorAgent process method
    print(f"\n🔄 Testing SupervisorAgent.process()...")
    test_state: AgentState = {
        'query': 'How to fix authentication errors?',
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    
    processed_state = supervisor_agent.process(test_state.copy())
    
    print(f"✅ SupervisorAgent.process() completed")
    print(f"   Routing decision: {processed_state['routing_decision']}")
    print(f"   Routing reasoning: {processed_state['routing_reasoning']}")
    print(f"   Messages added: {len(processed_state['messages'])}")
    
except Exception as e:
    print(f"❌ SupervisorAgent test failed: {e}")
    import traceback
    traceback.print_exc()

🧠 Testing SupervisorAgent...
✅ SupervisorAgent initialized

🧠 Testing routing decisions...

📋 Test Case 1: Specific ticket reference
   Query: 'HBASE-123 status'
   user_can_wait: True, production_incident: False
   ✅ Routed to: BM25 (expected: BM25)
   Reasoning: The query contains a specific ticket reference 'HBASE-123', which is best handled by the BM25 agent for fast keyword-based search.

📋 Test Case 2: Production incident
   Query: 'production server is down'
   user_can_wait: False, production_incident: True
   ⚠️  Routed to: WebSearch (expected: ContextualCompression)
   Reasoning: The query mentions a service status/outage ('server is down') and it's a production incident, which requires real-time information.

📋 Test Case 3: Complex research query
   Query: 'comprehensive analysis of memory issues'
   user_can_wait: True, production_incident: False
   ✅ Routed to: Ensemble (expected: Ensemble)
   Reasoning: The query requires a comprehensive analysis of memory issues, and the

## Phase 3a WebSearch Agent Testing

Testing the new WebSearch agent for real-time information retrieval using Tavily API.

In [12]:
summary_data = {
    'components_tested': [],  # List of component names that were tested
    'test_results': {},       # Dict with detailed results for each component
    'recommendations': [],    # List of recommendation strings
    'overall_status': 'PENDING'  # Overall status
}

In [13]:
# Import and run WebSearch Agent tests from separate module
from test_websearch_agent import test_websearch_agent

# Run WebSearch Agent tests
summary_data = test_websearch_agent(supervisor_llm, supervisor_agent, summary_data)


🌐 TESTING: WebSearch Agent
✅ WebSearch Agent initialized successfully
   Max searches: 3
✅ Tavily API connection successful

🔍 Test Case 1: GitHub status outage...
   ✅ Retrieved 10 results
   📊 Processing time: 25.08s
   🔧 Method: WebSearch
   🔍 Searches performed: 3
   📝 Sample result content length: 387
   🔗 Has URL: True
   ✅ EXCELLENT

🔍 Test Case 2: AWS Lambda service down...
   ✅ Retrieved 10 results
   📊 Processing time: 25.15s
   🔧 Method: WebSearch
   🔍 Searches performed: 3
   📝 Sample result content length: 1349
   🔗 Has URL: True
   ✅ EXCELLENT

🔍 Test Case 3: latest security vulnerability Java Sprin...
   ✅ Retrieved 9 results
   📊 Processing time: 14.35s
   🔧 Method: WebSearch
   🔍 Searches performed: 3
   📝 Sample result content length: 1228
   🔗 Has URL: True
   ✅ EXCELLENT

📊 WEBSEARCH AGENT SUMMARY:
   Test cases run: 3
   Successful tests: 3/3
   Average results per query: 9.7
   Average processing time: 21.53s
   Overall WebSearch Status: 🟡 GOOD

🧠 Testing Supervis

## Phase 3b LogSearch Agent Testing

Testing the new LogSearch agent for production log analysis using GCP Cloud Logging.

In [15]:
# Import and run LogSearch Agent tests from separate module
from test_logsearch_agent import test_logsearch_agent

# Run LogSearch Agent tests
summary_data = test_logsearch_agent(supervisor_llm, supervisor_agent, summary_data)


📋 TESTING: LogSearch Agent (GCP Backend)
✅ LogSearch Agent initialized successfully
   Max searches: 3
   Backend: gcp
✅ GCP backend connection successful
   Project: octopus-282815

🔍 Test Case 1: find recent certificate expired errors...
   ✅ Retrieved 1 results
   📊 Processing time: 1.93s
   🔧 Method: LogSearch
   📋 Backend: gcp
   🔍 Searches performed: 1
   📝 Sample result content length: 471
   ⏰ Has timestamp: True
   📊 Log level: ERROR
   ✅ EXCELLENT

🔍 Test Case 2: database connection timeout errors in lo...
   ✅ Retrieved 0 results
   📊 Processing time: 3.01s
   🔧 Method: LogSearch
   📋 Backend: gcp
   🔍 Searches performed: 1
   🟠 NO RESULTS

🔍 Test Case 3: application startup errors last 24 hours...
   ✅ Retrieved 0 results
   📊 Processing time: 1.98s
   🔧 Method: LogSearch
   📋 Backend: gcp
   🔍 Searches performed: 1
   🟠 NO RESULTS

🔍 Test Case 4: disk space exceeded exceptions...


Search execution failed: 400 Unparseable filter: syntax error at line 1, column 92, token 'disk'


   ✅ Retrieved 0 results
   📊 Processing time: 1.97s
   🔧 Method: LogSearch
   📋 Backend: gcp
   🔍 Searches performed: 1
   🟠 NO RESULTS

📊 LOGSEARCH AGENT SUMMARY:
   Test cases run: 4
   Successful tests: 4/4
   Average results per query: 0.2
   Average processing time: 2.22s
   Backend used: gcp
   Overall LogSearch Status: 🟠 FAIR

🧠 Testing Supervisor Routing to LogSearch:
   ✅ 'investigate recent database connect...' → LogSearch
   ✅ 'find certificate expired errors in ...' → LogSearch
   ✅ 'application timeout exceptions last...' → LogSearch
   ✅ 'disk space errors in production...' → LogSearch
   Routing accuracy: 4/4



## ⚡ **Phase 4: LangGraph Workflow Testing**

Testing the complete LangGraph workflow orchestration system.

In [16]:
# Cell 11: Create LangGraph node functions
print("🏗️  Creating LangGraph node functions...")

# Agent Node Functions for LangGraph
# These functions wrap our agent classes for LangGraph integration

def supervisor_node(state: AgentState) -> AgentState:
    """Supervisor node for intelligent query routing."""
    return supervisor_agent.process(state)

def bm25_node(state: AgentState) -> AgentState:
    """BM25 retrieval node."""
    return bm25_agent.process(state)

def contextual_compression_node(state: AgentState) -> AgentState:
    """ContextualCompression retrieval node."""
    return contextual_compression_agent.process(state)

def ensemble_node(state: AgentState) -> AgentState:
    """Ensemble retrieval node."""
    return ensemble_agent.process(state)

def response_writer_node(state: AgentState) -> AgentState:
    """ResponseWriter node for final response generation."""
    return response_writer_agent.process(state)

# Routing function for conditional edges
def route_query(state: AgentState) -> str:
    """Route queries based on supervisor decision."""
    routing_decision = state.get('routing_decision')
    
    if routing_decision == 'BM25':
        return 'bm25'
    elif routing_decision == 'ContextualCompression':
        return 'contextual_compression'
    elif routing_decision == 'Ensemble':
        return 'ensemble'
    else:
        # Default fallback
        return 'contextual_compression'

print("✅ LangGraph node functions created:")
print("   • supervisor_node")
print("   • bm25_node")
print("   • contextual_compression_node")
print("   • ensemble_node")
print("   • response_writer_node")
print("   • route_query (conditional routing)")

# Test node functions individually
print("\n🧪 Testing individual node functions...")

try:
    # Test supervisor node
    test_state: AgentState = {
        'query': 'test node functions',
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    
    supervisor_result = supervisor_node(test_state.copy())
    print(f"✅ supervisor_node: routing_decision = {supervisor_result['routing_decision']}")
    
    # Test routing function
    route_result = route_query(supervisor_result)
    print(f"✅ route_query: {supervisor_result['routing_decision']} -> {route_result}")
    
    print("✅ All node functions working correctly")
    
except Exception as e:
    print(f"❌ Node function test failed: {e}")
    import traceback
    traceback.print_exc()

🏗️  Creating LangGraph node functions...
✅ LangGraph node functions created:
   • supervisor_node
   • bm25_node
   • contextual_compression_node
   • ensemble_node
   • response_writer_node
   • route_query (conditional routing)

🧪 Testing individual node functions...
🧠 Supervisor Agent analyzing query: 'test node functions'
   user_can_wait: True, production_incident: False
✅ Supervisor decision: Ensemble - The query is not urgent, and the user can wait, making it suitable for a comprehensive search.
   Analysis time: 1.11s
✅ supervisor_node: routing_decision = Ensemble
✅ route_query: Ensemble -> ensemble
✅ All node functions working correctly


In [17]:
# Cell 12: Create and compile LangGraph workflow
print("🔗 Creating LangGraph workflow...")

try:
    from langgraph.graph import StateGraph, END
    
    print("🏗️  Building LangGraph workflow...")
    
    # Create the StateGraph
    workflow = StateGraph(AgentState)
    
    # Add all nodes to the graph
    workflow.add_node("supervisor", supervisor_node)
    workflow.add_node("bm25", bm25_node)
    workflow.add_node("contextual_compression", contextual_compression_node)
    workflow.add_node("ensemble", ensemble_node)
    workflow.add_node("response_writer", response_writer_node)
    
    print("✅ Added nodes: supervisor, bm25, contextual_compression, ensemble, response_writer")
    
    # Define the workflow edges
    # Start with supervisor
    workflow.set_entry_point("supervisor")
    
    # Conditional edges from supervisor to retrieval agents
    workflow.add_conditional_edges(
        "supervisor",
        route_query,  # Function that determines the route
        {
            "bm25": "bm25",
            "contextual_compression": "contextual_compression",
            "ensemble": "ensemble"
        }
    )
    
    # All retrieval agents go to response writer
    workflow.add_edge("bm25", "response_writer")
    workflow.add_edge("contextual_compression", "response_writer")
    workflow.add_edge("ensemble", "response_writer")
    
    # Response writer goes to END
    workflow.add_edge("response_writer", END)
    
    print("✅ Added workflow edges and routing logic")
    
    # Compile the workflow
    compiled_workflow = workflow.compile()
    
    print("✅ LangGraph workflow compiled successfully!")
    print("\n🔗 Workflow structure:")
    print("   START -> supervisor -> [bm25|contextual_compression|ensemble] -> response_writer -> END")
    
    # Show workflow graph info
    if hasattr(compiled_workflow, 'get_graph'):
        graph_info = compiled_workflow.get_graph()
        print(f"   Graph nodes: {list(graph_info.nodes.keys()) if hasattr(graph_info, 'nodes') else 'N/A'}")
    
except Exception as e:
    print(f"❌ LangGraph workflow creation failed: {e}")
    import traceback
    traceback.print_exc()
    compiled_workflow = None

🔗 Creating LangGraph workflow...
🏗️  Building LangGraph workflow...
✅ Added nodes: supervisor, bm25, contextual_compression, ensemble, response_writer
✅ Added workflow edges and routing logic
✅ LangGraph workflow compiled successfully!

🔗 Workflow structure:
   START -> supervisor -> [bm25|contextual_compression|ensemble] -> response_writer -> END
   Graph nodes: ['__start__', 'supervisor', 'bm25', 'contextual_compression', 'ensemble', 'response_writer', '__end__']


In [18]:
# Cell 13: Test complete LangGraph workflow execution
print("🚀 Testing complete LangGraph workflow execution...")

if compiled_workflow is None:
    print("❌ Cannot test workflow - compilation failed in previous cell")
else:
    # Test different types of queries to exercise different routing paths
    test_queries = [
        {
            'query': 'HBASE-456 ticket details',
            'user_can_wait': True,
            'production_incident': False,
            'expected_route': 'BM25',
            'description': 'Specific ticket reference'
        },
        {
            'query': 'server crashed - need immediate help',
            'user_can_wait': False,
            'production_incident': True,
            'expected_route': 'ContextualCompression',
            'description': 'Production incident'
        },
        {
            'query': 'comprehensive analysis of authentication issues across the system',
            'user_can_wait': True,
            'production_incident': False,
            'expected_route': 'Ensemble',
            'description': 'Complex research query'
        }
    ]
    
    for i, test_case in enumerate(test_queries, 1):
        print(f"\n{'='*60}")
        print(f"🧪 WORKFLOW TEST {i}: {test_case['description']}")
        print(f"{'='*60}")
        print(f"Query: '{test_case['query']}'")
        print(f"Parameters: user_can_wait={test_case['user_can_wait']}, production_incident={test_case['production_incident']}")
        print(f"Expected route: {test_case['expected_route']}")
        
        try:
            # Create initial state
            initial_state: AgentState = {
                'query': test_case['query'],
                'user_can_wait': test_case['user_can_wait'],
                'production_incident': test_case['production_incident'],
                'routing_decision': None,
                'routing_reasoning': None,
                'retrieved_contexts': [],
                'retrieval_method': None,
                'retrieval_metadata': {},
                'final_answer': None,
                'relevant_tickets': [],
                'messages': []
            }
            
            print("\n🚀 Starting workflow execution...")
            
            # Execute the workflow
            start_time = datetime.now()
            final_state = compiled_workflow.invoke(initial_state)
            execution_time = (datetime.now() - start_time).total_seconds()
            
            print(f"\n✅ WORKFLOW EXECUTION COMPLETED in {execution_time:.2f}s")
            print(f"\n📊 EXECUTION SUMMARY:")
            print(f"   🧠 Routing Decision: {final_state.get('routing_decision')} ({final_state.get('routing_reasoning')})")
            print(f"   🔍 Retrieval Method: {final_state.get('retrieval_method')}")
            print(f"   📄 Retrieved Contexts: {len(final_state.get('retrieved_contexts', []))}")
            print(f"   🎫 Relevant Tickets: {len(final_state.get('relevant_tickets', []))}")
            print(f"   💬 Messages: {len(final_state.get('messages', []))}")
            
            # Show retrieval metadata
            retrieval_metadata = final_state.get('retrieval_metadata', {})
            if retrieval_metadata:
                processing_time = retrieval_metadata.get('processing_time', 'N/A')
                print(f"   ⏱️  Retrieval Time: {processing_time:.3f}s" if isinstance(processing_time, float) else f"   ⏱️  Retrieval Time: {processing_time}")
            
            # Show final answer preview
            final_answer = final_state.get('final_answer', '')
            if final_answer:
                print(f"\n✍️  FINAL ANSWER ({len(final_answer)} chars):")
                print(f"   {final_answer[:200]}{'...' if len(final_answer) > 200 else ''}")
            
            # Show relevant tickets
            relevant_tickets = final_state.get('relevant_tickets', [])
            if relevant_tickets:
                print(f"\n🎫 RELEVANT TICKETS:")
                for ticket in relevant_tickets[:3]:  # Show first 3
                    print(f"   • {ticket.get('key', 'N/A')}: {ticket.get('title', 'No title')[:50]}...")
            
            # Show agent messages
            messages = final_state.get('messages', [])
            if messages:
                print(f"\n💬 AGENT MESSAGES:")
                for msg in messages:
                    if hasattr(msg, 'content'):
                        print(f"   • {msg.content}")
            
            # Verify expected routing
            actual_route = final_state.get('routing_decision')
            if actual_route == test_case['expected_route']:
                print(f"\n✅ ROUTING VERIFICATION: Expected {test_case['expected_route']}, got {actual_route} ✓")
            else:
                print(f"\n⚠️  ROUTING VERIFICATION: Expected {test_case['expected_route']}, got {actual_route}")
                print(f"   This might be due to LLM routing variability - not necessarily an error")
            
        except Exception as e:
            print(f"\n❌ WORKFLOW TEST {i} FAILED: {e}")
            import traceback
            traceback.print_exc()
            
        print(f"\n🏁 Test {i} completed\n")
    
    print("="*60)
    print("🎉 ALL WORKFLOW TESTS COMPLETED")
    print("="*60)

🚀 Testing complete LangGraph workflow execution...

🧪 WORKFLOW TEST 1: Specific ticket reference
Query: 'HBASE-456 ticket details'
Parameters: user_can_wait=True, production_incident=False
Expected route: BM25

🚀 Starting workflow execution...
🧠 Supervisor Agent analyzing query: 'HBASE-456 ticket details'
   user_can_wait: True, production_incident: False


2025-08-19 20:43:28,383 - BM25Agent - INFO - BM25 Agent processing query: 'HBASE-456 ticket details'
2025-08-19 20:43:28,384 - BM25Agent - INFO - Using BM25 retriever for query: 'HBASE-456 ticket details...'
2025-08-19 20:43:28,385 - BM25Agent - INFO - BM25 retriever returned 5 documents
2025-08-19 20:43:28,385 - BM25Agent - INFO - Filtered 5 -> 5 valid documents
2025-08-19 20:43:28,385 - BM25Agent - INFO - BM25 retrieve returning 5 results with valid content
2025-08-19 20:43:28,385 - BM25Agent - INFO - BM25 Agent completed in 0.00s with 5 results


✅ Supervisor decision: BM25 - The query contains a specific ticket reference 'HBASE-456', which is best handled by the BM25 agent for fast keyword-based search.
   Analysis time: 0.85s
🔍 BM25 Agent processing: 'HBASE-456 ticket details'
✅ BM25 Agent completed: 5 results in 0.00s
✍️  ResponseWriter Agent  generating response...
✅ ResponseWriter completed in 8.18s
   Generated response: 722 characters
   Relevant tickets: 5

✅ WORKFLOW EXECUTION COMPLETED in 9.07s

📊 EXECUTION SUMMARY:
   🧠 Routing Decision: BM25 (The query contains a specific ticket reference 'HBASE-456', which is best handled by the BM25 agent for fast keyword-based search.)
   🔍 Retrieval Method: BM25
   📄 Retrieved Contexts: 5
   🎫 Relevant Tickets: 5
   💬 Messages: 3
   ⏱️  Retrieval Time: 0.002s

✍️  FINAL ANSWER (722 chars):
   Thank you for your query regarding the HBASE-456 ticket details. Unfortunately, the retrieved JIRA tickets do not include any information related to HBASE-456. The tickets retrieved, such a

2025-08-19 20:43:37,642 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'server crashed - need immediate help...' in bugs
2025-08-19 20:43:37,642 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


✅ Supervisor decision: ContextualCompression - The query indicates an urgent production incident requiring immediate assistance.
   Analysis time: 1.06s
⚡ ContextualCompression Agent [URGENT] processing: 'server crashed - need immediate help'
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:43:39,103 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:43:39,109 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 11 above threshold 0.1
2025-08-19 20:43:39,109 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3059', '0.1878', '0.1752']
2025-08-19 20:43:39,109 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:43:39,111 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'server crashed - need immediate help...' in bugs
2025-08-19 20:43:39,112 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:43:39,887 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:43:39,893 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 11 above threshold 0.1
2025-08-19 20:43:39,895 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.30

✅ Compression retriever with content extraction: 3 results
✅ ContextualCompression Agent completed: 3 results in 2.62s
✍️  ResponseWriter Agent [PRODUCTION INCIDENT] generating response...
✅ ResponseWriter completed in 7.83s
   Generated response: 1751 characters
   Relevant tickets: 3

✅ WORKFLOW EXECUTION COMPLETED in 11.52s

📊 EXECUTION SUMMARY:
   🧠 Routing Decision: ContextualCompression (The query indicates an urgent production incident requiring immediate assistance.)
   🔍 Retrieval Method: ContextualCompression
   📄 Retrieved Contexts: 3
   🎫 Relevant Tickets: 3
   💬 Messages: 3
   ⏱️  Retrieval Time: 2.623s

✍️  FINAL ANSWER (1751 chars):
   Given the urgency of your query regarding a server crash, and considering this is a production incident, here are some immediate steps and insights based on the retrieved JIRA tickets:

1. **Memory Is...

🎫 RELEVANT TICKETS:
   • JBIDE-16308: Cannot start JBT 4.2.0.Alpha1 on Fedora 19...
   • JBIDE-16273: Java EE Web Project archetype from

2025-08-19 20:43:49,311 - BM25Agent - INFO - Using BM25 retriever for query: 'comprehensive analysis of authentication issues ac...'
2025-08-19 20:43:49,312 - BM25Agent - INFO - BM25 retriever returned 5 documents
2025-08-19 20:43:49,313 - BM25Agent - INFO - Filtered 5 -> 5 valid documents
2025-08-19 20:43:49,313 - BM25Agent - INFO - BM25 retrieve returning 5 results with valid content
2025-08-19 20:43:49,314 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'comprehensive analysis of authentication issues ac...' in bugs
2025-08-19 20:43:49,314 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


✅ Supervisor decision: Ensemble - The query requires a comprehensive analysis of authentication issues, and the user can wait for thorough results.
   Analysis time: 1.20s
🔗 Ensemble Agent processing: 'comprehensive analysis of authentication issues across the system'
   Using comprehensive multi-method retrieval...
🔄 Using individual agent ensemble
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:43:50,107 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:43:50,112 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 11 above threshold 0.1
2025-08-19 20:43:50,113 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.1309', '0.2149', '0.1381']
2025-08-19 20:43:50,113 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:43:50,114 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'comprehensive analysis of authentication issues ac...' in bugs
2025-08-19 20:43:50,115 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:43:50,741 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:43:50,746 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 11 above threshold 0.1
2025-08-19 20:43:50,746 - SupabaseRetriever_bugs - INFO - Result simila

✅ Compression retriever with content extraction: 3 results


2025-08-19 20:43:51,613 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:43:51,619 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 11 above threshold 0.1
2025-08-19 20:43:51,619 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.1309', '0.2149', '0.1381']
2025-08-19 20:43:51,620 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:43:52,373 - SupabaseRetriever_bugs - INFO - Direct vector search for: '1. Can you provide a detailed examination of authe...' in bugs
2025-08-19 20:43:52,373 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:43:52,950 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:43:52,953 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:43:52,953 - SupabaseRetriever_bugs - INFO - Result simila

✅ Individual agent ensemble: 8 deduplicated results
✅ Ensemble Agent completed: 8 results in 4.83s
✍️  ResponseWriter Agent  generating response...
✅ ResponseWriter completed in 6.29s
   Generated response: 1115 characters
   Relevant tickets: 8

✅ WORKFLOW EXECUTION COMPLETED in 12.36s

📊 EXECUTION SUMMARY:
   🧠 Routing Decision: Ensemble (The query requires a comprehensive analysis of authentication issues, and the user can wait for thorough results.)
   🔍 Retrieval Method: Ensemble
   📄 Retrieved Contexts: 8
   🎫 Relevant Tickets: 8
   💬 Messages: 3
   ⏱️  Retrieval Time: 4.834s

✍️  FINAL ANSWER (1115 chars):
   Based on your query regarding a comprehensive analysis of authentication issues across the system, it appears that none of the retrieved JIRA tickets directly address authentication issues. The ticket...

🎫 RELEVANT TICKETS:
   • FLEX-33924: iTunes Store submit email....
   • HBASE-8902: IntegrationTestBulkLoad takes way too long...
   • SPR-11132: EhCacheFactoryBean.afterP

## 📊 **Phase 5: Performance and Integration Analysis**

Analyzing the performance and integration aspects of the multi-agent system.

In [19]:
# Cell 14: Performance comparison between agents
print("📊 Performance comparison between individual agents...")

if 'bm25_agent' in locals() and 'contextual_compression_agent' in locals() and 'ensemble_agent' in locals():
    test_query = "database connection error timeout"
    performance_results = []
    
    print(f"\n🏃 Performance testing with query: '{test_query}'\n")
    
    # Test each agent's performance
    agents_to_test = [
        ('BM25', bm25_agent),
        ('ContextualCompression', contextual_compression_agent),
        ('Ensemble', ensemble_agent)
    ]
    
    for agent_name, agent in agents_to_test:
        try:
            print(f"⏱️  Testing {agent_name}Agent...")
            
            # Create test state
            test_state: AgentState = {
                'query': test_query,
                'user_can_wait': True,
                'production_incident': False,
                'routing_decision': None,
                'routing_reasoning': None,
                'retrieved_contexts': [],
                'retrieval_method': None,
                'retrieval_metadata': {},
                'final_answer': None,
                'relevant_tickets': [],
                'messages': []
            }
            
            # Measure execution time
            start_time = datetime.now()
            result_state = agent.process(test_state.copy())
            execution_time = (datetime.now() - start_time).total_seconds()
            
            # Collect results
            num_results = len(result_state.get('retrieved_contexts', []))
            retrieval_metadata = result_state.get('retrieval_metadata', {})
            
            performance_results.append({
                'agent': agent_name,
                'execution_time': execution_time,
                'num_results': num_results,
                'retrieval_time': retrieval_metadata.get('processing_time', execution_time),
                'method_type': retrieval_metadata.get('method_type', 'unknown'),
                'additional_info': retrieval_metadata
            })
            
            print(f"   ✅ {agent_name}: {execution_time:.3f}s, {num_results} results")
            
        except Exception as e:
            print(f"   ❌ {agent_name} failed: {e}")
            performance_results.append({
                'agent': agent_name,
                'execution_time': float('inf'),
                'num_results': 0,
                'error': str(e)
            })
    
    # Display performance summary
    print("\n📈 PERFORMANCE SUMMARY:")
    print(f"{'Agent':<20} {'Time (s)':<10} {'Results':<8} {'Method Type':<25}")
    print("-" * 65)
    
    # Sort by execution time
    performance_results.sort(key=lambda x: x['execution_time'])
    
    for result in performance_results:
        if 'error' not in result:
            agent = result['agent']
            exec_time = f"{result['execution_time']:.3f}"
            num_results = str(result['num_results'])
            method_type = result.get('method_type', 'unknown')[:24]
            
            print(f"{agent:<20} {exec_time:<10} {num_results:<8} {method_type:<25}")
        else:
            print(f"{result['agent']:<20} {'ERROR':<10} {'0':<8} {result.get('error', '')[:24]:<25}")
    
    # Performance insights
    print("\n💡 PERFORMANCE INSIGHTS:")
    valid_results = [r for r in performance_results if 'error' not in r]
    
    if valid_results:
        fastest = min(valid_results, key=lambda x: x['execution_time'])
        slowest = max(valid_results, key=lambda x: x['execution_time'])
        most_results = max(valid_results, key=lambda x: x['num_results'])
        
        print(f"   🏃 Fastest: {fastest['agent']} ({fastest['execution_time']:.3f}s)")
        print(f"   🐌 Slowest: {slowest['agent']} ({slowest['execution_time']:.3f}s)")
        print(f"   📊 Most results: {most_results['agent']} ({most_results['num_results']} results)")
        
        if len(valid_results) > 1:
            time_diff = slowest['execution_time'] - fastest['execution_time']
            print(f"   ⏱️  Speed difference: {time_diff:.3f}s ({time_diff/fastest['execution_time']*100:.1f}% slower)")

else:
    print("❌ Cannot perform performance comparison - agents not properly initialized")

2025-08-19 20:44:00,501 - BM25Agent - INFO - BM25 Agent processing query: 'database connection error timeout'
2025-08-19 20:44:00,503 - BM25Agent - INFO - Using BM25 retriever for query: 'database connection error timeout...'
2025-08-19 20:44:00,508 - BM25Agent - INFO - BM25 retriever returned 5 documents
2025-08-19 20:44:00,509 - BM25Agent - INFO - Filtered 5 -> 5 valid documents
2025-08-19 20:44:00,509 - BM25Agent - INFO - BM25 retrieve returning 5 results with valid content
2025-08-19 20:44:00,509 - BM25Agent - INFO - BM25 Agent completed in 0.01s with 5 results
2025-08-19 20:44:00,510 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection error timeout...' in bugs
2025-08-19 20:44:00,510 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


📊 Performance comparison between individual agents...

🏃 Performance testing with query: 'database connection error timeout'

⏱️  Testing BM25Agent...
🔍 BM25 Agent processing: 'database connection error timeout'
✅ BM25 Agent completed: 5 results in 0.01s
   ✅ BM25: 0.008s, 5 results
⏱️  Testing ContextualCompressionAgent...
⚡ ContextualCompression Agent  processing: 'database connection error timeout'
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:44:01,175 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:01,179 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:44:01,180 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2513', '0.1837', '0.1588']
2025-08-19 20:44:01,180 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:44:01,181 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection error timeout...' in bugs
2025-08-19 20:44:01,181 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:44:02,201 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:02,206 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:44:02,206 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2513'

✅ Compression retriever with content extraction: 3 results
✅ ContextualCompression Agent completed: 3 results in 1.88s
   ✅ ContextualCompression: 1.881s, 3 results
⏱️  Testing EnsembleAgent...
🔗 Ensemble Agent processing: 'database connection error timeout'
   Using comprehensive multi-method retrieval...
🔄 Using individual agent ensemble
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:44:03,194 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:03,198 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:44:03,198 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2513', '0.1837', '0.1588']
2025-08-19 20:44:03,199 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:44:03,200 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'database connection error timeout...' in bugs
2025-08-19 20:44:03,200 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:44:03,932 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:03,935 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:44:03,935 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2513'

✅ Compression retriever with content extraction: 3 results


2025-08-19 20:44:04,736 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:04,740 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:44:04,740 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2513', '0.1837', '0.1588']
2025-08-19 20:44:04,741 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:44:05,401 - SupabaseRetriever_bugs - INFO - Direct vector search for: '1. What are common causes of database connection e...' in bugs
2025-08-19 20:44:05,414 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:44:06,278 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:06,283 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 12 above threshold 0.1
2025-08-19 20:44:06,283 - SupabaseRetriever_bugs - INFO - Result simila

✅ Individual agent ensemble: 8 deduplicated results
✅ Ensemble Agent completed: 8 results in 5.00s
   ✅ Ensemble: 5.001s, 8 results

📈 PERFORMANCE SUMMARY:
Agent                Time (s)   Results  Method Type              
-----------------------------------------------------------------
BM25                 0.008      5        keyword_based            
ContextualCompression 1.881      3        semantic_with_reranking  
Ensemble             5.001      8        multi_method_ensemble    

💡 PERFORMANCE INSIGHTS:
   🏃 Fastest: BM25 (0.008s)
   🐌 Slowest: Ensemble (5.001s)
   📊 Most results: Ensemble (8 results)
   ⏱️  Speed difference: 4.992s (59808.0% slower)


In [20]:
# Cell 15: Integration and workflow health check
print("🔍 Integration and workflow health check...")

health_check_results = {
    'components': {},
    'integration': {},
    'workflow': {},
    'overall_status': 'UNKNOWN'
}

# Component health check
print("\n🏥 COMPONENT HEALTH CHECK:")
print("-" * 40)

components_to_check = [
    ('Supabase Client', 'supabase_client'),
    ('Vectorstore', 'vectorstore'),
    ('LLMs', 'rag_llm'),
    ('BM25 Agent', 'bm25_agent'),
    ('ContextualCompression Agent', 'contextual_compression_agent'),
    ('Ensemble Agent', 'ensemble_agent'),
    ('ResponseWriter Agent', 'response_writer_agent'),
    ('Supervisor Agent', 'supervisor_agent'),
    ('Compiled Workflow', 'compiled_workflow')
]

for component_name, variable_name in components_to_check:
    try:
        if variable_name in locals() or variable_name in globals():
            component = locals().get(variable_name) or globals().get(variable_name)
            if component is not None:
                print(f"✅ {component_name}: Initialized")
                health_check_results['components'][component_name] = 'OK'
            else:
                print(f"❌ {component_name}: Variable exists but is None")
                health_check_results['components'][component_name] = 'NULL'
        else:
            print(f"❌ {component_name}: Not found")
            health_check_results['components'][component_name] = 'MISSING'
    except Exception as e:
        print(f"⚠️  {component_name}: Error checking - {e}")
        health_check_results['components'][component_name] = 'ERROR'

# Integration health check
# Integration Health Check - Simplified and Direct
print("\n🔗 INTEGRATION HEALTH CHECK:")
print("-" * 40)

# Test vectorstore directly
try:
    if vectorstore is not None:
        test_result = vectorstore.similarity_search("test", k=1)
        print(f"✅ Vectorstore Search: {len(test_result)} results")
        health_check_results['integration']['Vectorstore Search'] = 'OK'
    else:
        print("⚠️  Vectorstore Search: vectorstore is None")
        health_check_results['integration']['Vectorstore Search'] = 'NULL'
except NameError:
    print("❌ Vectorstore Search: Variable not found")
    health_check_results['integration']['Vectorstore Search'] = 'MISSING'
except Exception as e:
    print(f"❌ Vectorstore Search: Failed - {str(e)[:50]}...")
    health_check_results['integration']['Vectorstore Search'] = 'FAILED'

# Test LLM directly
try:
    if rag_llm is not None:
        test_result = rag_llm.invoke("test")
        print(f"✅ LLM Connectivity: {type(test_result).__name__}")
        health_check_results['integration']['LLM Connectivity'] = 'OK'
    else:
        print("⚠️  LLM Connectivity: rag_llm is None")
        health_check_results['integration']['LLM Connectivity'] = 'NULL'
except NameError:
    print("❌ LLM Connectivity: Variable not found")
    health_check_results['integration']['LLM Connectivity'] = 'MISSING'
except Exception as e:
    print(f"❌ LLM Connectivity: Failed - {str(e)[:50]}...")
    health_check_results['integration']['LLM Connectivity'] = 'FAILED'

# Test agent state creation
try:
    test_state = {
        'query': 'test',
        'user_can_wait': True,
        'production_incident': False,
        'routing_decision': None,
        'routing_reasoning': None,
        'retrieved_contexts': [],
        'retrieval_method': None,
        'retrieval_metadata': {},
        'final_answer': None,
        'relevant_tickets': [],
        'messages': []
    }
    print("✅ Agent State Creation: Working")
    health_check_results['integration']['Agent State Creation'] = 'OK'
except Exception as e:
    print(f"❌ Agent State Creation: Failed - {str(e)[:50]}...")
    health_check_results['integration']['Agent State Creation'] = 'FAILED'

# Test node function execution
try:
    if supervisor_node is not None:
        test_result = supervisor_node({
            'query': 'test integration',
            'user_can_wait': True,
            'production_incident': False,
            'routing_decision': None,
            'routing_reasoning': None,
            'retrieved_contexts': [],
            'retrieval_method': None,
            'retrieval_metadata': {},
            'final_answer': None,
            'relevant_tickets': [],
            'messages': []
        })
        if test_result is not None and isinstance(test_result, dict):
            print("✅ Node Function Execution: Working")
            health_check_results['integration']['Node Function Execution'] = 'OK'
        else:
            print("⚠️  Node Function Execution: Returned invalid result")
            health_check_results['integration']['Node Function Execution'] = 'NULL'
    else:
        print("⚠️  Node Function Execution: supervisor_node is None")
        health_check_results['integration']['Node Function Execution'] = 'NULL'
except NameError:
    print("❌ Node Function Execution: supervisor_node not found")
    health_check_results['integration']['Node Function Execution'] = 'MISSING'
except Exception as e:
    print(f"❌ Node Function Execution: Failed - {str(e)[:50]}...")
    health_check_results['integration']['Node Function Execution'] = 'FAILED'       
# Workflow health check
print("\n⚡ WORKFLOW HEALTH CHECK:")
print("-" * 40)

if 'compiled_workflow' in locals() and compiled_workflow is not None:
    try:
        # Quick workflow test
        quick_test_state: AgentState = {
            'query': 'health check test',
            'user_can_wait': False,
            'production_incident': False,
            'routing_decision': None,
            'routing_reasoning': None,
            'retrieved_contexts': [],
            'retrieval_method': None,
            'retrieval_metadata': {},
            'final_answer': None,
            'relevant_tickets': [],
            'messages': []
        }
        
        workflow_result = compiled_workflow.invoke(quick_test_state)
        
        if workflow_result and workflow_result.get('final_answer'):
            print("✅ End-to-End Workflow: Complete execution successful")
            health_check_results['workflow']['end_to_end'] = 'OK'
            
            # Check workflow completeness
            required_fields = ['routing_decision', 'retrieval_method', 'final_answer']
            missing_fields = [field for field in required_fields if not workflow_result.get(field)]
            
            if not missing_fields:
                print("✅ Workflow Completeness: All required fields populated")
                health_check_results['workflow']['completeness'] = 'OK'
            else:
                print(f"⚠️  Workflow Completeness: Missing fields - {missing_fields}")
                health_check_results['workflow']['completeness'] = 'INCOMPLETE'
        else:
            print("⚠️  End-to-End Workflow: Execution completed but no final answer")
            health_check_results['workflow']['end_to_end'] = 'INCOMPLETE'
            
    except Exception as e:
        print(f"❌ End-to-End Workflow: Failed - {e}")
        health_check_results['workflow']['end_to_end'] = 'FAILED'
else:
    print("❌ End-to-End Workflow: Compiled workflow not available")
    health_check_results['workflow']['end_to_end'] = 'MISSING'

# Overall health assessment
print("\n🏥 OVERALL HEALTH ASSESSMENT:")
print("=" * 50)

all_statuses = []
all_statuses.extend(health_check_results['components'].values())
all_statuses.extend(health_check_results['integration'].values())
all_statuses.extend(health_check_results['workflow'].values())

ok_count = all_statuses.count('OK')
total_count = len(all_statuses)
health_percentage = (ok_count / total_count) * 100 if total_count > 0 else 0

if health_percentage >= 90:
    overall_status = "🟢 EXCELLENT"
    health_check_results['overall_status'] = 'EXCELLENT'
elif health_percentage >= 75:
    overall_status = "🟡 GOOD"
    health_check_results['overall_status'] = 'GOOD'
elif health_percentage >= 50:
    overall_status = "🟠 FAIR"
    health_check_results['overall_status'] = 'FAIR'
else:
    overall_status = "🔴 POOR"
    health_check_results['overall_status'] = 'POOR'

print(f"System Health: {overall_status} ({ok_count}/{total_count} checks passed - {health_percentage:.1f}%)")
print(f"\nComponents: {sum(1 for v in health_check_results['components'].values() if v == 'OK')}/{len(health_check_results['components'])} OK")
print(f"Integration: {sum(1 for v in health_check_results['integration'].values() if v == 'OK')}/{len(health_check_results['integration'])} OK")
print(f"Workflow: {sum(1 for v in health_check_results['workflow'].values() if v == 'OK')}/{len(health_check_results['workflow'])} OK")

if health_percentage < 100:
    failed_checks = [category + '.' + name for category, checks in health_check_results.items() 
                    if isinstance(checks, dict) for name, status in checks.items() if status != 'OK']
    print(f"\n⚠️  Issues found: {', '.join(failed_checks)}")

print("\n" + "=" * 50)

2025-08-19 20:44:07,422 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'test...' in bugs
2025-08-19 20:44:07,422 - SupabaseRetriever_bugs - INFO - Parameters: k=1, similarity_threshold=0.1, filters=None


🔍 Integration and workflow health check...

🏥 COMPONENT HEALTH CHECK:
----------------------------------------
✅ Supabase Client: Initialized
✅ Vectorstore: Initialized
✅ LLMs: Initialized
✅ BM25 Agent: Initialized
✅ ContextualCompression Agent: Initialized
✅ Ensemble Agent: Initialized
✅ ResponseWriter Agent: Initialized
✅ Supervisor Agent: Initialized
✅ Compiled Workflow: Initialized

🔗 INTEGRATION HEALTH CHECK:
----------------------------------------


2025-08-19 20:44:08,168 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-19 20:44:08,169 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.1
2025-08-19 20:44:08,169 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.1566', '0.1008']
2025-08-19 20:44:08,169 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 3 candidates)


✅ Vectorstore Search: 1 results
✅ LLM Connectivity: AIMessage
✅ Agent State Creation: Working
🧠 Supervisor Agent analyzing query: 'test integration'
   user_can_wait: True, production_incident: False
✅ Supervisor decision: Ensemble - The user can wait, and the query is not urgent or specific, suggesting a comprehensive search is appropriate.
   Analysis time: 1.21s
✅ Node Function Execution: Working

⚡ WORKFLOW HEALTH CHECK:
----------------------------------------
🧠 Supervisor Agent analyzing query: 'health check test'
   user_can_wait: False, production_incident: False


2025-08-19 20:44:11,010 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'health check test...' in bugs
2025-08-19 20:44:11,011 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None


✅ Supervisor decision: ContextualCompression - The query is a general troubleshooting question and the user cannot wait, making ContextualCompression the best choice for fast semantic search.
   Analysis time: 1.22s
⚡ ContextualCompression Agent  processing: 'health check test'
🔄 Using compression retriever with LangChain wrapper


2025-08-19 20:44:11,760 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:11,763 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 6 above threshold 0.1
2025-08-19 20:44:11,763 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.1059', '0.1264', '0.1356']
2025-08-19 20:44:11,764 - SupabaseRetriever_bugs - INFO - Direct vector search returned 4 results (from 12 candidates)
2025-08-19 20:44:11,764 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'health check test...' in bugs
2025-08-19 20:44:11,765 - SupabaseRetriever_bugs - INFO - Parameters: k=4, similarity_threshold=0.1, filters=None
2025-08-19 20:44:12,471 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-19 20:44:12,475 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 6 above threshold 0.1
2025-08-19 20:44:12,475 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.1058', '0.1265', '0.135

✅ Compression retriever with content extraction: 3 results
✅ ContextualCompression Agent completed: 3 results in 1.63s
✍️  ResponseWriter Agent  generating response...
✅ ResponseWriter completed in 4.63s
   Generated response: 830 characters
   Relevant tickets: 3
✅ End-to-End Workflow: Complete execution successful
✅ Workflow Completeness: All required fields populated

🏥 OVERALL HEALTH ASSESSMENT:
System Health: 🟢 EXCELLENT (15/15 checks passed - 100.0%)

Components: 9/9 OK
Integration: 4/4 OK
Workflow: 2/2 OK



## 🎯 **Phase 6: Final Summary and Recommendations**

Comprehensive summary of testing results and recommendations for system optimization.

In [21]:
# Cell 16: Final testing summary and recommendations
print("📋 FINAL TESTING SUMMARY AND RECOMMENDATIONS")
print("=" * 60)

# Collect all test results
summary_data = {
    'timestamp': datetime.now().isoformat(),
    'components_tested': [],
    'performance_data': [],
    'health_check': health_check_results if 'health_check_results' in locals() else {},
    'recommendations': []
}


# Components tested
components_tested = [
    '✅ AgentState class and common utilities',
    '✅ LLM components (GPT-3.5-turbo, GPT-4o)',
    '✅ Supabase vectorstore integration',
    '✅ BM25Agent (keyword-based retrieval)',
    '✅ ContextualCompressionAgent (semantic + reranking)',
    '✅ EnsembleAgent (multi-method retrieval)',
    '✅ ResponseWriterAgent (GPT-4o response generation)',
    '✅ SupervisorAgent (intelligent query routing)',
    '✅ LangGraph workflow orchestration',
    '✅ End-to-end query processing'
]

print("\n🧪 COMPONENTS TESTED:")
for component in components_tested:
    print(f"   {component}")
    summary_data['components_tested'].append(component)

# Performance insights
print("\n📊 PERFORMANCE INSIGHTS:")
if 'performance_results' in locals() and performance_results:
    valid_results = [r for r in performance_results if 'error' not in r]
    if valid_results:
        fastest = min(valid_results, key=lambda x: x['execution_time'])
        slowest = max(valid_results, key=lambda x: x['execution_time'])
        
        print(f"   🏃 Best Performance: {fastest['agent']} ({fastest['execution_time']:.3f}s)")
        print(f"   🔍 Most Comprehensive: {slowest['agent']} ({slowest['execution_time']:.3f}s, {slowest['num_results']} results)")
        
        summary_data['performance_data'] = {
            'fastest_agent': fastest['agent'],
            'fastest_time': fastest['execution_time'],
            'most_comprehensive': slowest['agent'],
            'comprehensive_time': slowest['execution_time'],
            'all_results': performance_results
        }
else:
    print("   ⚠️  Performance data not available")

# Health check summary
if 'health_check_results' in locals():
    overall_status = health_check_results.get('overall_status', 'UNKNOWN')
    print(f"\n🏥 SYSTEM HEALTH: {overall_status}")
    
    # Count successful components
    components_ok = sum(1 for v in health_check_results.get('components', {}).values() if v == 'OK')
    integration_ok = sum(1 for v in health_check_results.get('integration', {}).values() if v == 'OK')
    workflow_ok = sum(1 for v in health_check_results.get('workflow', {}).values() if v == 'OK')
    
    print(f"   Components: {components_ok} OK")
    print(f"   Integration: {integration_ok} OK")
    print(f"   Workflow: {workflow_ok} OK")

# Generate recommendations
print("\n💡 RECOMMENDATIONS:")
recommendations = []

# Performance recommendations
if 'performance_results' in locals() and performance_results:
    valid_results = [r for r in performance_results if 'error' not in r]
    if valid_results:
        fastest_time = min(r['execution_time'] for r in valid_results)
        slowest_time = max(r['execution_time'] for r in valid_results)
        
        if slowest_time > fastest_time * 3:
            recommendations.append("🚀 PERFORMANCE: Consider caching for slower retrieval methods to improve response times")
        
        ensemble_result = next((r for r in valid_results if r['agent'] == 'Ensemble'), None)
        if ensemble_result and ensemble_result['execution_time'] > 2.0:
            recommendations.append("🔗 ENSEMBLE: Consider parallel processing for ensemble methods to reduce latency")

# Health-based recommendations
if 'health_check_results' in locals():
    failed_components = [name for name, status in health_check_results.get('components', {}).items() if status != 'OK']
    if failed_components:
        recommendations.append(f"🔧 COMPONENTS: Address issues with: {', '.join(failed_components)}")
    
    failed_integration = [name for name, status in health_check_results.get('integration', {}).items() if status != 'OK']
    if failed_integration:
        recommendations.append(f"🔗 INTEGRATION: Fix integration issues: {', '.join(failed_integration)}")
    
    workflow_status = health_check_results.get('workflow', {}).get('end_to_end')
    if workflow_status != 'OK':
        recommendations.append("⚡ WORKFLOW: Debug end-to-end workflow execution issues")

# General recommendations
general_recommendations = [
    "📊 MONITORING: Implement metrics collection for production usage",
    "🔒 SECURITY: Review API key management and access controls",
    "📈 SCALING: Consider connection pooling for high-throughput scenarios",
    "🧪 TESTING: Add automated testing for continuous integration",
    "📝 DOCUMENTATION: Document optimal query patterns for each agent type"
]

recommendations.extend(general_recommendations)

# Display recommendations
for i, recommendation in enumerate(recommendations, 1):
    print(f"   {i}. {recommendation}")
    summary_data['recommendations'].append(recommendation)

# Agent usage guidelines
print("\n🎯 AGENT USAGE GUIDELINES:")
usage_guidelines = [
    "🔍 BM25Agent: Use for specific ticket IDs, exact error messages, technical terms",
    "⚡ ContextualCompressionAgent: Use for production incidents, general troubleshooting",
    "🔗 EnsembleAgent: Use for complex research queries, comprehensive analysis",
    "🧠 SupervisorAgent: Automatically routes queries based on urgency and complexity"
]

for guideline in usage_guidelines:
    print(f"   • {guideline}")

# Final status
print("\n" + "=" * 60)
if 'health_check_results' in locals():
    overall_status = health_check_results.get('overall_status', 'UNKNOWN')
    if overall_status in ['EXCELLENT', 'GOOD']:
        print("🎉 TESTING COMPLETE: Multi-agent system is ready for production use!")
        print("   All core components are functioning correctly.")
        print("   LangGraph workflow orchestration is working as expected.")
    elif overall_status == 'FAIR':
        print("⚠️  TESTING COMPLETE: System is functional but needs attention.")
        print("   Address the recommendations above before production deployment.")
    else:
        print("❌ TESTING COMPLETE: System has significant issues.")
        print("   Critical fixes needed before production use.")
else:
    print("✅ TESTING COMPLETE: All individual components tested successfully.")
    print("   System appears ready for further integration testing.")

print("\n📋 Test results summary available in local variables for further analysis.")
print("=" * 60)

# Save summary data for potential export
test_summary = summary_data
print(f"\n💾 Test summary data saved to 'test_summary' variable")
print(f"   Timestamp: {test_summary['timestamp']}")
print(f"   Components tested: {len(test_summary['components_tested'])}")
print(f"   Recommendations: {len(test_summary['recommendations'])}")

📋 FINAL TESTING SUMMARY AND RECOMMENDATIONS

🧪 COMPONENTS TESTED:
   ✅ AgentState class and common utilities
   ✅ LLM components (GPT-3.5-turbo, GPT-4o)
   ✅ Supabase vectorstore integration
   ✅ BM25Agent (keyword-based retrieval)
   ✅ ContextualCompressionAgent (semantic + reranking)
   ✅ EnsembleAgent (multi-method retrieval)
   ✅ ResponseWriterAgent (GPT-4o response generation)
   ✅ SupervisorAgent (intelligent query routing)
   ✅ LangGraph workflow orchestration
   ✅ End-to-end query processing

📊 PERFORMANCE INSIGHTS:
   🏃 Best Performance: BM25 (0.008s)
   🔍 Most Comprehensive: Ensemble (5.001s, 8 results)

🏥 SYSTEM HEALTH: EXCELLENT
   Components: 9 OK
   Integration: 4 OK
   Workflow: 2 OK

💡 RECOMMENDATIONS:
   1. 🚀 PERFORMANCE: Consider caching for slower retrieval methods to improve response times
   2. 🔗 ENSEMBLE: Consider parallel processing for ensemble methods to reduce latency
   3. 📊 MONITORING: Implement metrics collection for production usage
   4. 🔒 SECURITY: Revie