# 🧪 TestRAGTools.ipynb - Advanced RAG System Testing Framework

**Purpose**: Comprehensive testing framework for Cuttlefish4's Advanced RAG (Retrieval-Augmented Generation) System  
**Status**: ✅ Complete and Functional  
**Created**: August 2025  

---

## 🎯 **Testing Scope**

This notebook systematically tests all components of the Advanced RAG system for Cuttlefish4:

### **🔍 Core RAG Functionality**
1. **Vector Search** - Semantic similarity using OpenAI embeddings
2. **Keyword Search** - BM25-style full-text search via PostgreSQL
3. **Hybrid Search** - Weighted combination of vector + keyword
4. **Contextual Compression** - Vector search with high-quality thresholds

### **🆕 Advanced Ensemble Retrieval** (Crown Jewel Feature)
**4-Method Sophisticated System**:
- **Multi-Query Expansion** - LLM generates query variations (GPT-3.5-turbo)
- **Contextual Compression** - Advanced reranking with Cohere API
- **BM25 Retrieval** - Enhanced keyword search with document frequency
- **Weighted Ensemble** - 25% each method with content hash deduplication

### **📊 Data Collections**
- **Bugs Collection**: 4,910 JIRA bug reports and technical issues
- **PCR Collection**: 2,860 Program Change Requests and enhancements

### **🛠️ Additional Testing**
- **Document Lookup** - Direct retrieval by ID and JIRA ticket numbers
- **Tool Registry** - Dynamic access to all RAG methods
- **Performance Analysis** - Speed and accuracy benchmarking
- **Error Handling** - Robustness testing with edge cases

---

## 📋 **Prerequisites**

**Environment Variables Required**:
- `CUTTLEFISH_HOME` - Project root directory path
- `SUPABASE_URL`, `SUPABASE_KEY` - Database access
- `OPENAI_API_KEY` - For embeddings and LLM operations
- `COHERE_API_KEY` - Optional, for advanced reranking

**Data Requirements**:
- Populated Supabase tables: `bugs` and `pcr` with JIRA data
- Vector embeddings and full-text search indices
- RPC function: `match_documents_vector` for vector similarity search

**Dependencies**:
- Supabase Python client, OpenAI SDK, python-dotenv
- NumPy, Pandas for data processing
- All Cuttlefish4 RAG modules properly configured

---

## ⚡ **Quick Start**

1. **Run Cell 0** - Install all required dependencies
2. **Run Cell 2** - Environment setup and path configuration
3. **Run Cell 4** - Initialize RAG tools and test connections
4. **Run Cell 6-8** - Configure test parameters and utilities
5. **Run Cell 23** - 🚨 **CRITICAL** - Advanced ensemble module reload
6. **Run Cells 10-36** - Execute all test sections systematically

### 🔍 **Success Indicators**
- ✅ All database connections successful (4,910 bugs + 2,860 PCR docs)
- ✅ All RAG methods return relevant results
- ✅ Advanced ensemble shows `Multi-query expansion`, `Contextual compression`, `BM25 retrieval`
- ✅ Performance tests complete within reasonable time limits
- ✅ System health assessment shows EXCELLENT or GOOD status

### 🎯 **Advanced Ensemble Focus**
This notebook provides comprehensive testing of the **sophisticated 4-method ensemble** that matches the complexity of the original Cuttlefish3_Complete.ipynb but adapted for Supabase. Look for:
- `source: 'advanced_ensemble_bugs'` in results
- Logs showing multiple sophisticated retrieval methods
- Content hash-based deduplication
- Weighted scoring combination

---

*Execute cells in order for systematic testing of the entire Advanced RAG system.*

## 🔧 **Step 1: Dependency Installation**

**⚠️ Run this cell first** to install all required Python packages for the RAG system.

This cell will install:
- **Supabase client** for database operations
- **OpenAI SDK** for embeddings and LLM operations  
- **Python-dotenv** for environment variable management
- **NumPy & Pandas** for data processing
- **Typing extensions** for enhanced type hints

**Note**: You may see some warnings during installation - these are normal and can be ignored.

In [1]:
# Install required dependencies
# Run this cell first if you don't have the dependencies installed

import subprocess
import sys

def install_package(package):
    """Install a package using pip."""
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"✅ Successfully installed {package}")
    except subprocess.CalledProcessError as e:
        print(f"❌ Failed to install {package}: {e}")

# Core dependencies for RAG tools
required_packages = [
    "supabase==2.18.0",
    "openai>=1.0.0",
    "python-dotenv>=1.0.0",
    "numpy>=1.24.0",
    "pandas>=2.0.0",
    "typing-extensions>=4.0.0"
]

print("🔧 Installing required packages...")
for package in required_packages:
    install_package(package)

print("\n📦 Package installation completed!")

print("\n🏠 CUTTLEFISH_HOME Environment Variable:")
print("This notebook uses the CUTTLEFISH_HOME environment variable to locate project files.")
print("Please set it to your project root directory:")
print("")
print("# In your terminal or .bashrc/.zshrc:")
print("export CUTTLEFISH_HOME=/Users/foohm/github/cuttlefish4")
print("")
print("# Or set it in your Jupyter environment:")
print("import os")
print("os.environ['CUTTLEFISH_HOME'] = '/Users/foohm/github/cuttlefish4'")

print("\n⚠️  Other Environment Variables Required:")
print("Make sure you also have a .env file or environment variables set for:")
print("- SUPABASE_URL=https://your-project.supabase.co")
print("- SUPABASE_KEY=your-service-role-key")
print("- OPENAI_API_KEY=your-openai-api-key")
print("- OPENAI_EMBED_MODEL=text-embedding-3-small (optional)")

print("\n🚀 After setting CUTTLEFISH_HOME, run the next cell to continue!")

🔧 Installing required packages...
✅ Successfully installed supabase==2.18.0



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


✅ Successfully installed openai>=1.0.0



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


✅ Successfully installed python-dotenv>=1.0.0



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


✅ Successfully installed numpy>=1.24.0



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


✅ Successfully installed pandas>=2.0.0



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


✅ Successfully installed typing-extensions>=4.0.0

📦 Package installation completed!

🏠 CUTTLEFISH_HOME Environment Variable:
This notebook uses the CUTTLEFISH_HOME environment variable to locate project files.
Please set it to your project root directory:

# In your terminal or .bashrc/.zshrc:
export CUTTLEFISH_HOME=/Users/foohm/github/cuttlefish4

# Or set it in your Jupyter environment:
import os
os.environ['CUTTLEFISH_HOME'] = '/Users/foohm/github/cuttlefish4'

⚠️  Other Environment Variables Required:
Make sure you also have a .env file or environment variables set for:
- SUPABASE_URL=https://your-project.supabase.co
- SUPABASE_KEY=your-service-role-key
- OPENAI_API_KEY=your-openai-api-key
- OPENAI_EMBED_MODEL=text-embedding-3-small (optional)

🚀 After setting CUTTLEFISH_HOME, run the next cell to continue!



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## 🏠 **Step 2: Environment Setup & Path Configuration**

This section configures the Python environment and imports necessary modules for the RAG system.

### **What This Cell Does:**
- **Environment Detection**: Automatically detects project structure using `CUTTLEFISH_HOME`
- **Path Configuration**: Sets up Python import paths for all RAG modules
- **Environment Validation**: Loads and validates `.env` file with API keys
- **Module Imports**: Imports core RAG components with fallback error handling
- **Logging Setup**: Configures detailed logging for debugging

### **Required Environment Variables:**
- `CUTTLEFISH_HOME` - Project root directory path
- `SUPABASE_URL` - Your Supabase project URL  
- `SUPABASE_KEY` - Supabase service role key
- `OPENAI_API_KEY` - OpenAI API key for embeddings
- `COHERE_API_KEY` - (Optional) Cohere API key for advanced reranking

**📍 Important**: Make sure you're running this notebook from the `app/tools/` directory.

In [2]:
import os
import sys
import logging
from typing import List, Dict, Any
from datetime import datetime
import pandas as pd

# Import and load dotenv to read .env file
try:
    from dotenv import load_dotenv
    
    # Try to load .env from current directory first
    dotenv_loaded = load_dotenv()
    if dotenv_loaded:
        print("✅ Loaded .env file from current directory")
    else:
        # Try to load from project root if CUTTLEFISH_HOME is set
        cuttlefish_home = os.getenv('CUTTLEFISH_HOME')
        if cuttlefish_home:
            env_path = os.path.join(cuttlefish_home, '.env')
            if os.path.exists(env_path):
                load_dotenv(env_path)
                print(f"✅ Loaded .env file from: {env_path}")
            else:
                print(f"⚠️  .env file not found at: {env_path}")
        else:
            print("⚠️  No .env file found in current directory")
except ImportError:
    print("❌ python-dotenv not installed. Please install it first.")
    print("Run: pip install python-dotenv")

# Check for CUTTLEFISH_HOME environment variable
cuttlefish_home = os.getenv('CUTTLEFISH_HOME')

if not cuttlefish_home:
    print("⚠️  CUTTLEFISH_HOME environment variable not set!")
    print("Please set it to your project root directory:")
    print("export CUTTLEFISH_HOME=/Users/foohm/github/cuttlefish4")
    print("\nTrying to auto-detect project root...")
    
    # Auto-detect as fallback
    current_cwd = os.getcwd()
    if 'cuttlefish4' in current_cwd:
        parts = current_cwd.split(os.sep)
        cf4_index = next((i for i, part in enumerate(parts) if 'cuttlefish4' in part), None)
        if cf4_index is not None:
            cuttlefish_home = os.sep.join(parts[:cf4_index+1])
            print(f"🔍 Auto-detected project root: {cuttlefish_home}")
            # Try to load .env from auto-detected root
            env_path = os.path.join(cuttlefish_home, '.env')
            if os.path.exists(env_path):
                load_dotenv(env_path)
                print(f"✅ Loaded .env file from auto-detected root: {env_path}")
        else:
            cuttlefish_home = current_cwd
    else:
        cuttlefish_home = current_cwd
else:
    print(f"✅ CUTTLEFISH_HOME found: {cuttlefish_home}")

# Set up paths based on CUTTLEFISH_HOME
app_dir = os.path.join(cuttlefish_home, 'app')
rag_dir = os.path.join(app_dir, 'rag')
tools_dir = os.path.join(app_dir, 'tools')
supabase_dir = os.path.join(cuttlefish_home, 'supabase')

print(f"📁 Project root: {cuttlefish_home}")
print(f"📁 App directory: {app_dir}")
print(f"📁 RAG directory: {rag_dir}")
print(f"📁 Tools directory: {tools_dir}")
print(f"📁 Supabase directory: {supabase_dir}")

# Add directories to Python path
paths_to_add = [cuttlefish_home, app_dir, rag_dir, tools_dir, supabase_dir]
for path in paths_to_add:
    if os.path.exists(path) and path not in sys.path:
        sys.path.insert(0, path)
        print(f"✅ Added to Python path: {path}")
    elif not os.path.exists(path):
        print(f"⚠️  Path does not exist: {path}")

# Verify key directories exist
print(f"\n🔍 Directory verification:")
print(f"   App directory exists: {os.path.exists(app_dir)}")
print(f"   RAG directory exists: {os.path.exists(rag_dir)}")
print(f"   Tools directory exists: {os.path.exists(tools_dir)}")

# Try imports with better error handling
print(f"\n📦 Attempting imports...")

try:
    # Import Supabase retriever components
    from supabase_retriever import SupabaseRetriever, create_bugs_retriever, create_pcr_retriever
    print("✅ Successfully imported Supabase retriever components")
except ImportError as e:
    print(f"❌ Failed to import Supabase retriever: {e}")
    print("   Please check that supabase_retriever.py exists in the rag directory")

try:
    # Import RAG tools
    from rag_tools import RAGTools, get_rag_tools
    print("✅ Successfully imported RAG tools")
except ImportError as e:
    print(f"❌ Failed to import RAG tools: {e}")
    print("   Please check that rag_tools.py exists in the tools directory")

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

print(f"\n🎉 Setup completed!")
print(f"📍 Current working directory: {os.getcwd()}")

# Show environment variables status
print(f"\n🔧 Environment Variables Status:")
env_vars = ['CUTTLEFISH_HOME', 'SUPABASE_URL', 'SUPABASE_KEY', 'OPENAI_API_KEY', 'OPENAI_EMBED_MODEL']
for var in env_vars:
    value = os.getenv(var)
    if value:
        # Show first few characters for security (except CUTTLEFISH_HOME)
        if var == 'CUTTLEFISH_HOME':
            display_value = value
        else:
            display_value = value[:10] + "..." if len(value) > 10 else value
        print(f"   ✅ {var}: {display_value}")
    else:
        print(f"   ❌ {var}: Not set")

✅ Loaded .env file from current directory
✅ CUTTLEFISH_HOME found: /Users/foohm/github/cuttlefish4
📁 Project root: /Users/foohm/github/cuttlefish4
📁 App directory: /Users/foohm/github/cuttlefish4/app
📁 RAG directory: /Users/foohm/github/cuttlefish4/app/rag
📁 Tools directory: /Users/foohm/github/cuttlefish4/app/tools
📁 Supabase directory: /Users/foohm/github/cuttlefish4/supabase
✅ Added to Python path: /Users/foohm/github/cuttlefish4
✅ Added to Python path: /Users/foohm/github/cuttlefish4/app
✅ Added to Python path: /Users/foohm/github/cuttlefish4/app/rag
✅ Added to Python path: /Users/foohm/github/cuttlefish4/app/tools
✅ Added to Python path: /Users/foohm/github/cuttlefish4/supabase

🔍 Directory verification:
   App directory exists: True
   RAG directory exists: True
   Tools directory exists: True

📦 Attempting imports...
✅ Successfully imported Supabase retriever components
✅ Successfully imported RAG tools

🎉 Setup completed!
📍 Current working directory: /Users/foohm/github/cuttlef

## 🚀 **Step 3: RAG Tools Initialization**

Initialize the core RAG tools system and test database connections.

In [3]:
import os
import sys
import logging
from typing import List, Dict, Any
from datetime import datetime
import pandas as pd

# Add the parent directory to the path so we can import from app
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(''))))

# Import our RAG tools
from rag_tools import RAGTools, get_rag_tools

# Set up logging to see detailed output
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

print("✅ Imports successful")
print(f"📁 Current working directory: {os.getcwd()}")
print(f"🐍 Python path: {sys.path[-1]}")

✅ Imports successful
📁 Current working directory: /Users/foohm/github/cuttlefish4/app/tools
🐍 Python path: /Users/foohm/github/cuttlefish4


## ⚙️ **Step 4: Test Configuration**

Define test queries, parameters, and utility functions for comprehensive testing.

In [4]:
# Initialize RAG tools with default collection as 'bugs'
print("🚀 Initializing RAG Tools...")

try:
    rag_tools = get_rag_tools(default_collection='bugs')
    print("✅ RAG tools initialized successfully")
    print(f"🔧 Default collection: {rag_tools.default_collection}")
    
    # Test connections
    print("\n🔌 Testing connections...")
    connections = rag_tools.test_connections()
    print(f"Connection status: {connections}")
    
    if not any(connections.values()):
        print("⚠️  Warning: No active connections. Some tests may fail.")
        print("Please check your environment variables and Supabase setup.")
    else:
        print("✅ At least one connection is active")
        
except Exception as e:
    print(f"❌ Failed to initialize RAG tools: {e}")
    print("Please check your environment setup and dependencies")

2025-08-14 12:27:22,225 - RAGTools - INFO - ✅ RAG tools initialized successfully
2025-08-14 12:27:22,225 - RAGTools - INFO - ✅ RAG tools initialized successfully


🚀 Initializing RAG Tools...
✅ RAG tools initialized successfully
🔧 Default collection: bugs

🔌 Testing connections...


2025-08-14 12:27:22,991 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=id&limit=1 "HTTP/2 200 OK"
2025-08-14 12:27:22,994 - SupabaseRetriever_bugs - INFO - ✅ Connection to bugs table successful
2025-08-14 12:27:22,994 - SupabaseRetriever_bugs - INFO - ✅ Connection to bugs table successful
2025-08-14 12:27:23,376 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=id&limit=1 "HTTP/2 200 OK"
2025-08-14 12:27:23,378 - SupabaseRetriever_pcr - INFO - ✅ Connection to pcr table successful
2025-08-14 12:27:23,378 - SupabaseRetriever_pcr - INFO - ✅ Connection to pcr table successful


Connection status: {'bugs': True, 'pcr': True}
✅ At least one connection is active


## 🛠️ **Step 5: Utility Functions**

Helper functions for displaying results and testing different RAG methods.

In [5]:
# Define test queries for different scenarios
test_queries = [
    "authentication error",        # Technical issue
    "Java OutOfMemoryError",       # Specific error type
    "login failed",               # User-facing issue
    "database connection timeout", # Infrastructure issue
    "Eclipse IDE",                # Tool-specific
    "feature request",            # Enhancement/PCR type
    "release",                    # PCR-related
    "JBIDE-16273"                 # Specific ticket ID
]

# Test parameters
TEST_K = 3  # Number of results to fetch in tests
TEST_SIMILARITY_THRESHOLD = 0.2
TEST_VECTOR_WEIGHT = 0.7
TEST_KEYWORD_WEIGHT = 0.3

print(f"📝 Test queries: {test_queries}")
print(f"🔢 Test parameters: k={TEST_K}, similarity_threshold={TEST_SIMILARITY_THRESHOLD}")
print(f"⚖️  Hybrid weights: vector={TEST_VECTOR_WEIGHT}, keyword={TEST_KEYWORD_WEIGHT}")

📝 Test queries: ['authentication error', 'Java OutOfMemoryError', 'login failed', 'database connection timeout', 'Eclipse IDE', 'feature request', 'release', 'JBIDE-16273']
🔢 Test parameters: k=3, similarity_threshold=0.2
⚖️  Hybrid weights: vector=0.7, keyword=0.3


## 📊 **Test Section 1: Database Connectivity**

Validates database connections and document counts across collections.

In [6]:
def display_results(results: List[Dict[str, Any]], title: str, max_results: int = 3):
    """
    Display search results in a formatted way.
    """
    print(f"\n📊 {title}: {len(results)} results")
    
    if not results:
        print("   No results found")
        return
    
    for i, result in enumerate(results[:max_results]):
        # Handle different result formats
        if isinstance(result, dict):
            # Supabase direct format
            if 'key' in result and 'title' in result:
                key = result.get('key', 'No key')
                title = result.get('title', 'No title')
                similarity = result.get('similarity', 'N/A')
                combined_score = result.get('combined_score', 'N/A')
                print(f"   {i+1}. {key}: {title[:60]}...")
                if similarity != 'N/A':
                    print(f"      Similarity: {similarity}")
                if combined_score != 'N/A':
                    print(f"      Combined Score: {combined_score}")
            
            # LangChain Document format (if used)
            elif 'page_content' in result:
                content = result.get('page_content', '')[:100]
                metadata = result.get('metadata', {})
                print(f"   {i+1}. Content: {content}...")
                print(f"      Metadata: {metadata}")
            
            # Generic dict format
            else:
                print(f"   {i+1}. {str(result)[:100]}...")
        else:
            print(f"   {i+1}. {str(result)[:100]}...")


def test_search_method(method_func, method_name: str, query: str, **kwargs):
    """
    Test a search method and display results.
    """
    try:
        start_time = datetime.now()
        results = method_func(query, **kwargs)
        end_time = datetime.now()
        duration = (end_time - start_time).total_seconds()
        
        display_results(results, f"{method_name} (query: '{query}', {duration:.2f}s)")
        return results
        
    except Exception as e:
        print(f"\n❌ {method_name} failed for query '{query}': {e}")
        return []


def create_results_summary(all_results: Dict[str, Dict[str, List]]):
    """
    Create a summary table of all test results.
    """
    summary_data = []
    
    for query, methods in all_results.items():
        for method, results in methods.items():
            summary_data.append({
                'Query': query,
                'Method': method,
                'Results Count': len(results),
                'Success': '✅' if results else '❌'
            })
    
    return pd.DataFrame(summary_data)

print("✅ Utility functions defined")

✅ Utility functions defined


In [7]:
print("🔢 Testing document counts and connections...")

# Test document counts
try:
    bugs_count = rag_tools.count_documents_bugs()
    pcr_count = rag_tools.count_documents_pcr()
    
    print(f"📊 Document counts:")
    print(f"   Bugs collection: {bugs_count:,} documents")
    print(f"   PCR collection: {pcr_count:,} documents")
    
    total_docs = bugs_count + pcr_count
    print(f"   Total documents: {total_docs:,}")
    
    if total_docs == 0:
        print("⚠️  Warning: No documents found. Upload data before testing search functions.")
    
except Exception as e:
    print(f"❌ Document count test failed: {e}")

# Test with filters
print("\n🔍 Testing filtered document counts...")
try:
    # Test with project filter (if available)
    jbide_bugs = rag_tools.count_documents_bugs(filters={'project': 'JBIDE'})
    print(f"   JBIDE project bugs: {jbide_bugs:,} documents")
    
except Exception as e:
    print(f"❌ Filtered count test failed: {e}")

2025-08-14 12:27:23,574 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=id "HTTP/2 206 Partial Content"
2025-08-14 12:27:23,577 - RAGTools - INFO - Document count (bugs): 4910
2025-08-14 12:27:23,577 - RAGTools - INFO - Document count (bugs): 4910


🔢 Testing document counts and connections...


2025-08-14 12:27:23,916 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=id "HTTP/2 206 Partial Content"
2025-08-14 12:27:23,918 - RAGTools - INFO - Document count (pcr): 2860
2025-08-14 12:27:23,918 - RAGTools - INFO - Document count (pcr): 2860


📊 Document counts:
   Bugs collection: 4,910 documents
   PCR collection: 2,860 documents
   Total documents: 7,770

🔍 Testing filtered document counts...


2025-08-14 12:27:24,282 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=id&project=eq.JBIDE "HTTP/2 206 Partial Content"
2025-08-14 12:27:24,284 - RAGTools - INFO - Document count (bugs): 2547
2025-08-14 12:27:24,284 - RAGTools - INFO - Document count (bugs): 2547


   JBIDE project bugs: 2,547 documents


## 🎯 **Test Section 2: Vector Search Tools**

Tests semantic similarity search using OpenAI embeddings with various similarity thresholds and filters.

In [8]:
print("🎯 Testing Vector Search Tools...")

vector_results = {}

# Test vector search for bugs
print("\n🐛 Vector Search - Bugs Collection:")
for query in test_queries[:4]:  # Test first 4 queries
    results = test_search_method(
        rag_tools.vector_search_bugs,
        "Vector Search (Bugs)",
        query,
        k=TEST_K,
        similarity_threshold=TEST_SIMILARITY_THRESHOLD
    )
    vector_results[f"bugs_{query}"] = results

# Test vector search for PCR
print("\n🔄 Vector Search - PCR Collection:")
for query in ["feature request", "release", "enhancement"]:
    results = test_search_method(
        rag_tools.vector_search_pcr,
        "Vector Search (PCR)",
        query,
        k=TEST_K,
        similarity_threshold=TEST_SIMILARITY_THRESHOLD
    )
    vector_results[f"pcr_{query}"] = results

# Test with filters
print("\n🔍 Vector Search with Filters:")
try:
    filtered_results = rag_tools.vector_search_bugs(
        "authentication",
        k=TEST_K,
        similarity_threshold=0.6,
        filters={'type': 'Bug'}
    )
    display_results(filtered_results, "Vector Search with Type Filter")
except Exception as e:
    print(f"❌ Filtered vector search failed: {e}")

print(f"\n📈 Vector search test summary: {len([r for r in vector_results.values() if r])} successful out of {len(vector_results)}")

2025-08-14 12:27:24,295 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:24,295 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:24,296 - SupabaseRetriever_bugs - INFO - Parameters: k=3, similarity_threshold=0.2, filters=None
2025-08-14 12:27:24,296 - SupabaseRetriever_bugs - INFO - Parameters: k=3, similarity_threshold=0.2, filters=None


🎯 Testing Vector Search Tools...

🐛 Vector Search - Bugs Collection:


2025-08-14 12:27:24,662 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:24,961 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:24,986 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:24,986 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:24,991 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 6 above threshold 0.2
2025-08-14 12:27:24,991 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 6 above threshold 0.2
2025-08-14 12:27:24,992 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:24,992 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:24,993 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 Vector Search (Bugs) (query: 'authentication error', 0.70s): 3 results
   1. {'content': 'Title: Java EE Web Project archetype from Central cannot find dependencies\n\nDescripti...
   2. {'content': 'Title: context:include-filter can\'t find ControllerAdvice annotation\n\nDescription: {...
   3. {'content': "Title: Deployed module on OpenShift (server adapter) should be associated with '/'\n\nD...


2025-08-14 12:27:25,314 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:25,573 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:25,602 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:25,602 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:25,606 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 7 above threshold 0.2
2025-08-14 12:27:25,606 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 7 above threshold 0.2
2025-08-14 12:27:25,607 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.4634', '0.2014', '0.2567']
2025-08-14 12:27:25,607 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.4634', '0.2014', '0.2567']
2025-08-14 12:27:25,608 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 Vector Search (Bugs) (query: 'Java OutOfMemoryError', 0.61s): 3 results
   1. {'content': "Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19\n\nDescription: Installing JBT on Ecl...
   2. {'content': "Title: JST Web UI tests doesn't close New Wizard Dialogs after tests finished\n\nDescri...
   3. {'content': 'Title: FactoryBean bean type detection can causes fatal early instantiation\n\nDescript...


2025-08-14 12:27:26,376 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:26,564 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:26,632 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:26,632 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:26,636 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 4 above threshold 0.2
2025-08-14 12:27:26,636 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 4 above threshold 0.2
2025-08-14 12:27:26,636 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2531', '0.2409', '0.2496']
2025-08-14 12:27:26,636 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2531', '0.2409', '0.2496']
2025-08-14 12:27:26,637 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 Vector Search (Bugs) (query: 'login failed', 1.03s): 3 results
   1. {'content': "Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19\n\nDescription: Installing JBT on Ecl...
   2. {'content': 'Title: Java EE Web Project archetype from Central cannot find dependencies\n\nDescripti...
   3. {'content': "Title: EMMA code coverage tool isn't compatible with Luna\n\nDescription: The javaee an...


2025-08-14 12:27:26,959 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:27,195 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:27,284 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:27,284 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:27,288 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 3 above threshold 0.2
2025-08-14 12:27:27,288 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 3 above threshold 0.2
2025-08-14 12:27:27,289 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.2037', '0.2676']
2025-08-14 12:27:27,289 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2524', '0.2037', '0.2676']
2025-08-14 12:27:27,291 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 Vector Search (Bugs) (query: 'database connection timeout', 0.65s): 3 results
   1. {'content': "Title: JST Web UI tests doesn't close New Wizard Dialogs after tests finished\n\nDescri...
   2. {'content': "Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19\n\nDescription: Installing JBT on Ecl...
   3. {'content': 'Title: Java EE Web Project archetype from Central cannot find dependencies\n\nDescripti...

🔄 Vector Search - PCR Collection:


2025-08-14 12:27:27,753 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:28,100 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:28,155 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:28,155 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:28,158 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 9 above threshold 0.2
2025-08-14 12:27:28,158 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 9 above threshold 0.2
2025-08-14 12:27:28,159 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2782', '0.2689', '0.2703']
2025-08-14 12:27:28,159 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2782', '0.2689', '0.2703']
2025-08-14 12:27:28,159 - SupabaseRetriever_pcr - INFO - Direct vector search returned 3 res


📊 Vector Search (PCR) (query: 'feature request', 0.87s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.16.8\n\nDescription: Release Apache Flex 4.16.8 with 9 bug...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   3. {'content': 'Title: Apache Flex Release 4.16.6\n\nDescription: Release Apache Flex 4.16.6 with 8 bug...


2025-08-14 12:27:28,527 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:28,768 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:28,825 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:28,825 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:28,829 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 9 above threshold 0.2
2025-08-14 12:27:28,829 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 9 above threshold 0.2
2025-08-14 12:27:28,829 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2301', '0.2596', '0.2406']
2025-08-14 12:27:28,829 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2301', '0.2596', '0.2406']
2025-08-14 12:27:28,830 - SupabaseRetriever_pcr - INFO - Direct vector search returned 3 res


📊 Vector Search (PCR) (query: 'release', 0.67s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.0.0\n\nDescription: Release Apache Flex 4.0.0 with 13 bug ...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 7 bug...
   3. {'content': 'Title: Apache Flex Release 4.0.0\n\nDescription: Release Apache Flex 4.0.0 with 4 bug f...


2025-08-14 12:27:29,142 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:29,494 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:29,552 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:29,552 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:29,554 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 9 above threshold 0.2
2025-08-14 12:27:29,554 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 9 above threshold 0.2
2025-08-14 12:27:29,555 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2528', '0.2421', '0.2520']
2025-08-14 12:27:29,555 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2528', '0.2421', '0.2520']
2025-08-14 12:27:29,555 - SupabaseRetriever_pcr - INFO - Direct vector search returned 3 res


📊 Vector Search (PCR) (query: 'enhancement', 0.73s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.16.8\n\nDescription: Release Apache Flex 4.16.8 with 9 bug...
   2. {'content': 'Title: Apache Flex Release 4.16.6\n\nDescription: Release Apache Flex 4.16.6 with 8 bug...
   3. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...

🔍 Vector Search with Filters:


2025-08-14 12:27:29,953 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&type=eq.Bug&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:30,026 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:30,026 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:30,029 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 0 above threshold 0.6
2025-08-14 12:27:30,029 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 0 above threshold 0.6
2025-08-14 12:27:30,033 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:30,033 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:30,033 - RAGTools - INFO - Vector search (bugs): 0 results for 'authentication...'
2025-08-14 12:27:30,033 - RAGTools - INFO - Vector search (bugs): 0 resu


📊 Vector Search with Type Filter: 0 results
   No results found

📈 Vector search test summary: 7 successful out of 7


## 📝 **Test Section 3: Keyword Search Tools**  

Tests BM25-style full-text search capabilities using PostgreSQL's built-in text search.

In [9]:
print("📝 Testing Keyword Search Tools...")

keyword_results = {}

# Test keyword search for bugs
print("\n🐛 Keyword Search - Bugs Collection:")
for query in test_queries[:4]:
    results = test_search_method(
        rag_tools.keyword_search_bugs,
        "Keyword Search (Bugs)",
        query,
        k=TEST_K
    )
    keyword_results[f"bugs_{query}"] = results

# Test keyword search for PCR
print("\n🔄 Keyword Search - PCR Collection:")
for query in ["feature request", "release", "enhancement"]:
    results = test_search_method(
        rag_tools.keyword_search_pcr,
        "Keyword Search (PCR)",
        query,
        k=TEST_K
    )
    keyword_results[f"pcr_{query}"] = results

# Test BM25 search (alias to keyword search)
print("\n📊 BM25 Search (Alias to Keyword):")
bm25_results = test_search_method(
    rag_tools.bm25_search_bugs,
    "BM25 Search (Bugs)",
    "OutOfMemoryError",
    k=TEST_K
)

print(f"\n📈 Keyword search test summary: {len([r for r in keyword_results.values() if r])} successful out of {len(keyword_results)}")

2025-08-14 12:27:30,039 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'authentication error...' in bugs
2025-08-14 12:27:30,039 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'authentication error...' in bugs


📝 Testing Keyword Search Tools...

🐛 Keyword Search - Bugs Collection:


2025-08-14 12:27:30,463 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25authentication+error%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:30,703 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25authentication%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:30,705 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:30,705 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:30,707 - RAGTools - INFO - Keyword search (bugs): 3 results for 'authentication error...'
2025-08-14 12:27:30,707 - RAGTools - INFO - Keyword search (bugs): 3 results for 'authentication error...'
2025-08-14 12:27:30,708 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'Java OutOfMemoryError...' in bugs
2025-08-14 12:27:30,708 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'Java OutOfMemo


📊 Keyword Search (Bugs) (query: 'authentication error', 0.67s): 3 results
   1. {'content': 'Title: TestTokenAuthentication failing on hadoop2 build with "IllegalArgumentException:...
   2. {'content': 'Title: Eclipse OpenShift plugin - Wrong error message when user authentication fails du...
   3. {'content': 'Title: HBaseClient and HBaseServer should use hbase.security.authentication when negoti...


2025-08-14 12:27:31,031 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25java%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:31,033 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:31,033 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:31,034 - RAGTools - INFO - Keyword search (bugs): 3 results for 'Java OutOfMemoryError...'
2025-08-14 12:27:31,034 - RAGTools - INFO - Keyword search (bugs): 3 results for 'Java OutOfMemoryError...'
2025-08-14 12:27:31,034 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'login failed...' in bugs
2025-08-14 12:27:31,034 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'login failed...' in bugs
2025-08-14 12:27:31,205 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25login+failed%25&limit=3 "HTTP/2 200 OK"



📊 Keyword Search (Bugs) (query: 'Java OutOfMemoryError', 0.33s): 3 results
   1. {'content': "Title: JavaScript error thrown for the orderingList in the showcase\n\nDescription: Vie...
   2. {'content': 'Title: Java EE Web Project archetype from Central cannot find dependencies\n\nDescripti...
   3. {'content': 'Title: Fix tests related to java.beans.BeanInfo changes in JDK8-b117\n\nDescription: St...


2025-08-14 12:27:31,444 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25login%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:31,445 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:31,445 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:31,446 - RAGTools - INFO - Keyword search (bugs): 3 results for 'login failed...'
2025-08-14 12:27:31,446 - RAGTools - INFO - Keyword search (bugs): 3 results for 'login failed...'
2025-08-14 12:27:31,446 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'database connection timeout...' in bugs
2025-08-14 12:27:31,446 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'database connection timeout...' in bugs
2025-08-14 12:27:31,581 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25database+connection+timeout%25&limit=3 "H


📊 Keyword Search (Bugs) (query: 'login failed', 0.41s): 3 results
   1. {'content': 'Title: RESTServer should handle the loginUser correctly\n\nDescription: HBASE-8662 intr...
   2. {'content': 'Title: Secure ThriftServer needs to login before calling HBaseHandler\n\nDescription: i...
   3. {'content': 'Title: Secure Rest server should login before getting an instance of Rest servlet\n\nDe...


2025-08-14 12:27:31,875 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25database%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:31,877 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:31,877 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:31,878 - RAGTools - INFO - Keyword search (bugs): 3 results for 'database connection timeout...'
2025-08-14 12:27:31,878 - RAGTools - INFO - Keyword search (bugs): 3 results for 'database connection timeout...'
2025-08-14 12:27:31,879 - SupabaseRetriever_pcr - INFO - Direct keyword search for: 'feature request...' in pcr
2025-08-14 12:27:31,879 - SupabaseRetriever_pcr - INFO - Direct keyword search for: 'feature request...' in pcr



📊 Keyword Search (Bugs) (query: 'database connection timeout', 0.43s): 3 results
   1. {'content': 'Title: h2 database driver creation problems/questions\n\nDescription: I am quite confus...
   2. {'content': 'Title: [WINDOWS] MiniZookeeperCluster should ensure that ZKDatabase is closed upon shut...
   3. {'content': 'Title: ResourceDatabasePopulator incredibly slow on JDK 1.7.0_06 or newer\n\nDescriptio...

🔄 Keyword Search - PCR Collection:


2025-08-14 12:27:32,316 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&title=ilike.%25feature+request%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:32,461 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&title=ilike.%25feature%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:32,759 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&title=ilike.%25request%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:33,111 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&description=ilike.%25feature+request%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:33,112 - SupabaseRetriever_pcr - INFO - Direct keyword search returned 0 results
2025-08-14 12:27:33,112 - SupabaseRetriever_pcr - INFO - Direct keyword search returned 0 results
2025-08-14 12:27:33,113 - RAGTools - INFO - Keyword search (pcr): 0 results for 'feature r


📊 Keyword Search (PCR) (query: 'feature request', 1.23s): 0 results
   No results found

📊 Keyword Search (PCR) (query: 'release', 0.18s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 7 bug...
   3. {'content': 'Title: Apache Flex Release 4.16.2\n\nDescription: Release Apache Flex 4.16.2 with 14 bu...


2025-08-14 12:27:33,421 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&title=ilike.%25enhancement%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:33,686 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&title=ilike.%25enhancement%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:33,996 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&description=ilike.%25enhancement%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:27:33,998 - SupabaseRetriever_pcr - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:33,998 - SupabaseRetriever_pcr - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:33,999 - RAGTools - INFO - Keyword search (pcr): 3 results for 'enhancement...'
2025-08-14 12:27:33,999 - RAGTools - INFO - Keyword search (pcr): 3 results for 'enhancement...'
2025-08-14 12:27:34,000 - SupabaseRetriever_bugs - INFO - Direct keyword 


📊 Keyword Search (PCR) (query: 'enhancement', 0.70s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 7 bug...
   3. {'content': 'Title: Apache Flex Release 4.16.2\n\nDescription: Release Apache Flex 4.16.2 with 14 bu...

📊 BM25 Search (Alias to Keyword):


2025-08-14 12:27:34,312 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25outofmemoryerror%25&limit=2 "HTTP/2 200 OK"
2025-08-14 12:27:34,541 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&description=ilike.%25OutOfMemoryError%25&limit=2 "HTTP/2 200 OK"
2025-08-14 12:27:34,543 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:34,543 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:27:34,543 - RAGTools - INFO - Keyword search (bugs): 3 results for 'OutOfMemoryError...'
2025-08-14 12:27:34,543 - RAGTools - INFO - Keyword search (bugs): 3 results for 'OutOfMemoryError...'



📊 BM25 Search (Bugs) (query: 'OutOfMemoryError', 0.54s): 3 results
   1. {'content': 'Title: Showase on Openshift Express -> java.lang.OutOfMemoryError: PermGen space\n\nDes...
   2. {'content': "Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19\n\nDescription: Installing JBT on Ecl...
   3. {'content': 'Title: Not able to find HMaster and HRegionServer processes with grep by process name o...

📈 Keyword search test summary: 6 successful out of 7


## 🔄 **Test Section 4: Hybrid Search Tools**

Tests combined vector + keyword search with configurable weighting and various combinations.

In [10]:
print("🔄 Testing Hybrid Search Tools...")

hybrid_results = {}

# Test hybrid search for bugs
print("\n🐛 Hybrid Search - Bugs Collection:")
for query in test_queries[:3]:
    results = test_search_method(
        rag_tools.hybrid_search_bugs,
        "Hybrid Search (Bugs)",
        query,
        k=TEST_K,
        similarity_threshold=TEST_SIMILARITY_THRESHOLD,
        vector_weight=TEST_VECTOR_WEIGHT,
        keyword_weight=TEST_KEYWORD_WEIGHT
    )
    hybrid_results[f"bugs_{query}"] = results

# Test hybrid search for PCR
print("\n🔄 Hybrid Search - PCR Collection:")
for query in ["feature request", "release"]:
    results = test_search_method(
        rag_tools.hybrid_search_pcr,
        "Hybrid Search (PCR)",
        query,
        k=TEST_K,
        similarity_threshold=TEST_SIMILARITY_THRESHOLD,
        vector_weight=TEST_VECTOR_WEIGHT,
        keyword_weight=TEST_KEYWORD_WEIGHT
    )
    hybrid_results[f"pcr_{query}"] = results

# Test different weight combinations
print("\n⚖️  Testing Different Weight Combinations:")
weight_tests = [
    (0.9, 0.1, "High Vector Weight"),
    (0.1, 0.9, "High Keyword Weight"),
    (0.5, 0.5, "Equal Weights")
]

for vector_w, keyword_w, description in weight_tests:
    try:
        results = rag_tools.hybrid_search_bugs(
            "authentication error",
            k=2,
            vector_weight=vector_w,
            keyword_weight=keyword_w
        )
        display_results(results, f"Hybrid Search - {description} ({vector_w:.1f}/{keyword_w:.1f})")
    except Exception as e:
        print(f"❌ {description} test failed: {e}")

print(f"\n📈 Hybrid search test summary: {len([r for r in hybrid_results.values() if r])} successful out of {len(hybrid_results)}")

2025-08-14 12:27:34,559 - SupabaseRetriever_bugs - INFO - Direct hybrid search for: 'authentication error...' in bugs
2025-08-14 12:27:34,559 - SupabaseRetriever_bugs - INFO - Direct hybrid search for: 'authentication error...' in bugs
2025-08-14 12:27:34,560 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:34,560 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:34,561 - SupabaseRetriever_bugs - INFO - Parameters: k=6, similarity_threshold=0.2, filters=None
2025-08-14 12:27:34,561 - SupabaseRetriever_bugs - INFO - Parameters: k=6, similarity_threshold=0.2, filters=None


🔄 Testing Hybrid Search Tools...

🐛 Hybrid Search - Bugs Collection:


2025-08-14 12:27:34,836 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:35,055 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=18 "HTTP/2 200 OK"
2025-08-14 12:27:35,195 - SupabaseRetriever_bugs - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:35,195 - SupabaseRetriever_bugs - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:35,204 - SupabaseRetriever_bugs - INFO - Calculated 18 similarities, 11 above threshold 0.2
2025-08-14 12:27:35,204 - SupabaseRetriever_bugs - INFO - Calculated 18 similarities, 11 above threshold 0.2
2025-08-14 12:27:35,204 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2479', '0.2604', '0.2196']
2025-08-14 12:27:35,204 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2479', '0.2604', '0.2196']
2025-08-14 12:27:35,205 - SupabaseRetriever_bugs - INFO - Direct vector search


📊 Hybrid Search (Bugs) (query: 'authentication error', 1.21s): 3 results
   1. {'content': 'Title: TestTokenAuthentication failing on hadoop2 build with "IllegalArgumentException:...
   2. {'content': 'Title: Eclipse OpenShift plugin - Wrong error message when user authentication fails du...
   3. {'content': 'Title: HBaseClient and HBaseServer should use hbase.security.authentication when negoti...


2025-08-14 12:27:36,805 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:37,033 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=18 "HTTP/2 200 OK"
2025-08-14 12:27:37,182 - SupabaseRetriever_bugs - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:37,182 - SupabaseRetriever_bugs - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:37,190 - SupabaseRetriever_bugs - INFO - Calculated 18 similarities, 13 above threshold 0.2
2025-08-14 12:27:37,190 - SupabaseRetriever_bugs - INFO - Calculated 18 similarities, 13 above threshold 0.2
2025-08-14 12:27:37,191 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.4634', '0.2014', '0.2568']
2025-08-14 12:27:37,191 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.4634', '0.2014', '0.2568']
2025-08-14 12:27:37,191 - SupabaseRetriever_bugs - INFO - Direct vector search


📊 Hybrid Search (Bugs) (query: 'Java OutOfMemoryError', 1.70s): 3 results
   1. {'content': 'Title: Java EE Web Project archetype from Central cannot find dependencies\n\nDescripti...
   2. {'content': "Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19\n\nDescription: Installing JBT on Ecl...
   3. {'content': "Title: JavaScript error thrown for the orderingList in the showcase\n\nDescription: Vie...


2025-08-14 12:27:37,703 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:37,950 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=18 "HTTP/2 200 OK"
2025-08-14 12:27:38,032 - SupabaseRetriever_bugs - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:38,032 - SupabaseRetriever_bugs - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:38,042 - SupabaseRetriever_bugs - INFO - Calculated 18 similarities, 5 above threshold 0.2
2025-08-14 12:27:38,042 - SupabaseRetriever_bugs - INFO - Calculated 18 similarities, 5 above threshold 0.2
2025-08-14 12:27:38,043 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2531', '0.2409', '0.2496']
2025-08-14 12:27:38,043 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2531', '0.2409', '0.2496']
2025-08-14 12:27:38,044 - SupabaseRetriever_bugs - INFO - Direct vector search r


📊 Hybrid Search (Bugs) (query: 'login failed', 0.97s): 3 results
   1. {'content': 'Title: RESTServer should handle the loginUser correctly\n\nDescription: HBASE-8662 intr...
   2. {'content': 'Title: Secure ThriftServer needs to login before calling HBaseHandler\n\nDescription: i...
   3. {'content': 'Title: Secure Rest server should login before getting an instance of Rest servlet\n\nDe...

🔄 Hybrid Search - PCR Collection:


2025-08-14 12:27:38,692 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:38,935 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=18 "HTTP/2 200 OK"
2025-08-14 12:27:39,076 - SupabaseRetriever_pcr - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:39,076 - SupabaseRetriever_pcr - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:39,083 - SupabaseRetriever_pcr - INFO - Calculated 18 similarities, 18 above threshold 0.2
2025-08-14 12:27:39,083 - SupabaseRetriever_pcr - INFO - Calculated 18 similarities, 18 above threshold 0.2
2025-08-14 12:27:39,084 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2782', '0.2689', '0.2703']
2025-08-14 12:27:39,084 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2782', '0.2689', '0.2703']
2025-08-14 12:27:39,085 - SupabaseRetriever_pcr - INFO - Direct vector search returne


📊 Hybrid Search (PCR) (query: 'feature request', 1.33s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.16.7\n\nDescription: Release Apache Flex 4.16.7 with 12 bu...
   2. {'content': 'Title: Apache Flex Release 4.16.8\n\nDescription: Release Apache Flex 4.16.8 with 9 bug...
   3. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...


2025-08-14 12:27:40,069 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:40,280 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=18 "HTTP/2 200 OK"
2025-08-14 12:27:40,417 - SupabaseRetriever_pcr - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:40,417 - SupabaseRetriever_pcr - INFO - Processing 18 candidates for similarity calculation
2025-08-14 12:27:40,424 - SupabaseRetriever_pcr - INFO - Calculated 18 similarities, 18 above threshold 0.2
2025-08-14 12:27:40,424 - SupabaseRetriever_pcr - INFO - Calculated 18 similarities, 18 above threshold 0.2
2025-08-14 12:27:40,425 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2301', '0.2596', '0.2405']
2025-08-14 12:27:40,425 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2301', '0.2596', '0.2405']
2025-08-14 12:27:40,426 - SupabaseRetriever_pcr - INFO - Direct vector search returne


📊 Hybrid Search (PCR) (query: 'release', 0.94s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.0.0\n\nDescription: Release Apache Flex 4.0.0 with 13 bug ...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 7 bug...
   3. {'content': 'Title: Apache Flex Release 4.0.0\n\nDescription: Release Apache Flex 4.0.0 with 4 bug f...

⚖️  Testing Different Weight Combinations:


2025-08-14 12:27:41,066 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:41,284 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=12 "HTTP/2 200 OK"
2025-08-14 12:27:41,399 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:27:41,399 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:27:41,402 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 8 above threshold 0.2
2025-08-14 12:27:41,402 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 8 above threshold 0.2
2025-08-14 12:27:41,403 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2479', '0.2604', '0.2196']
2025-08-14 12:27:41,403 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2479', '0.2604', '0.2196']
2025-08-14 12:27:41,403 - SupabaseRetriever_bugs - INFO - Direct vector search r


📊 Hybrid Search - High Vector Weight (0.9/0.1): 2 results
   1. {'content': 'Title: Java EE Web Project archetype from Central cannot find dependencies\n\nDescripti...
   2. {'content': 'Title: context:include-filter can\'t find ControllerAdvice annotation\n\nDescription: {...


2025-08-14 12:27:42,130 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:42,338 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=12 "HTTP/2 200 OK"
2025-08-14 12:27:42,454 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:27:42,454 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:27:42,458 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 8 above threshold 0.2
2025-08-14 12:27:42,458 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 8 above threshold 0.2
2025-08-14 12:27:42,458 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:42,458 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:42,459 - SupabaseRetriever_bugs - INFO - Direct vector search r


📊 Hybrid Search - High Keyword Weight (0.1/0.9): 2 results
   1. {'content': 'Title: TestTokenAuthentication failing on hadoop2 build with "IllegalArgumentException:...
   2. {'content': 'Title: Eclipse OpenShift plugin - Wrong error message when user authentication fails du...


2025-08-14 12:27:43,075 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:43,285 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=12 "HTTP/2 200 OK"
2025-08-14 12:27:43,424 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:27:43,424 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:27:43,431 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 8 above threshold 0.2
2025-08-14 12:27:43,431 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 8 above threshold 0.2
2025-08-14 12:27:43,432 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:43,432 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:43,432 - SupabaseRetriever_bugs - INFO - Direct vector search r


📊 Hybrid Search - Equal Weights (0.5/0.5): 2 results
   1. {'content': 'Title: TestTokenAuthentication failing on hadoop2 build with "IllegalArgumentException:...
   2. {'content': 'Title: Eclipse OpenShift plugin - Wrong error message when user authentication fails du...

📈 Hybrid search test summary: 5 successful out of 5


## 🎯 **Test Section 5: Contextual Compression Tools**

Tests high-quality vector search with optional Cohere reranking for improved precision.

In [11]:
print("🎯 Testing Contextual Compression Tools...")

compression_results = {}

# Test contextual compression for bugs
print("\n🐛 Contextual Compression - Bugs Collection:")
for query in ["authentication error", "OutOfMemoryError"]:
    results = test_search_method(
        rag_tools.contextual_compression_search_bugs,
        "Contextual Compression (Bugs)",
        query,
        k=3,  # Fewer results for compression
        similarity_threshold=0.8  # Higher threshold
    )
    compression_results[f"bugs_{query}"] = results

# Test contextual compression for PCR
print("\n🔄 Contextual Compression - PCR Collection:")
for query in ["feature request", "enhancement"]:
    results = test_search_method(
        rag_tools.contextual_compression_search_pcr,
        "Contextual Compression (PCR)",
        query,
        k=3,
        similarity_threshold=0.8
    )
    compression_results[f"pcr_{query}"] = results

# Compare with regular vector search
print("\n📊 Comparison: Contextual Compression vs Regular Vector Search:")
try:
    query = "authentication error"
    
    # Regular vector search
    regular_results = rag_tools.vector_search_bugs(query, k=5, similarity_threshold=0.6)
    
    # Contextual compression
    compression_results_comp = rag_tools.contextual_compression_search_bugs(query, k=3, similarity_threshold=0.8)
    
    print(f"   Regular vector search (threshold=0.6): {len(regular_results)} results")
    print(f"   Contextual compression (threshold=0.8): {len(compression_results_comp)} results")
    print("   → Compression should return fewer, higher-quality results")
    
except Exception as e:
    print(f"❌ Comparison test failed: {e}")

print(f"\n📈 Contextual compression test summary: {len([r for r in compression_results.values() if r])} successful out of {len(compression_results)}")

2025-08-14 12:27:43,871 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:43,871 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:43,872 - SupabaseRetriever_bugs - INFO - Parameters: k=3, similarity_threshold=0.8, filters=None
2025-08-14 12:27:43,872 - SupabaseRetriever_bugs - INFO - Parameters: k=3, similarity_threshold=0.8, filters=None


🎯 Testing Contextual Compression Tools...

🐛 Contextual Compression - Bugs Collection:


2025-08-14 12:27:44,561 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:44,889 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:44,898 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:44,898 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:44,903 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:44,903 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:44,908 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:44,908 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:44,908 - RAGTools - INFO - Contextual compression se


📊 Contextual Compression (Bugs) (query: 'authentication error', 1.04s): 0 results
   No results found


2025-08-14 12:27:46,119 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:46,316 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:46,467 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:46,467 - SupabaseRetriever_bugs - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:46,473 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:46,473 - SupabaseRetriever_bugs - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:46,476 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:46,476 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:46,476 - RAGTools - INFO - Contextual compression se


📊 Contextual Compression (Bugs) (query: 'OutOfMemoryError', 1.57s): 0 results
   No results found

🔄 Contextual Compression - PCR Collection:


2025-08-14 12:27:46,863 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:46,929 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:46,929 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:46,934 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:46,934 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:46,941 - SupabaseRetriever_pcr - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:46,941 - SupabaseRetriever_pcr - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:46,941 - RAGTools - INFO - Contextual compression search (pcr): 0 results for 'feature request...'
2025-08-14 12:27:46,941 - RAGTools - INFO - Contextual compression search (


📊 Contextual Compression (PCR) (query: 'feature request', 0.47s): 0 results
   No results found


2025-08-14 12:27:47,158 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:47,346 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=9 "HTTP/2 200 OK"
2025-08-14 12:27:47,418 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:47,418 - SupabaseRetriever_pcr - INFO - Processing 9 candidates for similarity calculation
2025-08-14 12:27:47,424 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:47,424 - SupabaseRetriever_pcr - INFO - Calculated 9 similarities, 0 above threshold 0.8
2025-08-14 12:27:47,430 - SupabaseRetriever_pcr - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:47,430 - SupabaseRetriever_pcr - INFO - Direct vector search returned 0 results (from 9 candidates)
2025-08-14 12:27:47,430 - RAGTools - INFO - Contextual compression search (p


📊 Contextual Compression (PCR) (query: 'enhancement', 0.49s): 0 results
   No results found

📊 Comparison: Contextual Compression vs Regular Vector Search:


2025-08-14 12:27:47,791 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=15 "HTTP/2 200 OK"
2025-08-14 12:27:47,932 - SupabaseRetriever_bugs - INFO - Processing 15 candidates for similarity calculation
2025-08-14 12:27:47,932 - SupabaseRetriever_bugs - INFO - Processing 15 candidates for similarity calculation
2025-08-14 12:27:47,940 - SupabaseRetriever_bugs - INFO - Calculated 15 similarities, 0 above threshold 0.6
2025-08-14 12:27:47,940 - SupabaseRetriever_bugs - INFO - Calculated 15 similarities, 0 above threshold 0.6
2025-08-14 12:27:47,949 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 15 candidates)
2025-08-14 12:27:47,949 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 15 candidates)
2025-08-14 12:27:47,950 - RAGTools - INFO - Vector search (bugs): 0 results for 'authentication error...'
2025-08-14 12:27:47,950 - RAGTools - INFO - Vector search (bugs): 0 res

   Regular vector search (threshold=0.6): 0 results
   Contextual compression (threshold=0.8): 0 results
   → Compression should return fewer, higher-quality results

📈 Contextual compression test summary: 0 successful out of 4


## 🎪 **Test Section 6: Advanced Ensemble Retrieval**

**⭐ The Crown Jewel**: Tests the sophisticated 4-method ensemble system matching the original Cuttlefish3 design.

### **🔄 Module Reload Required**
Before running this test, you **must run the module reload cell below** to load the advanced ensemble functionality.

### **What Gets Tested:**
- **Basic Ensemble** (3 methods): Vector + Keyword + Hybrid
- **🆕 Advanced Ensemble** (4 methods): Multi-Query + Contextual Compression + BM25 + Vector
- **Comparison Analysis**: Direct comparison between basic vs advanced approaches
- **Individual Components**: Direct testing of each advanced retrieval method

### **Expected Advanced Ensemble Logs:**
Look for these signatures that confirm the sophisticated system is working:
- `Multi-query expansion` with LLM-generated query variations
- `Contextual compression` with Cohere reranking  
- `BM25 retrieval` with advanced scoring
- `Advanced deduplication` with content hashing
- Source: `advanced_ensemble_bugs` in results

## 6. Ensemble Search Tools Testing

### 🔄 **Advanced Ensemble Module Reload**

**⚠️ CRITICAL**: Run this cell **BEFORE** testing the advanced ensemble to ensure the sophisticated retrieval methods are loaded.

This cell performs a comprehensive module reload to pick up the advanced ensemble functionality:
- Clears module cache and reloads `advanced_retrievers.py` 
- Recreates RAG tools instance with `use_advanced=True` support
- Tests that all 4 advanced methods are available
- Validates the advanced ensemble parameter is recognized

**Success Indicators:**
- ✅ `Advanced ensemble parameter 'use_advanced' found!`
- ✅ `CONFIRMED: Advanced ensemble is active!`
- ✅ All advanced retriever components created successfully

In [12]:
print("🎪 Testing Ensemble Search Tools...")

ensemble_results = {}

# Test BASIC ensemble search for bugs (original 3-method)
print("\n🐛 Basic Ensemble Search - Bugs Collection:")
for query in ["authentication error", "OutOfMemoryError"]:
    results = test_search_method(
        lambda q, k=5: rag_tools.ensemble_search_bugs(q, k, use_advanced=False),
        "Basic Ensemble Search (Bugs)",
        query,
        k=5  # More results to see ensemble effect
    )
    ensemble_results[f"basic_bugs_{query}"] = results

# Test BASIC ensemble search for PCR (original 3-method)
print("\n🔄 Basic Ensemble Search - PCR Collection:")
for query in ["feature request", "release"]:
    results = test_search_method(
        lambda q, k=5: rag_tools.ensemble_search_pcr(q, k, use_advanced=False),
        "Basic Ensemble Search (PCR)",
        query,
        k=5
    )
    ensemble_results[f"basic_pcr_{query}"] = results

# Test ADVANCED ensemble search (NEW! - 4 sophisticated methods)
print("\n🎯 ADVANCED Ensemble Search - Bugs Collection:")
print("   This uses sophisticated retrieval methods like Multi-Query, Contextual Compression, BM25")
print("   Similar to the original Cuttlefish3_Complete.ipynb implementation")

for query in ["authentication error", "XML parsing issues"]:
    try:
        results = test_search_method(
            lambda q, k=5: rag_tools.ensemble_search_bugs(q, k, use_advanced=True),
            "ADVANCED Ensemble Search (Bugs)",
            query,
            k=5
        )
        ensemble_results[f"advanced_bugs_{query}"] = results
        
        # Show detailed source breakdown for advanced ensemble
        if results:
            sources = {}
            for result in results:
                source = result.get('source', 'unknown')
                sources[source] = sources.get(source, 0) + 1
            print(f"   → Sources used: {dict(sources)}")
            
            # Show search types
            search_types = {}
            for result in results:
                search_type = result.get('search_type', 'unknown')
                search_types[search_type] = search_types.get(search_type, 0) + 1
            print(f"   → Search types: {dict(search_types)}")
            
    except Exception as e:
        print(f"   ❌ Advanced ensemble failed for '{query}': {e}")
        import traceback
        traceback.print_exc()

# Test ADVANCED ensemble search for PCR (NEW! - 4 sophisticated methods)
print("\n🔄 ADVANCED Ensemble Search - PCR Collection:")
for query in ["feature enhancement", "version release"]:
    try:
        results = test_search_method(
            lambda q, k=5: rag_tools.ensemble_search_pcr(q, k, use_advanced=True),
            "ADVANCED Ensemble Search (PCR)",
            query,
            k=5
        )
        ensemble_results[f"advanced_pcr_{query}"] = results
        
        # Show source breakdown
        if results:
            sources = {}
            for result in results:
                source = result.get('source', 'unknown')
                sources[source] = sources.get(source, 0) + 1
            print(f"   → Sources used: {dict(sources)}")
        
    except Exception as e:
        print(f"   ❌ Advanced ensemble failed for '{query}': {e}")

# Compare BASIC vs ADVANCED ensemble
print("\n📊 BASIC vs ADVANCED Ensemble Comparison:")
try:
    query = "authentication error"
    k = 3
    
    print(f"   Query: '{query}' (k={k})")
    
    # Basic ensemble (3 methods: vector + keyword + hybrid)
    basic_results = rag_tools.ensemble_search_bugs(query, k=k, use_advanced=False)
    
    # Advanced ensemble (4 methods: naive + multi-query + contextual compression + BM25)
    advanced_results = rag_tools.ensemble_search_bugs(query, k=k, use_advanced=True)
    
    print(f"   📈 Basic ensemble (3 methods): {len(basic_results)} results")
    print(f"   🎯 Advanced ensemble (4 methods): {len(advanced_results)} results")
    
    if basic_results and advanced_results:
        print(f"   📊 Basic top result score: {basic_results[0].get('score', 0):.3f}")
        print(f"   📊 Advanced top result score: {advanced_results[0].get('score', 0):.3f}")
        
        # Check for different results
        basic_titles = set()
        advanced_titles = set()
        
        for result in basic_results:
            title = result.get('metadata', {}).get('title', '')
            if title:
                basic_titles.add(title[:50])
        
        for result in advanced_results:
            title = result.get('metadata', {}).get('title', '')
            if title:
                advanced_titles.add(title[:50])
        
        unique_to_basic = basic_titles - advanced_titles
        unique_to_advanced = advanced_titles - basic_titles
        overlap = basic_titles & advanced_titles
        
        print(f"   🔄 Overlap: {len(overlap)} results")
        print(f"   📈 Unique to basic: {len(unique_to_basic)} results")
        print(f"   🎯 Unique to advanced: {len(unique_to_advanced)} results")
        
        if unique_to_advanced:
            print(f"   → Advanced found unique results not in basic!")
        else:
            print(f"   → Results have significant overlap")
    
except Exception as e:
    print(f"❌ Ensemble comparison failed: {e}")

# Test individual ADVANCED components (if available)
print("\n🔧 Testing Individual Advanced Components:")

try:
    # Test if we can access the advanced retrievers directly
    from advanced_retrievers import MultiQueryRetriever, ContextualCompressionRetriever, BM25Retriever
    from supabase_retriever import create_bugs_retriever
    
    base_retriever = create_bugs_retriever()
    
    # Test Multi-Query Retriever
    print("   🔍 Multi-Query Retriever (LLM query expansion):")
    multi_query = MultiQueryRetriever(base_retriever)
    multi_results = multi_query.retrieve("authentication error", k=3)
    print(f"     Results: {len(multi_results)}")
    if multi_results:
        print(f"     Sample result source: {multi_results[0].source}")
    
    # Test Contextual Compression Retriever  
    print("   🎯 Contextual Compression Retriever (with reranking):")
    compression = ContextualCompressionRetriever(base_retriever)
    compression_results = compression.retrieve("authentication error", k=3)
    print(f"     Results: {len(compression_results)}")
    if compression_results:
        print(f"     Sample result source: {compression_results[0].source}")
    
    # Test BM25 Retriever
    print("   📝 BM25 Retriever (advanced keyword search):")
    bm25 = BM25Retriever(base_retriever)
    bm25_results = bm25.retrieve("authentication error", k=3)
    print(f"     Results: {len(bm25_results)}")
    if bm25_results:
        print(f"     Sample result source: {bm25_results[0].source}")
    
except ImportError as e:
    print(f"   ⚠️  Advanced components not available for direct testing: {e}")
    print("   This is expected - they're integrated into the ensemble")
except Exception as e:
    print(f"   ❌ Advanced component testing failed: {e}")

print(f"\n📈 Ensemble search test summary: {len([r for r in ensemble_results.values() if r])} successful out of {len(ensemble_results)}")

# Advanced ensemble information
print(f"\n🎯 Advanced Ensemble Methods (similar to Cuttlefish3):")
print(f"   1. 🎯 Naive Vector Search (25% weight) - Basic vector similarity")
print(f"   2. 🔍 Multi-Query Retrieval (25% weight) - LLM generates query variations")
print(f"   3. 🎯 Contextual Compression (25% weight) - Vector search + reranking")
print(f"   4. 📝 BM25 Retrieval (25% weight) - Advanced keyword search")
print(f"   Plus: Advanced deduplication with content hashing")
print(f"   🔄 Weighted combination using standardized scoring")

print(f"\n💡 The Advanced Ensemble should show logs from multiple sophisticated methods!")
print(f"   Look for: Multi-query expansion, Contextual compression, BM25 retrieval")

2025-08-14 12:27:48,369 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:48,369 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:27:48,370 - SupabaseRetriever_bugs - INFO - Parameters: k=2, similarity_threshold=0.2, filters=None
2025-08-14 12:27:48,370 - SupabaseRetriever_bugs - INFO - Parameters: k=2, similarity_threshold=0.2, filters=None


🎪 Testing Ensemble Search Tools...

🐛 Basic Ensemble Search - Bugs Collection:


2025-08-14 12:27:48,639 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:48,824 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:27:48,888 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:27:48,888 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:27:48,892 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:27:48,892 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:27:48,892 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:48,892 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:27:48,893 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 Basic Ensemble Search (Bugs) (query: 'authentication error', 1.80s): 4 results
   1. {'content': 'Title: TestTokenAuthentication failing on hadoop2 build with "IllegalArgumentException:...
   2. {'content': 'Title: Eclipse OpenShift plugin - Wrong error message when user authentication fails du...
   3. {'content': 'Title: context:include-filter can\'t find ControllerAdvice annotation\n\nDescription: {...


2025-08-14 12:27:50,956 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:51,158 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:27:51,201 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:27:51,201 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:27:51,206 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 5 above threshold 0.2
2025-08-14 12:27:51,206 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 5 above threshold 0.2
2025-08-14 12:27:51,207 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3938', '0.2088', '0.2512']
2025-08-14 12:27:51,207 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.3938', '0.2088', '0.2512']
2025-08-14 12:27:51,207 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 Basic Ensemble Search (Bugs) (query: 'OutOfMemoryError', 2.42s): 3 results
   1. {'content': 'Title: Showase on Openshift Express -> java.lang.OutOfMemoryError: PermGen space\n\nDes...
   2. {'content': "Title: Cannot start JBT 4.2.0.Alpha1 on Fedora 19\n\nDescription: Installing JBT on Ecl...
   3. {'content': "Title: EMMA code coverage tool isn't compatible with Luna\n\nDescription: The javaee an...

🔄 Basic Ensemble Search - PCR Collection:


2025-08-14 12:27:59,478 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:27:59,746 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:27:59,815 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:27:59,815 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:27:59,819 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:27:59,819 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:27:59,820 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2782', '0.2689', '0.2703']
2025-08-14 12:27:59,820 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2782', '0.2689', '0.2703']
2025-08-14 12:27:59,820 - SupabaseRetriever_pcr - INFO - Direct vector search returned 2 res


📊 Basic Ensemble Search (PCR) (query: 'feature request', 9.32s): 3 results
   1. {'content': 'Title: Apache Flex Release 4.16.8\n\nDescription: Release Apache Flex 4.16.8 with 9 bug...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   3. {'content': 'Title: Apache Flex Release 4.16.2\n\nDescription: Release Apache Flex 4.16.2 with 14 bu...


2025-08-14 12:28:02,441 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:02,627 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:02,680 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:02,680 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:02,684 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:28:02,684 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:28:02,685 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2301', '0.2596', '0.2406']
2025-08-14 12:28:02,685 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.2301', '0.2596', '0.2406']
2025-08-14 12:28:02,685 - SupabaseRetriever_pcr - INFO - Direct vector search returned 2 res


📊 Basic Ensemble Search (PCR) (query: 'release', 1.96s): 4 results
   1. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   2. {'content': 'Title: Apache Flex Release 4.0.0\n\nDescription: Release Apache Flex 4.0.0 with 13 bug ...
   3. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 7 bug...

🎯 ADVANCED Ensemble Search - Bugs Collection:
   This uses sophisticated retrieval methods like Multi-Query, Contextual Compression, BM25
   Similar to the original Cuttlefish3_Complete.ipynb implementation


2025-08-14 12:28:04,128 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:04,313 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:04,378 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:04,378 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:04,382 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:28:04,382 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:28:04,383 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:04,383 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:04,384 - SupabaseRetriever_bugs - INFO - Direct vector search return


📊 ADVANCED Ensemble Search (Bugs) (query: 'authentication error', 1.74s): 4 results
   1. {'content': 'Title: TestTokenAuthentication failing on hadoop2 build with "IllegalArgumentException:...
   2. {'content': 'Title: Eclipse OpenShift plugin - Wrong error message when user authentication fails du...
   3. {'content': 'Title: context:include-filter can\'t find ControllerAdvice annotation\n\nDescription: {...
   → Sources used: {'supabase_bugs': 4}
   → Search types: {'direct_keyword_search': 2, 'direct_vector_search': 2}


2025-08-14 12:28:05,996 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:06,054 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:06,054 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:06,058 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:28:06,058 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:28:06,059 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2291', '0.2023', '0.2374']
2025-08-14 12:28:06,059 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2291', '0.2023', '0.2374']
2025-08-14 12:28:06,060 - SupabaseRetriever_bugs - INFO - Direct vector search returned 2 results (from 6 candidates)
2025-08-14 12:28:06,060 - SupabaseRetriever_bugs - INFO - Direct vector search ret


📊 ADVANCED Ensemble Search (Bugs) (query: 'XML parsing issues', 2.43s): 4 results
   1. {'content': 'Title: JSF Code completion does not work for http://xmlns.jcp.org/jsf/facelets namespac...
   2. {'content': 'Title: CLONE - Validation problem in faces-config.xml for JSF 2.2 projects\n\nDescripti...
   3. {'content': 'Title: FactoryBean bean type detection can causes fatal early instantiation\n\nDescript...
   → Sources used: {'supabase_bugs': 4}
   → Search types: {'direct_keyword_search': 2, 'direct_vector_search': 2}

🔄 ADVANCED Ensemble Search - PCR Collection:


2025-08-14 12:28:08,387 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:08,574 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:08,629 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:08,629 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:08,631 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:28:08,631 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:28:08,632 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.3522', '0.3401', '0.3464']
2025-08-14 12:28:08,632 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.3522', '0.3401', '0.3464']
2025-08-14 12:28:08,632 - SupabaseRetriever_pcr - INFO - Direct vector search returned 2 res


📊 ADVANCED Ensemble Search (PCR) (query: 'feature enhancement', 2.40s): 4 results
   1. {'content': 'Title: Apache Flex Release 4.16.8\n\nDescription: Release Apache Flex 4.16.8 with 9 bug...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 6 bug...
   3. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   → Sources used: {'supabase_pcr': 4}


2025-08-14 12:28:10,861 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:11,043 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:11,104 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:11,104 - SupabaseRetriever_pcr - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:11,107 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:28:11,107 - SupabaseRetriever_pcr - INFO - Calculated 6 similarities, 6 above threshold 0.2
2025-08-14 12:28:11,108 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.3872', '0.3970', '0.3945']
2025-08-14 12:28:11,108 - SupabaseRetriever_pcr - INFO - Result similarities: ['0.3872', '0.3970', '0.3945']
2025-08-14 12:28:11,109 - SupabaseRetriever_pcr - INFO - Direct vector search returned 2 res


📊 ADVANCED Ensemble Search (PCR) (query: 'version release', 2.17s): 4 results
   1. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 14 bu...
   2. {'content': 'Title: Apache Flex Release 4.16.0\n\nDescription: Release Apache Flex 4.16.0 with 7 bug...
   3. {'content': 'Title: Apache Flex Release 4.0.0\n\nDescription: Release Apache Flex 4.0.0 with 4 bug f...
   → Sources used: {'supabase_pcr': 4}

📊 BASIC vs ADVANCED Ensemble Comparison:
   Query: 'authentication error' (k=3)


2025-08-14 12:28:12,973 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:12,975 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:12,975 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:12,978 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:12,978 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:12,978 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604']
2025-08-14 12:28:12,978 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604']
2025-08-14 12:28:12,979 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 3 candidates)
2025-08-14 12:28:12,979 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (fro

   📈 Basic ensemble (3 methods): 2 results
   🎯 Advanced ensemble (4 methods): 2 results
   📊 Basic top result score: 0.800
   📊 Advanced top result score: 0.800
   🔄 Overlap: 2 results
   📈 Unique to basic: 0 results
   🎯 Unique to advanced: 0 results
   → Results have significant overlap

🔧 Testing Individual Advanced Components:
   🔍 Multi-Query Retriever (LLM query expansion):


2025-08-14 12:28:16,905 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-08-14 12:28:16,913 - MultiQueryRetriever_bugs - INFO - Generated 4 query variations
2025-08-14 12:28:16,913 - MultiQueryRetriever_bugs - INFO - Generated 4 query variations
2025-08-14 12:28:16,914 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:28:16,914 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:28:16,914 - SupabaseRetriever_bugs - INFO - Parameters: k=2, similarity_threshold=0.1, filters=None
2025-08-14 12:28:16,914 - SupabaseRetriever_bugs - INFO - Parameters: k=2, similarity_threshold=0.1, filters=None
2025-08-14 12:28:17,428 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:17,641 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2

     Results: 3
     Sample result source: multi_query_bugs
   🎯 Contextual Compression Retriever (with reranking):


2025-08-14 12:28:19,195 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:19,728 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=60 "HTTP/2 200 OK"
2025-08-14 12:28:19,795 - SupabaseRetriever_bugs - INFO - Processing 60 candidates for similarity calculation
2025-08-14 12:28:19,795 - SupabaseRetriever_bugs - INFO - Processing 60 candidates for similarity calculation
2025-08-14 12:28:19,832 - SupabaseRetriever_bugs - INFO - Calculated 60 similarities, 59 above threshold 0.1
2025-08-14 12:28:19,832 - SupabaseRetriever_bugs - INFO - Calculated 60 similarities, 59 above threshold 0.1
2025-08-14 12:28:19,834 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.1882']
2025-08-14 12:28:19,834 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.1882']
2025-08-14 12:28:19,834 - SupabaseRetriever_bugs - INFO - Direct vector search

     Results: 3
     Sample result source: contextual_compression_bugs
   📝 BM25 Retriever (advanced keyword search):


2025-08-14 12:28:20,140 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25authentication%25&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:20,143 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:28:20,143 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 3 results
2025-08-14 12:28:20,144 - BM25Retriever_bugs - INFO - BM25 retrieval: 3 results
2025-08-14 12:28:20,144 - BM25Retriever_bugs - INFO - BM25 retrieval: 3 results


     Results: 3
     Sample result source: bm25_bugs

📈 Ensemble search test summary: 8 successful out of 8

🎯 Advanced Ensemble Methods (similar to Cuttlefish3):
   1. 🎯 Naive Vector Search (25% weight) - Basic vector similarity
   2. 🔍 Multi-Query Retrieval (25% weight) - LLM generates query variations
   3. 🎯 Contextual Compression (25% weight) - Vector search + reranking
   4. 📝 BM25 Retrieval (25% weight) - Advanced keyword search
   Plus: Advanced deduplication with content hashing
   🔄 Weighted combination using standardized scoring

💡 The Advanced Ensemble should show logs from multiple sophisticated methods!
   Look for: Multi-query expansion, Contextual compression, BM25 retrieval


## 🔍 **Test Section 7: Document Lookup Tools**

Tests direct document retrieval by ID and JIRA ticket numbers.

In [13]:
print("🔍 Testing Document Lookup Tools...")

# First, try to find some document IDs to test with
print("\n📋 Finding test document IDs...")
try:
    # Get some documents to extract IDs
    sample_docs = rag_tools.vector_search_bugs("authentication", k=2)
    
    test_ids = []
    test_jira_ids = []
    
    for doc in sample_docs:
        # Check both direct access and metadata access
        doc_metadata = doc.get('metadata', {}) if isinstance(doc, dict) else {}
        
        # For document ID
        doc_id = doc.get('id') or doc_metadata.get('id')
        if doc_id:
            test_ids.append(doc_id)
            
        # For JIRA key
        jira_key = doc.get('key') or doc_metadata.get('key')
        if jira_key:
            test_jira_ids.append(jira_key)
    
    print(f"   Found {len(test_ids)} document IDs: {test_ids[:3]}")
    print(f"   Found {len(test_jira_ids)} JIRA IDs: {test_jira_ids[:3]}")
    
except Exception as e:
    print(f"❌ Failed to find test IDs: {e}")
    test_ids = []
    test_jira_ids = []

# Test document lookup by ID
print("\n🆔 Testing Document Lookup by ID:")
if test_ids:
    for doc_id in test_ids[:2]:
        try:
            # Test with bugs collection
            result = rag_tools.get_document_by_id(doc_id, collection='bugs')
            if result:
                # Access title from metadata
                metadata = result.get('metadata', {})
                title = metadata.get('title', 'No title')
                key = metadata.get('key', 'No key')
                print(f"   ✅ Found document {doc_id}: {key} - {title[:50]}...")
            else:
                print(f"   ❌ Document {doc_id} not found in bugs collection")
                
        except Exception as e:
            print(f"   ❌ Lookup by ID {doc_id} failed: {e}")
else:
    print("   ⚠️  No test IDs available")

# Test document lookup by JIRA ID
print("\n🎫 Testing Document Lookup by JIRA ID:")
if test_jira_ids:
    for jira_id in test_jira_ids[:2]:
        try:
            # Test with bugs collection
            result = rag_tools.get_document_by_jira_id(jira_id, collection='bugs')
            if result:
                # Access title from metadata
                metadata = result.get('metadata', {})
                title = metadata.get('title', 'No title')
                print(f"   ✅ Found JIRA {jira_id}: {title[:50]}...")
            else:
                print(f"   ❌ JIRA {jira_id} not found in bugs collection")
                
        except Exception as e:
            print(f"   ❌ Lookup by JIRA ID {jira_id} failed: {e}")
else:
    print("   ⚠️  No test JIRA IDs available")

# Test with common JIRA ID patterns
print("\n🔍 Testing Common JIRA ID Patterns:")
common_jira_patterns = ["JBIDE-16308", "SPR-11209", "HBASE-123", "PROJECT-1"]

for jira_id in common_jira_patterns:
    try:
        result = rag_tools.get_document_by_jira_id(jira_id)
        if result:
            # Access title from metadata
            metadata = result.get('metadata', {})
            title = metadata.get('title', 'No title')
            print(f"   ✅ Found {jira_id}: {title[:50]}...")
            
            # Show some additional details for verification
            project = metadata.get('project', 'Unknown')
            issue_type = metadata.get('type', 'Unknown')
            status = metadata.get('status', 'Unknown')
            print(f"       Project: {project}, Type: {issue_type}, Status: {status}")
        else:
            print(f"   ❌ {jira_id} not found")
    except Exception as e:
        print(f"   ❌ Lookup for {jira_id} failed: {e}")

print("\n📈 Document lookup tests completed")

2025-08-14 12:28:20,165 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication...' in bugs
2025-08-14 12:28:20,165 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication...' in bugs
2025-08-14 12:28:20,166 - SupabaseRetriever_bugs - INFO - Parameters: k=2, similarity_threshold=0.2, filters=None
2025-08-14 12:28:20,166 - SupabaseRetriever_bugs - INFO - Parameters: k=2, similarity_threshold=0.2, filters=None


🔍 Testing Document Lookup Tools...

📋 Finding test document IDs...


2025-08-14 12:28:20,546 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:20,729 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:20,791 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:20,791 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:20,795 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:20,795 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:20,795 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2112']
2025-08-14 12:28:20,795 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2112']
2025-08-14 12:28:20,796 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 6 candidates)
2025-08

   Found 1 document IDs: [2]
   Found 1 JIRA IDs: ['SPR-11224']

🆔 Testing Document Lookup by ID:
   ✅ Found document 2: SPR-11224 - context:include-filter can't find ControllerAdvice...

🎫 Testing Document Lookup by JIRA ID:


2025-08-14 12:28:21,033 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&key=eq.SPR-11224 "HTTP/2 200 OK"
2025-08-14 12:28:21,035 - RAGTools - INFO - Document lookup by JIRA ID: found
2025-08-14 12:28:21,035 - RAGTools - INFO - Document lookup by JIRA ID: found
2025-08-14 12:28:21,149 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&key=eq.JBIDE-16308 "HTTP/2 200 OK"
2025-08-14 12:28:21,151 - RAGTools - INFO - Document lookup by JIRA ID: found
2025-08-14 12:28:21,151 - RAGTools - INFO - Document lookup by JIRA ID: found


   ✅ Found JIRA SPR-11224: context:include-filter can't find ControllerAdvice...

🔍 Testing Common JIRA ID Patterns:
   ✅ Found JBIDE-16308: Cannot start JBT 4.2.0.Alpha1 on Fedora 19...
       Project: JBIDE, Type: Bug, Status: Closed


2025-08-14 12:28:21,264 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&key=eq.SPR-11209 "HTTP/2 200 OK"
2025-08-14 12:28:21,266 - RAGTools - INFO - Document lookup by JIRA ID: found
2025-08-14 12:28:21,266 - RAGTools - INFO - Document lookup by JIRA ID: found
2025-08-14 12:28:21,379 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&key=eq.HBASE-123 "HTTP/2 200 OK"
2025-08-14 12:28:21,382 - RAGTools - INFO - Document lookup by JIRA ID: not found
2025-08-14 12:28:21,382 - RAGTools - INFO - Document lookup by JIRA ID: not found


   ✅ Found SPR-11209: Recently changes of GenericTypeAwarePropertyDescri...
       Project: SPR, Type: Bug, Status: Closed
   ❌ HBASE-123 not found


2025-08-14 12:28:21,507 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&key=eq.PROJECT-1 "HTTP/2 200 OK"
2025-08-14 12:28:21,509 - RAGTools - INFO - Document lookup by JIRA ID: not found
2025-08-14 12:28:21,509 - RAGTools - INFO - Document lookup by JIRA ID: not found


   ❌ PROJECT-1 not found

📈 Document lookup tests completed


## 🧰 **Test Section 8: Tool Registry & Dynamic Access**

Tests the tool registry system and dynamic access to all RAG methods, including advanced ensemble tools.

In [14]:
print("🧰 Testing Tool Registry...")

# Get all available tools
try:
    all_tools = rag_tools.get_all_tools()
    
    print(f"\n📋 Available Tools ({len(all_tools)} total):")
    
    # Categorize tools
    tool_categories = {
        'Vector Search': [name for name in all_tools.keys() if 'vector_search' in name],
        'Keyword Search': [name for name in all_tools.keys() if 'keyword_search' in name or 'bm25' in name],
        'Hybrid Search': [name for name in all_tools.keys() if 'hybrid_search' in name],
        'Contextual Compression': [name for name in all_tools.keys() if 'contextual_compression' in name],
        'Ensemble': [name for name in all_tools.keys() if 'ensemble' in name],
        'Document Lookup': [name for name in all_tools.keys() if 'get_document' in name],
        'Utility': [name for name in all_tools.keys() if 'count' in name or 'test' in name]
    }
    
    for category, tools in tool_categories.items():
        if tools:
            print(f"\n   📁 {category}:")
            for tool in tools:
                if 'advanced_ensemble' in tool:
                    print(f"      • {tool} ⭐ (NEW - Advanced 4-method ensemble)")
                else:
                    print(f"      • {tool}")
    
    # Test dynamic tool access
    print("\n🔧 Testing Dynamic Tool Access:")
    test_tool_name = 'vector_search_bugs'
    if test_tool_name in all_tools:
        tool_func = all_tools[test_tool_name]
        try:
            results = tool_func("test query", k=1)
            print(f"   ✅ Dynamic access to '{test_tool_name}' successful: {len(results)} results")
        except Exception as e:
            print(f"   ❌ Dynamic access to '{test_tool_name}' failed: {e}")
    else:
        print(f"   ❌ Tool '{test_tool_name}' not found in registry")
    
    # Test NEW advanced ensemble tools
    print("\n🎯 Testing NEW Advanced Ensemble Tools:")
    advanced_tools = ['advanced_ensemble_search_bugs', 'advanced_ensemble_search_pcr']
    
    for tool_name in advanced_tools:
        if tool_name in all_tools:
            tool_func = all_tools[tool_name]
            try:
                print(f"   🔧 Testing '{tool_name}'...")
                results = tool_func("authentication error", k=2)
                print(f"   ✅ Advanced tool '{tool_name}' successful: {len(results)} results")
                
                # Show advanced features used
                if results:
                    sources = set(result.get('source', 'unknown') for result in results)
                    search_types = set(result.get('search_type', 'unknown') for result in results)
                    print(f"      → Sources: {', '.join(sources)}")
                    print(f"      → Search types: {', '.join(search_types)}")
                
            except Exception as e:
                print(f"   ❌ Advanced tool '{tool_name}' failed: {e}")
                import traceback
                traceback.print_exc()
        else:
            print(f"   ❌ Advanced tool '{tool_name}' not found in registry")
    
    # Test basic vs advanced ensemble comparison
    print("\n📊 Basic vs Advanced Ensemble Tool Comparison:")
    try:
        query = "authentication error"
        
        # Basic ensemble
        basic_func = all_tools.get('ensemble_search_bugs')
        if basic_func:
            basic_results = basic_func(query, k=2)
            print(f"   📈 Basic ensemble: {len(basic_results)} results")
        
        # Advanced ensemble
        advanced_func = all_tools.get('advanced_ensemble_search_bugs')
        if advanced_func:
            advanced_results = advanced_func(query, k=2)
            print(f"   🎯 Advanced ensemble: {len(advanced_results)} results")
            
            if advanced_results:
                # Show which advanced methods were used
                sources = [result.get('source', 'unknown') for result in advanced_results]
                unique_sources = set(sources)
                print(f"      → Advanced methods used: {', '.join(unique_sources)}")
        
    except Exception as e:
        print(f"   ❌ Ensemble comparison failed: {e}")
    
except Exception as e:
    print(f"❌ Tool registry test failed: {e}")

print("\n📈 Tool registry tests completed")

# Advanced ensemble capabilities summary
print(f"\n🎯 Advanced Ensemble Capabilities (NEW):")
print(f"   ✨ Multi-Query Expansion: Uses LLM to generate query variations")
print(f"   ✨ Contextual Compression: Advanced reranking for higher quality")  
print(f"   ✨ Sophisticated BM25: Enhanced keyword search with scoring")
print(f"   ✨ Weighted Combination: Equal 25% weights for each method")
print(f"   ✨ Content Hashing: Advanced deduplication prevents duplicates")
print(f"   ✨ Fallback Support: Graceful degradation if components fail")

print(f"\n🔍 Expected Advanced Ensemble Log Signatures:")
print(f"   • 'Multi-Query Retrieval' - LLM generates query variations")
print(f"   • 'Contextual Compression' - Vector search + reranking")
print(f"   • 'BM25 Retrieval' - Advanced keyword search")
print(f"   • 'Advanced deduplication' - Content hash-based deduplication")
print(f"   • Source: 'advanced_ensemble_bugs' - Confirms advanced ensemble was used")

2025-08-14 12:28:21,524 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'test query...' in bugs
2025-08-14 12:28:21,524 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'test query...' in bugs
2025-08-14 12:28:21,525 - SupabaseRetriever_bugs - INFO - Parameters: k=1, similarity_threshold=0.2, filters=None
2025-08-14 12:28:21,525 - SupabaseRetriever_bugs - INFO - Parameters: k=1, similarity_threshold=0.2, filters=None


🧰 Testing Tool Registry...

📋 Available Tools (19 total):

   📁 Vector Search:
      • vector_search_bugs
      • vector_search_pcr

   📁 Keyword Search:
      • keyword_search_bugs
      • keyword_search_pcr
      • bm25_search_bugs
      • bm25_search_pcr

   📁 Hybrid Search:
      • hybrid_search_bugs
      • hybrid_search_pcr

   📁 Contextual Compression:
      • contextual_compression_search_bugs
      • contextual_compression_search_pcr

   📁 Ensemble:
      • ensemble_search_bugs
      • ensemble_search_pcr
      • advanced_ensemble_search_bugs ⭐ (NEW - Advanced 4-method ensemble)
      • advanced_ensemble_search_pcr ⭐ (NEW - Advanced 4-method ensemble)

   📁 Document Lookup:
      • get_document_by_id
      • get_document_by_jira_id

   📁 Utility:
      • count_documents_bugs
      • count_documents_pcr
      • test_connections

🔧 Testing Dynamic Tool Access:


2025-08-14 12:28:21,837 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:22,031 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:22,034 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:22,034 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:22,036 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:22,036 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:22,039 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:22,039 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:22,040 - RAGTools - INFO - Vector search (bugs): 0 r

   ✅ Dynamic access to 'vector_search_bugs' successful: 0 results

🎯 Testing NEW Advanced Ensemble Tools:
   🔧 Testing 'advanced_ensemble_search_bugs'...


2025-08-14 12:28:22,583 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:22,768 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:22,770 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:22,770 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:22,772 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:22,772 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:22,773 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2479', '0.2604']
2025-08-14 12:28:22,773 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2479', '0.2604']
2025-08-14 12:28:22,774 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 3

   ✅ Advanced tool 'advanced_ensemble_search_bugs' successful: 2 results
      → Sources: supabase_bugs
      → Search types: direct_keyword_search, direct_vector_search
   🔧 Testing 'advanced_ensemble_search_pcr'...


2025-08-14 12:28:24,031 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:24,240 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/pcr?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:24,244 - SupabaseRetriever_pcr - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:24,244 - SupabaseRetriever_pcr - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:24,246 - SupabaseRetriever_pcr - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:24,246 - SupabaseRetriever_pcr - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:24,250 - SupabaseRetriever_pcr - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:24,250 - SupabaseRetriever_pcr - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:24,251 - RAGTools - INFO - Vector search (pcr): 0 results f

   ✅ Advanced tool 'advanced_ensemble_search_pcr' successful: 0 results

📊 Basic vs Advanced Ensemble Tool Comparison:


2025-08-14 12:28:26,233 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:26,405 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:26,408 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:26,408 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:26,411 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:26,411 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:26,412 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604']
2025-08-14 12:28:26,412 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604']
2025-08-14 12:28:26,413 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 3

   📈 Basic ensemble: 2 results


2025-08-14 12:28:28,288 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:28,478 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:28,479 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:28,479 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:28,481 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:28,481 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 2 above threshold 0.2
2025-08-14 12:28:28,482 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604']
2025-08-14 12:28:28,482 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604']
2025-08-14 12:28:28,482 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 3

   🎯 Advanced ensemble: 2 results
      → Advanced methods used: supabase_bugs

📈 Tool registry tests completed

🎯 Advanced Ensemble Capabilities (NEW):
   ✨ Multi-Query Expansion: Uses LLM to generate query variations
   ✨ Contextual Compression: Advanced reranking for higher quality
   ✨ Sophisticated BM25: Enhanced keyword search with scoring
   ✨ Weighted Combination: Equal 25% weights for each method
   ✨ Content Hashing: Advanced deduplication prevents duplicates
   ✨ Fallback Support: Graceful degradation if components fail

🔍 Expected Advanced Ensemble Log Signatures:
   • 'Multi-Query Retrieval' - LLM generates query variations
   • 'Contextual Compression' - Vector search + reranking
   • 'BM25 Retrieval' - Advanced keyword search
   • 'Advanced deduplication' - Content hash-based deduplication
   • Source: 'advanced_ensemble_bugs' - Confirms advanced ensemble was used


## ⚡ **Test Section 9: Performance & Stress Testing**

Benchmarks performance across different retrieval methods and tests system behavior under various loads.

In [15]:
print("⚡ Running Performance and Stress Tests...")

import time

# Performance comparison
print("\n🏃 Performance Comparison:")
test_query = "authentication error"
performance_results = {}

methods_to_test = [
    ('Vector Search', lambda: rag_tools.vector_search_bugs(test_query, k=5)),
    ('Keyword Search', lambda: rag_tools.keyword_search_bugs(test_query, k=5)),
    ('Hybrid Search', lambda: rag_tools.hybrid_search_bugs(test_query, k=5)),
    ('Ensemble Search', lambda: rag_tools.ensemble_search_bugs(test_query, k=5))
]

for method_name, method_func in methods_to_test:
    try:
        start_time = time.time()
        results = method_func()
        end_time = time.time()
        duration = end_time - start_time
        
        performance_results[method_name] = {
            'duration': duration,
            'results_count': len(results),
            'results_per_second': len(results) / duration if duration > 0 else float('inf')
        }
        
        print(f"   {method_name}: {duration:.3f}s, {len(results)} results ({len(results)/duration:.1f} results/sec)")
        
    except Exception as e:
        print(f"   ❌ {method_name} failed: {e}")
        performance_results[method_name] = {'duration': float('inf'), 'results_count': 0, 'results_per_second': 0}

# Concurrent query test
print("\n🔄 Concurrent Query Test:")
try:
    queries = ["authentication", "error", "timeout", "connection", "memory"]
    start_time = time.time()
    
    all_results = []
    for query in queries:
        results = rag_tools.hybrid_search_bugs(query, k=2)
        all_results.extend(results)
    
    end_time = time.time()
    total_duration = end_time - start_time
    
    print(f"   Processed {len(queries)} queries in {total_duration:.3f}s")
    print(f"   Average time per query: {total_duration/len(queries):.3f}s")
    print(f"   Total results retrieved: {len(all_results)}")
    
except Exception as e:
    print(f"   ❌ Concurrent query test failed: {e}")

# Large result set test
print("\n📊 Large Result Set Test:")
try:
    large_k_values = [10, 25, 50]
    
    for k in large_k_values:
        start_time = time.time()
        results = rag_tools.vector_search_bugs("error", k=k)
        end_time = time.time()
        duration = end_time - start_time
        
        print(f"   k={k}: {duration:.3f}s, {len(results)} results")
        
except Exception as e:
    print(f"   ❌ Large result set test failed: {e}")

print("\n📈 Performance tests completed")

2025-08-14 12:28:29,501 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:28:29,501 - SupabaseRetriever_bugs - INFO - Direct vector search for: 'authentication error...' in bugs
2025-08-14 12:28:29,502 - SupabaseRetriever_bugs - INFO - Parameters: k=5, similarity_threshold=0.2, filters=None
2025-08-14 12:28:29,502 - SupabaseRetriever_bugs - INFO - Parameters: k=5, similarity_threshold=0.2, filters=None


⚡ Running Performance and Stress Tests...

🏃 Performance Comparison:


2025-08-14 12:28:29,695 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:29,854 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=15 "HTTP/2 200 OK"
2025-08-14 12:28:30,051 - SupabaseRetriever_bugs - INFO - Processing 15 candidates for similarity calculation
2025-08-14 12:28:30,051 - SupabaseRetriever_bugs - INFO - Processing 15 candidates for similarity calculation
2025-08-14 12:28:30,060 - SupabaseRetriever_bugs - INFO - Calculated 15 similarities, 9 above threshold 0.2
2025-08-14 12:28:30,060 - SupabaseRetriever_bugs - INFO - Calculated 15 similarities, 9 above threshold 0.2
2025-08-14 12:28:30,061 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:30,061 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:30,062 - SupabaseRetriever_bugs - INFO - Direct vector search r

   Vector Search: 0.562s, 5 results (8.9 results/sec)


2025-08-14 12:28:30,410 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25authentication%25&limit=5 "HTTP/2 200 OK"
2025-08-14 12:28:30,544 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25error%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:30,546 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 5 results
2025-08-14 12:28:30,546 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 5 results
2025-08-14 12:28:30,546 - RAGTools - INFO - Keyword search (bugs): 5 results for 'authentication error...'
2025-08-14 12:28:30,546 - RAGTools - INFO - Keyword search (bugs): 5 results for 'authentication error...'
2025-08-14 12:28:30,547 - SupabaseRetriever_bugs - INFO - Direct hybrid search for: 'authentication error...' in bugs
2025-08-14 12:28:30,547 - SupabaseRetriever_bugs - INFO - Direct hybrid search for: 'authentication error...' in bugs

   Keyword Search: 0.484s, 5 results (10.3 results/sec)


2025-08-14 12:28:31,070 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:31,301 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=30 "HTTP/2 200 OK"
2025-08-14 12:28:31,485 - SupabaseRetriever_bugs - INFO - Processing 30 candidates for similarity calculation
2025-08-14 12:28:31,485 - SupabaseRetriever_bugs - INFO - Processing 30 candidates for similarity calculation
2025-08-14 12:28:31,500 - SupabaseRetriever_bugs - INFO - Calculated 30 similarities, 17 above threshold 0.2
2025-08-14 12:28:31,500 - SupabaseRetriever_bugs - INFO - Calculated 30 similarities, 17 above threshold 0.2
2025-08-14 12:28:31,501 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:31,501 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:31,502 - SupabaseRetriever_bugs - INFO - Direct vector search

   Hybrid Search: 1.508s, 5 results (3.3 results/sec)


2025-08-14 12:28:32,279 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:32,479 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:32,527 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:32,527 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:32,531 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:28:32,531 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 4 above threshold 0.2
2025-08-14 12:28:32,532 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:32,532 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2478', '0.2604', '0.2196']
2025-08-14 12:28:32,533 - SupabaseRetriever_bugs - INFO - Direct vector search return

   Ensemble Search: 1.686s, 4 results (2.4 results/sec)

🔄 Concurrent Query Test:


2025-08-14 12:28:34,003 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:34,214 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=12 "HTTP/2 200 OK"
2025-08-14 12:28:34,336 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:28:34,336 - SupabaseRetriever_bugs - INFO - Processing 12 candidates for similarity calculation
2025-08-14 12:28:34,342 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 1 above threshold 0.2
2025-08-14 12:28:34,342 - SupabaseRetriever_bugs - INFO - Calculated 12 similarities, 1 above threshold 0.2
2025-08-14 12:28:34,343 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2112']
2025-08-14 12:28:34,343 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2112']
2025-08-14 12:28:34,344 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 12 candidates)
2

   Processed 5 queries in 3.759s
   Average time per query: 0.752s
   Total results retrieved: 10

📊 Large Result Set Test:


2025-08-14 12:28:37,863 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=30 "HTTP/2 200 OK"
2025-08-14 12:28:38,061 - SupabaseRetriever_bugs - INFO - Processing 30 candidates for similarity calculation
2025-08-14 12:28:38,061 - SupabaseRetriever_bugs - INFO - Processing 30 candidates for similarity calculation
2025-08-14 12:28:38,075 - SupabaseRetriever_bugs - INFO - Calculated 30 similarities, 19 above threshold 0.2
2025-08-14 12:28:38,075 - SupabaseRetriever_bugs - INFO - Calculated 30 similarities, 19 above threshold 0.2
2025-08-14 12:28:38,077 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2638', '0.2176', '0.2219']
2025-08-14 12:28:38,077 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2638', '0.2176', '0.2219']
2025-08-14 12:28:38,077 - SupabaseRetriever_bugs - INFO - Direct vector search returned 10 results (from 30 candidates)
2025-08-14 12:28:38,077 - SupabaseRetriever_bugs - INFO - Direct vector s

   k=10: 0.577s, 10 results


2025-08-14 12:28:38,322 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:38,631 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=75 "HTTP/2 200 OK"
2025-08-14 12:28:38,908 - SupabaseRetriever_bugs - INFO - Processing 75 candidates for similarity calculation
2025-08-14 12:28:38,908 - SupabaseRetriever_bugs - INFO - Processing 75 candidates for similarity calculation
2025-08-14 12:28:38,937 - SupabaseRetriever_bugs - INFO - Calculated 75 similarities, 45 above threshold 0.2
2025-08-14 12:28:38,937 - SupabaseRetriever_bugs - INFO - Calculated 75 similarities, 45 above threshold 0.2
2025-08-14 12:28:38,938 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2638', '0.2176', '0.2219']
2025-08-14 12:28:38,938 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2638', '0.2176', '0.2219']
2025-08-14 12:28:38,938 - SupabaseRetriever_bugs - INFO - Direct vector search

   k=25: 0.861s, 25 results


2025-08-14 12:28:39,494 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=100 "HTTP/2 200 OK"
2025-08-14 12:28:39,605 - SupabaseRetriever_bugs - INFO - Processing 100 candidates for similarity calculation
2025-08-14 12:28:39,605 - SupabaseRetriever_bugs - INFO - Processing 100 candidates for similarity calculation
2025-08-14 12:28:39,637 - SupabaseRetriever_bugs - INFO - Calculated 100 similarities, 57 above threshold 0.2
2025-08-14 12:28:39,637 - SupabaseRetriever_bugs - INFO - Calculated 100 similarities, 57 above threshold 0.2
2025-08-14 12:28:39,637 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2638', '0.2176', '0.2219']
2025-08-14 12:28:39,637 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2638', '0.2176', '0.2219']
2025-08-14 12:28:39,638 - SupabaseRetriever_bugs - INFO - Direct vector search returned 50 results (from 100 candidates)
2025-08-14 12:28:39,638 - SupabaseRetriever_bugs - INFO - Direct ve

   k=50: 0.700s, 50 results

📈 Performance tests completed


## 🛡️ **Test Section 10: Error Handling & Edge Cases**

Validates system robustness with invalid inputs, edge cases, and error conditions.

In [16]:
print("🛡️ Testing Error Handling and Edge Cases...")

# Test empty queries
print("\n🔍 Empty Query Tests:")
empty_query_tests = ["", " ", "   ", None]

for i, query in enumerate(empty_query_tests):
    try:
        results = rag_tools.vector_search_bugs(query or "", k=1)
        print(f"   Empty query {i+1}: {len(results)} results (query: '{query}')")
    except Exception as e:
        print(f"   Empty query {i+1} failed as expected: {type(e).__name__}")

# Test invalid parameters
print("\n⚠️  Invalid Parameter Tests:")
try:
    # Negative k
    results = rag_tools.vector_search_bugs("test", k=-1)
    print(f"   Negative k: {len(results)} results (should handle gracefully)")
except Exception as e:
    print(f"   Negative k failed as expected: {type(e).__name__}")

try:
    # Invalid similarity threshold
    results = rag_tools.vector_search_bugs("test", k=1, similarity_threshold=2.0)
    print(f"   Invalid threshold: {len(results)} results (should handle gracefully)")
except Exception as e:
    print(f"   Invalid threshold failed as expected: {type(e).__name__}")

try:
    # Invalid collection
    results = rag_tools.get_document_by_id("test", collection="invalid_collection")
    print(f"   Invalid collection: {results} (should be None or error)")
except Exception as e:
    print(f"   Invalid collection failed as expected: {type(e).__name__}")

# Test very long queries
print("\n📏 Long Query Tests:")
long_queries = [
    "a" * 100,  # 100 characters
    "authentication error " * 20,  # Repeated phrase
    "This is a very long query that contains many words and should test how the system handles extremely verbose search queries that users might occasionally input" * 5
]

for i, query in enumerate(long_queries):
    try:
        results = rag_tools.hybrid_search_bugs(query[:500], k=1)  # Truncate to 500 chars
        print(f"   Long query {i+1} ({len(query)} chars): {len(results)} results")
    except Exception as e:
        print(f"   Long query {i+1} failed: {type(e).__name__}")

# Test special characters
print("\n🔣 Special Character Tests:")
special_queries = [
    "@#$%^&*()",
    "query with spaces    and    tabs\t\t",
    "unicode: ñáéíóú 中文 русский",
    "'quotes' and \"double quotes\"",
    "SQL injection'; DROP TABLE bugs; --"
]

for i, query in enumerate(special_queries):
    try:
        results = rag_tools.keyword_search_bugs(query, k=1)
        print(f"   Special chars {i+1}: {len(results)} results (safe)")
    except Exception as e:
        print(f"   Special chars {i+1} failed: {type(e).__name__}")

print("\n📈 Error handling tests completed")

2025-08-14 12:28:39,645 - SupabaseRetriever_bugs - INFO - Direct vector search for: '...' in bugs
2025-08-14 12:28:39,645 - SupabaseRetriever_bugs - INFO - Direct vector search for: '...' in bugs
2025-08-14 12:28:39,646 - SupabaseRetriever_bugs - INFO - Parameters: k=1, similarity_threshold=0.2, filters=None
2025-08-14 12:28:39,646 - SupabaseRetriever_bugs - INFO - Parameters: k=1, similarity_threshold=0.2, filters=None


🛡️ Testing Error Handling and Edge Cases...

🔍 Empty Query Tests:


2025-08-14 12:28:39,913 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:40,105 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:40,108 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:40,108 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:40,110 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:40,110 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:40,112 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:40,112 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:40,113 - RAGTools - INFO - Vector search (bugs): 0 r

   Empty query 1: 0 results (query: '')


2025-08-14 12:28:40,774 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:40,889 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:40,892 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:40,892 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:40,894 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:40,894 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:40,896 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:40,896 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:40,896 - RAGTools - INFO - Vector search (bugs): 0 r

   Empty query 2: 0 results (query: ' ')


2025-08-14 12:28:41,243 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:41,244 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:41,244 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:41,246 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:41,246 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:41,248 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:41,248 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:41,248 - RAGTools - INFO - Vector search (bugs): 0 results for '   ...'
2025-08-14 12:28:41,248 - RAGTools - INFO - Vector search (bugs): 0 results for '   ...'
2025-0

   Empty query 3: 0 results (query: '   ')


2025-08-14 12:28:41,664 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:41,854 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:41,857 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:41,857 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:41,860 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:41,860 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 0.2
2025-08-14 12:28:41,862 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:41,862 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:41,863 - RAGTools - INFO - Vector search (bugs): 0 r

   Empty query 4: 0 results (query: 'None')

⚠️  Invalid Parameter Tests:


2025-08-14 12:28:42,173 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=-3 "HTTP/2 416 Requested Range Not Satisfiable"
2025-08-14 12:28:42,175 - SupabaseRetriever_bugs - INFO - Using text-based fallback search for: 'test...'
2025-08-14 12:28:42,175 - SupabaseRetriever_bugs - INFO - Using text-based fallback search for: 'test...'
2025-08-14 12:28:42,277 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25test%25&limit=-1 "HTTP/2 416 Requested Range Not Satisfiable"
2025-08-14 12:28:42,279 - SupabaseRetriever_bugs - ERROR - Text fallback search error: {'message': 'Requested range not satisfiable', 'code': 'PGRST103', 'hint': None, 'details': 'Limit should be greater than or equal to zero.'}
2025-08-14 12:28:42,279 - SupabaseRetriever_bugs - ERROR - Text fallback search error: {'message': 'Requested range not satisfiable', 'code': 'PGRST103', 'hint': None, 'details': '

   Negative k: 0 results (should handle gracefully)


2025-08-14 12:28:42,606 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:42,819 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=3 "HTTP/2 200 OK"
2025-08-14 12:28:42,822 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:42,822 - SupabaseRetriever_bugs - INFO - Processing 3 candidates for similarity calculation
2025-08-14 12:28:42,824 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 2.0
2025-08-14 12:28:42,824 - SupabaseRetriever_bugs - INFO - Calculated 3 similarities, 0 above threshold 2.0
2025-08-14 12:28:42,828 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:42,828 - SupabaseRetriever_bugs - INFO - Direct vector search returned 0 results (from 3 candidates)
2025-08-14 12:28:42,828 - RAGTools - INFO - Vector search (bugs): 0 r

   Invalid threshold: 0 results (should handle gracefully)
   Invalid collection: None (should be None or error)

📏 Long Query Tests:


2025-08-14 12:28:43,090 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:43,276 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:43,329 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:43,329 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:43,333 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:43,333 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:43,333 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2019']
2025-08-14 12:28:43,333 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2019']
2025-08-14 12:28:43,334 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 6 candidates)
2025-08

   Long query 1 (100 chars): 1 results


2025-08-14 12:28:44,247 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:44,303 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:44,303 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:44,305 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:44,305 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:44,306 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2014']
2025-08-14 12:28:44,306 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2014']
2025-08-14 12:28:44,306 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 6 candidates)
2025-08-14 12:28:44,306 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 6 candidates)
2025

   Long query 2 (420 chars): 1 results


2025-08-14 12:28:45,036 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-08-14 12:28:45,257 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&limit=6 "HTTP/2 200 OK"
2025-08-14 12:28:45,287 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:45,287 - SupabaseRetriever_bugs - INFO - Processing 6 candidates for similarity calculation
2025-08-14 12:28:45,291 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:45,291 - SupabaseRetriever_bugs - INFO - Calculated 6 similarities, 1 above threshold 0.2
2025-08-14 12:28:45,292 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2210']
2025-08-14 12:28:45,292 - SupabaseRetriever_bugs - INFO - Result similarities: ['0.2210']
2025-08-14 12:28:45,293 - SupabaseRetriever_bugs - INFO - Direct vector search returned 1 results (from 6 candidates)
2025-08

   Long query 3 (790 chars): 1 results

🔣 Special Character Tests:


2025-08-14 12:28:45,875 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25%40%23%24%25%5E%26%2A%28%29%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:46,077 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&description=ilike.%25%40%23%24%25%5E%26%2A%28%29%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:46,078 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 0 results
2025-08-14 12:28:46,078 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 0 results
2025-08-14 12:28:46,079 - RAGTools - INFO - Keyword search (bugs): 0 results for '@#$%^&*()...'
2025-08-14 12:28:46,079 - RAGTools - INFO - Keyword search (bugs): 0 results for '@#$%^&*()...'
2025-08-14 12:28:46,079 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'query with spaces    and    tabs		...' in bugs
2025-08-14 12:28:46,079 - SupabaseRetriever_bugs - INFO - Direct keyword search for

   Special chars 1: 0 results (safe)


2025-08-14 12:28:46,364 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25query%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:46,366 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 1 results
2025-08-14 12:28:46,366 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 1 results
2025-08-14 12:28:46,366 - RAGTools - INFO - Keyword search (bugs): 1 results for 'query with spaces    and    tabs		...'
2025-08-14 12:28:46,366 - RAGTools - INFO - Keyword search (bugs): 1 results for 'query with spaces    and    tabs		...'
2025-08-14 12:28:46,367 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'unicode: ñáéíóú 中文 русский...' in bugs
2025-08-14 12:28:46,367 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'unicode: ñáéíóú 中文 русский...' in bugs
2025-08-14 12:28:46,485 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%2

   Special chars 2: 1 results (safe)


2025-08-14 12:28:46,591 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25unicode%3A%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:46,715 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25%C3%B1%C3%A1%C3%A9%C3%AD%C3%B3%C3%BA%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:46,844 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B9%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:47,040 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&description=ilike.%25unicode%3A+%C3%B1%C3%A1%C3%A9%C3%AD%C3%B3%C3%BA+%E4%B8%AD%E6%96%87+%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B9%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:47,042 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 0 results
2025-08-14 12:28:47,042 - Supa

   Special chars 3: 0 results (safe)


2025-08-14 12:28:47,292 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25%27quotes%27%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:47,417 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25and%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:47,419 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 1 results
2025-08-14 12:28:47,419 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 1 results
2025-08-14 12:28:47,420 - RAGTools - INFO - Keyword search (bugs): 1 results for ''quotes' and "double quotes"...'
2025-08-14 12:28:47,420 - RAGTools - INFO - Keyword search (bugs): 1 results for ''quotes' and "double quotes"...'
2025-08-14 12:28:47,420 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'SQL injection'; DROP TABLE bugs; --...' in bugs
2025-08-14 12:28:47,420 - SupabaseRetriever_bugs - INFO - Direct keyword search for: 'SQL

   Special chars 4: 1 results (safe)


2025-08-14 12:28:47,663 - httpx - INFO - HTTP Request: GET https://jzstozvrjjhmwigycjtj.supabase.co/rest/v1/bugs?select=%2A&title=ilike.%25sql%25&limit=1 "HTTP/2 200 OK"
2025-08-14 12:28:47,664 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 1 results
2025-08-14 12:28:47,664 - SupabaseRetriever_bugs - INFO - Direct keyword search returned 1 results
2025-08-14 12:28:47,664 - RAGTools - INFO - Keyword search (bugs): 1 results for 'SQL injection'; DROP TABLE bugs; --...'
2025-08-14 12:28:47,664 - RAGTools - INFO - Keyword search (bugs): 1 results for 'SQL injection'; DROP TABLE bugs; --...'


   Special chars 5: 1 results (safe)

📈 Error handling tests completed


## 📊 **Test Section 11: Results Summary & Analysis**

Comprehensive analysis of all test results with performance metrics and recommendations.

In [17]:
print("📊 Test Results Summary and Analysis")

# Compile all test results
all_test_results = {}

# Add results from different test sections - handle both dict and list formats
if 'vector_results' in locals():
    if isinstance(vector_results, dict):
        all_test_results.update({f"vector_{k}": v for k, v in vector_results.items()})
    else:
        all_test_results['vector_results'] = vector_results

if 'keyword_results' in locals():
    if isinstance(keyword_results, dict):
        all_test_results.update({f"keyword_{k}": v for k, v in keyword_results.items()})
    else:
        all_test_results['keyword_results'] = keyword_results

if 'hybrid_results' in locals():
    if isinstance(hybrid_results, dict):
        all_test_results.update({f"hybrid_{k}": v for k, v in hybrid_results.items()})
    else:
        all_test_results['hybrid_results'] = hybrid_results

if 'compression_results' in locals():
    if isinstance(compression_results, dict):
        all_test_results.update({f"compression_{k}": v for k, v in compression_results.items()})
    else:
        all_test_results['compression_results'] = compression_results

if 'ensemble_results' in locals():
    if isinstance(ensemble_results, dict):
        all_test_results.update({f"ensemble_{k}": v for k, v in ensemble_results.items()})
    else:
        all_test_results['ensemble_results'] = ensemble_results

# Create summary statistics
total_tests = len(all_test_results)
successful_tests = len([r for r in all_test_results.values() if r])
failed_tests = total_tests - successful_tests

print(f"\n🎯 Overall Test Results:")
print(f"   Total tests: {total_tests}")
print(f"   Successful: {successful_tests} ({successful_tests/total_tests*100:.1f}%)" if total_tests > 0 else "   No tests found")
print(f"   Failed: {failed_tests} ({failed_tests/total_tests*100:.1f}%)" if total_tests > 0 else "")

# Method-wise breakdown
method_stats = {}
for test_name, results in all_test_results.items():
    method = test_name.split('_')[0]
    if method not in method_stats:
        method_stats[method] = {'total': 0, 'successful': 0}
    method_stats[method]['total'] += 1
    if results:  # Check if results exist (could be list or dict)
        method_stats[method]['successful'] += 1

print(f"\n📈 Method-wise Performance:")
for method, stats in method_stats.items():
    success_rate = stats['successful'] / stats['total'] * 100 if stats['total'] > 0 else 0
    print(f"   {method.capitalize()}: {stats['successful']}/{stats['total']} ({success_rate:.1f}%)")

# Performance summary
if 'performance_results' in locals():
    print(f"\n⚡ Performance Summary:")
    fastest_method = min(performance_results.items(), key=lambda x: x[1]['duration'])
    most_results = max(performance_results.items(), key=lambda x: x[1]['results_count'])
    
    print(f"   Fastest method: {fastest_method[0]} ({fastest_method[1]['duration']:.3f}s)")
    print(f"   Most results: {most_results[0]} ({most_results[1]['results_count']} results)")

# Recommendations
print(f"\n💡 Recommendations:")

if total_tests > 0:
    if successful_tests / total_tests >= 0.8:
        print("   ✅ RAG tools are working well overall")
    elif successful_tests / total_tests >= 0.5:
        print("   ⚠️  RAG tools have moderate success rate - check configuration")
    else:
        print("   ❌ RAG tools have low success rate - check environment and setup")

    # Check which methods work best
    best_methods = [method for method, stats in method_stats.items() 
                    if stats['successful'] / stats['total'] >= 0.8]

    if best_methods:
        print(f"   🏆 Best performing methods: {', '.join(best_methods)}")
else:
    print("   ℹ️  No test results found to analyze")

# Environment checks
print(f"\n🔧 Environment Status:")
try:
    import os
    env_vars = ['SUPABASE_URL', 'SUPABASE_KEY', 'OPENAI_API_KEY']
    missing_vars = [var for var in env_vars if not os.getenv(var)]
    
    if missing_vars:
        print(f"   ⚠️  Missing environment variables: {', '.join(missing_vars)}")
    else:
        print("   ✅ All required environment variables are set")
        
except Exception as e:
    print(f"   ❌ Environment check failed: {e}")

print(f"\n🎉 Testing completed! Check the results above for detailed analysis.")

📊 Test Results Summary and Analysis

🎯 Overall Test Results:
   Total tests: 28
   Successful: 27 (96.4%)
   Failed: 1 (3.6%)

📈 Method-wise Performance:
   Vector: 7/7 (100.0%)
   Keyword: 6/7 (85.7%)
   Hybrid: 5/5 (100.0%)
   Compression: 1/1 (100.0%)
   Ensemble: 8/8 (100.0%)

⚡ Performance Summary:
   Fastest method: Keyword Search (0.484s)
   Most results: Vector Search (5 results)

💡 Recommendations:
   ✅ RAG tools are working well overall
   🏆 Best performing methods: vector, keyword, hybrid, compression, ensemble

🔧 Environment Status:
   ✅ All required environment variables are set

🎉 Testing completed! Check the results above for detailed analysis.


## 🎮 **Interactive Testing Section**

**Playground for Custom Testing**: Use this section to test specific queries, compare methods, or explore the RAG system with your own inputs.

### **Available Examples:**
- **Custom Query Testing**: Test any query with any retrieval method
- **Method Comparison**: Side-by-side comparison of different approaches  
- **Document Lookup**: Search for specific JIRA tickets or documents
- **Advanced Filtering**: Test with custom metadata filters

**💡 Tip**: Uncomment and modify the example code below to run your own tests!

## 12. Interactive Testing Section

In [18]:
print("🎮 Interactive Testing Section")
print("Use this cell to test specific queries or methods manually.")
print("Uncomment and modify the examples below:")

# Example: Test a specific query
# test_query = "your custom query here"
# results = rag_tools.hybrid_search_bugs(test_query, k=5)
# display_results(results, f"Custom query: {test_query}")

# Example: Compare different methods
# query = "authentication error"
# vector_results = rag_tools.vector_search_bugs(query, k=3)
# keyword_results = rag_tools.keyword_search_bugs(query, k=3)
# hybrid_results = rag_tools.hybrid_search_bugs(query, k=3)
#
# print(f"Comparison for '{query}':")
# display_results(vector_results, "Vector Search")
# display_results(keyword_results, "Keyword Search")
# display_results(hybrid_results, "Hybrid Search")

# Example: Test specific document lookup
# jira_id = "JBIDE-16273"
# doc = rag_tools.get_document_by_jira_id(jira_id)
# if doc:
#     print(f"Found document {jira_id}:")
#     print(f"Title: {doc.get('title', 'N/A')}")
#     print(f"Description: {doc.get('description', 'N/A')[:200]}...")
# else:
#     print(f"Document {jira_id} not found")

# Example: Test with custom filters
# results = rag_tools.vector_search_bugs(
#     "error", 
#     k=5, 
#     filters={'type': 'Bug', 'priority': 'High'}
# )
# display_results(results, "Filtered Search")

print("\n💡 Uncomment and run the examples above to test interactively!")

🎮 Interactive Testing Section
Use this cell to test specific queries or methods manually.
Uncomment and modify the examples below:

💡 Uncomment and run the examples above to test interactively!
