# RAG vs Structured RAG: Hotel Search Comparison

This notebook compares two hotel search approaches:
1. **Traditional RAG**: Vector search using description embeddings
2. **Structured RAG**: Hybrid search with metadata filtering and dual embeddings

## Part 1: Traditional RAG Implementation

## 1. Install Required Libraries

First, we need to install the necessary packages for our RAG implementation.

In [130]:
%pip install openai qdrant-client python-dotenv pandas numpy matplotlib seaborn

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Import Dependencies

Import all necessary libraries for the RAG implementation.

In [None]:
import json
import os
import openai
from openai import OpenAI
import pandas as pd
import numpy as np
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct, NamedVector, PayloadSchemaType
from typing import List, Dict, Any
import time

## 3. Load and Explore Dataset

Load the hotel dataset and explore its structure to understand the data we'll be working with.

In [153]:
# Load the queries.json file
import json

print("Loading queries.json file...")
with open('queries.json', 'r') as f:
    test_data = json.load(f)

queries_data = test_data
structured_queries_data = test_data

print(f"Loaded {len(test_data)} queries from queries.json")
print(f"Query types: {len([q for q in test_data if q.get('type') == 'normal'])} normal, {len([q for q in test_data if q.get('type') == 'tricky'])} tricky")

# Show first few queries as a sample
print("\nSample queries:")
for i, query in enumerate(test_data[:3]):
    print(f"  {i+1}. ({query.get('type', 'unknown')}): {query.get('query', '')[:60]}...")
    if query.get('expected_results'):
        print(f"     Expected: {query.get('expected_results', [])}")
    print()

Loading queries.json file...
Loaded 89 queries from queries.json
Query types: 44 normal, 45 tricky

Sample queries:
  1. (normal): I need a casino hotel in Asia with gambling and nightlife....
     Expected: ['The Colosseum', 'The Shamrock', 'The Palace of Versailles', 'Marina Bay Sands']

  2. (normal): Find me a luxury beach resort in Europe with spa services....
     Expected: ['Hotel du Cap-Eden-Roc']

  3. (normal): I'm looking for a mountain lodge in North America for skiing...
     Expected: ['Kyoto Gardens Inn', 'Fairmont Banff Springs', 'The Outback', 'Bavarian Village Hotel']



In [None]:
# Load the Hotels dataset
with open('Dataset.json', 'r', encoding='utf-8') as file:
    hotels_data = json.load(file)

print(f"Total number of hotels: {len(hotels_data)}")
print("\nFirst hotel example:")
print(json.dumps(hotels_data[0], indent=2))

print(f"\nDescription length statistics:")
descriptions = [hotel['description'] for hotel in hotels_data]
desc_lengths = [len(desc) for desc in descriptions]
print(f"Average description length: {np.mean(desc_lengths):.1f} characters")
print(f"Min description length: {min(desc_lengths)}")
print(f"Max description length: {max(desc_lengths)}")

# Extract just the descriptions for embedding
hotel_descriptions = descriptions
print(f"\nSample descriptions:")
for i, desc in enumerate(hotel_descriptions[:3]):
    print(f"{i+1}. {desc[:100]}...")

Total number of hotels: 144

First hotel example:
{
  "description": "Experience Nordic elegance at 'The Stockholm Lale', a unique hotel in Sweden featuring exquisite Turkish-inspired decor and serving authentic Anatolian cuisine. It's a taste of Turkey in the heart of Scandinavia, perfect for a vibrant city break any time of year.",
  "structured_description": {
    "name": "The Stockholm Lale",
    "region": "Europe",
    "country": "Sweden",
    "type": "City",
    "atmosphere": "Vibrant",
    "activities": [
      "Cultural",
      "Cuisine"
    ],
    "services": [
      "Turkish food",
      "Turkish decor"
    ],
    "best_months": [
      "All"
    ],
    "description": "A unique hotel in Stockholm with Turkish-inspired design and food."
  }
}

Description length statistics:
Average description length: 211.7 characters
Min description length: 175
Max description length: 297

Sample descriptions:
1. Experience Nordic elegance at 'The Stockholm Lale', a unique hotel in Sweden fea

## 4. Initialize Qdrant Client

Set up the connection to Qdrant vector database. You'll need to replace the placeholder values with your actual Qdrant API credentials.

In [None]:
# Initialize Qdrant client
# Replace these with your actual Qdrant credentials
QDRANT_URL = "YOUR_QDRANT_API_URL"  # e.g., "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333"
QDRANT_API_KEY = "YOUR_QDRANT_API_KEY"

try:
    client = QdrantClient(
        url=QDRANT_URL,
        api_key=QDRANT_API_KEY,
    )
    
    # Test connection
    collections = client.get_collections()
    print("Successfully connected to Qdrant!")
    print(f"Existing collections: {[col.name for col in collections.collections]}")
    
except Exception as e:
    print(f"Failed to connect to Qdrant: {e}")
    print("Please check your QDRANT_URL and QDRANT_API_KEY")

Successfully connected to Qdrant!
Existing collections: ['hotels_rag', 'hotels_structured_rag']


## 5. Setup OpenAI Embeddings

Configure the OpenAI client and create functions to generate embeddings using the text-embedding-3-small model.

In [None]:
# Initialize OpenAI client

OPENAI_API_KEY = "YOUR_OPENAI_API_KEY"  # Replace this with your actual API key
openai_client = OpenAI(api_key=OPENAI_API_KEY)

def get_embedding(text: str, model: str = "text-embedding-3-small") -> List[float]:
    """
    Generate embeddings for given text using OpenAI's embedding model.
    
    Args:
        text (str): The text to embed
        model (str): The embedding model to use
    
    Returns:
        List[float]: The embedding vector
    """
    try:
        text = text.replace("\n", " ")
        response = openai_client.embeddings.create(input=[text], model=model)
        return response.data[0].embedding
    except Exception as e:
        print(f"Error generating embedding: {e}")
        return None

# Test the embedding function
test_text = "A beautiful hotel with ocean views and excellent service."
test_embedding = get_embedding(test_text)

if test_embedding:
    print(f"Embedding generated successfully!")
    print(f"Embedding dimension: {len(test_embedding)}")
    print(f"First 5 values: {test_embedding[:5]}")
else:
    print("Failed to generate embedding. Please check your OpenAI API key.")

Embedding generated successfully!
Embedding dimension: 1536
First 5 values: [-0.0390862375497818, 0.02253010682761669, -0.0018335554050281644, -0.005651958752423525, -0.013013940304517746]


## 6. Create Embeddings for Hotel Descriptions

Generate embeddings for all hotel descriptions in our dataset. This will take some time as we need to make API calls for each description.

In [None]:
def create_hotel_embeddings(hotels_data: List[Dict]) -> List[Dict]:
    """
    Create embeddings for all hotel descriptions.
    
    Args:
        hotels_data (List[Dict]): List of hotel dictionaries
    
    Returns:
        List[Dict]: Hotels with embeddings added
    """
    hotels_with_embeddings = []
    
    print(f"Processing {len(hotels_data)} hotels...")
    
    for i, hotel in enumerate(hotels_data):
        print(f"Processing hotel {i+1}/{len(hotels_data)}: {hotel.get('structured_description', {}).get('name', 'Unknown')}")
        
        # Get embedding for the description
        embedding = get_embedding(hotel['description'])
        
        if embedding:
            hotel_with_embedding = hotel.copy()
            hotel_with_embedding['embedding'] = embedding
            hotels_with_embeddings.append(hotel_with_embedding)
        else:
            print(f"Failed to get embedding for hotel {i+1}")
        
        # Add a small delay to avoid rate limiting
        time.sleep(0.1)
    
    print(f"Successfully created embeddings for {len(hotels_with_embeddings)} hotels")
    return hotels_with_embeddings

# Generate embeddings for all hotels
hotels_with_embeddings = create_hotel_embeddings(hotels_data)

# Display statistics
if hotels_with_embeddings:
    print(f"\nEmbedding Statistics:")
    print(f"Total hotels with embeddings: {len(hotels_with_embeddings)}")
    print(f"Embedding dimension: {len(hotels_with_embeddings[0]['embedding'])}")
    
    # Save embeddings for future use
    with open('hotels_with_embeddings.json', 'w', encoding='utf-8') as f:
        json.dump(hotels_with_embeddings, f, indent=2, ensure_ascii=False)
    print("Embeddings saved to 'hotels_with_embeddings.json'")

Processing 144 hotels...
Processing hotel 1/144: The Stockholm Lale
Processing hotel 2/144: The Bosphorus Gem
Processing hotel 3/144: Kyoto Gardens Inn
Processing hotel 4/144: The Parisian Adventure
Processing hotel 5/144: The Florence
Processing hotel 6/144: Java Lounge Hotel
Processing hotel 7/144: Riad Quebec
Processing hotel 8/144: Athens Beach Club
Processing hotel 9/144: Mumbai Palace
Processing hotel 10/144: Valhalla Landing
Processing hotel 11/144: Pyramid of the Sun
Processing hotel 12/144: Casa de Samba
Processing hotel 13/144: The Georgia Peach Inn
Processing hotel 14/144: The Amazon
Processing hotel 15/144: Siam Wellness Retreat
Processing hotel 16/144: The Brussels Bond
Processing hotel 17/144: Savannah Lodge
Processing hotel 18/144: The Continental
Processing hotel 19/144: The Winter Palace
Processing hotel 20/144: The Hermitage
Processing hotel 21/144: Oktoberfest Grand Hotel
Processing hotel 22/144: Hollywood Hotel
Processing hotel 23/144: Havana Nights
Processing hotel

## 7. Store Embeddings in Qdrant

Create a collection in Qdrant and store our hotel embeddings along with metadata for efficient retrieval.

In [None]:
def setup_qdrant_collection(client: QdrantClient, collection_name: str, embedding_dim: int):
    """
    Create or recreate a Qdrant collection for storing hotel embeddings.
    
    Args:
        client: Qdrant client instance
        collection_name: Name of the collection
        embedding_dim: Dimension of the embeddings
    """
    try:
        # Delete collection if it exists
        try:
            client.delete_collection(collection_name=collection_name)
            print(f"Deleted existing collection '{collection_name}'")
        except:
            print(f"Collection '{collection_name}' doesn't exist, creating new one")
        
        # Create new collection
        client.create_collection(
            collection_name=collection_name,
            vectors_config=VectorParams(size=embedding_dim, distance=Distance.COSINE),
        )
        print(f"Created collection '{collection_name}' with dimension {embedding_dim}")
        
    except Exception as e:
        print(f" Error setting up collection: {e}")

def store_hotels_in_qdrant(client: QdrantClient, hotels_with_embeddings: List[Dict], collection_name: str):
    """
    Store hotel embeddings in Qdrant collection with only basic information.
    
    Args:
        client: Qdrant client instance
        hotels_with_embeddings: List of hotels with embeddings
        collection_name: Name of the collection
    """
    points = []
    
    for i, hotel in enumerate(hotels_with_embeddings):
        # Only store the description - no structured metadata
        metadata = {
            "hotel_id": i,
            "description": hotel['description']
        }
        
        point = PointStruct(
            id=i,
            vector=hotel['embedding'],
            payload=metadata
        )
        points.append(point)
    
    # Upload points to Qdrant
    try:
        client.upsert(
            collection_name=collection_name,
            points=points
        )
        print(f"Successfully stored {len(points)} hotels in Qdrant collection '{collection_name}'")
        
        # Verify the upload
        collection_info = client.get_collection(collection_name=collection_name)
        print(f"Collection info: {collection_info.points_count} points stored")
        
    except Exception as e:
        print(f" Error storing hotels in Qdrant: {e}")

# Setup and populate Qdrant collection
COLLECTION_NAME = "hotels_rag"

if hotels_with_embeddings:
    embedding_dim = len(hotels_with_embeddings[0]['embedding'])
    
    # Setup collection
    setup_qdrant_collection(client, COLLECTION_NAME, embedding_dim)
    
    # Store embeddings
    store_hotels_in_qdrant(client, hotels_with_embeddings, COLLECTION_NAME)
else:
    print("No hotels with embeddings to store. Please run the embedding generation first.")

🗑️ Deleted existing collection 'hotels_rag'
✅ Created collection 'hotels_rag' with dimension 1536
✅ Successfully stored 144 hotels in Qdrant collection 'hotels_rag'
Collection info: 144 points stored


## 8. Hotel Search Utilities

Helper functions to map between hotel IDs, names, and data structures for result processing and evaluation.

In [None]:
# Utility: Find Hotel by Name
def find_hotel_by_name(hotel_name: str, hotels_data: List[Dict]) -> Dict:
    """
    Find a hotel by name in the dataset.
    
    Args:
        hotel_name: Name of the hotel to find
    Returns:
        hotels_data: List of all hotels
    
        Hotel dictionary if found, None otherwise
    """
    for hotel in hotels_data:
        structured_desc = hotel.get('structured_description', {})
        if structured_desc.get('name', '').lower() == hotel_name.lower():
            return hotel
    return None

def get_hotel_name_by_id(point_id: int, hotels_data: List[Dict]) -> str:
    """
    Get hotel name by point ID from hotels_data.
    
    Args:
        point_id: The Qdrant point ID
        hotels_data: List of all hotels
    
    Returns:
        Hotel name if found, 'Unknown' otherwise
    """
    try:
        if point_id < len(hotels_data):
            hotel = hotels_data[point_id]
            structured_desc = hotel.get('structured_description', {})
            return structured_desc.get('name', 'Unknown')
        return 'Unknown'
    except (IndexError, KeyError):
        return 'Unknown'

print("Hotel utility functions defined")

✅ Hotel utility functions defined


## Part 2: Traditional RAG Testing & Evaluation

Now that we have implemented the core Traditional RAG system, let's test it systematically with our query dataset and evaluate its performance.

# Traditional RAG Testing

This section implements a clean Traditional RAG workflow:
1. Run RAG search with top_k=5 
2. Match results with hotel names
3. Save results in traditional_rag_results.json format

## Traditional RAG Search Function

Core RAG implementation that performs semantic search using query embeddings and vector similarity.

In [154]:
def rag_search(query: str, client: QdrantClient, collection_name: str, top_k: int = 5) -> List[Dict]:
    """
    Perform RAG search to find similar hotels based on query.
    
    Args:
        query (str): User's search query
        client: Qdrant client instance
        collection_name (str): Name of the collection to search
        top_k (int): Number of results to return
    
    Returns:
        List[Dict]: List of similar hotels with similarity scores
    """
    try:
        # Generate embedding for the query
        query_embedding = get_embedding(query)
        
        if not query_embedding:
            print("Failed to generate embedding for query")
            return []
        
        # Search in Qdrant
        search_results = client.search(
            collection_name=collection_name,
            query_vector=query_embedding,
            limit=top_k,
            with_payload=True
        )
        
        # Format results - only using description
        results = []
        for result in search_results:
            hotel_info = {
                'id': result.id,
                'score': result.score,
                'description': result.payload.get('description', '')
            }
            results.append(hotel_info)
        
        return results
        
    except Exception as e:
        print(f"Error during RAG search: {e}")
        return []

def display_search_results(query: str, results: List[Dict]):
    """
    Display search results in a formatted way - simplified for description only.
    
    Args:
        query (str): The original search query
        results (List[Dict]): Search results from RAG
    """
    print(f"Search Query: '{query}'")
    print(f"Found {len(results)} results:\n")
    
    for i, hotel in enumerate(results, 1):
        print(f"{i}. Hotel #{hotel['id']} (Similarity Score: {hotel['score']:.3f})")
        print(f"   Description: {hotel['description']}")
        print("-" * 80)

## Traditional RAG Wrapper

Enhanced search function that formats results with hotel names and scores for evaluation.

In [155]:
# Step 2: Traditional RAG Search Function
def run_traditional_rag_search(query: str, client, collection_name: str, top_k: int = 5) -> List[Dict]:
    """
    Run traditional RAG search and return results with hotel names and scores.
    
    Args:
        query: Search query
        client: Qdrant client
        collection_name: Collection name
        top_k: Number of results to return
    
    Returns:
        List of dictionaries with name and score
    """
    try:
        # Use the existing rag_search function that works
        rag_results = rag_search(query, client, collection_name, top_k)
        
        # Format results to include hotel names from hotels_data
        formatted_results = []
        for result in rag_results:
            point_id = result['id']
            hotel_name = get_hotel_name_by_id(point_id, hotels_data) if 'hotels_data' in globals() else 'Unknown'
            formatted_results.append({
                "name": hotel_name,
                "score": round(result['score'], 3)
            })
        
        return formatted_results
        
    except Exception as e:
        print(f"Error in traditional RAG search: {e}")
        return []

print("Traditional RAG search function defined")

Traditional RAG search function defined


## Traditional RAG Pipeline

Complete pipeline that runs Traditional RAG on all test queries and saves results for evaluation.

In [156]:
# Step 3: Run Traditional RAG on All Queries and Save Results
def run_traditional_rag_pipeline(queries_data: List[Dict], save_results: bool = True) -> List[Dict]:
    """
    Run traditional RAG on all queries and save results in specified format.
    
    Args:
        queries_data: List of query dictionaries from queries.json
        save_results: Whether to save results to JSON file
    
    Returns:
        List of results in specified format
    """
    print("Starting Traditional RAG Pipeline...")
    print(f"Processing {len(queries_data)} queries with top_k=5")
    print("=" * 60)
    
    results = []
    
    for i, query_item in enumerate(queries_data):
        query_text = query_item['query']
        query_type = query_item['type']
        
        print(f"\nQuery {i+1}/{len(queries_data)} [{query_type}]")
        print(f"Query: {query_text}")
        
        # Run traditional RAG search
        start_time = time.time()
        retrieved_results = run_traditional_rag_search(
            query=query_text,
            client=client,
            collection_name=COLLECTION_NAME,
            top_k=5
        )
        end_time = time.time()
        
        # Format result according to specified structure
        result_entry = {
            "query": query_text,
            "query_type": query_type,
            "expected_results": query_item.get('expected_results', []),
            "retrieved_results": retrieved_results,
            "response_time": round(end_time - start_time, 3)
        }
        
        results.append(result_entry)
        
        # Display results
        print(f"Retrieved {len(retrieved_results)} results:")
        for j, result in enumerate(retrieved_results, 1):
            print(f"  {j}. {result['name']} (score: {result['score']})")
        
        print(f"Response time: {end_time - start_time:.3f}s")
        
        # Progress update
        if (i + 1) % 10 == 0:
            print(f"\nProcessed {i+1}/{len(queries_data)} queries")
    
    # Save results to JSON file
    if save_results:
        filename = 'traditional_rag_results.json'
        with open(filename, 'w', encoding='utf-8') as f:
            json.dump(results, f, indent=2, ensure_ascii=False)
        print(f"\nResults saved to {filename}")
        print(f"Total queries processed: {len(results)}")
    
    return results

# Check if we have the required data and run the pipeline
if 'queries_data' in locals() and 'client' in locals() and 'COLLECTION_NAME' in locals():
    print("All required data available. Running Traditional RAG Pipeline...")
    
    # Run on entire dataset
    test_queries = queries_data
    print(f"Testing on all {len(test_queries)} queries from queries.json")
    print(f"Processing entire dataset")
    
    traditional_rag_results = run_traditional_rag_pipeline(test_queries)
    print(f"\nTraditional RAG Pipeline completed!")
else:
    print("Missing required data. Please ensure queries_data, client, and COLLECTION_NAME are loaded.")

All required data available. Running Traditional RAG Pipeline...
Testing on all 89 queries from queries.json
Processing entire dataset
Starting Traditional RAG Pipeline...
Processing 89 queries with top_k=5

Query 1/89 [normal]
Query: I need a casino hotel in Asia with gambling and nightlife.


  search_results = client.search(


Retrieved 5 results:
  1. The Shamrock (score: 0.436)
  2. The Palace of Versailles (score: 0.425)
  3. Pharaoh's Casino (score: 0.416)
  4. The Parthenon (score: 0.402)
  5. The Colosseum (score: 0.378)
Response time: 0.599s

Query 2/89 [normal]
Query: Find me a luxury beach resort in Europe with spa services.
Retrieved 5 results:
  1. Siam Wellness Retreat (score: 0.424)
  2. Roman Holiday (score: 0.414)
  3. Belmond Hotel Cipriani (score: 0.412)
  4. Spartan Training Center (score: 0.411)
  5. The Wild West (score: 0.392)
Response time: 0.481s

Query 3/89 [normal]
Query: I'm looking for a mountain lodge in North America for skiing and hiking.
Retrieved 5 results:
  1. The Alps (score: 0.434)
  2. The Outback (score: 0.422)
  3. Fairmont Banff Springs (score: 0.406)
  4. Kyoto Gardens Inn (score: 0.385)
  5. Alpine Horn (score: 0.383)
Response time: 0.200s

Query 4/89 [normal]
Query: Where can I find a business hotel in Asia with conference facilities?
Retrieved 5 results:
  1. Siam 

## Results Management

Functions to load, display, and analyze Traditional RAG results for evaluation.

In [157]:
# Step 4: Load and Display Saved Results
def load_traditional_rag_results(filename: str = 'traditional_rag_results.json') -> List[Dict]:
    """
    Load previously saved traditional RAG results.
    
    Args:
        filename: Name of the JSON file to load
    
    Returns:
        List of results or None if file doesn't exist
    """
    try:
        with open(filename, 'r', encoding='utf-8') as f:
            results = json.load(f)
        
        print(f"Loaded {len(results)} results from {filename}")
        return results
        
    except FileNotFoundError:
        print(f"File {filename} not found")
        return None
    except Exception as e:
        print(f"Error loading {filename}: {e}")
        return None

def display_results_summary(results: List[Dict]):
    """Display a summary of the traditional RAG results."""
    if not results:
        print("No results to display")
        return
    
    print(f"Traditional RAG Results Summary")
    print("=" * 50)
    print(f"Total queries processed: {len(results)}")
    
    # Display first few results as examples
    print(f"\nSample Results:")
    for i, result in enumerate(results[:3]):
        print(f"\n{i+1}. Query: \"{result['query']}\"")
        print(f"   Top Results:")
        for j, retrieved in enumerate(result['retrieved_results'][:3], 1):
            print(f"     {j}. {retrieved['name']} (score: {retrieved['score']})")
    
    if len(results) > 3:
        print(f"\n... and {len(results) - 3} more results")

# Try to load existing results
print("Checking for existing traditional RAG results...")
existing_results = load_traditional_rag_results()

if existing_results:
    print("\nDisplaying existing results:")
    display_results_summary(existing_results)
else:
    print("No existing results found. Run the pipeline above to generate results.")

Checking for existing traditional RAG results...
Loaded 89 results from traditional_rag_results.json

Displaying existing results:
Traditional RAG Results Summary
Total queries processed: 89

Sample Results:

1. Query: "I need a casino hotel in Asia with gambling and nightlife."
   Top Results:
     1. The Shamrock (score: 0.436)
     2. The Palace of Versailles (score: 0.425)
     3. Pharaoh's Casino (score: 0.416)

2. Query: "Find me a luxury beach resort in Europe with spa services."
   Top Results:
     1. Siam Wellness Retreat (score: 0.424)
     2. Roman Holiday (score: 0.414)
     3. Belmond Hotel Cipriani (score: 0.412)

3. Query: "I'm looking for a mountain lodge in North America for skiing and hiking."
   Top Results:
     1. The Alps (score: 0.434)
     2. The Outback (score: 0.422)
     3. Fairmont Banff Springs (score: 0.406)

... and 86 more results


## Part 3: Structured RAG Implementation

Now we'll implement an enhanced RAG approach that leverages structured metadata for better search precision and filtering capabilities.

### Query Parsing & Serialization

Functions to parse natural language queries into structured format and serialize structured data for embeddings.

In [158]:
def parse_query_to_structured(query: str) -> Dict:
    """
    Parse a natural language query into structured format using GPT with structured output.
    Enhanced with better error handling and fallback mechanisms.
    
    Args:
        query (str): Natural language hotel search query
    
    Returns:
        Dict: Structured query in the same format as hotel metadata
    """
    
    # Define the JSON schema for structured output
    schema = {
        "type": "object",
        "properties": {
            "region": {
                "type": ["string", "null"],
                "enum": ["Europe", "Asia", "North America", "South America", "Africa", "Middle East", "Oceania", "Caribbean", None]
            },
            "country": {
                "type": ["string", "null"],
            },
            "type": {
                "type": ["string", "null"],
                "enum": ["Beach", "Mountain", "City", "Countryside", "Desert", "Entertainment", "Exotic", "Airport", "Coastal", None]
            },
            "atmosphere": {
                "type": ["string", "null"],
                "enum": ["Luxurious", "Quiet", "Adventure", "Family", "Historic", "Charming", "Vibrant", "Minimalist", "Business", "Intense", None]
            },
            "activities": {
                "type": "array",
                "items": {
                    "type": "string",
                    "enum": ["Beach", "Skiing", "Hiking", "Cultural", "Nightlife", "Gambling", "Wellness", "Entertainment", "Cuisine", "Fitness", "Weddings", "Business", "Dancing"]
                }
            },
            "services": {
                "type": "array",
                "items": {
                    "type": "string"
                }
            },
            "best_months": {
                "type": "array",
                "items": {
                    "type": "string",
                    "enum": ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December", "All"]
                }
            }
        },
        "required": ["region", "country", "type", "atmosphere", "activities", "services", "best_months"],
        "additionalProperties": False
    }
    
    try:
        # Make API call with structured output
        response = openai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {
                    "role": "system", 
                    "content": """You are an expert at parsing hotel search queries into structured data. Parse the user's query into the required JSON format. 

Guidelines:
- Use null for fields that are not mentioned or unclear
- For activities and services, return arrays (can be empty)
- For best_months, infer from context or use ['All'] if unclear
- Map similar concepts to the closest enum value
- Be conservative - only fill fields when you're confident

Examples:
- "beach hotel in Europe" -> region: "Europe", type: "Beach"
- "luxury spa in Asia" -> region: "Asia", atmosphere: "Luxurious", services: ["spa"]
- "mountain lodge for skiing" -> type: "Mountain", activities: ["Skiing"]"""
                },
                {
                    "role": "user", 
                    "content": f"Parse this hotel search query: '{query}'"
                }
            ],
            response_format={
                "type": "json_schema",
                "json_schema": {
                    "name": "hotel_query_structure",
                    "schema": schema,
                    "strict": True
                }
            },
            temperature=0
        )
        
        # Parse the JSON response - this should always be valid JSON now
        structured_query = json.loads(response.choices[0].message.content)
        
        # Validate that we got a reasonable result
        if isinstance(structured_query, dict):
            return structured_query
        else:
            print(f"Unexpected response format: {type(structured_query)}")
            return None
        
    except json.JSONDecodeError as e:
        print(f"JSON parsing error: {e}")
        print(f"Raw response: {response.choices[0].message.content if 'response' in locals() else 'No response'}")
        return None
        
    except Exception as e:
        error_msg = str(e).lower()
        
        # Check for specific error types
        if any(keyword in error_msg for keyword in ['rate limit', 'quota', 'billing']):
            print(f"API quota/billing issue: {e}")
        elif any(keyword in error_msg for keyword in ['timeout', 'connection', 'network']):
            print(f"Network/connection issue: {e}")
        elif any(keyword in error_msg for keyword in ['token', 'length', 'too long']):
            print(f"Token limit issue: {e}")
        else:
            print(f"Unexpected parsing error: {e}")
        
        # Return None to trigger retry mechanism
        return None

def serialize_structured_data(structured_data: Dict) -> str:
    """
    Convert structured data into a labeled string for embedding.
    
    Args:
        structured_data (Dict): Structured hotel data or query
    
    Returns:
        str: Serialized string with labels
    """
    parts = []
    
    if structured_data.get('region'):
        parts.append(f"Region: {structured_data['region']}")
    
    if structured_data.get('country'):
        parts.append(f"Country: {structured_data['country']}")
    
    if structured_data.get('type'):
        parts.append(f"Type: {structured_data['type']}")
    
    if structured_data.get('atmosphere'):
        parts.append(f"Atmosphere: {structured_data['atmosphere']}")
    
    if structured_data.get('activities'):
        activities = ', '.join(structured_data['activities'])
        parts.append(f"Activities: {activities}")
    
    if structured_data.get('services'):
        services = ', '.join(structured_data['services'])
        parts.append(f"Services: {services}")
    
    if structured_data.get('best_months'):
        months = ', '.join(structured_data['best_months'])
        parts.append(f"Best Months: {months}")
    
    return '. '.join(parts)

# Test the enhanced query parsing
test_query = "I want to stay in a tropical resort in Oceania with overwater bungalows."
print(f"Testing enhanced query parsing...")
print(f"Test Query: '{test_query}'")

parsed_query = parse_query_to_structured(test_query)
if parsed_query:
    print(f"Parsing successful!")
    print(f"Parsed Structure:")
    print(json.dumps(parsed_query, indent=2))
    
    serialized = serialize_structured_data(parsed_query)
    print(f"\nSerialized String:")
    print(f"'{serialized}'")
    
    # Test embedding the serialized structure
    structure_embedding = get_embedding(serialized)
    if structure_embedding:
        print(f"\n Structure embedding generated successfully!")
        print(f"Embedding dimension: {len(structure_embedding)}")
    else:
        print(f"\n Failed to generate structure embedding")
else:
    print(f" Parsing failed - this will trigger retry logic in structured_rag_search")

Testing enhanced query parsing...
Test Query: 'I want to stay in a tropical resort in Oceania with overwater bungalows.'
Parsing successful!
Parsed Structure:
{
  "region": "Oceania",
  "country": null,
  "type": "Exotic",
  "atmosphere": null,
  "activities": [],
  "services": [],
  "best_months": [
    "All"
  ]
}

Serialized String:
'Region: Oceania. Type: Exotic. Best Months: All'
Parsing successful!
Parsed Structure:
{
  "region": "Oceania",
  "country": null,
  "type": "Exotic",
  "atmosphere": null,
  "activities": [],
  "services": [],
  "best_months": [
    "All"
  ]
}

Serialized String:
'Region: Oceania. Type: Exotic. Best Months: All'

 Structure embedding generated successfully!
Embedding dimension: 1536

 Structure embedding generated successfully!
Embedding dimension: 1536


### Create Structured RAG Collection

Now we'll create a new Qdrant collection that stores both the regular description embeddings and structured metadata embeddings, along with filterable metadata fields.

In [None]:
def setup_structured_qdrant_collection(client: QdrantClient, collection_name: str, embedding_dim: int):
    """
    Create a Qdrant collection for structured RAG with multiple vector types and indexed fields.
    
    Args:
        client: Qdrant client instance
        collection_name: Name of the collection
        embedding_dim: Dimension of the embeddings
    """
    try:
        # Delete collection if it exists
        try:
            client.delete_collection(collection_name=collection_name)
            print(f"Deleted existing collection '{collection_name}'")
        except:
            print(f"Collection '{collection_name}' doesn't exist, creating new one")
        
        # Create new collection with multiple vectors
        vectors_config = {
            "description": VectorParams(size=embedding_dim, distance=Distance.COSINE),
            "structure": VectorParams(size=embedding_dim, distance=Distance.COSINE)
        }
        
        client.create_collection(
            collection_name=collection_name,
            vectors_config=vectors_config,
        )
        print(f" Created structured collection '{collection_name}' with dual vectors")
        
        # Create indexes for filterable fields
        indexable_fields = ["region", "country", "type", "atmosphere", "activities", "services", "best_months"]
        
        for field in indexable_fields:
            try:
                client.create_payload_index(
                    collection_name=collection_name,
                    field_name=field,
                    field_schema=PayloadSchemaType.KEYWORD
                )
                print(f" Created index for field: {field}")
            except Exception as e:
                print(f" Could not create index for {field}: {e}")
        
    except Exception as e:
        print(f" Error setting up structured collection: {e}")

def store_hotels_structured(client: QdrantClient, hotels_data: List[Dict], collection_name: str):
    """
    Store hotel data in structured Qdrant collection with both description and structure embeddings.
    
    Args:
        client: Qdrant client instance
        hotels_data: List of hotels with embeddings
        collection_name: Name of the collection
    """
    points = []
    
    print(f"Processing {len(hotels_data)} hotels for structured storage...")
    
    for i, hotel in enumerate(hotels_data):
        print(f"Processing hotel {i+1}/{len(hotels_data)}")
        
        # Get description embedding (reuse existing)
        desc_embedding = get_embedding(hotel['description'])
        
        # Get structured data and create embedding
        structured_desc = hotel.get('structured_description', {})
        serialized_structure = serialize_structured_data(structured_desc)
        structure_embedding = get_embedding(serialized_structure)
        
        if desc_embedding and structure_embedding:
            # Prepare metadata for filtering
            metadata = {
                "hotel_id": i,
                "description": hotel['description'],
                "structured_description": structured_desc,
                "serialized_structure": serialized_structure,
                # Filterable fields
                "region": structured_desc.get('region'),
                "country": structured_desc.get('country'),
                "type": structured_desc.get('type'),
                "atmosphere": structured_desc.get('atmosphere'),
                "activities": structured_desc.get('activities', []),
                "services": structured_desc.get('services', []),
                "best_months": structured_desc.get('best_months', []),
                "name": structured_desc.get('name', f"Hotel_{i}")
            }
            
            # Create point with multiple named vectors
            vectors = {
                "description": desc_embedding,
                "structure": structure_embedding
            }
            
            point = PointStruct(
                id=i,
                vector=vectors,
                payload=metadata
            )
            points.append(point)
        else:
            print(f"Failed to get embeddings for hotel {i+1}")
        
        # Add delay to avoid rate limiting
        time.sleep(0.1)
    
    # Upload points to Qdrant
    try:
        client.upsert(
            collection_name=collection_name,
            points=points
        )
        print(f"Successfully stored {len(points)} hotels in structured collection '{collection_name}'")
        
        # Verify the upload
        collection_info = client.get_collection(collection_name=collection_name)
        print(f"Collection info: {collection_info.points_count} points stored")
        
    except Exception as e:
        print(f"Error storing hotels in structured collection: {e}")

# Setup and populate structured collection
STRUCTURED_COLLECTION_NAME = "hotels_structured_rag"

if hotels_with_embeddings:
    embedding_dim = len(hotels_with_embeddings[0]['embedding'])
    
    # Setup structured collection
    setup_structured_qdrant_collection(client, STRUCTURED_COLLECTION_NAME, embedding_dim)
    
    # Store hotels with structured data
    store_hotels_structured(client, hotels_data, STRUCTURED_COLLECTION_NAME)
else:
    print("No hotels with embeddings available. Please run the embedding generation first.")

### Implement Structured RAG Search

Now we'll create the structured search function that combines metadata filtering with vector similarity search on both description and structure embeddings.

In [159]:
from qdrant_client.models import Filter, FieldCondition, MatchValue, MatchAny

def structured_rag_search(query: str, client: QdrantClient, collection_name: str, top_k: int = 5, use_filters: bool = True) -> List[Dict]:
    """
    Perform structured RAG search with metadata filtering and dual vector search.
    Falls back to Traditional RAG if structured parsing fails.
    
    Args:
        query (str): User's search query
        client: Qdrant client instance
        collection_name (str): Name of the structured collection
        top_k (int): Number of results to return
        use_filters (bool): Whether to apply metadata filters
    
    Returns:
        List[Dict]: List of hotels with similarity scores
    """
    
    def traditional_rag_fallback(query: str, client: QdrantClient, collection_name: str, top_k: int) -> List[Dict]:
        """
        Fallback to traditional RAG search using only description embeddings.
        """
        print(f"Falling back to Traditional RAG for query: '{query}'")
        
        try:
            # Generate embedding for the query
            desc_embedding = get_embedding(query)
            if not desc_embedding:
                print("Failed to generate embedding for fallback")
                return []
            
            # Perform simple vector search on description embeddings
            search_results = client.search(
                collection_name=collection_name,
                query_vector=("description", desc_embedding),
                limit=top_k,
                with_payload=True,
                score_threshold=0.0
            )
            
            # Format results to match structured RAG output format
            final_results = []
            for result in search_results:
                hotel_info = {
                    'id': result.id,
                    'combined_score': result.score,  # Use description score as combined score
                    'description_score': result.score,
                    'structure_score': 0.0,  # No structure score in fallback
                    'description': result.payload.get('description', ''),
                    'name': result.payload.get('name', ''),
                    'region': result.payload.get('region', ''),
                    'country': result.payload.get('country', ''),
                    'type': result.payload.get('type', ''),
                    'atmosphere': result.payload.get('atmosphere', ''),
                    'activities': result.payload.get('activities', []),
                    'structured_description': result.payload.get('structured_description', {})
                }
                final_results.append(hotel_info)
            
            print(f"Fallback found {len(final_results)} traditional RAG results")
            return final_results
            
        except Exception as e:
            print(f"Fallback Traditional RAG also failed: {e}")
            return []
    
    try:
        # Parse query into structured format
        print(f"Parsing query: '{query}'")
        structured_query = parse_query_to_structured(query)
        
        if not structured_query:
            print("Failed to parse query into structured format")
            print("Switching to Traditional RAG fallback...")
            return traditional_rag_fallback(query, client, collection_name, top_k)
        
        print(f"Parsed structure: {json.dumps(structured_query, indent=2)}")
        
        # Generate embeddings for both description and structure
        desc_embedding = get_embedding(query)
        serialized_query = serialize_structured_data(structured_query)
        structure_embedding = get_embedding(serialized_query)
        
        print(f"Serialized query: '{serialized_query}'")
        
        if not desc_embedding or not structure_embedding:
            print("Failed to generate query embeddings")
            print("Switching to Traditional RAG fallback...")
            return traditional_rag_fallback(query, client, collection_name, top_k)
        
        # Create metadata filters with must (required) and should (preferred) conditions
        filters = None
        if use_filters:
            must_conditions = []
            should_conditions = []
            
            # Required filters (must match)
            if structured_query.get('region'):
                must_conditions.append(
                    FieldCondition(key="region", match=MatchValue(value=structured_query['region']))
                )
            
            if structured_query.get('country'):
                should_conditions.append(
                    FieldCondition(key="country", match=MatchValue(value=structured_query['country']))
                )
            
            # Preferred filters (should match for better ranking)
            if structured_query.get('type'):
                should_conditions.append(
                    FieldCondition(key="type", match=MatchValue(value=structured_query['type']))
                )
            
            if structured_query.get('atmosphere'):
                should_conditions.append(
                    FieldCondition(key="atmosphere", match=MatchValue(value=structured_query['atmosphere']))
                )
            
            if structured_query.get('activities'):
                should_conditions.append(
                    FieldCondition(key="activities", match=MatchAny(any=structured_query['activities']))
                )
            
            if structured_query.get('services'):
                should_conditions.append(
                    FieldCondition(key="services", match=MatchAny(any=structured_query['services']))
                )
            
            if structured_query.get('best_months') and structured_query['best_months'] != ['All']:
                should_conditions.append(
                    FieldCondition(key="best_months", match=MatchAny(any=structured_query['best_months']))
                )
            
            # Create filter with must and should conditions
            if must_conditions or should_conditions:
                filters = Filter(
                    must=must_conditions if must_conditions else None,
                    should=should_conditions if should_conditions else None
                )
                print(f"Applied {len(must_conditions)} required filters and {len(should_conditions)} preference filters")
            else:
                print("No specific filters applied")
        
        # Perform hybrid search (description + structure)
        try:
            # Search using description vector
            desc_results = client.search(
                collection_name=collection_name,
                query_vector=("description", desc_embedding),
                query_filter=filters,
                limit=top_k * 2,  # Get more results for merging
                with_payload=True,
                score_threshold=0.0  # Accept all results
            )
            
            # Search using structure vector
            structure_results = client.search(
                collection_name=collection_name,
                query_vector=("structure", structure_embedding),
                query_filter=filters,
                limit=top_k * 2,  # Get more results for merging
                with_payload=True,
                score_threshold=0.0  # Accept all results
            )
            
        except Exception as search_error:
            print(f"Vector search failed: {search_error}")
            print("Switching to Traditional RAG fallback...")
            return traditional_rag_fallback(query, client, collection_name, top_k)
        
        # Merge and rank results
        results_dict = {}
        
        # Add description results with weight
        for result in desc_results:
            hotel_id = result.id
            desc_score = result.score * 0.6  # 60% weight for description
            results_dict[hotel_id] = {
                'id': hotel_id,
                'description_score': result.score,
                'structure_score': 0.0,
                'combined_score': desc_score,
                'payload': result.payload
            }
        
        # Add structure results with weight and combine scores
        for result in structure_results:
            hotel_id = result.id
            struct_score = result.score * 0.4  # 40% weight for structure
            
            if hotel_id in results_dict:
                # Update existing result
                results_dict[hotel_id]['structure_score'] = result.score
                results_dict[hotel_id]['combined_score'] += struct_score
            else:
                # Add new result
                results_dict[hotel_id] = {
                    'id': hotel_id,
                    'description_score': 0.0,
                    'structure_score': result.score,
                    'combined_score': struct_score,
                    'payload': result.payload
                }
        
        # Sort by combined score and take top_k
        sorted_results = sorted(results_dict.values(), key=lambda x: x['combined_score'], reverse=True)[:top_k]
        
        # Format final results
        final_results = []
        for result in sorted_results:
            hotel_info = {
                'id': result['id'],
                'combined_score': result['combined_score'],
                'description_score': result['description_score'],
                'structure_score': result['structure_score'],
                'description': result['payload'].get('description', ''),
                'name': result['payload'].get('name', ''),
                'region': result['payload'].get('region', ''),
                'country': result['payload'].get('country', ''),
                'type': result['payload'].get('type', ''),
                'atmosphere': result['payload'].get('atmosphere', ''),
                'activities': result['payload'].get('activities', []),
                'structured_description': result['payload'].get('structured_description', {})
            }
            final_results.append(hotel_info)
        
        print(f"Found {len(final_results)} structured results")
        return final_results
        
    except Exception as e:
        print(f"Error during structured RAG search: {e}")
        print("Switching to Traditional RAG fallback...")
        return traditional_rag_fallback(query, client, collection_name, top_k)

def display_structured_results(query: str, results: List[Dict]):
    """
    Display structured search results in a formatted way.
    
    Args:
        query (str): The original search query
        results (List[Dict]): Search results from structured RAG
    """
    print(f"Structured RAG Query: '{query}'")
    print(f"Found {len(results)} results:\n")
    
    for i, hotel in enumerate(results, 1):
        print(f"{i}. {hotel['name']} - Combined Score: {hotel['combined_score']:.3f}")
        print(f"   Description: {hotel['description_score']:.3f} | Structure: {hotel['structure_score']:.3f}")
        print(f"   {hotel['region']}, {hotel['country']} | {hotel['type']} | {hotel['atmosphere']}")
        print(f"   Activities: {', '.join(hotel['activities'][:3]) if hotel['activities'] else 'None'}")
        print()

# Structured RAG Testing

This section provides a clean workflow for testing the Structured RAG system:
1. Run structured RAG search with top_k=5
2. Extract hotel names and scores 
3. Save results to structured_rag_results.json with response time tracking

In [160]:
def run_structured_rag_search(query: str, client: QdrantClient, collection_name: str, top_k: int = 5) -> List[Dict]:
    """
    Run structured RAG search and return results with hotel names and scores.
    
    Args:
        query (str): User's search query
        client: Qdrant client instance
        collection_name (str): Name of the structured collection
        top_k (int): Number of results to return
    
    Returns:
        List[Dict]: List of hotels with names and scores
    """
    try:
        # Use the existing structured_rag_search function
        structured_results = structured_rag_search(query, client, collection_name, top_k, use_filters=True)
        
        if not structured_results:
            print(f"No results found for query: '{query}'")
            return []
        
        # Format results with hotel names and scores
        formatted_results = []
        for result in structured_results:
            hotel_name = result.get('name', 'Unknown')
            combined_score = result.get('combined_score', 0.0)
            
            formatted_results.append({
                'name': hotel_name,
                'score': round(combined_score, 3),
                'description_score': round(result.get('description_score', 0.0), 3),
                'structure_score': round(result.get('structure_score', 0.0), 3),
                'id': result.get('id', -1),
                'region': result.get('region', ''),
                'country': result.get('country', ''),
                'type': result.get('type', ''),
                'atmosphere': result.get('atmosphere', '')
            })
        
        return formatted_results
        
    except Exception as e:
        print(f"Error during structured RAG search: {e}")
        return []

# Test the function with a sample query
test_query = "I want to stay in a palace hotel in Asia with luxury amenities."
print(f"Testing structured RAG search with: '{test_query}'")

test_results = run_structured_rag_search(test_query, client, STRUCTURED_COLLECTION_NAME, top_k=5)
if test_results:
    print(f"Found {len(test_results)} results:")
    for i, result in enumerate(test_results, 1):
        print(f"  {i}. {result['name']} - Combined Score: {result['score']}")
        print(f"     📍 {result['region']}, {result['country']} | {result['type']} | {result['atmosphere']}")
else:
    print("No results found in test")

Testing structured RAG search with: 'I want to stay in a palace hotel in Asia with luxury amenities.'
Parsing query: 'I want to stay in a palace hotel in Asia with luxury amenities.'
Parsed structure: {
  "region": "Asia",
  "country": null,
  "type": "Exotic",
  "atmosphere": "Luxurious",
  "activities": [],
  "services": [
    "luxury amenities"
  ],
  "best_months": [
    "All"
  ]
}
Parsed structure: {
  "region": "Asia",
  "country": null,
  "type": "Exotic",
  "atmosphere": "Luxurious",
  "activities": [],
  "services": [
    "luxury amenities"
  ],
  "best_months": [
    "All"
  ]
}
Serialized query: 'Region: Asia. Type: Exotic. Atmosphere: Luxurious. Services: luxury amenities. Best Months: All'
Applied 1 required filters and 3 preference filters
Found 4 structured results
Found 4 results:
  1. The Palace of Versailles - Combined Score: 0.55
     📍 Asia, Macau | City | Luxurious
  2. The Orient Express - Combined Score: 0.542
     📍 Asia, Singapore | Exotic | Luxurious
  3. The

  desc_results = client.search(
  structure_results = client.search(


In [161]:
# Utility functions for Structured RAG
def get_structured_hotel_name_by_id(hotel_id: int, hotels_data: List[Dict]) -> str:
    """
    Get hotel name by ID from the structured hotels data.
    
    Args:
        hotel_id (int): Hotel ID from search results
        hotels_data (List[Dict]): List of hotel data
    
    Returns:
        str: Hotel name or 'Unknown' if not found
    """
    try:
        if 0 <= hotel_id < len(hotels_data):
            hotel = hotels_data[hotel_id]
            # Try structured_description.name first
            if 'structured_description' in hotel and 'name' in hotel['structured_description']:
                return hotel['structured_description']['name']
            # Fallback to extracting from description
            return f"Hotel_{hotel_id}"
        return "Unknown"
    except Exception as e:
        print(f"Error getting hotel name for ID {hotel_id}: {e}")
        return "Unknown"

def find_structured_hotel_by_name(name: str, hotels_data: List[Dict]) -> Dict:
    """
    Find a hotel by name in the structured hotels data.
    
    Args:
        name (str): Hotel name to search for
        hotels_data (List[Dict]): List of hotel data
    
    Returns:
        Dict: Hotel information or None if not found
    """
    try:
        for i, hotel in enumerate(hotels_data):
            if 'structured_description' in hotel and 'name' in hotel['structured_description']:
                if hotel['structured_description']['name'].lower() == name.lower():
                    return hotel
        return None
    except Exception as e:
        print(f"Error finding hotel by name '{name}': {e}")
        return None

print("Structured RAG utility functions loaded")
print("Functions available:")
print("  - get_structured_hotel_name_by_id()")
print("  - find_structured_hotel_by_name()")

Structured RAG utility functions loaded
Functions available:
  - get_structured_hotel_name_by_id()
  - find_structured_hotel_by_name()


In [None]:
def run_structured_rag_pipeline(queries_data: List[Dict], client: QdrantClient, collection_name: str, output_file: str = 'structured_rag_results.json', top_k: int = 5, test_last_n: int = 10):
    """
    Run the complete Structured RAG pipeline on all queries and save results.
    
    Args:
        queries_data (List[Dict]): List of queries from queries.json
        client: Qdrant client instance
        collection_name (str): Name of the structured collection
        output_file (str): Output JSON file name
        top_k (int): Number of results per query
        test_last_n (int): Number of queries to test (last N queries)
    """
    print(f"Starting Structured RAG Pipeline")
    print(f"Testing last {test_last_n} queries (out of {len(queries_data)} total)")
    print(f"Top-K results per query: {top_k}")
    print(f"Results will be saved to: {output_file}")
    print("=" * 60)
    
    # Use last N queries for testing (these include challenging queries)
    test_queries = queries_data[-test_last_n:]
    
    all_results = []
    
    for i, query_item in enumerate(test_queries, 1):
        query = query_item['query']
        query_type = query_item.get('type', 'unknown')
        
        print(f"\nQuery {i}/{test_last_n} ({query_type}): '{query}'")
        
        # Track response time
        start_time = time.time()
        
        # Run structured RAG search
        try:
            search_results = run_structured_rag_search(query, client, collection_name, top_k)
            
            # Record end time
            end_time = time.time()
            
            if search_results:
                # Format results for JSON output
                result_entry = {
                    "query": query,
                    "query_type": query_type,
                    "expected_results": query_item.get('expected_results', []),
                    "retrieved_results": [
                        {
                            "name": result['name'],
                            "combined_score": result['score'],
                            "description_score": result['description_score'],
                            "structure_score": result['structure_score'],
                            "region": result['region'],
                            "country": result['country'],
                            "type": result['type'],
                            "atmosphere": result['atmosphere']
                        }
                        for result in search_results
                    ],
                    "response_time": round(end_time - start_time, 3)
                }
                
                all_results.append(result_entry)
                
                print(f"Found {len(search_results)} results (Response time: {result_entry['response_time']}s)")
                for j, result in enumerate(search_results, 1):
                    print(f"  {j}. {result['name']} - Combined: {result['score']}")
                    print(f"      {result['region']}, {result['country']}")
            else:
                # Record end time even for failed searches
                end_time = time.time()
                
                result_entry = {
                    "query": query,
                    "query_type": query_type,
                    "expected_results": query_item.get('expected_results', []),
                    "retrieved_results": [],
                    "response_time": round(end_time - start_time, 3)
                }
                all_results.append(result_entry)
                print(f"No results found (Response time: {result_entry['response_time']}s)")
                
        except Exception as e:
            end_time = time.time()
            print(f"Error processing query: {e}")
            
            result_entry = {
                "query": query,
                "query_type": query_type,
                "expected_results": query_item.get('expected_results', []),
                "retrieved_results": [],
                "response_time": round(end_time - start_time, 3),
                "error": str(e)
            }
            all_results.append(result_entry)
    
    # Save results to JSON file
    try:
        with open(output_file, 'w', encoding='utf-8') as f:
            json.dump(all_results, f, indent=2, ensure_ascii=False)
        
        print(f"\nPipeline completed successfully!")
        print(f"Processed {len(all_results)} queries")
        print(f"Results saved to: {output_file}")
        
        # Calculate summary statistics
        successful_queries = [r for r in all_results if r['retrieved_results']]
        failed_queries = [r for r in all_results if not r['retrieved_results']]
        avg_response_time = sum(r['response_time'] for r in all_results) / len(all_results)
        
        print(f"\n Summary Statistics:")
        print(f"  Successful queries: {len(successful_queries)}/{len(all_results)}")
        print(f"  Failed queries: {len(failed_queries)}/{len(all_results)}")
        print(f"  Average response time: {avg_response_time:.3f}s")
        
    except Exception as e:
        print(f"Error saving results: {e}")

# Run the Structured RAG pipeline
print("Ready to run Structured RAG Pipeline")
print("Execute the following to start the pipeline:")
print("run_structured_rag_pipeline(structured_queries_data, client, STRUCTURED_COLLECTION_NAME)")

# Run the pipeline on entire dataset
run_structured_rag_pipeline(structured_queries_data, client, STRUCTURED_COLLECTION_NAME, 'structured_rag_results.json', top_k=5, test_last_n=len(structured_queries_data))

## Part 4: Structured RAG Testing & Evaluation

Test the Structured RAG implementation and compare its performance with Traditional RAG.

In [None]:
# Load RAG Results for Evaluation
import json
import numpy as np
from typing import List, Dict, Tuple
import matplotlib.pyplot as plt
import seaborn as sns

def load_rag_results(traditional_file: str = 'traditional_rag_results.json', 
                     structured_file: str = 'structured_rag_results.json') -> Tuple[List[Dict], List[Dict]]:
    """
    Load both Traditional and Structured RAG results from JSON files.
    
    Returns:
        Tuple of (traditional_results, structured_results)
    """
    try:
        # Load Traditional RAG results
        with open(traditional_file, 'r', encoding='utf-8') as f:
            traditional_results = json.load(f)
        print(f"Loaded {len(traditional_results)} Traditional RAG results")
        
        # Load Structured RAG results
        with open(structured_file, 'r', encoding='utf-8') as f:
            structured_results = json.load(f)
        print(f"Loaded {len(structured_results)} Structured RAG results")
        
        return traditional_results, structured_results
        
    except FileNotFoundError as e:
        print(f"Error loading results: {e}")
        print("Please ensure both result files exist in the current directory")
        return [], []
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return [], []

# Load both result sets
traditional_results, structured_results = load_rag_results()

# Verify data consistency
if len(traditional_results) == len(structured_results):
    print(f"Dataset consistency verified: {len(traditional_results)} queries in both result sets")
else:
    print(f"Warning: Different number of results - Traditional: {len(traditional_results)}, Structured: {len(structured_results)}")

# Preview data structure
if traditional_results:
    print(f"\nTraditional RAG result structure:")
    sample = traditional_results[0]
    for key, value in sample.items():
        if key == 'retrieved_results' and value:
            print(f"  {key}: {len(value)} results (sample: {value[0]['name'] if value else 'None'})")
        else:
            print(f"  {key}: {type(value).__name__}")

if structured_results:
    print(f"\nStructured RAG result structure:")
    sample = structured_results[0]
    for key, value in sample.items():
        if key == 'retrieved_results' and value:
            print(f"  {key}: {len(value)} results (sample: {value[0]['name'] if value else 'None'})")
        else:
            print(f"  {key}: {type(value).__name__}")

In [None]:
# Evaluation Metrics Functions
def calculate_precision_recall_f1(retrieved_hotels: List[str], expected_hotels: List[str]) -> Tuple[float, float, float]:
    """
    Calculate Precision, Recall, and F1-Score for a single query.
    
    Args:
        retrieved_hotels: List of hotel names retrieved by the system
        expected_hotels: List of expected/correct hotel names
    
    Returns:
        Tuple of (precision, recall, f1_score)
    """
    if not retrieved_hotels and not expected_hotels:
        return 1.0, 1.0, 1.0  # Perfect match when both are empty
    
    if not retrieved_hotels:
        return 0.0, 0.0, 0.0  # No results retrieved
    
    if not expected_hotels:
        return 0.0, 1.0, 0.0  # No expected results, so precision is 0
    
    # Convert to sets for intersection calculation
    retrieved_set = set(retrieved_hotels)
    expected_set = set(expected_hotels)
    
    # Calculate metrics
    true_positives = len(retrieved_set.intersection(expected_set))
    
    precision = true_positives / len(retrieved_set) if retrieved_set else 0.0
    recall = true_positives / len(expected_set) if expected_set else 0.0
    
    # F1 Score
    if precision + recall == 0:
        f1_score = 0.0
    else:
        f1_score = 2 * (precision * recall) / (precision + recall)
    
    return precision, recall, f1_score

def calculate_mrr(retrieved_hotels: List[str], expected_hotels: List[str]) -> float:
    """
    Calculate Traditional Mean Reciprocal Rank (first correct result only).
    
    Args:
        retrieved_hotels: List of hotel names retrieved by the system (in rank order)
        expected_hotels: List of expected/correct hotel names
    
    Returns:
        Reciprocal rank (1/rank of first correct result, 0 if no correct results)
    """
    if not retrieved_hotels or not expected_hotels:
        return 0.0
    
    expected_set = set(expected_hotels)
    
    for rank, hotel in enumerate(retrieved_hotels, 1):
        if hotel in expected_set:
            return 1.0 / rank
    
    return 0.0  # No correct results found

def calculate_weighted_mrr(retrieved_hotels: List[str], expected_hotels: List[str]) -> float:
    """
    Calculate Weighted MRR that rewards finding more correct results.
    Uses recall bonus to reward systems that find more relevant items.
    
    Args:
        retrieved_hotels: List of hotel names retrieved by the system (in rank order)
        expected_hotels: List of expected/correct hotel names
    
    Returns:
        Weighted MRR score
    """
    if not retrieved_hotels or not expected_hotels:
        return 0.0
    
    expected_set = set(expected_hotels)
    reciprocal_ranks = []
    
    # Find ALL correct results and their ranks
    for rank, hotel in enumerate(retrieved_hotels, 1):
        if hotel in expected_set:
            reciprocal_ranks.append(1.0 / rank)
    
    if not reciprocal_ranks:
        return 0.0
    
    # Calculate recall bonus (reward for finding more relevant items)
    recall = len(reciprocal_ranks) / len(expected_set)
    
    # Weighted score: mean of reciprocal ranks * recall bonus
    mean_reciprocal_rank = sum(reciprocal_ranks) / len(reciprocal_ranks)
    weighted_score = mean_reciprocal_rank * (0.5 + 0.5 * recall)  # Base 50% + up to 50% recall bonus
    
    return weighted_score

def calculate_ndcg_at_k(retrieved_hotels: List[str], expected_hotels: List[str], k: int = 5) -> float:
    """
    Calculate Normalized Discounted Cumulative Gain (NDCG@K).
    Better metric for ranking that considers both position and completeness.
    
    Args:
        retrieved_hotels: List of hotel names retrieved by the system (in rank order)
        expected_hotels: List of expected/correct hotel names
        k: Number of top results to consider
    
    Returns:
        NDCG@K score (0 to 1, higher is better)
    """
    if not retrieved_hotels or not expected_hotels:
        return 0.0
    
    expected_set = set(expected_hotels)
    
    # Calculate DCG (Discounted Cumulative Gain)
    dcg = 0.0
    for i, hotel in enumerate(retrieved_hotels[:k]):
        if hotel in expected_set:
            # Relevance = 1 for correct, 0 for incorrect
            # Discount factor = 1/log2(position + 1)
            dcg += 1.0 / np.log2(i + 2)  # i+2 because log2(1) = 0
    
    # Calculate IDCG (Ideal DCG) - if all expected results were at top positions
    idcg = 0.0
    num_relevant = min(len(expected_hotels), k)
    for i in range(num_relevant):
        idcg += 1.0 / np.log2(i + 2)
    
    # NDCG = DCG / IDCG
    if idcg == 0:
        return 0.0
    
    return dcg / idcg

def evaluate_rag_system(results: List[Dict], system_name: str) -> Dict:
    """
    Evaluate a RAG system using multiple metrics.
    
    Args:
        results: List of query results from the RAG system
        system_name: Name of the system being evaluated
    
    Returns:
        Dictionary containing evaluation metrics
    """
    print(f"\nEvaluating {system_name}...")
    
    precisions = []
    recalls = []
    f1_scores = []
    mrr_scores = []
    weighted_mrr_scores = []
    ndcg_scores = []
    response_times = []
    
    queries_with_results = 0
    queries_with_expected = 0
    
    for i, result in enumerate(results):
        query = result.get('query', '')
        expected_hotels = result.get('expected_results', [])
        retrieved_results = result.get('retrieved_results', [])
        response_time = result.get('response_time', 0)
        
        # Extract hotel names from retrieved results
        if isinstance(retrieved_results, list) and retrieved_results:
            if isinstance(retrieved_results[0], dict):
                # Structured format with name and score
                retrieved_hotels = [r.get('name', '') for r in retrieved_results if r.get('name')]
            else:
                # Simple list format
                retrieved_hotels = retrieved_results
        else:
            retrieved_hotels = []
        
        # Calculate metrics for this query
        precision, recall, f1 = calculate_precision_recall_f1(retrieved_hotels, expected_hotels)
        mrr = calculate_mrr(retrieved_hotels, expected_hotels)
        weighted_mrr = calculate_weighted_mrr(retrieved_hotels, expected_hotels)
        ndcg = calculate_ndcg_at_k(retrieved_hotels, expected_hotels, k=5)
        
        precisions.append(precision)
        recalls.append(recall)
        f1_scores.append(f1)
        mrr_scores.append(mrr)
        weighted_mrr_scores.append(weighted_mrr)
        ndcg_scores.append(ndcg)
        response_times.append(response_time)
        
        if retrieved_hotels:
            queries_with_results += 1
        if expected_hotels:
            queries_with_expected += 1
        
        # Print details for first few queries
        if i < 3:
            print(f"  Query {i+1}: '{query[:50]}...'")
            print(f"    Expected: {expected_hotels}")
            print(f"    Retrieved: {retrieved_hotels}")
            print(f"    Metrics: P={precision:.3f}, R={recall:.3f}, F1={f1:.3f}")
            print(f"    Ranking: MRR={mrr:.3f}, Weighted-MRR={weighted_mrr:.3f}, NDCG@5={ndcg:.3f}")
    
    # Calculate aggregate metrics
    avg_precision = np.mean(precisions)
    avg_recall = np.mean(recalls)
    avg_f1 = np.mean(f1_scores)
    avg_mrr = np.mean(mrr_scores)
    avg_weighted_mrr = np.mean(weighted_mrr_scores)
    avg_ndcg = np.mean(ndcg_scores)
    avg_response_time = np.mean(response_times)
    
    metrics = {
        'system_name': system_name,
        'total_queries': len(results),
        'queries_with_results': queries_with_results,
        'queries_with_expected': queries_with_expected,
        'avg_precision': avg_precision,
        'avg_recall': avg_recall,
        'avg_f1_score': avg_f1,
        'avg_mrr': avg_mrr,
        'avg_weighted_mrr': avg_weighted_mrr,
        'avg_ndcg': avg_ndcg,
        'avg_response_time': avg_response_time,
        'precision_scores': precisions,
        'recall_scores': recalls,
        'f1_scores': f1_scores,
        'mrr_scores': mrr_scores,
        'weighted_mrr_scores': weighted_mrr_scores,
        'ndcg_scores': ndcg_scores,
        'response_times': response_times
    }
    
    return metrics

print("Evaluation functions defined with multiple ranking metrics")

In [None]:
# Create comparison summary
if traditional_results and structured_results:
    # Evaluate Traditional RAG
    traditional_metrics = evaluate_rag_system(traditional_results, "Traditional RAG")
    # Evaluate Structured RAG
    structured_metrics = evaluate_rag_system(structured_results, "Structured RAG")
    print(f"\nEVALUATION RESULTS SUMMARY (Enhanced with Better Ranking Metrics)")
    print("=" * 90)
    print(f"{'Metric':<25} {'Traditional RAG':<20} {'Structured RAG':<20} {'Winner':<15}")
    print("-" * 90)
    
    metrics_to_compare = [
        ('Precision', 'avg_precision', 'higher'),
        ('Recall', 'avg_recall', 'higher'),
        ('F1-Score', 'avg_f1_score', 'higher'),
        ('MRR (Traditional)', 'avg_mrr', 'higher'),
        ('Weighted MRR', 'avg_weighted_mrr', 'higher'),
        ('NDCG@5', 'avg_ndcg', 'higher'),
        ('Avg Response Time (s)', 'avg_response_time', 'lower')
    ]
    
    for metric_name, metric_key, better in metrics_to_compare:
        trad_value = traditional_metrics[metric_key]
        struct_value = structured_metrics[metric_key]
        
        if better == 'higher':
            if trad_value > struct_value:
                winner = "Traditional RAG"
            elif struct_value > trad_value:
                winner = "Structured RAG"
            else:
                winner = "Tie"
        else:  # lower is better (for response time)
            if trad_value < struct_value:
                winner = "Traditional RAG"
            elif struct_value < trad_value:
                winner = "Structured RAG"
            else:
                winner = "Tie"
        
        print(f"{metric_name:<25} {trad_value:<20.4f} {struct_value:<20.4f} {winner:<15}")
    
    print("-" * 90)
else:
    print("Cannot run evaluation - missing result files")
    print("Please ensure both traditional_rag_results.json and structured_rag_results.json exist")


🔍 Evaluating Traditional RAG...
  Query 1: 'I need a casino hotel in Asia with gambling and ni...'
    Expected: ['The Colosseum', 'The Shamrock', 'The Palace of Versailles', 'Marina Bay Sands']
    Retrieved: ['The Shamrock', 'The Palace of Versailles', "Pharaoh's Casino", 'The Parthenon', 'The Colosseum']
    Metrics: P=0.600, R=0.750, F1=0.667
    Ranking: MRR=1.000, Weighted-MRR=0.496, NDCG@5=0.788
  Query 2: 'Find me a luxury beach resort in Europe with spa s...'
    Expected: ['Hotel du Cap-Eden-Roc']
    Retrieved: ['Siam Wellness Retreat', 'Roman Holiday', 'Belmond Hotel Cipriani', 'Spartan Training Center', 'The Wild West']
    Metrics: P=0.000, R=0.000, F1=0.000
    Ranking: MRR=0.000, Weighted-MRR=0.000, NDCG@5=0.000
  Query 3: 'I'm looking for a mountain lodge in North America ...'
    Expected: ['Kyoto Gardens Inn', 'Fairmont Banff Springs', 'The Outback', 'Bavarian Village Hotel']
    Retrieved: ['The Alps', 'The Outback', 'Fairmont Banff Springs', 'Kyoto Gardens Inn', '