# Hotel Support Agent Tutorial

This notebook demonstrates the Agent Catalog hotel support agent using LangChain with Couchbase vector store and Arize Phoenix evaluation.


## Setup and Imports

Import all necessary modules for the hotel support agent.


In [1]:
import base64
import getpass
import json
import logging
import os
import sys
import time
from datetime import timedelta

import agentc
import agentc_langchain
import dotenv
import requests
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.management.buckets import CreateBucketSettings
from couchbase.management.search import SearchIndex
from couchbase.options import ClusterOptions
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import PromptTemplate
from langchain_core.tools import Tool
from langchain_couchbase.vectorstores import CouchbaseVectorStore
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Setup logging
logging.basicConfig(
    level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)

# Suppress verbose logging
logging.getLogger("openai").setLevel(logging.WARNING)
logging.getLogger("httpx").setLevel(logging.WARNING)
logging.getLogger("httpcore").setLevel(logging.WARNING)
logging.getLogger("agentc_core").setLevel(logging.WARNING)

# Load environment variables
dotenv.load_dotenv(override=True)

# Set default values for travel-sample bucket configuration
DEFAULT_BUCKET = "travel-sample"
DEFAULT_SCOPE = "agentc_data"
DEFAULT_COLLECTION = "hotel_data"
DEFAULT_INDEX = "hotel_data_index"


## Environment Setup

Setup environment variables and configuration for the hotel support agent.


In [2]:
def setup_capella_ai_config():
    """Setup Capella AI configuration - requires environment variables to be set."""
    required_capella_vars = [
        "CB_USERNAME",
        "CB_PASSWORD", 
        "CAPELLA_API_ENDPOINT",
        "CAPELLA_API_EMBEDDING_MODEL",
        "CAPELLA_API_LLM_MODEL",
    ]
    missing_vars = [var for var in required_capella_vars if not os.getenv(var)]
    if missing_vars:
        raise ValueError(f"Missing required Capella AI environment variables: {missing_vars}")

    return {
        "endpoint": os.getenv("CAPELLA_API_ENDPOINT"),
        "embedding_model": os.getenv("CAPELLA_API_EMBEDDING_MODEL"),
        "llm_model": os.getenv("CAPELLA_API_LLM_MODEL"),
    }


def test_capella_connectivity():
    """Test connectivity to Capella AI services."""
    try:
        endpoint = os.getenv("CAPELLA_API_ENDPOINT")
        if not endpoint:
            logger.warning("CAPELLA_API_ENDPOINT not configured")
            return False

        # Test embedding model (requires API key)
        if os.getenv("CB_USERNAME") and os.getenv("CB_PASSWORD"):
            api_key = base64.b64encode(
                f"{os.getenv('CB_USERNAME')}:{os.getenv('CB_PASSWORD')}".encode()
            ).decode()

            headers = {
                "Authorization": f"Basic {api_key}",
                "Content-Type": "application/json",
            }

            # Test embedding
            logger.info("Testing Capella AI connectivity...")
            embedding_data = {
                "model": os.getenv("CAPELLA_API_EMBEDDING_MODEL", "intfloat/e5-mistral-7b-instruct"),
                "input": "test connectivity",
            }

            response = requests.post(
                f"{endpoint}/embeddings", json=embedding_data, headers=headers
            )
            if response.status_code == 200:
                logger.info("✅ Capella AI embedding test successful")
                return True
            else:
                logger.warning(f"❌ Capella AI embedding test failed: {response.text}")
                return False
        else:
            logger.warning("Capella AI credentials not available")
            return False
    except Exception as e:
        logger.warning(f"❌ Capella AI connectivity test failed: {e}")
        return False


def _set_if_undefined(env_var: str, default_value: str = None):
    """Set environment variable if not already defined."""
    if not os.getenv(env_var):
        if default_value is None:
            value = getpass.getpass(f"Enter {env_var}: ")
        else:
            value = default_value
        os.environ[env_var] = value


def setup_environment():
    """Setup required environment variables with defaults for travel-sample configuration."""
    logger.info("Setting up environment variables...")

    # Set default bucket configuration
    _set_if_undefined("CB_BUCKET", DEFAULT_BUCKET)
    _set_if_undefined("CB_SCOPE", DEFAULT_SCOPE)
    _set_if_undefined("CB_COLLECTION", DEFAULT_COLLECTION)
    _set_if_undefined("CB_INDEX", DEFAULT_INDEX)

    # Required Couchbase connection variables
    _set_if_undefined("CB_CONN_STRING")
    _set_if_undefined("CB_USERNAME")
    _set_if_undefined("CB_PASSWORD")

    # Optional Capella AI configuration
    endpoint = os.getenv("CAPELLA_API_ENDPOINT")
    if endpoint:
        # Ensure endpoint has /v1 suffix for OpenAI compatibility
        if not endpoint.endswith("/v1"):
            endpoint = endpoint.rstrip("/") + "/v1"
            os.environ["CAPELLA_API_ENDPOINT"] = endpoint
            logger.info(f"Added /v1 suffix to endpoint: {endpoint}")

    # Test Capella AI connectivity
    test_capella_connectivity()


# Setup environment
setup_environment()


2025-07-23 19:24:56,324 - __main__ - INFO - Setting up environment variables...
2025-07-23 19:24:56,325 - __main__ - INFO - Added /v1 suffix to endpoint: https://6u2le4tknje7yzkg.ai.cloud.couchbase.com/v1
2025-07-23 19:24:56,325 - __main__ - INFO - Testing Capella AI connectivity...
2025-07-23 19:24:57,401 - __main__ - INFO - ✅ Capella AI embedding test successful


## CouchbaseClient Class

Define the CouchbaseClient for all database operations.


In [3]:
class CouchbaseClient:
    """Centralized Couchbase client for all database operations."""

    def __init__(self, conn_string: str, username: str, password: str, bucket_name: str):
        """Initialize Couchbase client with connection details."""
        self.conn_string = conn_string
        self.username = username
        self.password = password
        self.bucket_name = bucket_name
        self.cluster = None
        self.bucket = None
        self._collections = {}

    def connect(self):
        """Establish connection to Couchbase cluster."""
        try:
            auth = PasswordAuthenticator(self.username, self.password)
            options = ClusterOptions(auth)

            # Use WAN profile for better timeout handling with remote clusters
            options.apply_profile("wan_development")
            self.cluster = Cluster(self.conn_string, options)
            self.cluster.wait_until_ready(timedelta(seconds=10))
            logger.info("Successfully connected to Couchbase")
            return self.cluster
        except Exception as e:
            raise ConnectionError(f"Failed to connect to Couchbase: {e!s}")

    def setup_collection(self, scope_name: str, collection_name: str):
        """Setup collection - create scope and collection if they don't exist, but don't clear scope."""
        try:
            # Ensure cluster connection
            if not self.cluster:
                self.connect()

            # For travel-sample bucket, assume it exists
            if not self.bucket:
                self.bucket = self.cluster.bucket(self.bucket_name)
                logger.info(f"Connected to bucket '{self.bucket_name}'")

            # Setup scope
            bucket_manager = self.bucket.collections()
            scopes = bucket_manager.get_all_scopes()
            scope_exists = any(scope.name == scope_name for scope in scopes)

            if not scope_exists and scope_name != "_default":
                logger.info(f"Creating scope '{scope_name}'...")
                bucket_manager.create_scope(scope_name)
                logger.info(f"Scope '{scope_name}' created successfully")

            # Setup collection - clear if exists, create if doesn't
            collections = bucket_manager.get_all_scopes()
            collection_exists = any(
                scope.name == scope_name
                and collection_name in [col.name for col in scope.collections]
                for scope in collections
            )

            if collection_exists:
                logger.info(f"Collection '{collection_name}' exists, clearing data...")
                # Clear existing data
                self.clear_collection_data(scope_name, collection_name)
            else:
                logger.info(f"Creating collection '{collection_name}'...")
                bucket_manager.create_collection(scope_name, collection_name)
                logger.info(f"Collection '{collection_name}' created successfully")

            time.sleep(3)

            # Create primary index
            try:
                self.cluster.query(
                    f"CREATE PRIMARY INDEX IF NOT EXISTS ON `{self.bucket_name}`.`{scope_name}`.`{collection_name}`"
                ).execute()
                logger.info("Primary index created successfully")
            except Exception as e:
                logger.warning(f"Error creating primary index: {e}")

            logger.info("Collection setup complete")
            return self.bucket.scope(scope_name).collection(collection_name)

        except Exception as e:
            raise RuntimeError(f"Error setting up collection: {e!s}")

    def clear_collection_data(self, scope_name: str, collection_name: str):
        """Clear all data from a collection."""
        try:
            logger.info(f"Clearing data from {self.bucket_name}.{scope_name}.{collection_name}...")
            
            # Use N1QL to delete all documents with explicit execution
            delete_query = f"DELETE FROM `{self.bucket_name}`.`{scope_name}`.`{collection_name}`"
            result = self.cluster.query(delete_query)
            
            # Execute the query and get the results
            rows = list(result)
            
            # Wait a moment for the deletion to propagate
            time.sleep(2)
            
            # Verify collection is empty
            count_query = f"SELECT COUNT(*) as count FROM `{self.bucket_name}`.`{scope_name}`.`{collection_name}`"
            count_result = self.cluster.query(count_query)
            count_row = list(count_result)[0]
            remaining_count = count_row['count']
            
            if remaining_count == 0:
                logger.info(f"Collection cleared successfully, {remaining_count} documents remaining")
            else:
                logger.warning(f"Collection clear incomplete, {remaining_count} documents remaining")
            
        except Exception as e:
            logger.warning(f"Error clearing collection data: {e}")
            # If N1QL fails, try to continue anyway
            pass

    def get_collection(self, scope_name: str, collection_name: str):
        """Get a collection object."""
        key = f"{scope_name}.{collection_name}"
        if key not in self._collections:
            self._collections[key] = self.bucket.scope(scope_name).collection(collection_name)
        return self._collections[key]

    def setup_vector_search_index(self, index_definition: dict, scope_name: str):
        """Setup vector search index."""
        try:
            scope_index_manager = self.bucket.scope(scope_name).search_indexes()

            existing_indexes = scope_index_manager.get_all_indexes()
            index_name = index_definition["name"]

            if index_name not in [index.name for index in existing_indexes]:
                logger.info(f"Creating vector search index '{index_name}'...")
                search_index = SearchIndex.from_json(index_definition)
                scope_index_manager.upsert_index(search_index)
                logger.info(f"Vector search index '{index_name}' created successfully")
            else:
                logger.info(f"Vector search index '{index_name}' already exists")
        except Exception as e:
            raise RuntimeError(f"Error setting up vector search index: {e!s}")

    def load_hotel_data(self, scope_name, collection_name, index_name, embeddings):
        """Load hotel data into Couchbase."""
        try:
            # Import hotel data loading function
            sys.path.append(os.path.join(os.getcwd(), "data"))
            from hotel_data import load_hotel_data_to_couchbase

            # Load hotel data using the data loading script
            load_hotel_data_to_couchbase(
                cluster=self.cluster,
                bucket_name=self.bucket_name,
                scope_name=scope_name,
                collection_name=collection_name,
                embeddings=embeddings,
                index_name=index_name,
            )
            logger.info("Hotel data loaded into vector store successfully")

        except Exception as e:
            raise RuntimeError(f"Error loading hotel data: {e!s}")

    def setup_vector_store(
        self, scope_name: str, collection_name: str, index_name: str, embeddings
    ):
        """Setup vector store with hotel data."""
        try:
            # Load hotel data
            self.load_hotel_data(scope_name, collection_name, index_name, embeddings)

            # Create vector store instance
            vector_store = CouchbaseVectorStore(
                cluster=self.cluster,
                bucket_name=self.bucket_name,
                scope_name=scope_name,
                collection_name=collection_name,
                embedding=embeddings,
                index_name=index_name,
            )

            logger.info("Vector store setup complete")
            return vector_store

        except Exception as e:
            raise RuntimeError(f"Error setting up vector store: {e!s}")


## Hotel Support Agent Setup

Setup the hotel support agent with Agent Catalog integration.


In [4]:
def setup_hotel_support_agent():
    """Setup the hotel support agent with Agent Catalog integration."""
    try:
        # Initialize Agent Catalog with single application span
        catalog = agentc.catalog.Catalog()
        application_span = catalog.Span(name="Hotel Support Agent")

        # Setup Couchbase connection and collections
        couchbase_client = CouchbaseClient(
            conn_string=os.getenv("CB_CONN_STRING"),
            username=os.getenv("CB_USERNAME"),
            password=os.getenv("CB_PASSWORD"),
            bucket_name=os.getenv("CB_BUCKET", DEFAULT_BUCKET),
        )
        couchbase_client.connect()
        couchbase_client.setup_collection(
            os.getenv("CB_SCOPE", DEFAULT_SCOPE),
            os.getenv("CB_COLLECTION", DEFAULT_COLLECTION)
        )

        # Setup vector index
        try:
            with open("agentcatalog_index.json", "r") as file:
                index_definition = json.load(file)
            logger.info("Loaded vector search index definition from agentcatalog_index.json")
        except Exception as e:
            raise ValueError(f"Error loading index definition: {e!s}")
        
        couchbase_client.setup_vector_search_index(
            index_definition, os.getenv("CB_SCOPE", DEFAULT_SCOPE)
        )

        # Setup embeddings using Capella AI
        try:
            if (
                os.getenv("CB_USERNAME")
                and os.getenv("CB_PASSWORD")
                and os.getenv("CAPELLA_API_ENDPOINT")
                and os.getenv("CAPELLA_API_EMBEDDING_MODEL")
            ):
                api_key = base64.b64encode(
                    f"{os.getenv('CB_USERNAME')}:{os.getenv('CB_PASSWORD')}".encode()
                ).decode()
                embeddings = OpenAIEmbeddings(
                    model=os.getenv("CAPELLA_API_EMBEDDING_MODEL"),
                    api_key=api_key,
                    base_url=os.getenv("CAPELLA_API_ENDPOINT"),
                )
                logger.info("✅ Using Capella AI for embeddings")
            else:
                raise ValueError("Capella AI credentials not available")
        except Exception as e:
            logger.error(f"❌ Capella AI embeddings failed: {e}")
            raise RuntimeError("Capella AI embeddings required for this configuration")

        couchbase_client.setup_vector_store(
            os.getenv("CB_SCOPE", DEFAULT_SCOPE),
            os.getenv("CB_COLLECTION", DEFAULT_COLLECTION),
            os.getenv("CB_INDEX", DEFAULT_INDEX),
            embeddings,
        )

        # Setup LLM with Agent Catalog callback - try Capella AI first, fallback to OpenAI
        try:
            # Create API key for Capella AI
            api_key = base64.b64encode(
                f"{os.getenv('CB_USERNAME')}:{os.getenv('CB_PASSWORD')}".encode()
            ).decode()

            llm = ChatOpenAI(
                api_key=api_key,
                base_url=os.getenv("CAPELLA_API_ENDPOINT"),
                model=os.getenv("CAPELLA_API_LLM_MODEL"),
                temperature=0,
                callbacks=[agentc_langchain.chat.Callback(span=application_span)],
            )
            # Test the LLM works
            llm.invoke("Hello")
            logger.info("✅ Using Capella AI LLM")
        except Exception as e:
            logger.warning(f"⚠️ Capella AI LLM failed: {e}")
            logger.info("🔄 Falling back to OpenAI LLM...")
            _set_if_undefined("OPENAI_API_KEY")
            llm = ChatOpenAI(
                api_key=os.getenv("OPENAI_API_KEY"),
                model="gpt-4o",
                temperature=0,
                callbacks=[agentc_langchain.chat.Callback(span=application_span)],
            )
            logger.info("✅ Using OpenAI LLM as fallback")

        # Load tools and create agent
        tool_search = catalog.find("tool", name="search_vector_database")
        if not tool_search:
            raise ValueError(
                "Could not find search_vector_database tool. Make sure it's indexed with 'agentc index tools/'"
            )

        tools = [
            Tool(
                name=tool_search.meta.name,
                description=tool_search.meta.description,
                func=tool_search.func,
            ),
        ]

        hotel_prompt = catalog.find("prompt", name="hotel_search_assistant")
        if not hotel_prompt:
            raise ValueError(
                "Could not find hotel_search_assistant prompt in catalog. Make sure it's indexed with 'agentc index prompts/'"
            )

        custom_prompt = PromptTemplate(
            template=hotel_prompt.content.strip(),
            input_variables=["input", "agent_scratchpad"],
            partial_variables={
                "tools": "\n".join([f"{tool.name}: {tool.description}" for tool in tools]),
                "tool_names": ", ".join([tool.name for tool in tools]),
            },
        )

        agent = create_react_agent(llm, tools, custom_prompt)
        agent_executor = AgentExecutor(
            agent=agent,
            tools=tools,
            verbose=True,
            handle_parsing_errors=True,
            max_iterations=5,
            max_execution_time=60,
        )

        return agent_executor, application_span

    except Exception as e:
        logger.exception(f"Error setting up hotel support agent: {e}")
        raise


# Setup the hotel support agent
agent_executor, application_span = setup_hotel_support_agent()


2025-07-23 19:24:59,898 - __main__ - INFO - Successfully connected to Couchbase
2025-07-23 19:25:01,404 - __main__ - INFO - Connected to bucket 'travel-sample'
2025-07-23 19:25:03,399 - __main__ - INFO - Collection 'hotel_data' exists, clearing data...
2025-07-23 19:25:03,400 - __main__ - INFO - Clearing data from travel-sample.agentc_data.hotel_data...
2025-07-23 19:25:08,368 - __main__ - INFO - Collection cleared successfully, 0 documents remaining
2025-07-23 19:25:12,462 - __main__ - INFO - Primary index created successfully
2025-07-23 19:25:12,462 - __main__ - INFO - Collection setup complete
2025-07-23 19:25:12,463 - __main__ - INFO - Loaded vector search index definition from agentcatalog_index.json
2025-07-23 19:25:13,691 - __main__ - INFO - Vector search index 'hotel_data_index' already exists
2025-07-23 19:25:13,898 - __main__ - INFO - ✅ Using Capella AI for embeddings
2025-07-23 19:25:17,480 - hotel_data - INFO - Loading hotel data from travel-sample.inventory.hotel...
2025-0

## Test Functions

Define test functions to demonstrate the hotel support agent functionality.


In [5]:
def run_hotel_query(query: str, agent_executor, application_span):
    """Run a single hotel query with error handling."""
    logger.info(f"🔍 Hotel Query: {query}")
    
    try:
        with application_span.new(f"Hotel Query: {query}") as query_span:
            query_span["query"] = query
            
            # Run the agent
            response = agent_executor.invoke({"input": query})
            result = response.get("output", "No response generated")
            
            query_span["result"] = result
            logger.info(f"🤖 AI Response: {result}")
            logger.info("✅ Query completed successfully")
            
            return result
            
    except Exception as e:
        logger.exception(f"❌ Query failed: {e}")
        return f"Error: {str(e)}"


def test_hotel_data_loading():
    """Test hotel data loading from travel-sample independently."""
    logger.info("Testing Hotel Data Loading from travel-sample")
    logger.info("=" * 50)
    
    try:
        # Import hotel data functions
        sys.path.append(os.path.join(os.getcwd(), "data"))
        from hotel_data import get_hotel_count, get_hotel_texts
        
        # Test hotel count
        count = get_hotel_count()
        logger.info(f"✅ Hotel count in travel-sample.inventory.hotel: {count}")
        
        # Test hotel text generation
        texts = get_hotel_texts()
        logger.info(f"✅ Generated {len(texts)} hotel texts for embeddings")
        
        if texts:
            logger.info(f"✅ First hotel text sample: {texts[0][:200]}...")
        
        logger.info("✅ Data loading test completed successfully")
        
    except Exception as e:
        logger.exception(f"❌ Data loading test failed: {e}")


# Test hotel data loading
test_hotel_data_loading()


2025-07-23 19:26:35,777 - __main__ - INFO - Testing Hotel Data Loading from travel-sample
2025-07-23 19:26:38,175 - __main__ - INFO - ✅ Hotel count in travel-sample.inventory.hotel: 917
2025-07-23 19:26:40,182 - hotel_data - INFO - Loading hotel data from travel-sample.inventory.hotel...
2025-07-23 19:26:45,215 - hotel_data - INFO - Loaded 917 hotels from travel-sample.inventory.hotel
Processing hotels: 100%|██████████| 917/917 [00:00<00:00, 130458.48it/s]
2025-07-23 19:26:45,225 - hotel_data - INFO - Generated 917 hotel text embeddings
2025-07-23 19:26:45,227 - __main__ - INFO - ✅ Generated 917 hotel texts for embeddings
2025-07-23 19:26:45,228 - __main__ - INFO - ✅ First hotel text sample: 'La Mirande Hotel in Avignon, France. Title: Avignon. Description: 5 star hotel housed in a 700 year old converted townhouse. Address: 4 place de la Mirande,F- AVIGNON. Price: €400 and up. State: Prov...
2025-07-23 19:26:45,228 - __main__ - INFO - ✅ Data loading test completed successfully


## Test 1: Hotel Search in San Francisco

Search for hotels in San Francisco with free parking and breakfast.


In [6]:
result1 = run_hotel_query(
    "Find me a hotel in San Francisco with free parking and breakfast",
    agent_executor,
    application_span
)


2025-07-23 19:26:45,234 - __main__ - INFO - 🔍 Hotel Query: Find me a hotel in San Francisco with free parking and breakfast




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mOkay, so the user is asking for a hotel in San Francisco with free parking and breakfast. I need to figure out how to find this using the search_vector_database tool. First, I should make sure I understand the tool's capabilities. It can search for hotels based on semantic similarity, so I need to phrase the query correctly.

I'll start by considering the location, which is San Francisco. Then, the amenities are free parking and breakfast. Combining these, the search query should be "San Francisco free parking breakfast". This should help the tool find hotels that have both these amenities.

I should also remember the critical format rules. I need to present only the hotels actually located in San Francisco. If the search returns hotels from other cities, I have to filter them out. So, I'll make sure to check the results carefully.

Once I run the search, I'll get a list of hotels. I'll go through each one to confirm their lo

2025-07-23 19:27:11,363 - __main__ - INFO - 🤖 AI Response: No matching hotels found in San Francisco with both free parking and breakfast.
2025-07-23 19:27:11,364 - __main__ - INFO - ✅ Query completed successfully


[32;1m[1;3mAlright, let's break this down step by step. The user is asking for a hotel in San Francisco with both free parking and breakfast. I'll start by crafting a search query that includes the location and the amenities. The query should be "San Francisco free parking breakfast" to ensure the tool finds the right matches.

I'll use the search_vector_database tool with this query. Once the results come in, I need to carefully review each hotel's location. The tool might return hotels from other cities, so I'll filter those out. Only San Francisco-based hotels should be included in the final answer.

Looking at the results, I see five hotels listed. The first one is in Calistoga, which is outside San Francisco, so I'll exclude it. The second is in Llanaber, UK, which is definitely not San Francisco. The third is in Birmingham, UK, another non-match. The fourth is in San Francisco, so that's a keeper. The fifth is in Treflach, UK, which is also not the right location.

Only Hotel 4

## Test 2: Hotel Search in London

Search for hotels in London with free internet access.


In [7]:
result2 = run_hotel_query(
    "I need a hotel in London with free internet access",
    agent_executor,
    application_span
)


2025-07-23 19:27:11,379 - __main__ - INFO - 🔍 Hotel Query: I need a hotel in London with free internet access




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mOkay, so the user is asking for a hotel in London with free internet access. I need to figure out how to find that using the search_vector_database tool. First, I should understand what the tool does. It searches for hotels based on semantic similarity, so I need to phrase my query correctly.

I remember from the example that the search query should be in the format "[city] [ameniTY]". So, in this case, it should be "London free internet". That makes sense because it includes both the location and the desired amenities.

Next, I'll use the search_vector_database tool with the query "London free internet". The tool will return a list of hotels that match this criteria. I should wait for the tool's response, which will be the Observation step.

Once I have the results, I need to analyze them. I should check if all the hotels listed are actually located in London. If any are from other cities, I have to filter them out or mentio

2025-07-23 19:27:40,850 - __main__ - INFO - 🤖 AI Response: Here is the list of hotels found in London with free internet access:

1. **Chelsea House Hotel**  
   - Address: 96 Redcliffe Gdn, SW10 9HH, London  
   - Features: Free breakfast, Free internet, Free parking  
   - Review: "The breakfast is uninteresting, except the coffee, which is awful! But, the tea and the juice are good. It is close to many pubs, cafés and shops."  

If no other hotels match your criteria, this is the only hotel found in London with free internet access.
2025-07-23 19:27:40,851 - __main__ - INFO - ✅ Query completed successfully


[32;1m[1;3mAlright, let's break this down step by step. The user is asking for hotels in London with free internet access. I need to use the search_vector_database tool to find the best matches.

First, I'll craft a search query that includes both the location and the desired amenities. The tool works best with clear, concise terms, so "London free internet" seems appropriate.

Next, I'll use the tool with this query. The tool will return a list of hotels that match these criteria. I'll wait for the results to see which hotels are available.

Upon receiving the results, I notice that the first hotel listed is in Treflach, which is near Treflach, not London. That's a problem because the user specifically asked for London. I'll need to filter out any hotels not located in London.

Looking at the remaining hotels, the second one is in London, South Kensington. The third is in Calistoga, California, which is a different city altogether. The fourth is in Liverpool, and the fifth is in Kin

## Test 3: Hotel Search in Paris

Search for luxury hotels in Paris with specific amenities.


In [8]:
result3 = run_hotel_query(
    "Show me luxury hotels in Paris with spa and fitness facilities",
    agent_executor,
    application_span
)


2025-07-23 19:27:40,857 - __main__ - INFO - 🔍 Hotel Query: Show me luxury hotels in Paris with spa and fitness facilities




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mOkay, so the user is asking for luxury hotels in Paris with both a spa and fitness facilities. I need to figure out how to approach this using the search_vector_database tool. 

First, I should understand the user's needs. They're looking for high-end accommodations in Paris, specifically those that have a spa and a fitness center. It's important to make sure the hotels are actually located in Paris and not in surrounding areas. 

I'll start by crafting a search query that includes the location and the amenities. The tool uses semantic similarity, so using "Paris luxury hotel spa fitness" should work. I need to make sure the query is clear and includes all the key terms the user is interested in.

Next, I'll use the search_vector_database tool with this query. The tool will return a list of matching hotels. I should then analyze these results to ensure they are all located in Paris. If any hotels are in different cities, I'll

2025-07-23 19:28:09,138 - __main__ - INFO - 🤖 AI Response: Here is a hotel in Paris that offers the amenities you are looking for:  
- **Fraser Suites Harmonie Paris La Défense**  
  - **Location:** Courbevoie, France (a suburb of Paris)  
  - **Amenities:** Spa and fitness center  

If you're looking for more options, you might try searching with different terms like "Paris luxury spa resort" or "Paris fitness center hotel."
2025-07-23 19:28:09,139 - __main__ - INFO - ✅ Query completed successfully


[32;1m[1;3mOkay, so I need to help the user find luxury hotels in Paris with both spa and fitness facilities. Let me start by understanding the user's request. They're looking for high-end accommodations in Paris, specifically those that have a spa and a fitness center. It's important to ensure that the hotels are actually located in Paris and not in other cities.

I'll use the search_vector_database tool with the query "Paris luxury hotel spa fitness." This should help find hotels that match the criteria. Once I get the results, I'll need to filter out any hotels that aren't in Paris. If there are no results, I'll inform the user accordingly.

After running the search, I see that the tool returned five hotels, but only one of them is in Paris. The other four are in different cities like Liverpool, Highland, Birmingham, and Palm Springs. Since the user specifically asked for Paris, I'll present only the Paris hotel.

The Paris hotel is the Fraser Suites Harmonie Paris La Défense in C

## Arize Phoenix Evaluation

This section demonstrates how to evaluate the hotel support agent using Arize Phoenix observability platform. The evaluation includes:

- **Relevance Scoring**: Using Phoenix RelevanceEvaluator to score how relevant responses are to queries
- **QA Scoring**: Using Phoenix QAEvaluator to score answer quality
- **Hallucination Detection**: Using Phoenix HallucinationEvaluator to detect fabricated information  
- **Toxicity Detection**: Using Phoenix ToxicityEvaluator to detect harmful content
- **Phoenix UI**: Real-time observability dashboard at `http://localhost:6006/`

We'll run hotel search queries and evaluate the responses for quality and safety.


In [9]:
# Import Phoenix evaluation components
try:
    import phoenix as px
    from phoenix.evals import (
        HALLUCINATION_PROMPT_RAILS_MAP,
        HALLUCINATION_PROMPT_TEMPLATE,
        QA_PROMPT_RAILS_MAP,
        QA_PROMPT_TEMPLATE,
        RAG_RELEVANCY_PROMPT_RAILS_MAP,
        RAG_RELEVANCY_PROMPT_TEMPLATE,
        TOXICITY_PROMPT_RAILS_MAP,
        TOXICITY_PROMPT_TEMPLATE,
        OpenAIModel,
        llm_classify,
    )
    import pandas as pd
    
    ARIZE_AVAILABLE = True
    logger.info("✅ Arize Phoenix evaluation components available")
except ImportError as e:
    logger.warning(f"Arize dependencies not available: {e}")
    logger.warning("Skipping evaluation section...")
    ARIZE_AVAILABLE = False

if ARIZE_AVAILABLE:
    # Start Phoenix session for observability
    try:
        px.launch_app(port=6006)
        logger.info("🚀 Phoenix UI available at http://localhost:6006/")
    except Exception as e:
        logger.warning(f"Could not start Phoenix UI: {e}")

    # Demo queries for evaluation
    hotel_demo_queries = [
        "Find me a hotel in San Francisco with free parking and breakfast",
        "I need a hotel in London with free internet access"
    ]
    
    # Run demo queries and collect responses for evaluation
    hotel_demo_results = []
    
    for i, query in enumerate(hotel_demo_queries, 1):
        try:
            logger.info(f"🔍 Running evaluation query {i}: {query}")
            
            # Run the agent
            response = agent_executor.invoke({"input": query})
            output = response.get("output", "No response generated")
    
            hotel_demo_results.append({
                "query": query,
                "response": output,
                "query_type": f"hotel_demo_{i}",
                "success": True
            })
            
            logger.info(f"✅ Query {i} completed successfully")
    
        except Exception as e:
            logger.exception(f"❌ Query {i} failed: {e}")
            hotel_demo_results.append({
                "query": query,
                "response": f"Error: {e!s}",
                "query_type": f"hotel_demo_{i}",
                "success": False
            })
    
    # Convert to DataFrame for evaluation
    hotel_results_df = pd.DataFrame(hotel_demo_results)
    logger.info(f"📊 Collected {len(hotel_results_df)} responses for evaluation")
    
    # Display results summary
    for _, row in hotel_results_df.iterrows():
        logger.info(f"Query: {row['query']}")
        logger.info(f"Response: {row['response'][:200]}...")
        logger.info(f"Success: {row['success']}")
        logger.info("-" * 50)
    
    logger.info("💡 Visit Phoenix UI at http://localhost:6006/ to see detailed traces and evaluations")
    logger.info("💡 Use the evaluation script at evals/eval_arize.py for comprehensive evaluation")

else:
    logger.info("Arize evaluation not available - install phoenix-evals to enable evaluation")


2025-07-23 19:28:09,153 - __main__ - INFO - Arize evaluation not available - install phoenix-evals to enable evaluation


In [10]:
if ARIZE_AVAILABLE and len(hotel_demo_results) > 0:
    logger.info("🔍 Running comprehensive Phoenix evaluations...")
    
    # Setup evaluator LLM (using OpenAI for consistency)
    evaluator_llm = OpenAIModel(model="gpt-4o", temperature=0.1)
    
    # Prepare evaluation data with proper column names for Phoenix evaluators
    hotel_eval_data = []
    for _, row in hotel_results_df.iterrows():
        hotel_eval_data.append({
            "input": row["query"],
            "output": row["response"],
            "reference": "A helpful and accurate response about hotels with specific hotel information and amenities",
            "text": row["response"]  # For toxicity evaluation
        })
    
    hotel_eval_df = pd.DataFrame(hotel_eval_data)
    
    try:
        # 1. Relevance Evaluation
        logger.info("🔍 Running Relevance Evaluation...")
        hotel_relevance_results = llm_classify(
            data=hotel_eval_df[["input", "reference"]],  # Fixed: use 'data' instead of 'dataframe'
            model=evaluator_llm,
            template=RAG_RELEVANCY_PROMPT_TEMPLATE,
            rails=list(RAG_RELEVANCY_PROMPT_RAILS_MAP.values()),
            provide_explanation=True  # Added for explanations
        )
        
        logger.info("✅ Relevance Evaluation Results:")
        # Fixed: Handle DataFrame results correctly
        if hasattr(hotel_relevance_results, "columns"):
            relevance_labels = hotel_relevance_results["label"].tolist() if "label" in hotel_relevance_results.columns else ["unknown"] * len(hotel_eval_data)
            relevance_explanations = hotel_relevance_results["explanation"].tolist() if "explanation" in hotel_relevance_results.columns else ["No explanation"] * len(hotel_eval_data)
            
            for i, (label, explanation) in enumerate(zip(relevance_labels, relevance_explanations)):
                query = hotel_eval_data[i]["input"]
                logger.info(f"   Query: {query}")
                logger.info(f"   Relevance: {label}")
                logger.info(f"   Explanation: {explanation}")
                logger.info("   " + "-"*30)
        
        # 2. QA Evaluation
        logger.info("🔍 Running QA Evaluation...")
        hotel_qa_results = llm_classify(
            data=hotel_eval_df[["input", "output", "reference"]],  # Fixed: use 'data'
            model=evaluator_llm,
            template=QA_PROMPT_TEMPLATE,
            rails=list(QA_PROMPT_RAILS_MAP.values()),
            provide_explanation=True
        )
        
        logger.info("✅ QA Evaluation Results:")
        if hasattr(hotel_qa_results, "columns"):
            qa_labels = hotel_qa_results["label"].tolist() if "label" in hotel_qa_results.columns else ["unknown"] * len(hotel_eval_data)
            qa_explanations = hotel_qa_results["explanation"].tolist() if "explanation" in hotel_qa_results.columns else ["No explanation"] * len(hotel_eval_data)
            
            for i, (label, explanation) in enumerate(zip(qa_labels, qa_explanations)):
                query = hotel_eval_data[i]["input"]
                logger.info(f"   Query: {query}")
                logger.info(f"   QA Score: {label}")
                logger.info(f"   Explanation: {explanation}")
                logger.info("   " + "-"*30)
        
        # 3. Hallucination Evaluation
        logger.info("🔍 Running Hallucination Evaluation...")
        hotel_hallucination_results = llm_classify(
            data=hotel_eval_df[["input", "reference", "output"]],  # Fixed: use 'data'
            model=evaluator_llm,
            template=HALLUCINATION_PROMPT_TEMPLATE,
            rails=list(HALLUCINATION_PROMPT_RAILS_MAP.values()),
            provide_explanation=True
        )
        
        logger.info("✅ Hallucination Evaluation Results:")
        if hasattr(hotel_hallucination_results, "columns"):
            hallucination_labels = hotel_hallucination_results["label"].tolist() if "label" in hotel_hallucination_results.columns else ["unknown"] * len(hotel_eval_data)
            hallucination_explanations = hotel_hallucination_results["explanation"].tolist() if "explanation" in hotel_hallucination_results.columns else ["No explanation"] * len(hotel_eval_data)
            
            for i, (label, explanation) in enumerate(zip(hallucination_labels, hallucination_explanations)):
                query = hotel_eval_data[i]["input"]
                logger.info(f"   Query: {query}")
                logger.info(f"   Hallucination: {label}")
                logger.info(f"   Explanation: {explanation}")
                logger.info("   " + "-"*30)
        
        # 4. Toxicity Evaluation
        logger.info("🔍 Running Toxicity Evaluation...")
        hotel_toxicity_results = llm_classify(
            data=hotel_eval_df[["text"]],  # Fixed: use 'data'
            model=evaluator_llm,
            template=TOXICITY_PROMPT_TEMPLATE,
            rails=list(TOXICITY_PROMPT_RAILS_MAP.values()),
            provide_explanation=True
        )
        
        logger.info("✅ Toxicity Evaluation Results:")
        if hasattr(hotel_toxicity_results, "columns"):
            toxicity_labels = hotel_toxicity_results["label"].tolist() if "label" in hotel_toxicity_results.columns else ["unknown"] * len(hotel_eval_data)
            toxicity_explanations = hotel_toxicity_results["explanation"].tolist() if "explanation" in hotel_toxicity_results.columns else ["No explanation"] * len(hotel_eval_data)
            
            for i, (label, explanation) in enumerate(zip(toxicity_labels, toxicity_explanations)):
                query = hotel_eval_data[i]["input"]
                logger.info(f"   Query: {query}")
                logger.info(f"   Toxicity: {label}")
                logger.info(f"   Explanation: {explanation}")
                logger.info("   " + "-"*30)
        
        # Summary of all evaluations
        logger.info("📊 EVALUATION SUMMARY")
        logger.info("=" * 50)
        
        for i, query in enumerate([item["input"] for item in hotel_eval_data]):
            logger.info(f"Query {i+1}: {query}")
            logger.info(f"  Relevance: {relevance_labels[i] if 'relevance_labels' in locals() else 'N/A'}")
            logger.info(f"  QA Score: {qa_labels[i] if 'qa_labels' in locals() else 'N/A'}")
            logger.info(f"  Hallucination: {hallucination_labels[i] if 'hallucination_labels' in locals() else 'N/A'}")
            logger.info(f"  Toxicity: {toxicity_labels[i] if 'toxicity_labels' in locals() else 'N/A'}")
            logger.info("  " + "-"*40)
        
        logger.info("✅ All Phoenix evaluations completed successfully!")
        
    except Exception as e:
        logger.exception(f"❌ Phoenix evaluation failed: {e}")
        logger.info("💡 This might be due to API rate limits or model availability")
        
else:
    if not ARIZE_AVAILABLE:
        logger.info("❌ Phoenix evaluations skipped - Arize dependencies not available")
    else:
        logger.info("❌ Phoenix evaluations skipped - No demo results to evaluate")

2025-07-23 19:28:09,204 - __main__ - INFO - ❌ Phoenix evaluations skipped - Arize dependencies not available


## Summary

This notebook demonstrates a complete hotel support agent implementation using:

1. **Agent Catalog Integration**: Using agentc to find tools and prompts
2. **LangChain Framework**: ReAct agent pattern with Agent Catalog callbacks
3. **Couchbase Vector Store**: Storing and searching hotel data from travel-sample bucket
4. **Single Tool Architecture**: Focused on `search_vector_database` for hotel search
5. **Comprehensive Evaluation**: Phoenix-based evaluation with multiple metrics

The agent can handle various hotel-related queries including:
- Hotel search by location and amenities
- Finding hotels with specific features (parking, internet, spa, etc.)
- Luxury hotel searches with premium amenities
- Location-based hotel recommendations

## Phoenix Evaluation Metrics

The notebook demonstrates all four key Phoenix evaluation types:

1. **Relevance Evaluation**: Measures how relevant responses are to user queries
2. **QA Evaluation**: Assesses the quality and accuracy of answers
3. **Hallucination Detection**: Identifies fabricated or incorrect information
4. **Toxicity Detection**: Screens for harmful or inappropriate content

Each evaluation provides:
- Binary or categorical labels (e.g., "relevant"/"irrelevant", "correct"/"incorrect")
- Detailed explanations of the evaluation reasoning
- Confidence scores for the assessments

## Key Features

This hotel support agent implementation:
- **Uses travel-sample bucket**: Leverages existing Couchbase sample data
- **Capella AI integration**: Supports both embeddings and LLM through Capella AI
- **Fallback to OpenAI**: Graceful fallback when Capella AI is unavailable
- **Single focused tool**: Simplified architecture with one search tool
- **Comprehensive evaluation**: Full Phoenix evaluation pipeline
- **Agent Catalog callbacks**: Integrated observability and tracing

## Data Source

The agent uses hotel data from the `travel-sample.inventory.hotel` collection, which contains:
- Real hotel information with names, locations, and amenities
- Structured data with address, city, country, and facilities
- Rich text descriptions suitable for vector embedding
- Comprehensive coverage of global hotel properties

For production use, consider:
- Setting up proper monitoring with Arize Phoenix
- Implementing comprehensive evaluation pipelines
- Adding error handling and retry logic
- Scaling the vector store for larger datasets
- Adding more sophisticated query understanding

## Usage Instructions

To run this notebook:
1. Set up the required environment variables (Couchbase connection, API keys)
2. Install dependencies: `pip install -r requirements.txt`
3. Ensure travel-sample bucket is available in your Couchbase cluster
4. Publish your agent catalog: `agentc index . && agentc publish`
5. Run the notebook cells sequentially

The agent will automatically load hotel data from travel-sample and create embeddings for semantic search capabilities.
