<a href="https://colab.research.google.com/github/KaifAhmad1/code-test/blob/main/CommVersion_Assignment_Mohd_Kaif.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **RealtyFlow AI: Real Estate Lead Qualification System**

A complete implementation of a conversational AI agent for real estate businesses
that qualifies leads by collecting necessary information based on whether they
want to buy or sell property.

**RealtyFlow AI** is an intelligent chatbot designed to streamline initial customer interactions for real estate businesses. Built using LangGraph for state management, Google Gemini for natural language understanding and generation, and FAISS for efficient postcode similarity searching, it guides users through a predefined decision tree to understand their intent (buy/sell), budget, and location preferences.

The chatbot automates lead qualification, provides instant responses 24/7, and ensures consistent information gathering, ultimately enhancing agent productivity and improving the customer experience.

### **Core Functionality & Flow**

The chatbot operates based on a structured decision tree:

1.  **Initial Greeting & Intent Capture:**
    *   Bot: "Home. How may I help you? (e.g., 'I want to buy', 'I'm looking to sell')"
    *   User provides input (e.g., "I want to buy a house").
    *   **Gemini (Intent Classifier):** Determines if the user wants to "buy" or "sell".

2.  **Contact Information Gathering:**
    *   Bot: "Great. Can I get your name?" -> User provides name.
    *   Bot: "Can I get your phone number?" -> User provides phone.
    *   Bot: "Can I get your email address?" -> User provides email.

3.  **Path Divergence (Buy vs. Sell):**

    *   **If Intent is "Buy":**
        *   Bot: "Are you looking for a new home or a re-sale home?"
        *   **Gemini (Buy Type Classifier):** Determines "new\_home" or "re\_sale".
        *   Bot: "What is your budget?"
        *   **Budget Processor Tool:** Parses the budget amount.
            *   **Rule (New Home):** If budget < £1,000,000 for a "new\_home", the bot informs the user about the minimum budget and ends the interaction with a referral to call the office.
        *   Bot (if budget is sufficient or re-sale): "Can I know the postcode of your location of interest?"
        *   **Postcode Processor Tool (Exact Match & FAISS):**
            *   Checks if the normalized postcode is in the pre-approved list.
            *   If not an exact match, FAISS suggests similar valid postcodes.
            *   **Rule (New Home, Postcode Not Covered):** Informs the user and ends with a referral.
            *   **Rule (Postcode Covered or Re-sale/Sell with Uncovered Postcode):** Proceeds to reassistance.

    *   **If Intent is "Sell":**
        *   (After contact info) Bot: "What is your postcode?"
        *   **Postcode Processor Tool (Exact Match & FAISS):**
            *   Checks eligibility. FAISS suggests similar if no exact match.
            *   Proceeds to reassistance regardless of coverage (as per flowchart, only a message change).

4.  **Outcome & Reassistance:**

    *   **If Postcode Covered (for any valid path):**
        *   Bot: "Great! That postcode is covered. I can expect someone to get in touch with you within 24 hours via phone or email. Is there anything else I can help you with? (yes/no)"
    *   **If Postcode NOT Covered (for Sell or Re-sale Buy path):**
        *   Bot: "Sorry, we don't cater to the Post code '{postcode}' that you provided. (Did you perhaps mean {suggestion}?) Please call the office on {phone\_number} to get help. Is there anything else I can help you with? (yes/no)"

5.  **Handling Reassistance Choice:**
    *   **Gemini (Yes/No Classifier):** Determines if the user said "yes" or "no".
    *   If "yes": Bot: "Okay, let's start over." (Restarts the flow).
    *   If "no": Bot: "Thank you for chatting with us. Good bye." (Ends conversation).
    *   If unclear: Re-prompts for yes/no.

6.  **Error/Max Attempts Handling:**
    *   If the bot fails to understand an input after 3 attempts, it politely ends the conversation and refers the user to call the office.

### **Technical Stack & Components**

*   **Orchestration:** `LangGraph` (manages the conversational state and flow between nodes).
*   **Language Model (LLM):** `Google Gemini` (via `ChatGoogleGenerativeAI` from `langchain-google-genai`) for:
    *   Intent classification (buy/sell, new\_home/re\_sale, yes/no).
    *   Generating conversational responses.
*   **Embeddings:** `GoogleGenerativeAIEmbeddings` (model: `models/embedding-001`) for converting postcodes into vector representations.
*   **Vector Search:** `FAISS` (Facebook AI Similarity Search) for:
    *   Building an index of eligible postcode embeddings.
    *   Finding the most similar eligible postcodes if a user's input isn't an exact match (useful for typo correction/suggestions).
*   **Data Handling:** `pandas` for loading the initial list of eligible postcodes.
*   **Core Logic:** Python functions and classes acting as "tools" or "agents":
    *   `IntentClassifierAgent`: Wraps Gemini calls for classification tasks.
    *   `BudgetProcessorTool`: Parses budget strings into numerical values.
    *   `PostcodeProcessorTool`: Handles exact postcode validation and FAISS-based similarity search.
*   **State Management:** A `TypedDict` (`ChatbotState`) holds all relevant information during the conversation (history, extracted details, pending actions).

### **Usefulness & Problem Solving**

**RealtyFlow AI** is particularly useful for:

*   **Real Estate Agencies & Property Developers:** To automate initial customer engagement and lead qualification.

**In essence, RealtyFlow AI streamlines the top of the sales funnel, enhances agent productivity, improves customer experience, and helps reduce operational costs by acting as an intelligent, automated front-desk for real estate businesses.**

In [1]:
!pip install -q langchain langgraph langchain-google-genai google-generativeai pandas faiss-cpu tiktoken python-dotenv

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.2/151.2 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.0/42.0 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m61.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m56.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.3/42.3 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.6/47.6 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.8/194.8 kB[0m [31m15.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [14]:
# Initialize Google API key at the start of the script
GOOGLE_API_KEY = "AIzaSyD9ljvMl4t9ucEnQpi3RfAJsoCgViE7O9Q"

import os
import re
import json
import logging
from enum import Enum
from typing import Dict, List, Optional, Literal, Set, Tuple, Any, Union, TypedDict
import pandas as pd
import numpy as np
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langgraph.graph import StateGraph, END
import faiss

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger("RealtyFlow")

# Constants
MIN_BUDGET_NEW_HOME = 1_000_000
COMPANY_PHONE_NUMBER = "1800 111 222"
MAX_ATTEMPTS = 3
POSTCODE_FILE = "uk_postcodes.csv"

# Enums for better type safety and readability
class Intent(str, Enum):
    BUY = "buy"
    SELL = "sell"
    UNKNOWN = "unknown"

class BuyType(str, Enum):
    NEW_HOME = "new_home"
    RE_SALE = "re_sale"
    UNKNOWN = "unknown"

class YesNo(str, Enum):
    YES = "yes"
    NO = "no"
    UNKNOWN = "unknown"

class ConversationStage(str, Enum):
    GREETING = "greeting"
    INTENT = "intent"
    NAME = "name"
    PHONE = "phone"
    EMAIL = "email"
    BUY_TYPE = "buy_type"
    BUDGET = "budget"
    POSTCODE = "postcode"
    REASSISTANCE = "reassistance"
    ENDED = "ended"

In [15]:
# Enhanced State Definition
class ChatState(TypedDict):
    messages: List[Union[AIMessage, HumanMessage, SystemMessage]]
    intent: Optional[Intent]
    buy_type: Optional[BuyType]
    name: Optional[str]
    phone: Optional[str]
    email: Optional[str]
    budget: Optional[float]
    postcode: Optional[str]
    postcode_covered: Optional[bool]
    suggested_postcode: Optional[str]
    pending_action: Optional[str]
    attempts: int
    conversation_ended: bool
    conversation_stage: ConversationStage
    last_error: Optional[str]
    session_id: str
    interaction_history: List[Dict[str, Any]]

In [16]:
# Utility Functions
def normalize_postcode(postcode: str) -> str:
    """Standardize postcode format by removing spaces and converting to uppercase."""
    if not postcode:
        return ""
    return re.sub(r'[^A-Z0-9]', '', postcode.upper())

def load_eligible_postcodes(file_path: str) -> Tuple[Set[str], List[str]]:
    """Load postcodes from file or use sample data if file not found."""
    sample_postcodes = ["SW1A1AA", "SW1A2AA", "W1A1AA", "E1W3SS", "B15 3TR"]
    try:
        df = pd.read_csv(file_path)
        postcode_col = next((col for col in df.columns if col.lower() == "postcode"), df.columns[0])
        postcodes = [normalize_postcode(str(pc)) for pc in df[postcode_col] if pd.notna(pc)]
        logger.info(f"Loaded {len(postcodes)} postcodes from {file_path}")
        return set(postcodes), postcodes
    except Exception as e:
        logger.warning(f"Error loading postcodes from {file_path}: {e}. Using sample data.")
        return set(sample_postcodes), sample_postcodes

def log_interaction(state: ChatState, action: str, input_data: Optional[str] = None, output_data: Optional[str] = None):
    """Log interaction details for analytics and debugging."""
    interaction = {
        "timestamp": pd.Timestamp.now().isoformat(),
        "session_id": state["session_id"],
        "stage": state["conversation_stage"],
        "action": action,
        "attempts": state["attempts"],
        "input": input_data,
        "output": output_data
    }
    state["interaction_history"].append(interaction)
    return state

In [17]:
# Agent 1: Enhanced Intent Classifier Agent with Better Prompt Engineering
class EnhancedIntentClassifierAgent:
    def __init__(self, llm):
        self.llm = llm
        self.logger = logging.getLogger("IntentClassifier")

    def _build_classification_chain(self, allowed_options: List[str], description: str, examples: List[Dict[str, str]]):
        """Build a classification chain with robust prompt engineering including examples."""
        examples_text = "\n".join([f"User message: '{e['input']}' → Classification: {e['output']}" for e in examples])

        prompt_template = f"""# Task: {description}

## Allowed Options: {', '.join(allowed_options)}

## Instructions:
- Analyze the user message carefully
- Choose EXACTLY ONE option from the allowed options
- Return ONLY the classification without any explanation or additional text
- If unsure, choose the closest match based on intent

## Examples:
{examples_text}

## Input:
User message: {{user_message}}

## Output (classification ONLY):"""

        prompt = ChatPromptTemplate.from_template(prompt_template)
        return prompt | self.llm | StrOutputParser()

    def classify_intent(self, message: str) -> Intent:
        """Classify if the user wants to buy or sell a property."""
        examples = [
            {"input": "I'm interested in purchasing a house", "output": "buy"},
            {"input": "Looking to buy a property in London", "output": "buy"},
            {"input": "Want to sell my flat", "output": "sell"},
            {"input": "I need to put my house on the market", "output": "sell"},
            {"input": "I want to list my property", "output": "sell"}
        ]

        chain = self._build_classification_chain(
            [Intent.BUY, Intent.SELL],
            "Determine if the user wants to buy or sell a property.",
            examples
        )

        try:
            res = chain.invoke({"user_message": message}).strip().lower()
            self.logger.info(f"Intent classification: {message} -> {res}")
            return Intent(res) if res in [Intent.BUY, Intent.SELL] else Intent.UNKNOWN
        except Exception as e:
            self.logger.error(f"Intent classification error: {e}")
            return Intent.UNKNOWN

    def classify_buy_type(self, message: str) -> BuyType:
        """Classify if the buyer wants a new home or a re-sale property."""
        examples = [
            {"input": "I want a brand new house", "output": "new_home"},
            {"input": "Looking for new constructions", "output": "new_home"},
            {"input": "I'm interested in existing properties", "output": "re_sale"},
            {"input": "Something that's already built", "output": "re_sale"},
            {"input": "Second-hand home", "output": "re_sale"}
        ]

        chain = self._build_classification_chain(
            [BuyType.NEW_HOME, BuyType.RE_SALE],
            "Determine if the buyer wants a new home or a re-sale property.",
            examples
        )

        try:
            res = chain.invoke({"user_message": message}).strip().lower()
            self.logger.info(f"Buy type classification: {message} -> {res}")
            return BuyType(res) if res in [BuyType.NEW_HOME, BuyType.RE_SALE] else BuyType.UNKNOWN
        except Exception as e:
            self.logger.error(f"Buy type classification error: {e}")
            return BuyType.UNKNOWN

    def classify_yes_no(self, message: str) -> YesNo:
        """Classify if the answer is yes or no."""
        examples = [
            {"input": "Yes please", "output": "yes"},
            {"input": "Sure", "output": "yes"},
            {"input": "No thanks", "output": "no"},
            {"input": "Not interested", "output": "no"},
            {"input": "I don't think so", "output": "no"}
        ]

        chain = self._build_classification_chain(
            [YesNo.YES, YesNo.NO],
            "Determine if the answer is yes or no.",
            examples
        )

        try:
            res = chain.invoke({"user_message": message}).strip().lower()
            self.logger.info(f"Yes/No classification: {message} -> {res}")
            return YesNo(res) if res in [YesNo.YES, YesNo.NO] else YesNo.UNKNOWN
        except Exception as e:
            self.logger.error(f"Yes/No classification error: {e}")
            return YesNo.UNKNOWN

In [18]:
# Agent 2: Enhanced Information Gatherer Agent with Input Validation
class EnhancedInfoGathererAgent:
    def __init__(self, llm):
        self.llm = llm
        self.logger = logging.getLogger("InfoGatherer")

    def _validate_with_llm(self, field: str, value: str) -> Tuple[bool, str]:
        """Use LLM to validate user input with explanations."""
        template = f"""# Validation Task: Check if the provided {field} is valid

## Input:
{value}

## Instructions:
- Carefully check if the input is a valid {field}
- Return a JSON object with two fields:
  - "is_valid": true/false
  - "reason": brief explanation of why it's valid or invalid

## Format:
{{
  "is_valid": boolean,
  "reason": "string"
}}"""

        prompt = PromptTemplate.from_template(template)
        chain = prompt | self.llm

        try:
            response = chain.invoke({})
            # Extract JSON from the response
            import re
            json_match = re.search(r'({[\s\S]*})', response.content)
            if json_match:
                result = json.loads(json_match.group(1))
                return result.get("is_valid", False), result.get("reason", "Invalid input")
            return False, "Unable to validate input"
        except Exception as e:
            self.logger.error(f"Validation error for {field}: {e}")
            return False, f"Error validating {field}"

    def get_name(self, state: ChatState) -> ChatState:
        """Process and validate user's name."""
        user_msg = state["messages"][-1].content.strip()

        if not user_msg or len(user_msg) < 2:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please provide a valid name."))
            return log_interaction(state, "name_invalid", user_msg, "Invalid name")

        is_valid, reason = self._validate_with_llm("name", user_msg)

        if not is_valid:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content=f"Please provide a valid name. {reason}"))
            return log_interaction(state, "name_invalid", user_msg, reason)

        state["name"] = user_msg
        state["messages"].append(AIMessage(content=f"Thanks {user_msg}! What is your phone number?"))
        state["pending_action"] = "get_phone"
        state["conversation_stage"] = ConversationStage.PHONE
        state["attempts"] = 0
        return log_interaction(state, "name_valid", user_msg, "Name accepted")

    def get_phone(self, state: ChatState) -> ChatState:
        """Process and validate user's phone number."""
        user_msg = state["messages"][-1].content.strip()

        # Basic validation
        if not re.search(r'\d{7,}', user_msg):
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please enter a valid phone number with at least 7 digits."))
            return log_interaction(state, "phone_invalid", user_msg, "Invalid phone format")

        is_valid, reason = self._validate_with_llm("phone number", user_msg)

        if not is_valid:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content=f"Please enter a valid phone number. {reason}"))
            return log_interaction(state, "phone_invalid", user_msg, reason)

        state["phone"] = user_msg
        state["messages"].append(AIMessage(content="Please share your email address."))
        state["pending_action"] = "get_email"
        state["conversation_stage"] = ConversationStage.EMAIL
        state["attempts"] = 0
        return log_interaction(state, "phone_valid", user_msg, "Phone accepted")

    def get_email(self, state: ChatState) -> ChatState:
        """Process and validate user's email address."""
        user_msg = state["messages"][-1].content.strip()

        # Basic email validation
        if "@" not in user_msg or "." not in user_msg.split("@")[-1]:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please provide a valid email address."))
            return log_interaction(state, "email_invalid", user_msg, "Invalid email format")

        is_valid, reason = self._validate_with_llm("email address", user_msg)

        if not is_valid:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content=f"Please provide a valid email address. {reason}"))
            return log_interaction(state, "email_invalid", user_msg, reason)

        state["email"] = user_msg

        if state.get("intent") == Intent.BUY:
            state["messages"].append(AIMessage(content="Are you looking for a new home or a re-sale?"))
            state["pending_action"] = "get_buy_type"
            state["conversation_stage"] = ConversationStage.BUY_TYPE
        else:
            state["messages"].append(AIMessage(content="Please provide the postcode of the property you want to sell."))
            state["pending_action"] = "get_postcode"
            state["conversation_stage"] = ConversationStage.POSTCODE

        state["attempts"] = 0
        return log_interaction(state, "email_valid", user_msg, "Email accepted")

In [19]:
# Agent 3: Enhanced Budget Processor Agent with Better Parsing
class EnhancedBudgetProcessorAgent:
    def __init__(self, llm):
        self.llm = llm
        self.logger = logging.getLogger("BudgetProcessor")

    def extract_budget_with_llm(self, text: str) -> Optional[float]:
        """Use LLM to extract budget information from text."""
        template = """# Task: Extract Budget Amount

## Input Text:
"{text}"

## Instructions:
- Extract the numerical budget amount from the input text
- Convert to a standard numerical value in pounds (£)
- Handle different formats (e.g., "1 million", "500k", "1.2m", etc.)
- Return a JSON object with the extracted budget and confidence

## Output Format:
```json
{
  "budget": float,  // The budget in pounds (e.g., 1000000.0)
  "confidence": float  // How confident you are (0.0-1.0)
}
```

If no budget could be extracted, return null for the budget and explain why.
"""

        prompt = PromptTemplate.from_template(template).format(text=text)
        chain = prompt | self.llm

        try:
            response = chain.invoke({})
            # Extract JSON from the response
            import re
            json_match = re.search(r'({[\s\S]*})', response.content)
            if json_match:
                result = json.loads(json_match.group(1))
                if result.get("budget") is not None and result.get("confidence", 0) > 0.7:
                    return float(result["budget"])
            return None
        except Exception as e:
            self.logger.error(f"Budget extraction error: {e}")
            return None

    def parse_budget(self, text: str) -> Optional[float]:
        """Parse budget amount from text with improved handling."""
        if not text:
            return None

        # First try with the LLM
        llm_result = self.extract_budget_with_llm(text)
        if llm_result is not None:
            return llm_result

        # Fallback to rule-based parsing
        txt = text.lower()
        multiplier = 1.0

        if "million" in txt or re.search(r'\d+\s*m\b', txt):
            multiplier = 1_000_000.0
            txt = re.sub(r"(million|m)\b", "", txt)
        elif "thousand" in txt or re.search(r'\d+\s*k\b', txt):
            multiplier = 1_000.0
            txt = re.sub(r"(thousand|k)\b", "", txt)

        # Clean text and extract numbers
        txt = re.sub(r"[£$,€]", "", txt)
        numbers = re.findall(r"\d+(?:\.\d+)?", txt)

        if not numbers:
            return None

        try:
            return float(max(float(num) for num in numbers)) * multiplier
        except ValueError:
            return None

    def process_budget(self, state: ChatState) -> ChatState:
        """Process and validate the user's budget."""
        user_msg = state["messages"][-1].content.strip()
        budget_val = self.parse_budget(user_msg)

        if budget_val is None:
            state["attempts"] += 1
            state["messages"].append(AIMessage(
                content="Please enter a valid budget (e.g., 1,200,000, 1.2m, or 500k)."
            ))
            return log_interaction(state, "budget_invalid", user_msg, "Could not parse budget")

        state["budget"] = budget_val
        self.logger.info(f"Parsed budget: {user_msg} -> {budget_val}")

        if state.get("buy_type") == BuyType.NEW_HOME and budget_val < MIN_BUDGET_NEW_HOME:
            message = (
                f"I see you're interested in a new home with a budget of £{budget_val:,.2f}. "
                f"Unfortunately, new homes in our portfolio require a minimum budget of £{MIN_BUDGET_NEW_HOME:,}. "
                f"Please call {COMPANY_PHONE_NUMBER} for assistance with available options. Goodbye."
            )
            state["messages"].append(AIMessage(content=message))
            state["conversation_ended"] = True
            state["conversation_stage"] = ConversationStage.ENDED
            return log_interaction(state, "budget_too_low", user_msg, "Budget below minimum")
        else:
            message = (
                f"Thank you. Your budget of £{budget_val:,.2f} is noted. "
                f"Now, can you please enter the postcode of your location of interest?"
            )
            state["messages"].append(AIMessage(content=message))
            state["pending_action"] = "get_postcode"
            state["conversation_stage"] = ConversationStage.POSTCODE
            state["attempts"] = 0
            return log_interaction(state, "budget_valid", user_msg, f"Budget accepted: £{budget_val:,.2f}")

In [20]:
# Agent 4: Enhanced Postcode Processor Agent with Improved Similarity Search
class EnhancedPostcodeProcessorAgent:
    def __init__(self, eligible_set: Set[str], eligible_list: List[str], embedding_model, llm):
        self.eligible_set = eligible_set
        self.eligible_list = eligible_list
        self.embedding_model = embedding_model
        self.llm = llm
        self.logger = logging.getLogger("PostcodeProcessor")
        self.index = None
        self.dimension = 0
        if self.eligible_list:
            self.build_index()

    def build_index(self):
        """Build FAISS index for postcode similarity search."""
        try:
            docs = [str(pc) for pc in self.eligible_list]
            embeddings = np.array(self.embedding_model.embed_documents(docs)).astype('float32')
            self.dimension = embeddings.shape[1]
            self.index = faiss.IndexFlatL2(self.dimension)
            self.index.add(embeddings)
            self.logger.info(f"Built FAISS index with {len(docs)} postcodes and {self.dimension} dimensions")
        except Exception as e:
            self.logger.error(f"Error building FAISS index: {e}")

    def is_valid_uk_postcode(self, postcode: str) -> bool:
        """Validate if the string follows UK postcode format."""
        # UK postcode regex pattern (simplified)
        uk_pattern = r'^[A-Z]{1,2}[0-9][A-Z0-9]? ?[0-9][A-Z]{2}$'
        return bool(re.match(uk_pattern, postcode.upper()))

    def validate_postcode_with_llm(self, postcode: str) -> Tuple[bool, str]:
        """Use LLM to validate if a string looks like a UK postcode."""
        template = """# Task: Validate UK Postcode

## Input:
{postcode}

## Instructions:
- Determine if the input looks like a valid UK postcode format
- UK postcodes typically follow patterns like "SW1A 1AA" or "M1 1AA"
- Return a JSON with validation result and reason

## Output Format:
```json
{
  "is_valid": boolean,
  "reason": "string explanation"
}
```"""

        prompt = PromptTemplate.from_template(template).format(postcode=postcode)
        chain = prompt | self.llm

        try:
            response = chain.invoke({})
            import re
            json_match = re.search(r'({[\s\S]*})', response.content)
            if json_match:
                result = json.loads(json_match.group(1))
                return result.get("is_valid", False), result.get("reason", "Invalid format")
            return False, "Could not validate postcode format"
        except Exception as e:
            self.logger.error(f"Postcode validation error: {e}")
            return False, "Error validating postcode"

    def is_covered(self, postcode: str) -> bool:
        """Check if a postcode is in the eligible set."""
        return normalize_postcode(postcode) in self.eligible_set

    def find_similar(self, postcode: str, k: int = 3) -> List[Tuple[str, float]]:
        """Find k most similar postcodes and their distances."""
        if self.index is None:
            return []

        try:
            q_embed = np.array(self.embedding_model.embed_query(postcode)).astype('float32')
            if q_embed.ndim == 1:
                q_embed = q_embed.reshape(1, -1)

            distances, indices = self.index.search(q_embed, k)
            results = []

            for i in range(min(k, len(indices[0]))):
                idx = indices[0][i]
                if idx < len(self.eligible_list):
                    results.append((self.eligible_list[idx], float(distances[0][i])))

            return results
        except Exception as e:
            self.logger.error(f"Error in similarity search: {e}")
            return []

    def process_postcode(self, state: ChatState) -> ChatState:
        """Process and validate the user's postcode."""
        user_msg = state["messages"][-1].content.strip()

        # Initial validation
        is_valid, reason = self.validate_postcode_with_llm(user_msg)
        if not is_valid:
            state["attempts"] += 1
            state["messages"].append(AIMessage(
                content=f"That doesn't appear to be a valid UK postcode. {reason} Please try again."
            ))
            return log_interaction(state, "postcode_invalid_format", user_msg, reason)

        norm_pc = normalize_postcode(user_msg)
        state["postcode"] = norm_pc

        if self.is_covered(norm_pc):
            state["postcode_covered"] = True

            # Personalized response based on intent
            if state.get("intent") == Intent.BUY:
                property_type = "properties" if state.get("buy_type") == BuyType.RE_SALE else "new homes"
                message = (
                    f"Great news! We cover {user_msg} for {property_type}. "
                    f"One of our agents will contact you within 24 hours via your provided "
                    f"phone ({state.get('phone')}) or email ({state.get('email')}). "
                    f"Is there anything else I can help you with today? (yes/no)"
                )
            else:  # SELL
                message = (
                    f"Excellent! We're active in your area ({user_msg}). "
                    f"A member of our sales team will contact you within 24 hours to discuss "
                    f"selling your property. Is there anything else I can help you with? (yes/no)"
                )

            state["messages"].append(AIMessage(content=message))

        else:
            state["postcode_covered"] = False
            similar_results = self.find_similar(norm_pc)
            suggestion = similar_results[0][0] if similar_results and similar_results[0][1] < 0.5 else None
            state["suggested_postcode"] = suggestion

            # Different message based on intent and property type
            if state.get("intent") == Intent.BUY and state.get("buy_type") == BuyType.NEW_HOME:
                message = (
                    f"I'm sorry, we don't currently have new home developments in {user_msg}. "
                )
                if suggestion:
                    message += f"Did you perhaps mean {suggestion}? "

                message += (
                    f"Please call our new homes team at {COMPANY_PHONE_NUMBER} to discuss options. "
                    f"Is there anything else I can help you with? (yes/no)"
                )
            else:
                message = (
                    f"I notice that {user_msg} is not in our primary coverage area. "
                )
                if suggestion:
                    message += f"Did you perhaps mean {suggestion}? "

                message += (
                    f"Please call {COMPANY_PHONE_NUMBER} and we can discuss your requirements. "
                    f"Is there anything else I can help you with? (yes/no)"
                )

            state["messages"].append(AIMessage(content=message))

        state["pending_action"] = "check_reassistance"
        state["conversation_stage"] = ConversationStage.REASSISTANCE
        state["attempts"] = 0
        return log_interaction(state, "postcode_processed", user_msg,
                              f"Postcode covered: {state['postcode_covered']}")

In [21]:
# Agent 5: Enhanced Orchestrator Agent with Better Context Awareness
class EnhancedOrchestratorAgent:
    def __init__(self, llm, embedding_model, eligible_set, eligible_list):
        self.llm = llm
        self.intent_agent = EnhancedIntentClassifierAgent(llm)
        self.info_agent = EnhancedInfoGathererAgent(llm)
        self.budget_agent = EnhancedBudgetProcessorAgent(llm)
        self.postcode_agent = EnhancedPostcodeProcessorAgent(eligible_set, eligible_list, embedding_model, llm)
        self.logger = logging.getLogger("Orchestrator")

    def _get_conversation_summary(self, state: ChatState) -> str:
        """Generate a summary of the conversation state for context."""
        summary = []

        if state.get("name"):
            summary.append(f"User: {state['name']}")

        if state.get("intent"):
            summary.append(f"Intent: {state['intent'].value}")

        if state.get("buy_type"):
            summary.append(f"Property Type: {state['buy_type'].value}")

        if state.get("budget"):
            summary.append(f"Budget: £{state['budget']:,.2f}")

        if state.get("postcode"):
            coverage = "covered" if state.get("postcode_covered") else "not covered"
            summary.append(f"Postcode: {state['postcode']} ({coverage})")

        return ", ".join(summary)

    def handle_max_attempts(self, state: ChatState) -> ChatState:
        """Handle cases where user has exceeded maximum attempts for a response."""
        if state["attempts"] >= MAX_ATTEMPTS:
            context = self._get_conversation_summary(state)

            message = (
                f"I'm having trouble understanding your responses. "
                f"For better assistance, please call our customer service at {COMPANY_PHONE_NUMBER}. "
            )

            if context:
                message += f"I'll share what I've gathered so far: {context}. "

            message += "Thank you for your interest. Goodbye."

            state["messages"].append(AIMessage(content=message))
            state["conversation_ended"] = True
            state["conversation_stage"] = ConversationStage.ENDED
            return log_interaction(state, "max_attempts_reached", None, f"Max attempts ({MAX_ATTEMPTS}) reached")

        return state

    def process_intent(self, state: ChatState) -> ChatState:
        """Process user's intent to buy or sell property."""
        user_msg = state["messages"][-1].content.strip()
        intent = self.intent_agent.classify_intent(user_msg)

        if intent == Intent.UNKNOWN:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content=(
                "I'm not sure if you're looking to buy or sell a property. "
                "Could you please clarify if you're interested in buying or selling?"
            )))
            return self.handle_max_attempts(
                log_interaction(state, "intent_unknown", user_msg, "Could not determine intent")
            )

        state["intent"] = intent
        state["messages"].append(AIMessage(content=(
            f"Great! I understand you're looking to {intent.value} a property. "
            f"To help you better, could I get your name please?"
        )))
        state["pending_action"] = "get_name"
        state["conversation_stage"] = ConversationStage.NAME
        state["attempts"] = 0
        return log_interaction(state, "intent_identified", user_msg, f"Intent: {intent.value}")

    def process_buy_type(self, state: ChatState) -> ChatState:
        """Process user's property type preference."""
        user_msg = state["messages"][-1].content.strip()
        buy_type = self.intent_agent.classify_buy_type(user_msg)

        if buy_type == BuyType.UNKNOWN:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content=(
                "I'm not sure if you're looking for a new home or a re-sale property. "
                "Please clarify if you want a newly built home or an existing property."
            )))
            return self.handle_max_attempts(
                log_interaction(state, "buy_type_unknown", user_msg, "Could not determine property type")
            )

        state["buy_type"] = buy_type
        state["messages"].append(AIMessage(content=(
            f"Thank you for confirming you're interested in a {buy_type.value.replace('_', ' ')}. "
            f"What is your budget for this purchase?"
        )))
        state["pending_action"] = "get_budget"
        state["conversation_stage"] = ConversationStage.BUDGET
        state["attempts"] = 0
        return log_interaction(state, "buy_type_identified", user_msg, f"Buy type: {buy_type.value}")

    def process_reassistance(self, state: ChatState) -> ChatState:
        """Process whether the user wants additional help."""
        user_msg = state["messages"][-1].content.strip()
        yes_no = self.intent_agent.classify_yes_no(user_msg)

        if yes_no == YesNo.UNKNOWN:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content=(
                "I didn't understand your response. Could you please answer with yes or no?"
            )))
            return self.handle_max_attempts(
                log_interaction(state, "reassistance_unknown", user_msg, "Could not determine yes/no")
            )

        if yes_no == YesNo.YES:
            state["messages"].append(AIMessage(content=(
                "How can I help you further today?"
            )))
            state["pending_action"] = "handle_followup"
            return log_interaction(state, "reassistance_yes", user_msg, "User wants more help")
        else:
            state["messages"].append(AIMessage(content=(
                "Thank you for chatting with us today. A representative will be in touch shortly. Goodbye!"
            )))
            state["conversation_ended"] = True
            state["conversation_stage"] = ConversationStage.ENDED
            return log_interaction(state, "reassistance_no", user_msg, "User doesn't need more help")

    def handle_followup(self, state: ChatState) -> ChatState:
        """Handle follow-up questions from the user."""
        user_msg = state["messages"][-1].content.strip()

        # Use LLM to generate contextual response to followup
        template = """# Follow-up Response Generation

## Conversation Context:
User is interested in {intent} a property.
{context}

## Latest User Message:
{user_message}

## Instructions:
- Provide a helpful response to the user's follow-up question
- Keep the response concise and real-estate focused
- If you can't answer the question, offer to connect them with an agent
- Always include the company phone number ({phone}) for any detailed inquiries

## Response:"""

        context = self._get_conversation_summary(state)
        prompt = PromptTemplate.from_template(template).format(
            intent=state.get("intent", Intent.UNKNOWN).value,
            context=context,
            user_message=user_msg,
            phone=COMPANY_PHONE_NUMBER
        )

        chain = prompt | self.llm | StrOutputParser()

        try:
            response = chain.invoke({})
            state["messages"].append(AIMessage(content=response))
            return log_interaction(state, "followup_handled", user_msg, response)
        except Exception as e:
            self.logger.error(f"Error handling followup: {e}")
            fallback_response = (
                f"I'm not sure I can answer that question fully. For detailed assistance, "
                f"please contact our team at {COMPANY_PHONE_NUMBER}. Is there anything else "
                f"I can help you with today?"
            )
            state["messages"].append(AIMessage(content=fallback_response))
            return log_interaction(state, "followup_error", user_msg, str(e))

    def initialize_conversation(self, state: ChatState) -> ChatState:
        """Initialize conversation with greeting if not already started."""
        if not state["messages"]:
            state["messages"] = [
                SystemMessage(content=(
                    "You are an AI assistant for a high-end real estate agency. "
                    "Your role is to qualify leads by collecting necessary information."
                )),
                AIMessage(content=(
                    "Hello! I'm the virtual assistant for RealtyFlow Properties. "
                    "How can I help you today with buying or selling a property?"
                ))
            ]
            state["conversation_stage"] = ConversationStage.GREETING
            return log_interaction(state, "conversation_initialized", None, "Greeting sent")
        return state

    def handle_general_query(self, state: ChatState) -> ChatState:
        """Handle general queries that don't fit into the structured flow."""
        user_msg = state["messages"][-1].content.strip()

        # Create a prompt for handling general real estate queries
        template = """# General Real Estate Query Response

## Query:
{query}

## Conversation Context:
{context}

## Instructions:
- Provide a helpful, concise response to the real estate related query
- If it's something that requires an agent's expertise, suggest contacting one
- Keep responses accurate and within typical real estate agent knowledge
- Always mention the company phone number ({phone}) for detailed inquiries
- If the query is unrelated to real estate, gently redirect to property matters

## Response:"""

        context = self._get_conversation_summary(state)
        prompt = PromptTemplate.from_template(template).format(
            query=user_msg,
            context=context,
            phone=COMPANY_PHONE_NUMBER
        )

        chain = prompt | self.llm | StrOutputParser()

        try:
            response = chain.invoke({})
            state["messages"].append(AIMessage(content=response))

            # Suggest continuing with the qualification process if not completed
            if (state.get("conversation_stage") != ConversationStage.ENDED and
                not state.get("name") and not state.get("intent")):
                state["messages"].append(AIMessage(content=(
                    "To help you better, would you mind telling me if you're looking to buy or sell a property?"
                )))

            return log_interaction(state, "general_query_handled", user_msg, response)
        except Exception as e:
            self.logger.error(f"Error handling general query: {e}")
            fallback = (
                f"I apologize, but I'm having trouble processing your request. For immediate assistance, "
                f"please contact our team at {COMPANY_PHONE_NUMBER}."
            )
            state["messages"].append(AIMessage(content=fallback))
            return log_interaction(state, "general_query_error", user_msg, str(e))

    def error_recovery(self, state: ChatState, error: Exception) -> ChatState:
        """Recovery mechanism for handling errors gracefully."""
        self.logger.error(f"Error in conversation flow: {error}")

        # Log the error
        state["last_error"] = str(error)

        # Get the current conversation stage for context
        stage = state.get("conversation_stage", ConversationStage.GREETING)
        context = self._get_conversation_summary(state)

        # Craft an appropriate recovery message based on the stage
        if stage in [ConversationStage.GREETING, ConversationStage.INTENT]:
            message = (
                "I apologize, but I'm having trouble understanding your request. "
                "Could you please tell me clearly if you're looking to buy or sell a property?"
            )
        elif stage in [ConversationStage.NAME, ConversationStage.PHONE, ConversationStage.EMAIL]:
            message = (
                "I seem to be having trouble processing your information. "
                f"For better assistance, you can call us directly at {COMPANY_PHONE_NUMBER}. "
                "Would you like to continue with our chat instead?"
            )
        else:
            message = (
                "I apologize for the confusion. There seems to be an error in our conversation. "
                f"Please contact our customer service at {COMPANY_PHONE_NUMBER} for immediate assistance. "
                f"They'll have access to the information we've gathered so far: {context}"
            )

        state["messages"].append(AIMessage(content=message))
        return log_interaction(state, "error_recovery", None, f"Recovered from: {error}")