<a href="https://colab.research.google.com/github/KaifAhmad1/code-test/blob/main/CommVersion_Assignment_Mohd_Kaif.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **RealtyFlow AI: Intelligent Real Estate Chatbot**

RealtyFlow AI: Intelligent Real Estate Chatbot
---------------------------------------------
A complete implementation of a conversational AI agent for real estate businesses
that qualifies leads by collecting necessary information based on whether they
want to buy or sell property.

**RealtyFlow AI** is an intelligent chatbot designed to streamline initial customer interactions for real estate businesses. Built using LangGraph for state management, Google Gemini for natural language understanding and generation, and FAISS for efficient postcode similarity searching, it guides users through a predefined decision tree to understand their intent (buy/sell), budget, and location preferences.

The chatbot automates lead qualification, provides instant responses 24/7, and ensures consistent information gathering, ultimately enhancing agent productivity and improving the customer experience.

### **Core Functionality & Flow**

The chatbot operates based on a structured decision tree:

1.  **Initial Greeting & Intent Capture:**
    *   Bot: "Home. How may I help you? (e.g., 'I want to buy', 'I'm looking to sell')"
    *   User provides input (e.g., "I want to buy a house").
    *   **Gemini (Intent Classifier):** Determines if the user wants to "buy" or "sell".

2.  **Contact Information Gathering:**
    *   Bot: "Great. Can I get your name?" -> User provides name.
    *   Bot: "Can I get your phone number?" -> User provides phone.
    *   Bot: "Can I get your email address?" -> User provides email.

3.  **Path Divergence (Buy vs. Sell):**

    *   **If Intent is "Buy":**
        *   Bot: "Are you looking for a new home or a re-sale home?"
        *   **Gemini (Buy Type Classifier):** Determines "new\_home" or "re\_sale".
        *   Bot: "What is your budget?"
        *   **Budget Processor Tool:** Parses the budget amount.
            *   **Rule (New Home):** If budget < £1,000,000 for a "new\_home", the bot informs the user about the minimum budget and ends the interaction with a referral to call the office.
        *   Bot (if budget is sufficient or re-sale): "Can I know the postcode of your location of interest?"
        *   **Postcode Processor Tool (Exact Match & FAISS):**
            *   Checks if the normalized postcode is in the pre-approved list.
            *   If not an exact match, FAISS suggests similar valid postcodes.
            *   **Rule (New Home, Postcode Not Covered):** Informs the user and ends with a referral.
            *   **Rule (Postcode Covered or Re-sale/Sell with Uncovered Postcode):** Proceeds to reassistance.

    *   **If Intent is "Sell":**
        *   (After contact info) Bot: "What is your postcode?"
        *   **Postcode Processor Tool (Exact Match & FAISS):**
            *   Checks eligibility. FAISS suggests similar if no exact match.
            *   Proceeds to reassistance regardless of coverage (as per flowchart, only a message change).

4.  **Outcome & Reassistance:**

    *   **If Postcode Covered (for any valid path):**
        *   Bot: "Great! That postcode is covered. I can expect someone to get in touch with you within 24 hours via phone or email. Is there anything else I can help you with? (yes/no)"
    *   **If Postcode NOT Covered (for Sell or Re-sale Buy path):**
        *   Bot: "Sorry, we don't cater to the Post code '{postcode}' that you provided. (Did you perhaps mean {suggestion}?) Please call the office on {phone\_number} to get help. Is there anything else I can help you with? (yes/no)"

5.  **Handling Reassistance Choice:**
    *   **Gemini (Yes/No Classifier):** Determines if the user said "yes" or "no".
    *   If "yes": Bot: "Okay, let's start over." (Restarts the flow).
    *   If "no": Bot: "Thank you for chatting with us. Good bye." (Ends conversation).
    *   If unclear: Re-prompts for yes/no.

6.  **Error/Max Attempts Handling:**
    *   If the bot fails to understand an input after 3 attempts, it politely ends the conversation and refers the user to call the office.

### **Technical Stack & Components**

*   **Orchestration:** `LangGraph` (manages the conversational state and flow between nodes).
*   **Language Model (LLM):** `Google Gemini` (via `ChatGoogleGenerativeAI` from `langchain-google-genai`) for:
    *   Intent classification (buy/sell, new\_home/re\_sale, yes/no).
    *   Generating conversational responses.
*   **Embeddings:** `GoogleGenerativeAIEmbeddings` (model: `models/embedding-001`) for converting postcodes into vector representations.
*   **Vector Search:** `FAISS` (Facebook AI Similarity Search) for:
    *   Building an index of eligible postcode embeddings.
    *   Finding the most similar eligible postcodes if a user's input isn't an exact match (useful for typo correction/suggestions).
*   **Data Handling:** `pandas` for loading the initial list of eligible postcodes.
*   **Core Logic:** Python functions and classes acting as "tools" or "agents":
    *   `IntentClassifierAgent`: Wraps Gemini calls for classification tasks.
    *   `BudgetProcessorTool`: Parses budget strings into numerical values.
    *   `PostcodeProcessorTool`: Handles exact postcode validation and FAISS-based similarity search.
*   **State Management:** A `TypedDict` (`ChatbotState`) holds all relevant information during the conversation (history, extracted details, pending actions).

### **Usefulness & Problem Solving**

**RealtyFlow AI** is particularly useful for:

*   **Real Estate Agencies & Property Developers:** To automate initial customer engagement and lead qualification.

**In essence, RealtyFlow AI streamlines the top of the sales funnel, enhances agent productivity, improves customer experience, and helps reduce operational costs by acting as an intelligent, automated front-desk for real estate businesses.**

In [1]:
!pip install -q langchain langgraph langchain-google-genai google-generativeai pandas faiss-cpu tiktoken python-dotenv

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.2/151.2 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.0/42.0 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m61.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m56.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.3/42.3 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.6/47.6 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.8/194.8 kB[0m [31m15.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [2]:
# Initialize Google API key at the start of the script
GOOGLE_API_KEY = "AIzaSyD9ljvMl4t9ucEnQpi3RfAJsoCgViE7O9Q"  # Replace with your actual API key

import os
import re
from typing import Dict, List, Optional, Literal, Set, Tuple, Any
import pandas as pd
import numpy as np
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langgraph.graph import StateGraph, END
import faiss

# Constants
MIN_BUDGET_NEW_HOME = 1_000_000
COMPANY_PHONE_NUMBER = "1800 111 222"
MAX_ATTEMPTS = 3
POSTCODE_FILE = "uk_postcodes.csv"  # Adjust path as needed

In [4]:
from typing import TypedDict
# State Definition
class ChatState(TypedDict):
    messages: List[AIMessage | HumanMessage]
    intent: Optional[Literal["buy", "sell"]]
    buy_type: Optional[Literal["new_home", "re_sale"]]
    name: Optional[str]
    phone: Optional[str]
    email: Optional[str]
    budget: Optional[float]
    postcode: Optional[str]
    postcode_covered: Optional[bool]
    suggested_postcode: Optional[str]
    pending_action: Optional[str]
    attempts: int
    conversation_ended: bool

In [5]:
# Utility Functions
def normalize_postcode(postcode: str) -> str:
    return postcode.upper().replace(" ", "")

def load_eligible_postcodes(file_path: str) -> Tuple[Set[str], List[str]]:
    sample_postcodes = ["SW1A1AA", "SW1A2AA", "W1A1AA", "E1W3SS", "B15 3TR"]
    try:
        df = pd.read_csv(file_path)
        postcode_col = next((col for col in df.columns if col.lower() == "postcode"), df.columns[0])
        postcodes = [normalize_postcode(str(pc)) for pc in df[postcode_col] if pd.notna(pc)]
        return set(postcodes), postcodes
    except FileNotFoundError:
        return set(sample_postcodes), sample_postcodes

In [6]:
# Agent 1: Intent Classifier Agent
class IntentClassifierAgent:
    def __init__(self, llm):
        self.llm = llm

    def _build_chain(self, allowed_options: List[str], description: str):
        prompt_template = f"""Task: {description}
Allowed Options: {', '.join(allowed_options)}.
Answer with ONLY ONE option without extra text.
User message: {{user_message}}
Classification:"""
        prompt = ChatPromptTemplate.from_template(prompt_template)
        return prompt | self.llm | StrOutputParser()

    def classify_intent(self, message: str) -> Literal["buy", "sell", "unknown"]:
        chain = self._build_chain(["buy", "sell"], "Determine if the user wants to buy or sell a property.")
        res = chain.invoke({"user_message": message}).strip().lower()
        return res if res in ["buy", "sell"] else "unknown"

    def classify_buy_type(self, message: str) -> Literal["new_home", "re_sale", "unknown"]:
        chain = self._build_chain(["new_home", "re_sale"], "Determine if the buyer wants a new home or a re-sale property.")
        res = chain.invoke({"user_message": message}).strip().lower()
        return res if res in ["new_home", "re_sale"] else "unknown"

    def classify_yes_no(self, message: str) -> Literal["yes", "no", "unknown"]:
        chain = self._build_chain(["yes", "no"], "Determine if the answer is yes or no.")
        res = chain.invoke({"user_message": message}).strip().lower()
        return res if res in ["yes", "no"] else "unknown"

In [7]:
# Agent 2: Information Gatherer Agent
class InfoGathererAgent:
    def get_name(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        if not user_msg or len(user_msg) < 2:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please provide a valid name."))
        else:
            state["name"] = user_msg
            state["messages"].append(AIMessage(content="Thanks! What is your phone number?"))
            state["pending_action"] = "get_phone"
            state["attempts"] = 0
        return state

    def get_phone(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        if not re.search(r'\d{7,}', user_msg):
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please enter a valid phone number with at least 7 digits."))
        else:
            state["phone"] = user_msg
            state["messages"].append(AIMessage(content="Please share your email address."))
            state["pending_action"] = "get_email"
            state["attempts"] = 0
        return state

    def get_email(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        if "@" not in user_msg or "." not in user_msg.split("@")[-1]:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please provide a valid email address."))
        else:
            state["email"] = user_msg
            if state.get("intent") == "buy":
                state["messages"].append(AIMessage(content="Are you looking for a new home or a re-sale?"))
                state["pending_action"] = "get_buy_type"
            else:
                state["messages"].append(AIMessage(content="Please provide the postcode of the property."))
                state["pending_action"] = "get_postcode"
            state["attempts"] = 0
        return state

In [8]:
# Agent 3: Budget Processor Agent
class BudgetProcessorAgent:
    def parse_budget(self, text: str) -> Optional[float]:
        if not text:
            return None
        txt = text.lower()
        multiplier = 1.0
        if "million" in txt or re.search(r'\d+m\b', txt):
            multiplier = 1_000_000.0
            txt = re.sub(r"(million|m)", "", txt)
        elif "thousand" in txt or re.search(r'\d+k\b', txt):
            multiplier = 1_000.0
            txt = re.sub(r"(thousand|k)", "", txt)
        txt = re.sub(r"[£$,€]", "", txt)
        numbers = re.findall(r"\d+(?:\.\d+)?", txt)
        if not numbers:
            return None
        try:
            return max(float(num) * multiplier for num in numbers)
        except ValueError:
            return None

    def process_budget(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        budget_val = self.parse_budget(user_msg)
        if budget_val is None:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please enter a valid budget (e.g., 1,200,000, 1.2m, or 500k)."))
        else:
            state["budget"] = budget_val
            if state.get("buy_type") == "new_home" and budget_val < MIN_BUDGET_NEW_HOME:
                state["messages"].append(AIMessage(content=f"New homes require at least £{MIN_BUDGET_NEW_HOME:,}. Please call {COMPANY_PHONE_NUMBER} for assistance. Goodbye."))
                state["conversation_ended"] = True
            else:
                state["messages"].append(AIMessage(content="Please enter the postcode of your location of interest."))
                state["pending_action"] = "get_postcode"
            state["attempts"] = 0
        return state

In [9]:
# Agent 4: Postcode Processor Agent
class PostcodeProcessorAgent:
    def __init__(self, eligible_set: Set[str], eligible_list: List[str], embedding_model):
        self.eligible_set = eligible_set
        self.eligible_list = eligible_list
        self.embedding_model = embedding_model
        self.index = None
        self.dimension = 0
        if self.eligible_list:
            self.build_index()

    def build_index(self):
        docs = [str(pc) for pc in self.eligible_list]
        embeddings = np.array(self.embedding_model.embed_documents(docs)).astype('float32')
        self.dimension = embeddings.shape[1]
        self.index = faiss.IndexFlatL2(self.dimension)
        self.index.add(embeddings)

    def is_covered(self, postcode: str) -> bool:
        return normalize_postcode(postcode) in self.eligible_set

    def find_similar(self, postcode: str, k: int = 1) -> Optional[Tuple[str, float]]:
        if self.index is None:
            return None
        q_embed = np.array(self.embedding_model.embed_query(postcode)).astype('float32')
        if q_embed.ndim == 1:
            q_embed = q_embed.reshape(1, -1)
        distances, indices = self.index.search(q_embed, k)
        idx = indices[0][0]
        if idx < len(self.eligible_list):
            return self.eligible_list[idx], float(distances[0][0])
        return None

    def process_postcode(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        norm_pc = normalize_postcode(user_msg)
        state["postcode"] = norm_pc
        if self.is_covered(norm_pc):
            state["postcode_covered"] = True
            state["messages"].append(AIMessage(content="Great! That postcode is covered. Our team will contact you within 24 hours. Is there anything else I can help you with? (yes/no)"))
        else:
            state["postcode_covered"] = False
            similar = self.find_similar(norm_pc)
            suggestion = similar[0] if similar and similar[1] < 0.7 else None
            msg = f"Sorry, we don’t cover '{norm_pc}'."
            if suggestion:
                msg += f" Did you mean '{suggestion}'?"
            msg += f" Please call {COMPANY_PHONE_NUMBER} for assistance. Is there anything else I can help you with? (yes/no)"
            state["suggested_postcode"] = suggestion
            state["messages"].append(AIMessage(content=msg))
        state["pending_action"] = "check_reassistance"
        state["attempts"] = 0
        return state

In [10]:
# Agent 5: Orchestrator Agent (Manages Flow and Collaboration)
class OrchestratorAgent:
    def __init__(self, llm, embedding_model, eligible_set, eligible_list):
        self.intent_agent = IntentClassifierAgent(llm)
        self.info_agent = InfoGathererAgent()
        self.budget_agent = BudgetProcessorAgent()
        self.postcode_agent = PostcodeProcessorAgent(eligible_set, eligible_list, embedding_model)

    def handle_max_attempts(self, state: ChatState) -> ChatState:
        if state["attempts"] >= MAX_ATTEMPTS:
            state["messages"].append(AIMessage(content=f"I’m having trouble understanding. Please call {COMPANY_PHONE_NUMBER} for assistance. Goodbye."))
            state["conversation_ended"] = True
        return state

    def process_intent(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        intent = self.intent_agent.classify_intent(user_msg)
        if intent == "unknown":
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="I didn't understand. Are you looking to buy or sell a property?"))
        else:
            state["intent"] = intent
            state["messages"].append(AIMessage(content="Great! Can I get your name?"))
            state["pending_action"] = "get_name"
            state["attempts"] = 0
        return self.handle_max_attempts(state)

    def get_buy_type(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        buy_type = self.intent_agent.classify_buy_type(user_msg)
        if buy_type == "unknown":
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="I didn't catch that. Are you looking for a new home or a re-sale property?"))
        else:
            state["buy_type"] = buy_type
            state["messages"].append(AIMessage(content="What is your budget?"))
            state["pending_action"] = "get_budget"
            state["attempts"] = 0
        return self.handle_max_attempts(state)

    def check_reassistance(self, state: ChatState) -> ChatState:
        user_msg = state["messages"][-1].content.strip()
        resp = self.intent_agent.classify_yes_no(user_msg)
        if resp == "yes":
            state.update({
                "intent": None,
                "buy_type": None,
                "budget": None,
                "postcode": None,
                "postcode_covered": None,
                "suggested_postcode": None,
                "pending_action": "process_intent",
                "attempts": 0
            })
            state["messages"].append(AIMessage(content="Okay, let's start over. Are you looking to buy or sell a property?"))
        elif resp == "no":
            state["messages"].append(AIMessage(content="Thank you for choosing RealtyFlow. Goodbye!"))
            state["conversation_ended"] = True
        else:
            state["attempts"] += 1
            state["messages"].append(AIMessage(content="Please answer with yes or no."))
        return self.handle_max_attempts(state)

# RealtyFlow Chatbot Class
class RealtyFlowChatbot:
    def __init__(self):
        self.llm = ChatGoogleGenerativeAI(
            model="gemini-1.5-flash-latest",
            temperature=0.1,
            google_api_key=GOOGLE_API_KEY
        )
        self.embedding_model = GoogleGenerativeAIEmbeddings(
            model="models/embedding-001",
            google_api_key=GOOGLE_API_KEY
        )
        eligible_set, eligible_list = load_eligible_postcodes(POSTCODE_FILE)
        self.orchestrator = OrchestratorAgent(self.llm, self.embedding_model, eligible_set, eligible_list)
        self.state = self.initial_state()
        self.workflow = self.build_workflow()

    def initial_state(self) -> ChatState:
        return {
            "messages": [AIMessage(content="Hello! Welcome to RealtyFlow. Are you looking to buy or sell a property?")],
            "intent": None,
            "buy_type": None,
            "name": None,
            "phone": None,
            "email": None,
            "budget": None,
            "postcode": None,
            "postcode_covered": None,
            "suggested_postcode": None,
            "pending_action": "process_intent",
            "attempts": 0,
            "conversation_ended": False
        }

    def build_workflow(self) -> StateGraph:
        workflow = StateGraph(ChatState)
        workflow.add_node("process_intent", self.orchestrator.process_intent)
        workflow.add_node("get_name", self.orchestrator.info_agent.get_name)
        workflow.add_node("get_phone", self.orchestrator.info_agent.get_phone)
        workflow.add_node("get_email", self.orchestrator.info_agent.get_email)
        workflow.add_node("get_buy_type", self.orchestrator.get_buy_type)
        workflow.add_node("get_budget", self.orchestrator.budget_agent.process_budget)
        workflow.add_node("get_postcode", self.orchestrator.postcode_agent.process_postcode)
        workflow.add_node("check_reassistance", self.orchestrator.check_reassistance)
        workflow.set_entry_point("process_intent")
        return workflow.compile()

    def process_message(self, user_message: str) -> str:
        self.state["messages"].append(HumanMessage(content=user_message))
        pending = self.state.get("pending_action")
        nodes = {
            "process_intent": "process_intent",
            "get_name": "get_name",
            "get_phone": "get_phone",
            "get_email": "get_email",
            "get_buy_type": "get_buy_type",
            "get_budget": "get_budget",
            "get_postcode": "get_postcode",
            "check_reassistance": "check_reassistance"
        }
        if pending in nodes:
            self.state = self.workflow.invoke(self.state, config={"node": nodes[pending]})
        else:
            self.state["messages"].append(AIMessage(content="I'm sorry, I encountered an error."))
            self.state["conversation_ended"] = True
        return self.state["messages"][-1].content

    def is_conversation_ended(self) -> bool:
        return self.state.get("conversation_ended", False)

    def get_collected_info(self) -> Dict[str, Any]:
        keys = ["name", "phone", "email", "intent", "buy_type", "budget", "postcode", "postcode_covered", "suggested_postcode"]
        return {k: self.state.get(k) for k in keys}

In [None]:
# Main Execution
def run_conversation():
    chatbot = RealtyFlowChatbot()
    print("\n--- RealtyFlow AI Chatbot ---\n")
    print("Bot:", chatbot.state["messages"][0].content)
    while not chatbot.is_conversation_ended():
        user_input = input("You: ")
        if user_input.lower() in ["quit", "exit"]:
            print("Conversation terminated by user.")
            break
        response = chatbot.process_message(user_input)
        print("Bot:", response)
    print("\n--- Collected Information ---")
    info = chatbot.get_collected_info()
    for key, value in info.items():
        if value:
            print(f"{key.capitalize()}: {value}")

if __name__ == "__main__":
    run_conversation()


--- RealtyFlow AI Chatbot ---

Bot: Hello! Welcome to RealtyFlow. Are you looking to buy or sell a property?
You: Yes 
Bot: Great! Can I get your name?
You: Mohd Kaif 
Bot: Great! Can I get your name?
You: Mohd Kaif 
Bot: Great! Can I get your name?
You: 
Bot: Great! Can I get your name?
You: 
Bot: Great! Can I get your name?
