# HomeMatch Agent - Personalizing Real Estate with Generative AI

### Summary
In this project, I developed HomeMatch, an application that utilizes OpenAI's GPT models and a ChromaDB vector database to personalize real estate listings based on buyer preferences. Buyer inputs, such as preferred number of bedrooms, outdoor spaces, eco-friendly features, and neighborhood qualities, were transformed into vector embeddings using text-embedding-ada-002 for precise semantic matching with property listings.

Each matched listing was further personalized using GPT-3.5 Turbo, enhancing the descriptions to align with buyer needs while maintaining factual accuracy.

All generated listings were exported into a structured text file for review and future use.

### Step 1
Installing the necessary packages and setting the OpenAI API key

In [1]:
!pip install --upgrade openai chromadb


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
openai_api_key = "YOUR_API_KEY"

### Step 2

Generate real estate listings using a large language model based on predefined key attributes

In [4]:
import openai

openai_client = openai.Client(api_key=openai_api_key)

def generate_listing_with_features(neighborhood, price, bedrooms, bathrooms, size, features):
    features_text = ", ".join(features)
    prompt = f"""
    Generate a real estate listing:
    Neighborhood: {neighborhood}
    Price: {price}
    Bedrooms: {bedrooms}
    Bathrooms: {bathrooms}
    House Size: {size} sqft
    Special Features: {features_text}
    
    Provide a detailed description of the house and the neighborhood, highlighting the special features.
    """
    response = openai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that generates real estate listings."},
            {"role": "user", "content": prompt}
        ],
        max_tokens=300,
        temperature=0.7
    )
    
    return response.choices[0].message.content.strip()

listings = [
    {
        "neighborhood": "Green Oaks",
        "price": "$800,000",
        "bedrooms": 3,
        "bathrooms": 2,
        "size": "2,000",
        "features": ["spacious kitchen", "solar panels", "large backyard"]
    },
    {
        "neighborhood": "Sunny Vale",
        "price": "$1,200,000",
        "bedrooms": 4,
        "bathrooms": 3,
        "size": "2,500",
        "features": ["pool", "modern appliances", "two-car garage"]
    },
    {
        "neighborhood": "Blue Ridge",
        "price": "$950,000",
        "bedrooms": 3,
        "bathrooms": 2,
        "size": "2,100",
        "features": ["energy-efficient heating", "hardwood floors", "solar panels"]
    },
    {
        "neighborhood": "Maplewood",
        "price": "$650,000",
        "bedrooms": 2,
        "bathrooms": 2,
        "size": "1,500",
        "features": ["small garden", "fireplace", "updated kitchen"]
    },
    {
        "neighborhood": "Pinehurst",
        "price": "$1,050,000",
        "bedrooms": 4,
        "bathrooms": 3,
        "size": "2,400",
        "features": ["spacious living room", "central air", "large backyard"]
    },
    {
        "neighborhood": "Lakeview",
        "price": "$700,000",
        "bedrooms": 3,
        "bathrooms": 2,
        "size": "1,800",
        "features": ["private deck", "solar panels", "garden space"]
    },
    {
        "neighborhood": "Silverbrook",
        "price": "$890,000",
        "bedrooms": 3,
        "bathrooms": 3,
        "size": "2,300",
        "features": ["finished basement", "energy-efficient lighting", "pool"]
    },
    {
        "neighborhood": "Elmwood",
        "price": "$990,000",
        "bedrooms": 4,
        "bathrooms": 3,
        "size": "2,600",
        "features": ["home office", "two-car garage", "open-concept design"]
    },
    {
        "neighborhood": "Cedar Park",
        "price": "$750,000",
        "bedrooms": 3,
        "bathrooms": 2,
        "size": "2,000",
        "features": ["solar roof", "modern appliances", "outdoor kitchen"]
    },
    {
        "neighborhood": "River Bend",
        "price": "$1,300,000",
        "bedrooms": 5,
        "bathrooms": 4,
        "size": "3,200",
        "features": ["pool", "guest house", "spacious patio"]
    }
]


### Step 3
Storing listings with their corresponding embeddings in a vector database

In [5]:
import chromadb
import re

chromadb_client = chromadb.Client()
collection_name = "real_estate_listings"
existing_collections = chromadb_client.list_collections()

if collection_name in [col.name for col in existing_collections]:
    chromadb_client.delete_collection(collection_name)
    print(f"Deleted existing collection: {collection_name}")

collection = chromadb_client.create_collection(collection_name)

def clean_text(text):
    text = re.sub(r'[^\x00-\x7F]+', '', text)
    text = re.sub(r'\*\*', '', text)
    text = text.replace('\n', ' ')
    text = re.sub(r' {2,}', ' ', text)
    text = re.sub(r'-+', '', text)
    text = re.sub(r'[^\w\s,.]', '', text)
    
    return text.strip()

def generate_embedding(text):
    response = openai_client.embeddings.create(
        model="text-embedding-ada-002",
        input=text
    )
    
    return response.data[0].embedding

def store_listing_in_db(id, listing_text):
    clean_listing_text = clean_text(listing_text)
    embedding = generate_embedding(clean_listing_text)
    collection.add(
        ids=[id],
        documents=[listing_text],
        embeddings=[embedding]
    )

for i, listing in enumerate(listings):
    description = generate_listing_with_features(
        listing["neighborhood"],
        listing["price"],
        listing["bedrooms"],
        listing["bathrooms"],
        listing["size"],
        listing["features"]
    )
    store_listing_in_db(f"listing_{i+1}", description)


In [6]:
def check_database():
    results = collection.get(include=["documents", "embeddings"])
    
    print(f"Number of listings stored: {len(results['documents'])}")
    
    if results['documents']:
        print("\nSample listing:")
        print(f"Listing: {results['documents'][1]}")
        print(f"Embedding: {results['embeddings'][1][:5]}...")

check_database()

Number of listings stored: 10

Sample listing:
Listing: 🏡 **Real Estate Listing**
   
   **Neighborhood:** Sunny Vale
   
   **Price:** $1,200,000
   
   **Bedrooms:** 4
   
   **Bathrooms:** 3
   
   **House Size:** 2,500 sqft
   
   **Special Features:** Pool, Modern Appliances, Two-Car Garage

   **Description:**
   Welcome to this stunning 4-bedroom, 3-bathroom home located in the desirable neighborhood of Sunny Vale. This modern residence boasts a spacious 2,500 sqft of living space, perfect for comfortable family living.

   Step inside to discover a beautifully designed interior with sleek finishes and an abundance of natural light. The open-concept layout seamlessly connects the living room, dining area, and gourmet kitchen equipped with modern appliances, making it ideal for entertaining guests or everyday family meals.

   The master suite offers a private retreat with a luxurious en-suite bathroom featuring a soaking tub and separate shower. Three additional bedrooms provide

In [7]:
def save_listings_to_file_from_db(collection, filename="listings.txt"):
    results = collection.get(include=["documents"])
    if not results.get('documents'):
        print("No listings found in the database.")
        return
    
    with open(filename, "w") as file:
        for i, document in enumerate(results['documents']):
            file.write(f"Listing {i+1}:\n")
            file.write(f"{document}\n")
            file.write("\n" + "-"*40 + "\n\n")
    
    print(f"Listings saved to {filename}")

save_listings_to_file_from_db(collection)

Listings saved to listings.txt


### Step 4
Building the User Preference Interface

In [8]:
# Defining the buyer persona
questions = [
    "How many bedrooms are you looking for in your house?",
    "Would you prefer a house with outdoor spaces like a deck or garden?",
    "Do you prefer a house with eco-friendly features, such as solar panels?",
    "What neighborhood qualities are important to you in terms of peacefulness and nature?"
]
answers = [
    "I'm looking for a three-bedroom house.",
    "Yes, I would love a house with outdoor spaces like a deck or a garden.",
    "Yes, I'm very interested in houses with solar panels and other eco-friendly features.",
    "I want a neighborhood that is peaceful and well-connected to nature."
]

def generate_buyer_embedding(answers):
    preferences_text = " ".join(answers)
    response = openai_client.embeddings.create(
        model="text-embedding-ada-002",
        input=[preferences_text]
    )
    
    return response.data[0].embedding

buyer_embedding = generate_buyer_embedding(answers)

buyer_embedding

[-0.00761585496366024,
 0.019766923040151596,
 -0.007018782198429108,
 0.008778241463005543,
 -0.018229778856039047,
 0.013935934752225876,
 -0.006272441241890192,
 -0.013935934752225876,
 -0.005551507230848074,
 -0.004979841876775026,
 -0.0009027551277540624,
 0.012532178312540054,
 0.006015191785991192,
 -0.021278660744428635,
 0.019195258617401123,
 0.01908092387020588,
 0.026322022080421448,
 -0.003614196553826332,
 0.006421709433197975,
 -0.01538415439426899,
 -0.03046341985464096,
 -0.0012219350319355726,
 0.006355015095323324,
 -0.016425855457782745,
 -0.004620962776243687,
 -0.00766666978597641,
 0.008676611818373203,
 -0.006796468049287796,
 -0.009724665433168411,
 0.004290667362511158,
 0.036688223481178284,
 -0.009133944287896156,
 -0.017175372689962387,
 -0.008733779191970825,
 -0.018153555691242218,
 -0.007704780902713537,
 0.00416680658236146,
 0.005433998536318541,
 0.004265259951353073,
 0.0016864133067429066,
 0.0034268172457814217,
 0.004624139051884413,
 -0.003083817

In [9]:
results = collection.query(
    query_embeddings=[buyer_embedding],
    n_results=1,
    include=["documents", "distances"]
)

for i, doc in enumerate(results['documents']):
    print(f"Match {i+1}:")
    print(f"Listing: {doc}")
    print(f"Distance: {results['distances'][i]}")
    print("-----")

Match 1:
Listing: ["🏡 **Beautiful Home in Lakeview Neighborhood**\n\n💰 **Price:** $700,000\n\n🛏️ **Bedrooms:** 3  \n🛁 **Bathrooms:** 2  \n📏 **House Size:** 1,800 sqft\n\n🌟 **Special Features:** Private deck, solar panels, garden space\n\nNestled in the sought-after Lakeview neighborhood, this charming 3-bedroom, 2-bathroom home offers a perfect blend of modern amenities and cozy living spaces. The house spans 1,800 square feet, providing ample room for comfortable living.\n\nThe private deck is an ideal spot for relaxing and entertaining, offering serene views of the neighborhood. Imagine enjoying your morning coffee or hosting gatherings in this tranquil outdoor space.\n\nEquipped with solar panels, this home is energy-efficient and environmentally friendly, helping you save on utility costs while reducing your carbon footprint. Embrace sustainable living in style.\n\nThe property also boasts a lovely garden space, perfect for those with a green thumb or anyone looking to create their

### Step 5
Searching Based on Preferences

In [10]:
def retrieve_best_listings(buyer_embedding, collection, n_results=5, distance_threshold=0.4):
    results = collection.query(
        query_embeddings=[buyer_embedding],
        n_results=n_results,
        include=["documents", "distances"]
    )
    
    documents = results['documents'][0]
    distances = results['distances'][0]
    filtered_results = []
    for i in range(min(len(documents), len(distances))):
        if distances[i] <= distance_threshold:
            filtered_results.append({
                "listing": documents[i],
                "distance": distances[i]
            })
    
    return sorted(filtered_results, key=lambda x: x['distance'])

best_listings = retrieve_best_listings(buyer_embedding, collection, n_results=5)
for i, listing in enumerate(best_listings):
    print(f"Listing {i+1}: {listing['listing']}\nDistance: {listing['distance']}\n")


Listing 1: 🏡 **Beautiful Home in Lakeview Neighborhood**

💰 **Price:** $700,000

🛏️ **Bedrooms:** 3  
🛁 **Bathrooms:** 2  
📏 **House Size:** 1,800 sqft

🌟 **Special Features:** Private deck, solar panels, garden space

Nestled in the sought-after Lakeview neighborhood, this charming 3-bedroom, 2-bathroom home offers a perfect blend of modern amenities and cozy living spaces. The house spans 1,800 square feet, providing ample room for comfortable living.

The private deck is an ideal spot for relaxing and entertaining, offering serene views of the neighborhood. Imagine enjoying your morning coffee or hosting gatherings in this tranquil outdoor space.

Equipped with solar panels, this home is energy-efficient and environmentally friendly, helping you save on utility costs while reducing your carbon footprint. Embrace sustainable living in style.

The property also boasts a lovely garden space, perfect for those with a green thumb or anyone looking to create their own oasis right at home.

### Step 6
Personalizing Listing Descriptions

In [11]:
def personalize_listing(listing, questions, answers):
    prompt = f"""
    Original Listing Description:
    {listing['listing']}

    Buyer Preferences:
    1. {questions[0]} Answer: {answers[0]}
    2. {questions[1]} Answer: {answers[1]}
    3. {questions[2]} Answer: {answers[2]}
    4. {questions[3]} Answer: {answers[3]}

    Task: Rewrite the original listing description to highlight features that match the buyer's preferences. Emphasize the aspects of the listing that align with the buyer’s needs, such as the number of bedrooms, outdoor spaces, eco-friendly features, and the peacefulness of the neighborhood. Do not alter any factual details.
    """
    response = openai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful real estate assistant."},
            {"role": "user", "content": prompt}
        ],
        max_tokens=300,
        temperature=0.7
    )
    
    return response.choices[0].message.content.strip()

for i, listing in enumerate(best_listings):
    personalized_description = personalize_listing(listing, questions, answers)
    print(f"Personalized Listing {i+1}:\n{personalized_description}\n")

Personalized Listing 1:
🏡 **Perfect 3-Bedroom Retreat in Tranquil Lakeview Neighborhood**

💰 **Price:** $700,000

🛏️ **Bedrooms:** 3  
🛁 **Bathrooms:** 2  
📏 **House Size:** 1,800 sqft

🌟 **Special Features:** Private deck, solar panels, garden space

Discover your dream home in the serene Lakeview neighborhood, where this delightful 3-bedroom, 2-bathroom residence awaits. With a spacious 1,800 square feet of living space, this home offers a cozy yet modern lifestyle tailored to your needs.

Step outside to your private deck, a peaceful sanctuary perfect for unwinding or hosting intimate gatherings. Enjoy the tranquil views of the neighborhood while savoring your morning coffee or relaxing under the stars in this inviting outdoor retreat.

Experience sustainable living at its finest with solar panels adorning the roof of this eco-conscious home. Not only will you benefit from reduced utility costs, but you'll also play a part in preserving the environment while enjoying the comforts of