# Project 4 - Personalized Real Estate Agent

This notebook demonstrates the development of HomeMatch, an innovative application that uses LLMs and vector databases to generate personalized real estate listings.

In [14]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [None]:
# Import Required Libraries and Set Up Environment
import os
import chromadb
from chromadb.config import Settings
from openai import OpenAI

BASE_URL = os.getenv("OPENAI_BASE_URL", "https://openai.vocareum.com/v1")
API_KEY = os.getenv("OPENAI_API_KEY", "").strip()
if not API_KEY:
    raise ValueError("OPENAI_API_KEY is not set. Please set it in your environment.")

client = OpenAI(base_url=BASE_URL, api_key=API_KEY)

def _obfuscate(key: str, show: int = 4) -> str:
    return (key[:show] + "...") if key else ""

print(f"Using base URL: {BASE_URL}")
print(f"API key starts with: {_obfuscate(API_KEY)}")

Using base URL: https://openai.vocareum.com/v1
API key starts with: voc-...


## Step 2: Generate Synthetic Real Estate Listings

We'll use an LLM to generate at least 10 synthetic property listings for our database.

In [16]:
# Prompt for generating listings
listing_prompt = """Generate a real estate listing with the following format:\nNeighborhood: [name]\nPrice: [amount]\nBedrooms: [number]\nBathrooms: [number]\nHouse Size: [size]\nDescription: [description]\nNeighborhood Description: [description]"""

# Save listings to a file for deliverables
def generate_listings(n: int = 10) -> list:
    prompt = f"""
        Generate {n} diverse real estate listings with neighborhood,
        price, bedrooms, bathrooms, size, and description.
    """
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=1500  
    )
    text = response.choices[0].message.content
    if text is not None:
        listings = text.split("\n\n")
        with open("listings.txt", "w") as f:
            for listing in listings:
                f.write(listing + "\n\n")
        return listings
    else:
        print("No content returned from OpenAI API.")
        return []

listings = generate_listings()
print(listings[0])

1. Neighborhood: West Hollywood, Los Angeles
   Price: $1,200,000
   Bedrooms: 3
   Bathrooms: 2
   Size: 1,800 sqft
   Description: Stunning modern home in the heart of West Hollywood. This newly renovated property features an open floor plan, gourmet kitchen, and luxurious master suite.


## Step 3: Store Listings in a Vector Database

We'll use ChromaDB to store the listings and their embeddings.

In [17]:
# Load listings from file
with open("listings.txt", "r") as f:
    listings = [l.strip() for l in f.read().split("\n\n") if l.strip()]

# Generate embeddings for each listing using OpenAI API
def get_embedding(text, model="text-embedding-ada-002"):
    response = client.embeddings.create(
        input=text,
        model=model
    )
    return response.data[0].embedding

embeddings = [get_embedding(listing) for listing in listings]

# Initialize and configure ChromaDB
chromadbClient = chromadb.Client(Settings())
collection = chromadbClient.get_or_create_collection(name="home_listings")

# Store listings and embeddings in ChromaDB
for idx, (listing, embedding) in enumerate(zip(listings, embeddings)):
    collection.add(
        documents=[listing],
        embeddings=[embedding],
        ids=[str(idx)]
    )

print(f"Stored {len(listings)} listings in ChromaDB.")

Stored 10 listings in ChromaDB.


## Step 4: Collect Buyer Preferences

In [18]:
questions = [
    "How big do you want your house to be?",
    "What are 3 most important things for you in choosing this property?",
    "Which amenities would you like?",
    "Which transportation options are important to you?",
    "How urban do you want your neighborhood to be?",
]
answers = [
    "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
    "A quiet neighborhood, good local schools, and convenient shopping options.",
    "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
    "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
    "A balance between suburban tranquility and access to urban amenities like restaurants and theaters.",
]
buyer_profile = " ".join(answers)

## Step 5: Semantic Search for Matching Listings

In [19]:
# Convert buyer preferences to embedding
buyer_embedding = get_embedding(buyer_profile)

# Query ChromaDB for top 5 most relevant listings
results = collection.query(
    query_embeddings=[buyer_embedding],
    n_results=5
)

matched_listings = results['documents'][0]
matched_scores = results['distances'][0]

# Display matched listings and their similarity scores
for i, (listing, score) in enumerate(zip(matched_listings, matched_scores)):
    print(f"- Matched Listing {i+1} (Similarity Score: {score:.4f}) -\n{listing}\n")

- Matched Listing 1 (Similarity Score: 0.3701) -
6. Neighborhood: Capitol Hill, Seattle, WA
   Price: $1,200,000
   Bedrooms: 4
   Bathrooms: 3
   Size: 2,000 sq. ft.
   Description: Craftsman-style home with a spacious backyard and covered porch. Features a chef's kitchen, hardwood floors, and a master suite with a soaking tub.

- Matched Listing 2 (Similarity Score: 0.3831) -
4. Neighborhood: Old Town, Alexandria, VA
   Price: $700,000
   Bedrooms: 3
   Bathrooms: 2.5
   Size: 1,800 sq. ft.
   Description: Charming historic townhouse with a brick facade and updated interior. Features a fenced backyard, gourmet kitchen, and original wood floors.

- Matched Listing 3 (Similarity Score: 0.3885) -
7. Neighborhood: Park Slope, Brooklyn, NY
   Price: $1,300,000
   Bedrooms: 3
   Bathrooms: 2
   Size: 1,600 sq. ft.
   Description: Renovated brownstone apartment with exposed brick walls and a private terrace. Located near Prospect Park, top-rated schools, and trendy boutiques.

- Matched Lis

## Step 6: Personalize Listing Descriptions

In [20]:
# Personalize Listing Descriptions using LLM
personalized_listings = []
for listing in matched_listings:
    prompt = f"Rewrite the following real estate listing to emphasize the buyer's preferences below, without changing any factual details.\n\nBuyer Preferences: {buyer_profile}\n\nListing: {listing}\n\nPersonalized Description:"
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=500
    )
    personalized = response.choices[0].message.content.strip()
    personalized_listings.append(personalized)

# Display personalized listings
for i, pl in enumerate(personalized_listings):
    print(f"- Personalized Listing {i+1} -\n{pl}\n")

- Personalized Listing 1 -
Ideal for the discerning buyer seeking a comfortable three-bedroom home with a spacious kitchen and cozy living room. Located in a quiet neighborhood with good local schools and convenient shopping options. This property boasts a backyard perfect for gardening, a two-car garage, and a modern, energy-efficient heating system. Enjoy easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads. Experience suburban tranquility while still having access to urban amenities like restaurants and theaters.

- Personalized Listing 2 -
Ideal for the discerning buyer seeking a comfortable three-bedroom home in a quiet neighborhood, this charming historic townhouse in Old Town, Alexandria, VA ticks all the boxes. The spacious kitchen is perfect for those who love to cook and entertain, while the cozy living room offers a welcoming retreat. With good local schools nearby and convenient shopping options just a stone's throw away, this home offer