This is a starter notebook for the project, you'll have to import the libraries you'll need, you can find a list of the ones available in this workspace in the requirements.txt file in this workspace. 

In [1]:
!pip install -r requirements.txt

Defaulting to user installation because normal site-packages is not writeable








In [2]:
import os

os.environ["OPENAI_API_KEY"] = "voc-205140254126677398870367f26a48400452.87094479"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"

from langchain.llms import OpenAI


In [3]:
import openai
import os
import json

# Setup OpenAI API with Vocareum routing
openai.api_key = os.environ["OPENAI_API_KEY"]
openai.api_base = os.environ["OPENAI_API_BASE"]

def generate_listing():
    prompt = (
        "Generate a realistic real estate listing with the following format:\n\n"
        "Neighborhood: <neighborhood>\n"
        "Price: <price>\n"
        "Bedrooms: <number>\n"
        "Bathrooms: <number>\n"
        "House Size: <sqft>\n"
        "Description: <listing description>\n"
        "Neighborhood Description: <description of the neighborhood>\n\n"
        "Make it engaging, factual, and unique."
    )
    
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response['choices'][0]['message']['content'].strip()

# Generate 10 listings
listings = []
for i in range(10):
    print(f"Generating listing {i+1}...")
    listing = generate_listing()
    listings.append(listing)

# Save to file
with open("listings.txt", "w") as f:
    for item in listings:
        f.write(item + "\n\n")

Generating listing 1...
Generating listing 2...
Generating listing 3...
Generating listing 4...
Generating listing 5...
Generating listing 6...
Generating listing 7...
Generating listing 8...
Generating listing 9...
Generating listing 10...


In [4]:
with open("listings.txt") as f:
    print(f.read()[:1000])  # Preview first 1000 characters

Neighborhood: North End
Price: $650,000
Bedrooms: 3
Bathrooms: 2.5
House Size: 2,200 sqft
Description: Welcome to this charming 1920s Craftsman home located in the desirable North End neighborhood. This beautifully renovated home features a gourmet kitchen with stainless steel appliances, a cozy fireplace in the living room, and a spacious backyard perfect for entertaining. The master suite boasts a luxurious en-suite bathroom and walk-in closet. 
Neighborhood Description: The North End is known for its tree-lined streets, historic homes, and vibrant community atmosphere. Enjoy easy access to local parks, shops, restaurants, and excellent schools. Don't miss this opportunity to own a piece of this coveted neighborhood.

Neighborhood: Beverly Hills
Price: $5,000,000
Bedrooms: 5
Bathrooms: 6
House Size: 6,000 sqft
Description: This stunning modern estate in prime Beverly Hills features sleek lines, high-end finishes, and panoramic views of the city. The spacious open concept living area 

In [5]:
import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction

# Setup Chroma client
chroma_client = chromadb.Client()

# Create embedding function using OpenAI
embedding_function = OpenAIEmbeddingFunction(
    api_key=os.environ["OPENAI_API_KEY"],
    api_base=os.environ["OPENAI_API_BASE"]
)

# Create a collection (or get it if already exists)
collection = chroma_client.get_or_create_collection(
    name="real_estate_listings",
    embedding_function=embedding_function
)

# Store listings in ChromaDB
for i, listing in enumerate(listings):
    collection.add(
        documents=[listing],
        ids=[f"listing-{i+1}"]
    )

print(f"Stored {len(listings)} listings in ChromaDB.")

Stored 10 listings in ChromaDB.


In [6]:
collection.count()  

10

In [7]:
results = collection.get()
for doc in results["documents"]:
    print(doc[:200], "\n---\n")

Neighborhood: North End
Price: $650,000
Bedrooms: 3
Bathrooms: 2.5
House Size: 2,200 sqft
Description: Welcome to this charming 1920s Craftsman home located in the desirable North End neighborhood. Th 
---

Neighborhood: Beverly Hills
Price: $5,000,000
Bedrooms: 5
Bathrooms: 6
House Size: 6,000 sqft
Description: This stunning modern estate in prime Beverly Hills features sleek lines, high-end finishes, a 
---

Neighborhood: Silver Lake
Price: $850,000
Bedrooms: 3
Bathrooms: 2.5
House Size: 1,800 sqft
Description: This stunning modern 3-bedroom, 2.5-bathroom home in trendy Silver Lake boasts an open floor pl 
---

Neighborhood: Green Hills
Price: $550,000
Bedrooms: 3
Bathrooms: 2.5
House Size: 2,200 sqft
Description: Welcome to this charming 3 bedroom, 2.5 bathroom home located in the desirable Green Hills neig 
---

Neighborhood: Park Slope, Brooklyn
Price: $1,200,000
Bedrooms: 4
Bathrooms: 2.5
House Size: 2,500 sqft
Description: Welcome to this charming brownstone in the heart of Park

In [8]:
### Collecting Buyer Preferences

# Simulate buyer answering natural language questions
questions = [
    "How big do you want your house to be?",
    "What are 3 most important things for you in choosing this property?",
    "Which amenities would you like?",
    "Which transportation options are important to you?",
    "How urban do you want your neighborhood to be?",
]

answers = [
    "A modern two-bedroom apartment with lots of sunlight and a balcony.",
    "Close to downtown, near public transport, and safe for walking.",
    "In-building gym, covered parking, and pet-friendly environment.",
    "Short walk to a train station and bus stop.",
    "Urban, lively environment with shops and cafes nearby."
]

# Combine all answers into one single input for semantic search
buyer_preferences = " ".join(answers)

print("Combined Buyer Preferences:")
print(buyer_preferences)

Combined Buyer Preferences:
A modern two-bedroom apartment with lots of sunlight and a balcony. Close to downtown, near public transport, and safe for walking. In-building gym, covered parking, and pet-friendly environment. Short walk to a train station and bus stop. Urban, lively environment with shops and cafes nearby.


In [9]:
### Semantic Search Based on Buyer Preferences

# Number of results to return
TOP_K = 3

# Query the ChromaDB vector store
search_results = collection.query(
    query_texts=[buyer_preferences],
    n_results=TOP_K
)

# Display the matched listings
print(f"Top {TOP_K} Listings Matching Buyer Preferences:\n")
for i, doc in enumerate(search_results["documents"][0]):
    print(f"--- Result {i+1} ---")
    print(doc)
    print("\n")

Top 3 Listings Matching Buyer Preferences:

--- Result 1 ---
Neighborhood: Downtown Manhattan
Price: $1,250,000
Bedrooms: 2
Bathrooms: 2
House Size: 1,200 sqft
Description: This stunning apartment in the heart of Downtown Manhattan offers the perfect blend of luxury and convenience. Featuring two spacious bedrooms, two modern bathrooms, and 1,200 square feet of living space, this home is designed for comfortable city living. The open concept kitchen and living area is perfect for entertaining guests, while the large windows provide breathtaking views of the iconic city skyline. The building also offers top-notch amenities, including a fitness center, rooftop terrace, and 24-hour concierge service.
Neighborhood Description: Downtown Manhattan is known for its vibrant energy, with a bustling mix of shops, restaurants, and cultural attractions. Residents can enjoy easy access to public transportation, world-class dining options, and exciting entertainment venues. From the historic charm o

In [10]:
print(search_results["ids"][0])

['listing-7', 'listing-6', 'listing-5']


In [11]:
### Personalise Listings Based on Buyer Preferences

def personalise_listing(original_listing, preferences):
    prompt = (
        f"You are a real estate agent.\n"
        f"The buyer has the following preferences:\n\"{preferences}\"\n\n"
        f"Here is a property listing:\n\"{original_listing}\"\n\n"
        f"Rewrite this listing to appeal to the buyer by highlighting the most relevant features.\n"
        f"Do not make up facts. Keep all factual info accurate.\n"
        f"Keep the output concise and professional."
    )

    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}]
    )

    return response['choices'][0]['message']['content'].strip()


#Loop over matched listings and personalise them
print("Personalised Listings:\n")
for i, listing in enumerate(search_results["documents"][0]):
    personalised = personalise_listing(listing, buyer_preferences)
    print(f"--- Personalised Listing {i+1} ---\n{personalised}\n\n")

Personalised Listings:

--- Personalised Listing 1 ---
"Modern two-bedroom apartment in Downtown Manhattan, priced at $1,250,000. Enjoy an abundance of sunlight and a balcony with stunning city views. In-building gym, covered parking, and a pet-friendly environment included. Easy access to public transport, a short walk to train station and bus stop. Urban, lively neighborhood with shops and cafes nearby. 1,200 sqft of luxurious living space with top-notch amenities, including a fitness center and rooftop terrace."


--- Personalised Listing 2 ---
Welcome to this modern 2-bedroom apartment, located in an urban and lively environment in Park Slope, Brooklyn. This sunlit apartment features a balcony, in-building gym, covered parking, and is pet-friendly. Enjoy the convenience of being close to downtown, public transport, shops, cafes, a train station, and a bus stop. This apartment is perfect for those looking for a safe and walkable neighborhood with all amenities at your doorstep. Pric