### REAL ESTATE AGENT ########################

Steps to build the agent:

1. Use LLM to generate listings
2. Question about client preferences, life-style, etc
3. Use LLM to generate a description of an ideal apartment
4. Query listings using vector search (embeddings to find the apartments)
5. Filter by some hard constraings (such as number of rooms)
6. Present the final listing adapting the description to the user preferences

Tests:

To evaluate the agent we will run two scenarios:
1. Family with kids looking for calm places
2. Single person looking for vibrant places

In [77]:
!pip install -qU numpy
!pip install -qU openai
!pip install -qU langchain-openai
!pip install -U --quiet lancedb pandas pydantic
!pip install pylance




[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1

In [89]:
# loading environment variables (copy .env.dist to .env. and update OPEN_AI api key there)
from dotenv import load_dotenv
load_dotenv()  

True

In [90]:
import os
from langchain.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI 
import openai

client = openai.OpenAI(
    base_url = os.environ.get('BASE_URL'),
    api_key = os.environ.get('API_KEY')
)

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, api_key=os.environ["API_KEY"], base_url=os.environ.get('BASE_URL'))

In [85]:
import json

if not os.path.exists('listings.json'):
    response = llm.invoke(f"Generate 10 real estate listings for houses in the city of Porto, Portugal, the listing should include the following fields: price, number_of_bedrooms, square_footage, description and neighborhood_description, the format should be a JSON")

    listings = json.loads(response.content)
    with open('listings.json', 'w') as f:
        json.dump(listings, f)
else:
    with open('listings.json', 'r') as f:
        listings = json.load(f)

Define the base questionnarie and fill the awnsers for the first scenario: Family with kids looking for calm places.

In [86]:
# questions to ask the user about their personal preferences
personal_questions = [  "Do you work in person or remotely?",
                        "Do you have kids?",
                        "Do you prefer vibrant neighborhoods or calm places?",
                        "Do you like to cook in your free time?",
                        "Do you prefer to spend time indoors or outdoors?",
                        "What is your budget limit (in euros)?",
                       ]

# personal_answers = [ ] 
# for question in personal_questions:
#    answer = input(question)
#    personal_answers.append(answer)
    
# list of your personal answers to the questions above
# first scenario: family with kids
personal_answers = ['Remotely', 'Yes', 'Calm', 'Yes, I love!', 'Indoors', '400000']

In [93]:
def fill_questionaire(personal_questions, personal_answers):
    questionnaire = ""
    for i in range(len(personal_questions[0:-1])): # avoid the last question (budget)
        questionnaire += personal_questions[i] + " " + personal_answers[i] + "\n"
    return questionnaire
questionnaire = fill_questionaire(personal_questions, personal_answers)
print(questionnaire)

Do you work in person or remotely? Remotely
Do you have kids? Yes
Do you prefer vibrant neighborhoods or calm places? Calm
Do you like to cook in your free time? Yes, I love!
Do you prefer to spend time indoors or outdoors? Indoors



In [94]:
prompt_target_description = PromptTemplate(
    template="{question}\n{questionnaire}",
    input_variables=["question","questionnaire"],
)
question_target_description = """
    You are a real estate agent. Based on the context provided, generate a description of the ideal house for the user,
    taking into account their personal preferences described in the questionnaire below.
"""

query = prompt_target_description.format(question = question_target_description, questionnaire = questionnaire)
print(query)


    You are a real estate agent. Based on the context provided, generate a description of the ideal house for the user,
    taking into account their personal preferences described in the questionnaire below.

Do you work in person or remotely? Remotely
Do you have kids? Yes
Do you prefer vibrant neighborhoods or calm places? Calm
Do you like to cook in your free time? Yes, I love!
Do you prefer to spend time indoors or outdoors? Indoors



Using the questionnarie, we will generate the description of the ideal listing for this user:

In [95]:
target_description = llm.invoke(query).content
print(target_description)

Based on your preferences, the ideal house for you would be a spacious and cozy home located in a quiet and peaceful neighborhood. The house should have a large, well-equipped kitchen where you can indulge in your love for cooking. It should also have plenty of indoor living space for you and your kids to enjoy, with comfortable and inviting areas for relaxation and entertainment. Additionally, a backyard or outdoor space for some fresh air and relaxation would be a nice bonus, even though you prefer spending most of your time indoors. Overall, a tranquil and comfortable home that caters to your love for cooking and indoor activities would be perfect for you and your family.


Now it's time to create the embeddings for our vector database, we will build using listing and neighboorhood descriptions.

In [96]:
import time

print(f"Total listings to process: {len(listings)}")

# Generate embeddings for all listings
print("Generating embeddings for listings...")
embeddings = []
batch_size = 10  # Process in batches to avoid rate limits

for i in range(0, len(listings), batch_size):
    batch = listings[i:i+batch_size]
    batch_texts = [listing['description'] + " " + listing['neighborhood_description'] for listing in batch]

    try:
        response = client.embeddings.create(
            input=batch_texts,
            model="text-embedding-3-small"
        )
        
        batch_embeddings = [embedding.embedding for embedding in response.data]
        embeddings.extend(batch_embeddings)
        
        print(f"Processed batch {i//batch_size + 1}/{(len(listings) + batch_size - 1)//batch_size}")
        
        # Small delay to respect rate limits
        time.sleep(0.1)
        
    except Exception as e:
        print(f"Error processing batch {i//batch_size + 1}: {e}")
        break

print(f"Generated {len(embeddings)} embeddings")

Total listings to process: 10
Generating embeddings for listings...
Processed batch 1/1
Generated 10 embeddings


Now we will create the lancedb table listings to save our listings along with the embeddings.

In [None]:
import lancedb

def convert_to_float(price_str):
    try:
        return float(price_str.replace("$", "").replace(",", "").replace("\u20ac", "").strip())
    except ValueError:
        return 0.0

db = lancedb.connect("./lancedb")

# Prepare data for LanceDB
lance_data = []
for i, listing in enumerate(listings):
    if i < len(embeddings):
        lance_data.append({
            "id": i,
            "number_of_bedrooms": listing['number_of_bedrooms'],
            "square_footage": listing['square_footage'],
            "description": listing['description'],
            "neighborhood_description": listing['neighborhood_description'],
            "price": convert_to_float(listing['price']),
            "vector": embeddings[i]
        })

print(f"Prepared {len(lance_data)} listings for LanceDB")

# Create table (drop if exists)
try:
    db.drop_table("listings")
except:
    pass

table = db.create_table("listings", lance_data)
print("Created LanceDB table with listings and embeddings")

Prepared 10 listings for LanceDB
[{'id': 0, 'number_of_bedrooms': 3, 'square_footage': 2100, 'description': 'Modern, spacious villa with panoramic views of the city. The house features a beautifully manicured garden, a fully equipped kitchen, and a large garage.', 'neighborhood_description': 'Situated in the most gentrified neighborhood in Porto, known for its peaceful ambience, high-end shops, and close proximity to the city center.', 'price': 550000.0, 'vector': [-0.0064545366913080215, -0.006703554652631283, -0.006574065424501896, 0.024184592068195343, -0.0025748421903699636, -0.003421502187848091, -0.003155053360387683, 0.013387187384068966, 0.02059873752295971, -0.005812071263790131, -0.011026500724256039, 0.009781412780284882, -0.004004203248769045, -0.0445442721247673, -0.009387964382767677, 0.021833864971995354, -0.05856894701719284, 0.00978639256209135, 0.057572875171899796, 0.02599743939936161, 0.009039339609444141, -0.04538097232580185, -0.008964634500443935, -0.021435437723

In [98]:
def similarity_search(table, query_text: str, top_k: int = 5, budget_limit: float = None):
    # Generate embedding for the query
    print(f"Searching for: '{query_text}'")
    
    try:
        query_response = client.embeddings.create(
            input=[query_text],
            model="text-embedding-3-small"
        )
        query_embedding = query_response.data[0].embedding
        
        # Perform similarity search using LanceDB
        results = table.search(query_embedding).where(f"price < {budget_limit}").limit(top_k).to_pandas()

        # Format results
        search_results = []
        for _, row in results.iterrows():
            search_results.append((
                row['id'],
                row['description'] + " " + row['neighborhood_description'],
                row['price'],
                row['_distance']
            ))
        
        return search_results
        
    except Exception as e:
        print(f"Error during search: {e}")
        return []

Now we can finally search for the best listing using the target_descrition:

In [99]:
results = similarity_search(table, target_description, top_k=3, budget_limit=float(convert_to_float(personal_answers[-1])))

print(f"\nTop 3 listings:")
for i, (listing_id, text, price, distance) in enumerate(results, 1):
    print(f"\n{i}. Listing ID: {listing_id}")
    print(f"   Price: {price}")
    print(f"   Distance: {distance:.4f}")
    print(f"   Text: {text}")
    print("-" * 80)

Searching for: 'Based on your preferences, the ideal house for you would be a spacious and cozy home located in a quiet and peaceful neighborhood. The house should have a large, well-equipped kitchen where you can indulge in your love for cooking. It should also have plenty of indoor living space for you and your kids to enjoy, with comfortable and inviting areas for relaxation and entertainment. Additionally, a backyard or outdoor space for some fresh air and relaxation would be a nice bonus, even though you prefer spending most of your time indoors. Overall, a tranquil and comfortable home that caters to your love for cooking and indoor activities would be perfect for you and your family.'

Top 3 listings:

1. Listing ID: 6
   Price: 300000.0
   Distance: 1.0699
   Text: Charming and well-maintained apartment with a private garden, a spacious kitchen, and hardwood floors. Located in a family-friendly neighborhood offering many recreational parks, trendy cafes, and grocery stores.
---

This is the prompt to personalize the listing to our user:

In [101]:
prompt_final_presentation = PromptTemplate(
    template="{question}\nQUESTIONNAIRE:\n{questionnaire}\nLISTING:\n{listing}",
    input_variables=["question","questionnaire","listing"],
)
question_final_presentation = """
    You are a real estate agent that needs to present a property listing. Based on the context provided, generate the best description of the ideal house for the user,
    taking into account their personal preferences described in the questionnaire and listing below.
"""
if results:
    listing = listings[results[0][0]]
else:
    raise ValueError("No listings found")
query = prompt_final_presentation.format(question = question_final_presentation, questionnaire = questionnaire, listing = listing)
print(query)


    You are a real estate agent that needs to present a property listing. Based on the context provided, generate the best description of the ideal house for the user,
    taking into account their personal preferences described in the questionnaire and listing below.

QUESTIONNAIRE:
Do you work in person or remotely? Remotely
Do you have kids? Yes
Do you prefer vibrant neighborhoods or calm places? Calm
Do you like to cook in your free time? Yes, I love!
Do you prefer to spend time indoors or outdoors? Indoors

LISTING:
{'price': '€300,000', 'number_of_bedrooms': 2, 'square_footage': 1600, 'description': 'Charming and well-maintained apartment with a private garden, a spacious kitchen, and hardwood floors.', 'neighborhood_description': 'Located in a family-friendly neighborhood offering many recreational parks, trendy cafes, and grocery stores.'}


And finally present the final recommendation for our family looking for a new home:

In [102]:
recommended_description = llm.invoke(query).content
print("**** Welcome to your new home!******\n")
print(recommended_description)

**** Welcome to your new home!******

Based on your preferences, I have found the perfect property for you! This charming and well-maintained apartment is priced at €300,000 and offers 2 bedrooms with a square footage of 1600. It features a private garden, a spacious kitchen (perfect for your love of cooking), and hardwood floors for a cozy feel indoors. 

Located in a family-friendly neighborhood, this apartment is surrounded by recreational parks, trendy cafes, and grocery stores, providing a calm and vibrant balance that suits your lifestyle. It's the ideal space for you to enjoy spending time indoors with your kids while still having access to outdoor amenities nearby. Don't miss out on this perfect home for you!


To test our agent, we will adapt the awnsers to the second scenario (Single person looking for vibrant places):

In [103]:
### Update personal_answers to test different budgets and preferences (now less money, no kids, vibrant neighborhood)
personal_answers = ['Remotely', 'No', 'Vibrant', 'Yes, I love!', 'Indoors', '300000']

query = prompt_target_description.format(question = question_target_description, questionnaire = fill_questionaire(personal_questions, personal_answers))
target_description = llm.invoke(query).content

results = similarity_search(table, target_description, top_k=3, budget_limit=float(convert_to_float(personal_answers[-1])))

if results:
    listing = listings[results[0][0]]
else:
    raise ValueError("No listings found")
query = prompt_final_presentation.format(question = question_final_presentation, questionnaire = fill_questionaire(personal_questions, personal_answers), listing = listing)

recommended_description = llm.invoke(query).content
print("**** Welcome to your new home!******\n")
print(recommended_description)

Searching for: 'Based on your preferences, the ideal house for you would be a modern and vibrant urban apartment located in a bustling neighborhood with plenty of restaurants, cafes, and entertainment options. The apartment should have a spacious and well-equipped kitchen where you can indulge in your love for cooking. Additionally, the living space should be cozy and inviting, perfect for spending time indoors. Look for a property with easy access to amenities and a lively atmosphere to suit your remote work lifestyle and vibrant personality. Happy house hunting!'
**** Welcome to your new home!******

Based on your preferences, I have found the perfect property for you! This cozy apartment in a modern building is ideal for someone who works remotely and loves to cook. With its fully fitted kitchen and open-plan living area, you can enjoy cooking while taking in stunning city views from the large windows and balcony. 

Located in Porto's bustling downtown, this vibrant neighborhood off