# Project: HomeMatch AI Real estate Agent

HomeMatch AI agent helps buyers to check for property options without intervention of human agent.

In [None]:
!pip install chromadb
!pip install pandas

In [7]:
import os

os.environ["OPENAI_API_KEY"] = "voc-2666137141266773666733673c46b3b31115.28063286"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"

from langchain.llms import OpenAI
model = 'gpt-3.5-turbo'

### Generating Real Estate Listings

In [2]:
from langchain.chat_models import ChatOpenAI
from langchain import LLMChain, PromptTemplate

# constructing prompt to get 10 listings

prompt_template = PromptTemplate(
    input_variables=[],
    template="""
Generate 10 diverse and different types property listings in Mexico in the following format. Include different themes like lakefront, boho, urban, apartment, condo, studio, downtown. Total number of listing should be ten:
Neighborhood: Green Oaks
Price: $800,000
Bedrooms: 3
Bathrooms: 2
House Size: 2,000 sqft

Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family.

Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, commuting is a breeze.

return the list in the json format so that we can parse each listing
""")

In [48]:
# Initialize the OpenAI LLM
llm = ChatOpenAI(model=model, temperature=1, max_tokens = 4000)

# Create the LLMChain
chain = LLMChain(llm=llm, prompt=prompt_template)

# Generate the listings
response = chain.run({})

# Print the generated listings
print(response)

{
    "listings": [
        {
            "Neighborhood": "Green Oaks",
            "Price": "$800,000",
            "Bedrooms": 3,
            "Bathrooms": 2,
            "House Size": "2,000 sqft",
            "Description": "Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family.",
            "Neighborhood Description": "Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bi

In [62]:
import json
data = json.loads(response)


In [61]:
import csv

# Extract the listings
listings = data['listings']

# Define the CSV file name
csv_file = "property_listings.csv"

# Write the listings to a CSV file
with open(csv_file, mode='w', newline='', encoding='utf-8') as file:
    writer = csv.DictWriter(file, fieldnames=listings[0].keys())
    writer.writeheader()
    writer.writerows(listings)

print(f"Data has been written to {csv_file}")

Data has been written to property_listings.csv


In [14]:
import pandas as pd
df = pd.read_csv("property_listings.csv")
df['id'] = df.index.astype(str)
df

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House Size,Description,Neighborhood Description,id
0,Green Oaks,"$800,000",3,2,"2,000 sqft",Welcome to this eco-friendly oasis nestled in ...,"Green Oaks is a close-knit, environmentally-co...",0
1,Lakefront Paradise,"$1,200,000",4,3,"3,500 sqft",Escape to this luxurious lakefront paradise fe...,"Located in a serene lakefront community, this ...",1
2,Boho Hideaway,"$500,000",2,1,"1,200 sqft",Step into this bohemian-style hideaway with 2 ...,"Located in a trendy boho neighborhood, this pr...",2
3,Urban Loft,"$700,000",1,1,"1,000 sqft",Experience city living at its finest in this m...,Located in the heart of the bustling city cent...,3
4,Luxury Condo,"$1,500,000",3,3,"2,500 sqft",Indulge in luxury living in this spacious cond...,"Situated in an exclusive high-rise building, t...",4
5,Chic Studio,"$300,000",0,1,500 sqft,Live in style in this chic studio apartment fe...,"Located in a trendy downtown area, this studio...",5
6,Downtown Penthouse,"$2,000,000",4,4,"3,000 sqft",Elevate your living experience in this stunnin...,Located in the heart of the vibrant downtown d...,6
7,Historic Colonial Mansion,"$1,800,000",6,5,"4,500 sqft",Step back in time in this majestic colonial ma...,"Located in a historic district, this mansion i...",7
8,Beachfront Villa,"$2,500,000",5,4,"5,000 sqft",Escape to paradise in this luxurious beachfron...,"Located in a prestigious beachfront community,...",8
9,Mountain Retreat,"$900,000",4,3,"2,800 sqft",Discover serenity in this mountain retreat fea...,"Located in a tranquil mountainous area, this p...",9


### Storing Listings in a Vector Database

In [17]:
# combining descriptions details for easy embbeding
df['combined_description'] = df['Description'] + " " + df['Neighborhood Description']


### Generating embeddings

We use OpenAI "text-embedding-ada-002" model to generate embedding for combined_description

In [27]:
from langchain.embeddings import OpenAIEmbeddings

# Initialise the embeddings model
embeddings = OpenAIEmbeddings()

emedding_vectors = embeddings.embed_documents(df['combined_description'].tolist())
df['embedding'] = emedding_vectors

### Storing in ChromaDB

In [175]:
client.delete_collection(name="real_estate_listings")

In [176]:
from chromadb import Client

# Intialise chromdb
client = Client()
collection = client.get_or_create_collection(name="real_estate_listings")

# Adding data
for idx, row in df.iterrows():
    embedding = row['embedding']
    print(f"adding document with id {idx}and embbeding dim : {len(embedding)}")
    collection.add(
    documents = [row['combined_description']],
    embeddings = [embedding],
    metadatas=[{
        "Neighborhood": row["Neighborhood"],
        "Price": row["Price"],
        "Bedrooms": row["Bedrooms"],
        "Bathrooms": row["Bathrooms"],
        "House Size": row["House Size"]
    }],
    ids=[str(idx)]
    )
    
    

adding document with id 0and embbeding dim : 1536
adding document with id 1and embbeding dim : 1536
adding document with id 2and embbeding dim : 1536
adding document with id 3and embbeding dim : 1536
adding document with id 4and embbeding dim : 1536
adding document with id 5and embbeding dim : 1536
adding document with id 6and embbeding dim : 1536
adding document with id 7and embbeding dim : 1536
adding document with id 8and embbeding dim : 1536
adding document with id 9and embbeding dim : 1536


### Semantic Search of Listings Based on Buyer Preferences

we can search for listings with precise filters like number of bedrooms and bathrooms or without filters just with brief descriptions

In [74]:
# embedding the query
query_embed = embeddings.embed_query("look for a property beside lake")

In [71]:
# Search by text similarity
results = collection.query(query_embeddings=query_embed,
                          n_results=5)


# show results
for i in range(len(results['documents'][0])):
    print(results['metadatas'][0][i])
    print(results['documents'][0][i])
    

{'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '3,500 sqft', 'Neighborhood': 'Lakefront Paradise', 'Price': '$1,200,000'}
Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for entertaining friends and family. Enjoy fishing, swimming, and boating right from your backyard. Located in a serene lakefront community, this property offers peace and tranquility away from the hustle and bustle of city life. Explore hiking trails, waterfalls, and wildlife in the surrounding natural beauty.
{'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '2,800 sqft', 'Neighborhood': 'Mountain Retreat', 'Price': '$900,000'}
Discover serenity in this mountain retreat featuring 4 bedrooms, 3 bathrooms, and scenic views of the surrounding peaks. The cozy fireplace, hot tub, and outdoor deck create a peaceful escape for relaxation and rejuvenation. Located in a tranquil 

In [77]:
# search with additional filters

filters = {
    "Bathrooms": 3
}

results = collection.query(query_embeddings=query_embed,
                           where=filters,
                           n_results=1)

# show results
for i in range(len(results['documents'][0])):
    print(results['metadatas'][0][i])
    print(results['documents'][0][i])

{'ids': [['1']], 'distances': [[0.29515165090560913]], 'metadatas': [[{'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '3,500 sqft', 'Neighborhood': 'Lakefront Paradise', 'Price': '$1,200,000'}]], 'embeddings': None, 'documents': [['Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for entertaining friends and family. Enjoy fishing, swimming, and boating right from your backyard. Located in a serene lakefront community, this property offers peace and tranquility away from the hustle and bustle of city life. Explore hiking trails, waterfalls, and wildlife in the surrounding natural beauty.']]}
{'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '3,500 sqft', 'Neighborhood': 'Lakefront Paradise', 'Price': '$1,200,000'}
Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. 

## Personalizing Listing Descriptions using Open AI LLM

In [92]:
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.schema import Document

In [94]:
documents = []
for idx, row in df.iterrows():
    doc = Document(
        page_content=row['combined_description'],
        metadata={
            "Neighborhood": row["Neighborhood"],
            "Price": row["Price"],
            "Bedrooms": row["Bedrooms"],
            "Bathrooms": row["Bathrooms"],
            "House Size": row["House Size"]
        }
    )
    documents.append(doc)

# Create the Chroma vectorstore with texts and metadata
db = Chroma.from_documents(documents=documents, embedding=embeddings)

In [95]:
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()

qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=db.as_retriever())

In [165]:
prompt_template = """
You are an expert AI real estate agent. Based on the buyer's preferences, provide a personalized summary to buyer for each property.

Buyer's Preferences:

Neighborhood: {Neighborhood}
Price: {Price}
Bedrooms: {Bedrooms}
Bathrooms: {Bathrooms}
House Size: {House_Size}
List of Properties:
{properties}
Task: Based on the buyer's preferences, provide a personalized summary for each property, highlighting how well it matches their criteria and any unique features that may appeal to them. Do not change the original details of the property 
if you didnt find any property matching their preferences suggest them something similar and tell them how they are also better. If something similar is also not found say didnt find any properties as per their requirement.
Display all details of the property at first and then give your personalised summary to the buyer"""

In [167]:
results = db.similarity_search_with_score("Looking for a lakehouse far from urban area", k=3)
results

[(Document(page_content='Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for entertaining friends and family. Enjoy fishing, swimming, and boating right from your backyard. Located in a serene lakefront community, this property offers peace and tranquility away from the hustle and bustle of city life. Explore hiking trails, waterfalls, and wildlife in the surrounding natural beauty.', metadata={'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '3,500 sqft', 'Neighborhood': 'Lakefront Paradise', 'Price': '$1,200,000'}),
  0.26687294244766235),
 (Document(page_content='Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for entertaining friends and family. Enjoy fishing, swimming, and boating r

In [168]:
# Custom parser function to extract metadata
def parse_metadata(doc):
    if doc.metadata:
        metadata = doc.metadata
        page_content = doc.page_content
        parsed_data = {
            "Bathrooms": metadata.get("Bathrooms"),
            "Bedrooms": metadata.get("Bedrooms"),
            "House Size": metadata.get("House Size"),
            "Neighborhood": metadata.get("Neighborhood"),
            "Price": metadata.get("Price"),
            "Description": page_content
        }
    return parsed_data

properties = []
# Parsing the metadata
for i in range(len(results[0])):
    property_details = parse_metadata(results[0][0])
    properties.append(property_details)
    print(properties)



[{'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '3,500 sqft', 'Neighborhood': 'Lakefront Paradise', 'Price': '$1,200,000', 'Description': 'Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for entertaining friends and family. Enjoy fishing, swimming, and boating right from your backyard. Located in a serene lakefront community, this property offers peace and tranquility away from the hustle and bustle of city life. Explore hiking trails, waterfalls, and wildlife in the surrounding natural beauty.'}]
[{'Bathrooms': 3, 'Bedrooms': 4, 'House Size': '3,500 sqft', 'Neighborhood': 'Lakefront Paradise', 'Price': '$1,200,000', 'Description': 'Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for 

In [169]:
from langchain.chat_models import ChatOpenAI
from langchain import LLMChain, PromptTemplate
prompt_template = PromptTemplate(
    input_variables=["Neighborhood", "Price", "Bedrooms", "Bathrooms", "House_Size", "properties"],
    template=prompt_template)

In [170]:
buyer_preferences = {
    "Neighborhood": "No preference",
    "Price": "$1,000,000 - $1,500,0000",
    "Bedrooms": "2",
    "Bathrooms": "2",
    "House_Size": "1,000 - 4,000 sqft",
    "properties": properties}

In [171]:
# Initialize the OpenAI LLM
llm = ChatOpenAI(model=model, temperature=1, max_tokens = 4000)

# Create the LLMChain
chain = LLMChain(llm=llm, prompt=prompt_template)

# Generate the listings
response = chain.run(buyer_preferences)

# Print the generated listings
print(response)

Property 1:
Neighborhood: Lakefront Paradise
Price: $1,200,000
Bedrooms: 4
Bathrooms: 3
House Size: 3,500 sqft
Description: Escape to this luxurious lakefront paradise featuring 4 bedrooms, 3 bathrooms, and breathtaking views of the crystal-clear water. The spacious open-concept living area opens up to a large deck, perfect for entertaining friends and family. Enjoy fishing, swimming, and boating right from your backyard. Located in a serene lakefront community, this property offers peace and tranquility away from the hustle and bustle of city life. Explore hiking trails, waterfalls, and wildlife in the surrounding natural beauty.

Personalized Summary: This property exceeds your criteria in terms of bedrooms, bathrooms, and house size. The Lakefront Paradise neighborhood offers a peaceful retreat with stunning lake views and easy access to water activities. The spacious deck is perfect for entertaining, and the natural surroundings provide a tranquil atmosphere for relaxation. This pr

We observe that these results are much promising when use LLM to give suggestions 