This is a starter notebook for the project, you'll have to import the libraries you'll need, you can find a list of the ones available in this workspace in the requirements.txt file in this workspace. 

In [1]:
import os
import pandas as pd
import openai

os.environ["OPENAI_API_KEY"] = "KEY"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"

from langchain.document_loaders.csv_loader import CSVLoader
from langchain.schema import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI

# Generating Synthetic Real Estate Listings Using LLM

In [172]:
prompt = """

Generate a list of 10 real estates in Riyadh city (Saudi Arabia).

This is an example:
Neighborhood: Alhamra
Price: 4,800,000 Saudi Riyals
Bedrooms: 5
Bathrooms: 3
House_Size: 600 m2
Property_Description: This charming 5-bedroom, 3-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style.
Neighborhood_Description: Alhamra is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Park or grab a cup of coffee at the cozy Cafe. With easy access to public transportation and bike lanes, commuting is a breeze.

The output should follow the following format :
Neighborhood: <Neighborhood Name>
Price: <Price>
Bedrooms: <Number of Bedrooms>
Bathrooms: <Number of Bathrooms>
House_Size: <Size in sqft>
Property_Description: <Detailed description of the property>
Neighborhood_Description: <Description of the neighborhood>

"""


response = openai.Completion.create(
    prompt=prompt,
    model="gpt-3.5-turbo-instruct",
    max_tokens=3000)

real_estate_str=response["choices"][0]["text"].strip()
print(real_estate_str)

real_estate_lines = real_estate_str.splitlines()

1. Neighborhood: Al Yamamah
Price: 12,500,000 Saudi Riyals
Bedrooms: 8
Bathrooms: 6
House_Size: 950 m2
Property_Description: This luxurious 8-bedroom, 6-bathroom villa is situated in the prestigious Al Yamamah neighborhood. With tasteful design and high-end finishes, this home offers spacious living areas, a modern kitchen, and a private pool. The expansive master suite boasts a walk-in closet and a luxurious en-suite bathroom. Perfect for those who value privacy and sophistication.
Neighborhood_Description: Al Yamamah is known for its exclusive villas and upscale living. The neighborhood offers convenient access to shopping centers, schools, and hospitals. With lush green parks and well-manicured streets, Al Yamamah is the epitome of luxury living.

2. Neighborhood: Al Munsiyah
Price: 3,500,000 Saudi Riyals
Bedrooms: 4
Bathrooms: 3
House_Size: 400 m2
Property_Description: This charming 4-bedroom, 3-bathroom family home is located in the quiet and family-friendly neighborhood of Al Mun

In [173]:
df = pd.DataFrame(columns = ["Neighborhood", "Price", "Bedrooms", "Bathrooms", "House_Size", "Property_Description", "Neighborhood_Description" ])
df["Neighborhood"] = [ i.split(": ")[-1] for i in real_estate_lines if " Neighborhood" in i]
df["Price"]= [ i.split(": ")[-1] for i in real_estate_lines if "Price" in i]
df["Bedrooms"] = [i.split(": ")[-1] for i in real_estate_lines if "Bedrooms" in i]
df["Bathrooms"]=[ i.split(": ")[-1] for i in real_estate_lines if "Bathrooms" in i]
df["House_Size"]=[ i.split(": ")[-1] for i in real_estate_lines if "House_Size" in i]
df["Property_Description"]=[ i.split(": ")[-1] for i in real_estate_lines if "Property_Description" in i]
df["Neighborhood_Description"] =[ i.split(": ")[-1] for i in real_estate_lines if "Neighborhood_Description" in i]

df.head(10)

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House_Size,Property_Description,Neighborhood_Description
0,Al Yamamah,"12,500,000 Saudi Riyals",8,6,950 m2,"This luxurious 8-bedroom, 6-bathroom villa is ...",Al Yamamah is known for its exclusive villas a...
1,Al Munsiyah,"3,500,000 Saudi Riyals",4,3,400 m2,"This charming 4-bedroom, 3-bathroom family hom...",Al Munsiyah is a peaceful and welcoming neighb...
2,Al Waha,"2,600,000 Saudi Riyals",3,2,280 m2,"This modern 3-bedroom, 2-bathroom townhouse is...",Al Waha is a family-friendly community with a ...
3,Al Hamraa,"7,500,000 Saudi Riyals",6,4,600 m2,"This elegant 6-bedroom, 4-bathroom villa is si...",Known for its upscale lifestyle and high-end r...
4,Al Olaya,"5,200,000 Saudi Riyals",5,3,500 m2,"This spacious 5-bedroom, 3-bathroom villa is l...",Al Olaya is known for its luxurious villas and...
5,Al Sahafah,"1,800,000 Saudi Riyals",3,2,320 m2,"This cozy 3-bedroom, 2-bathroom townhouse is l...",Al Sahafah is a family-oriented neighborhood w...
6,Al Izdihar,"6,500,000 Saudi Riyals",4,5,750 m2,"This luxurious 4-bedroom, 5-bathroom villa is ...",Al Izdihar is a well-established and highly so...
7,Al Manar,"9,000,000 Saudi Riyals",7,4,800 m2,"This grand 7-bedroom, 4-bathroom villa is loca...",Al Manar is known for its luxurious and spacio...
8,Al Falah,"800,000 Saudi Riyals",2,2,150 m2,"This cozy 2-bedroom, 2-bathroom apartment is l...","Al Falah is a rapidly developing neighborhood,..."
9,Al Yasmin,"1,200,000 Saudi Riyals",3,2,200 m2,"This modern 3-bedroom, 2-bathroom townhouse is...",Al Yasmin is a quiet and welcoming neighborhoo...


In [174]:
df.to_csv("real_estate_listings.csv", index=False)

# Creating Vector Database and Storing Embeddings


In [2]:
df = pd.read_csv("real_estate_listings.csv")
df.head()

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House_Size,Property_Description,Neighborhood_Description
0,Al Yamamah,"12,500,000 Saudi Riyals",8,6,950 m2,"This luxurious 8-bedroom, 6-bathroom villa is ...",Al Yamamah is known for its exclusive villas a...
1,Al Munsiyah,"3,500,000 Saudi Riyals",4,3,400 m2,"This charming 4-bedroom, 3-bathroom family hom...",Al Munsiyah is a peaceful and welcoming neighb...
2,Al Waha,"2,600,000 Saudi Riyals",3,2,280 m2,"This modern 3-bedroom, 2-bathroom townhouse is...",Al Waha is a family-friendly community with a ...
3,Al Hamraa,"7,500,000 Saudi Riyals",6,4,600 m2,"This elegant 6-bedroom, 4-bathroom villa is si...",Known for its upscale lifestyle and high-end r...
4,Al Olaya,"5,200,000 Saudi Riyals",5,3,500 m2,"This spacious 5-bedroom, 3-bathroom villa is l...",Al Olaya is known for its luxurious villas and...


In [48]:
# loader = CSVLoader(file_path='real_estate_listings.csv')
# docs = loader.load()
# print(len(docs))
# print(docs)

In [3]:
documents = []
for index, row in df.iterrows():
    documents.append(Document(page_content=row['Property_Description'], metadata={'id': str(index)}))
print(len(documents))
print(documents)

10
[Document(page_content='This luxurious 8-bedroom, 6-bathroom villa is situated in the prestigious Al Yamamah neighborhood. With tasteful design and high-end finishes, this home offers spacious living areas, a modern kitchen, and a private pool. The expansive master suite boasts a walk-in closet and a luxurious en-suite bathroom. Perfect for those who value privacy and sophistication.', metadata={'id': '0'}), Document(page_content='This charming 4-bedroom, 3-bathroom family home is located in the quiet and family-friendly neighborhood of Al Munsiyah. With a spacious layout, including a large living area and a modern kitchen, this home is perfect for entertaining guests. The outdoor terrace and garden provide a peaceful oasis for relaxation.', metadata={'id': '1'}), Document(page_content='This modern 3-bedroom, 2-bathroom townhouse is located in the sought-after neighborhood of Al Waha. The open-concept living and dining area leads to a private garden, perfect for outdoor gatherings. 

In [4]:
splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=100,
    length_function=len,
    add_start_index=True,
)
split_docs = splitter.split_documents(documents)
print(f"Splitted {len(documents)} documents into {len(split_docs)}.")
print(split_docs[:2])

Splitted 10 documents into 16.
[Document(page_content='This luxurious 8-bedroom, 6-bathroom villa is situated in the prestigious Al Yamamah neighborhood. With tasteful design and high-end finishes, this home offers spacious living areas, a modern kitchen, and a private pool. The expansive master suite boasts a walk-in closet and a luxurious en-suite', metadata={'id': '0', 'start_index': 0}), Document(page_content='and a private pool. The expansive master suite boasts a walk-in closet and a luxurious en-suite bathroom. Perfect for those who value privacy and sophistication.', metadata={'id': '0', 'start_index': 201})]


In [5]:
embedding_function = OpenAIEmbeddings()
db = Chroma.from_documents(
    split_docs, embedding_function, persist_directory="chroma")
db.persist()
print(f"Stored {len(split_docs)} splitted documents to chroma.")

Stored 16 splitted documents to chroma.


#  Semantic Search for Listings Based on User Preference

In [6]:
buyer_preference = "A comfortable 6-bedroom villa with backyard" 

In [7]:
def get_personalized_response(buyer_preference, template):
    results = db.similarity_search_with_relevance_scores(buyer_preference, k=3)
#     print(results)
    
    if len(results) == 0 or results[0][1] < 0.6:
        print(f"Sorry, we couldn't find any results matching your preference")
    else:
        context= "\n---\n".join([res.page_content for res, _ in results])
        prompt = template.format(context=context, question=buyer_preference)
        print("Prompt: ",prompt)
        
        model = ChatOpenAI()
        response_text = model.predict(prompt)
        sources = [res.metadata.get("id", None) for res, _score in results]
        formatted_response = f"Response: {response_text}\nSources: {sources}"
        print(formatted_response)

In [8]:
template =\
        """
Based on the context below:
{context}
---
Answer this question: {question}
"""
get_personalized_response(buyer_preference, template)

Prompt:  
Based on the context below:
private backyard, perfect for outdoor entertaining. Located in a family-friendly neighborhood, this villa offers the perfect balance of style and convenience.
---
a private backyard with a BBQ area. Located in a popular residential community, this villa offers a peaceful and convenient lifestyle.
---
This cozy 3-bedroom, 2-bathroom villa is perfect for those looking for a budget-friendly home in a family-friendly neighborhood. The single-story layout includes an open-concept living and dining area, with a fully equipped kitchen. The private backyard features a small garden, perfect for outdoor
---
Answer this question: A comfortable 6-bedroom villa with backyard

Response: Based on the context provided, it can be inferred that a comfortable 6-bedroom villa with a private backyard would likely offer a spacious and stylish living space perfect for entertaining, located in a family-friendly neighborhood or popular residential community.
Sources: ['8',

# Augmented Response Generation

LLM Augmentation: For each retrieved listing, use the LLM to augment the description, tailoring it to resonate with the buyerâ€™s specific preferences. This involves subtly emphasizing aspects of the property that align with what the buyer is looking for.

Maintaining Factual Integrity: Ensure that the augmentation process enhances the appeal of the listing without altering factual information.

In [10]:
augment_template =\
        """
Based on the context below:
{context}
---
Write a response that not only answers the question {question}, but also ensures that your explanation is captivating and customized to align with the buyer's specific preferences. This involves subtly emphasizing aspects of the property that align with what the buyer is looking for.

"""
        
get_personalized_response(buyer_preference, augment_template)

Prompt:  
Based on the context below:
private backyard, perfect for outdoor entertaining. Located in a family-friendly neighborhood, this villa offers the perfect balance of style and convenience.
---
a private backyard with a BBQ area. Located in a popular residential community, this villa offers a peaceful and convenient lifestyle.
---
This cozy 3-bedroom, 2-bathroom villa is perfect for those looking for a budget-friendly home in a family-friendly neighborhood. The single-story layout includes an open-concept living and dining area, with a fully equipped kitchen. The private backyard features a small garden, perfect for outdoor
---
Write a response that not only answers the question A comfortable 6-bedroom villa with backyard, but also ensures that your explanation is captivating and customized to align with the buyer's specific preferences. This involves subtly emphasizing aspects of the property that align with what the buyer is looking for.


Response: This luxurious 6-bedroom vi