This is a starter notebook for the project, you'll have to import the libraries you'll need, you can find a list of the ones available in this workspace in the requirements.txt file in this workspace. 

# Project: Personalised Real Estate Agent

## 1. Import and API's Setup

In [16]:
import os

os.environ["OPENAI_API_KEY"] = "YOUR OPEN AI KEY"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"

from langchain.llms import OpenAI
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import PromptTemplate
from langchain.vectorstores import Chroma


## 2. Genearting Real estate listing

In [22]:
import openai

openai.api_base = "https://openai.vocareum.com/v1"
openai.api_key = os.getenv("OPENAI_API_KEY")

prompt = """
Generate 12 realistic real estate listing.
Return ONLY valid CSV.

Columns Names, MUST ENTER THESE COLUMNS NAMES:
Neighborhood, Price, Bedrooms, Bathrooms, House Size, Description

MUST FOLLOW THESE RULES:-
Rules:
    - Neighborhood: Realistic names (e.g. "Sunset Valley", "Fresh Land")
    - Bedrooms: integer 1-5
    - Bathrooms: integer 1-4
    - Price: realistic USD number like "$550,000"
    - House Size: number + "sqft" (like "2,000 sqft")
    - Description: Detailed but factual and no line breaks, a single sentence description highlighting property features and neighborhood appeal.
    - Enclose text fields in double quotes (e.g. "2,000 sqft")
    - Do not add extra lines, commentary, markdown, or labels.
    - No None or empty cells.
    - Exactly one property per row.
    - Do not use semi column anywhere
    - Do not combine multiple neighbourhoods, prices, etc in one cell
    - 

"""

response = openai.ChatCompletion.create(
    model = "gpt-3.5-turbo",
    messages = [{"role" : "user", "content" : prompt}],
    temperature = 0.5
)

# Extract content
csv_text = response["choices"][0]["message"]["content"].strip()
# print(csv_text)

# Save CSV file
with open("listing.csv", "w", encoding="utf-8") as f:
    f.write(csv_text)
    
print("CSV generated successfuly!")

CSV generated successfuly!


## 3. Load the listing and Building vector database

In [23]:
# load CSV
loader = CSVLoader(file_path="listing.csv", csv_args={"restval": ""})
docs = loader.load()
#"delimiter": " ","skipinitialspace": True, 
# Split long description to store in vector db
splitter = CharacterTextSplitter(chunk_size = 1000, chunk_overlap = 0)
docs_split = splitter.split_documents(docs)

# Creating embeddings
embeddings = OpenAIEmbeddings()

# Create the Chroma vector db
db = Chroma.from_documents(docs_split, embeddings)

print(f"Vector database is successfully created with {len(docs_split)} documents.")

Vector database is successfully created with 12 documents.


## 4. Building Buyer Preferences Interface 

In [24]:
questions = [   
            "How big do you want your house to be?" 
            "What are 3 most important things for you in choosing this property?", 
            "Which amenities would you like?", 
            "Which transportation options are important to you?",
            "How urban do you want your neighborhood to be?",   
    ]

answers = [
        "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
        "A quiet neighborhood, good local schools, and convenient shopping options.",
        "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
        "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
        "A balance between suburban tranquility and access to urban amenities like restaurants and theaters."    
    ]

preference_summary = "\n".join([f"{q} {a}" for q, a in zip(questions, answers)])

print(preference_summary)

How big do you want your house to be?What are 3 most important things for you in choosing this property? A comfortable three-bedroom house with a spacious kitchen and a cozy living room.
Which amenities would you like? A quiet neighborhood, good local schools, and convenient shopping options.
Which transportation options are important to you? A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.
How urban do you want your neighborhood to be? Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.


## 5. Semantic Search Based on Preferences

In [28]:
semantic_query = (
    "Find listings that match these buyer preferences: \n\n" + preference_summary
    )
results = db.similarity_search(semantic_query, k=3)

print("Top 3 matching listings: ")
for i, doc in enumerate(results, 1):
    print(f"\n--- Listing {i} ---\n{doc.page_content}")

Top 3 matching listings: 

--- Listing 1 ---
Neighborhood: Fresh Land
Price: $700,000
Bedrooms: 4
Bathrooms: 3
House Size: 3,000 sqft
Description: Spacious modern house with a large backyard perfect for entertaining, located in a highly sought-after area.

--- Listing 2 ---
Neighborhood: Green Meadows
Price: $500,000
Bedrooms: 3
Bathrooms: 2
House Size: 2,100 sqft
Description: Spacious split-level home with a two-car garage and a backyard perfect for summer barbecues.

--- Listing 3 ---
Neighborhood: Meadowbrook
Price: $350,000
Bedrooms: 2
Bathrooms: 1
House Size: 1,100 sqft
Description: Adorable starter home in Meadowbrook with a fenced backyard and easy access to shopping and dining options.


## 6. Personalised listing augmentation

In [30]:
import openai
top_listing = results[0].page_content

template = """
        You are a real estate personalised engine.
        
        Buyer Preferences:
        {preferences}
        
        Original Listing:
        {listing}
        
        Rewrite the listing to highlight what the buyer cares about.
        Do Not change factual details.
        
        Make the tone warm, persuasive, and helpful.
"""

prompt_text = template.format(
        preferences = preference_summary,
        listing = top_listing
)

response = openai.ChatCompletion.create(
    model = "gpt-3.5-turbo",
    messages = [{"role": "user", "content": prompt_text}],
    temperature = 0.7
)

personalised_description = response["choices"][0]["message"]["content"]
print(personalised_description)

Welcome to your dream home in the desirable neighborhood of Fresh Land! Priced at $700,000, this spacious modern house features 4 bedrooms, 3 bathrooms, and a generous 3,000 sqft of living space. 

Step inside to discover a comfortable three-bedroom layout, complete with a spacious kitchen and a cozy living room - perfect for entertaining friends and family. The large backyard is ideal for gardening enthusiasts, and the two-car garage provides plenty of space for your vehicles.

Located in a quiet neighborhood with good local schools and convenient shopping options, this home offers the perfect blend of comfort and convenience. You'll also appreciate the modern, energy-efficient heating system, ensuring your home is always cozy and welcoming.

With easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads, this home is ideal for those seeking a balance of urban living and tranquility. Don't miss out on the opportunity to make this house your own - schedu

## Save our final output

In [31]:
with open("personalised_output.txt", "w") as f:
    f.write(personalised_description)