## Step 0: Prerequisites of the application

<div class="alert alert-block alert-info">
<b>Instructions:</b> <br>
✅ Uncomment the lines below and run the cell.<br>
✅ Comment the lines back and Restart the Kernel and run the remaining cells.     
</div>

In [1]:
# !pip install pandas
# !pip install -q -r ./requirements.txt


## Step 1: Setting up Python Application

<div class="alert alert-block alert-info">
<b>Project Instructions:</b> <br>
✅ Initialize a Python Project: Create a new Python project, setting up a virtual environment and installing necessary packages like LangChain, a suitable LLM library (e.g., OpenAI's GPT), and a vector database package compatible with Python (e.g., ChromaDB or LanceDB). If you don't wish to create your files from scratch, starter files are available in the workspace on the next page as an application skeleton.
</div>

### 1.1 Importing Libraries

In [3]:
import json
from langchain.chains import ConversationChain
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.chains.conversational_retrieval.base import ConversationalRetrievalChain
from langchain.output_parsers import PydanticOutputParser
from langchain.memory import ConversationSummaryMemory, ConversationBufferMemory, CombinedMemory, ChatMessageHistory
from langchain.prompts import PromptTemplate
from langchain.schema import Document
from langchain.vectorstores import Chroma
import os
from pydantic import BaseModel, Field, NonNegativeInt
from typing import Any, Dict, List, Optional, Tuple
import pandas as pd


### 1.2. Setting Open AI Variables

In [4]:
os.environ["OPENAI_API_KEY"] = "<Insert OpenAI Key here>"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"


## Step 2: Generating Real Estate Listings

<div class="alert alert-block alert-info">
<b>Project Instructions:</b> <br>
✅ Generate real estate listings using a Large Language Model. 
✅ Generate at least 10 listings This can involve creating prompts for the LLM to produce descriptions of various properties.
</div>

### 2.1. Setting Constants

In [5]:
MODEL_NAME="gpt-3.5-turbo"
TEMPERATURE=0
MAX_TOKENS=1500
NUMBER_OF_LISTINGS=12

### 2.2. Setting the LLM Model

In [6]:
llm = ChatOpenAI(model_name=MODEL_NAME, temperature=TEMPERATURE, max_tokens=MAX_TOKENS)

In [7]:
class RealEstateListing(BaseModel):
    neighborhood: str = Field(description="The neighborhood of the real estate")
    price: NonNegativeInt = Field(description="The listing price of the real estate in CAD")
    bedrooms: NonNegativeInt = Field(description="The number of bedrooms in the real estate")
    bathrooms: NonNegativeInt = Field(description="The number of bathrooms in the real estate")
    size: NonNegativeInt = Field(description="The sq ft. area of the real estate")
    description: str = Field(description="The description of the real estate")
    neighborhood_description: str = Field(description="The description of the neighborhood of the real estate")
        
class ListOfRealEstateLlisting(BaseModel):
    listOfRealEstates: List[RealEstateListing] = Field(description="List of Real Estate Listings")

In [8]:
parser = PydanticOutputParser(pydantic_object=ListOfRealEstateLlisting)

In [9]:
print(parser.get_format_instructions())

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"listOfRealEstates": {"title": "Listofrealestates", "description": "List of Real Estate Listings", "type": "array", "items": {"$ref": "#/definitions/RealEstateListing"}}}, "required": ["listOfRealEstates"], "definitions": {"RealEstateListing": {"title": "RealEstateListing", "type": "object", "properties": {"neighborhood": {"title": "Neighborhood", "description": "The neighborhood of the real estate", "type": "string"}, "price": {"title": "Price", "description": "The listing price of the real estate in CAD", "minimum": 0, "type"

### 2.3. Setting the prompt template

In [10]:
INSTRUCTION = "Generate {number_of_listings} real estate listing in Metro Vancouver region. It should be a mix of different housing types like condominiums (condos), townhouses, semi-detached homes, detached houses, duplexes, and laneway houses.".format(number_of_listings=NUMBER_OF_LISTINGS)

LISTING_EXAMPLE = """
Below is an example of a listing:
Neighborhood: Green Oaks
Price: CAD800,000
Bedrooms: 3
Bathrooms: 2
House Size: 2,000 sqft
Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem.
Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, commuting is a breeze.
"""

### 2.4. Building the prompt

In [11]:
prompt = PromptTemplate(
    template="{instruction}\n{format_instructions}\nContext: {example}",
    input_variables=["instruction", "example"],
    partial_variables={"format_instructions": parser.get_format_instructions},
)

query = prompt.format(instruction=INSTRUCTION, example=LISTING_EXAMPLE)
print(query)

Generate 12 real estate listing in Metro Vancouver region. It should be a mix of different housing types like condominiums (condos), townhouses, semi-detached homes, detached houses, duplexes, and laneway houses.
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"listOfRealEstates": {"title": "Listofrealestates", "description": "List of Real Estate Listings", "type": "array", "items": {"$ref": "#/definitions/RealEstateListing"}}}, "required": ["listOfRealEstates"], "definitions": {"RealEstateListing": {"title": "RealEstateListing", "type": "object", "properties": {"neighborhoo

### 2.5. Generate output from LLM

In [12]:
output = llm.predict(query)

In [13]:
print(output)

{
  "listOfRealEstates": [
    {
      "neighborhood": "Kitsilano",
      "price": 1200000,
      "bedrooms": 2,
      "bathrooms": 2,
      "size": 1500,
      "description": "Beautiful waterfront condo with stunning views of the ocean and mountains. Modern design with high-end finishes and amenities.",
      "neighborhood_description": "Kitsilano is a vibrant neighborhood known for its trendy shops, restaurants, and outdoor activities."
    },
    {
      "neighborhood": "Mount Pleasant",
      "price": 950000,
      "bedrooms": 3,
      "bathrooms": 2,
      "size": 1800,
      "description": "Spacious townhouse with a private rooftop deck and attached garage. Close to parks, schools, and public transportation.",
      "neighborhood_description": "Mount Pleasant is a hip neighborhood with a mix of residential and commercial spaces, known for its arts and culture scene."
    },
    {
      "neighborhood": "Burnaby Heights",
      "price": 1600000,
      "bedrooms": 4,
      "bathroom

In [14]:
result = parser.parse(output)
print(result.listOfRealEstates)

[RealEstateListing(neighborhood='Kitsilano', price=1200000, bedrooms=2, bathrooms=2, size=1500, description='Beautiful waterfront condo with stunning views of the ocean and mountains. Modern design with high-end finishes and amenities.', neighborhood_description='Kitsilano is a vibrant neighborhood known for its trendy shops, restaurants, and outdoor activities.'), RealEstateListing(neighborhood='Mount Pleasant', price=950000, bedrooms=3, bathrooms=2, size=1800, description='Spacious townhouse with a private rooftop deck and attached garage. Close to parks, schools, and public transportation.', neighborhood_description='Mount Pleasant is a hip neighborhood with a mix of residential and commercial spaces, known for its arts and culture scene.'), RealEstateListing(neighborhood='Burnaby Heights', price=1600000, bedrooms=4, bathrooms=3, size=2200, description='Luxurious detached house with a large backyard and modern kitchen. Perfect for families looking for space and comfort.', neighborho

In [15]:
df = pd.DataFrame([listing.dict() for listing in result.listOfRealEstates])
df = df.reset_index()
df.head(15)

Unnamed: 0,index,neighborhood,price,bedrooms,bathrooms,size,description,neighborhood_description
0,0,Kitsilano,1200000,2,2,1500,Beautiful waterfront condo with stunning views...,Kitsilano is a vibrant neighborhood known for ...
1,1,Mount Pleasant,950000,3,2,1800,Spacious townhouse with a private rooftop deck...,Mount Pleasant is a hip neighborhood with a mi...
2,2,Burnaby Heights,1600000,4,3,2200,Luxurious detached house with a large backyard...,Burnaby Heights is a family-friendly neighborh...
3,3,Richmond,800000,2,2,1200,Cozy duplex with a private garden and updated ...,Richmond is a diverse and multicultural commun...
4,4,Yaletown,1500000,2,2,1600,Sleek and modern condo in the heart of Yaletow...,Yaletown is a trendy neighborhood with upscale...
5,5,West End,1100000,2,2,1400,Charming heritage townhouse with original feat...,West End is a diverse and vibrant neighborhood...
6,6,Coquitlam,1000000,3,2,2000,Spacious semi-detached home with a large backy...,Coquitlam is a family-friendly suburb with gre...
7,7,North Vancouver,1300000,4,3,2400,Modern laneway house with a rooftop deck and e...,North Vancouver is a nature lover's paradise w...
8,8,Surrey,900000,3,2,1800,Bright and spacious detached house with a larg...,Surrey is a growing city with a mix of urban a...
9,9,Port Moody,850000,2,2,1600,Modern duplex with a private patio and mountai...,Port Moody is a charming waterfront community ...


In [16]:
df.to_csv('real_estate_listings.csv')


## Step 3:  Storing Listings in a Vector Database

<div class="alert alert-block alert-info">
<b>Project Instructions:</b> <br>
✅ Vector Database Setup: Initialize and configure ChromaDB or a similar vector database to store real estate listings.<br>
✅ Generating and Storing Embeddings: Convert the LLM-generated listings into suitable embeddings that capture the semantic content of each listing, and store these embeddings in the vector database.
</div>

### 3.1. Vector Database Setup

#### Create or get a collection for real estate listings


In [17]:
documents = [
    Document(
        page_content=f"{row['description']} Neighborhood: {row['neighborhood_description']}",
        metadata={
            "index": row["index"],
            "neighborhood": row["neighborhood"],
            "price": row["price"],
            "bedrooms": row["bedrooms"],
            "bathrooms": row["bathrooms"],
            "size": row["size"]
        }
    )
    for _, row in df.iterrows()
]

### 3.2. Generate and Store Embeddings

In [18]:
# !rm -rf ./real_estate_chroma


In [19]:
db = Chroma.from_documents(
    documents=documents,
    embedding=OpenAIEmbeddings(),
    persist_directory="./real_estate_chroma"
)



## Step 4: Building the User Preference Interface


<div class="alert alert-block alert-info">
<b>Project Instructions:</b> <br>
✅ Collect buyer preferences, such as the number of bedrooms, bathrooms, location, and other specific requirements from a set of questions or telling the buyer to enter their preferences in natural language. You can hard-code the buyer preferences in questions and answers, or collect them interactively however you'd like.<br>
✅ Buyer Preference Parsing: Implement logic to interpret and structure these preferences for querying the vector database.
</div>

In [20]:
INSTRUCTION = """
    Based on the Preferences in the context, tell me real estate listings that match the preferences.
    Make sure you do not paraphrase the listings.
    Only show the listings which exactly match the preferences mentioned below.
    
    """

QUESTIONS = [   
                "How big do you want your house to be?",
                "What are 3 most important things for you in choosing this property?", 
                "Which amenities would you like?", 
                "Which transportation options are important to you?",
                "How urban do you want your neighborhood to be?",   
            ]
ANSWERS = [
    "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
    "A quiet neighborhood, good local schools, and convenient shopping options.",
    "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
    "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
    "A balance between suburban tranquility and access to urban amenities like restaurants and theaters."
]

### 4.1. Create and Store Memory

#### Add User message in memory

In [21]:
history = ChatMessageHistory()
history.add_user_message(f"""The goal is to create a personalized experience for each buyer, making the property search process more engaging and tailored to individual preferences. Recommend a real estate listing on their answers to personal questions. Ask user {len(QUESTIONS)} questions""")

for i in range(len(QUESTIONS)):
    history.add_ai_message(QUESTIONS[i])
    history.add_user_message(ANSWERS[i])



#### Add AI message in memory

In [22]:
history.add_ai_message("""Now tell me a summary of what type of real estate you're considering, and specify what characteristics you are looking for in your new home!""")
summary_memory = ConversationSummaryMemory(
    llm=llm,
    memory_key="chat_history",
    input_key="input",
    buffer=f"The human answered {len(QUESTIONS)} personal questions). Use them to identify the characteristics of real estate listing and which listing they will like the most.",
    return_messages=True)


#### Create Conversational Memory

In [23]:
conversational_memory = ConversationBufferMemory(
    chat_memory=history,
    memory_key="convo",
    input_key="question"
)

#### Create combined memory

In [24]:
memory = CombinedMemory(memories=[conversational_memory, summary_memory])


#### Create a prompt

In [25]:
RECOMMENDER_TEMPLATE = """
You are AI at 'Future Homes Realty'  a forward-thinking real estate company. 
Where personalization is key to customer satisfaction, you revolutionize how clients interact with real estate listings. 
Using the provided context and the customer's preferences, offer relevant information about available homes. 
Keep your response concise and engaging, with a maximum of 300 characters. 
\n Context: {context}
\nCustomer's preferences: {chat_history} 
\nHuman: {input}
\nAI:
"""
PROMPT = PromptTemplate(
    template=RECOMMENDER_TEMPLATE,
    input_variables=['context', 'chat_history', 'input']
    )
print(PROMPT)


input_variables=['context', 'chat_history', 'input'] template="\nYou are AI at 'Future Homes Realty'  a forward-thinking real estate company. \nWhere personalization is key to customer satisfaction, you revolutionize how clients interact with real estate listings. \nUsing the provided context and the customer's preferences, offer relevant information about available homes. \nKeep your response concise and engaging, with a maximum of 300 characters. \n\n Context: {context}\n\nCustomer's preferences: {chat_history} \n\nHuman: {input}\n\nAI:\n"


## Step 5: Searching Based on Preferences


<div class="alert alert-block alert-info">
<b>Project Instructions:</b> <br>
✅ Semantic Search Implementation: Use the structured buyer preferences to perform a semantic search on the vector database, retrieving listings that most closely match the user's requirements.<br>
✅ Listing Retrieval Logic: Fine-tune the retrieval algorithm to ensure that the most relevant listings are selected based on the semantic closeness to the buyer’s preferences.
</div>

In [26]:
chain = ConversationalRetrievalChain.from_llm(
                                llm=llm,
                                chain_type="stuff",
                                retriever=db.as_retriever(),
                                combine_docs_chain_kwargs={'prompt': PROMPT},
                                memory=memory
                            )

## Step 6: Personalizing Listing Descriptions

<div class="alert alert-block alert-info">
<b>Project Instructions:</b> <br>
✅ LLM Augmentation: For each retrieved listing, use the LLM to augment the description, tailoring it to resonate with the buyer’s specific preferences. This involves subtly emphasizing aspects of the property that align with what the buyer is looking for. <br>
✅ Maintaining Factual Integrity: Ensure that the augmentation process enhances the appeal of the listing without altering factual information.


In [27]:
query = """
For each retrieved listing, tailor the description to resonate with the buyer’s specific preferences. 
This involves subtly emphasizing aspects of the property that align with what the buyer is looking for.
Ensure that the augmentation process enhances the appeal of the listing without altering factual information.
"""

result = chain({"question": query, "input": "Suggest a suitable home. Tell me exactly how it matches my preferences for each of the characteristics I mentioned. If no home found, tell me that no home available based on my preferences"})


In [28]:
print(result['answer'])

Based on your preferences, the beautiful heritage-style townhouse in Mount Pleasant matches your desire for a spacious layout, high ceilings, and a private backyard oasis. The gourmet kitchen is perfect for entertaining, and the cozy fireplace aligns with your preference for relaxing evenings. The neighborhood's artsy vibe, trendy eateries, and vibrant community also cater to your preference for a diverse and eclectic area. This home offers the perfect blend of comfort and convenience for you.
