# Personalized Rea Estate Agent

## Step 1: Setting Up the Python Application

```
pip install -r requirements.txt

requirements.txt
    langchain==0.1.12
    openai==0.28.1
    pydantic>=1.10.12
    pytest>=7.4.0
    sentence-transformers>=2.2.0
    transformers>=4.31.0
    chromadb==0.4.15
    jupyter==1.0.0

Set OpenAI API key to environment variable so that langchain methods can refer them in the environment variables list.

In [1]:
import os
openai_api_key = open("../../api_keys/openai.key", "rt").read()
os.environ["openai_api_key"] = openai_api_key
OpenAI_CHAT_MODEL = "gpt-3.5-turbo"

In [2]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(openai_api_key=openai_api_key,
            model_name=OpenAI_CHAT_MODEL,
            temperature=0.6,
            max_tokens=2000)

  warn_deprecated(


## Step 2: Generating Real Estate Listings

Generate real estate listings using a Large Language Model. Generated 20 listings.

In [3]:
import openai

def generate_review(prompt):
    OpenAI_CHAT_MODEL = "gpt-3.5-turbo"
    openai_instruction = [
        {
            "role": "system",
            "content": "You are a real estate agent."},
        {
            "role": "user",
            "content": f"prompt_text = {prompt}"}
        ]

    try:
        response = openai.ChatCompletion.create(
            model = OpenAI_CHAT_MODEL,
            messages = openai_instruction,
            temperature=0.6,
            max_tokens=2000
        )
        review = response.choices[0].message['content'].strip()

    except openai.error.OpenAIError as err:
        print(f"Encountered an error: {err}")

    return review



Samples to generate synthetic sales talk and neighbor description.

In [4]:
location = "Furano, Hokkaido, Japan"

In [5]:
review_len = 40
prompt_text = f"Write a sales talk for a cozy real estate and its equipments in {location}."
generated_review = generate_review(prompt_text)
generated_review

"🏡 Welcome to your dream home in Furano, Hokkaido, Japan! Nestled in the heart of picturesque surroundings, this cozy real estate gem offers the perfect blend of comfort and tranquility.\n\nAs you step inside, you'll be greeted by the warm ambiance of the spacious living room, complete with a traditional Japanese hearth to keep you cozy during the winter months. The open-concept kitchen is a chef's delight, equipped with modern appliances and ample counter space for all your culinary adventures.\n\nVenture outside to the beautifully landscaped garden, where you can relax and unwind in your very own private oasis. Imagine sipping a cup of tea on the veranda as you take in the stunning views of the surrounding mountains and lush greenery.\n\nThis property comes fully furnished with high-quality furnishings and tasteful decor, ensuring that you can move in and start enjoying your new home right away. With a convenient location close to amenities and outdoor activities, you'll have everyth

In [6]:
prompt_text = f"Write a positive statement around the neighbor of a real estate in {location}."
generated_neighbor = generate_review(prompt_text)
generated_neighbor

'The charming neighborhood surrounding this property in Furano, Hokkaido, Japan offers a perfect blend of tranquility and convenience, with stunning views of the surrounding mountains and easy access to local amenities.'

Now generate a data to store in the vector db as a reference listings.

### Generate catalog spec CSV data

In [7]:
entries = 20
area = "Hokkaido, Japan"
prompt_text = f"Generate a csv file of {entries} entries of real estate data with a column head Neighborhood, Price, Bedrooms, Bathrooms, House Size in {area}. Bedrooms and Bathrooms are numberf of it, and the House Size in square of meters. Price in Japanese Yen. Fluctuate number of bachrooms from 1 to 3. Resond just CSV part only."
csv_data = generate_review(prompt_text)
csv_data

'Neighborhood,Price,Bedrooms,Bathrooms,House Size\nSapporo,55000000,4,2,180\nOtaru,32000000,3,1,120\nAsahikawa,45000000,3,2,150\nHakodate,60000000,4,3,200\nObihiro,38000000,3,1,130\nKitami,42000000,3,2,140\nTomakomai,48000000,4,2,160\nFurano,35000000,2,1,100\nNiseko,70000000,5,3,220\nKushiro,40000000,3,1,130\nAbashiri,32000000,2,1,90\nMuroran,42000000,3,2,140\nIwamizawa,55000000,4,2,180\nShibetsu,30000000,2,1,80\nRumoi,38000000,3,1,120\nNemuro,46000000,3,2,150\nChitose,50000000,4,2,170\nKutchan,68000000,4,3,210\nYoichi,33000000,2,1,100\nAibetsu,40000000,3,2,140'

In [8]:
# Save the data so as not to repeatedly generate the data with OpenAI APIs.
with open('./HomeAgent.csv', mode='w') as f:
    f.write(csv_data)

### Add sales description about the properties

In [9]:
# load the stored catalog data
import pandas as pd

df = pd.read_csv('./HomeAgent.csv')
df

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House Size
0,Sapporo,55000000,4,2,180
1,Otaru,32000000,3,1,120
2,Asahikawa,45000000,3,2,150
3,Hakodate,60000000,4,3,200
4,Obihiro,38000000,3,1,130
5,Kitami,42000000,3,2,140
6,Tomakomai,48000000,4,2,160
7,Furano,35000000,2,1,100
8,Niseko,70000000,5,3,220
9,Kushiro,40000000,3,1,130


In [10]:
description = []
neibhborhooddescription = []

for index, row in df.iterrows():
    review_len = 40
    location = row['Neighborhood']
    prompt_text = f"Write a {review_len} word sales talk for a cozy real estate and its equipments in {location}. Mention randomly about the amenities equipped."
    description.append(generate_review(prompt_text))

    review_len = 30
    prompt_text = f"Write a {review_len} word positive statement around the neighbor of a real estate in {location}. Mention randomly about the transportation lines, schools, shoppings, and leisure facilities nearby. Also mention about degree of urban randomly in its vicinity."
    neibhborhooddescription.append(generate_review(prompt_text))

df['Description'] = description
df['Neighborhood Description'] = neibhborhooddescription

In [11]:
df

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House Size,Description,Neighborhood Description
0,Sapporo,55000000,4,2,180,"Step into your cozy Sapporo retreat, complete ...","Located in Sapporo, this real estate boasts a ..."
1,Otaru,32000000,3,1,120,Step into your dream home in picturesque Otaru...,"Nestled in Otaru's charming neighborhood, this..."
2,Asahikawa,45000000,3,2,150,Step into your dream home in Asahikawa! This c...,"Nestled in Asahikawa, this property boasts eas..."
3,Hakodate,60000000,4,3,200,"Step into your dream home in Hakodate, where c...","Nestled in Hakodate, this property boasts easy..."
4,Obihiro,38000000,3,1,130,"Step into your dream home in Obihiro, where co...","""Located in scenic Obihiro, this property boas..."
5,Kitami,42000000,3,2,140,Step into your dream home in Kitami! This cozy...,"Nestled in Kitami's vibrant neighborhood, this..."
6,Tomakomai,48000000,4,2,160,Step into your dream home in Tomakomai! This c...,"Located in Tomakomai, this property boasts exc..."
7,Furano,35000000,2,1,100,"Step into your dream home in Furano, where coz...","Nestled in Furano, this property boasts excell..."
8,Niseko,70000000,5,3,220,Step into your dream home in Niseko! This char...,"Located in Niseko, this property boasts a vibr..."
9,Kushiro,40000000,3,1,130,Step into your dream home in Kushiro - a cozy ...,"Nestled in Kushiro, this property boasts easy ..."


Store the CSV, so as not generate the file repeatedly.

In [12]:
df.to_csv('./HomeAgent2.csv')

## Step 3: Storing Listings in a Vector Database


In [13]:
# Chroma DB packages
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chains.question_answering import load_qa_chain

Load it the dataset from a file.

In [14]:
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.text_splitter import CharacterTextSplitter

loader = CSVLoader(file_path='./HomeAgent2.csv')
docs = loader.load()

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
split_docs = splitter.split_documents(docs)


- **Vector Database Setup**: Initialize and configure ChromaDB or a similar vector database to store real estate listings.
- **Generating and Storing Embeddings**: Convert the LLM-generated listings into suitable embeddings that capture the semantic content of each listing, and store these embeddings in the vector database.

In [15]:
#db = Chroma.from_documents(chunks, OpenAIEmbeddings())
db = Chroma.from_documents(documents = split_docs,
                        embedding = OpenAIEmbeddings(),
                        persist_directory='./data')

  warn_deprecated(


Test the vector db to extract the similarity entries to the requirements.

In [16]:
matching_docs = db.similarity_search("Which resident has good transportation?")
matching_docs

[Document(page_content=": 18\nNeighborhood: Yoichi\nPrice: 33000000\nBedrooms: 2\nBathrooms: 1\nHouse Size: 100\nDescription: Step into your dream home in Yoichi! This cozy property boasts modern appliances, spacious living areas, and breathtaking views. Enjoy the convenience of a fully equipped kitchen, luxurious bathroom, and private outdoor space. Don't miss out on this perfect blend of comfort and style!\nNeighborhood Description: Yoichi offers a vibrant neighborhood with excellent transportation links, top-rated schools, shopping centers, and leisure facilities. Its perfect blend of urban amenities and natural beauty makes it a desirable location.", metadata={'row': 18, 'source': './HomeAgent2.csv'}),
 Document(page_content=': 10\nNeighborhood: Abashiri\nPrice: 32000000\nBedrooms: 2\nBathrooms: 1\nHouse Size: 90\nDescription: Step into your dream home in Abashiri, where cozy comfort meets modern convenience. This charming property boasts a fully-equipped kitchen, spacious living a

Following is the test to use vector db with OpenAI.

In [17]:
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff",verbose=True)

answer =  chain.run(input_documents=matching_docs, question="Which resident has good transportation?")
answer

  warn_deprecated(




[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
: 18
Neighborhood: Yoichi
Price: 33000000
Bedrooms: 2
Bathrooms: 1
House Size: 100
Description: Step into your dream home in Yoichi! This cozy property boasts modern appliances, spacious living areas, and breathtaking views. Enjoy the convenience of a fully equipped kitchen, luxurious bathroom, and private outdoor space. Don't miss out on this perfect blend of comfort and style!
Neighborhood Description: Yoichi offers a vibrant neighborhood with excellent transportation links, top-rated schools, shopping centers, and leisure facilities. Its perfect blend of urban amenities and natural beauty makes it a desirable location.

: 10
Neighborhood: Abashiri
Price: 32000000
Bedro

'Both residents in Sapporo and Abashiri have properties that boast excellent access to transportation lines.'

Another example with RetrievalQA.

In [18]:
from langchain.chains import RetrievalQA
retrieval_chain = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=db.as_retriever())
retrieval_chain.run("Which resident has good transportation?")

'Both the residents of Yoichi and Sapporo have access to good transportation. Yoichi offers excellent transportation links, while Sapporo boasts excellent access to transportation lines.'

## Step 4: Building the User Preference Interface

Collect buyer preferences, such as the number of bedrooms, bathrooms, location, and other specific requirements from a set of questions or telling the buyer to enter their preferences in natural language. You can hard-code the buyer preferences in questions and answers as below.

In [19]:
questions = [
                "How big do you want your house to be?",
                "What are 3 most important things for you in choosing this property?",
                "Which amenities would you like?",
                "Which transportation options are important to you?",
                "How urban do you want your neighborhood to be?",
            ]
answers = [
                "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
                "A quiet neighborhood, good local schools, and convenient shopping options.",
                "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
                "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
                "A balance between suburban tranquility and access to urban amenities like restaurants and theaters."
            ]

Buyer Preference Parsing: Implement logic to interpret and structure these preferences for inputing the vector database.

In [20]:
from langchain.memory import ConversationBufferMemory, ChatMessageHistory
#from typing import Any, Dict, Optional, Tuple


Create a memory of QAs for user preference.

In [21]:
# Construct a conversational history
history = ChatMessageHistory()
for question, answer in zip(questions, answers):
    history.add_ai_message(question)
    history.add_user_message(answer)
history.add_ai_message("""Now I can recommend a home we're considering best suited for you from the listings.""")

# you could choose to store some of the q/a in memory as well, in addition to original questions
# class MementoBufferMemory(ConversationBufferMemory):
#     def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
#         input_str, output_str = self._get_input_output(inputs, outputs)
#         self.chat_memory.add_user_message(input_str)
#         self.chat_memory.add_ai_message(output_str)

preset_memory = ConversationBufferMemory(
    chat_memory=history,
    memory_key = 'chat_history',
    return_messages=True
)


Create a chain to make a recommendation based on QA conversation memory and listings the agent has.

In [22]:
from langchain.chains import ConversationalRetrievalChain

qa = ConversationalRetrievalChain.from_llm(
    llm = llm,
    retriever =db.as_retriever(),
    memory=preset_memory,
    verbose = True)

Test with an example inquirry.

In [23]:
result = qa.invoke("Considering my preference, which resident in your listings is the best?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Assistant: How big do you want your house to be?
Human: A comfortable three-bedroom house with a spacious kitchen and a cozy living room.
Assistant: What are 3 most important things for you in choosing this property?
Human: A quiet neighborhood, good local schools, and convenient shopping options.
Assistant: Which amenities would you like?
Human: A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.
Assistant: Which transportation options are important to you?
Human: Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.
Assistant: How urban do you want your neighborhood to be?
Human: A balance between suburban tranquility and access to urban amenities like restaurants and th

Display the response(answer) from AI.

In [24]:
display(result["answer"])

'Based on your preferences for a cozy property with modern amenities, a spacious kitchen, luxurious bathroom, and a private outdoor space, the property in Yoichi with 2 bedrooms, 1 bathroom, and a house size of 100 might be the best suited for you. It offers breathtaking views and is located in a vibrant neighborhood with excellent transportation links and top-rated schools.'

## Step 6: Test with an app



In [25]:
import gradio

def runChain(query, history):
    return qa(query)["answer"]

app = gradio.ChatInterface(runChain)
app.queue()
app.launch(share=False, debug=True)

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.


  warn_deprecated(




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Assistant: How big do you want your house to be?
Human: A comfortable three-bedroom house with a spacious kitchen and a cozy living room.
Assistant: What are 3 most important things for you in choosing this property?
Human: A quiet neighborhood, good local schools, and convenient shopping options.
Assistant: Which amenities would you like?
Human: A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.
Assistant: Which transportation options are important to you?
Human: Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.
Assistant: How urban do you want your neighborhood to be?
Human: A balance between suburban tranquility and access to urban amenities like restaurants and th



In [None]:
app.close()

Closing server running on port: 7860


## Sample chat app output

![chat app](./chatapp.png)
