# Personalized Rea Estate Agent

## Step 1: Setting Up the Python Application

```
pip install -r requirements.txt

requirements.txt
    langchain==0.1.12
    openai==0.28.1
    pydantic>=1.10.12
    pytest>=7.4.0
    sentence-transformers>=2.2.0
    transformers>=4.31.0
    chromadb==0.4.15
    jupyter==1.0.0

In [1]:
#from langchain.llms import OpenAI
from langchain.chains import ConversationChain

import requests

Set OpenAI API key to environment variable so that langchain methods can refer them in the environment variables list.

In [2]:
import os
openai_api_key = open("../../api_keys/openai.key", "rt").read()
os.environ["openai_api_key"] = openai_api_key
OpenAI_CHAT_MODEL = "gpt-3.5-turbo"


## Step 2: Generating Real Estate Listings

Generate real estate listings using a Large Language Model. Generate at least 10 listings This can involve creating prompts for the LLM to produce descriptions of various properties. An example of a listing might be:

In [4]:
import openai

def generate_review(prompt, min_len):
    OpenAI_CHAT_MODEL = "gpt-3.5-turbo"
    openai_instruction = [
        {
            "role": "system",
            "content": "You are a real estate agent."},
        {
            "role": "user",
            "content": f"prompt_text = {prompt}"}
        ]

    try:
        response = openai.ChatCompletion.create(
            model = OpenAI_CHAT_MODEL,
            messages = openai_instruction,
            temperature=0.6,
            max_tokens=2000
        )
        review = response.choices[0].message['content'].strip()

    except openai.error.OpenAIError as err:
        print(f"Encountered an error: {err}")

    return review



Sample to generate synthetic sales talk and neighbor description.

In [5]:
location = "Furano, Hokkaido, Japan"

In [6]:
review_len = 40
prompt_text = f"Write a {review_len} word sales talk for a cozy real estate and its equipments in {location}."
generated_review = generate_review(prompt_text, min_len=review_len)
generated_review

'Step into your dream home in Furano, Hokkaido, Japan! This cozy real estate offers a perfect blend of comfort and style, complete with top-of-the-line appliances and modern furnishings. Embrace the tranquil surroundings and make every day a retreat in this charming property.'

In [7]:
review_len = 30
prompt_text = f"Write a {review_len} word positive statement around the neighbor of a real estate in {location}."
generated_neighbor = generate_review(prompt_text, min_len=review_len)
generated_neighbor

'Nestled in the picturesque town of Furano, Hokkaido, Japan, this charming neighborhood offers a tranquil setting with stunning mountain views and easy access to outdoor activities all year round.'

In [8]:
entries = 20
area = "Hokkaido, Japan"
prompt_text = f"Generate a csv file of {entries} entries of real estate data with a column head Neighborhood, Price, Bedrooms, Bathrooms, House Size in {area}. Bedrooms and Bathrooms are numberf of it, and the House Size in square of meters. Price in Japanese Yen. Fluctuate number of bachrooms from 1 to 3. Resond just CSV part only."
csv_data = generate_review(prompt_text, min_len=review_len)
csv_data

'Neighborhood,Price,Bedrooms,Bathrooms,House Size\nSapporo,65000000,3,2,150\nOtaru,45000000,2,1,100\nHakodate,78000000,4,3,200\nAsahikawa,55000000,3,2,120\nFurano,70000000,3,2,160\nObihiro,48000000,2,1,90\nKitami,60000000,3,2,140\nTomakomai,53000000,3,2,110\nAbashiri,72000000,4,3,180\nWakkanai,50000000,3,1,130\nMorioka,68000000,3,2,150\nHachinohe,52000000,2,1,100\nAomori,75000000,4,3,190\nHirosaki,62000000,3,2,130\nAkita,57000000,3,2,120\nYamagata,69000000,3,2,160\nMorioka,51000000,3,1,110\nSendai,74000000,4,3,180\nFukushima,64000000,3,2,140\nKoriyama,59000000,3,2,130'

In [9]:
with open('./HomeAgent.csv', mode='w') as f:
    f.write(csv_data)

In [10]:
import pandas as pd

df = pd.read_csv('./HomeAgent.csv')
df

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House Size
0,Sapporo,65000000,3,2,150
1,Otaru,45000000,2,1,100
2,Hakodate,78000000,4,3,200
3,Asahikawa,55000000,3,2,120
4,Furano,70000000,3,2,160
5,Obihiro,48000000,2,1,90
6,Kitami,60000000,3,2,140
7,Tomakomai,53000000,3,2,110
8,Abashiri,72000000,4,3,180
9,Wakkanai,50000000,3,1,130


In [11]:
description = []
neibhborhooddescription = []

for index, row in df.iterrows():
    review_len = 40
    location = row['Neighborhood']
    prompt_text = f"Write a {review_len} word sales talk for a cozy real estate and its equipments in {location}. Mention randomly about the amenities equipped."
    description.append(generate_review(prompt_text, min_len=review_len))

    review_len = 30
    prompt_text = f"Write a {review_len} word positive statement around the neighbor of a real estate in {location}. Mention randomly about the transportation lines, schools, shoppings, and leisure facilities nearby. Also mention about degree of urban randomly in its vicinity."
    neibhborhooddescription.append(generate_review(prompt_text, min_len=review_len))

df['Description'] = description
df['Neighborhood Description'] = neibhborhooddescription

In [12]:
df

Unnamed: 0,Neighborhood,Price,Bedrooms,Bathrooms,House Size,Description,Neighborhood Description
0,Sapporo,65000000,3,2,150,Welcome to this charming Sapporo property! Thi...,"Located in Sapporo, this real estate boasts a ..."
1,Otaru,45000000,2,1,100,"Step into your dream home in Otaru, where cozy...","Located in Otaru, this property boasts easy ac..."
2,Hakodate,78000000,4,3,200,"Step into your dream home in Hakodate, where c...","""Located in Hakodate, this property boasts eas..."
3,Asahikawa,55000000,3,2,120,Step into your cozy Asahikawa retreat featurin...,"Nestled in Asahikawa, this property boasts eas..."
4,Furano,70000000,3,2,160,"Step into your dream home in Furano, where coz...",Discover Furano's charming neighborhood with c...
5,Obihiro,48000000,2,1,90,Step into your dream home in Obihiro! This coz...,Discover the charm of living in Obihiro! This ...
6,Kitami,60000000,3,2,140,Step into your dream home in Kitami! This cozy...,"Nestled in Kitami, this property boasts easy a..."
7,Tomakomai,53000000,3,2,110,Step into your dream home in Tomakomai! This c...,Tomakomai offers a vibrant neighborhood with e...
8,Abashiri,72000000,4,3,180,Step into your dream home in Abashiri! This co...,Discover the charm of living in Abashiri! Enjo...
9,Wakkanai,50000000,3,1,130,Step into your dream home in Wakkanai! This co...,"Located in Wakkanai, this property boasts exce..."


Store the CSV, so as not generate the file repeated ly.

In [13]:
df.to_csv('./HomeAgent2.csv')

Load it the dataset from a file.

## Step 3: Storing Listings in a Vector Database

- **Vector Database Setup**: Initialize and configure ChromaDB or a similar vector database to store real estate listings.
- **Generating and Storing Embeddings**: Convert the LLM-generated listings into suitable embeddings that capture the semantic content of each listing, and store these embeddings in the vector database.

In [14]:
# Chroma DB packages
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chains.question_answering import load_qa_chain

In [15]:
import os
openai_api_key = open("../../api_keys/openai.key", "rt").read()
os.environ["openai_api_key"] = openai_api_key
OpenAI_CHAT_MODEL = "gpt-3.5-turbo"


In [16]:
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.text_splitter import CharacterTextSplitter

loader = CSVLoader(file_path='./HomeAgent2.csv')
docs = loader.load()

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
split_docs = splitter.split_documents(docs)


In [17]:
#db = Chroma.from_documents(chunks, OpenAIEmbeddings())
db = Chroma.from_documents(documents = split_docs,
                        embedding = OpenAIEmbeddings(),
                        persist_directory='./data')

  warn_deprecated(


In [18]:
matching_docs = db.similarity_search("Which resident has good transportation?")
matching_docs

[Document(page_content=': 0\nNeighborhood: Sapporo\nPrice: 45000000\nBedrooms: 3\nBathrooms: 2\nHouse Size: 150\nDescription: Step into your dream home in Sapporo! This cozy real estate gem comes fully equipped with modern amenities like a spacious kitchen, luxurious bathroom, and a cozy fireplace. Enjoy the convenience of nearby parks, shopping centers, and public transportation. Welcome home!\nNeighborhood Description: "Located in Sapporo, this property boasts excellent access to transportation lines, top-rated schools, shopping centers, and leisure facilities. Enjoy the perfect blend of urban convenience and peaceful surroundings."', metadata={'row': 0, 'source': './HomeAgent2.csv'}),
 Document(page_content=': 0\nNeighborhood: Sapporo\nPrice: 45000000\nBedrooms: 3\nBathrooms: 2\nHouse Size: 150\nDescription: Step into your dream home in Sapporo! This cozy real estate gem comes fully equipped with modern amenities like a spacious kitchen, luxurious bathroom, and a cozy fireplace. Enj

## Step 4: Building the User Preference Interface

Collect buyer preferences, such as the number of bedrooms, bathrooms, location, and other specific requirements from a set of questions or telling the buyer to enter their preferences in natural language. You can hard-code the buyer preferences in questions and answers, or collect them interactively however you'd like, example:

In [19]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(openai_api_key=openai_api_key,
            model_name=OpenAI_CHAT_MODEL,
            temperature=0.6,
            max_tokens=2000)

  warn_deprecated(


In [20]:
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff",verbose=True)

answer =  chain.run(input_documents=matching_docs, question="Which resident has good transportation?")
answer

  warn_deprecated(




[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
: 0
Neighborhood: Sapporo
Price: 45000000
Bedrooms: 3
Bathrooms: 2
House Size: 150
Description: Step into your dream home in Sapporo! This cozy real estate gem comes fully equipped with modern amenities like a spacious kitchen, luxurious bathroom, and a cozy fireplace. Enjoy the convenience of nearby parks, shopping centers, and public transportation. Welcome home!
Neighborhood Description: "Located in Sapporo, this property boasts excellent access to transportation lines, top-rated schools, shopping centers, and leisure facilities. Enjoy the perfect blend of urban convenience and peaceful surroundings."

: 0
Neighborhood: Sapporo
Price: 45000000
Bedrooms: 3
Bathrooms: 2


'Both residents in Sapporo and Asahikawa boast excellent access to transportation lines, so both neighborhoods have good transportation options for residents.'

In [21]:
from langchain.chains import RetrievalQA
retrieval_chain = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=db.as_retriever())
retrieval_chain.run("Which resident has good transportation?")

'Both residents in Sapporo and Asahikawa have good transportation options. The properties in both neighborhoods are described as having excellent access to transportation lines, making it convenient for residents to get around.'

Buyer Preference Parsing: Implement logic to interpret and structure these preferences for inputing the vector database.

In [22]:
from langchain.prompts import PromptTemplate
from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.memory import ConversationSummaryMemory, ConversationBufferMemory, CombinedMemory, ChatMessageHistory
from typing import Any, Dict, Optional, Tuple


In [23]:
questions = [
                "How big do you want your house to be?",
                "What are 3 most important things for you in choosing this property?",
                "Which amenities would you like?",
                "Which transportation options are important to you?",
                "How urban do you want your neighborhood to be?",
            ]
answers = [
                "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
                "A quiet neighborhood, good local schools, and convenient shopping options.",
                "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
                "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
                "A balance between suburban tranquility and access to urban amenities like restaurants and theaters."
            ]

In [24]:
# Construct a conversational history
history = ChatMessageHistory()
for question, answer in zip(questions, answers):
    history.add_ai_message(question)
    history.add_user_message(answer)
history.add_ai_message("""Now I can recommend a home we're considering best suited for you from the listings.""")

# you could choose to store some of the q/a in memory as well, in addition to original questions
# class MementoBufferMemory(ConversationBufferMemory):
#     def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
#         input_str, output_str = self._get_input_output(inputs, outputs)
#         self.chat_memory.add_user_message(input_str)
#         self.chat_memory.add_ai_message(output_str)

preset_memory = ConversationBufferMemory(
    chat_memory=history,
    memory_key = 'chat_history',
    return_messages=True
)


In [25]:
from langchain.chains import ConversationalRetrievalChain

qa = ConversationalRetrievalChain.from_llm(
    llm = llm,
    retriever =db.as_retriever(),
    memory=preset_memory,
    verbose = True)

In [26]:
result = qa.invoke("Which resident has good school for 2 kids, and has more than 3 bed rooms?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Assistant: How big do you want your house to be?
Human: A comfortable three-bedroom house with a spacious kitchen and a cozy living room.
Assistant: What are 3 most important things for you in choosing this property?
Human: A quiet neighborhood, good local schools, and convenient shopping options.
Assistant: Which amenities would you like?
Human: A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.
Assistant: Which transportation options are important to you?
Human: Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.
Assistant: How urban do you want your neighborhood to be?
Human: A balance between suburban tranquility and access to urban amenities like restaurants and th

In [27]:
display(result["answer"])

'The property in Sapporo would be a good choice for a family with two kids looking for a residence with good schools and more than three bedrooms. It has 3 bedrooms, 2 bathrooms, and is located in a neighborhood with top-rated schools.'

In [27]:
RECOMMENDER_TEMPLATE = """
You are realter agent, and recommend a real estate from the listings best suits to the human needs.
Human's preference can be derived from their answers to the questions.

### Personal Questions and Answers:

{qa}

### Summary of Recommendations:


### Context:

{context}

Human: {query}

AI:
"""

PROMPT = PromptTemplate(
    template=RECOMMENDER_TEMPLATE,

    input_variables=["context", "qa", "query"],
)

In [37]:
from langchain.chains import LLMChain, LLMRequestsChain, SimpleSequentialChain

seq_chain = SimpleSequentialChain(
    chains = [conv_chain, retrieval_chain]
)

In [40]:
seq_chain.invoke({"input" :"Which resident has good transportation?"})

KeyError: 'query'

## Step 5: Searching Based on Preferences

**Semantic Search Implementation**: Use the structured buyer preferences to perform a semantic search on the vector database, retrieving listings that most closely match the user's requirements.

**Listing Retrieval Logic**: Fine-tune the retrieval algorithm to ensure that the most relevant listings are selected based on the semantic closeness to the buyer’s preferences.

## Step 6: Personalizing Listing Descriptions

LLM Augmentation: For each retrieved listing, use the LLM to augment the description, tailoring it to resonate with the buyer’s specific preferences. This involves subtly emphasizing aspects of the property that align with what the buyer is looking for.

Maintaining Factual Integrity: Ensure that the augmentation process enhances the appeal of the listing without altering factual information.

In [28]:
from langchain.callbacks import StdOutCallbackHandler
handler = StdOutCallbackHandler()

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=db.as_retriever(search_type="similarity"),
    chain_type_kwargs = {"prompt": PROMPT, "verbose": True, "memory": preset_memory},
    callbacks=[handler]
)

In [29]:

question = "List up to one you think is the best?"
result = qa.invoke({"query": question})



[1m> Entering new RetrievalQA chain...[0m


[1m> Entering new StuffDocumentsChain chain...[0m


ValueError: Missing some input keys: {'qa', 'query'}

In [None]:

def runChain(query,history):
    return process_llm_response(qa(query))

app = gradio.ChatInterface(runChain)
app.queue()
app.launch(share=False, debug=True)