# Personalized Rea Estate Agent

## Step 1: Setting Up the Python Application

```
pip install -r requirements.txt

requirements.txt
    langchain==0.1.12
    openai==0.28.1
    pydantic>=1.10.12
    pytest>=7.4.0
    sentence-transformers>=2.2.0
    transformers>=4.31.0
    chromadb==0.4.15
    jupyter==1.0.0

Set OpenAI API key to environment variable so that langchain methods can refer them in the environment variables list.

In [1]:
import os
openai_api_key = open("../../api_keys/openai.key", "rt").read()
os.environ["openai_api_key"] = openai_api_key
OpenAI_CHAT_MODEL = "gpt-3.5-turbo"

In [2]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(openai_api_key=openai_api_key,
            model_name=OpenAI_CHAT_MODEL,
            temperature=0.6,
            max_tokens=2000)

  warn_deprecated(


## Step 3: Storing Listings in a Vector Database


In [3]:
# Chroma DB packages
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chains.question_answering import load_qa_chain

Load it the dataset from a file.

In [4]:
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.text_splitter import CharacterTextSplitter

loader = CSVLoader(file_path='./experiences.csv')
docs = loader.load()

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
split_docs = splitter.split_documents(docs)


- **Vector Database Setup**: Initialize and configure ChromaDB or a similar vector database to store real estate listings.
- **Generating and Storing Embeddings**: Convert the LLM-generated listings into suitable embeddings that capture the semantic content of each listing, and store these embeddings in the vector database.

In [5]:
#db = Chroma.from_documents(chunks, OpenAIEmbeddings())
db = Chroma.from_documents(documents = split_docs,
                        embedding = OpenAIEmbeddings(),
                        persist_directory='./data')

  warn_deprecated(


Test the vector db to extract the similarity entries to the requirements.

In [10]:
matching_docs = db.similarity_search("AI related experiences")
matching_docs

[Document(page_content='Name: experiences\nname.1: AIoT Streaming Analytics\ntag: AIML\nskill: Apama\ndescription: Workshop to empower a partner ready for AIoT(AI+IoT) streaming analytics\nURL: NA', metadata={'row': 38, 'source': './experiences.csv'}),
 Document(page_content='Name: experiences\nname.1: AIoT Streaming Analytics\ntag: AIML\nskill: Cumulocity\ndescription: Workshop to empower a partner ready for AIoT(AI+IoT) streaming analytics\nURL: NA', metadata={'row': 39, 'source': './experiences.csv'}),
 Document(page_content='Name: experiences\nname.1: AIoT Streaming Analytics\ntag: IoT\nskill: Apama\ndescription: Workshop to empower a partner ready for AIoT(AI+IoT) streaming analytics\nURL: NA', metadata={'row': 35, 'source': './experiences.csv'}),
 Document(page_content='Name: experiences\nname.1: AIoT Streaming Analytics\ntag: AIML\nskill: Python\ndescription: Workshop to empower a partner ready for AIoT(AI+IoT) streaming analytics\nURL: NA', metadata={'row': 37, 'source': './exp

Following is the test to use vector db with OpenAI.

In [9]:
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff",verbose=True)

answer =  chain.run(input_documents=matching_docs, question="Which area is mostly worked?")
answer



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
Name: experiences
name.1: Fleet IoT
tag: AIML
skill: Python
description: Fleet solution packaging project targeting the Japanese market. Install in-vehicle devices and collect data from platforms. From on-site data collection, provided the scoring function of the driver's behavior data using ML methods (4 months)
URL: NA

Name: experiences
name.1: Fleet IoT
tag: AIML
skill: Pandas
description: Fleet solution packaging project targeting the Japanese market. Install in-vehicle devices and collect data from platforms. From on-site data collection, provided the scoring function of the driver's behavior data using ML methods (4 months)
URL: NA

Name: experiences
name.1: Fleet 

'Based on the provided context, the individual has mostly worked on a Fleet IoT project with a focus on Artificial Intelligence and Machine Learning (AIML) skills. They have experience in Python, Pandas, SKLearn, and Jupyter as part of this project. The main area of work seems to be developing a fleet solution for the Japanese market, involving the installation of in-vehicle devices, data collection from platforms, and providing a scoring function for driver behavior data using machine learning methods.'

Another example with RetrievalQA.

In [8]:
from langchain.chains import RetrievalQA
retrieval_chain = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=db.as_retriever())
retrieval_chain.run("Which computer language is frequently experienced?")

  warn_deprecated(


'The computer languages frequently experienced in the provided context are Python and Jupyter.'

## Step 4: Building the User Preference Interface

Collect buyer preferences, such as the number of bedrooms, bathrooms, location, and other specific requirements from a set of questions or telling the buyer to enter their preferences in natural language. You can hard-code the buyer preferences in questions and answers as below.

In [11]:
questions = [
                "Which role are you seeking for?",
                "What are the technical strength you have obtained through your carier?",
                "Which roles have you played?",
                "What kind of teams have you lead?",
                "Can you run business in English and Japanese",
            ]
answers = [
                "An indivisual contributor role in AI and IoT area",
                "I can work on wide range of IT skill from HW to business applications",
                "Solution Engineer role and customer facing role in the tough situation",
                "Team consist of member from diverse areas and roles in the world. India, US, Europe, China. Product Managers, devlopers and etc.",
                "Yes. I can communicat in both languages."
            ]

Buyer Preference Parsing: Implement logic to interpret and structure these preferences for inputing the vector database.

In [12]:
from langchain.memory import ConversationBufferMemory, ChatMessageHistory
#from typing import Any, Dict, Optional, Tuple


Create a memory of QAs for user preference.

In [18]:
# Construct a conversational history
history = ChatMessageHistory()
for question, answer in zip(questions, answers):
    history.add_ai_message(question)
    history.add_user_message(answer)
history.add_ai_message("""Now I can recommend you a job we're considering best suited for you.""")

# you could choose to store some of the q/a in memory as well, in addition to original questions
# class MementoBufferMemory(ConversationBufferMemory):
#     def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
#         input_str, output_str = self._get_input_output(inputs, outputs)
#         self.chat_memory.add_user_message(input_str)
#         self.chat_memory.add_ai_message(output_str)

preset_memory = ConversationBufferMemory(
    chat_memory=history,
    memory_key = 'chat_history',
    return_messages=True
)


Create a chain to make a recommendation based on QA conversation memory and listings the agent has.

In [19]:
from langchain.prompts import PromptTemplate
prompt_template = """
Answer the question as precisely and accurately based on the context below
{context}
Question: {question}
Chat History:
{chat_history}
Answer:"""
PROMPT = PromptTemplate(
    template=prompt_template,
    input_variables=["context", "question", "chat_history"])
chain_type_kwargs = {"prompt": PROMPT}

In [20]:
from langchain.chains import ConversationalRetrievalChain

qa = ConversationalRetrievalChain.from_llm(
    llm = llm,
    retriever =db.as_retriever(),
    memory=preset_memory,
    combine_docs_chain_kwargs=chain_type_kwargs,
    verbose = True)

Test with an example inquirry.

In [21]:
result = qa.invoke("Considering my background, what kind of job fits to me?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Assistant: Which role are you seeking for?
Human: An indivisual contributor role in AI and IoT area
Assistant: What are the technical strength you have obtained through your carier?
Human: I can work on wide range of IT skill from HW to business applications
Assistant: Which roles have you played?
Human: Solution Engineer role and customer facing role in the tough situation
Assistant: What kind of teams have you lead?
Human: Team consist of member from diverse areas and roles in the world. India, US, Europe, China. Product Managers, devlopers and etc.
Assistant: Can you run business in English and Japanese
Human: Yes. I can communicat in both languages.
Assistant: Now I can recommend you a job we're considering best suited for you.
Follow Up Input

Display the response(answer) from AI.

In [22]:
display(result["answer"])

'Based on your background and skills in IT, AI, and IoT, as well as your experience as a Solution Engineer and leading diverse teams, a role as a Data Scientist or Machine Learning Engineer could be a good fit for you. Your experience with Azure ML, SKLearn, Jupyter, Pandas, and Python, as well as your ability to communicate in English and Japanese, make you well-suited for roles that involve working with data, developing prediction models, and implementing AI solutions. Consider exploring opportunities in companies that focus on AI, machine learning, and data science.'

## Step 6: Test with an app



In [23]:
import gradio

def runChain(query, history):
    return qa(query)["answer"]

app = gradio.ChatInterface(runChain)
app.queue()
app.launch(share=False, debug=True)

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.


  warn_deprecated(




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Assistant: Which role are you seeking for?
Human: An indivisual contributor role in AI and IoT area
Assistant: What are the technical strength you have obtained through your carier?
Human: I can work on wide range of IT skill from HW to business applications
Assistant: Which roles have you played?
Human: Solution Engineer role and customer facing role in the tough situation
Assistant: What kind of teams have you lead?
Human: Team consist of member from diverse areas and roles in the world. India, US, Europe, China. Product Managers, devlopers and etc.
Assistant: Can you run business in English and Japanese
Human: Yes. I can communicat in both languages.
Assistant: Now I can recommend you a job we're considering best suited for you.
Human: Consider



In [24]:
app.close()

Closing server running on port: 7860


## Sample chat app output

![chat app](./chatapp.png)
