<a href="https://colab.research.google.com/github/heber-augusto/udacity-generative-ai-nanodegree/blob/main/personalized-real-state-agent/personalized_real_state_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Personalized Real Estate Agent
Project from Udacity Generative AI Nano Degreet

The task is to develop an innovative application named "HomeMatch". This application leverages large language models (LLMs) and vector databases to transform standard real estate listings into personalized narratives that resonate with potential buyers' unique preferences and needs.

This notebook implements a Retrieval-Augmented Generation (RAG) system to provide personalized real estate recommendations. By analyzing the user's conversation history and preferences, the system retrieves relevant information and generates suitable property recommendations. The notebook integrates various machine learning and natural language processing techniques to enhance user interaction and improve recommendation accuracy.

## Libraries and API Key Setup
This section reads the OpenAI API key and installs the required libraries for the project.
It ensures that all dependencies are properly configured before proceeding.

### Read OpenAI API KEY

In [1]:
from google.colab import userdata
import os
os.environ['OPENAI_API_KEY'] = userdata.get('openai_api_key')

### Libraries instalations

In [2]:
!pip install langchain openai==0.28 chromadb tiktoken
!pip install langchain-community langchain-core

Collecting langchain
  Downloading langchain-0.2.3-py3-none-any.whl (974 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m974.0/974.0 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai==0.28
  Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting chromadb
  Downloading chromadb-0.5.0-py3-none-any.whl (526 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m526.8/526.8 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting tiktoken
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m21.2 MB/s[0m eta [36m0:00:00[0m
Collecting langchain-core<0.3.0,>=0.2.0 (from langchain)
  Downloading langchain_core-0.2.5-py3-none-any.whl (314 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━

### Libraries imports

In [3]:
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.prompts import PromptTemplate
from langchain.schema import AIMessage, HumanMessage, SystemMessage, BaseMessage
from langchain.memory import ConversationSummaryMemory, ConversationBufferMemory, CombinedMemory, ChatMessageHistory
from langchain.chains import ConversationChain
from typing import Any, Dict, Optional, Tuple
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.pydantic_v1 import BaseModel, Field

from langchain_core.runnables import (
    RunnableLambda,
    ConfigurableFieldSpec,
    RunnablePassthrough,
)
from langchain_core.runnables.history import RunnableWithMessageHistory

from operator import itemgetter
from typing import List



## Generating Real Estate Listings
This section uses an LLM to generate at least 10 real estate listings.
It provides a prompt template and configures the model to create detailed property descriptions.

### Generating Real Estate Listings with an LLM

In [4]:
model_name="gpt-4o"
temperature = 1.2

chat_llm = ChatOpenAI(
    model_name=model_name,
    temperature=temperature,
    max_tokens = 4096)

prompt_template = """
Generate at least 10 real estate listings to produce descriptions of various properties. An example of a listing might be:
-----------
Neighborhood: Green Oaks
Price: $800,000
Bedrooms: 3
Bathrooms: 2
House Size: 2,000 sqft

Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem.

Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, commuting is a breeze.
"""

messages = [
    HumanMessage(content=prompt_template),
]

# Code can be used when the file listings.txt is not provided
#output = chat_llm(messages)

#fd = open('listings.txt', 'w')
#fd.write(output.content)
#fd.close()


  warn_deprecated(


### Read file from repository and put the content inside a list
The following code considers that the file listings.txt is already inside the repository

In [5]:
import requests
response = requests.get(
    'https://raw.githubusercontent.com/heber-augusto/udacity-generative-ai-nanodegree/main/personalized-real-state-agent/files/listings.txt')

### Create a list with listings
Remove the first element from the split and the content at the end of each generated listing

In [6]:
listings = []
for prop in response.text.split('-----------')[1:]:
  listings.append(prop[:-5])

## Semantic Search

### Creating a Vector Database and Storing Listings

In [7]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.schema.document import Document
from langchain.vectorstores import Chroma

docs = [Document(page_content=x) for x in listings]

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
split_docs = splitter.split_documents(docs)

embeddings = OpenAIEmbeddings()

db = Chroma.from_documents(split_docs, embeddings)

  warn_deprecated(


### Function to make a Semantic Search of Listings Based on query

In [8]:
def get_similar_listings(query, topk=2):
  similar_docs = db.similarity_search(
      query,
      k=topk)
  return similar_docs

### Testing semantic search functionality

In [9]:
similar_listings = get_similar_listings(
  query = "I want a house with 3 or more bedrooms inside a calm neigboorhood"
)


print(similar_listings[0].page_content)

Neighborhood: Cedar Pines
Price: $525,000
Bedrooms: 3
Bathrooms: 2.5
House Size: 2,300 sqft

Description: Welcome to Cedar Pines where this charming 3-bedroom, 2.5-bathroom home awaits. Featuring an open floor plan, a cozy fireplace, and a sunroom that provides ample light, this home combines comfort and elegance. The spacious master bedroom includes an en-suite bath with a soaking tub and walk-in closet. The landscaped backyard offers enough space for customizable gardening or entertainment.

Neighborhood Description: Cedar Pines is a serene neighborhood residing amongst natural woodlands, offering myriad outdoor activities including hiking trails and bird watching. The community emphasizes sustainable living while providing access to top-tier schools and local craft shops.


In [10]:
listings_retriever = RunnableLambda(get_similar_listings)

print(listings_retriever.invoke("I want a house with 3 or more bedrooms inside a calm neigboorhood"))

[Document(page_content='Neighborhood: Cedar Pines\nPrice: $525,000\nBedrooms: 3\nBathrooms: 2.5\nHouse Size: 2,300 sqft\n\nDescription: Welcome to Cedar Pines where this charming 3-bedroom, 2.5-bathroom home awaits. Featuring an open floor plan, a cozy fireplace, and a sunroom that provides ample light, this home combines comfort and elegance. The spacious master bedroom includes an en-suite bath with a soaking tub and walk-in closet. The landscaped backyard offers enough space for customizable gardening or entertainment.\n\nNeighborhood Description: Cedar Pines is a serene neighborhood residing amongst natural woodlands, offering myriad outdoor activities including hiking trails and bird watching. The community emphasizes sustainable living while providing access to top-tier schools and local craft shops.'), Document(page_content='Neighborhood: Riverwalk\nPrice: $695,000\nBedrooms: 3\nBathrooms: 3\nHouse Size: 2,200 sqft\n\nDescription: Discover this Riverwalk beauty with 3 bedrooms a

## Augmented Response Generation

### Interaction simulation

In [11]:
questions = [
                "How big do you want your house to be?",
                "What are 3 most important things for you in choosing this property?",
                "Which amenities would you like?",
                "Which transportation options are important to you?",
                "How urban do you want your neighborhood to be?",
            ]
answers1 = [
    "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
    "A quiet neighborhood, good local schools, and convenient shopping options.",
    "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
    "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
    "A balance between suburban tranquility and access to urban amenities like restaurants and theaters."]


answers2 = [
    "A four-bedroom house with enough space for a game room and an office.",
    "Proximity to nature, a good local community, and access to recreational activities.",
    "A large backyard for the kids to play, a modern kitchen, and a fireplace.",
    "Easy access to hiking and biking trails, and a short drive to the city.",
    "I prefer a more rural or quiet suburban neighborhood with easy access to nature and outdoor activities."
]

### Creates conversation history examples

In [12]:
from langchain_core.chat_history import BaseChatMessageHistory
class InMemoryHistory(BaseChatMessageHistory, BaseModel):
    """In memory implementation of chat message history."""

    messages: List[BaseMessage] = Field(default_factory=list)

    def add_message(self, message: BaseMessage) -> None:
        """Add a self-created message to the store"""
        self.messages.append(message)

    def clear(self) -> None:
        self.messages = []


store = {}


def get_session_history(user_id: str, conversation_id: str) -> BaseChatMessageHistory:
    if (user_id, conversation_id) not in store:
        store[(user_id, conversation_id)] = InMemoryHistory()
    return store[(user_id, conversation_id)]

history1 = get_session_history("1", "1")
history2 = get_session_history("2", "2")

for i in range(len(questions)):
  history1.add_ai_message(questions[i])
  history2.add_ai_message(questions[i])
  history1.add_user_message(answers1[i])
  history2.add_user_message(answers2[i])
print(store)

{('1', '1'): InMemoryHistory(messages=[AIMessage(content='How big do you want your house to be?'), HumanMessage(content='A comfortable three-bedroom house with a spacious kitchen and a cozy living room.'), AIMessage(content='What are 3 most important things for you in choosing this property?'), HumanMessage(content='A quiet neighborhood, good local schools, and convenient shopping options.'), AIMessage(content='Which amenities would you like?'), HumanMessage(content='A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.'), AIMessage(content='Which transportation options are important to you?'), HumanMessage(content='Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.'), AIMessage(content='How urban do you want your neighborhood to be?'), HumanMessage(content='A balance between suburban tranquility and access to urban amenities like restaurants and theaters.')]), ('2', '2'): InMemoryHistory(messages=[AIMessage(c

### Prompt template and Chain definition

This code sets up a conversational AI chain to provide personalized real estate recommendations. It uses a chat prompt template, a function to format retrieved documents, and a pipeline to handle user input and conversation history. The system retrieves relevant property listings based on user answers and gives recommendations using a LLM model.

In [13]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're an assistant who's good at {ability}. Here is some {context}",
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


context = itemgetter("question") | listings_retriever | format_docs
first_step = RunnablePassthrough.assign(context=context)
chain = first_step | prompt | chat_llm

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history=get_session_history,
    input_messages_key="question",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="user_id",
            annotation=str,
            name="User ID",
            description="Unique identifier for the user.",
            default="",
            is_shared=True,
        ),
        ConfigurableFieldSpec(
            id="conversation_id",
            annotation=str,
            name="Conversation ID",
            description="Unique identifier for the conversation.",
            default="",
            is_shared=True,
        ),
    ],
)
sample_question = "I want a house with 3 or more bedrooms inside a calm neigboorhood"
my_app_ability = "real estate recommendations"

### Test recommendation for user 1

In [14]:
response_from_chain = with_message_history.invoke(
        {"ability": my_app_ability,
         "question": "What is your real state recommendation?"},
        config={
            "configurable": {"user_id": "1", "conversation_id": "1"}
        },
    )

print(response_from_chain.content)



Based on your preferences, here's a custom recommendation:

### Neighborhood: Greenhill Estates
**Price:** $635,000
**Bedrooms:** 3
**Bathrooms:** 2
**House Size:** 2,200 sqft

#### Description:
This delightful 3-bedroom, 2-bath home in Greenhill Estates features a spacious open-concept kitchen with granite countertops and energy-efficient stainless steel appliances. The cozy living room, complete with a gas fireplace, opens up to a beautifully landscaped backyard perfect for gardening. The home also includes a two-car garage and a modern, energy-efficient heating system to keep utility costs low.

#### Neighborhood Description:
Greenhill Estates is a peaceful neighborhood known for its excellent local schools and convenient shopping options. The community offers a suburban feel with lush green spaces, ideal for walking and biking. Despite its tranquility, it's just a short drive from a variety of dining and entertainment venues.

**Amenities:**
- Community playgrounds and parks
- Bike

### Test recommendation for user 2

In [15]:
response_from_chain = with_message_history.invoke(
        {"ability": my_app_ability,
         "question": "What is your real state recommendation?"},
        config={
            "configurable": {"user_id": "2", "conversation_id": "2"}
        },
    )

print(response_from_chain.content)



Based on your preferences, I recommend the property in **Redstone Acres**. Here’s why:

### Property Recommendation
- **Neighborhood: Redstone Acres**
- **Price: $575,000**
- **Bedrooms: 4**
- **Bathrooms: 2.5**
- **House Size: 2,700 sqft**

#### Description
This charming 4-bedroom, 2.5-bath residence features:
- An **open floor plan** for spacious living.
- A **newly renovated kitchen**—ideal for family gatherings and modern convenience.
- **High ceilings** that provide an airy and roomy environment.
- A **wood-paneled study** perfect for a home office.
- A **sun-drenched living room** that opens up to a large fenced backyard—great for kids to play and ideal for pets.
- The potential for a **game room** given the ample living space available.
- A **fireplace** that adds a cozy, rustic charm to the home.

### Neighborhood Perks
- **Countryside tranquility** blended with modern amenities such as PGA-tier golf courses community pools, tennis courts, and equestrian facilities.
- Proximity