<a href="https://colab.research.google.com/github/seobando/UDACITY_GenerativeAI/blob/main/HomeMatch/HomeMatch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Workspace setup

## Install required libraries

In [5]:
!pip install openai
!pip install langchain
!pip install chromadb
!pip install tiktoken

Collecting openai
  Downloading openai-1.30.1-py3-none-any.whl (320 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/320.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━[0m [32m256.0/320.6 kB[0m [31m7.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.6/320.6 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-

## Load Libraries

In [6]:
import os
import ast
import pandas as pd

from langchain.document_loaders.csv_loader import CSVLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain_community.chat_models import ChatOpenAI
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain import LLMChain
from langchain.chains.question_answering import load_qa_chain

from openai import OpenAI

# Final Project

In [7]:
from google.colab import userdata

api_key = userdata.get('OPENAI_API_KEY')

In [8]:
#api_key  = os.environ.get("OPENAI_API_KEY")

In [9]:
open_ai = OpenAI(
    api_key=api_key
)

In [10]:
file_path = "Listings.csv"

## 1 Synthetic Data Generation

In [11]:
def get_completition(client, prompt, model="gpt-3.5-turbo"):
  messages = [{"role": "user", "content": prompt}]
  response = client.chat.completions.create(
      model=model,
      messages=messages,
      temperature=0)
  return response.choices[0].message.content

def convert_to_dataframe(text):
  text_dict = ast.literal_eval(text)
  df = pd.DataFrame(text_dict)
  return df

def save_as_csv(df,file_path):
  df.to_csv(file_path)

In [12]:
instruction = """
Create a list of apartments with descriptions for rent.

The list must meet the following criteria:
  - Should have at leat 10 properties.
  - Each property should has the following description:
    - Building name
    - Number of bedrooms
    - Number of bathrooms
    - Property size in meters
    - Neigborhood name
    - Has parking
    - Has an elevator
    - Has garbage shut
    - Has a pool
    - Is pet friendly
    - Is near market places
    - Is near schools or universities
    - Is near hospitals
    - Is near public transport
    - Description
    - Price in COP
  - Near means a distance of around 1000 meters to the apartment
"""

output_format = """
[
  {
    'Building name': '<answer_here>',
    'Number of bedrooms':'<answer_here>',
    'Number of badrooms':'<answer_here>',
    'Property size in meters':'<answer_here>',
    'Neigborhood name':'<answer_here>',
    'Has parking':'<answer_here>',
    'Has an elevator':'<answer_here>',
    'Has garbage shut':'<answer_here>',
    'Has a pool':'<answer_here>',
    'Is pet friendly': '<answer_here>',
    'Is near market places': '<answer_here>',
    'Is near schools or universities': '<answer_here>',
    'Is near hospitals': '<answer_here>',
    'Is near public transport': '<answer_here>',
    'Description':'<answer_here>',
    'Price in COP':'<answer_here>',
    },
]
"""

prompt = f"""
Act as a real state agent from the city of Medellin in Colombia.

{instruction}

You should fill the <building_name> and <answer_here> parts of the following output format:

{output_format}

"""

In [13]:
response = get_completition(open_ai, prompt, model="gpt-3.5-turbo")
print(response)

[
  {
    'Building name': 'Poblado Suites',
    'Number of bedrooms': '3',
    'Number of bathrooms': '2',
    'Property size in meters': '120',
    'Neigborhood name': 'El Poblado',
    'Has parking': 'Yes',
    'Has an elevator': 'Yes',
    'Has garbage shut': 'Yes',
    'Has a pool': 'Yes',
    'Is pet friendly': 'No',
    'Is near market places': 'Yes',
    'Is near schools or universities': 'Yes',
    'Is near hospitals': 'Yes',
    'Is near public transport': 'Yes',
    'Description': 'Modern apartment with great amenities and beautiful views of the city.',
    'Price in COP': '3,500,000',
  },
  {
    'Building name': 'Laureles Towers',
    'Number of bedrooms': '2',
    'Number of bathrooms': '1',
    'Property size in meters': '80',
    'Neigborhood name': 'Laureles',
    'Has parking': 'Yes',
    'Has an elevator': 'Yes',
    'Has garbage shut': 'Yes',
    'Has a pool': 'No',
    'Is pet friendly': 'Yes',
    'Is near market places': 'Yes',
    'Is near schools or universiti

In [14]:
df = convert_to_dataframe(response)
df.head()

Unnamed: 0,Building name,Number of bedrooms,Number of bathrooms,Property size in meters,Neigborhood name,Has parking,Has an elevator,Has garbage shut,Has a pool,Is pet friendly,Is near market places,Is near schools or universities,Is near hospitals,Is near public transport,Description,Price in COP
0,Poblado Suites,3,2,120,El Poblado,Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Modern apartment with great amenities and beau...,3500000
1,Laureles Towers,2,1,80,Laureles,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Yes,Cozy apartment in a quiet neighborhood with ea...,2000000
2,Envigado Gardens,4,3,150,Envigado,Yes,No,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Spacious family-friendly apartment with a beau...,4800000
3,Belén Heights,1,1,60,Belén,No,Yes,Yes,No,No,Yes,Yes,Yes,Yes,Cozy apartment with modern finishes and close ...,1500000
4,Robledo Residences,3,2,100,Robledo,Yes,No,Yes,No,Yes,Yes,Yes,Yes,Yes,Comfortable apartment in a residential area wi...,2300000


In [15]:
save_as_csv(df,file_path)

## 2 Semantic Search

In [16]:
loader = CSVLoader(file_path=file_path)
docs = loader.load()
print(docs)

[Document(page_content=': 0\nBuilding name: Poblado Suites\nNumber of bedrooms: 3\nNumber of bathrooms: 2\nProperty size in meters: 120\nNeigborhood name: El Poblado\nHas parking: Yes\nHas an elevator: Yes\nHas garbage shut: Yes\nHas a pool: Yes\nIs pet friendly: No\nIs near market places: Yes\nIs near schools or universities: Yes\nIs near hospitals: Yes\nIs near public transport: Yes\nDescription: Modern apartment with great amenities and beautiful views of the city.\nPrice in COP: 3,500,000', metadata={'source': 'Listings.csv', 'row': 0}), Document(page_content=': 1\nBuilding name: Laureles Towers\nNumber of bedrooms: 2\nNumber of bathrooms: 1\nProperty size in meters: 80\nNeigborhood name: Laureles\nHas parking: Yes\nHas an elevator: Yes\nHas garbage shut: Yes\nHas a pool: No\nIs pet friendly: Yes\nIs near market places: Yes\nIs near schools or universities: Yes\nIs near hospitals: Yes\nIs near public transport: Yes\nDescription: Cozy apartment in a quiet neighborhood with easy acce

In [17]:
splitter = CharacterTextSplitter(chunk_size = 1000, chunk_overlap=0)
split_docs = splitter.split_documents(docs)
embeddings = OpenAIEmbeddings(openai_api_key=api_key)
db = Chroma.from_documents(split_docs, embeddings)

  warn_deprecated(


In [18]:
def search_properties(db,
                      building_name=None,
                      num_bedrooms=None,
                      num_bathrooms=None,
                      property_size=None,
                      neighborhood_name=None,
                      has_parking=None,
                      has_elevator=None,
                      has_garbage_shut=None,
                      has_pool=None,
                      is_pet_friendly=None,
                      is_near_marketplaces=None,
                      is_near_schools_or_universities=None,
                      is_near_hospitals=None,
                      is_near_public_transport=None,
                      price=None):
    # Construct the query string based on the provided parameters
    query_parts = []
    if building_name:
        query_parts.append(f"Building name: {building_name}")
    if num_bedrooms is not None:
        query_parts.append(f"Number of bedrooms: {num_bedrooms}")
    if num_bathrooms is not None:
        query_parts.append(f"Number of bathrooms: {num_bathrooms}")
    if property_size is not None:
        query_parts.append(f"Property size: {property_size}m²")
    if neighborhood_name:
        query_parts.append(f"Neighborhood: {neighborhood_name}")
    if has_parking is not None:
        query_parts.append(f"Has parking: {'Yes' if has_parking else 'No'}")
    if has_elevator is not None:
        query_parts.append(f"Has elevator: {'Yes' if has_elevator else 'No'}")
    if has_garbage_shut is not None:
        query_parts.append(f"Has garbage shut: {'Yes' if has_garbage_shut else 'No'}")
    if has_pool is not None:
        query_parts.append(f"Has pool: {'Yes' if has_pool else 'No'}")
    if is_pet_friendly is not None:
        query_parts.append(f"Is pet friendly: {'Yes' if is_pet_friendly else 'No'}")
    if is_near_marketplaces is not None:
        query_parts.append(f"Is near market places: {'Yes' if is_near_marketplaces else 'No'}")
    if is_near_schools_or_universities is not None:
        query_parts.append(f"Is near schools or universities: {'Yes' if is_near_schools_or_universities else 'No'}")
    if is_near_hospitals is not None:
        query_parts.append(f"Is near hospitals: {'Yes' if is_near_hospitals else 'No'}")
    if is_near_public_transport is not None:
        query_parts.append(f"Is near public transport: {'Yes' if is_near_public_transport else 'No'}")
    if price is not None:
        query_parts.append(f"Price: {price} COP")

    query = ". ".join(query_parts)

    # Perform the search
    results = db.similarity_search(query)

    return results

# Example usage
query_results = search_properties(
    db,
    building_name="Sunset Plaza",
    num_bedrooms=3,
    num_bathrooms=2,
    property_size=120,
    neighborhood_name="El Poblado",
    has_parking=True,
    has_elevator=True,
    has_garbage_shut=False,
    has_pool=True,
    is_pet_friendly=True,
    is_near_marketplaces=True,
    is_near_schools_or_universities=True,
    is_near_hospitals=True,
    is_near_public_transport=True,
    price=1500000
)

# Print the query results
for i, result in enumerate(query_results):
    print(f"Result {i+1}:")
    print(result)
    print("\n")

Result 1:
page_content=': 0\nBuilding name: Poblado Suites\nNumber of bedrooms: 3\nNumber of bathrooms: 2\nProperty size in meters: 120\nNeigborhood name: El Poblado\nHas parking: Yes\nHas an elevator: Yes\nHas garbage shut: Yes\nHas a pool: Yes\nIs pet friendly: No\nIs near market places: Yes\nIs near schools or universities: Yes\nIs near hospitals: Yes\nIs near public transport: Yes\nDescription: Modern apartment with great amenities and beautiful views of the city.\nPrice in COP: 3,500,000' metadata={'row': 0, 'source': 'Listings.csv'}


Result 2:
page_content=': 6\nBuilding name: Castilla Gardens\nNumber of bedrooms: 3\nNumber of bathrooms: 2\nProperty size in meters: 110\nNeigborhood name: Castilla\nHas parking: Yes\nHas an elevator: Yes\nHas garbage shut: Yes\nHas a pool: Yes\nIs pet friendly: Yes\nIs near market places: Yes\nIs near schools or universities: Yes\nIs near hospitals: Yes\nIs near public transport: Yes\nDescription: Modern apartment with a spacious living area and a

## 3 Augmented Response Generation

### Basic Question and Answer

In [19]:
def generate_suggestions(db, llm, query):
  retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2})
  rag = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
  result = rag({"query": query})
  return result["result"]

In [20]:
model_name = "gpt-3.5-turbo"
llm = ChatOpenAI(api_key=api_key, model_name = model_name, temperature=0, max_tokens=2000)

  warn_deprecated(


In [21]:
query = "Based on the available apartments suggest a couple of apartments for a family of 4 members near a school"
print(generate_suggestions(db, llm, query))

  warn_deprecated(


Based on the available information, the apartment in San Javier Residences would be a good option for a family of 4 members near a school. It has 4 bedrooms, 3 bathrooms, is pet-friendly, and is near schools and universities. Another option could be the Laureles Towers apartment, which has 2 bedrooms, 1 bathroom, is pet-friendly, and is also near schools and universities. Both apartments are near public transport, making them convenient choices for a family with school-going children.


In [22]:
query = "Based on the available apartments suggest a couple of apartments for a family of 4 members near a school, should have at least 2 bathrooms"
print(generate_suggestions(db, llm, query))

Based on the information provided, I would recommend the following apartments for a family of 4 members near a school and with at least 2 bathrooms:

1. San Javier Residences:
   - Number of bedrooms: 4
   - Number of bathrooms: 3
   - Property size: 140 sqm
   - Neigborhood: San Javier
   - Has parking: Yes
   - Has an elevator: Yes
   - Has garbage shut: Yes
   - Has a pool: No
   - Is pet friendly: Yes
   - Is near market places: Yes
   - Is near schools or universities: Yes
   - Is near hospitals: Yes
   - Is near public transport: Yes
   - Description: Spacious apartment with modern amenities and easy access to public transport.
   - Price: 4,200,000 COP

2. Robledo Residences:
   - Number of bedrooms: 3
   - Number of bathrooms: 2
   - Property size: 100 sqm
   - Neigborhood: Robledo
   - Has parking: Yes
   - Has an elevator: No
   - Has garbage shut: Yes
   - Has a pool: No
   - Is pet friendly: Yes
   - Is near market places: Yes
   - Is near schools or universities: Yes
   - 

### Customization

In [23]:
questions = [
    "Do you have separate rooms for your children?",
    "Do you have extra space for an office? I am working from home",
    "Do you have an apartment with a panoramic view of the city?",
    "We don't have our own transportation, Do you have any apartment near a public transportation system."
]

answers = [
    "We have a 3-room apartment with extra space for a desk.",
    "There is a beautiful apartment with a dedicated studio that could be used as an office.",
    "We have a couple of apartments on floor 21 and 23 where the view is awesome.",
    "We have an apartment that is 100 meters from the metro station.",
]

questions_and_answers_examples = ""

for question, answer in zip(questions, answers):
    questions_and_answers_examples += f"Question: {question}\nAnswer: {answer}\n\n"

# Remove the trailing new line characters for a clean output
questions_and_answers_examples = questions_and_answers_examples.strip()

# Print the result (for verification, if needed)
print(questions_and_answers_examples)

Question: Do you have separate rooms for your children?
Answer: We have a 3-room apartment with extra space for a desk.

Question: Do you have extra space for an office? I am working from home
Answer: There is a beautiful apartment with a dedicated studio that could be used as an office.

Question: Do you have an apartment with a panoramic view of the city?
Answer: We have a couple of apartments on floor 21 and 23 where the view is awesome.

Question: We don't have our own transportation, Do you have any apartment near a public transportation system.
Answer: We have an apartment that is 100 meters from the metro station.


In [27]:
context = "{context}"
query = "{query}"

prompt_template = f"""
Act as a real state agent from the city of Medellin in Colombia where you have the following context:

{context}

Answer the following question:

{query}

Also, provide a custom answer for customers who are searching for a good apartment to rent. These are some example of the expected questions and answers:

{questions_and_answers_examples}
"""

print(prompt_template)


Act as a real state agent from the city of Medellin in Colombia where you have the following context:

{context}

Answer the following question:

{query}

Also, provide a custom answer for customers who are searching for a good apartment to rent. These are some example of the expected questions and answers:

Question: Do you have separate rooms for your children?
Answer: We have a 3-room apartment with extra space for a desk.

Question: Do you have extra space for an office? I am working from home
Answer: There is a beautiful apartment with a dedicated studio that could be used as an office.

Question: Do you have an apartment with a panoramic view of the city?
Answer: We have a couple of apartments on floor 21 and 23 where the view is awesome.

Question: We don't have our own transportation, Do you have any apartment near a public transportation system.
Answer: We have an apartment that is 100 meters from the metro station.



In [28]:
query = "We're searching for an apartment for our two kids near school"

In [29]:
similar_docs = db.similarity_search(query,k=5)
prompt = PromptTemplate(
    template=prompt_template,
    input_variables=["context","query"],
)
chain = load_qa_chain(llm,prompt=prompt,chain_type="stuff")
print(chain.run(input_documents=similar_docs, query=query))

For your specific needs of finding an apartment near a school for your two kids, I would recommend considering the Laureles Towers in the Laureles neighborhood. This cozy apartment is pet-friendly, near schools and universities, and has easy access to public transportation. It has 2 bedrooms, 1 bathroom, and comes with parking. The property size is 80 square meters and the price is 2,000,000 COP.

For customers searching for a good apartment to rent, we have a variety of options available to suit your needs. Whether you're looking for extra space for your children, a dedicated office area, a panoramic view of the city, or easy access to public transportation, we can help you find the perfect apartment. Just let us know your preferences and we'll find the ideal property for you.
