## Step 1: Setting Up the Python Application

In [1]:
import sys
sys.path.append("../")

In [12]:
import os
import sys
import asyncio
import json
from scripts.home_matcher import HomeMatcher
from scripts.db_semantic_searcher import ListingSearcher
from scripts.listing_personalizer import ListingPersonalizer
from scripts.listings_creator_langchain import ListingsGenerator
from scripts.resources.consts import BUYER_PREFERENCES_STR
from scripts.models import ListingConverter

In [3]:
# Use since running async in jupyter
import nest_asyncio
nest_asyncio.apply()

In [4]:
os.environ['OPENAI_API_KEY'] = '<YOUR_OPENAI_KEY>'
api_key = os.getenv('OPENAI_API_KEY')

## Steps 2 & 3: Generating Real Estate Listings usage & Storing Listings in a Vector Database


In [5]:
listings_generator = ListingsGenerator()
asyncio.run(listings_generator.generate_listings())

Generating listings...




Saving listings to resources/listings.json
Listings are stored in ChromaDB path: resources/listings.db
Count of items in db: 10
Listings stored in ChromaDB


In [6]:
listings_generator.load_listings()

{'property_1': {'Neighborhood': 'Green Oaks',
  'Price': '$800,000',
  'Bedrooms': 3,
  'Bathrooms': 2,
  'House Size': '2,000 sqft',
  'Description': 'Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem. Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, co

## Step 4, 5 & 6: Building the User Preference Interface, Searching Based on Preferences, Personalizing Listing Descriptions:



In [16]:
#Step 4 - Building the user preferences is hardcoded in the consts:
from scripts.resources.consts import BUYER_PREFERENCES_STR, \
    BUYER_PERSONALIZATION_PROMPT, BUYER_PERSONALIZATION_SYSTEM_PROMPT, BUYER_PERSONALIZATION_FEW_SHOT_EXAMPLES

#Step 5 - Search based on preferences:
listing_searcher = ListingSearcher()
listings = listing_searcher.search_listings(BUYER_PREFERENCES_STR)
print(f"\nThe {len(listings)} top most similar listings are:\n{listings}")

Searching for similar listing
Count of items in db: 10

The 5 top most similar listings are:
[Document(page_content="neighborhood:Riverside\nprice:$900,000\nbedrooms:4.0\nbathrooms:3.0\nhouse size:2,800 sqft\ndescription:Stunning 4-bedroom, 3-bathroom home in the desirable Riverside neighborhood. This home features a bright and open floor plan with a gourmet kitchen, a spacious family room with a fireplace, and a formal dining area. The master suite includes a walk-in closet and a spa-like bathroom. The backyard is an entertainer's dream with a covered patio and built-in BBQ. Neighborhood Description: Riverside is a vibrant community known for its excellent schools, parks, and recreational facilities. Enjoy a day at Riverside Park or explore the local shops and restaurants. With convenient access to major highways and public transportation, commuting is easy."), Document(page_content='neighborhood:Brookfield\nprice:$650,000\nbedrooms:3.0\nbathrooms:2.0\nhouse size:2,100 sqft\ndescripti

In [22]:
#Only original descriptions:
houselistings_formatted_listings = [ListingConverter().convert_text_to_houselisting(listing) for listing in listings]
original_descriptions = [listing.description for listing in houselistings_formatted_listings]
original_descriptions

["Stunning 4-bedroom, 3-bathroom home in the desirable Riverside neighborhood. This home features a bright and open floor plan with a gourmet kitchen, a spacious family room with a fireplace, and a formal dining area. The master suite includes a walk-in closet and a spa-like bathroom. The backyard is an entertainer's dream with a covered patio and built-in BBQ. Neighborhood Description: Riverside is a vibrant community known for its excellent schools, parks, and recreational facilities. Enjoy a day at Riverside Park or explore the local shops and restaurants. With convenient access to major highways and public transportation, commuting is easy.",
 'Beautiful 3-bedroom, 2-bathroom home in the charming Brookfield neighborhood. This home features a spacious living room with a fireplace, a modern kitchen with stainless steel appliances, and a master suite with a private bath. The backyard is perfect for entertaining with a large patio and mature trees. Neighborhood Description: Brookfield 

In [20]:
#Step 6 - Creating personalized listing description
listing_personalizer = ListingPersonalizer()
personalized_listings = asyncio.run(listing_personalizer.personalize_listings(BUYER_PREFERENCES_STR, listings))
personalized_listings_json = [listing.dict() for listing in personalized_listings]


Creating personalized listings...
Max retries reached, proceeding without augmented description


In [29]:
augmented_descriptions = [listing["augmented_description"] for listing in personalized_listings_json]
augmented_descriptions

["Stunning 4-bedroom, 3-bathroom home in the desirable Riverside neighborhood. This spacious home features a bright and open floor plan with a gourmet kitchen, a spacious family room with a fireplace, and a formal dining area, perfect for family gatherings. The master suite includes a walk-in closet and a spa-like bathroom, providing a luxurious retreat. The backyard is an entertainer's dream with a covered patio and built-in BBQ, ideal for hosting friends and family. Riverside is a vibrant community known for its excellent schools and parks, making it an ideal location for families. Enjoy a day at Riverside Park or explore the local shops and restaurants. With convenient access to major highways and public transportation, commuting is easy.",
 'Beautiful 3-bedroom, 2-bathroom home in the charming Brookfield neighborhood. This home features a spacious living room with a fireplace, a modern kitchen with stainless steel appliances, and a master suite with a private bath. The backyard is 

In [35]:
from difflib import HtmlDiff, SequenceMatcher
from IPython.display import display, HTML
import re

#Creating a comparison:
diff_list = list(zip(original_descriptions, augmented_descriptions))


#Created this part using GenAI for comparison. Red- Removed from augmented while Green is added in the augmented version.
def word_by_word_diff(before, after):
    before_words = re.findall(r'\S+|\s+', before)
    after_words = re.findall(r'\S+|\s+', after)

    matcher = SequenceMatcher(None, before_words, after_words)
    diff_html = ''
    for opcode, i1, i2, j1, j2 in matcher.get_opcodes():
        if opcode == 'equal':
            diff_html += ''.join(before_words[i1:i2])
        elif opcode == 'insert':
            diff_html += '<span style="background-color: #a6f3a6;">' + ''.join(after_words[j1:j2]) + '</span>'
        elif opcode == 'delete':
            diff_html += '<span style="background-color: #f3a6a6;">' + ''.join(before_words[i1:i2]) + '</span>'
        elif opcode == 'replace':
            diff_html += '<span style="background-color: #f3a6a6;">' + ''.join(before_words[i1:i2]) + '</span>'
            diff_html += '<span style="background-color: #a6f3a6;">' + ''.join(after_words[j1:j2]) + '</span>'
    return diff_html

def display_diffs(before_after_list):
    for before, after in before_after_list:
        diff_html = word_by_word_diff(before, after)
        display(HTML('<pre style="white-space: pre-wrap;">{}</pre>'.format(diff_html)))


display_diffs(diff_list)


## Main script usage:

In [3]:
# This is the main script that enables running e2e.

home_matcher = HomeMatcher(api_key=api_key)
asyncio.run(home_matcher.match())




Generating listings...
Saving listings to resources/listings.json
Count of items in db: 40
Listings stored in ChromaDB
Searching for similar listing
Count of items in db: 40
Creating personalized listings...
Max retries reached, proceeding without augmented description
Personalized listings saved to resources/personalized_listings.json


In [5]:
results = home_matcher.load_matches()
results

[{'neighborhood': 'Cedar Park',
  'price': '$780,000',
  'bedrooms': 4.0,
  'bathrooms': 3.0,
  'house_size': '2,300 sqft',
  'description': 'Beautiful 4-bedroom, 3-bathroom home in the Cedar Park neighborhood. This home features an open floor plan with a spacious living room, a modern kitchen with quartz countertops, and a dining area that opens to a large deck. The master suite includes a walk-in closet and a luxurious bathroom with a soaking tub. The finished basement offers additional living space and a home office. Neighborhood Description: Cedar Park is a family-friendly community with excellent schools, parks, and recreational facilities. Enjoy hiking and biking trails, community events, and easy access to shopping and dining.',
  'augmented_description': "Beautiful 4-bedroom, 3-bathroom home in the Cedar Park neighborhood. This home features an open floor plan with a spacious living room, a modern kitchen with quartz countertops, and a dining area that opens to a large deck. Th