# HomeMatch Project

Imagine you're a talented developer at "Future Homes Realty", a forward-thinking real estate company. In an industry where personalization is key to customer satisfaction, your company wants to revolutionize how clients interact with real estate listings. The goal is to create a personalized experience for each buyer, making the property search process more engaging and tailored to individual preferences.

Your task is to develop an innovative application named "HomeMatch". This application leverages large language models (LLMs) and vector databases to transform standard real estate listings into personalized narratives that resonate with potential buyers' unique preferences and needs.

* Step 1: Generating Real Estate Listings

Generate real estate listings using a Large Language Model. Generate at least 10 listings. This can involve creating prompts for the LLM to produce descriptions of various properties.


* Step 2: Storing Listings in a Vector Database

Initialize and configure ChromaDB or a similar vector database to store real estate listings.

Convert the LLM-generated listings into suitable embeddings that capture the semantic content of each listing, and store these embeddings in the vector database.

* Step 3: Building the User Preference Interface

Collect buyer preferences, such as the number of bedrooms, bathrooms, location, and other specific requirements from a set of questions or telling the buyer to enter their preferences in natural language.

* Step 4: Searching Based on Preferences

Use the structured buyer preferences to perform a semantic search on the vector database, retrieving listings that most closely match the user's requirements.

* Step 5: Personalizing Listing Descriptions

For each retrieved listing, use the LLM to augment the description, tailoring it to resonate with the buyers specific preferences. This involves subtly emphasizing aspects of the property that align with what the buyer is looking for.

## Step 0: Pip installs and libraries

In [None]:
pip install openai==0.28
pip install sentence_transformers
pip install langchain
pip install --upgrade chromadb==0.4.14

In [25]:
from langchain.llms import OpenAI
import openai
import json
import pandas as pd
import pickle

## Step 1: Generating Real Estate Listings

In [12]:
api_key = "YOUR_API_KEY"
openai.api_key = api_key

In [13]:
def generate_real_estate_listings(prompt, num_listings=13):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": "You are a knowledgeable real estate agent. Generate {} real estate examples based on the provided example.".format(num_listings)
                },
                {
                    "role": "user",
                    "content": prompt
                }
            ],
            temperature=0.7,
            max_tokens=250 * num_listings,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0
        )
        listings = [choice.message.content for choice in response.choices]
        return listings
    except Exception as e:
        return json.dumps([{"error": str(e)}])

In [15]:
prompt = """Neighborhood: Green Oaks; Price: $800,000, Bedrooms: 3, Bathrooms: 2; House Size: 2,000 sqft; Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family.; Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, commuting is a breeze."""

generated_listings = generate_real_estate_listings(prompt)

In [16]:
generated_listings

["Neighborhood: Lakefront Estates; Price: $1,200,000; Bedrooms: 4; Bathrooms: 3.5; House Size: 3,500 sqft; Description: Enjoy luxurious lakeside living in this stunning 4-bedroom, 3.5-bathroom estate in Lakefront Estates. The grand foyer welcomes you into a spacious open-concept living area with panoramic views of the lake. The gourmet kitchen features high-end appliances and a large island, perfect for entertaining. The expansive backyard offers a private dock and boat lift, ideal for water enthusiasts. Neighborhood Description: Lakefront Estates is an exclusive waterfront community known for its upscale homes, private marina, and scenic walking trails. Residents can enjoy boating, fishing, and lakeside picnics just steps from their front door. Close proximity to upscale shopping, fine dining, and top-rated schools make this neighborhood highly desirable. \n\nNeighborhood: Mountain View Heights; Price: $750,000; Bedrooms: 3; Bathrooms: 2.5; House Size: 2,300 sqft; Description: Nestled

In [17]:
listings = generated_listings[0].split('\n\n')

In [18]:
listings[0]

'Neighborhood: Lakefront Estates; Price: $1,200,000; Bedrooms: 4; Bathrooms: 3.5; House Size: 3,500 sqft; Description: Enjoy luxurious lakeside living in this stunning 4-bedroom, 3.5-bathroom estate in Lakefront Estates. The grand foyer welcomes you into a spacious open-concept living area with panoramic views of the lake. The gourmet kitchen features high-end appliances and a large island, perfect for entertaining. The expansive backyard offers a private dock and boat lift, ideal for water enthusiasts. Neighborhood Description: Lakefront Estates is an exclusive waterfront community known for its upscale homes, private marina, and scenic walking trails. Residents can enjoy boating, fishing, and lakeside picnics just steps from their front door. Close proximity to upscale shopping, fine dining, and top-rated schools make this neighborhood highly desirable. '

In [20]:
listings[12]

'Neighborhood: Coastal Cottage Community; Price: $750,000; Bedrooms: 3; Bathrooms: 2.5; House Size: 2,200 sqft; Description: Experience coastal living in this charming 3-bedroom, 2.5-bathroom cottage located in the quaint Coastal Cottage Community. The light and airy living area features a cozy fireplace, shiplap accents, and French doors leading to a screened porch. The gourmet kitchen boasts quartz countertops, subway tile backsplash, and stainless steel appliances. The landscaped backyard is a private oasis with a fire pit, outdoor dining area, and lush gardens. Neighborhood Description: Coastal Cottage Community is a seaside neighborhood known for its sandy beaches, seaside cafes, and laid-back lifestyle. Residents can enjoy beachcombing, surfing, and sunset picnics just a short walk from their front door. With a strong sense of community and a relaxed vibe, this neighborhood offers the perfect coastal escape.'

In [26]:
with open('listings.pkl', 'wb') as f:
    pickle.dump(listings, f)

## Step 2: Storing Listings in a Vector Database

In [27]:
with open('listings.pkl', 'rb') as f:
    listings = pickle.load(f)

I create a new ChromaDB collection where I will save the list of previous properties. I have decided to store the information of each property as documents directly to simplify the search according to the client's interests entered in natural language.

As no embedding function is supplied, Chroma will use SentenceTransformer as a default.

In [29]:
import chromadb
client = chromadb.PersistentClient()
collection = client.get_or_create_collection("home_features")

for i, item in enumerate(listings):
    collection.add(
        documents=[item],
        ids=[str(i)]
    )

/root/.cache/chroma/onnx_models/all-MiniLM-L6-v2/onnx.tar.gz: 100%|██████████| 79.3M/79.3M [00:01<00:00, 68.6MiB/s]


Next, we see that we have 13 documents saved and an example of the first document:


In [30]:
collection.count()

13

In [33]:
collection.peek(1)

{'ids': ['0'],
 'embeddings': [[0.15903078019618988,
   -0.05696413666009903,
   0.041905347257852554,
   -0.04410164803266525,
   -0.0034898160956799984,
   -0.009429418481886387,
   -0.016508642584085464,
   0.010040447115898132,
   -0.04713725298643112,
   0.021636007353663445,
   0.03540901839733124,
   -0.012959107756614685,
   0.03908371925354004,
   0.0003529523964971304,
   0.05585859715938568,
   0.04027879238128662,
   0.0964839980006218,
   -0.06388022005558014,
   0.034443359822034836,
   0.007843921892344952,
   -0.057935334742069244,
   -0.0038062159437686205,
   0.057928744703531265,
   -0.017242150381207466,
   0.08734144270420074,
   0.09170781821012497,
   -0.009139681234955788,
   0.03129904344677925,
   -0.02294132299721241,
   0.05235237628221512,
   0.08088673651218414,
   0.04833973944187164,
   -0.03599729388952255,
   -0.02561216987669468,
   0.020298819988965988,
   -0.0023264142218977213,
   -0.025495333597064018,
   -0.022442936897277832,
   -0.0249308682978

## Step 3: Building the User Preference Interface

I have decided to dynamically obtain the client's interests, so I have also used an LLM model to which I pass certain questions and save their responses.

In [41]:
def generate_user_interests(prompt):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": "You are a user who is trying to buy a house. Answer the following questions to find out what interests you. Give me your answer in less than 200 tokens."
                },
                {
                    "role": "user",
                    "content": prompt
                }
            ],
            temperature=0.7,
            max_tokens=200,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0
        )
        listings = [choice.message.content for choice in response.choices]
        return listings
    except Exception as e:
        return json.dumps([{"error": str(e)}])

In [42]:
prompt = ["How big do you want your house to be?",
          "How many bedrooms are you looking for?",
          "Are you looking for a house with more than one bathroom?",
          "How urban do you want your neighborhood to be?",
          "What approximate price are you looking for?"
          ]
user_interests = generate_user_interests(" ".join(prompt))

In [43]:
user_interests

['I am looking for a medium-sized house with 3-4 bedrooms and at least 2 bathrooms. I prefer a suburban neighborhood with easy access to urban amenities. My budget is around $500,000.']

## Step 4: Searching Based on Preferences

I search for the 2 properties stored in the database vector that most closely match the client's specifications.

In [52]:
response = collection.query(
    query_texts=user_interests,
    n_results=2
)
response

{'ids': [['2', '4']],
 'distances': [[0.8289262174209224, 0.9366515334660342]],
 'metadatas': [[None, None]],
 'embeddings': None,
 'documents': [['Neighborhood: Urban Loft District; Price: $500,000; Bedrooms: 2; Bathrooms: 2; House Size: 1,800 sqft; Description: Experience modern urban living in this stylish 2-bedroom, 2-bathroom loft located in the vibrant Urban Loft District. The industrial-chic design features exposed brick walls, high ceilings, and polished concrete floors. The open-concept living area is perfect for entertaining, with a gourmet kitchen and a spacious dining area. Floor-to-ceiling windows flood the space with natural light, creating a bright and airy ambiance. Neighborhood Description: The Urban Loft District is a trendy neighborhood known for its converted loft spaces, art galleries, and hip eateries. Residents can walk to boutique shops, cafes, and cultural attractions, making it a vibrant and lively community for young professionals and creatives. ',
   'Neighb

## Step 5: Personalizing Listing Descriptions

In [67]:
def generate_house_description(prompt):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": "You are a knowledgeable real estate agent. Make a description of the house considering the specifications given by the client and the features of the house you want to sell. When there is a mismatch between them, you always have to consider the House features as the source of truth."
                },
                {
                    "role": "user",
                    "content": prompt
                }
            ],
            temperature=0.7,
            max_tokens=500,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0
        )
        listings = [choice.message.content for choice in response.choices]
        return listings
    except Exception as e:
        return json.dumps([{"error": str(e)}])

Property that most closely resembles the client's specifications:

In [71]:
response['documents'][0][0]

'Neighborhood: Urban Loft District; Price: $500,000; Bedrooms: 2; Bathrooms: 2; House Size: 1,800 sqft; Description: Experience modern urban living in this stylish 2-bedroom, 2-bathroom loft located in the vibrant Urban Loft District. The industrial-chic design features exposed brick walls, high ceilings, and polished concrete floors. The open-concept living area is perfect for entertaining, with a gourmet kitchen and a spacious dining area. Floor-to-ceiling windows flood the space with natural light, creating a bright and airy ambiance. Neighborhood Description: The Urban Loft District is a trendy neighborhood known for its converted loft spaces, art galleries, and hip eateries. Residents can walk to boutique shops, cafes, and cultural attractions, making it a vibrant and lively community for young professionals and creatives. '

Corresponding description from the real estate agent:

In [68]:
prompt = ["Client specifications:" + user_interests[0] + ". House features: " + response['documents'][0][0]]
house_description = generate_house_description(" ".join(prompt))
house_description

["Welcome to this stunning 2-bedroom, 2-bathroom loft located in the heart of the vibrant Urban Loft District, priced at $500,000. Spanning 1,800 sqft, this modern urban dwelling boasts a contemporary industrial-chic design with exposed brick walls, high ceilings, and polished concrete floors, offering a unique and stylish living experience.\n\nThe open-concept living area is designed for both comfort and entertainment, featuring a gourmet kitchen and a spacious dining area. Floor-to-ceiling windows flood the space with natural light, creating a bright and airy ambiance that complements the modern aesthetic of the home.\n\nSituated in the sought-after Urban Loft District, this property offers easy access to urban amenities, boutique shops, cafes, art galleries, and trendy eateries, all within walking distance. The neighborhood is known for its dynamic and lively atmosphere, making it the perfect place for young professionals and creatives looking to immerse themselves in a vibrant comm

Second property that most closely resembles the client's specifications:

In [70]:
response['documents'][0][1]

'Neighborhood: Historic Downtown; Price: $600,000; Bedrooms: 3; Bathrooms: 2; House Size: 1,600 sqft; Description: Step back in time with this charming 3-bedroom, 2-bathroom historic home in Downtown. The restored Victorian features original hardwood floors, crown molding, and period details throughout. The cozy living room is anchored by a vintage fireplace, perfect for chilly evenings. The wrap-around porch is ideal for sipping lemonade and watching the world go by. Neighborhood Description: Historic Downtown is a quaint district filled with tree-lined streets, historic homes, and unique shops. Residents can walk to local cafes, farmers markets, and cultural events, immersing themselves in the rich history and charm of the neighborhood. '

Corresponding description from the real estate agent:

In [69]:
prompt = ["Client specifications:" + user_interests[0] + ". House features: " + response['documents'][0][1]]
house_description = generate_house_description(" ".join(prompt))
house_description

["Welcome to this charming historic home located in the heart of Downtown! This 3-bedroom, 2-bathroom Victorian beauty offers a glimpse into the past with its original hardwood floors, crown molding, and period details that exude timeless elegance. The cozy living room, complete with a vintage fireplace, provides the perfect setting for intimate gatherings and cozy nights in.\n\nStep outside onto the wrap-around porch, where you can enjoy a relaxing moment sipping lemonade and soaking in the picturesque surroundings of tree-lined streets and historic homes that define the quaint charm of Historic Downtown.\n\nWhile this home may not match the client's initial specifications in terms of the number of bedrooms and bathrooms, its unique features and character offer a rare opportunity to own a piece of history in a highly desirable neighborhood. With a house size of 1,600 sqft and priced at $600,000, this property presents a blend of comfort, nostalgia, and community that is sure to captiv