# HomeMatch
Follow the installation instructions in `HomeMatchReadme.md`. Run each cell below to demonstrate the functionality of `HomeMatch`. To generate new test listings and save them to `Listings.txt` change the `generate_listings` variable to `True`

In [1]:

from langchain.llms import OpenAI
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.prompts import PromptTemplate
from langchain.chains.question_answering import load_qa_chain
from langchain.schema import Document
from langchain.chat_models import ChatOpenAI
from typing import List
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
import random

import csv
import os
from dotenv import load_dotenv

load_dotenv()  # take environment variables from .env

True

## Generate Listings Using LLM

Set `generate_listings` to `True` to generate new listing data and save it to `Listings.txt`

In [2]:
generate_listings = False
num_listings = 20
filename = "Listings.txt"

model_name = 'gpt-3.5-turbo'

llm = ChatOpenAI()

from langchain.chat_models import ChatOpenAI

# Function to generate fake real estate listings using an LLM
delimiter = "\n-----------------------\n"


# Neighborhoods in Austin, TX
neighborhoods = ["SoCo", "Bouldin Creek", "Travis Heights", "Mueller", "Clarksville",
                 "West Lake Hills", "Tarrytown", "Hyde Park", "Rosedale", "North Loop", "Brentwood", "Crestview", "Allandale",
                 "East Austin", "Cherrywood", "Windsor Park", "Circle C Ranch", "Steiner Ranch", "Milwood"]

# Generate random properties
def generate_random_property():
    price = random.randint(150000, 2000000)  # Random price
    bedrooms = random.randint(2, 6)
    bathrooms = random.randint(1, bedrooms) # Don't recommend more bathrooms than bedrooms
    sqft = random.randint(bedrooms*500, bedrooms*1000)  # Square footage
    neighborhood = random.choice(neighborhoods)

    return {
        "price": f"${price:,}",
        "square_footage": sqft,
        "bedrooms": bedrooms,
        "bathrooms": bathrooms,
        "neighborhood": neighborhood
    }

def generate_fake_listings(num_listings):
        
    # Generate multiple listings
    properties = [generate_random_property() for _ in range(num_listings)]
    
    # Format properties into descriptions using LLM
    property_descriptions = []
    prompt_template = PromptTemplate(
        input_variables=["price", "square_footage", "bedrooms", "bathrooms", "neighborhood"],
        template="""
        Generate a compelling real estate listing description for a house in {neighborhood}, Texas.
        It has {bedrooms} bedrooms, {bathrooms} bathrooms, and spans {square_footage} square feet.
        The asking price is {price}. Be sure to also include a description
        of neighborhood. Keep it engaging!
        """
    )


    listings = []
    for i in range(num_listings):
        prop_id = i + 1
        prop = properties[i]
        response = llm.invoke(prompt_template.format(**prop))
        listings.append("Listing ID: {prop_id}\n{content}".format(prop_id=prop_id, content=response.content))

    with open(filename, mode='w', encoding='utf-8') as file:
        file.write(delimiter.join(listings))
    

if generate_listings:
    content = generate_fake_listings(num_listings)

    


In [3]:
# Load the document
loader = TextLoader(filename)  
docs = loader.load()

# Manually split the document using the delimiter
split_texts = docs[0].page_content.split(delimiter)  # Split on delimiter

# Convert into LangChain Document objects
split_docs = [Document(page_content=text.strip()) for text in split_texts if text.strip()]
    
embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(split_docs, embeddings)

## Semantic Search Using User Preferences
Here we have hardcoded preferences passed from a user. Feel free to change the questions and answers to test out new
preferences. We use these preferences to search for the closest 3 listings using a vector database.

In [4]:
questions = [   
                "How big do you want your house to be?" 
                "What are 3 most important things for you in choosing this property?", 
                "Which amenities would you like?", 
                "Which transportation options are important to you?",
                "How urban do you want your neighborhood to be?",   
            ]
answers = [
    "A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
    "A quiet neighborhood, good local schools, and convenient shopping options.",
    "A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
    "Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
    "A balance between suburban tranquility and access to urban amenities like restaurants and theaters."
]

preferences = "\n\n".join(f"Q: {q}\nA: {a}" for q, a in zip(questions, answers))

query = """
Based on a list of user preferences in the form of question and answers, recommend real estate listings
from the provided context that most closely match the user's preferences.

## User Preferences
{preferences}
""".format(preferences=preferences)

similar_docs = db.similarity_search(query, k=3)

## Generate Personalized Descriptions
Here we generate personalized descriptions for their top 3 property picks based on the user's preferences.

In [5]:
llm = OpenAI(model_name=model_name, temperature=0, max_tokens=2000)

query = """
For each of the real estate listings in context, create a personalized description of the listing that resonates
with the user preferences given. For each description, be sure to only use information from the
real estate listing. Only output a single description of the listing alongside the price, number of bedrooms and bathrooms, alongside the listing ID.

## User Preferences
{preferences}
""".format(preferences=preferences)

prompt = PromptTemplate(
    template="{query}\nContext: {context}",
    input_variables=["query", "context"],
)
chain = load_qa_chain(llm, prompt = prompt, chain_type="stuff")
print(chain.run(input_documents=similar_docs, query = query))




Listing ID: 8
Price: $895,781
This stunning 4 bedroom, 4 bathroom house in vibrant East Austin is a dream come true! With over 3,000 square feet of luxurious space, this home offers a modern design and spacious living perfect for entertaining or relaxing. The gourmet kitchen with top-of-the-line appliances and a large island is a chef's delight, while the master suite provides a private oasis with a spa-like bathroom. Outside, the beautifully landscaped yard is ideal for enjoying the Texas sunshine. Located in a neighborhood known for its eclectic mix of restaurants, shops, and entertainment options, this East Austin gem is waiting for you to make it your own.

Listing ID: 14
Price: $492,483
Welcome to your dream home in the heart of vibrant East Austin! This stunning 5-bedroom, 2-bathroom house offers a spacious 4137 square feet of living space, perfect for families of all sizes. The modern design and open layout create a welcoming atmosphere for entertaining guests or relaxing in sty