This is a starter notebook for the project, you'll have to import the libraries you'll need, you can find a list of the ones available in this workspace in the requirements.txt file in this workspace. 

In [17]:
!pip install -U chromadb pandas langchain langchain-openai langchain-chroma langchain-community

Defaulting to user installation because normal site-packages is not writeable
Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl (2.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m25.0 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting pydantic-settings<3.0.0,>=2.4.0
  Downloading pydantic_settings-2.10.1-py3-none-any.whl (45 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting httpx-sse<1.0.0,>=0.4.0
  Downloading httpx_sse-0.4.1-py3-none-any.whl (8.1 kB)


Installing collected packages: httpx-sse, pydantic-settings, langchain-community
Successfully installed httpx-sse-0.4.1 langchain-community-0.3.27 pydantic-settings-2.10.1


In [22]:
import os
import json
import pandas as pd
import chromadb
from io import StringIO
from langchain_openai import OpenAI
#from langchain.document_loaders.csv_loader import CSVLoader
from langchain_community.document_loaders import CSVLoader
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain import LLMChain
from langchain.chains.question_answering import load_qa_chain
#from langchain.output_parsers import PydanticOutputParser
#from fastapi.encoders import jsonable_encoder
from langchain.schema import SystemMessage, HumanMessage
#from langchain_core.output_parsers import StrOutputParser

os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY_GOES_HERE"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"
client = chromadb.PersistentClient(path="./chroma_db") # Stores data in a local directory

llm = OpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=500,
    api_key=os.getenv("OPENAI_API_KEY")
)

instruction = """
Generate 25 Real Estate lisitngs in CSV format with the following fields: Neighborhood, Price, Bedrooms, Bathrooms, House Size, Description.
Follow this format
Neighborhood: Green Oaks
Price: $800,000
Bedrooms: 3
Bathrooms: 2
House Size: 2,000 sqft

Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. 
This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels 
and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful 
hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious 
backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without 
compromising on style in this Green Oaks gem.

Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to 
organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park
or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, 
commuting is a breeze.
"""

RE_listings = llm.invoke(instruction)
#try:
#    RE_listings = llm.invoke(instruction).content
#except Exception as e:
    # In case quota not enough
#    raise RuntimeError(
#        f"LLM call failed (Please check quota). Underlying error: {e}"
#    )


# Convert RE_listings->DataFrame
df = pd.read_csv(StringIO(RE_listings), sep=';')
df.to_csv('RE_listings.csv', index=False)

# Load CSV
csv_loader = CSVLoader(file_path='./RE_listings.csv')
RE_listing_docs = csv_loader.load()

# Embeddings created
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",   
    api_key=os.getenv("OPENAI_API_KEY"),
    base_url=os.getenv("OPENAI_API_BASE")  
)


# Build Chroma Database
db = Chroma.from_documents(
    documents=RE_listing_docs,
    embedding=embeddings,
    collection_name="RE_listings",
    persist_directory="data",
)
#db.persist() not needed in current version; writes are automatically persisted 

# Test User Input
user_input = input("Welcome to House Match; Please enter what you are looking for: ")
print("You entered:", user_input)

# Retrieval 
results = db.similarity_search(user_input, k=5)
for i, d in enumerate(results, 1):
    print(f"\nResult {i}:\n{d.page_content}\nMetadata: {d.metadata}") 

Welcome to House Match; Please enter what you are looking for: Solar Panels
You entered: Solar Panels

Result 1:
```: Green Oaks,$800,000,3,2,2,000 sqft,"Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem."
Metadata: {'source': './RE_listings.csv', 'row': 1}

Result 2:
```: Green Oaks,$800,000,3,2,2000 sqft,"Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods t

In [20]:
print(RE_listings)

```
Neighborhood,Price,Bedrooms,Bathrooms,House Size,Description
Green Oaks,$800,000,3,2,2,000 sqft,"Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem."
Maplewood,$650,000,4,3,2,500 sqft,"Discover your dream home in Maplewood! This stunning 4-bedroom, 3-bathroom residence features a modern open floor plan and high-end finishes throughout. The spacious kitchen is equipped with stainless steel appliances and a large island, perfect for entertaining. The backyard oasis includes a patio and pool, ideal for summer gatherings. Ma