This is a starter notebook for the project, you'll have to import the libraries you'll need, you can find a list of the ones available in this workspace in the requirements.txt file in this workspace. 

In [1]:
# Step 1: Setting Up the Python Application

"""
Initialize a Python Project: Create a new Python project, setting up a virtual environment and installing necessary 
packages like LangChain, a suitable LLM library (e.g., OpenAI's GPT), and a vector database package compatible with Python 
(e.g., ChromaDB or LanceDB). If you don't wish to create your files from scratch, starter files are available in the 
workspace on the next page as an application skeleton.
"""

from langchain.llms import OpenAI

# in case of Chain Of Thought (COT), you need these libraries:
from langchain.prompts import PromptTemplate
# from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain import LLMChain
# ChromaDB
from langchain.vectorstores import Chroma

from langchain.embeddings.openai import OpenAIEmbeddings
# from langchain.document_loaders.csv_loader import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain import LLMChain
from langchain.chains.question_answering import load_qa_chain
from langchain.docstore.document import Document


import os
os.environ["OPENAI_API_KEY"] = 'xxx'

In [3]:
# Step 2: Generating Real Estate Listings

"""
Generate real estate listings using a Large Language Model. Generate at least 10 listings: this can involve creating prompts 
for the LLM to produce descriptions of various properties.
"""


current_model_name = "gpt-3.5-turbo"
temperature = 1.0
my_llm = OpenAI(model_name=current_model_name, temperature=temperature, max_tokens = 500)

prompt_template = PromptTemplate.from_template(
    """You are a real estate agent/seller in charge of producing descriptions of various house properties. 
       Generate {num_descriptions} descriptions of possible properties that are at sell. These descriptions must include these parameters:
           Neighborhood: (you have to invent the name)
           Price: (in $)
           Bedrooms (max 6)
           Bathrooms (max 5)
           House Size (max 7,000 sqft)
           Description of the property: (max 50 words)
           Description of the Neighborhood: (max 50 words)
           
       Use this example as a template:
            Neighborhood: Green Oaks
            
            Price: $800,000
            
            Bedrooms: 3
            
            Bathrooms: 2
            
            House Size: 2,000 sqft
            
            Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem.
            
            Neighborhood Description: Green Oaks is a close-knit, environmentally conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, commuting is a breeze.
       
       """
)

#llm_chain = LLMChain(
#    prompt=prompt_template,
#    llm=my_llm(prompt_template.format("num_descriptions" = 2))
#)

llm=my_llm(prompt_template.format(num_descriptions = 3))

print("====OUTPUT=====\n")
# output = llm_chain()
output = llm
print(output)


====OUTPUT=====

Neighborhood: Maple Grove

Price: $1,200,000

Bedrooms: 4

Bathrooms: 3

House Size: 3,500 sqft

Description: Step into luxury in this stunning 4-bedroom, 3-bathroom home in the sought-after neighborhood of Maple Grove. Impeccable finishes, high ceilings, and bespoke features make this property a dream home. The gourmet kitchen, elegant master suite, and expansive outdoor living space are just a few highlights of this exquisite property.

Neighborhood Description: Maple Grove is a prestigious neighborhood known for its tree-lined streets, upscale shopping, and top-rated schools. Residents enjoy access to parks, country clubs, and fine dining options. With a strong sense of community and convenient location, Maple Grove offers a truly exceptional quality of life. 


Neighborhood: Sunset Ridge

Price: $900,000

Bedrooms: 5

Bathrooms: 4

House Size: 4,200 sqft

Description: Discover the perfect family home in the charming neighborhood of Sunset Ridge. This spacious 5-bed

In [4]:
# Step 3: Storing Listings in a Vector Database

"""
* Vector Database Setup: Initialize and configure ChromaDB or a similar vector database to store real estate listings.

* Generating and Storing Embeddings: Convert the LLM-generated listings into suitable embeddings that capture the semantic 
content of each listing and store these embeddings in the vector database.
"""

embeddings = OpenAIEmbeddings()

documented_descriptions = Document(page_content = output)
vector_descr = [documented_descriptions]

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
split_docs = splitter.split_documents(vector_descr)

#db = Chroma.from_documents(desc, embeddings)
db = Chroma.from_documents(split_docs, embeddings)

In [6]:
# Step 4: Building the User Preference Interface
"""
Collect buyer preferences, such as the number of bedrooms, bathrooms, location, and other specific requirements from a set 
of questions or telling the buyer to enter their preferences in natural language. You can hard-code the buyer preferences 
in questions and answers, or collect them interactively however you'd like, example:
"""

questions = [   
"How big do you want your house to be?",
"What are 3 most important things for you in choosing this property?", 
"Which amenities would you like?", 
"Which transportation options are important to you?",
"How urban do you want your neighborhood to be?",   
            ]
answers = [
"A comfortable three-bedroom house with a spacious kitchen and a cozy living room.",
"A quiet neighborhood, good local schools, and convenient shopping options.",
"A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.",
"Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.",
"A balance between suburban tranquility and access to urban amenities like restaurants and theaters."
            ]
# Buyer Preference Parsing: Implement logic to interpret and structure these preferences for querying the vector database.

prompt_template_questionary = PromptTemplate.from_template(
    "Real estate agent/seller questions: " + 
    questions[0] + answers[0] + 
    questions[1] + answers[1] + 
    questions[2] + answers[2] + 
    questions[3] + answers[3] + 
    questions[4] + answers[4]
)

print (prompt_template_questionary)

query = """
Based on these questions and answers made to the customers, which property name would you recommend? 
Make sure you do not paraphrase the descriptions, and only use the information provided in the descriptions.
"""

use_chain_helper = True

if use_chain_helper:
    # Initialize the retriever
    retriever = db.as_retriever()
    print("Retriever initialized:", retriever)
    # Create the RetrievalQA chain
    try:
        rag = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
        print("RetrievalQA chain initialized")
        # Run the query
        result = rag.run(query)
        print("Query result:", result)
    except Exception as e:
        print("Error initializing RetrievalQA or running query:", e)
else:
    similar_docs = db.similarity_search(query, k=3)
    prompt = PromptTemplate(
        template="{query}\nContext: {context}",
        input_variables=["query", "context"]
    )
    chain = load_qa_chain(llm, prompt=prompt, chain_type="stuff")
    print(chain.run(input_documents=similar_docs, query=query))

input_variables=[] template='Real estate agent/seller questions: How big do you want your house to be?A comfortable three-bedroom house with a spacious kitchen and a cozy living room.What are 3 most important things for you in choosing this property?A quiet neighborhood, good local schools, and convenient shopping options.Which amenities would you like?A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.Which transportation options are important to you?Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.How urban do you want your neighborhood to be?A balance between suburban tranquility and access to urban amenities like restaurants and theaters.'
Retriever initialized: tags=['Chroma', 'OpenAIEmbeddings'] vectorstore=<langchain.vectorstores.chroma.Chroma object at 0x79901928e6b0>
Error initializing RetrievalQA or running query: 1 validation error for LLMChain
llm
  value is not a valid dict (type=type_error.dic

In [None]:
# Step 5: Searching Based on Preferences
"""
* Semantic Search Implementation: Use the structured buyer preferences to perform a semantic search on the vector database, 
retrieving listings that most closely match the user's requirements.

* Listing Retrieval Logic: Fine-tune the retrieval algorithm to ensure that the most relevant listings are selected based 
on the semantic closeness to the buyer’s preferences.
"""

In [None]:
# Step 6: Personalizing Listing Descriptions
"""
* LLM Augmentation: For each retrieved listing, use the LLM to augment the description, tailoring it to resonate with 
the buyer’s specific preferences. This involves subtly emphasizing aspects of the property that align with what 
the buyer is looking for.

* Maintaining Factual Integrity: Ensure that the augmentation process enhances the appeal of the listing without 
altering factual information.
"""

In [None]:
Step 7: Deliverables and Testing
•	Test your "HomeMatch" application and make sure it meets all of the requirements in the rubric. Your project code will be run when it's assessed. Enter different "buyer preferences" and ensure it works.
•	Jupyter Notebook/Python Program: Compile the application code in a Jupyter notebook or a standalone Python program. Ensure the code is well-commented and logically structured.
•	Example Outputs: Include example outputs showcasing how user preferences are processed and how the application generates personalized listing descriptions. You can include these in comments in your application or in a Jupyter notebook that's saved with outputs.
