This is a starter notebook for the project, you'll have to import the libraries you'll need, you can find a list of the ones available in this workspace in the requirements.txt file in this workspace. 

In [1]:
!pip install pandas

Defaulting to user installation because normal site-packages is not writeable
Collecting pandas
  Downloading pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.0/13.0 MB[0m [31m64.4 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Collecting tzdata>=2022.7
  Downloading tzdata-2024.1-py2.py3-none-any.whl (345 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m345.4/345.4 kB[0m [31m32.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tzdata, pandas
Successfully installed pandas-2.2.2 tzdata-2024.1


In [1]:
import os

os.environ["OPENAI_API_KEY"] = "<YOUR API KEY>"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"



In [2]:
# from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.vectorstores import Chroma
from langchain.prompts import PromptTemplate
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders.csv_loader import CSVLoader
import pandas as pd

In [3]:
model_name = 'gpt-3.5-turbo'
llm = ChatOpenAI(model=model_name, temperature=0,max_tokens=2000)

### Synthetic Data Generation
### (No need to run this section again as synthetic data has already been generated and saved to csv file here: data/property_listings.csv)

In [43]:
prompt_template = """Generate {num_listings}  Property listings in csv format. Each listing must contain these fields in the csv:
 Location, Price, Size in Sqft, Number of Bedrooms, Number of Bathrooms, Property Age. Generate multiple listings for each location with varying Property sizes.
Only csv format is acceptable. Use semicolon as a separator.
"""

synthetic_data_prompt = PromptTemplate.from_template(prompt_template)

In [50]:
listings = llm.predict(synthetic_data_prompt.format(num_listings=25))

In [52]:
# Save Synthetic data as csv file
with open('data/property_listings.csv', 'w') as f:
    f.write(listings)


### Loading Generated Synthetic data 



In [4]:
# validating the csv in schema in pandas (Note we have to install pandas to run this cell)
df = pd.read_csv('data/property_listings.csv', sep=';')
df

Unnamed: 0,Location,Price,Size in Sqft,Number of Bedrooms,Number of Bathrooms,Property Age
0,New York,1000000,1500,3,2,5
1,New York,1500000,2000,4,3,10
2,New York,800000,1200,2,1,2
3,New York,1200000,1800,3,2,7
4,New York,900000,1300,2,1,4
5,Los Angeles,900000,1800,3,2,8
6,Los Angeles,1200000,2500,4,3,12
7,Los Angeles,750000,1500,2,1,5
8,Los Angeles,1100000,2000,3,2,10
9,Los Angeles,850000,1600,2,1,6


In [4]:
# Loading Generated Synthetic data
loader = CSVLoader(file_path='data/property_listings.csv')
docs = loader.load()

In [5]:
docs

[Document(page_content='Location;Price;Size in Sqft;Number of Bedrooms;Number of Bathrooms;Property Age: New York;1000000;1500;3;2;5', metadata={'source': 'data/property_listings.csv', 'row': 0}),
 Document(page_content='Location;Price;Size in Sqft;Number of Bedrooms;Number of Bathrooms;Property Age: New York;1500000;2000;4;3;10', metadata={'source': 'data/property_listings.csv', 'row': 1}),
 Document(page_content='Location;Price;Size in Sqft;Number of Bedrooms;Number of Bathrooms;Property Age: New York;800000;1200;2;1;2', metadata={'source': 'data/property_listings.csv', 'row': 2}),
 Document(page_content='Location;Price;Size in Sqft;Number of Bedrooms;Number of Bathrooms;Property Age: New York;1200000;1800;3;2;7', metadata={'source': 'data/property_listings.csv', 'row': 3}),
 Document(page_content='Location;Price;Size in Sqft;Number of Bedrooms;Number of Bathrooms;Property Age: New York;900000;1300;2;1;4', metadata={'source': 'data/property_listings.csv', 'row': 4}),
 Document(page_c

### Semantic search on Vector DB

In [6]:

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
split_docs = splitter.split_documents(docs)

embeddings = OpenAIEmbeddings()

db = Chroma.from_documents(split_docs, embeddings)

In [7]:
queries = [
    """Show me some properties in San Francisco.""",
    """Find me top properties in Miami with atleast 3 bedrooms.""",
    """Recommend properties in New York not older than 5 years""",
    
    
    
    
]


top_n_results = 3



In [8]:
def get_search_results(queries):
    answers = []
    for query in queries:
        answer = db.similarity_search(query,k=top_n_results)
        answers.append(answer)
        
    return answers
    

In [9]:
answers = get_search_results(queries)

### Augmented Response Generation

In [10]:
def get_personalized_recommendations(answer):
    
    augmented_query = f"""Give me a personalized property listing based on my preferences: {answer}
    Use creative language to decsribe the property elaborately but Do not include any information beyond the provided information. 
    """
    augmented_response = llm.predict(augmented_query)
    
    return augmented_response
    
    

In [12]:
for i,answer in enumerate(answers):
    
    augmented_response = get_personalized_recommendations(answer)
    print(f'Query {i+1}:\n {queries[i]}\n')
    print(f'Response {i+1}:\n {augmented_response}\n')
    print('_________________________________________')

Query 1:
 Show me some properties in San Francisco.

Response 1:
 Welcome to your dream home in San Francisco! This stunning property boasts 2500 sqft of luxurious living space, featuring 4 spacious bedrooms and 3 beautifully appointed bathrooms. With a price tag of $2,000,000, this home offers the perfect blend of elegance and comfort. The property is just 10 years old, ensuring modern amenities and a contemporary design. Don't miss out on this rare opportunity to own a piece of paradise in the heart of San Francisco.

Alternatively, you may also consider a charming 3-bedroom, 2-bathroom home in San Francisco. Priced at $1,800,000, this cozy property offers 2000 sqft of living space and is perfect for those seeking a more intimate setting. With a property age of 8 years, this home combines modern convenience with classic charm. Don't let this gem slip through your fingers!

Looking for a blend of style and affordability? Look no further than this 3-bedroom, 2-bathroom property in San 