## This notebook generates 10 records to mimic a real estate listing using an one-shot approach

In [1]:
import json
import pandas as pd

from langchain_openai import ChatOpenAI, OpenAI
from langchain_openai import OpenAIEmbeddings

In [2]:
import os
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv()) # read local .env file


In [3]:
model_name = "gpt-3.5-turbo-instruct"

temperature = 0.7

llm = OpenAI(model_name=model_name, temperature=temperature, max_tokens = 3000)


In [4]:
prompt = """
you are a real state agent with 15 years of experience and are helping a data scientist to 
generate 10 training records that mimic a real estate listing in a small handful of suburban neighborhoods 
with a price ranges from $700000 to $1000000, average size of 3 bedrooms , 2 bathrooms and 2,000 square feet. 
Each record can deviate from these averages by as much as 50%. 
The Listing Number starts with an "L" and follows by a 6-digit unique number. 
Be creative and provide fun facts while writing "Description" and "Neighborhood Description". 
** Do not use the same content more than onece **.
The output should only contain a list of JSON objects, nothing else. 
Each record needs to be in the JSAON format with key attributes following the example provided below.


### example ###
Listing Number: L123456
Neighborhood: Green Oaks
Price: $800,000
Bedrooms: 3
Bathrooms: 2
House Size: 2,000 sqft

House Description: Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem.

Neighborhood Description: Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With easy access to public transportation and bike lanes, commuting is a breeze.
"""

In [5]:
output = llm.invoke(prompt)

In [6]:
type(output)

str

In [7]:
print(output)


[
  {
    "Listing Number": "L123456",
    "Neighborhood": "Green Oaks",
    "Price": "$800,000",
    "Bedrooms": 3,
    "Bathrooms": 2,
    "House Size": "2,000 sqft",
    "House Description": "Welcome to this eco-friendly oasis nestled in the heart of Green Oaks. This charming 3-bedroom, 2-bathroom home boasts energy-efficient features such as solar panels and a well-insulated structure. Natural light floods the living spaces, highlighting the beautiful hardwood floors and eco-conscious finishes. The open-concept kitchen and dining area lead to a spacious backyard with a vegetable garden, perfect for the eco-conscious family. Embrace sustainable living without compromising on style in this Green Oaks gem.",
    "Neighborhood Description": "Green Oaks is a close-knit, environmentally-conscious community with access to organic grocery stores, community gardens, and bike paths. Take a stroll through the nearby Green Oaks Park or grab a cup of coffee at the cozy Green Bean Cafe. With ea

In [8]:
data = json.loads(output)

In [9]:
type(data)

list

In [10]:
df = pd.DataFrame(data)
df.head(10)

Unnamed: 0,Listing Number,Neighborhood,Price,Bedrooms,Bathrooms,House Size,House Description,Neighborhood Description
0,L123456,Green Oaks,"$800,000",3,2,"2,000 sqft",Welcome to this eco-friendly oasis nestled in ...,"Green Oaks is a close-knit, environmentally-co..."
1,L234567,Maple Hills,"$750,000",3,2,"2,500 sqft","Welcome to this beautiful 3-bedroom, 2-bathroo...","Maple Hills is a friendly, family-oriented nei..."
2,L345678,Willow Creek,"$850,000",4,3,"1,800 sqft","Welcome to this charming 4-bedroom, 3-bathroom...","Willow Creek is a quiet, tree-lined neighborho..."
3,L456789,Oakwood Heights,"$950,000",3,2,"2,500 sqft","Welcome to this stunning 3-bedroom, 2-bathroom...",Oakwood Heights is a prestigious neighborhood ...
4,L567890,Chestnut Grove,"$900,000",3,2,"2,200 sqft","Welcome to this charming 3-bedroom, 2-bathroom...","Chestnut Grove is a peaceful, family-friendly ..."
5,L678901,Pine Valley,"$700,000",4,3,"1,800 sqft","Welcome to this spacious 4-bedroom, 3-bathroom...","Pine Valley is a quiet, family-oriented neighb..."
6,L789012,Cedar Heights,"$950,000",3,2,"2,500 sqft","Welcome to this stunning 3-bedroom, 2-bathroom...","Cedar Heights is a prestigious, gated communit..."
7,L890123,Aspen Grove,"$800,000",3,2,"1,800 sqft","Welcome to this charming 3-bedroom, 2-bathroom...","Aspen Grove is a peaceful, tree-lined neighbor..."
8,L901234,Evergreen Estates,"$900,000",4,3,"2,500 sqft","Welcome to this stunning 4-bedroom, 3-bathroom...",Evergreen Estates is a prestigious neighborhoo...
9,L012345,Hickory Hills,"$750,000",3,2,"2,200 sqft","Welcome to this beautiful 3-bedroom, 2-bathroo...","Hickory Hills is a quiet, family-friendly neig..."


In [11]:
df.dtypes

Listing Number              object
Neighborhood                object
Price                       object
Bedrooms                     int64
Bathrooms                    int64
House Size                  object
House Description           object
Neighborhood Description    object
dtype: object

In [12]:
df.to_csv('listing_5.csv', index=False)