## GenerateListings

**Purpose**: Generates real estate listings using AI for the HomeMatch application.

**Main Components**:
- AI-powered listing generation (GPT-3.5)
- Batch processing to handle token limits
- CSV export functionality

**Output**:
- Creates `house_listings.csv` with property listings including:
  - Neighborhood, Property Type, Price, Bedrooms, Bathrooms, House Size, Description, Neighborhood Description

**Configuration**:
- City: Lisbon
- Price Range: €100,000 - €1,500,000
- Batch Size: 10 listings per API call
- Default: 100 listings

**Prerequisites**:
- Replace `"YOUR API KEY"` with your OpenAI API key
- Install dependencies: `pip install -r requirements.txt`

**Usage**:
- Run all cells sequentially
- Adjust `num_listings` parameter in the last cell to change the number of listings generated

In [None]:
import os
import pandas as pd
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
from langchain.prompts import PromptTemplate
from io import StringIO



os.environ["OPENAI_API_KEY"] = "YOUR API KEY"
os.environ["OPENAI_API_BASE"] = "https://openai.vocareum.com/v1"


save_file_name = "house_listings.csv"
batch_size = 10 # generate listings in batches to bypass the token limit

# init llm
model_name = "gpt-3.5-turbo"
temperature = 1.0
llm = ChatOpenAI(model_name=model_name, temperature=temperature, max_tokens=4000)


In [20]:
# create house listings
def create_house_listings(num_listings=1):
    # Example: Template with variables that can be filled
    template_with_vars = """
    Generate a real estate listing in only one city {city} with the following requirements:

    - Price range: {price_range}
    - Property type: {property_type}
    - Bedrooms: from 1 to 5
    - Bathrooms: from 1 to 3


    The content field should contain a JSON object with these keys:
    - Neighborhood: unique neighborhood name
    - Property Type: property type
    - Price: price as number (include € symbol in the value)
    - Bedrooms: number
    - Bathrooms: number
    - House Size: size in sqm (include sqm symbol in the value) 
    - Description: detailed property description, 1-2 sentences
    - Neighborhood Description: neighborhood description, 1-2 sentences

    Generate {num_listings_gen} listings.
    Make the listing unique and appealing. 
    Return the listing as a CSV list of listings.
    IMPORTANT: All fields that contain commas must be enclosed in double quotes. For example: "€750,000" or "A beautiful apartment, with modern amenities."
    """

    # Create the PromptTemplate
    prompt = PromptTemplate(
        input_variables=["city", "price_range", "property_type", "num_listings_gen"],
        template=template_with_vars
    )

    # Store values in a dictionary (this is what you'll unpack)
    prompt_vars = {
        "city": "Lisbon",
        "price_range": "€100,000 - €1,500,000",
        "property_type": "Apartment, House",
        "num_listings_gen": batch_size,
    }

    all_listings = []
    header_written = False

    # Format the template using ** to unpack the dictionary
    formatted_prompt = prompt.format(**prompt_vars)
    for i in range(0, num_listings, batch_size):
        # running in batches to bypass the token limit 
        response = llm([HumanMessage(content=formatted_prompt)])
        print("running for interval", i, "to", i+batch_size)

        # Clean up the response (remove markdown code blocks if present)
        csv_content = response.content.strip()
        if csv_content.startswith("```csv"):
            csv_content = csv_content[7:]
        if csv_content.startswith("```"):
            csv_content = csv_content[3:]
        if csv_content.endswith("```"):
            csv_content = csv_content[:-3]
        
        # Combine results
        lines = csv_content.strip().split('\n')
        if not header_written and len(lines) > 0:
            all_listings.append(lines[0])  # Header
            header_written = True
            if len(lines) > 1:
                all_listings.extend(lines[1:])  # Data rows
        elif len(lines) > 1:
            all_listings.extend(lines[1:])  # Skip header, add data rows
    
    return '\n'.join(all_listings)


In [21]:
# save the listings to a csv file
def save_listings_csv(listings):
    # Try to parse, skip bad lines automatically
    try:
        df = pd.read_csv(StringIO(listings), on_bad_lines='skip')
    except pd.errors.ParserError:
        # If still fails, remove last line and try again
        lines = listings.strip().split('\n')
        if len(lines) > 1:
            listings = '\n'.join(lines[:-1])
        df = pd.read_csv(StringIO(listings), on_bad_lines='skip')
    
    # Remove last entry if it doesn't have all 8 columns
    if len(df) > 0 and len(df.columns) == 8:
        last_row = df.iloc[-1]
        if last_row.isna().any() or (last_row == '').any():
            df = df.iloc[:-1]  # Remove incomplete last entry
    
    # Save with proper quoting to handle commas in values
    df.to_csv(save_file_name, index=True)
    return df


In [23]:
# run this to create the listings
listings = create_house_listings(num_listings=100)
print(listings)
save_listings_csv(listings)

running for interval 0 to 10
running for interval 10 to 20
running for interval 20 to 30
running for interval 30 to 40
running for interval 40 to 50
running for interval 50 to 60
running for interval 60 to 70
running for interval 70 to 80
running for interval 80 to 90
running for interval 90 to 100
Neighborhood,Property Type,Price,Bedrooms,Bathrooms,House Size,Description,Neighborhood Description
"Alfama","Apartment","€150,000",1,1,"50 sqm","Charming one-bedroom apartment with views of the historic Alfama district. Perfect for a cozy city retreat.","Alfama is a picturesque neighborhood known for its narrow streets, Fado music, and stunning views of the Tagus River."
"Chiado","House","€1,200,000",3,2,"200 sqm","Beautiful three-bedroom house with modern amenities and a rooftop terrace overlooking the city. Ideal for entertaining guests and enjoying the Lisbon lifestyle.","Chiado is a trendy neighborhood filled with boutique shops, cafes, and art galleries, making it a vibrant cultural hu

Unnamed: 0,Neighborhood,Property Type,Price,Bedrooms,Bathrooms,House Size,Description,Neighborhood Description
0,Alfama,Apartment,"€150,000",1,1,50 sqm,Charming one-bedroom apartment with views of t...,Alfama is a picturesque neighborhood known for...
1,Chiado,House,"€1,200,000",3,2,200 sqm,Beautiful three-bedroom house with modern amen...,Chiado is a trendy neighborhood filled with bo...
2,Baixa,Apartment,"€350,000",2,1,80 sqm,Recently renovated two-bedroom apartment in th...,"Baixa is Lisbon's downtown area, known for its..."
3,Estrela,House,"€950,000",4,3,300 sqm,Stunning four-bedroom house with a private gar...,Estrela is an upscale residential area known f...
4,Campo de Ourique,Apartment,"€280,000",1,1,65 sqm,Cozy one-bedroom apartment in the vibrant Camp...,Campo de Ourique is a lively neighborhood with...
...,...,...,...,...,...,...,...,...
95,Estrela,House,"€950,000",5,2,250 sqm,Stunning villa with a rooftop terrace and pano...,Estrela is an elegant district with beautiful ...
96,Belem,Apartment,"€300,000",2,2,70 sqm,Modern apartment with river views and proximit...,Belem is a historic area with famous monuments...
97,Santos,House,"€1,100,000",3,2,220 sqm,Renovated townhouse with a courtyard and desig...,Santos is a lively district with a mix of trad...
98,Parque das Nações,Apartment,"€500,000",3,2,90 sqm,Contemporary apartment in a waterfront develop...,Parque das Nações is a modern district with a ...
