# 🏠 AI-Driven Property Insights & Outreach Automation  
### 2025 **AI Property Insights Hackathon** — Powered by **BatchData**

---

### 🧠 Overview

This project demonstrates an **end-to-end automation pipeline** for identifying and contacting owners of **distressed or high-interest properties** using data from the **BatchData API** and generative AI.

The notebook:
1. Searches for **distressed properties** (e.g., preforeclosures, tax-defaulted, vacant, inherited) in a target market such as *Phoenix, AZ*.
2. **Skip-traces** those properties to find verified contact information (phone numbers, emails, and mailing addresses).
3. Uses **OpenAI** to automatically generate:
   - Personalized email templates  
   - Compliant phone call scripts  
   - Professional letter content for mail outreach  

---

### ⚙️ How It Works

1. **BatchData Property Search API** – Finds qualifying properties based on distress signals and filters.  
2. **BatchData Skip Trace API** – Retrieves validated owner contact data.  
3. **OpenAI API** – Writes ready-to-use, friendly, and compliant outreach messages.  
4. **Data Output** – Results can be exported to CSV or integrated with downstream automation systems.

---

### 🚀 Future Integrations

This prototype can easily connect to:
- **AWS SES** – for automated email campaigns.  
- **AWS Connect** – for AI-guided outbound calls.  
- **Printing or fulfillment services** – for physical mailers and letters.  
- **Market analytics pipelines** – to identify *“hot markets”*, owner clusters, or multi-property investors using additional BatchData signals.

---

### 💡 Purpose

The goal is to streamline real-estate lead generation and owner contact through automation and AI—transforming raw property data into actionable, human-like outreach at scale.  
This system reduces manual effort, maintains compliance, and sets the foundation for a fully automated **AI-driven property insights platform**.

---

*Created for the 2025 AI Property Insights Hackathon, sponsored by BatchData.*


### 🔧 Setup and Configuration

This section initializes the environment and API connections.

- **dotenv**: Loads environment variables (like API keys) from a `.env` file.  
- **requests**: Used to make HTTP requests to the BatchData API.  
- **pandas**: Will be used later for organizing and exporting data.  
- **openai**: Provides access to OpenAI’s API for generating AI outreach messages.  

After loading the environment variables, the code sets up the BatchData and OpenAI API keys, builds the endpoint URLs, and prepares the authorization headers required for BatchData requests.


In [None]:
import json
import os

import pandas as pd
import requests
from dotenv import find_dotenv, load_dotenv
from openai import OpenAI

load_dotenv(find_dotenv())

BD_API_KEY = os.getenv("BATCHDATA_API_PROD_KEY")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

BD_API_BASE_URL = os.getenv("BATCHDATA_API_BASE_URL")
BD_PROPERTY_SEARCH_ENDPOINT = f"{BD_API_BASE_URL}/property/search"
BD_SKIP_TRACE_ENDPOINT = f"{BD_API_BASE_URL}/property/skip-trace"

bd_headers = {
    "Authorization": f"Bearer {BD_API_KEY}",
    "Content-Type": "application/json",
    "Accept": "application/json",
}

### 🏘️ Property Search Request

This section queries the **BatchData Property Search API** to retrieve a list of properties based on specific criteria.

- The `searchCriteria` object defines the target location (`"Phoenix, AZ"`) and filters results using **QuickLists**, such as:
  - `preforeclosure`
  - `vacant`
  - `tax-default`
  - `inherited`, etc.

- The `options` object specifies pagination (`skip` and `take` values), controlling how many records are returned.

The request is sent via `requests.post()` to the property search endpoint using the authorization headers defined earlier.  
If successful, the response is parsed into a JSON object and stored in `properties` for further processing.


In [None]:
# body = {"searchCriteria": {"query": "Phoenix, AZ"}, "options": {"skip": 0, "take": 10}}
body = {
    "searchCriteria": {
        "query": "Phoenix, AZ",
        "orQuickLists": [
            "preforeclosure",
            "notice-of-lis-pendens",
            "notice-of-default",
            "tax-default",
            "involuntary-lien",
            "vacant",
            "mailing-address-vacant",
            "absentee-owner",
            "inherited",
        ],
    },
    "options": {"skip": 0, "take": 10},
}

resp = requests.post(BD_PROPERTY_SEARCH_ENDPOINT, headers=bd_headers, json=body)

if resp.status_code == 200:
    print("OK:", resp)
    data = resp.json()
else:
    print(f"Request failed: {resp.status_code}")
    data = None

if data:
    properties = data.get("results", {}).get("properties", [])
else:
    properties = []

print(f"Got {len(properties)} properties")
print(json.dumps(properties[:2], indent=2))

### ☎️ Skip Trace Request

This section builds and sends a **Skip Trace API** request to BatchData.

- A new request body, `skip_trace_body`, is created to look up **owner contact information** (phone numbers, emails, mailing addresses) for each property retrieved earlier.
- The `"filters": {"dnc": False}` field ensures that only records **not marked as “Do Not Contact”** are included, maintaining compliance.
- Each property’s address is extracted from the previous search results and appended to the `"requests"` list.

The code then sends a POST request to the `BD_SKIP_TRACE_ENDPOINT`.  
If successful, the response is parsed into JSON and stored as `skip_trace_data` for later use.  
This data will form the foundation for generating AI-powered outreach messages.


In [None]:
skip_trace_body = {
    "requests": [],
    "filters": {"dnc": False},
}

for p in properties:
    addr = p.get("address", {})
    prop_req = {
        "propertyAddress": {
            "street": addr.get("street", ""),
            "city": addr.get("city", ""),
            "state": addr.get("state", ""),
            "zip": addr.get("zip", ""),
        }
    }
    skip_trace_body["requests"].append(prop_req)

# print(f"Prepared {len(skip_trace_body['requests'])} skip-trace requests")
# print(json.dumps(skip_trace_body, indent=2))

response = requests.post(
    BD_SKIP_TRACE_ENDPOINT, headers=bd_headers, json=skip_trace_body
)

# --- Inspect response ---
print(f"Status code: {response.status_code}")

if response.status_code == 200:
    try:
        skip_trace_data = response.json()
        print(json.dumps(skip_trace_data, indent=2))
    except ValueError:
        print("⚠️ Response was not valid JSON:")
        print(response.text[:500])
else:
    print(f"❌ Request failed ({response.status_code})")
    print(response.text[:500])

### 🧩 Extracting and Cleaning Contact Data

This section processes the skip-trace results to build a clean, structured list of property owner contacts.

1. **Access Results**  
   The `"persons"` list is extracted from the `skip_trace_data` response.

2. **Format Addresses**  
   The helper function `format_address()` combines street, city, state, and ZIP (+4) fields into a single readable string.

3. **Filter Phone Numbers**  
   Only phone numbers with `"dnc": false` are kept, ensuring compliance with do-not-contact regulations.

4. **Collect Emails**  
   Email addresses are extracted and trimmed for use in outreach messages.

Each contact record is then stored in a list called `contacts`, containing:
- `name` (owner or entity name)  
- `address` (full formatted property address)  
- `phones_ok_to_call` (filtered, valid phone numbers)  
- `emails` (available email addresses)

This produces a clean dataset ready for AI-generated communication templates.


In [None]:
persons = (skip_trace_data or {}).get("results", {}).get("persons", [])


def format_address(addr: dict) -> str:
    if not isinstance(addr, dict):
        return ""
    street = addr.get("street", "")
    city = addr.get("city", "")
    state = addr.get("state", "")
    zip5 = addr.get("zip", "")
    zip4 = addr.get("zipPlus4")
    zip_full = f"{zip5}-{zip4}" if zip4 and "-" not in str(zip5) else zip5
    parts = [street, city, state, zip_full]
    return ", ".join([p for p in parts if p])


contacts = []
for r in persons:
    # address
    address = format_address(r.get("propertyAddress", {}))

    # phones: only those with phone.dnc == False
    phones = [
        str(ph.get("number")).strip()
        for ph in (r.get("phoneNumbers") or [])
        if ph.get("dnc") is False and ph.get("number")
    ]

    # emails
    emails = [
        (em.get("email") or "").strip()
        for em in (r.get("emails") or [])
        if em.get("email")
    ]

    contacts.append(
        {
            "name": r.get("name", {}).get("full")
            or r.get("name", {}).get("first")
            or "Property Owner",
            "address": address,
            "phones_ok_to_call": phones,  # filtered by phone-level DNC
            "emails": emails,
        }
    )

### 🤖 Generating AI-Based Outreach Content

This section uses the **OpenAI API** to automatically create outreach materials for each property owner.

1. **Initialize OpenAI Client**  
   The `OpenAI` client is configured using the API key loaded from the environment.

2. **Generate Outreach Function**  
   The function `generate_outreach_for_contact()` takes a contact record (name, address, phones, emails) and sends a structured prompt to OpenAI’s `gpt-4o-mini` model.

   - The prompt instructs the model to write in a **professional, friendly, and compliant** tone.  
   - It requests four specific outreach assets, returned in JSON format:
     - `email_subject` — short, up to 7 words  
     - `email_body` — concise, 90–130 words  
     - `call_script` — 45–75 seconds of natural dialog  
     - `letter_body` — 120–180 words suitable for print mail

3. **AI Response Handling**  
   The model’s response is parsed as JSON (or stored as raw text if parsing fails).

4. **Batch Generation**  
   The notebook loops through all `contacts` and generates outreach content for each, storing the results in a list called `outreach_assets`.

This transforms contact data into ready-to-use, AI-generated communication materials for email, phone, and mail outreach.


In [None]:
client = OpenAI(api_key=OPENAI_API_KEY)


def generate_outreach_for_contact(contact):
    """
    contact = {
      "name": "Cedarbrook LLC",
      "address": "4522 W Ravina Ln, Phoenix, AZ, 85086-1431",
      "phones_ok_to_call": ["6464966437", ...],
      "emails": ["name@example.com", ...]
    }
    """

    user_msg = f"""
        Owner name: {contact.get('name', 'Property Owner')}
        Property address: {contact.get('address', '')}

        Write outreach assets with this tone: professional, friendly, low-pressure, compliant.
        Never mention do-not-call lists, where data came from, or any legal claims or guarantees.
        Try to add some local flair to the messages, if possible (e.g., mention the city or state and fun things about the area).
        It's OK to be a little goofy with the "local color", but keep it professional.
        Sign it off as:
        Best regards,
        Alfonso Hernandez, BatchData AI in Real Estate Hackathon

        Return a JSON object with EXACT keys:
        - email_subject  (<= 7 words)
        - email_body     (90–130 words)
        - call_script    (45–75 seconds; greeting, brief intro, reason for call, 1–2 short qualifying questions, clear opt-out line)
        - letter_body    (120–180 words; mailing letter, include the property address once in the first paragraph)

        Focus: expressing interest in discussing options regarding the property, offering a no-obligation conversation, and providing a polite close.
    """.strip()

    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=0.4,
        max_tokens=500,
        messages=[
            {
                "role": "system",
                "content": "You write concise, compliant real-estate outreach. Avoid legal claims, pressure, and references to data sources or DNC.",
            },
            {"role": "user", "content": user_msg},
        ],
    )

    text = resp.choices[0].message.content.strip()
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        return {"raw": text}


outreach_assets = []
for c in contacts:
    assets = generate_outreach_for_contact(c)
    outreach_assets.append(
        {
            "name": c.get("name"),
            "address": c.get("address"),
            "phones_ok_to_call": c.get("phones_ok_to_call", []),
            "emails": c.get("emails", []),
            "assets": assets,
        }
    )

# Inspect a couple
print(json.dumps(outreach_assets[:2], indent=2, ensure_ascii=False))