# Zillow Research Data - Quick Start Guide

This notebook demonstrates how to use the **ZillowConnector** to analyze Zillow's housing market data.

** IMPORTANT: Get Real Zillow Data**

Zillow provides FREE housing data as downloadable CSV files (no API available):

**Step 1: Download Real Data**
1. Visit [Zillow Research Data](https://www.zillow.com/research/data/)
2. Click on a dataset category (e.g., "Home Values (ZHVI)")
3. Download CSV file (e.g., "Metro" level ZHVI data)
4. Save to your downloads folder or preferred location

**Step 2: Update File Paths**
- Replace `zhvi_file` path in cell 4 with your downloaded file
- Example: `zhvi_file = "/Users/yourname/Downloads/Metro_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv"`

**Popular Datasets:**
- **ZHVI (Home Values)**: Metro, County, City, ZIP levels
- **ZRI (Rent Index)**: Metro, County, City, ZIP levels  
- **Inventory**: For-sale homes, new listings
- **Sales**: Median sale price, homes sold

**What This Connector Does:**
- Loads Zillow CSV files into pandas DataFrames
- Filters by state, metro, county, or ZIP
- Calculates growth rates (YoY, MoM)
- Converts to time series format
- Exports processed data

---

*© 2025 KR-Labs. Licensed under Apache-2.0.*

## 1. Setup and Import

In [1]:
from krl_data_connectors.housing import ZillowConnector
import pandas as pd

# Initialize connector
zillow = ZillowConnector()
print(" Zillow connector initialized")

{"timestamp": "2025-10-20T17:16:27.880878Z", "level": "INFO", "name": "ZillowConnector", "message": "Connector initialized", "source": {"file": "base_connector.py", "line": 79, "function": "__init__"}, "levelname": "INFO", "taskName": "Task-17", "connector": "ZillowConnector", "cache_dir": "~/.krl_cache", "cache_ttl": 86400, "has_api_key": false}
{"timestamp": "2025-10-20T17:16:27.881081Z", "level": "INFO", "name": "ZillowConnector", "message": "ZillowConnector initialized", "source": {"file": "zillow_connector.py", "line": 119, "function": "__init__"}, "levelname": "INFO", "taskName": "Task-17", "data_source": "Zillow Research Data (file-based)"}
✓ Zillow connector initialized
{"timestamp": "2025-10-20T17:16:27.880878Z", "level": "INFO", "name": "ZillowConnector", "message": "Connector initialized", "source": {"file": "base_connector.py", "line": 79, "function": "__init__"}, "levelname": "INFO", "taskName": "Task-17", "connector": "ZillowConnector", "cache_dir": "~/.krl_cache", "cache

## 2. Load ZHVI Data (Home Values)

Load Zillow Home Value Index data from a downloaded CSV file.

In [2]:
#  REPLACE THIS PATH with your downloaded Zillow ZHVI file
# Download from: https://www.zillow.com/research/data/
# Example: Metro_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv

zhvi_file = "/path/to/your/downloaded/zhvi_file.csv"

# Uncomment and modify this line with your actual file path:
# zhvi_file = "/Users/yourname/Downloads/Metro_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv"

# Load data
try:
    zhvi_data = zillow.load_zhvi_data(zhvi_file)
    
    print(f" Loaded {len(zhvi_data)} geographic areas")
    print(f"Columns: {len(zhvi_data.columns)} total")
    
    # Show date columns
    date_cols = [col for col in zhvi_data.columns if col.startswith('20')]
    print(f"Date range: {date_cols[0]} to {date_cols[-1]}")
    print(f"Total months: {len(date_cols)}")
    
    print(f"\nSample areas:")
    display_cols = ['RegionName', 'State', 'Metro'] if 'Metro' in zhvi_data.columns else ['RegionName', 'State']
    print(zhvi_data[display_cols].head())
    
    zhvi_data.head()
    
except FileNotFoundError:
    print(" File not found!")
    print("\n Please download Zillow data:")
    print("   1. Visit https://www.zillow.com/research/data/")
    print("   2. Download a ZHVI dataset (e.g., Metro level)")
    print("   3. Update the 'zhvi_file' path above with your downloaded file")
    print("\nExample datasets:")
    print("   - Metro ZHVI: All home types by metro area")
    print("   - County ZHVI: All home types by county")  
    print("   - ZIP ZHVI: All home types by ZIP code")

{"timestamp": "2025-10-20T17:16:27.886589Z", "level": "INFO", "name": "ZillowConnector", "message": "Loading ZHVI data", "source": {"file": "zillow_connector.py", "line": 189, "function": "load_zhvi_data"}, "levelname": "INFO", "taskName": "Task-19", "filepath": "/path/to/your/downloaded/zhvi_file.csv"}
❌ File not found!

📥 Please download Zillow data:
   1. Visit https://www.zillow.com/research/data/
   2. Download a ZHVI dataset (e.g., Metro level)
   3. Update the 'zhvi_file' path above with your downloaded file

Example datasets:
   - Metro ZHVI: All home types by metro area
   - County ZHVI: All home types by county
   - ZIP ZHVI: All home types by ZIP code
❌ File not found!

📥 Please download Zillow data:
   1. Visit https://www.zillow.com/research/data/
   2. Download a ZHVI dataset (e.g., Metro level)
   3. Update the 'zhvi_file' path above with your downloaded file

Example datasets:
   - Metro ZHVI: All home types by metro area
   - County ZHVI: All home types by county
   - 

## 3. Filter by State

Get housing data for specific states.

In [3]:
if 'zhvi_data' in locals():
    # Get Rhode Island home values
    ri_homes = zillow.get_state_data(zhvi_data, 'RI')
    
    print(f"Rhode Island metros: {len(ri_homes)}")
    print("\nRegions:")
    print(ri_homes[['RegionName', 'Metro']].drop_duplicates())
else:
    print("  No data loaded. Please run cell 2 first to load ZHVI data.")

⚠️  No data loaded. Please run cell 2 first to load ZHVI data.


## 4. Calculate Year-over-Year Growth

Analyze home value appreciation rates.

In [4]:
if 'ri_homes' in locals() and not ri_homes.empty:
    # Get most recent date column from the data
    date_cols = [col for col in ri_homes.columns if col.startswith('20')]
    if date_cols:
        recent_date = date_cols[-1]  # Most recent month
        
        ri_growth = zillow.calculate_yoy_growth(ri_homes, recent_date)
        
        print(f"Year-over-Year Growth Rates (as of {recent_date}):")
        print(ri_growth[['RegionName', recent_date, 'yoy_growth_pct']].round(2))
    else:
        print("No date columns found in data")
else:
    print("  No RI data available. Run previous cells first.")

⚠️  No RI data available. Run previous cells first.


## 5. Get Time Series Data

Convert wide format to long format for time series analysis.

In [5]:
if 'ri_homes' in locals() and not ri_homes.empty:
    # Convert to time series format
    time_series = zillow.get_time_series(ri_homes, value_name='ZHVI')
    
    print(f" Time series records: {len(time_series)}")
    print(f"Date range: {time_series['Date'].min()} to {time_series['Date'].max()}")
    print("\nSample data:")
    time_series.head(10)
else:
    print("  No RI data available. Run previous cells first.")

⚠️  No RI data available. Run previous cells first.


## 6. Compare Multiple States

Analyze housing markets across regions.

In [6]:
if 'zhvi_data' in locals():
    # Compare Northeast states
    northeast = zillow.get_state_data(zhvi_data, ['RI', 'MA', 'CT', 'NY'])
    
    # Get summary statistics
    summary = zillow.calculate_summary_statistics(northeast)
    
    print("Summary Statistics by State:")
    print(summary[['State', 'mean', 'median', 'min', 'max']].round(0))
else:
    print("  No data loaded. Run cell 2 first to load ZHVI data.")

⚠️  No data loaded. Run cell 2 first to load ZHVI data.


## 7. Load Rental Data (ZRI)

Access Zillow Rent Index for rental market analysis.

In [7]:
#  REPLACE THIS PATH with your downloaded Zillow ZRI (rent) file
# Download from: https://www.zillow.com/research/data/
# Example: Metro_zri_uc_sfrcondomfr_sm_sa_month.csv

zri_file = "/path/to/your/downloaded/zri_file.csv"

# Uncomment and modify with your actual file path:
# zri_file = "/Users/yourname/Downloads/Metro_zri_uc_sfrcondomfr_sm_sa_month.csv"

try:
    zri_data = zillow.load_zri_data(zri_file)
    
    # Filter to Rhode Island
    ri_rents = zillow.get_state_data(zri_data, 'RI')
    
    print(f" Rhode Island rental metros: {len(ri_rents)}")
    ri_rents.head()
    
except FileNotFoundError:
    print(" ZRI file not found!")
    print("\n Download Zillow Rent Index (ZRI) data:")
    print("   Visit https://www.zillow.com/research/data/")
    print("   Look for 'Zillow Observed Rent Index (ZORI)' or 'ZRI' datasets")

{"timestamp": "2025-10-20T17:16:27.909172Z", "level": "INFO", "name": "ZillowConnector", "message": "Loading ZRI data", "source": {"file": "zillow_connector.py", "line": 221, "function": "load_zri_data"}, "levelname": "INFO", "taskName": "Task-29", "filepath": "/path/to/your/downloaded/zri_file.csv"}
❌ ZRI file not found!

📥 Download Zillow Rent Index (ZRI) data:
   Visit https://www.zillow.com/research/data/
   Look for 'Zillow Observed Rent Index (ZORI)' or 'ZRI' datasets
❌ ZRI file not found!

📥 Download Zillow Rent Index (ZRI) data:
   Visit https://www.zillow.com/research/data/
   Look for 'Zillow Observed Rent Index (ZORI)' or 'ZRI' datasets


## 8. Export Results

Save processed data for further analysis.

In [8]:
if 'ri_homes' in locals() and not ri_homes.empty:
    # Export Rhode Island home values
    zillow.export_to_csv(ri_homes, 'ri_home_values.csv')
    print(" Exported ri_home_values.csv")
    
    # Export time series if available
    if 'time_series' in locals() and not time_series.empty:
        zillow.export_to_csv(time_series, 'ri_time_series.csv')
        print(" Exported ri_time_series.csv")
    
    print("\n Data exported successfully")
else:
    print("  No data available to export. Load data first.")

⚠️  No data available to export. Load data first.


## Next Steps

**Explore More:**
- Filter by metro area: `get_metro_data()`
- Filter by county: `get_county_data()`
- Filter by ZIP code: `get_zip_data()`
- Calculate month-over-month growth: `calculate_mom_growth()`
- Get latest N periods: `get_latest_values()`

**Data Sources:**
- [Zillow Research Data Downloads](https://www.zillow.com/research/data/)
- [Zillow Research Homepage](https://www.zillow.com/research/)

**Documentation:**
- Full connector API reference
- Advanced filtering examples
- Visualization tutorials