# Housing Datasets:

## Gov DataSets: (Free to Use)

The United States Census Bureau Offers very specific data on real estate:
Example (2021 New York City Housing and Vancancy Survey Micro Data):
[Link to example](https://www.census.gov/data/datasets/2021/demo/nychvs/microdata.html)


More Government Housing DataSets:
[Link to Data](https://www.census.gov/topics/housing/data/datasets.html)

## Zillow DataSets: (Free to Use/Download CSV Files)

[Link to zillow dataset](https://www.zillow.com/research/data/)

**Home Values**: Zillow Home Value INdex (ZHVI)
1. Typical home values in different percentiles
2. Availble for 
    - single gamily residences
    - condo/co op homes
    - homes with 1,2,3,4, and 5+ bedrooms

**Home Vlaue Forecasts:** Zillow Home Value Forecast (ZHVF)
1. Month ahead, quarter-ahead, and year ahead forecasts
2. For all homes (SFR, Confo, Co-op)
3. Metro and US

**Rentals**: Zillow Observed Rent Index (ZORI)
1. Market rate rents
2. Categories
   - all homes
   - single family
   - multi family
3. Metro and US

**Rental Forecast**: Zillow Observed Rent Forecast (ZORF)
1. Forecast of ZORI for a month, quarter, and year ahead
2. U.S

**For Sale Listings**: 
1. For sale inventory (count of active listings each month)
2. New listings (every month)
3. Newly pending listings (listings pending status)
4. Median list price
5. Metro US

**Sales**: 
1. Sales Count nowcast
2. Sale price (median & mean)
3. Total transaction value
4. Sale to list ratio (sale price vs list price)
5. Metro/US

**Market Heat Index**:
1. Supply demand in a market
2. Single family + condo homes
3. Metro / US

**New Construction**: 
1. Number of sold new Constrcution Homes
2. New construction median sale price
3. Median slate price per sqft
4. Metro/US

## Zillow API: (via Bridge Interactive and MLS Integrations)

- API is part of Zillow’s developer platform and allows access to more granular data, such as property-level details and comprehensive real estate records.
- Through the Bridge Interactive platform, Zillow connects to Multiple Listing Service (MLS) datasets, which are extensive, property-level databases maintained by real estate professionals.
- This data is much richer and includes information like listings, sales history, taxes, photos, and agent-provided details.
- APIs requires developer accounts, authentication, and sometimes payment for higher usage tiers

## Kaggle Housing DataSets:

Key Words: House Prices USA, Real Estate USA, Housing DataSets USA, Zillow Data 

Links to potentially interesting data sets:
USA Real Estate Dataset (300k+ entries): [Link to resource](https://www.kaggle.com/discussions/general/333339) <br>
**Attributes:**
- status
- price
- bed
- bath
- acre_lot
- full_address
- street
- city
- state
- zip_code
- house_size
- sold_date


In [14]:
import pandas as pd
import os

data_directory = "./Zillow_DataSets"

# File paths in dictionary
csv_files = {
    "For Sale Listings": os.path.join(data_directory, "For_Sale_Listing.csv"),
    "Home Values": os.path.join(data_directory, "Home_Values.csv"),
    "Home Values Forecast": os.path.join(data_directory, "Home_Values_Forecast.csv"),
    "Market Heat Index": os.path.join(data_directory, "Market_Heat_Index.csv"),
    "Rentals": os.path.join(data_directory, "Rentals.csv"),
    "Sales": os.path.join(data_directory, "Sales.csv")
}

def load_and_explore_csv(file_path):
    print(f"Loading data from: {file_path}")
    df = pd.read_csv(file_path)
    
    # Display basic information
    print("\nAttributes (Columns):")
    print(df.columns.tolist())
    
    print("\nFirst 5 Rows:")
    print(df.head())
    
    print("\n---\n")
    return df

for dataset_name, file_path in csv_files.items():
    print(f"Exploring dataset: {dataset_name}")
    load_and_explore_csv(file_path)


Exploring dataset: For Sale Listings
Loading data from: ./Zillow_DataSets/For_Sale_Listing.csv

Attributes (Columns):
['RegionID', 'SizeRank', 'RegionName', 'RegionType', 'StateName', '2018-03-31', '2018-04-30', '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31', '2018-09-30', '2018-10-31', '2018-11-30', '2018-12-31', '2019-01-31', '2019-02-28', '2019-03-31', '2019-04-30', '2019-05-31', '2019-06-30', '2019-07-31', '2019-08-31', '2019-09-30', '2019-10-31', '2019-11-30', '2019-12-31', '2020-01-31', '2020-02-29', '2020-03-31', '2020-04-30', '2020-05-31', '2020-06-30', '2020-07-31', '2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31', '2021-01-31', '2021-02-28', '2021-03-31', '2021-04-30', '2021-05-31', '2021-06-30', '2021-07-31', '2021-08-31', '2021-09-30', '2021-10-31', '2021-11-30', '2021-12-31', '2022-01-31', '2022-02-28', '2022-03-31', '2022-04-30', '2022-05-31', '2022-06-30', '2022-07-31', '2022-08-31', '2022-09-30', '2022-10-31', '2022-11-30', '2022-12-31', '202