# Project Overview

> This is an in-depth notebook which explores the Austin Housing Dataset through several different models. The notebook includes a thorough EDA and cleaning section, natural language processing on text descriptions, exploration of different models using different categorical methods (one-hot encoding vs target encoding) with extensive parameter tuning, an evaluation of the final model, and visualizations.

* **Business Objective**


* **Notebook Preparation**
    * Importing our Modules


* **Preprocessing**
    * EDA and Cleaning
        * Scaling target value for Time Series
        * Duplicates
        * Outlier Detection
        * Missing Data
        * Binary Data
        * Studying our Target Variable
    * Natural Language Processing
    * Create Holdout Set
    * Feature Engineering
    * Correlations and Multicollinearity
    * EDA & Process Train Set
        * Categoricals
        * Continuous
            * Standardize Continuous Data
            * Find Interactions
            * Adding Polynomial Features
        * NLP
    * Process Test Set
        * Categoricals
        * Continuous
        * NLP
    * Create Train/Test Final


* **Model Explorations**
    * Picking our Base Features
    * Linear Regressions
        * Basic LR with Top Features One-Hot Encoded
        * Basic LR with Top Features Target Encoded
        * LR with ALL model features
        * Linear Regression with various Feature Selection Methods
            * Permutation Importance
            * Forward-Backward Selector
            * RFECV
    * K-Nearest Neighbors
    * Support Vector Regression
    * Decision Tree Models
        * Decision Tree - One Hot Encoded
            * Random Forest Regressor
            * XGBoost
        * Decision Tree - Target Encoded
            * Random Forest Regressor
            * XGBoost



* **Regression Results and Model Selection**
    * Evaluate results of all attempted models and choose best model


* **Final Model**
    * Process and train final model on data
    * Saving model assets for later use
    * Evaluating final model stats


* Making Predictions with New Data
    * Making new predictions on one entry and imported csv


* Visualizations
    * Feature visualizations


> Analysis
* Statistical analysis and explanations

> Conclusions and Recommendations
* Answers to business questions

> Future Work

> Explanation of Attempts - Feature Engineering/Selection


# Objective

Build a model that accurately predicts house prices in Austin

# Notebook Preparation

In [2]:
# data processing tools
import pandas as pd
import numpy as np
from numpy import mean
from numpy import std
from math import sqrt
import itertools
from collections import Counter

# model tools
import statsmodels.api as sm
from statsmodels.formula.api import ols
import scipy.stats as stats
from scipy.stats import norm
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.feature_selection import RFECV
from sklearn.model_selection import cross_val_score, RepeatedKFold, train_test_split, GridSearchCV
from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn import neighbors
import xgboost as xgb

# NLP tools
import spacy
import re
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

# Visualization tools
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
pd.options.display.max_columns = None

# Preprocessing

In [3]:
df = pd.read_pickle("listings_cleaned_na.pkl")
df

Unnamed: 0,address/community,address/neighborhood,address/streetAddress,address/subdivision,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory,priceHistory/0/buyerAgent,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/buyerAgent,priceHistory/1/buyerAgent/name,priceHistory/1/buyerAgent/photo,priceHistory/1/buyerAgent/photo/url,priceHistory/1/buyerAgent/profileUrl,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/sellerAgent,priceHistory/1/sellerAgent/name,priceHistory/1/sellerAgent/photo,priceHistory/1/sellerAgent/photo/url,priceHistory/1/sellerAgent/profileUrl,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/buyerAgent,priceHistory/2/buyerAgent/name,priceHistory/2/buyerAgent/photo,priceHistory/2/buyerAgent/photo/url,priceHistory/2/buyerAgent/profileUrl,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/sellerAgent,priceHistory/2/sellerAgent/name,priceHistory/2/sellerAgent/photo,priceHistory/2/sellerAgent/photo/url,priceHistory/2/sellerAgent/profileUrl,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/buyerAgent,priceHistory/3/buyerAgent/name,priceHistory/3/buyerAgent/photo,priceHistory/3/buyerAgent/photo/url,priceHistory/3/buyerAgent/profileUrl,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/sellerAgent,priceHistory/3/sellerAgent/name,priceHistory/3/sellerAgent/photo,priceHistory/3/sellerAgent/photo/url,priceHistory/3/sellerAgent/profileUrl,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/buyerAgent,priceHistory/4/buyerAgent/name,priceHistory/4/buyerAgent/photo,priceHistory/4/buyerAgent/photo/url,priceHistory/4/buyerAgent/profileUrl,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/sellerAgent,priceHistory/4/sellerAgent/name,priceHistory/4/sellerAgent/photo,priceHistory/4/sellerAgent/photo/url,priceHistory/4/sellerAgent/profileUrl,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/buyerAgent,priceHistory/5/buyerAgent/name,priceHistory/5/buyerAgent/photo,priceHistory/5/buyerAgent/photo/url,priceHistory/5/buyerAgent/profileUrl,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/sellerAgent,priceHistory/5/sellerAgent/name,priceHistory/5/sellerAgent/photo,priceHistory/5/sellerAgent/photo/url,priceHistory/5/sellerAgent/profileUrl,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/buyerAgent,priceHistory/6/buyerAgent/name,priceHistory/6/buyerAgent/photo,priceHistory/6/buyerAgent/photo/url,priceHistory/6/buyerAgent/profileUrl,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/sellerAgent,priceHistory/6/sellerAgent/name,priceHistory/6/sellerAgent/photo,priceHistory/6/sellerAgent/photo/url,priceHistory/6/sellerAgent/profileUrl,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/buyerAgent,priceHistory/7/buyerAgent/name,priceHistory/7/buyerAgent/photo,priceHistory/7/buyerAgent/photo/url,priceHistory/7/buyerAgent/profileUrl,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/sellerAgent,priceHistory/7/sellerAgent/name,priceHistory/7/sellerAgent/photo,priceHistory/7/sellerAgent/photo/url,priceHistory/7/sellerAgent/profileUrl,priceHistory/7/showCountyLink,priceHistory/7/time,priceHistory/8/buyerAgent,priceHistory/8/buyerAgent/name,priceHistory/8/buyerAgent/photo,priceHistory/8/buyerAgent/photo/url,priceHistory/8/buyerAgent/profileUrl,priceHistory/8/event,priceHistory/8/postingIsRental,priceHistory/8/price,priceHistory/8/priceChangeRate,priceHistory/8/sellerAgent,priceHistory/8/sellerAgent/name,priceHistory/8/sellerAgent/photo,priceHistory/8/sellerAgent/photo/url,priceHistory/8/sellerAgent/profileUrl,priceHistory/8/showCountyLink,priceHistory/8/time,priceHistory/9/buyerAgent,priceHistory/9/buyerAgent/name,priceHistory/9/buyerAgent/photo,priceHistory/9/buyerAgent/photo/url,priceHistory/9/buyerAgent/profileUrl,priceHistory/9/event,priceHistory/9/postingIsRental,priceHistory/9/price,priceHistory/9/priceChangeRate,priceHistory/9/sellerAgent,priceHistory/9/sellerAgent/name,priceHistory/9/sellerAgent/photo,priceHistory/9/sellerAgent/photo/url,priceHistory/9/sellerAgent/profileUrl,priceHistory/9/showCountyLink,priceHistory/9/time,priceHistory/10/buyerAgent,priceHistory/10/buyerAgent/name,priceHistory/10/buyerAgent/photo,priceHistory/10/buyerAgent/photo/url,priceHistory/10/buyerAgent/profileUrl,priceHistory/10/event,priceHistory/10/postingIsRental,priceHistory/10/price,priceHistory/10/priceChangeRate,priceHistory/10/sellerAgent,priceHistory/10/sellerAgent/name,priceHistory/10/sellerAgent/photo,priceHistory/10/sellerAgent/photo/url,priceHistory/10/sellerAgent/profileUrl,priceHistory/10/showCountyLink,priceHistory/10/time,priceHistory/11/buyerAgent,priceHistory/11/buyerAgent/name,priceHistory/11/buyerAgent/photo,priceHistory/11/buyerAgent/photo/url,priceHistory/11/buyerAgent/profileUrl,priceHistory/11/event,priceHistory/11/postingIsRental,priceHistory/11/price,priceHistory/11/priceChangeRate,priceHistory/11/sellerAgent,priceHistory/11/sellerAgent/name,priceHistory/11/sellerAgent/photo,priceHistory/11/sellerAgent/photo/url,priceHistory/11/sellerAgent/profileUrl,priceHistory/11/showCountyLink,priceHistory/11/time,priceHistory/12/buyerAgent,priceHistory/12/buyerAgent/name,priceHistory/12/buyerAgent/photo/url,priceHistory/12/buyerAgent/profileUrl,priceHistory/12/event,priceHistory/12/postingIsRental,priceHistory/12/price,priceHistory/12/priceChangeRate,priceHistory/12/sellerAgent,priceHistory/12/sellerAgent/name,priceHistory/12/sellerAgent/photo,priceHistory/12/sellerAgent/photo/url,priceHistory/12/sellerAgent/profileUrl,priceHistory/12/showCountyLink,priceHistory/12/time,priceHistory/13/buyerAgent,priceHistory/13/buyerAgent/name,priceHistory/13/buyerAgent/photo,priceHistory/13/buyerAgent/photo/url,priceHistory/13/buyerAgent/profileUrl,priceHistory/13/event,priceHistory/13/postingIsRental,priceHistory/13/price,priceHistory/13/priceChangeRate,priceHistory/13/sellerAgent,priceHistory/13/sellerAgent/name,priceHistory/13/sellerAgent/photo,priceHistory/13/sellerAgent/photo/url,priceHistory/13/sellerAgent/profileUrl,priceHistory/13/showCountyLink,priceHistory/13/time,priceHistory/14/buyerAgent,priceHistory/14/buyerAgent/name,priceHistory/14/buyerAgent/photo,priceHistory/14/buyerAgent/photo/url,priceHistory/14/buyerAgent/profileUrl,priceHistory/14/event,priceHistory/14/postingIsRental,priceHistory/14/price,priceHistory/14/priceChangeRate,priceHistory/14/sellerAgent,priceHistory/14/sellerAgent/name,priceHistory/14/sellerAgent/photo,priceHistory/14/sellerAgent/photo/url,priceHistory/14/sellerAgent/profileUrl,priceHistory/14/showCountyLink,priceHistory/14/time,priceHistory/15/buyerAgent,priceHistory/15/buyerAgent/name,priceHistory/15/buyerAgent/photo,priceHistory/15/buyerAgent/photo/url,priceHistory/15/buyerAgent/profileUrl,priceHistory/15/event,priceHistory/15/postingIsRental,priceHistory/15/price,priceHistory/15/priceChangeRate,priceHistory/15/sellerAgent,priceHistory/15/sellerAgent/name,priceHistory/15/sellerAgent/photo,priceHistory/15/sellerAgent/photo/url,priceHistory/15/sellerAgent/profileUrl,priceHistory/15/showCountyLink,priceHistory/15/time,priceHistory/16/buyerAgent,priceHistory/16/buyerAgent/name,priceHistory/16/buyerAgent/photo/url,priceHistory/16/buyerAgent/profileUrl,priceHistory/16/event,priceHistory/16/postingIsRental,priceHistory/16/price,priceHistory/16/priceChangeRate,priceHistory/16/sellerAgent,priceHistory/16/sellerAgent/name,priceHistory/16/sellerAgent/photo,priceHistory/16/sellerAgent/photo/url,priceHistory/16/sellerAgent/profileUrl,priceHistory/16/showCountyLink,priceHistory/16/time,priceHistory/17/buyerAgent,priceHistory/17/buyerAgent/name,priceHistory/17/buyerAgent/photo,priceHistory/17/buyerAgent/photo/url,priceHistory/17/buyerAgent/profileUrl,priceHistory/17/event,priceHistory/17/postingIsRental,priceHistory/17/price,priceHistory/17/priceChangeRate,priceHistory/17/sellerAgent,priceHistory/17/sellerAgent/name,priceHistory/17/sellerAgent/photo,priceHistory/17/sellerAgent/photo/url,priceHistory/17/sellerAgent/profileUrl,priceHistory/17/showCountyLink,priceHistory/17/time,priceHistory/18/buyerAgent,priceHistory/18/buyerAgent/name,priceHistory/18/buyerAgent/photo/url,priceHistory/18/buyerAgent/profileUrl,priceHistory/18/event,priceHistory/18/postingIsRental,priceHistory/18/price,priceHistory/18/priceChangeRate,priceHistory/18/sellerAgent,priceHistory/18/sellerAgent/name,priceHistory/18/sellerAgent/photo,priceHistory/18/sellerAgent/photo/url,priceHistory/18/sellerAgent/profileUrl,priceHistory/18/showCountyLink,priceHistory/18/time,priceHistory/19/buyerAgent,priceHistory/19/buyerAgent/name,priceHistory/19/buyerAgent/photo/url,priceHistory/19/buyerAgent/profileUrl,priceHistory/19/event,priceHistory/19/postingIsRental,priceHistory/19/price,priceHistory/19/priceChangeRate,priceHistory/19/sellerAgent,priceHistory/19/sellerAgent/name,priceHistory/19/sellerAgent/photo,priceHistory/19/sellerAgent/photo/url,priceHistory/19/sellerAgent/profileUrl,priceHistory/19/showCountyLink,priceHistory/19/time,priceHistory/20/buyerAgent,priceHistory/20/buyerAgent/name,priceHistory/20/buyerAgent/photo/url,priceHistory/20/buyerAgent/profileUrl,priceHistory/20/event,priceHistory/20/postingIsRental,priceHistory/20/price,priceHistory/20/priceChangeRate,priceHistory/20/sellerAgent,priceHistory/20/sellerAgent/name,priceHistory/20/sellerAgent/photo/url,priceHistory/20/sellerAgent/profileUrl,priceHistory/20/showCountyLink,priceHistory/20/time,priceHistory/21/buyerAgent,priceHistory/21/buyerAgent/name,priceHistory/21/buyerAgent/photo/url,priceHistory/21/buyerAgent/profileUrl,priceHistory/21/event,priceHistory/21/postingIsRental,priceHistory/21/price,priceHistory/21/priceChangeRate,priceHistory/21/sellerAgent,priceHistory/21/sellerAgent/name,priceHistory/21/sellerAgent/photo/url,priceHistory/21/sellerAgent/profileUrl,priceHistory/21/showCountyLink,priceHistory/21/time,priceHistory/22/buyerAgent,priceHistory/22/buyerAgent/name,priceHistory/22/buyerAgent/photo,priceHistory/22/buyerAgent/photo/url,priceHistory/22/buyerAgent/profileUrl,priceHistory/22/event,priceHistory/22/postingIsRental,priceHistory/22/price,priceHistory/22/priceChangeRate,priceHistory/22/sellerAgent,priceHistory/22/sellerAgent/name,priceHistory/22/sellerAgent/photo,priceHistory/22/sellerAgent/photo/url,priceHistory/22/sellerAgent/profileUrl,priceHistory/22/showCountyLink,priceHistory/22/time,priceHistory/23/buyerAgent,priceHistory/23/buyerAgent/name,priceHistory/23/buyerAgent/photo/url,priceHistory/23/buyerAgent/profileUrl,priceHistory/23/event,priceHistory/23/postingIsRental,priceHistory/23/price,priceHistory/23/priceChangeRate,priceHistory/23/sellerAgent,priceHistory/23/sellerAgent/name,priceHistory/23/sellerAgent/photo,priceHistory/23/sellerAgent/photo/url,priceHistory/23/sellerAgent/profileUrl,priceHistory/23/showCountyLink,priceHistory/23/time,priceHistory/24/buyerAgent,priceHistory/24/buyerAgent/name,priceHistory/24/buyerAgent/photo/url,priceHistory/24/buyerAgent/profileUrl,priceHistory/24/event,priceHistory/24/postingIsRental,priceHistory/24/price,priceHistory/24/priceChangeRate,priceHistory/24/sellerAgent,priceHistory/24/sellerAgent/name,priceHistory/24/sellerAgent/photo/url,priceHistory/24/sellerAgent/profileUrl,priceHistory/24/showCountyLink,priceHistory/24/time,priceHistory/25/buyerAgent,priceHistory/25/buyerAgent/name,priceHistory/25/buyerAgent/photo/url,priceHistory/25/buyerAgent/profileUrl,priceHistory/25/event,priceHistory/25/postingIsRental,priceHistory/25/price,priceHistory/25/priceChangeRate,priceHistory/25/sellerAgent,priceHistory/25/sellerAgent/name,priceHistory/25/sellerAgent/photo/url,priceHistory/25/sellerAgent/profileUrl,priceHistory/25/showCountyLink,priceHistory/25/time,priceHistory/26/buyerAgent,priceHistory/26/buyerAgent/name,priceHistory/26/buyerAgent/photo/url,priceHistory/26/buyerAgent/profileUrl,priceHistory/26/event,priceHistory/26/postingIsRental,priceHistory/26/price,priceHistory/26/priceChangeRate,priceHistory/26/sellerAgent,priceHistory/26/showCountyLink,priceHistory/26/time,priceHistory/27/buyerAgent,priceHistory/27/buyerAgent/name,priceHistory/27/buyerAgent/photo/url,priceHistory/27/buyerAgent/profileUrl,priceHistory/27/event,priceHistory/27/postingIsRental,priceHistory/27/price,priceHistory/27/priceChangeRate,priceHistory/27/sellerAgent,priceHistory/27/sellerAgent/name,priceHistory/27/sellerAgent/photo/url,priceHistory/27/sellerAgent/profileUrl,priceHistory/27/showCountyLink,priceHistory/27/time,priceHistory/28/buyerAgent,priceHistory/28/event,priceHistory/28/postingIsRental,priceHistory/28/price,priceHistory/28/priceChangeRate,priceHistory/28/sellerAgent,priceHistory/28/showCountyLink,priceHistory/28/time,priceHistory/29/buyerAgent,priceHistory/29/event,priceHistory/29/postingIsRental,priceHistory/29/price,priceHistory/29/priceChangeRate,priceHistory/29/sellerAgent,priceHistory/29/showCountyLink,priceHistory/29/time,propertyTaxRate,resoFactsStats/aboveGradeFinishedArea,resoFactsStats/accessibilityFeatures,resoFactsStats/accessibilityFeatures/0,resoFactsStats/accessibilityFeatures/1,resoFactsStats/additionalParcelsDescription,resoFactsStats/appliances,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/appliances/3,resoFactsStats/appliances/4,resoFactsStats/appliances/5,resoFactsStats/appliances/6,resoFactsStats/appliances/7,resoFactsStats/appliances/8,resoFactsStats/appliances/9,resoFactsStats/appliances/10,resoFactsStats/appliances/11,resoFactsStats/appliances/12,resoFactsStats/appliances/13,resoFactsStats/appliances/14,resoFactsStats/architecturalStyle,resoFactsStats/associationAmenities,resoFactsStats/associationAmenities/0,resoFactsStats/associationAmenities/1,resoFactsStats/associationFee,resoFactsStats/associationFee2,resoFactsStats/associationFeeIncludes,resoFactsStats/associationFeeIncludes/0,resoFactsStats/associationFeeIncludes/1,resoFactsStats/associationFeeIncludes/2,resoFactsStats/associationFeeIncludes/3,resoFactsStats/associationFeeIncludes/4,resoFactsStats/associationFeeIncludes/5,resoFactsStats/associationFeeIncludes/6,resoFactsStats/associationFeeIncludes/7,resoFactsStats/associationFeeIncludes/8,resoFactsStats/associationFeeIncludes/9,resoFactsStats/associationFeeIncludes/10,resoFactsStats/associationFeeIncludes/11,resoFactsStats/associationName,resoFactsStats/associationName2,resoFactsStats/associationPhone,resoFactsStats/associationPhone2,resoFactsStats/atAGlanceFacts,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/atAGlanceFacts/6/factValue,resoFactsStats/atAGlanceFacts/7/factValue,resoFactsStats/atAGlanceFacts/8/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsPartial,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/belowGradeFinishedArea,resoFactsStats/builderModel,resoFactsStats/builderName,resoFactsStats/buildingArea,resoFactsStats/buildingFeatures,resoFactsStats/buildingFeatures/0,resoFactsStats/buildingFeatures/1,resoFactsStats/buildingFeatures/2,resoFactsStats/buildingFeatures/3,resoFactsStats/buildingFeatures/4,resoFactsStats/buildingFeatures/5,resoFactsStats/buildingName,resoFactsStats/carportSpaces,resoFactsStats/cityRegion,resoFactsStats/commonWalls,resoFactsStats/communityFeatures,resoFactsStats/communityFeatures/0,resoFactsStats/communityFeatures/1,resoFactsStats/communityFeatures/2,resoFactsStats/communityFeatures/3,resoFactsStats/communityFeatures/4,resoFactsStats/communityFeatures/5,resoFactsStats/constructionMaterials,resoFactsStats/constructionMaterials/0,resoFactsStats/constructionMaterials/1,resoFactsStats/constructionMaterials/2,resoFactsStats/constructionMaterials/3,resoFactsStats/constructionMaterials/4,resoFactsStats/constructionMaterials/5,resoFactsStats/cooling,resoFactsStats/cooling/0,resoFactsStats/cooling/1,resoFactsStats/cooling/2,resoFactsStats/cooling/3,resoFactsStats/coveredSpaces,resoFactsStats/developmentStatus,resoFactsStats/electric,resoFactsStats/electric/0,resoFactsStats/electric/1,resoFactsStats/electric/2,resoFactsStats/electric/3,resoFactsStats/electric/4,resoFactsStats/elementarySchool,resoFactsStats/elementarySchoolDistrict,resoFactsStats/entryLevel,resoFactsStats/entryLocation,resoFactsStats/exteriorFeatures,resoFactsStats/exteriorFeatures/0,resoFactsStats/exteriorFeatures/1,resoFactsStats/exteriorFeatures/2,resoFactsStats/exteriorFeatures/3,resoFactsStats/exteriorFeatures/4,resoFactsStats/exteriorFeatures/5,resoFactsStats/exteriorFeatures/6,resoFactsStats/exteriorFeatures/7,resoFactsStats/exteriorFeatures/8,resoFactsStats/exteriorFeatures/9,resoFactsStats/exteriorFeatures/10,resoFactsStats/fencing,resoFactsStats/fireplaceFeatures,resoFactsStats/fireplaceFeatures/0,resoFactsStats/fireplaces,resoFactsStats/flooring,resoFactsStats/flooring/0,resoFactsStats/flooring/1,resoFactsStats/flooring/2,resoFactsStats/flooring/3,resoFactsStats/flooring/4,resoFactsStats/flooring/5,resoFactsStats/flooring/6,resoFactsStats/flooring/7,resoFactsStats/flooring/8,resoFactsStats/foundationDetails,resoFactsStats/foundationDetails/0,resoFactsStats/foundationDetails/1,resoFactsStats/foundationDetails/2,resoFactsStats/frontageLength,resoFactsStats/frontageType,resoFactsStats/gas,resoFactsStats/gas/0,resoFactsStats/greenBuildingVerificationType,resoFactsStats/greenBuildingVerificationType/0,resoFactsStats/greenBuildingVerificationType/1,resoFactsStats/greenEnergyEfficient,resoFactsStats/greenEnergyEfficient/0,resoFactsStats/greenEnergyEfficient/1,resoFactsStats/greenEnergyEfficient/2,resoFactsStats/greenEnergyEfficient/3,resoFactsStats/greenIndoorAirQuality,resoFactsStats/greenSustainability,resoFactsStats/greenSustainability/0,resoFactsStats/greenSustainability/1,resoFactsStats/greenSustainability/2,resoFactsStats/greenSustainability/3,resoFactsStats/greenSustainability/4,resoFactsStats/greenSustainability/5,resoFactsStats/greenWaterConservation,resoFactsStats/greenWaterConservation/0,resoFactsStats/greenWaterConservation/1,resoFactsStats/greenWaterConservation/2,resoFactsStats/hasAssociation,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasElectricOnProperty,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasView,resoFactsStats/hasWaterfrontView,resoFactsStats/heating,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/heating/2,resoFactsStats/heating/3,resoFactsStats/heating/4,resoFactsStats/heating/5,resoFactsStats/heating/6,resoFactsStats/heating/7,resoFactsStats/heating/8,resoFactsStats/heating/9,resoFactsStats/heating/10,resoFactsStats/heating/11,resoFactsStats/heating/12,resoFactsStats/highSchool,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/isSeniorCommunity,resoFactsStats/landLeaseAmount,resoFactsStats/laundryFeatures,resoFactsStats/laundryFeatures/0,resoFactsStats/laundryFeatures/1,resoFactsStats/levels,resoFactsStats/listAOR,resoFactsStats/listingId,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/lotSizeDimensions,resoFactsStats/mainLevelBathrooms,resoFactsStats/middleOrJuniorSchool,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/numberOfUnitsInCommunity,resoFactsStats/numberOfUnitsVacant,resoFactsStats/onMarketDate,resoFactsStats/openParkingSpaces,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/name,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/name,resoFactsStats/otherFacts/11/value,resoFactsStats/otherFacts/12/name,resoFactsStats/otherFacts/12/value,resoFactsStats/otherFacts/13/name,resoFactsStats/otherFacts/13/value,resoFactsStats/otherFacts/14/name,resoFactsStats/otherFacts/14/value,resoFactsStats/otherFacts/15/name,resoFactsStats/otherFacts/15/value,resoFactsStats/otherFacts/16/name,resoFactsStats/otherFacts/16/value,resoFactsStats/otherFacts/17/name,resoFactsStats/otherFacts/17/value,resoFactsStats/otherFacts/18/name,resoFactsStats/otherFacts/18/value,resoFactsStats/otherFacts/19/name,resoFactsStats/otherFacts/19/value,resoFactsStats/otherFacts/20/name,resoFactsStats/otherFacts/20/value,resoFactsStats/otherFacts/21/name,resoFactsStats/otherFacts/21/value,resoFactsStats/otherFacts/22/name,resoFactsStats/otherFacts/22/value,resoFactsStats/otherFacts/23/name,resoFactsStats/otherFacts/23/value,resoFactsStats/otherFacts/24/name,resoFactsStats/otherFacts/24/value,resoFactsStats/otherFacts/25/name,resoFactsStats/otherFacts/25/value,resoFactsStats/otherFacts/26/name,resoFactsStats/otherFacts/26/value,resoFactsStats/otherFacts/27/name,resoFactsStats/otherFacts/27/value,resoFactsStats/otherFacts/28/name,resoFactsStats/otherFacts/28/value,resoFactsStats/otherFacts/29/name,resoFactsStats/otherFacts/29/value,resoFactsStats/otherFacts/30/name,resoFactsStats/otherFacts/30/value,resoFactsStats/otherFacts/31/name,resoFactsStats/otherFacts/31/value,resoFactsStats/otherFacts/32/name,resoFactsStats/otherFacts/32/value,resoFactsStats/otherFacts/33/name,resoFactsStats/otherFacts/33/value,resoFactsStats/otherFacts/34/name,resoFactsStats/otherFacts/34/value,resoFactsStats/otherFacts/35/name,resoFactsStats/otherFacts/35/value,resoFactsStats/otherFacts/36/name,resoFactsStats/otherFacts/36/value,resoFactsStats/otherFacts/37/name,resoFactsStats/otherFacts/37/value,resoFactsStats/otherFacts/38/name,resoFactsStats/otherFacts/38/value,resoFactsStats/otherFacts/39/name,resoFactsStats/otherFacts/39/value,resoFactsStats/otherFacts/40/name,resoFactsStats/otherFacts/40/value,resoFactsStats/otherFacts/41/name,resoFactsStats/otherFacts/41/value,resoFactsStats/otherFacts/42/name,resoFactsStats/otherFacts/42/value,resoFactsStats/otherFacts/43/name,resoFactsStats/otherFacts/43/value,resoFactsStats/otherFacts/44/name,resoFactsStats/otherFacts/44/value,resoFactsStats/otherFacts/45/name,resoFactsStats/otherFacts/45/value,resoFactsStats/otherFacts/46/name,resoFactsStats/otherFacts/46/value,resoFactsStats/otherFacts/47/name,resoFactsStats/otherFacts/47/value,resoFactsStats/otherFacts/48/name,resoFactsStats/otherFacts/48/value,resoFactsStats/otherFacts/49/name,resoFactsStats/otherFacts/49/value,resoFactsStats/otherFacts/50/name,resoFactsStats/otherFacts/50/value,resoFactsStats/otherFacts/51/name,resoFactsStats/otherFacts/51/value,resoFactsStats/otherFacts/52/name,resoFactsStats/otherFacts/52/value,resoFactsStats/otherFacts/53/name,resoFactsStats/otherFacts/53/value,resoFactsStats/otherFacts/54/name,resoFactsStats/otherFacts/54/value,resoFactsStats/otherFacts/55/name,resoFactsStats/otherFacts/55/value,resoFactsStats/otherFacts/56/name,resoFactsStats/otherFacts/56/value,resoFactsStats/otherFacts/57/name,resoFactsStats/otherFacts/57/value,resoFactsStats/otherFacts/58/name,resoFactsStats/otherFacts/58/value,resoFactsStats/otherFacts/59/name,resoFactsStats/otherFacts/59/value,resoFactsStats/otherFacts/60/name,resoFactsStats/otherFacts/60/value,resoFactsStats/otherFacts/61/name,resoFactsStats/otherFacts/61/value,resoFactsStats/otherFacts/62/name,resoFactsStats/otherFacts/62/value,resoFactsStats/otherFacts/63/name,resoFactsStats/otherFacts/63/value,resoFactsStats/otherFacts/64/name,resoFactsStats/otherFacts/64/value,resoFactsStats/otherFacts/65/name,resoFactsStats/otherFacts/65/value,resoFactsStats/otherFacts/66/name,resoFactsStats/otherFacts/66/value,resoFactsStats/otherFacts/67/name,resoFactsStats/otherFacts/67/value,resoFactsStats/otherFacts/68/name,resoFactsStats/otherFacts/68/value,resoFactsStats/otherFacts/69/name,resoFactsStats/otherFacts/69/value,resoFactsStats/otherFacts/70/name,resoFactsStats/otherFacts/70/value,resoFactsStats/otherFacts/71/name,resoFactsStats/otherFacts/71/value,resoFactsStats/otherFacts/72/name,resoFactsStats/otherFacts/72/value,resoFactsStats/otherFacts/73/name,resoFactsStats/otherFacts/73/value,resoFactsStats/otherFacts/74/name,resoFactsStats/otherFacts/74/value,resoFactsStats/otherFacts/75/name,resoFactsStats/otherFacts/75/value,resoFactsStats/otherFacts/76/name,resoFactsStats/otherFacts/76/value,resoFactsStats/otherFacts/77/name,resoFactsStats/otherFacts/77/value,resoFactsStats/otherFacts/78/name,resoFactsStats/otherFacts/78/value,resoFactsStats/otherFacts/79/name,resoFactsStats/otherFacts/79/value,resoFactsStats/otherFacts/80/name,resoFactsStats/otherFacts/80/value,resoFactsStats/otherFacts/81/name,resoFactsStats/otherFacts/81/value,resoFactsStats/otherFacts/82/name,resoFactsStats/otherFacts/82/value,resoFactsStats/otherFacts/83/name,resoFactsStats/otherFacts/83/value,resoFactsStats/otherFacts/84/name,resoFactsStats/otherFacts/84/value,resoFactsStats/otherFacts/85/name,resoFactsStats/otherFacts/85/value,resoFactsStats/otherFacts/86/name,resoFactsStats/otherFacts/86/value,resoFactsStats/otherFacts/87/name,resoFactsStats/otherFacts/87/value,resoFactsStats/otherFacts/88/name,resoFactsStats/otherFacts/88/value,resoFactsStats/otherFacts/89/name,resoFactsStats/otherFacts/89/value,resoFactsStats/otherFacts/90/name,resoFactsStats/otherFacts/90/value,resoFactsStats/otherFacts/91/name,resoFactsStats/otherFacts/91/value,resoFactsStats/otherFacts/92/name,resoFactsStats/otherFacts/92/value,resoFactsStats/otherParking,resoFactsStats/otherParking/0,resoFactsStats/otherStructures,resoFactsStats/otherStructures/0,resoFactsStats/otherStructures/1,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/parkingFeatures/2,resoFactsStats/parkingFeatures/3,resoFactsStats/parkingFeatures/4,resoFactsStats/parkingFeatures/5,resoFactsStats/parkingFeatures/6,resoFactsStats/parkingFeatures/7,resoFactsStats/parkingFeatures/8,resoFactsStats/patioAndPorchFeatures,resoFactsStats/patioAndPorchFeatures/0,resoFactsStats/patioAndPorchFeatures/1,resoFactsStats/patioAndPorchFeatures/2,resoFactsStats/propertyCondition,resoFactsStats/roofType,resoFactsStats/rooms/0/area,resoFactsStats/rooms/0/description,resoFactsStats/rooms/0/dimensions,resoFactsStats/rooms/0/features,resoFactsStats/rooms/0/length,resoFactsStats/rooms/0/level,resoFactsStats/rooms/0/roomType,resoFactsStats/rooms/0/width,resoFactsStats/rooms/1/area,resoFactsStats/rooms/1/description,resoFactsStats/rooms/1/dimensions,resoFactsStats/rooms/1/features,resoFactsStats/rooms/1/length,resoFactsStats/rooms/1/level,resoFactsStats/rooms/1/roomType,resoFactsStats/rooms/1/width,resoFactsStats/rooms/2/area,resoFactsStats/rooms/2/description,resoFactsStats/rooms/2/dimensions,resoFactsStats/rooms/2/features,resoFactsStats/rooms/2/length,resoFactsStats/rooms/2/level,resoFactsStats/rooms/2/roomType,resoFactsStats/rooms/2/width,resoFactsStats/rooms/3/area,resoFactsStats/rooms/3/description,resoFactsStats/rooms/3/dimensions,resoFactsStats/rooms/3/features,resoFactsStats/rooms/3/length,resoFactsStats/rooms/3/level,resoFactsStats/rooms/3/roomType,resoFactsStats/rooms/3/width,resoFactsStats/rooms/4/area,resoFactsStats/rooms/4/description,resoFactsStats/rooms/4/dimensions,resoFactsStats/rooms/4/features,resoFactsStats/rooms/4/length,resoFactsStats/rooms/4/level,resoFactsStats/rooms/4/roomType,resoFactsStats/rooms/4/width,resoFactsStats/rooms/5/area,resoFactsStats/rooms/5/description,resoFactsStats/rooms/5/dimensions,resoFactsStats/rooms/5/features,resoFactsStats/rooms/5/length,resoFactsStats/rooms/5/level,resoFactsStats/rooms/5/roomType,resoFactsStats/rooms/5/width,resoFactsStats/rooms/6/area,resoFactsStats/rooms/6/description,resoFactsStats/rooms/6/dimensions,resoFactsStats/rooms/6/features,resoFactsStats/rooms/6/length,resoFactsStats/rooms/6/level,resoFactsStats/rooms/6/roomType,resoFactsStats/rooms/6/width,resoFactsStats/rooms/7/area,resoFactsStats/rooms/7/description,resoFactsStats/rooms/7/dimensions,resoFactsStats/rooms/7/features,resoFactsStats/rooms/7/length,resoFactsStats/rooms/7/level,resoFactsStats/rooms/7/roomType,resoFactsStats/rooms/7/width,resoFactsStats/rooms/8/area,resoFactsStats/rooms/8/description,resoFactsStats/rooms/8/dimensions,resoFactsStats/rooms/8/features,resoFactsStats/rooms/8/length,resoFactsStats/rooms/8/level,resoFactsStats/rooms/8/roomType,resoFactsStats/rooms/8/width,resoFactsStats/rooms/9/area,resoFactsStats/rooms/9/description,resoFactsStats/rooms/9/dimensions,resoFactsStats/rooms/9/features,resoFactsStats/rooms/9/length,resoFactsStats/rooms/9/level,resoFactsStats/rooms/9/roomType,resoFactsStats/rooms/9/width,resoFactsStats/rooms/10/area,resoFactsStats/rooms/10/description,resoFactsStats/rooms/10/dimensions,resoFactsStats/rooms/10/features,resoFactsStats/rooms/10/length,resoFactsStats/rooms/10/level,resoFactsStats/rooms/10/roomType,resoFactsStats/rooms/10/width,resoFactsStats/rooms/11/area,resoFactsStats/rooms/11/description,resoFactsStats/rooms/11/dimensions,resoFactsStats/rooms/11/features,resoFactsStats/rooms/11/length,resoFactsStats/rooms/11/level,resoFactsStats/rooms/11/roomType,resoFactsStats/rooms/11/width,resoFactsStats/rooms/12/area,resoFactsStats/rooms/12/description,resoFactsStats/rooms/12/dimensions,resoFactsStats/rooms/12/features,resoFactsStats/rooms/12/length,resoFactsStats/rooms/12/level,resoFactsStats/rooms/12/roomType,resoFactsStats/rooms/12/width,resoFactsStats/rooms/13/area,resoFactsStats/rooms/13/description,resoFactsStats/rooms/13/dimensions,resoFactsStats/rooms/13/features,resoFactsStats/rooms/13/length,resoFactsStats/rooms/13/level,resoFactsStats/rooms/13/roomType,resoFactsStats/rooms/13/width,resoFactsStats/securityFeatures,resoFactsStats/securityFeatures/0,resoFactsStats/securityFeatures/1,resoFactsStats/securityFeatures/2,resoFactsStats/sewer,resoFactsStats/sewer/0,resoFactsStats/sewer/1,resoFactsStats/sewer/2,resoFactsStats/spaFeatures,resoFactsStats/spaFeatures/0,resoFactsStats/specialListingConditions,resoFactsStats/stories,resoFactsStats/storiesTotal,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/topography,resoFactsStats/utilities,resoFactsStats/utilities/0,resoFactsStats/utilities/1,resoFactsStats/utilities/2,resoFactsStats/utilities/3,resoFactsStats/utilities/4,resoFactsStats/utilities/5,resoFactsStats/utilities/6,resoFactsStats/utilities/7,resoFactsStats/vegetation,resoFactsStats/vegetation/0,resoFactsStats/vegetation/1,resoFactsStats/view,resoFactsStats/view/0,resoFactsStats/view/1,resoFactsStats/view/2,resoFactsStats/view/3,resoFactsStats/view/4,resoFactsStats/view/5,resoFactsStats/virtualTour,resoFactsStats/waterfrontFeatures,resoFactsStats/waterfrontFeatures/0,resoFactsStats/waterfrontFeatures/1,resoFactsStats/waterfrontFeatures/2,resoFactsStats/waterfrontFeatures/3,resoFactsStats/waterfrontFeatures/4,resoFactsStats/waterfrontFeatures/5,resoFactsStats/waterfrontFeatures/6,resoFactsStats/windowFeatures,resoFactsStats/windowFeatures/0,resoFactsStats/windowFeatures/1,resoFactsStats/windowFeatures/2,resoFactsStats/windowFeatures/3,resoFactsStats/woodedArea,resoFactsStats/yearBuiltEffective,resoFactsStats/zoning,resoFactsStats/zoningDescription,schools,schools/0/assigned,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/1/assigned,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/2/assigned,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
0,,,60 Terrace View Ave,,10463.0,2.0,5.0,1.610134e+12,"Discover Marble Hill, a neighborhood rich with...",FOR_SALE,40.877743,1889.0,-73.910866,799999.0,,,,,,,Listed for sale,799999.0,0.335558,,,,,,1.610064e+12,,,,,,Listing removed,0.0,599000.0,0.000000,,,,,,0.0,1.459469e+12,,,,,,Listed for sale,0.0,599000.0,0.711429,,,,,,0.0,1.426810e+12,,,,,,Listing removed,0.0,350000.0,0.000000,,,,,,0.0,1.293062e+12,,,,,,Price change,0.0,350000.0,-0.066667,,,,,,0.0,1.276128e+12,,,,,,Price change,0.0,375000.0,0.071429,,,,,,0.0,1.275610e+12,,,,,,Listed for sale,0.0,350000.0,0.000000,,,,,,0.0,1.265328e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.88,,,,,,,,,,,,,,,,,,,,,,"Victorian,Trilevel",,,,,,,,,,,,,,,,,,,,,,,,"Natural Gas, Hot Water",,Driveway,,12 Days,$424,,Finished,2.0,1.0,1.0,,,,5.0,,,,,,,,,,,,,,New York,,,Park,,,,,,,Frame,Vinyl Siding,,,,,,,,,,,,,,,,,,Call Listing Agent,Bronx 10,,,,,,,,,,,,,,,Back Yard,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Frame,Vinyl Siding,,,,,,,,,,0,0,1.0,,,0,1.0,0,,0,,,Natural Gas,Hot Water,,,,,,,,,,,,Call Listing Agent,Bronx 10,Residential,,,,,,,3.00,,,"1,889 sqft",,,,Call Listing Agent,Bronx 10,,,1.610064e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,NO TAX ID FOUND,0,Driveway,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Public Sewer,,,,,,,,,5096.0,711000.0,,,,,,,,,,,,,,,,,,,,,https://my.matterport.com/show/?m=jZtoZmynvwi,,,,,,,,,,,,,,,,,,,,0.1,Elementary,Ps 37 Multiple Intelligence School,4.0,647.0,14.0,1.0,,0.1,Middle,In Tech Academy Aka Ms High School 368,3.0,993.0,14.0,1.0,,,,,,,,,,,,,1920.0
1,,,625 W 246th St,,10471.0,8.0,8.0,1.595968e+12,EXCLUSIVE BRAND NEW\nLavish Newly Built 8-Bd. ...,FOR_SALE,40.892689,7000.0,-73.910667,3995000.0,,,,,,,Price change,3995000.0,-0.111235,,,,,,1.607299e+12,,,,,,Price change,0.0,4495000.0,-0.080401,,,,,,0.0,1.601510e+12,,,,,,Listed for sale,0.0,4888000.0,0.087430,,,,,,0.0,1.595894e+12,,,,,,Listing removed,0.0,4495000.0,0.000000,,,,,,0.0,1.584144e+12,,,,,,Listed for sale,0.0,4495000.0,3.610256,,,,,,0.0,1.572566e+12,,,,,,Sold,0.0,975000.0,-0.025000,,Trebach Realty,,,/profile/Trebach-Realty/,0.0,1.450397e+12,,,,,,Listing removed,0.0,1000000.0,0.000000,,,,,,0.0,1.447200e+12,,,,,,Pending sale,0.0,1000000.0,0.000000,,,,,,0.0,1.439856e+12,,,,,,Listed for sale,0.0,1000000.0,0.00,,,,,,0.0,1.430438e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,Dishwasher,Dryer,Washer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Central,"Garage, Garage - Attached",0.29 Acres,176 Days,$571,,,8.0,7.0,1.0,0.0,,0.0,8.0,,,,,,,,,,,,,,Bronx,,,,,,,,,,,,,,,,,Central,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Hardwood,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,1.0,,1.0,0,0.0,0,,0,,,,,,,,,,,,,,,,,,Single Family,0.0,,,,,,,,,"7,000 sqft",0.29 Acres,,,,,,,1.595894e+12,,,Clubhouse,,Granite countertop,,Playground,,Stainless steel appliances,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,059130860,0,Garage,Garage - Attached,,,,,,,,,,,,,Other,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.0,,Other,13941.0,1937000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1940.0,,,,,0.4,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,,,,,,,,,,,,,1940.0
2,,,716 W 231st St,,10463.0,3.0,4.0,1.592668e+12,This 4233 square foot single family home has 4...,FOR_SALE,40.883419,4233.0,-73.918106,1495000.0,,,,,,,Price change,1495000.0,-0.002668,,,,,,1.611101e+12,,,,,,Listed for sale,0.0,1499000.0,0.000000,,,,,,0.0,1.592611e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,Dishwasher,Dryer,Washer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"Garage, Garage - Attached",0.42 Acres,214 Days,$353,,,3.0,3.0,0.0,0.0,,0.0,4.0,,,,,,,,,,,,,,Bronx,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0.0,,,0,0.0,0,,0,,,,,,,,,,,,,,,,,,Single Family,0.0,,,,,,,,,"4,233 sqft",0.42 Acres,,,,,,,1.592611e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,057500494,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,12253.0,2341000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1920.0,,,,,0.3,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,,0.4,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,,,,,,,,,,,,,1920.0
3,,,750 W 232nd St,,10463.0,6.0,5.0,1.600814e+12,EXCLUSIVE NEW TO MARKET\nPrime Renovation Oppo...,FOR_SALE,40.885033,7000.0,-73.917793,3450000.0,,,,,,,Price change,3450000.0,-0.092105,,,,,,1.608163e+12,,,,,,Listed for sale,0.0,3800000.0,0.225806,,,,,,0.0,1.600733e+12,,,,,,Sold,0.0,3100000.0,-0.156463,,,,,,0.0,1.551917e+12,,,,,,Listing removed,0.0,3675000.0,0.000000,,,,,,0.0,1.550707e+12,,,,,,Pending sale,0.0,3675000.0,0.000000,,,,,,0.0,1.510272e+12,,,,,,Listed for sale,0.0,3675000.0,-0.125000,,,,,,0.0,1.506298e+12,,,,,,Sold,0.0,4200000.0,0.000000,,,,,,0.0,9.650880e+11,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Central,"Garage, Garage - Attached",0.26 Acres,120 Days,$493,,,6.0,6.0,0.0,0.0,,0.0,5.0,,,,,,,,,,,,,,Bronx,,,,,,,,,,,,,,,,,Central,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,1.0,,1.0,0,0.0,0,,0,,,,,,,,,,,,,,,,,,Single Family,0.0,,,,,,,,,"7,000 sqft",0.26 Acres,,,,,,,1.600733e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,057510300,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,19472.0,3011000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1950.0,,,,,0.2,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,,,,,,,,,,,,,1950.0
4,,,24 Cooper St #5CD,,10034.0,2.0,3.0,1.611091e+12,"Due to Coronavirus 19, outbreak, ALL showings ...",FOR_SALE,40.867687,994.0,-73.924606,230000.0,,,,,,,Listed for sale,230000.0,0.000000,,,,,,1.611014e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.88,,,,,,,,,,,,,,,,,,,,,,,,,,"$1,472/mo",,,,,,,,,,,,,,,,,,,,,,0 spaces,"$1,472/mo",1 Day,$231,,,2.0,2.0,0.0,0.0,,0.0,3.0,,,,,,,,,,,,,,New York,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.0,0,0,0.0,,,0,0.0,0,,0,,,,,,,,,,,,,,,,,,Condo,0.0,,,,,,,,,994 sqft,,,,,,,,1.611014e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.8,Elementary,Ps 18 Park Terrace,5.0,349.0,13.0,3.0,,1.9,Middle,Ms 319 Marie Teresa,7.0,421.0,12.0,3.0,,0.1,9-12,1.0,High,https://www.greatschools.org/school?id=18169&s...,INWOOD EARLY COLLEGE FOR HEALTH AND INFORMATIO...,3.0,371.0,,1.0,Public,1925.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
33839,,,93-19 71st Ave,,11375.0,2.0,4.0,,This house is a rare combination of superb loc...,SOLD,40.712009,2200.0,-73.850281,1255000.0,,,,,,,Sold,1255000.0,0.004804,,David Yakubov,,https://photos.zillowstatic.com/h_e/ISd8b63jno...,/profile/user3094820/,1.530230e+12,,,,,,Pending sale,0.0,1249000.0,0.000000,,,,,,0.0,1.523837e+12,,,,,,Listed for sale,0.0,1249000.0,1.401923,,,,,,0.0,1.519603e+12,,,,,,Sold,0.0,520000.0,0.000000,,,,,,0.0,1.512086e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,,,,Dishwasher,Dryer,Washer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"Garage, Garage - Attached","2,500 sqft",,,,,2.0,1.0,1.0,0.0,,0.0,4.0,,,,,,,,,,,,,,Forest Hills,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Community District 28,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0.0,,,0,0.0,0,,0,,,,,,,,,,,,,,,,,Community District 28,Single Family,,,,,,,,,,"2,200 sqft","2,500 sqft",,,,Community District 28,,,,,Den/Family Room,Y,Detached/Attached,Det,Driveway,Pvt,Eat In Kitchen,Y,Picture,Y,Water,Public,Attic,Y,Heat,Steam,Sewer,Y,Wood Floors,Y,Fuel,Gas,Office,Y,Zone,10,A/C,7,# Fireplaces,0,# Kitchens,1,County,Queens,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,032220033,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,DiningRoom,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Y,,,,,,,,Loft,7129.0,1034000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1930.0,,,,,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1930.0
33840,,,6829 Manse St,,11375.0,2.0,3.0,,Wonderful 1 Family Home. First Floor Features ...,SOLD,40.714203,2417.0,-73.855263,825000.0,,,,,,,Sold,825000.0,-0.049539,,Annie/Steve Your Home Sold Guaranteed,,https://photos.zillowstatic.com/h_e/ISqh2r0uwq...,/profile/Agardi-Team/,1.532995e+12,,,,,,Listing removed,0.0,868000.0,0.000000,,,,,,0.0,1.519690e+12,,,,,,Listed for sale,0.0,868000.0,0.000000,,,,,,0.0,1.518566e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Other,,"Garage, Garage - Attached","2,417 sqft",,,,,2.0,0.0,0.0,0.0,,0.0,3.0,,,,,,,,,,,,,,Flushing,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0.0,,,0,1.0,0,,0,,,Other,,,,,,,,,,,,,,,Single Family,,,,,,,,,,"2,417 sqft","2,417 sqft",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,031950052,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,DiningRoom,,,,,,,,FamilyRoom,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,6447.0,907000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1920.0,,,,,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1920.0
33841,,,82 Greenway Ter,,11375.0,6.0,6.0,,"""DISTINQUISHED FIELDSTONE TOWNHOUSE TREASURE""\...",SOLD,40.717163,6085.0,-73.843124,2704000.0,,,GiGi Malek,,https://photos.zillowstatic.com/h_e/ISvww3cyma...,/profile/ForestHillsGiGi/,Sold,2704000.0,0.040400,,Linda Weiss,,https://photos.zillowstatic.com/h_e/ISbxza59gh...,/profile/lindaweiss11/,1.561939e+12,,,,,,Listing removed,0.0,2599000.0,0.000000,,,,,,0.0,1.560989e+12,,,,,,Pending sale,0.0,2599000.0,0.000000,,,,,,0.0,1.556410e+12,,,,,,Listed for sale,0.0,2599000.0,0.000000,,,,,,0.0,1.554336e+12,,,,,,Pending sale,0.0,2599000.0,0.000000,,,,,,0.0,1.553558e+12,,,,,,Listing removed,1.0,7000.0,0.000000,,,,,,0.0,1.553126e+12,,,,,,Price change,1.0,7000.0,-0.176471,,,,,,0.0,1.552349e+12,,,,,,Listed for sale,0.0,2599000.0,1.652041,,,,,,0.0,1.551139e+12,,,,,,Price change,1.0,8500.0,-0.15,,,,,,0.0,1.547424e+12,,,,,,Listed for rent,1.0,10000.0,0.0,,,,,,0.0,1.540166e+12,,Linda Weiss,,https://photos.zillowstatic.com/h_e/ISbxza59gh...,/profile/lindaweiss11/,Sold,0.0,980000.0,10.95122,,Linda Weiss,,https://photos.zillowstatic.com/h_e/ISbxza59gh...,/profile/lindaweiss11/,0.0,9.742464e+11,,Terrace Sotheby's International Realty,,https://photos.zillowstatic.com/h_e/ISbpd8y5jp...,/profile/terracesir/,Sold,0.0,82000.0,0.0,,Terrace Sotheby's International Realty,,https://photos.zillowstatic.com/h_e/ISbpd8y5jp...,/profile/terracesir/,0.0,1.849824e+11,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"Garage, Garage - Attached","3,255 sqft",,,,,6.0,5.0,1.0,0.0,,0.0,6.0,,,,,,,,,,,,,,Forest Hills Gardens,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0.0,,0.0,0,0.0,0,0.0,0,,,,,,,,,,,,,,,,,,Townhouse,,,,,,,,,,"6,085 sqft","3,255 sqft",,,,,,,,,,Fios Available,,Parking,Parking Type,Garage,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,032740007,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,18430.0,2513000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1925.0,,,,,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0
33842,,,86 Greenway Ter,,11375.0,5.0,6.0,,EXCLUSIVE LISTING OF TERRACE SOTHEBY'S INTERNA...,SOLD,40.717052,4564.0,-73.843025,2750000.0,,,Terrace Sotheby's International Realty,,https://photos.zillowstatic.com/h_e/ISbpd8y5jp...,/profile/terracesir/,Sold,2750000.0,-0.075630,,Sheldon Stivelman,,https://photos.zillowstatic.com/h_e/ISf403agde...,/profile/SheldonStivelman/,1.532390e+12,,,,,,Pending sale,0.0,2975000.0,0.000000,,,,,,0.0,1.523318e+12,,,,,,Listed for sale,0.0,2975000.0,0.000000,,,,,,0.0,1.521677e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0 spaces,"6,603 sqft",,,,,5.0,4.0,1.0,0.0,,0.0,6.0,,,,,,,,,,,,,,Forest Hills Gardens,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0.0,,0.0,0,0.0,0,0.0,0,,,,,,,,,,,,,,,,,,Townhouse,,,,,,,,,,"4,564 sqft","6,603 sqft",,,,,,,,,Features,"Special Program/QC Approved Listing, Garage Co...",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,032740004,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,24649.0,2893000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1925.0,,,,,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0


## EDA and Cleaning

### Scale Time Series

We need to bring all of our home sale prices to the same time scale. It's easy to ignore that these homes were sold over the space of many years, but a year is a long time for real estate. We'll need to appreciate all sale prices into our most recent time series, using months.

In [None]:
df.rename(columns={'latestPrice':'price'}, inplace=True)

In [None]:
df['latest_saleyear'].unique()

# our data spans all the way from 2019 to 2021.

In [None]:
df['latest_saledate'].min()
# earliest sale in january 2018

In [None]:
df['latest_saledate'].max()
# last sale date january 2021

We need to look to some external sources to get house price appreciation info. I went to the Austin Board of Realtors site to get this info from Texas A&M for the Austin metro area that this data covers  
https://www.recenter.tamu.edu/data/housing-activity/#!/activity/MSA/Austin-Round_Rock

* Jan 2018 median: 287000
* Jan 2019 median: 294000 +2.4%
* Jan 2020 median: 305000 +3.7%
* Jan 2021 median: 363830 +19.3% (!!!)

In [None]:
jan2018_to_2019 = .024/12
jan2019_to_2020 = .037/12
jan2020_to_2021 = .193/12

In [None]:
for i in range(1,13):
    df['time_series'] = df['latest_salemonth'].apply(lambda x: abs(x-13))

In [None]:
# adjust 2018 to 2019
df.loc[df['latest_saleyear']==2018, 'adj_price'] = df.loc[df['latest_saleyear']==2018].apply(lambda x: int( (x['price'])*(1+(jan2018_to_2019*x['time_series']))*(1+(jan2019_to_2020*x['time_series']))*(1+(jan2020_to_2021*x['time_series'])) ) , axis=1 )

# adjust 2019 to 2020                                                            
df.loc[df['latest_saleyear']==2019, 'adj_price'] = df.loc[df['latest_saleyear']==2019].apply(lambda x: int( (x['price'])*(1+(jan2019_to_2020*x['time_series']))*(1+(jan2020_to_2021*x['time_series'])) ) , axis=1)
                                                                                  
# adjust 2020 to 2021
df.loc[df['latest_saleyear']==2020, 'adj_price'] = df.loc[df['latest_saleyear']==2020].apply(lambda x: int( (x['price'])*(1+(jan2020_to_2021*x['time_series'])) ), axis=1)

#copy 2021 to self
df.loc[df['latest_saleyear']==2021, 'adj_price'] = df.loc[df['latest_saleyear']==2021].apply(lambda x: int(x['price']), axis=1)

In [None]:
# rename original price column and make adj_price our price column
df.rename(columns={'price' : 'orig_price', 'adj_price' : 'price'}, inplace=True)

# put price at the front of the data frame
df.set_index('price', inplace=True)
df.reset_index(inplace=True)

### Checking homeType

In [None]:
# what are the homeTypes?

df['homeType'].value_counts(normalize=True)

In [None]:
# Ultimately, with Single Family, Condo and Townhouse making up most of the data, we are going to remove all multi-family type listings
df = df.loc[((df['homeType'] == 'Single Family') | (df['homeType'] == 'Condo')) | (df['homeType'] == 'Townhouse')]

### Duplicate Data

In [None]:
# check for duplicate data

df[df.duplicated(subset=['latitude','longitude'], keep=False)].sort_values('latitude')

# no duplicate data

### Outlier Detection

In [None]:
'''# plotting latitude and longitude as a visual scatter plot to look for location-based outliers

plt.figure(figsize=(25,25))

sns.scatterplot(data=df, x="longitude", y="latitude", hue="price", palette="magma_r");'''

Using latitude and longitude, we make a visual map of the Austin area that lets us see any map outliers. There don't appear to be any zones that are well outside of the Austin area, except for just a few down in the lower SE area. So we might plan to cut off our latitude just above 30.1.

This visualization suggests that location is very important to home price. We'll check that out more directly.

In [None]:
# drop latitidue below 30.12 to remove the few outliers in the SE
df.drop(df[df['latitude']<30.12].index , inplace=True)

In [None]:
# looking for outliers in the percentiles

df.describe()

We see potential outliers in price, lotSizeSqFt, livingAreaSqFt, and numOfBathrooms, numOfBedrooms, propertyTaxRate, garageSpaces, parkingSpaces.

In [None]:
'''# check how our histograms are looking
df.hist(figsize=(18,15), bins=100);
'''

In [None]:
#check what is going on with the lotSizeSqFt outliers by sorting descending
df.sort_values('lotSizeSqFt', ascending=False).head(5)

# This top listing is legitimate. But we have a problem here where condo and townhouse listings are using the 
# size of the overall lot for their lot, and that isn't really accurate/representative
# We'll fix this in a little bit

In [None]:
#check what is going on with the livingAreaSqFt outliers by sorting ascending
df.sort_values('livingAreaSqFt', ascending=True).head(5)

# just tiny houses I guess ?

In [None]:
#check what is going on with the livingAreaSqFt outliers by sorting descending
df.sort_values('livingAreaSqFt', ascending=False).head(5)

In [None]:
# we're dropping the top two listings here. One is a lot, and the other is clearly mistaken.
df.drop(index=[705, 2557], inplace=True)

In [None]:
#check what is going on with the numOfBathrooms outliers by sorting descending
df.sort_values('numOfBathrooms', ascending=False).head(5)

In [None]:
# I'm going to say this top listing has 2.5 bathrooms not 27. That is clearly a typo.
df.loc[df.index==2838, 'numOfBathrooms'] = 2.5

In [None]:
#check what is going on with the numOfBathrooms outliers by sorting ascending
df.sort_values('numOfBathrooms', ascending=True).head(5)

In [None]:
# most listings with 0 bathrooms also have 0 bedrooms. This is clearly wrong, but I'm not going to guess if there are no bedrooms.
# I will impute typical bathroom count per bedroom count based on year built
# then drop any remaining listings with 0 bathrooms and 0 bedrooms

df.loc[(df['numOfBathrooms']==0) & (df['numOfBedrooms']>0) & (df['yearBuilt'] > 1989), 'numOfBathrooms'] = 2
df.loc[(df['numOfBathrooms']==0) & (df['numOfBedrooms']>0) & (df['yearBuilt'] <= 1989), 'numOfBathrooms'] = 1
df.loc[(df['numOfBathrooms']==0) & (df['numOfBedrooms']>=3) & (df['yearBuilt'] > 1989), 'numOfBathrooms'] = 2.5
df.loc[(df['numOfBathrooms']==0) & (df['numOfBedrooms']>=3) & (df['yearBuilt'] <= 1989), 'numOfBathrooms'] = 2

df.drop(df[df['numOfBathrooms']==0].index, inplace=True)
df.drop(df[df['numOfBedrooms']==0].index, inplace=True)

In [None]:
#check what is going on with the numOfBedrooms outliers by sorting descending
df.sort_values('numOfBedrooms', ascending=False).head(5)

In [None]:
# That condo is supposed to have 2 bathrooms, not 20.
df.loc[df.index==8597, 'numOfBedrooms'] = 2


In [None]:
#check what is going on with the propertyTaxRate outliers by sorting descending
df.sort_values('propertyTaxRate', ascending=False).head(5)

# looks like some of these are just in a higher tax rate area.

In [None]:
#check what is going on with the garageSpaces outliers by sorting descending
df.sort_values('garageSpaces', ascending=False).head(10)

In [None]:
# a bunch of these garage spaces are definitely just bogus numbers. I'm going to force change a lot of them to numbers that make sense
df.loc[(df['garageSpaces'] > 3) & (df['price'] < 1000000) & (df['homeType'] == 'Single Family'), 'garageSpaces'] = 3
df.loc[(df['garageSpaces'] > 5) & (df['price'] > 1000000)& (df['homeType'] == 'Single Family'), 'garageSpaces'] = 4
df.loc[df.index==6885, 'garageSpaces'] = 2

In [None]:
#check what is going on with the parkingSpaces outliers by sorting descending
df.sort_values('parkingSpaces', ascending=False).head(5)

In [None]:
# We are going to do the same forced conversions on parking spaces
df.loc[(df['parkingSpaces'] > 3) & (df['price'] < 1000000 & (df['homeType'] == 'Single Family')), 'parkingSpaces'] = 3
df.loc[(df['parkingSpaces'] > 5) & (df['price'] > 1000000& (df['homeType'] == 'Single Family')), 'parkingSpaces'] = 5
df.loc[df.index==6885, 'parkingSpaces'] = 2

df.sort_values('parkingSpaces', ascending=False).head(5)

In [None]:
df['city'].value_counts()

In [None]:
'''# check how our histograms are looking for our columns that seem to have outliers

df.hist(figsize=(18,15), bins=100);'''

For the square footage variables, I ultimately concluded that extremely large houses and lots are so seriously under-represented in the dataset that we won't be able to reliably predict on them anyway and they are better left off.

Ultimately I opt to remove via IQR on these items. 

In order to prevent a lot of data loss in this way, I kept IQR range of 1.6 instead of the standard 1.5

In [None]:
# A lot of our variables are not normally shaped, so we can't reliably remove outliers via standard deviation.
# We will use IQR to remove our outliers with the following function

def iqr_outliers(column):
    """return the lower range and upper range for the data based on IQR*1.6"""
    Q1,Q3 = np.percentile(column , [25,75])
    iqr = Q3 - Q1
    lower_range = Q1 - (1.6 * iqr)
    upper_range = Q3 + (1.6 * iqr)
    return lower_range,upper_range  

In [None]:
# determing our IQR for price, lot size, sq footage and longitude
lotlower,lotupper = iqr_outliers(df.lotSizeSqFt)
sqftlower, sqftupper = iqr_outliers(df.livingAreaSqFt)

# dropping the things outside of our lower and upper range
df.drop(df[ (df.lotSizeSqFt > lotupper) | (df.lotSizeSqFt < lotlower) ].index , inplace=True)
df.drop(df[ (df.livingAreaSqFt > sqftupper) | (df.livingAreaSqFt < sqftlower) ].index , inplace=True)

In [None]:
# We'll imputing the median lot size into condo and townhouse listings that are over-listed for lot size square feet

df.loc[(df['homeType']=='Condo') | (df['homeType']=='Townhouse') & (df['livingAreaSqFt']<1200) & (df['lotSizeSqFt']>8000), 'lotSizeSqFt'] = df['lotSizeSqFt'].median()

In [None]:
'''# check how our histograms are looking

df.hist(figsize=(18,20), bins=100);

# much better'''

#### Manually locating price outliers

In [None]:
# we're using the median house value for a zip code to determine the zip code's sort, so we can visualize the zip code

# group our dataframe by zipcode on median home price, sorted ascending. 
zipsorted = pd.DataFrame(df.groupby('zipcode')['price'].median().sort_values(ascending=True))

# divide our dataframe into groups with entries per group as specified above,
# and assign this number to a new column
zipsorted['rank'] = np.divmod(np.arange(len(zipsorted)), 1)[0]+1

# function that looks up a segment that a data entry belongs to
def make_group(x, frame, column):
    y = frame.loc[(frame.index == x)][column]
    z = np.array(y)
    z[0]
    return z[0]

# make a new column on our dataframe. Look up each zip entry's group, and append to the column.
df['zip_rank'] = df['zipcode'].apply(lambda x: make_group(x, zipsorted, 'rank'))

# apply the median home price per zip code to the data frame
df['median_zip'] = df['zipcode'].apply(lambda x: round(df.loc[df['zipcode']==x]['price'].median(), 0))

In [None]:
'''# visualize zip code as a color function

fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(df['median_zip'], df['price'] /100000, c=df['zip_rank'], cmap='magma_r')

ax.set_xlabel('Median Home Price per Zip', fontsize=12)
ax.set_ylabel('Price in $100,000', fontsize=12)
ax.set_title('Price per Zip Code Median, by Zip Code Median Rank', fontsize=20)
;'''

We can see that a few of our zip codes are very high value. There are also some clear outliers in this data set. We'll take care of removing those, and then come back to this visual again later after we've done some cleanup.

In [None]:
'''# visualize zip code as a color function, on a plot of price per square footage

fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(df['livingAreaSqFt'], df['price'] /100000, c=df['zip_rank'], cmap='magma_r')

ax.set_xlabel('Square Feet of Living Space', fontsize=12)
ax.set_ylabel('Price in $100,000', fontsize=12)
ax.set_title('Price per Total Square Feet, by Zip Code Median Rank', fontsize=20)
;'''

In [None]:
# we're dropping the values above 3 million, and the 3 entries from zipcode 78734
df.drop(df[df['price']>3000000].index, inplace=True)
df.drop(df[df['zipcode']==78734].index, inplace=True)

In [None]:
# check price stats by zip code and displaying top 30 zip codes by mean

find_zip_outliers = df.groupby('zipcode')['price'].describe()
find_zip_outliers.sort_values('mean', ascending=False).head(35)

# very suspicious values in many zip codes for min

In [None]:
# anything under 75k is no way a legitimate market value sale. 
# anything in this range is certainly a inter-family sale,
# non-commercial, some weird sale type.
# We are dropping all of those. 
df.drop(df.loc[(df['price'] <= 75000)].index, axis=0, inplace=True)

In [None]:
# Eliminating outliers on a per-zipcode basis using our IQR 1.6

zipcodes = df['zipcode'].unique()

for i in zipcodes:
    lower, upper = iqr_outliers(df[df['zipcode'] == i]['price'])
    df.drop(df[ ( (df.price > upper) & (df['zipcode'] == i) ) | ( (df.price < lower)  & (df['zipcode'] == i) ) ].index , inplace=True)


In [None]:
'''#We can check our price per zip code histograms.

df['price'].hist(by=df['zipcode'], figsize=(30,30));    

# some of our zip codes don't have enough sales to give us information'''

In [None]:
# We're going to drop our few zip codes where we have only a couple of data points

df.drop( df.loc[(df['zipcode']==78653) | (df['zipcode']==78738) | (df['zipcode']==78719)| (df['zipcode']==78652)| (df['zipcode']==78742)].index, axis=0, inplace=True)

In [None]:
# redo our zip code medians and rankings after outlier removal

# apply the median home price per zip code to the data frame again after outlier removal
df['median_zip'] = df['zipcode'].apply(lambda x: round(df.loc[df['zipcode']==x]['price'].median(), 0))

# group our dataframe by zipcode on median home price, sorted ascending. We want to bin like-medians together.
zipsorted = pd.DataFrame(df.groupby('zipcode')['price'].median().sort_values(ascending=True))

# divide our dataframe into groups with entries per group as specified above,
# and assign this number to a new column
zipsorted['rank'] = np.divmod(np.arange(len(zipsorted)), 1)[0]+1

# make a new column on our dataframe. Look up each zip entry's group, and append to the column.
df['zip_rank'] = df['zipcode'].apply(lambda x: make_group(x, zipsorted, 'rank'))

In [None]:
'''# re-visualize zip code as a color function, using the median zip after outlier removal. 

fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(df['median_zip'], df['price'] /100000, c=df['zip_rank'], cmap='magma_r')

ax.set_xlabel('Zip Code by Median Rank', fontsize=12)
ax.set_ylabel('Price in $100,000', fontsize=12)
ax.set_title('Price per Zip Code Median, by Zip Code Median Home Value', fontsize=20);

# save visualization to png
#plt.savefig('images/zip_prices.png')'''

In [None]:
# apply the median price per square foot per zip code to the data frame
df['pr_sqft'] = df.apply(lambda x: round( (x['price'] / x['livingAreaSqFt'] ), 0), axis=1 )

In [None]:
'''fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(df['livingAreaSqFt'], df['pr_sqft'], c=df['zip_rank'], cmap='magma_r')

ax.set_xlabel('Total Square Footage', fontsize=12)
ax.set_ylabel('Price Per Square Foot', fontsize=12)
ax.set_title('Price Per Square Foot to Total Square Footage, by Zip Code Median Rank', fontsize=20);

# save visualization to png
#plt.savefig('images/zip_prices.png')'''

In [None]:
#dropping irrationally high pr/sqft
df.drop(df[df['pr_sqft']>1000].index, inplace=True)

In [None]:
'''# visualize zip code as a color function, on a plot of price per square footage

fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(df['livingAreaSqFt'], df['price'] /100000, c=df['zip_rank'], cmap='magma_r')

ax.set_xlabel('Square Feet of Living Space', fontsize=12)
ax.set_ylabel('Price in $100,000', fontsize=12)
ax.set_title('Price per Total Square Footage, by Zip Code Median Rank', fontsize=20)
;'''

In [None]:
df['price'].mean()

In [None]:
low_zips = df.loc[df['median_zip']<500000]
high_zips = df.loc[df['median_zip']>=500000]

In [None]:
'''# visualize zip code as a color function, on a plot of price per square footage

fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(low_zips['livingAreaSqFt'], low_zips['price'] /100000, c=low_zips['zip_rank'], cmap='magma_r')

ax.set_xlabel('Square Feet of Living Space', fontsize=12)
ax.set_ylabel('Price in $100,000', fontsize=12)
ax.set_title('Price per Total Square Footage, by Zip Code Median Rank \nFor Zip Medians under 500k', fontsize=20)
;'''

In [None]:
'''# visualize zip code as a color function, on a plot of price per square footage

fig, ax = plt.subplots(figsize=(20, 15))

ax.scatter(high_zips['livingAreaSqFt'], high_zips['price'] /100000, c=high_zips['zip_rank'], cmap='magma_r')

ax.set_xlabel('Square Feet of Living Space', fontsize=12)
ax.set_ylabel('Price in $100,000', fontsize=12)
ax.set_title('Price per Total Square Footage, by Zip Code Median Rank\nFor Zip Medians over 500k', fontsize=20)
;'''

Here's a fun way to see the improvements to our data quality after we clean outliers! A much deeper color map.

In [None]:
'''# plotting latitude and longitude as a visual scatter plot. The improved color map actually visually demonstrates
# the removal of extreme price outliers.

plt.figure(figsize=(25,25))

sns.scatterplot(data=df, x="longitude", y="latitude", hue="price", palette="magma_r");'''

In [None]:
'''# we can also map our zip codes in this way.

plt.figure(figsize=(25,25))

sns.scatterplot(data=df, x="longitude", y="latitude", hue="zip_rank", palette="magma_r");'''

### Missing Data

In [None]:
# look for nulls

df.isna().sum()

# no missing data. impressive!

In [None]:
# check data types

df.dtypes

# data types all look correct

### Binary Data

In [None]:
# we're going to convert some of our ordinal features to binary 0/1
convert_to_bool = ['numOfAccessibilityFeatures', 'numOfAppliances', 'numOfParkingFeatures', 'numOfPatioAndPorchFeatures', 'numOfSecurityFeatures', 'numOfWaterfrontFeatures', 'numOfWindowFeatures', 'numOfCommunityFeatures']

df_convert_to_bool = df[convert_to_bool]
df_convert_to_bool.describe()

In [None]:
# Any element that has no features by the 50th percentile is getting converted to a binary 

# change all non-null values > 0 in those columns to 1
df.loc[df['numOfAccessibilityFeatures'] > 0, 'numOfAccessibilityFeatures'] = 1
df.loc[df['numOfPatioAndPorchFeatures'] > 0, 'numOfPatioAndPorchFeatures'] = 1
df.loc[df['numOfSecurityFeatures'] > 0, 'numOfSecurityFeatures'] = 1
df.loc[df['numOfWaterfrontFeatures'] > 0, 'numOfWaterfrontFeatures'] = 1
df.loc[df['numOfWindowFeatures'] > 0, 'numOfWindowFeatures'] = 1
df.loc[df['numOfCommunityFeatures'] > 0, 'numOfCommunityFeatures'] = 1

# now anything that is not a 1 becomes a 0
df.loc[df['numOfAccessibilityFeatures']!= 1, 'numOfAccessibilityFeatures'] = 0
df.loc[df['numOfPatioAndPorchFeatures'] != 1, 'numOfPatioAndPorchFeatures'] = 0
df.loc[df['numOfSecurityFeatures'] != 1, 'numOfSecurityFeatures'] = 0
df.loc[df['numOfWaterfrontFeatures'] != 1, 'numOfWaterfrontFeatures'] = 0
df.loc[df['numOfWindowFeatures'] != 1, 'numOfWindowFeatures'] = 0
df.loc[df['numOfCommunityFeatures'] != 1, 'numOfCommunityFeatures'] = 0

# rename to reflect binary
df.rename(columns={'numOfAccessibilityFeatures' : 'accessibility', 'numOfPatioAndPorchFeatures' : 'patioporch', 'numOfSecurityFeatures': 'security', 
                  'numOfWaterfrontFeatures': 'waterfront', 'numOfWindowFeatures' : 'windowfeatures', 'numOfCommunityFeatures' : 'community'}, inplace=True)

In [None]:
# convert original boolean columns to binary 0/1
boolean = ['hasAssociation', 'hasCooling', 'hasGarage', 'hasHeating', 'hasSpa', 'hasView']

for item in boolean:
    df[boolean] = df[boolean].astype(int)

### Study Target Variable

In [None]:
#histogram and normal probability plot
sns.distplot(df['price'], fit=norm);
fig = plt.figure()

res = stats.probplot(df['price'], plot=plt)

# our sales price histogram is positively skewed and has a high peak
# Our QQ-plot shows that we have heavy tails with right skew

In [None]:
#skewness and kurtosis
print("Skewness: %f" % df['price'].skew())
print("Kurtosis: %f" % df['price'].kurt())

# price is highly right skewed
# very positive kurtosis, indicating lots in the tails. We can see those tails in the right skew.

In [None]:
# log transform our target price to improve normality of distribution
df_target_log = np.log(df['price'])

#histogram and normal probability plot
sns.distplot(df_target_log, fit=norm);
fig = plt.figure()
res = stats.probplot(df_target_log, plot=plt)

# Our target price is more normally distributed when log transformed, so we'll be doing that when we make our model


## Natural Language Processing

Our data set includes the listing text for each sale. We're going to use Natural Language Processing methods to extract relevant information from the listing text to boost the effectiveness of our model.

We're using spaCy and after basic package installation, we also need to load the english language pipeline.

In [None]:
# Load spaCy with English language processor
nlp = spacy.load("en_core_web_sm")

# add real estate related stop words to default stop word list
nlp.Defaults.stop_words |= {"bedroom", "bathroom","bath","home", "austin", "tx", "pron", "sq", "ft", "rent", "mo",
                            "w", "bed", 'single', 'family', 'contain', 'st', 'dr', 'square', 'foot', 'room', 'square', 'feet',
                            '-pron-', 'garage', 'pflugerville', 'story', '1st', '1story', '2car', '2nd',
                            '2story', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th', 'street', 'avenue', 'ave', 
                            'sac', 
                            
                           }

nlp.Defaults.stop_words.remove('is')
nlp.Defaults.stop_words.remove('as')

In [None]:
# text processing functions for NLP

def preprocessor(word):
    '''processes an individual word to remove punctuation, numbers, special characters etc
    Returns processed word, or blank string if character removal resulted in no word
    ARGUMENT:
    word from line of text'''
    if type(word) == str:
        word = re.sub(r'[^\w\s]', '', word)
        word = re.sub('<[^>]*>', '', word)
        word = re.sub('<[0-9]*>', '', word)
        word = re.sub('[\W]+', '', word.lower())
        try:
            word = int(word)
            word = ''
            return word
        except:
            return word

def word_processor(line):
    '''Takes a line of text. Tokenizes each word of sentence. 
    If token is stop word, goes to next token. If not stop word,
    calls preprocessor on word
    Returns processed words from line
    ARGUMENT: 
    line of text'''
    
    tokens = nlp(line) # nlp reads line and creates tokens from each word  
    words = [] # empty list of words for this line
    
    for token in tokens:
        if token.is_stop == False: # only continues if token is not stop word
            token_preprocessed = preprocessor(token.lemma_) # calls preprocessor on word
            if token_preprocessed != '': # only continues if returned word is not empty
                words.append(token_preprocessed) # appends word to list of words
    return(words) # return list of words for this line

def text_block_processor(text):
    '''Takes a block of text. Divides block into sentences with words lemmatized.
    Sends each sentence to word processor. Concatenates all words into one string
    If the string contains "zestimate", returns a DEFAULT listing note
    Otherwise returns string of cleaned and processed words from text block
    ARGUMENTS:
    block of text
    '''
    
    make_sentences = nlp(text)
    
    sentences_lemmata_list = [sentence.lemma_.lower() for sentence in make_sentences.sents]
    
    these_processed_sentences = ''

    
    for item in sentences_lemmata_list:
        words = word_processor(item)
        line = ' '.join(words)
        these_processed_sentences += (' ' + line)
        
    if 'zestimate' in these_processed_sentences:
        return 'DEFAULT'
    else:
        return these_processed_sentences

In [None]:
# reset indices on original data frame before making a copy
df.reset_index(inplace=True)
df.drop('index', axis=1, inplace=True)

In [None]:
'''# copy the description column to a new data frame for text processing
listing_text = pd.DataFrame(df['description'])
listing_text

# cleaning all of the text in our listing_text description field and adding it to the listing_text data frame 
sentences = []

listing_text['sentences'] = None

for row in listing_text.index:
    thistext = listing_text['description'][row]
    the_sentences = text_block_processor(thistext)
    listing_text['sentences'][row] = the_sentences
    sentences += the_sentences

# drop the description field and export our listing_text to csv so we don't have to run it again
listing_text.drop('description', axis=1, inplace=True)

listing_text.to_pickle("listing_text.pkl")'''

In [None]:
listing_desc = pd.read_pickle("listing_text.pkl")
listing_desc

In [None]:
# append our listing text to our original data frame
df = pd.concat([df, listing_desc], axis=1)

df

## Create Holdout Set

We need to create our holdout data before any further processing.

The reasons for this are:
   * We will standardize our continuous variables, and you should standardize only on your train set and apply that to your test set.
   * We will be feature engineering on our train set, and applying that later to our test set. We cannot have our test set data leak into our engineered features.
   * We'll be doing some natural language processing, fitting on our train set and applying to our test set. We don't want data leakage for the same reasons as above.
    

In [None]:
# set our random seed for the notebook. We could randomize this each time the notebook is run,
# but ultimately we want all of our train/test splits to use the same data
randomstate = 2

y = pd.DataFrame(df['price'])
x = df.drop('price', axis=1,)

# creating our train/validation sets and our test sets
train_data, holdout, y_train, y_test = train_test_split(df, y, test_size=0.2, random_state=randomstate)

# reset indices to prevent any index mismatches
train_data.reset_index(inplace=True)
train_data.drop('index', axis=1, inplace=True)

holdout.reset_index(inplace=True)
holdout.drop('index', axis=1, inplace=True)

y_train.reset_index(inplace=True, drop=True)
y_test.reset_index(inplace=True, drop=True)

## Feature Engineering

In [None]:
# Adding target encoding, which we will opt to try instead of one-hot with a few models

# smooth mean function by MAx Halford at https://maxhalford.github.io/blog/target-encoding/

def calc_smooth_mean(df, by, on, m, target_df):
    '''input a pandas.DataFrame, a categorical column name, the name of the target column, and a weight .'''
    # Compute the global mean
    mean = df[on].mean() 

    # Compute the number of values and the mean of each group
    agg = df.groupby(by)[on].agg(['count', 'mean'])  
    counts = agg['count']
    means = agg['mean']

    # Compute the "smoothed" means
    smooth = (counts * means + m * mean) / (counts + m)

    # Replace each value by the according smoothed mean
    return round(target_df[by].map(smooth), 0) 

num_of_samples = train_data.shape[0]
zip_samples = num_of_samples/train_data['zipcode'].unique().shape[0]
month_samples = num_of_samples/train_data['latest_salemonth'].unique().shape[0]


# create smooth additive encoded variables for zipcode, year built, and monthsold
train_data['zip_smooth'] = calc_smooth_mean(train_data, 'zipcode', 'price', zip_samples, train_data)
train_data['year_smooth'] = calc_smooth_mean(train_data, 'yearBuilt', 'price', 300, train_data)
train_data['month_smooth'] = calc_smooth_mean(train_data, 'latest_salemonth', 'price', month_samples, train_data)

# Create a wider lat and long zone to calculate an area mean
train_data['lat_zone'] = round(train_data['latitude'], 2)
train_data['long_zone'] = round(train_data['longitude'], 2)

lat_samples = num_of_samples/train_data['lat_zone'].unique().shape[0]
long_samples = num_of_samples/train_data['long_zone'].unique().shape[0]

# calculate smooth mean variables for lat and long, then create an interactive variable describing both together
train_data['lat_smooth'] = calc_smooth_mean(train_data, 'lat_zone', 'price', lat_samples, train_data)
train_data['long_smooth'] = calc_smooth_mean(train_data, 'long_zone', 'price', long_samples, train_data)
train_data['lat_long'] = round(np.sqrt(train_data['lat_smooth'] * train_data['long_smooth']), 0)

## Correlations/Multicollinearity

In [None]:
# look for multicollinearity of features
fig, ax = plt.subplots(figsize=(20, 20))

sns.heatmap(train_data.corr(), center=0,  
           vmin=-1, vmax=1,  square=True)

# title
plt.title('PEARSON CORRELATION MATRIX', fontsize=18)

plt.show()

In [None]:
train_data.corr()

In [None]:
#Get our list of highly correlated feature pairs with following steps:

# save correlation matrix as a new data frame
# converts all values to absolute value
# stacks the row:column pairs into a multindex
# reset the index to set the multindex to seperate columns
# sort values. 0 is the column automatically generated by the stacking
df_correlations = train_data.corr().abs().stack().reset_index().sort_values(0, ascending=False)

# zip the variable name columns in a new column named "pairs"
df_correlations['pairs'] = list(zip(df_correlations.level_0, df_correlations.level_1))

# set index to pairs
df_correlations.set_index(['pairs'], inplace = True)

# rename our results column to correlation
df_correlations.rename(columns={0: "correlation"}, inplace=True)

# Drop 1:1 correlations to get rid of self pairs
df_correlations.drop(df_correlations[df_correlations['correlation'] == 1.000000].index, inplace=True)

# view pairs above 75% correlation and below 90% correlation (engineered features will correlate with each other above 95%)
df_correlations[(df_correlations.correlation>.75) & (df_correlations.correlation<.95)]


In [None]:
# Check out our variables correlationg with price
df_correlations = train_data.corr().abs().stack().reset_index().sort_values(0, ascending=False)
df_correlations.loc[df_correlations['level_0'] == 'price'].sort_values(0, ascending=False)

We'll drop:

* parkingSpaces, hasGarage, numOfParkingFeatures and keep garageSpaces (higher relationship with Price)
* numOfElementarySchools, and keep numOfPrimarySchools (higher relationship with Price)
* MedianStudentsPerTeacher, keeping avgSchoolRating
* numOfBathrooms correlates with square footage, but I'm not dropping either

We can get a sense of the most important features to our price from our correlation table. Zipcode as a plain variable does not correlate, which makes sense, because without some sort of transformation it is an arbitrary unordered number. We can see how transformed as median_zip or zip_rank it becomes the MOST important contributor to price. We can see here that big contributors to price include
    
    * Lat/long in a target encoded form
    * zip code (in some altered form, not as arbitrary number)
    * livingAreaSqFt
    * numBathrooms
    * avgSchoolRating

In [None]:
train_data['numOfSchools'] = train_data['numOfPrimarySchools'] + train_data['numOfElementarySchools'] + train_data['numOfMiddleSchools'] + train_data['numOfHighSchools']
holdout['numOfSchools'] = holdout['numOfPrimarySchools']  + holdout['numOfElementarySchools'] + holdout['numOfMiddleSchools'] + holdout['numOfHighSchools']

In [None]:
# drop multicollinear features
train_data.drop(['parkingSpaces', 'hasGarage', 'numOfElementarySchools', 'numOfPrimarySchools', 'numOfMiddleSchools', 
         'numOfHighSchools', 'MedianStudentsPerTeacher', 'numOfParkingFeatures'], axis=1, inplace=True)
holdout.drop(['parkingSpaces', 'hasGarage', 'numOfElementarySchools', 'numOfPrimarySchools', 'numOfMiddleSchools', 
         'numOfHighSchools', 'MedianStudentsPerTeacher', 'numOfParkingFeatures'], axis=1, inplace=True)

## EDA and Processing Train Set

### Categoricals

In [None]:
categories = ['zipcode', 'yearBuilt', 'hasAssociation', 'hasCooling', 'hasHeating', 'hasSpa', 'hasView', 'accessibility', 'patioporch', 'security',
          'waterfront', 'windowfeatures', 'community', 'latest_salemonth', 'numOfSchools', 'garageSpaces', 'propertyTaxRate', ]

df_categoricals = train_data[categories]

In [None]:
# adding price to our dataframe so that we can do some visualizations    

df_categoricals['price'] = train_data['price']

# plot our categoricals as box plots vs price
def boxplot(x, y, **kwargs):
    sns.boxplot(x=x, y=y)
    x=plt.xticks(rotation=90)
    
f = pd.melt(df_categoricals, id_vars=['price'], value_vars=categories)
g = sns.FacetGrid(f, col="variable",  col_wrap=2, sharex=False, sharey=False, size=5)
g = g.map(boxplot, "value", "price")

df_categoricals.drop('price', axis=1, inplace=True)

In [None]:
# there is only ONE listing with 5 schools, so we will change that one to 4
df_categoricals.loc[df_categoricals['numOfSchools']==5, 'numOfSchools'] = 4

In [None]:
# binning our year built bins
# We're also saving the bin information to year_bins because we will need the bin information to make new predictions
num_bins = 30
labels = np.array(range(1,num_bins+1))
df_categoricals["year_block"], year_bins = pd.qcut(df_categoricals['yearBuilt'], q=num_bins, retbins=True, labels=labels)

df_categoricals.drop('yearBuilt', axis=1, inplace=True)

# telling Pandas that these columns are categoricals
for item in df_categoricals.columns:
    df_categoricals[item] = df_categoricals[item].astype('category')

In [None]:
# make a processed bins file for use with linear regression
# We're making TWO categorical sets. One is high one hot encoding. One is low one hot encoding, and the 
# categoricals in that one will be target encoded as continuous instead

high_one_hot_cat =  ['zipcode', 'year_block', 'hasAssociation', 
                 'hasCooling', 'hasHeating', 'hasSpa', 'hasView', 
                 'accessibility', 'patioporch', 'security', 'numOfSchools',
              'waterfront', 'windowfeatures', 'community', 'latest_salemonth',
                    'garageSpaces', 'propertyTaxRate', ]
low_one_hot_cat =  ['hasAssociation', 
                'hasCooling', 'hasHeating', 'hasSpa', 
                'hasView', 'accessibility', 'patioporch', 'numOfSchools',
                'security', 'waterfront', 'windowfeatures', 'community', 'garageSpaces', 'propertyTaxRate', ]

df_cats_high_one_hot = pd.get_dummies(df_categoricals[high_one_hot_cat], prefix=high_one_hot_cat, drop_first=True)
df_cats_low_one_hot = pd.get_dummies(df_categoricals[low_one_hot_cat], prefix=low_one_hot_cat, drop_first=True)

### Continuous

In [None]:
continuous = ['numPriceChanges', 
              'lotSizeSqFt', 'livingAreaSqFt', 'avgSchoolDistance', 
              'avgSchoolRating', 'avgSchoolSize', 'numOfBedrooms', 
              'numOfStories', 'numOfPhotos', 
              'numOfAppliances', 'latest_salemonth',
             'zip_smooth', 'year_smooth', 'month_smooth', 'lat_long'] 

x_continuous = train_data[continuous]
x_continuous['price'] = train_data['price']

Let's look at mean price by month and see if there are any better insights.

In [None]:
small_cont = ['numPriceChanges', 
              'avgSchoolRating', 'numOfBedrooms', 
              'numOfStories', 'numOfAppliances', 
              'latest_salemonth']
# plot our continuous as box plots vs price
def boxplot(x, y, **kwargs):
    sns.boxplot(x=x, y=y)
    x=plt.xticks(rotation=90)

f = pd.melt(x_continuous, id_vars=['price'], value_vars=small_cont)
g = sns.FacetGrid(f, col="variable",  col_wrap=2, sharex=False, sharey=False, size=5)
g = g.map(boxplot, "value", "price")


In [None]:
large_cont = ['lotSizeSqFt', 'livingAreaSqFt', 'avgSchoolDistance', 
              'avgSchoolRating', 'avgSchoolSize', 'numOfPhotos',
             'zip_smooth', 'year_smooth', 'month_smooth', 'lat_long']

# check linearity of continuous predictors

fig, axes = plt.subplots(nrows=5, ncols=2, figsize=(15,25), sharey=True)

for ax, column in zip(axes.flatten(), large_cont):
    ax.scatter(x_continuous[column], x_continuous['price']/100000, label=column, alpha=.1)
    ax.set_title(f'Sale Price vs {column}')
    ax.set_xlabel(column)
    ax.set_ylabel('Sale Price in $100,000')

fig.tight_layout()

Positive relationship observed with:
* lot size
* square footage
* school rating
* number of bedrooms
* lat/long

Negative relationship observed with:
* number of price changes

Others seem neutral/uncertain

In [None]:
# Checking out our mean sales price for year built scattered versus price shows a polynomial relationship

yearly_prices = train_data.groupby('latest_salemonth')['price'].mean()

plt.scatter(yearly_prices.index, yearly_prices)
plt.title("Linearity check")
plt.xlabel('year built')
plt.ylabel('sales price')
plt.show()

Our average per month looks polynomial.

In [None]:
# Checking out our mean sales price for latitude  scattered versus price shows a polynomial relationship
lat_prices = train_data.groupby('livingAreaSqFt')['price'].mean()

plt.scatter(lat_prices.index, lat_prices)
plt.title("Linearity check")
plt.xlabel('lat')
plt.ylabel('sales price')
plt.show()

In [None]:
# Checking out our mean sales price for latitude  scattered versus price shows a polynomial relationship
lat_prices = train_data.groupby('lotSizeSqFt')['price'].mean()

plt.scatter(lat_prices.index, lat_prices)
plt.title("Linearity check")
plt.xlabel('lat')
plt.ylabel('sales price')
plt.show()

In [None]:
# Checking out our mean sales price for latitude  scattered versus price shows a polynomial relationship
lat_prices = train_data.groupby('avgSchoolSize')['price'].mean()

plt.scatter(lat_prices.index, lat_prices)
plt.title("Linearity check")
plt.xlabel('lat')
plt.ylabel('sales price')
plt.show()

#### Standardize

In [None]:
# check out our histograms

x_continuous.hist(figsize=(18,15), bins='auto');

In [None]:
# don't need price in there anymore
x_continuous.drop('price', axis=1, inplace=True)

# standardize all of our values with scikit-learn StandardScaler
scaler = StandardScaler()

#transformed_scaled_continuous = pd.DataFrame(scaler.fit_transform(x_train_cont_log),columns = x_train_cont_log.columns)
scaled_continuous = pd.DataFrame(scaler.fit_transform(x_continuous),columns = x_continuous.columns)
scaled_continuous.head(5)

In [None]:
# make a processed bins file for use with linear regression
# We're making TWO continuous sets. One is high one hot encoding. One is low one hot encoding, and includes the 
# categoricals that are target encoded as continuous instead

high_one_hot_cont =  ['numPriceChanges', 
              'lotSizeSqFt', 'livingAreaSqFt', 'avgSchoolDistance', 
              'avgSchoolRating', 'avgSchoolSize', 'numOfBedrooms', 
              'numOfStories', 'numOfPhotos', 
              'numOfAppliances']
low_one_hot_cont =  ['numPriceChanges', 
              'lotSizeSqFt', 'livingAreaSqFt', 'avgSchoolDistance', 
              'avgSchoolRating', 'avgSchoolSize', 'numOfBedrooms', 
              'numOfStories', 'numOfPhotos', 
              'numOfAppliances',  
             'zip_smooth', 'year_smooth', 'month_smooth', 'lat_long']

df_cont_high_one_hot = scaled_continuous[high_one_hot_cont]
df_cont_low_one_hot = scaled_continuous[low_one_hot_cont]

#### Finding Interactions

I wrote a function which finds all of the feature combinations possible in our dataset. Then for each combination, the function runs a linear regression with cross validation on 5 folds and gets the r^2 score for the regression including that feature combination. All scores are recorded and r^2 score improvement is assessed, with the resulting table giving the increase in model improvement from a feature combo. 

In [None]:
'''def test_feature_combinations(price, variables):
    
    """Function takes in target price and a dataframe of independent variables, and 
    tests model improvement for each combination of variables
    ARGUMENTS:
    Y of target values
    X-dataframe of continuous features
    Returns dataframe of score improvements over base score for each interaction combination"""
    
    # select our estimator and our cross validation plan
    regression = LinearRegression()
    cv = RepeatedKFold(n_splits=5, n_repeats=2, random_state=1)
    
    # prepare our scoring dataframe
    scoring_df = pd.DataFrame()
    
    # prepare our lists to store our features and scores as we iterate
    scores = []
    feature1 = []
    feature2 = []
    
    # Get a list of all of our features, and remove our target variable 'price' from the list
    features = list(variables.columns)

    # make a list of all of our possible feature combinations
    feature_combos = itertools.combinations(features, 2)
    feature_combos = list(feature_combos)
    
    # set our y-value as our target variable
    y = price
    
    # prepare our x-value with our independent variables. We do an initial split here in order to run a 
    # linear regression to get a base r^2 on our basic model without interactions
    X = variables
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=randomstate)
    base_score = round(np.mean(cross_val_score(regression, X_train, y_train, scoring='r2', cv=cv)), 4)   
    print("Model base score is ",base_score)
    
    # now we run the regression on each feature combo
    for feature in feature_combos:
        feat1, feat2 = feature[0], feature[1]
        
        # create the test interaction on our data set
        variables['test_interaction'] = variables[feat1] * variables[feat2]
        # create a new X which includes the test interaction and drops our target value
        X = variables
        # make a new split so that our x-splits include the test interaction
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=randomstate)
        
        # Run a linear regression with cross-val just like our base model, and append the score to our scores list
        new_score = round(np.mean(cross_val_score(regression, X_train, y_train, scoring='r2', cv=cv)), 4)
        scores.append(new_score)
        # put feature 1 on a list
        feature1.append(feat1)
        # put feature 2 on a list
        feature2.append(feat2)
        print(feat1, feat2, new_score)
        
        
    
    # load all of our lists into the scoring dataframe
    scoring_df['feature1'] = feature1
    scoring_df['feature2'] = feature2
    scoring_df['scores'] = scores
    scoring_df['improvement'] = scoring_df['scores'] - base_score
    variables.drop('test_interaction', axis=1, inplace=True)
    
    # return our scoring dataframe to the function
    return scoring_df'''

In [None]:
# running our function on our continuous variables to look for improvement
# our R2 is much lower for model base score because we aren't including our categorical variables in this improvement assessment

#scoring_df = test_feature_combinations(y_train, df_cont_low_one_hot)

In [None]:
# showing our improvement scores for our interactions

#scoring_df.sort_values('improvement', ascending=False)

We won't add any interactions. None of these improvements were significant enough.

#### Add polynomial features

In [None]:
def plot_polys(y, xlabel, title):
    '''Takes in a y-axis, x-axis label, and title and plots with various polynomial levels
    ARGUMENTS:
    y axis variable values
    x-axis label
    visualization title'''
    x = y.index
    
    # express numbers as arrays and reshape
    y = np.array(y)
    x = np.array(x)
    x = x.reshape(-1, 1)
    
    # make sure indices match up
    y = y[x[:,0].argsort()]
    x = x[x[:, 0].argsort()]

    # plot figure
    plt.figure(figsize=(16, 8))

    # standard linear regression
    linreg = LinearRegression()
    linreg.fit(x, y)

    # 2nd degree polynomial regression
    poly2 = PolynomialFeatures(degree=2)
    x_poly2 = poly2.fit_transform(x)
    poly_reg2 = LinearRegression()
    poly_reg2.fit(x_poly2, y)

    # third degree polynomial regression 
    poly3 = PolynomialFeatures(degree=3)
    x_poly3 = poly3.fit_transform(x)
    poly_reg3 = LinearRegression()
    poly_reg3.fit(x_poly3, y)

    # predict on x values
    pred = linreg.predict(x)
    pred2 = poly_reg2.predict(x_poly2)
    pred3 = poly_reg3.predict(x_poly3)

    # plot regression lines
    plt.scatter(x, y)
    plt.yscale('log')
    plt.title(title)
    plt.xlabel(xlabel)
    plt.ylabel('Average')
    plt.plot(x, pred, c='red', label='Linear regression line')
    plt.plot(x, pred2, c='yellow', label='Polynomial regression line 2')
    plt.plot(x, pred3, c='#a3cfa3', label='Polynomial regression line 3');

In [None]:
# group by average month sold mean to see relationship
y = train_data.groupby('latest_salemonth')['price'].mean()
plot_polys(y, "Month", "Month Sold Mean")

In [None]:
# adding our chosen polynomial features

def create_polynomial_array(data, column, num_features):
    values = data[column]
    poly_array = np.array(values)
    poly_array = poly_array.reshape(-1,1)
    poly_fit = PolynomialFeatures(degree=num_features, include_bias=False)
    fit_features = poly_fit.fit_transform(poly_array)
    poly_df = pd.DataFrame(fit_features)
    return poly_df

month_poly = create_polynomial_array(df_cont_low_one_hot, 'month_smooth',2)

df_cont_low_one_hot['month1'] = month_poly[1]


### NLP 

In [None]:
nlp = ['sentences']

nlp_train = train_data[nlp]

In [None]:
nlp_train.head(10)

In [None]:
v = TfidfVectorizer(sublinear_tf=True, max_df=0.9, min_df=.005, ngram_range=(1,4), max_features=1000)
x = v.fit_transform(nlp_train['sentences'])

train_word_vectors = pd.DataFrame(x.toarray(), columns=v.get_feature_names())

In [None]:
predictors_train = sm.add_constant(train_word_vectors)
model = sm.OLS(y_train, predictors_train).fit()
model.summary()

In [None]:
def stepwise_selection(X, y, 
                       initial_list=[], 
                       threshold_in=0.01, 
                       threshold_out = 0.05, 
                       verbose=True):
    """ 
    Perform a forward-backward feature selection 
    based on p-value from statsmodels.api.OLS
    Arguments:
        X - pandas.DataFrame with candidate features
        y - list-like with the target
        initial_list - list of features to start with (column names of X)
        threshold_in - include a feature if its p-value < threshold_in
        threshold_out - exclude a feature if its p-value > threshold_out
        verbose - whether to print the sequence of inclusions and exclusions
    Returns: list of selected features 
    Always set threshold_in < threshold_out to avoid infinite looping.
    See https://en.wikipedia.org/wiki/Stepwise_regression for the details
    """
    included = list(initial_list)
    while True:
        changed=False
        # forward step
        excluded = list(set(X.columns)-set(included))
        new_pval = pd.Series(index=excluded)
        for new_column in excluded:
            model = sm.OLS(y, sm.add_constant(pd.DataFrame(X[included+[new_column]]))).fit()
            new_pval[new_column] = model.pvalues[new_column]
        best_pval = new_pval.min()
        if best_pval < threshold_in:
            best_feature = new_pval.idxmin()
            included.append(best_feature)
            changed=True
            if verbose:
                print('Add  {:30} with p-value {:.6}'.format(best_feature, best_pval))

        # backward step
        model = sm.OLS(y, sm.add_constant(pd.DataFrame(X[included]))).fit()
        # use all coefs except intercept
        pvalues = model.pvalues.iloc[1:]
        worst_pval = pvalues.max() # null if pvalues is empty
        if worst_pval > threshold_out:
            changed=True
            worst_feature = included[pvalues.argmax()]
            included.remove(worst_feature)
            if verbose:
                print('Drop {:30} with p-value {:.6}'.format(worst_feature, worst_pval))
        if not changed:
            break
    return included



In [None]:
# this was for ngrams of 1, 4 with 1000 returns

#important_ngrams = stepwise_selection(train_word_vectors, y_train, verbose=True)

important_ngrams = ['zilker', 'design', 'view', 'wine', 'outdoor', 'main', 'laminate', 'barton', 'spa', 'great', 'heart', 'finish', 'i35', 'paint', 'hill', 'meadow', 'classic', 'chef kitchen', 'airport', 'lake', 'community', 'pool', 'guest', 'gourmet kitchen', 'default', 'office', 'custom', 'congress', 'luxury', 'flat', 'hardwood', 'build', 'vinyl', 'suite', 'courtyard', 'original', 'travis', 'medium', 'marble', 'level', 'walkable', 'tile', 'condo', 'west', 'modern', 'playground', 'addition', 'construction', 'south', 'detach', 'northwest', 'entertainer', 'oak', 'washer', 'central', 'character', 'city', 'investment', 'park pool', 'study', 'wall', 'garden tub', 'quality', 'fan', 'mckinney', 'height', 'north', 'granite counter', 'shed', 'indoor', 'branch', 'isd', 'community pool', 'concept', 'floor', 'acre', 'major', 'prime', 'anderson', 'luxurious', 'downstairs', 'line', 'contemporary', 'ut', 'stunning', 'wet bar', 'ground', 'easy access', 'price', 'convenient', 'minute', 'upgrade include', 'mile downtown', 'adorable', 'waterfall', 'pre', 'patio', 'en', 'en suite', 'screen', 'circle', 'like', 'plenty', 'car', 'main living', 'block', 'east', 'unique', 'replace', 'tankless', 'space', 'breakfast', 'staircase', 'new', 'window', 'playroom', 'yes', 'amenity', 'restaurant', 'canyon']

#print(important_ngrams)

['zilker', 'design', 'view', 'wine', 'outdoor', 'main', 'laminate', 'barton', 'spa', 'great', 'heart', 'finish', 'i35', 'paint', 'hill', 'meadow', 'classic', 'chef kitchen', 'airport', 'lake', 'community', 'pool', 'guest', 'gourmet kitchen', 'default', 'office', 'custom', 'congress', 'luxury', 'flat', 'hardwood', 'build', 'vinyl', 'suite', 'courtyard', 'original', 'travis', 'medium', 'marble', 'level', 'walkable', 'tile', 'condo', 'west', 'modern', 'playground', 'addition', 'construction', 'south', 'detach', 'northwest', 'entertainer', 'oak', 'washer', 'central', 'character', 'city', 'investment', 'park pool', 'study', 'wall', 'garden tub', 'quality', 'fan', 'mckinney', 'height', 'north', 'granite counter', 'shed', 'indoor', 'branch', 'isd', 'community pool', 'concept', 'floor', 'acre', 'major', 'prime', 'anderson', 'luxurious', 'downstairs', 'line', 'contemporary', 'ut', 'stunning', 'wet bar', 'ground', 'easy access', 'price', 'convenient', 'minute', 'upgrade include', 'mile downtown', 'adorable', 'waterfall', 'pre', 'patio', 'en', 'en suite', 'screen', 'circle', 'like', 'plenty', 'car', 'main living', 'block', 'east', 'unique', 'replace', 'tankless', 'space', 'breakfast', 'staircase', 'new', 'window', 'playroom', 'yes', 'amenity', 'restaurant', 'canyon']

In [None]:
# this was for ngrams of 2, 4 with 600 returns

#important_ngrams = stepwise_selection(train_word_vectors, y_train, verbose=True)

#important_ngrams = ['zilker park', 'chef kitchen', 'gourmet kitchen', 'main house', 'outdoor kitchen', 'main level', 'barton creek', 'pool spa', 'master suite', 'outdoor living', 'high end', 'ceiling window', 'easy access', 'hardwood floor', 'wet bar', 'custom build', 'garden tub', 'granite counter', 'southpark meadow', 'bird lake', 'hill country', 'custom cabinetry', 'en suite', 'south congress', 'spa like', 'laminate flooring', 'washer dryer', 'ceiling fan', 'major employer', 'wall window', 'new construction', 'south park', 'tile flooring', 'fresh paint', 'community pool', 'great location', 'large backyard', 'attention detail', 'outdoor space', 'laminate floor', 'mile downtown', 'gas fireplace', 'laminate wood', 'wood floor', 'brand new', 'investment opportunity', 'acre lot', 'walnut creek', 'walk distance', 'mid century', 'tankless water', 'vinyl plank', 'great open', 'double oven', 'sys yes', 'floor ceiling', 'tile backsplash', 'new roof', 'storage shed', 'onion creek', 'hyde park', 'price sell', 'custom cabinet', 'mopac i35', 'major highway', 'hard tile', 'stainless steel', 'quiet neighborhood', 'floor large', 'picture window', 'floor master', 'open living', 'backyard perfect', 'glass door', 'upgrade include', 'office space', 'location close', 'award win', 'park pool', 'ceramic tile', 'screen porch', 'wood burn', 'soak tub', 'open concept', 'concept live', 'new carpet', 'new flooring', 'yard large', 'large window', 'mature oak', 'water heater', 'pool playground', 'upstairs game', 'cover patio', 'floor open', 'quick access', 'tall ceiling', 'wood look', 'creek greenbelt', 'default listing']

#print(important_ngrams)

In [None]:
'''model = LinearRegression()
model.fit(train_word_vectors, y_train)

from sklearn.inspection import permutation_importance
r = permutation_importance(model, train_word_vectors, y_train,
                           n_repeats=10,
                            random_state=0,
                          n_jobs=-1)

importances = {}

for i in r.importances_mean.argsort()[::-1]:
    if r.importances_mean[i] >= 0.001:
        importances[train_word_vectors.columns[i]] = r.importances_mean[i]
    else: continue
        
importances

important_ngrams = list(importances.keys())
important_ngrams '''

In [None]:
important_ngrams = ['en suite',
 'en',
 'quiet cul de',
 'quiet cul',
 'sys yes',
 'sprinkler sys yes',
 'de',
 'cul de',
 'de lot',
 'cul de lot',
 'default',
 'pool',
 'main',
 'zilker',
 'steel',
 'cul',
 'natural light',
 'stainless steel',
 'large corner lot',
 'mid century',
 'award win',
 'barton',
 'washer dryer',
 'large corner',
 'natural',
 'floor',
 'wine',
 'heart',
 'award',
 'corner lot',
 'design',
 'corner',
 'washer',
 'view',
 'finish',
 'steel appliance',
 'swimming',
 'round rock',
 'garden tub separate',
 'hill country',
 'brand new',
 'easy access',
 'south',
 'open concept',
 'hill',
 'new',
 'energy efficient',
 'brand',
 'ton natural light',
 'distance',
 'lake',
 'hike bike',
 'walk distance',
 'country',
 'curb appeal',
 'year old',
 'level',
 'luxury',
 'dryer',
 'condo',
 'vault ceiling',
 'community',
 'chef kitchen',
 'appliance',
 'isd',
 'pane',
 'minute',
 'park pool',
 'curb',
 'space',
 'old',
 'formal',
 'living',
 'new construction',
 'vault',
 'walk closet',
 'floor plan',
 'classic',
 'lot natural light',
 'nest',
 'nest thermostat',
 'marble',
 'meadow',
 'recess lighting',
 'convenient',
 'courtyard',
 'tub separate',
 'green space',
 'main floor',
 'congress',
 'walkable',
 'flat',
 'sys',
 'bike trail',
 'main living',
 'double pane',
 'lot natural',
 'master suite',
 'century',
 'vinyl',
 'addition',
 'conveniently locate',
 'hardwood',
 'guest',
 'character',
 'miss',
 'garden tub separate shower',
 'formal living',
 'west',
 'swimming pool',
 'main level',
 'energy',
 'subway',
 'medium',
 'efficient',
 'sport',
 'ton natural',
 'ut',
 'living area',
 'access mopac',
 'gas fireplace',
 'downstairs',
 'formal dining',
 'recess',
 'laminate',
 'mid',
 'steiner ranch',
 'stone fireplace',
 'creek greenbelt',
 'tennis court',
 'access',
 'build',
 'community pool',
 'interior exterior paint',
 'fresh',
 'fan',
 'outdoor kitchen',
 'airport',
 'sport court',
 'double',
 'tile',
 'investment',
 'walnut creek',
 'amenity',
 'elementary',
 'contemporary',
 'mckinney',
 'great',
 'elementary school',
 'ground',
 'garden tub',
 'ih35',
 'entertainer',
 'car',
 'detach',
 'parking',
 'south congress',
 'east',
 'indoor',
 'round',
 'replace',
 'like',
 'city',
 'adorable',
 'bamboo',
 'i35',
 'restaurant',
 'quartz',
 'mother',
 'soak tub',
 'office',
 'upgrade',
 'large',
 'yes',
 'quality',
 'set',
 'builder',
 'northwest',
 'green',
 'breakfast bar',
 'height',
 'tankless',
 'exterior paint',
 'north',
 'vinyl plank',
 'branch',
 'outdoor living',
 'paint',
 'large backyard',
 'easy access mopac',
 'light',
 'stone',
 'lot',
 'luxurious',
 'large walk closet',
 'gourmet kitchen',
 'warranty',
 'living kitchen',
 'tennis',
 'walnut',
 'travertine',
 'hoa',
 'subway tile',
 'concept']

In [None]:
# Run our linear regression again, using only the features recommended by our feature selector

train_word_vectors_refined = train_word_vectors[important_ngrams]

predictors_int = sm.add_constant(train_word_vectors_refined)
model = sm.OLS(y_train, predictors_int).fit()
model.summary()

## Process Test Set

### Categoricals

In [None]:
holdout_categoricals = holdout[categories]

# binning our year built bins
holdout_categoricals["year_block"], year_bins = pd.qcut(holdout_categoricals['yearBuilt'], q=num_bins, retbins=True, labels=labels)

holdout_categoricals.drop('yearBuilt', axis=1, inplace=True)

# telling Pandas that these columns are categoricals
for item in holdout_categoricals.columns:
    holdout_categoricals[item] = holdout_categoricals[item].astype('category')

# make a processed bins file for use with linear regression
df_cats_high_one_hot_holdout = pd.get_dummies(holdout_categoricals[high_one_hot_cat], prefix=high_one_hot_cat, drop_first=True)
df_cats_low_one_hot_holdout = pd.get_dummies(holdout_categoricals[low_one_hot_cat], prefix=low_one_hot_cat, drop_first=True)

### Continuous

In [None]:
# apply target encoding to test data, using train data to map

# create smooth additive encoded variables for zipcode, year built, and monthsold
holdout['zip_smooth'] = calc_smooth_mean(train_data, 'zipcode', 'price', zip_samples, holdout)
holdout['year_smooth'] = calc_smooth_mean(train_data, 'yearBuilt', 'price', 300, holdout)
holdout['month_smooth'] = calc_smooth_mean(train_data, 'latest_salemonth', 'price', month_samples, holdout)

# Create a wider lat and long zone to calculate an area mean
holdout['lat_zone'] = round(holdout['latitude'], 2)
holdout['long_zone'] = round(holdout['longitude'], 2)

# calculate smooth mean variables for lat and long, then create an interactive variable describing both together
holdout['lat_smooth'] = calc_smooth_mean(train_data, 'lat_zone', 'price', lat_samples, holdout)
holdout['long_smooth'] = calc_smooth_mean(train_data, 'long_zone', 'price', long_samples, holdout)
holdout['lat_long'] = round(np.sqrt(holdout['lat_smooth'] * holdout['long_smooth']), 0)

In [None]:
holdout['year_smooth'].isna().sum()

In [None]:
holdout.loc[holdout['year_smooth'].isna(), 'year_smooth'] = train_data['year_smooth'].mean()

In [None]:
holdout_continuous = holdout[continuous]

In [None]:
scaled_holdout_continuous = pd.DataFrame(scaler.transform(holdout_continuous),columns = holdout_continuous.columns)

# making our two continuous sets
df_cont_high_one_hot_holdout = scaled_holdout_continuous[high_one_hot_cont]
df_cont_low_one_hot_holdout = scaled_holdout_continuous[low_one_hot_cont]

In [None]:
# adding polynomial features

month_poly = create_polynomial_array(df_cont_low_one_hot_holdout, 'month_smooth',2)

df_cont_low_one_hot_holdout['month1'] = month_poly[1]

### NLP

In [None]:
nlp_holdout = holdout[nlp]

In [None]:
x = v.transform(nlp_holdout['sentences'])

holdout_word_vectors = pd.DataFrame(x.toarray(), columns=v.get_feature_names())

holdout_word_vectors_refined = holdout_word_vectors[important_ngrams]

## Create Train/Test Final Set

In [None]:
# make our train sets for one-hot encoded and target-encoded categoricals
X_train_onehot = pd.concat([df_cont_high_one_hot, df_cats_high_one_hot, train_word_vectors_refined], axis=1)
X_train_encoded = pd.concat([df_cont_low_one_hot, df_cats_low_one_hot, train_word_vectors_refined], axis=1)

# make our test sets for one-hot encoded and target-encoded categoricals
X_test_onehot = pd.concat([df_cont_high_one_hot_holdout, df_cats_high_one_hot_holdout, holdout_word_vectors_refined], axis=1)
X_test_encoded = pd.concat([df_cont_low_one_hot_holdout, df_cats_low_one_hot_holdout, holdout_word_vectors_refined], axis=1)

# make our target variable train and test sets, after log transforming our target variable
target = 'price' # target variable
y = np.log(df[target]) # our log-transformed target variable

y_train, y_test = train_test_split(y, test_size=0.2, random_state=randomstate) #
test_actual = np.exp(y_test)

y_train.reset_index(inplace=True, drop=True)
y_test.reset_index(inplace=True, drop=True)

### Data Sets Reference

Our final data sets include:

* X_train_onehot, X_test_onehot - train/test split predictors for one-hot sets
* X_train_encoded, X_test_encoded - train/test split predictors for encoded sets
* y_train, y_test - target values for all sets
* y - log transformed price
* test_actual - exponentiated y_test prices

# Model Explorations

We're going to evaluate a few different variations of our linear regression model, as well as a few more complex model types. In order to keep track of our results, we'll be making a dictionary to store our model accuracy results.

In [None]:
# prepare dictionary to store results
models = {}
models['Models'] = []
models['r2'] = []
models['mae'] = []
models['mae_500'] = []

values = pd.DataFrame({'actual':test_actual})

Our final data sets include:

* x_final_one_hot - Predictors with one-hot encoding for zipcode, month and year
* x_final_encoded - Predictors with additive smoothed encoding for zipcode, month and year
* y - log transformed price


* X_train_onehot, X_test_onehot - train/test split predictors for one-hot sets
* X_train_encoded, X_test_encoded - train/test split predictors for encoded sets
* y_train, y_test - target values for all sets
* test_actual - exponentiated y_test prices

## Picking our Base Features

We can use residuals plots to determine if features are important enough to add to our model. If we regress our target on a predictor, and then plot those residuals against a DIFFERENT predictor, our plot will tell us if the new feature might add to our model.

We're planning to one-hot encode our zips, but we'll use our zip_median continuous variable for now to start on.

We're going to add features in order of their correlation with price on our correlation heat map, so our base feature is zip code because it has the strongest correlation.

We regress our target on zip code, then we plot our residuals against total square footage.

In [None]:
x, y = np.array(df['median_zip']).reshape(-1,1), df['price']
z = np.array(df['livingAreaSqFt']).reshape(-1,1)

model = LinearRegression()
model.fit(x, y)

test_predictions = model.predict(x)

residuals = y - test_predictions

fig = plt.figure(figsize=(15,10))

# Add labels for x and y axes
plt.xlabel('Total Square Footage')
plt.ylabel('Residuals')

# Add a title for the plot
plt.title('Residuals vs Square Footage - Response is Median_Zip')


plt.scatter(z, residuals, label="sample");

Regression on median zip, this time residuals plotted against school rating. This may not have the strong relationship we expected.

In [None]:
x, y = np.array(df['median_zip']).reshape(-1,1), df['price']
z = np.array(df['avgSchoolRating']).reshape(-1,1)

model = LinearRegression()
model.fit(x, y)

test_predictions = model.predict(x)

residuals = y - test_predictions

fig = plt.figure(figsize=(15,10))

# Add labels for x and y axes
plt.xlabel('Average School Rating')
plt.ylabel('Residuals')

# Add a title for the plot
plt.title('Residuals vs Average School Rating - Response is Median Zip Code')


plt.scatter(z, residuals, label="sample");

In [None]:
x, y = np.array(df['median_zip']).reshape(-1,1), df['price']
z = np.array(df['lotSizeSqFt']).reshape(-1,1)

model = LinearRegression()
model.fit(x, y)

test_predictions = model.predict(x)

residuals = y - test_predictions

fig = plt.figure(figsize=(15,10))

# Add labels for x and y axes
plt.xlabel('Lot Size Square Feet')
plt.ylabel('Residuals')

# Add a title for the plot
plt.title('Residuals vs Lot Size Square Feet - Response is Median_Zip')

plt.scatter(z, residuals, label="sample");

Interesting how past a certain lot size, larger rounded numbers are used instead of specific numbers. There does seem to be a relationship here.

In [None]:
x, y = np.array(df['median_zip']).reshape(-1,1), df['price']
z = np.array(df['numOfBedrooms']).reshape(-1,1)

model = LinearRegression()
model.fit(x, y)

test_predictions = model.predict(x)

residuals = y - test_predictions

fig = plt.figure(figsize=(15,10))

# Add labels for x and y axes
plt.xlabel('Number of Bedrooms')
plt.ylabel('Residuals')

# Add a title for the plot
plt.title('Residuals vs Number of Bedrooms - Response is Median_Zip')

plt.scatter(z, residuals, label="sample");

If we can see a pattern when we plot residuals vs a different predictor, it can tell us if a feature might add value to our model.

## Linear Regressions

### Basic Model Top Features Only

We're going to build our most baseline model using only the top three features -
    
    * zipcode
    * avgSchoolRating
    * livingAreaSqFt
    * numOfBathrooms

We can identify top features from our correlation heat map. Here's a reminder of the top:

#### One-Hot Encoded Categoricals

In [None]:
# put together our basic feature set and preprocess

# one-hot encode categorical
base_cat = pd.DataFrame()
base_cat['zipcode'] = df['zipcode']
base_cat['zipcode'] = base_cat['zipcode'].astype('category')
base_cat_processed = pd.get_dummies(base_cat['zipcode'], prefix='zipcode', drop_first=True)
base_cat_processed.reset_index(inplace=True)
base_cat_processed.drop('index', axis=1, inplace=True)

# log transform and standard scale our continuous
base_cont = df[['avgSchoolRating', 'livingAreaSqFt', 'numOfBathrooms']]
#base_cont = np.log(base_cont)

scaler = StandardScaler()
base_cont_processed = pd.DataFrame(scaler.fit_transform(base_cont),columns = base_cont.columns)

#join cat and cont into predictor data frame
#x_base_set = pd.concat([base_cont_processed, base_cat_processed], axis=1)
x_base_set = base_cont_processed.join([base_cat_processed], how='inner') 

# train/test split
x_base_train, x_base_test = train_test_split(x_base_set, test_size=0.2, random_state=randomstate)

In [None]:
# run model for R^2 score

model = LinearRegression()
model.fit(x_base_train, y_train)
cv_5 = cross_val_score(model, x_base_train, y_train, cv=5)
r2 = cv_5.mean()
r2

In [None]:
# apply our model to our test set and get predicted values
test_predictions = model.predict(x_base_test)
test_predictions

# reverse log transform our predicted values
test_predictions_unscaled = np.exp(test_predictions).astype(int)
test_predictions_unscaled = test_predictions_unscaled.flatten()

# get residuals
residuals = test_actual - test_predictions_unscaled

fig = plt.figure(figsize=(20,15))
plt.scatter(test_predictions_unscaled, residuals);
# Residuals plot

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, test_predictions_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, test_predictions_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
plt.figure(figsize=(10,5))
sns.distplot(np.exp(y_test), hist=True, kde=False)
sns.distplot(test_predictions_unscaled, hist=True, kde=False)
plt.title("Predictions vs Actual")
plt.legend(labels=['Actual Values of Price', 'Predicted Values of Price'])
plt.xlim(0,1500000);

In [None]:
# append our results to our lists

models['Models'].append('Bare Bones Features LR - One-Hot Zip')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['lin_pred'] = test_predictions_unscaled

Our baseline model has an R^2 of 70% on only a few features. Our MAE is pretty high. We will see if we can improve on that with some other feature selection methods, and even some other model types.

#### Target Encoded Categoricals

In [None]:
# put together our basic feature set and preprocess

# log transform and standard scale our continuous
base_cont = train_data[['avgSchoolRating', 'livingAreaSqFt', 'numOfBathrooms', 'zip_smooth']]
scaler = StandardScaler()
x_base_train = pd.DataFrame(scaler.fit_transform(base_cont),columns = base_cont.columns)


test_cont = holdout[['avgSchoolRating', 'livingAreaSqFt', 'numOfBathrooms', 'zip_smooth']]
scaler = StandardScaler()
x_base_test = pd.DataFrame(scaler.fit_transform(test_cont),columns = test_cont.columns)

In [None]:
# run model for R^2 score

model = LinearRegression()
model.fit(x_base_train, y_train)
cv_5 = cross_val_score(model, x_base_train, y_train, cv=5)
r2 = cv_5.mean()
r2

Our R-squared of 60.5% is much lower than when we used our zip code as categoricals.

In [None]:
# apply our model to our test set and get predicted values
test_predictions = model.predict(x_base_test)
test_predictions

# reverse log transform our predicted values
test_predictions_unscaled = np.exp(test_predictions).flatten().astype(int)

# get residuals
residuals = test_actual - test_predictions_unscaled

fig = plt.figure(figsize=(20,15))
plt.scatter(test_predictions_unscaled, residuals);
# Residuals plot

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, test_predictions_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, test_predictions_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
# append our results to our lists

models['Models'].append('Bare Bones Features LR - Encoded Zip')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['lin_pred'] = test_predictions_unscaled

### Linear Regression Model - ALL Features

#### One Hot Set

Run a base model with no cross-validation or specific feature selection with ALL possible features. We're going to use our one-hot encoded set which performed better in our first test.

In [None]:
predictors_train = sm.add_constant(X_train_onehot)
model = sm.OLS(y_train, predictors_train).fit()
model.summary()

There is a multicollinearity of features in our feature set somewhere. Let's check.

The correlations it picked up are circumstancial.

There are a good number of features included in this model with a p-value over .05, meaning there is a greater than 5% chance that the results are due to randomness of the sample rather than the feature. A lot of our features have a very low p-value which indicates a very low chance that these results are not affected by the feature. 

Now we perform cross-validation with our base model over 5 splits and get our mean R^2.

In [None]:
model = LinearRegression()
model.fit(X_train_onehot, y_train)
cv_5 = cross_val_score(model, X_train_onehot, y_train, cv=5)
r2 = cv_5.mean()
r2

In [None]:
# How many predictors are in our base model?
print("{} predictors used for this model".format(X_train_onehot.shape[1]))

In [None]:
# apply our model to our test set and get predicted values
test_predictions = model.predict(X_test_onehot)
test_predictions

# reverse log transform our predicted values
test_predictions_unscaled = np.exp(test_predictions)
test_predictions_unscaled = test_predictions_unscaled.flatten().astype(int)

# get residuals
residuals = test_actual - test_predictions_unscaled

fig = plt.figure(figsize=(20,15))
plt.scatter(test_predictions_unscaled, residuals);

# Residuals plot

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, test_predictions_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, test_predictions_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
# append our results to our lists

models['Models'].append('Basic LR - One-Hot')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['lin_pred'] = test_predictions_unscaled

##### Study Residuals

In [None]:
# We need our statsmodels model again to plot residuals
predictors_train = sm.add_constant(X_train_onehot)
model = sm.OLS(y_train, predictors_train).fit()

In [None]:
fig = plt.figure(figsize=(15,8))
fig = sm.graphics.plot_regress_exog(model, "livingAreaSqFt", fig=fig)
plt.show()

In [None]:
fig = plt.figure(figsize=(15,8))
fig = sm.graphics.plot_regress_exog(model, "avgSchoolRating", fig=fig)
plt.show()

In [None]:
fig = plt.figure(figsize=(15,8))
fig = sm.graphics.plot_regress_exog(model, "lotSizeSqFt", fig=fig)
plt.show()

In [None]:
fig = plt.figure(figsize=(15,8))
fig = sm.graphics.plot_regress_exog(model, "numOfBedrooms", fig=fig)
plt.show()

#### Target Encoded Categoricals

Run a base model with no cross-validation or specific feature selection with ALL possible features. We're using our target categorical encoded set which performed worse in our first test.

In [None]:
predictors_train = sm.add_constant(X_train_encoded)
model = sm.OLS(y_train, predictors_train).fit()
model.summary()

In [None]:
model = LinearRegression()
model.fit(X_train_encoded, y_train)
cv_5 = cross_val_score(model, X_train_encoded, y_train, cv=5)
r2 = cv_5.mean()
r2

In [None]:
# How many predictors are in our base model?
print("{} predictors used for this model".format(X_train_encoded.shape[1]))

In [None]:
# apply our model to our test set and get predicted values
test_predictions = model.predict(X_test_encoded)
test_predictions

# reverse log transform our predicted values
test_predictions_unscaled = np.exp(test_predictions)
test_predictions_unscaled = test_predictions_unscaled.flatten().astype(int)

# get residuals
residuals = test_actual - test_predictions_unscaled

fig = plt.figure(figsize=(20,15))
plt.scatter(test_predictions_unscaled, residuals);

# Residuals plot

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, test_predictions_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, test_predictions_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
# append our results to our lists

models['Models'].append('Basic LR - Encoded')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['lin_pred'] = test_predictions_unscaled

## Linear Regression - Feature Selectors

Feature selectors are different methods to help us pick which features we want to use in our model. In our example above where we used ALL predictors in our linear regression, several of our features had a p-value over .05, which indicates that there is more than a 5% chance that the changes attributed to that feature were actually by random chance. We want features where our p-value is below a threshold that we specify where we are reasonably confident that the feature is contributing to the model and not by random chance.

### Permutation Importance

In [None]:
model = LinearRegression()
model.fit(X_train_onehot, y_train)

from sklearn.inspection import permutation_importance
r = permutation_importance(model, X_train_onehot, y_train,
                           n_repeats=15,
                            random_state=0,
                          n_jobs=-1)

In [None]:
for i in r.importances_mean.argsort()[::-1]:
    if r.importances_mean[i] >= 0.001:
        print(f"{X_train_onehot.columns[i]:<8} "
            f"\t\tImportance: {r.importances_mean[i]:.3f} ")

In [None]:
importances = {}

for i in r.importances_mean.argsort()[::-1]:
    if r.importances_mean[i] >= 0.001:
        importances[X_train_onehot.columns[i]] = r.importances_mean[i]
    else: continue
        
importances

important_features_again = list(importances.keys())
important_features_again  

In [None]:
permutation_x_train = X_train_onehot[important_features_again]
permutation_x_test = X_test_onehot[important_features_again]

model = LinearRegression()
model.fit(permutation_x_train, y_train)
cv_5 = cross_val_score(model, permutation_x_train, y_train, cv=5)
r2 = cv_5.mean()
r2

In [None]:
# apply our model to our test set and get predicted values
test_predictions = model.predict(permutation_x_test)
test_predictions

# reverse log transform our predicted values
test_predictions_unscaled = np.exp(test_predictions)
test_predictions_unscaled = test_predictions_unscaled.flatten().astype(int)

# get residuals
residuals = test_actual - test_predictions_unscaled

fig = plt.figure(figsize=(20,15))
plt.scatter(test_predictions_unscaled, residuals);

# Residuals plot

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, test_predictions_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, test_predictions_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
# append our results to our lists

models['Models'].append('LR - Permutation Importance')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['lin_pred'] = test_predictions_unscaled

### Forward-Backward Selector

First we'll try a simple forward-backward feature selection model based on p-value, using a statsmodel OLS linear regression model. This selector starts with zero features, internally runs the model for each feature individually, and adds the lowest p-value feature to its list to include. It then runs the model again with the original feature included and tries adding each other feature individually. It will either add the next best feature under the threshold or remove an existing feature if it is no longer within the threshold. This process iterates until all features in the model are under the p-value threshold.

This model takes quite some time to run, so is commented out with the results replicated in markdown following.

In [None]:

result = stepwise_selection(X_train_onehot, y_train, verbose=True)

#print('resulting features:', result)


#result = ['livingAreaSqFt', 'hasAssociation_1', 'zipcode_78704', 'avgSchoolRating', 'zipcode_78703', 'avgSchoolSize', 'numPriceChanges', 'zipcode_78731', 'propertyTaxRate', 'zipcode_78753', 'zipcode_78737', 'zipcode_78757', 'zipcode_78746', 'zipcode_78756', 'zipcode_78702', 'zipcode_78735', 'modern', 'zipcode_78732', 'zipcode_78747', 'zipcode_78723', 'zipcode_78751', 'lotSizeSqFt', 'year_block_30', 'year_block_29', 'year_block_28', 'zipcode_78744', 'zipcode_78705', 'zipcode_78722', 'zipcode_78759', 'patioporch_1', 'condo', 'great', 'numOfPhotos', 'zipcode_78726', 'design', 'latest_salemonth_12', 'latest_salemonth_11', 'latest_salemonth_10', 'year_block_27', 'latest_salemonth_9', 'numOfStories', 'zipcode_78750', 'avgSchoolDistance', 'zipcode_78701', 'zipcode_78728', 'latest_salemonth_8', 'latest_salemonth_7', 'zipcode_78745', 'hasSpa_1', 'year_block_26', 'zipcode_78730', 'washer', 'zipcode_78733', 'latest_salemonth_6', 'price', 'convenient', 'investment', 'walkable', 'garageSpaces', 'zipcode_78721', 'zipcode_78752', 'zipcode_78741', 'window', 'outdoor', 'congress', 'default', 'accessibility_1', 'spa', 'zipcode_78749', 'zipcode_78729', 'zipcode_78717', 'year_block_13', 'latest_salemonth_5', 'hill', 'luxury', 'wine', 'classic', 'waterfall', 'unique', 'tankless', 'floor', 'view', 'detach', 'travis', 'zilker', 'new', 'laminate', 'hasHeating_1', 'zipcode_78739', 'zipcode_78758', 'zipcode_78727', 'zipcode_78748', 'zipcode_78736', 'zipcode_78724', 'year_block_24', 'zipcode_78660', 'year_block_25', 'indoor', 'community', 'medium', 'fan']

In [None]:
# Run our linear regression again, using only the features recommended by our feature selector

X_train_refined = X_train_onehot[result]
X_test_refined = X_test_onehot[result]

predictors_int = sm.add_constant(X_train_refined)
model = sm.OLS(y_train, predictors_int).fit()
model.summary()

In [None]:
print("{} predictors used".format(len(result)))

In [None]:
model = LinearRegression()
model.fit(X_train_refined, y_train)
cv = RepeatedKFold(n_splits=5, n_repeats=2, random_state=1)

cv_5 = cross_val_score(model, X_train_refined, y_train, cv=cv)
r2 = cv_5.mean()
r2

In [None]:
# apply our model to our test set and get predicted values
test_predictions_refined = model.predict(X_test_refined)

# reverse log transform our predicted values
test_predictions_refined_unscaled = np.exp(test_predictions_refined)
test_predictions_refined_unscaled=test_predictions_refined_unscaled.flatten()

# get residuals
residuals = test_actual - test_predictions_refined_unscaled

# plot residuals
fig = plt.figure(figsize=(20,15))
plt.scatter(test_predictions_refined_unscaled, residuals)

In [None]:
# get mean absolute error
mae = round(mean_absolute_error(test_actual, test_predictions_refined_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, test_predictions_refined_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
# append our results to our lists

models['Models'].append('Forw-Back Selector')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['forw_back_pred'] = test_predictions_refined_unscaled

### Recursive Feature Elimination with Cross Validation - Linear Regression

RFECV is a reverse forward-backward selector. It starts the model with all features in use then removes the weakest one, and iterates until the best feature set is found. It uses integrated cross-validation to determine the optimal set of features in the model with the best cross-validated score. We score on mean absolute error.

In [None]:
# Using sklearn RFECV to perform integrated CV while picking the number of features
# picks the number of features itself

model = LinearRegression(n_jobs=-4)
cv = RepeatedKFold(n_splits=5, n_repeats=2, random_state=1)

rfecv = RFECV(estimator=model, step=1, cv=cv, scoring='neg_mean_absolute_error', n_jobs=-4)

# fit model to train set
rfecv.fit(X_train_onehot, y_train)

# print optimal number of features
print('Optimal number of features: {}'.format(rfecv.n_features_))

In [None]:
dset = pd.DataFrame()
dset['attr'] = X_train_onehot.columns
dset['used'] = rfecv.support_

# make a list of the features used in the rfecv
rfecv_result = list(dset[(dset['used'] == True)]['attr'])

# Show the features that RFECV did not use
dset[dset['used']==False]

In [None]:
# Run our linear regression again in statsmodels, using the features recommended by our feature selector

X_train_rfecv = X_train_onehot[rfecv_result]
X_test_rfecv = X_test_onehot[rfecv_result]

predictors_int = sm.add_constant(X_train_rfecv)
model = sm.OLS(y_train, predictors_int).fit()
model.summary()

In [None]:
# getting the r2 score of our best feature set
r2 = model.rsquared
r2

RFECV still includes features with a p-value over .05. Overall though, accuracy is higher than other feature selection methods.


In [None]:
plt.figure(figsize=(16, 9))
plt.title('Recursive Feature Elimination with Cross-Validation', fontsize=18, fontweight='bold', pad=20)
plt.xlabel('Number of features selected', fontsize=14, labelpad=20)
plt.ylabel('R2', fontsize=14, labelpad=20)
plt.plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_, color='#303F9F', linewidth=3)

plt.show()

In [None]:
# predict on new data
rfecv_predictions = rfecv.predict(X_test_onehot)

rfecv_predictions_unscaled = np.exp(rfecv_predictions)
rfecv_predictions_unscaled = rfecv_predictions_unscaled.flatten().astype(int)

# get residuals
residuals = test_actual - rfecv_predictions_unscaled

#plot residuals 
fig = plt.figure(figsize=(20,15))
plt.scatter(rfecv_predictions_unscaled, residuals);

In [None]:
# get mean absolute error

mae = round(mean_absolute_error(test_actual, rfecv_predictions_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, rfecv_predictions_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
plt.figure(figsize=(10,5))
sns.distplot(np.exp(y_test), hist=True, kde=False)
sns.distplot(rfecv_predictions_unscaled, hist=True, kde=False)
plt.legend(labels=['Actual Values of Price', 'Predicted Values of Price'])
plt.xlim(0,1500000);

In [None]:
models['Models'].append('RFECV')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['rfecv_pred'] = rfecv_predictions_unscaled

## K-Nearest Neighbors Model

K-Nearest Neighbors is more commonly used for classification. Its basic premise is to determine "what is this like" in making a prediction, by looking at other things that are close in value/type. We can pick how many neighbors it assesses to make a classification. As we will see, it doesn't work very well for this type of application (or, I've not tuned the hyperparameters properly and/or don't know how to use it well).

We're using our target encoded data set on this.

In [None]:
mae_val = [] #to store mae values for different k

# checks mean absolute error scores on k from 1 to 25
for K in range(25):
    K = K+1
    
    # set up the KNN regressor
    model = neighbors.KNeighborsRegressor(n_neighbors = K)

    model.fit(X_train_encoded, y_train)  #fit the model
    pred=model.predict(X_test_encoded) #make prediction on test set
    error = mean_absolute_error(y_test,pred) #calculate rmse
    mae_val.append(error) #store mae values
    print('MAE value for k= ' , K , 'is:', error)
    
# gets optimal k-value based on score minimum
index_min = np.argmin(mae_val) + 1

# makes model and fits using optimal k
model = neighbors.KNeighborsRegressor(n_neighbors = index_min)
model.fit(X_train_encoded, y_train)  #fit the model

# Get R^2 with cv
scores = cross_val_score(model, X_train_encoded, y_train, scoring='r2', cv=5, n_jobs=-1, error_score='raise')
r2 = np.mean(scores)
r2

In [None]:
#make prediction on test set
pred_knn = model.predict(X_test_encoded)
pred_knn = np.exp(pred_knn).flatten().astype(int)

# get residuals
residuals = test_actual - pred_knn

# plot residuals
fig = plt.figure(figsize=(20,15))
plt.scatter(pred_knn, residuals)

In [None]:
# get mean absolute error

mae = round(mean_absolute_error(test_actual, pred_knn), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, pred_knn)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
plt.figure(figsize=(10,5))
sns.distplot(np.exp(y_test), hist=True, kde=False)
sns.distplot(pred_knn, hist=True, kde=False)
plt.legend(labels=['Actual Values of Price', 'Predicted Values of Price'])
plt.xlim(0,1500000);

In [None]:
models['Models'].append('KNN')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['knn_pred'] = pred_knn

## Support Vector Regression

Support vector regression is a form of regression that allows us to define the acceptable error in our model and then finds the line that best fits the data, according to our specifications. This is really useful with something like housing price predictions, where we are ok with our prediction being within a certain dollar amount. SVR will attempt to get all of the predictions within that dollar amount when possible. This will result in a fit line that is different than a linear regression would have produced, but should result in a lower absolute error, which is a reasonable scoring metric for housing price predictions.

We're going to use sklearn's GridSearchCV to find the optimal hyperparameters to use with our SVM! Here are the parameters we are trying out:

* kernel: linear is parametric, and rbf is non-parametric. One of these should perform better. Our data is not totally normal, so it might be rbf.
* epsilon: This value is how much error we're ok with accepting without assigning a penalty to the model
* C: The error that we will accept from a point outside our epsilon

Our C and epsilon need to be in scale with our output variable, which is our log-transformed price.


In [None]:
# Parameter Tuning
'''
param_grid = {'C' : [.5, 1, 3, 5]            
              }

svr = SVR(epsilon=.05, kernel='linear')
grid_search = GridSearchCV(svr, param_grid, scoring='neg_mean_absolute_error', cv=5)

grid_search.fit(X_train_onehot, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

In [None]:
'''# Parameter Tuning

param_grid = {'epsilon' : [.05, .1, .5, 1]            
              }

svr = SVR(C=5, kernel='linear')
grid_search = GridSearchCV(svr, param_grid, scoring='neg_mean_absolute_error', cv=5)

grid_search.fit(X_train_onehot, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

The following cannot produce results with the above commented out, but results have been replicated in the markdown following.

In [None]:
'''# Parameter Tuning

param_grid = {'kernel' : ['linear', 'rbf', 'poly'],
              'gamma' : ['scale', 'auto']
              }

svr = SVR(C=5, epsilon=.05, tol=.01, verbose=True)
grid_search = GridSearchCV(svr, param_grid, verbose=10, scoring='neg_mean_absolute_error', cv=3, n_jobs=-1)

grid_search.fit(X_train_onehot, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

[LibSVM]Best parameters set found on train set: 

    {'gamma': 'auto', 'kernel': 'rbf'}

    Grid scores on train set:

    -0.146 (+/-0.003) for {'gamma': 'scale', 'kernel': 'linear'}
    -0.157 (+/-0.005) for {'gamma': 'scale', 'kernel': 'rbf'}
    -0.184 (+/-0.007) for {'gamma': 'scale', 'kernel': 'poly'}
    -0.146 (+/-0.003) for {'gamma': 'auto', 'kernel': 'linear'}
    -0.139 (+/-0.002) for {'gamma': 'auto', 'kernel': 'rbf'}
    -0.194 (+/-0.004) for {'gamma': 'auto', 'kernel': 'poly'}

In [None]:
# setting up estimator with our optimal parameters
params = {'kernel' : 'rbf', 'C' : 5, 'epsilon' : .05, 'gamma':'auto'}
svr = SVR(**params, verbose=True, tol=.008)

# fitting our estimator to train data
svr.fit(X_train_onehot, y_train)

# getting R^2 with cv
cv_5 = cross_val_score(svr, X_train_onehot, y_train, cv=5, n_jobs=-1)
r2 = cv_5.mean()
r2

In [None]:
'''# Parameter Tuning

param_grid = {'C' : [1,3,5],            
              }

svr = SVR(epsilon=.05, tol=.008, verbose=True, kernel='rbf', gamma = 'auto')
grid_search = GridSearchCV(svr, param_grid, verbose=10, scoring='neg_mean_absolute_error', cv=3, n_jobs=-1)

grid_search.fit(X_train_onehot, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

[LibSVM]Best parameters set found on train set: 
    
    {'C': 5}
    
    Grid scores on train set:
    
    -0.142 (+/-0.003) for {'C': 1}
    -0.139 (+/-0.002) for {'C': 3}
    -0.139 (+/-0.002) for {'C': 5}

In [None]:
# make new predictions on test
predictions_SVR = svr.predict(X_test_onehot)
predictions_SVR_unscaled = np.exp(predictions_SVR)

# get residuals
residuals = test_actual - predictions_SVR_unscaled

# plot residuals
fig = plt.figure(figsize=(20,15))
plt.scatter(predictions_SVR_unscaled, residuals)

In [None]:
# get mean absolute error

mae = round(mean_absolute_error(test_actual, predictions_SVR_unscaled), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, predictions_SVR_unscaled)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
# plot actual and predictions
plt.figure(figsize=(10,5))
sns.distplot(np.exp(y_test), hist=True, kde=False)
sns.distplot(predictions_SVR_unscaled, hist=True, kde=False)
plt.legend(labels=['Actual Values of Price', 'Predicted Values of Price'])
plt.xlim(0,1500000);

In [None]:
models['Models'].append('SVR')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['svr_pred'] = predictions_SVR_unscaled

## Redo Data for Decision-Based Regressors

We're now going to work with some different model types that are entirely different from linear regression.

There's conflicting information on whether we should use one-hot encoding, or target encoding. We'll solve this by trying both and figuring out what works best for our data set.

### Decision Tree Regressors - One Hot

We need to slightly redo our one-hot encodings to not drop the first entry. We'll also make year_built into total one-hot encodings rather than bins.

In [None]:
dummies_boost_cats_test = pd.get_dummies(holdout_categoricals[high_one_hot_cat], prefix=high_one_hot_cat, drop_first=False)
dummies_boost_cats_train = pd.get_dummies(df_categoricals[high_one_hot_cat], prefix=high_one_hot_cat, drop_first=False)

x_train_boost = pd.concat([df_cont_high_one_hot, dummies_boost_cats_train, train_word_vectors_refined], axis=1)
x_test_boost = pd.concat([df_cont_high_one_hot_holdout, dummies_boost_cats_test, holdout_word_vectors_refined], axis=1)

# redoing our y_train and y_test as non-log transformed
y = df[target] # our target variable

# creating our train/validation sets and our test sets
y_train, y_test = train_test_split(y, test_size=0.2, random_state=randomstate)

# reset indices to avoid index mismatches
y_train = pd.DataFrame(y_train)
y_train.reset_index(inplace=True)
y_train.drop('index', axis=1, inplace=True)

y_test = pd.DataFrame(y_test)
y_test.reset_index(inplace=True)
y_test.drop('index', axis=1, inplace=True)

#### Random Forest

Random Forest is a decision tree model. It runs a large number of randomized decision trees and uses the averages of its results to make a prediction. It attempts to pick each tree at random without a directed or iterative plan.

We're going to do extensive parameter tuning to find the best hyperparameters for our random forest, tuning only one or two hyperparameters at a time. When we find an optimal hyperparameter we will add it to our model and tune a different parameter. This iterative approach allows us to test parameters in less time than a large GridSearchCV, although a larger grid search would prove more exhaustive to explore parameter interactions.

Here are the parameters we are trying out:

    max_depth: This determines how many tree nodes the estimator looks at before making a prediction. We don't know what is best here, so we are trying a large scale from 10-100 first and seeing what looks right.
    min_samples_split: The minimum number of samples required to split an internal node
    min_samples_leaf: The minimum number of samples required to be at a leaf node
    max_features: The number of features to consider when looking for the best split
    bootstrap: Whether bootstrap samples are used when building trees
    n_estimators: The number of trees in the forest

In [None]:
# visualize changes to model score as it is tried on different max depths from 10 to 100, to get a starting point for max depth

from sklearn.model_selection import validation_curve
def ValidationCurve(estimator, predictors, target, param_name, hyperparam):
    
    train_score, test_score = validation_curve(estimator, predictors, 
                                               target, param_name, np.arange(10,110,10), 
                                               cv=5, scoring='r2', verbose=10, n_jobs=-4)
    Rsqaured_train = train_score.mean(axis=1)
    Rsquared_test= test_score.mean(axis=1)
    
    plt.figure(figsize=(10,5))
    plt.plot(np.arange(10,110,10), Rsqaured_train, color='r', linestyle='-', marker='o', label='Training Set')
    plt.plot(np.arange(10,110,10), Rsquared_test, color='b', linestyle='-', marker='x', label='Testing Set')
    plt.legend(labels=['Training Set', 'Testing Set'])
    plt.xlabel(hyperparam)
    plt.ylabel('R_squared')
    plt.title("R^squared for Max Depth on Train/Test")
    
ValidationCurve(RandomForestRegressor(), x_train_boost, y_train, 'max_depth', 'Maximum Depth')

In [None]:
'''# Parameter Tuning

param_grid = {'max_depth' : [20, 40, 60, 80, 100],            
              }

model = RandomForestRegressor(criterion='mae', verbose=10, n_jobs=-4)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set:

    {'max_depth': 20}

    Grid scores on train set:

    0.782 (+/-0.028) for {'max_depth': 20}
    0.782 (+/-0.027) for {'max_depth': 40}
    0.781 (+/-0.027) for {'max_depth': 60}
    0.781 (+/-0.025) for {'max_depth': 80}
    0.781 (+/-0.024) for {'max_depth': 100}

In [None]:
'''# Parameter Tuning

param_grid = {'max_depth' : [15, 20, 25],            
              }

model = RandomForestRegressor(criterion='mae', verbose=10)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-7)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 
    {'max_depth': 20}

    Grid scores on train set:

    0.778 (+/-0.029) for {'max_depth': 15}
    0.782 (+/-0.028) for {'max_depth': 20}
    0.779 (+/-0.027) for {'max_depth': 25}

In [None]:
'''# Parameter Tuning

param_grid = {'min_samples_split': [2, 5, 8]            
              }

model = RandomForestRegressor(criterion='mae', verbose=10, max_depth=20, n_jobs=-4)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'min_samples_split': 2}

    Grid scores on train set:

    0.783 (+/-0.028) for {'min_samples_split': 2}
    0.782 (+/-0.027) for {'min_samples_split': 5}
    0.781 (+/-0.025) for {'min_samples_split': 8}

In [None]:
'''#Parameter Tuning

param_grid = {'min_samples_split': [2, 3, 4]            
              }

model = RandomForestRegressor(criterion='mae', verbose=10, max_depth=20, n_jobs=-4)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'min_samples_split': 2}

    Grid scores on train set:

    0.782 (+/-0.028) for {'min_samples_split': 2}
    0.780 (+/-0.024) for {'min_samples_split': 3}
    0.781 (+/-0.024) for {'min_samples_split': 4}

In [None]:
'''# Parameter Tuning

param_grid = {'min_samples_leaf': [1, 3],
              'max_features': ['auto', 'sqrt']
              }

model = RandomForestRegressor(criterion='mae', max_depth=20, min_samples_split=2, n_jobs=-4)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'max_features': 'auto', 'min_samples_leaf': 1}

    Grid scores on train set:

    0.782 (+/-0.023) for {'max_features': 'auto', 'min_samples_leaf': 1}
    0.780 (+/-0.025) for {'max_features': 'auto', 'min_samples_leaf': 3}
    0.747 (+/-0.028) for {'max_features': 'sqrt', 'min_samples_leaf': 1}
    0.724 (+/-0.030) for {'max_features': 'sqrt', 'min_samples_leaf': 3}

In [None]:
'''# Parameter Tuning

param_grid = {
         'bootstrap': [True, False]
        }

model = RandomForestRegressor(criterion='mae', max_depth=20, min_samples_split=2, max_features='auto', min_samples_leaf=1, n_jobs=-4)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'bootstrap': True}

    Grid scores on train set:

    0.781 (+/-0.025) for {'bootstrap': True}
    0.598 (+/-0.074) for {'bootstrap': False}

In [None]:
'''# Parameter Tuning

param_grid = {
 'n_estimators' : [250, 500, 1000]
    }

model = RandomForestRegressor(criterion='mae', max_depth=20, min_samples_split=2, bootstrap=True, max_features='auto', min_samples_leaf=1, n_jobs=-4)

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'n_estimators': 250}

    Grid scores on train set:

    0.780 (+/-0.024) for {'n_estimators': 100}
    0.784 (+/-0.025) for {'n_estimators': 250}
    0.783 (+/-0.026) for {'n_estimators': 500}

In [None]:
# set our regressor with optimal parameterss. I'm going to go with 500 on the estimators because they all seemed very close.
randomforest = RandomForestRegressor(criterion='mae',
                                     max_depth=20, 
                                     min_samples_split=2, 
                                     max_features='auto', 
                                     min_samples_leaf=1, 
                                     bootstrap=True, 
                                     n_estimators=250, 
                                     n_jobs=-4)

# fit random forest regressor to train
randomforest.fit(x_train_boost, y_train)

# check cross_val R^2
cv = RepeatedKFold(n_splits=5, n_repeats=2, random_state=1)
cv_5 = cross_val_score(randomforest, x_train_boost, y_train, cv=cv, n_jobs=-4)
r2 = cv_5.mean()
r2

In [None]:
# apply our model to our test set and get predicted values
forest_predictions = randomforest.predict(x_test_boost)

# calculate residuals
residuals = test_actual - forest_predictions

# plot residuals
fig = plt.figure(figsize=(20,15))
plt.scatter(forest_predictions, residuals)

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, forest_predictions), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, forest_predictions)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
models['Models'].append('Random Forest - One Hot')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['forest_pred'] = forest_predictions

#### XGBoost

This model is entirely different than linear regression. Gradient boosting uses decision trees to learn about outcomes, with trees being added to the model one at a time and existing trees in the model are not changed. Each successive tree tries to improve upon the predictions of the first one, with the weights of the various decision points being updated each time. Gradient boosting uses the residuals to improve its next tree prediction. Overall much more opaque of a process than linear regression.

Gradient Boosting performs best with optimal parameter tuning. We're going to use sklearn's GridSearchCV to find the optimal hyperparameters to use with our gradient booster! Here are the parameters we are trying out:

* n_estimators: Number of boosts to perform. Gradient boosting is pretty robust to over-fitting so more is usually better
* max_depth: This determines how many tree nodes the estimator looks at before making a prediction. We don't know what is best here, so we are trying things from 2-4 to see what works the best
* min_child_weight: Min sum of instance weight needed in a child
* gamma: Minimum loss reduction to make another partition on a leaf node. Higher results in more conservative algorithm.
* subsample: Ratio of training sets. .5 means that it will sample half the training data before making trees. Occurs with each boosting iteration.
* colsample_by_tree: ratio of columns when making a tree
* alpha: L1 regularization. Higher will make model more conservative.
* learning_rate: Tuning this setting alters how much the model corrects after it runs a boost. .1 is a common rate and we will test a lower and higher rate as well.

In [None]:
# visualize changes to model score as it is tried on different max depths from 10 to 80, to get a starting point for max depth

from sklearn.model_selection import validation_curve
def ValidationCurve(estimator, predictors, target, param_name, hyperparam):
    
    train_score, test_score = validation_curve(estimator, predictors, target, 
                                               param_name, np.arange(10,90,10), 
                                               cv=5, scoring='r2', n_jobs=-4)
    Rsqaured_train = train_score.mean(axis=1)
    Rsquared_test= test_score.mean(axis=1)
    
    plt.figure(figsize=(10,5))
    plt.plot(np.arange(10,90,10), Rsqaured_train, color='r', linestyle='-', marker='o', label='Training Set')
    plt.plot(np.arange(10,90,10), Rsquared_test, color='b', linestyle='-', marker='x', label='Testing Set')
    plt.legend(labels=['Training Set', 'Testing Set'])
    plt.xlabel(hyperparam)
    plt.ylabel('R_squared')
    plt.title("R^squared for Max Depth on Train/Test")
    
ValidationCurve(xgb.XGBRegressor(), x_train_boost, y_train, 'max_depth', 'Maximum Depth')

In [None]:
'''# Parameter Tuning max_depth

param_grid = {"max_depth": [5, 10],
                          
              }

model = xgb.XGBRegressor(
                 n_estimators=250,                                                                    
                 seed=42,
                 missing=0,
                 eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-4)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'max_depth': 5}

    Grid scores on train set:
    
    0.790 (+/-0.016) for {'max_depth': 5}
    0.770 (+/-0.037) for {'max_depth': 10}

In [None]:
'''# Parameter Tuning max_depth and min_child_weight

param_grid = {"max_depth": [3, 5, 7],
              "min_child_weight" : [6, 8, 10]            
              }

model = xgb.XGBRegressor(
                 n_estimators=250,                                                                    
                 seed=42,
                 missing=0,
                 eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'max_depth': 3, 'min_child_weight': 8}

    Grid scores on train set:

    0.794 (+/-0.019) for {'max_depth': 3, 'min_child_weight': 6}
    0.797 (+/-0.025) for {'max_depth': 3, 'min_child_weight': 8}
    0.795 (+/-0.021) for {'max_depth': 3, 'min_child_weight': 10}
    0.787 (+/-0.018) for {'max_depth': 5, 'min_child_weight': 6}
    0.785 (+/-0.019) for {'max_depth': 5, 'min_child_weight': 8}
    0.792 (+/-0.022) for {'max_depth': 5, 'min_child_weight': 10}
    0.783 (+/-0.028) for {'max_depth': 7, 'min_child_weight': 6}
    0.780 (+/-0.023) for {'max_depth': 7, 'min_child_weight': 8}
    0.780 (+/-0.031) for {'max_depth': 7, 'min_child_weight': 10}

In [None]:
'''# Parameter Tuning max_depth and min_child_weight

param_grid = {"max_depth": [2, 3, 4],
              "min_child_weight" : [7, 8, 9]            
              }

model = xgb.XGBRegressor(
                 n_estimators=250,                                                                    
                 seed=42,
                 missing=0,
                 eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'max_depth': 3, 'min_child_weight': 8}

    Grid scores on train set:

    0.794 (+/-0.019) for {'max_depth': 2, 'min_child_weight': 7}
    0.793 (+/-0.023) for {'max_depth': 2, 'min_child_weight': 8}
    0.795 (+/-0.024) for {'max_depth': 2, 'min_child_weight': 9}
    0.797 (+/-0.022) for {'max_depth': 3, 'min_child_weight': 7}
    0.797 (+/-0.025) for {'max_depth': 3, 'min_child_weight': 8}
    0.795 (+/-0.026) for {'max_depth': 3, 'min_child_weight': 9}
    0.790 (+/-0.016) for {'max_depth': 4, 'min_child_weight': 7}
    0.793 (+/-0.019) for {'max_depth': 4, 'min_child_weight': 8}
    0.792 (+/-0.029) for {'max_depth': 4, 'min_child_weight': 9}

In [None]:
'''# Parameter Tuning gamma

param_grid = {'gamma':[.1, .3, .5, .7, .9]            
              }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'gamma': 0.1}

    Grid scores on train set:

    0.797 (+/-0.025) for {'gamma': 0.1}
    0.797 (+/-0.025) for {'gamma': 0.3}
    0.797 (+/-0.025) for {'gamma': 0.5}
    0.797 (+/-0.025) for {'gamma': 0.7}
    0.797 (+/-0.025) for {'gamma': 0.9}

In [None]:
'''# Parameter Tuning subsample

param_grid = {
 'subsample':[.2, .4, .6, .8, 1],
 
    }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

In [None]:
'''# Parameter Tuning colsample_by_tree

param_grid = {
 'colsample_bytree':[.2, .4, .6, .8, 1]
    }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

In [None]:
# Parameter Tuning alpha

param_grid = {
    'reg_alpha':[1e-5, 1e-2, 0.1, 1, 10, 100]
    }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                colsample_bytree=.8,
                reg_lambda = 1,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))

Best parameters set found on train set: 

    {'reg_alpha': 10}

    Grid scores on train set:

    0.803 (+/-0.019) for {'reg_alpha': 1e-05}
    0.803 (+/-0.019) for {'reg_alpha': 0.01}
    0.803 (+/-0.019) for {'reg_alpha': 0.1}
    0.803 (+/-0.019) for {'reg_alpha': 1}
    0.803 (+/-0.019) for {'reg_alpha': 10}
    0.802 (+/-0.020) for {'reg_alpha': 100}

In [None]:
'''# Parameter Tuning alpha

param_grid = {
    'reg_alpha':[5, 10, 25, 50]
    }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                colsample_bytree=.8,
                reg_lambda = 1,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'reg_alpha': 10}

    Grid scores on train set:

    0.803 (+/-0.019) for {'reg_alpha': 5}
    0.803 (+/-0.019) for {'reg_alpha': 10}
    0.802 (+/-0.020) for {'reg_alpha': 25}
    0.802 (+/-0.020) for {'reg|_alpha': 50}

In [None]:
# Parameter Tuning lambda

param_grid = {'lambda':[0.1, 1, 10, 100, 500, 1000, 2500]            
              }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                colsample_bytree=.8,
                reg_alpha = 10,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))

Best parameters set found on train set: 

    {'lambda': 1}

    Grid scores on train set:

    0.797 (+/-0.022) for {'lambda': 0.1}
    0.803 (+/-0.019) for {'lambda': 1}
    0.802 (+/-0.013) for {'lambda': 10}
    0.794 (+/-0.018) for {'lambda': 100}
    0.780 (+/-0.029) for {'lambda': 500}
    0.772 (+/-0.023) for {'lambda': 1000}
    0.755 (+/-0.021) for {'lambda': 2500}

In [None]:
# Parameter Tuning lambda

param_grid = {'lambda':[0.5, 1, 5]            
              }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                colsample_bytree=.8,
                reg_alpha = 10,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))

Best parameters set found on train set: 

    {'lambda': 1}

    Grid scores on train set:

    0.799 (+/-0.023) for {'lambda': 0.5}
    0.803 (+/-0.019) for {'lambda': 1}
    0.801 (+/-0.023) for {'lambda': 5}

In [None]:
'''# Parameter Tuning num_estimators

param_grid = {'num_estimators':[250, 500, 1000, 5000]            
              }

model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                colsample_bytree=.8,
                reg_alpha = 10,
                reg_lambda = 1,
                seed=42,
                missing=0,
                eval_metric='mae' )

grid_search = GridSearchCV(model, param_grid=param_grid, cv = 5, 
                                   verbose=10, n_jobs=-1)
grid_search.fit(x_train_boost, y_train)

grid_search.best_estimator_

print("Best parameters set found on train set: \n")
print(grid_search.best_params_)
print("\nGrid scores on train set:\n")
means = grid_search.cv_results_['mean_test_score']
stds = grid_search.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, grid_search.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))'''

Best parameters set found on train set: 

    {'num_estimators': 250}

    Grid scores on train set:

    0.803 (+/-0.019) for {'num_estimators': 250}
    0.803 (+/-0.019) for {'num_estimators': 500}
    0.803 (+/-0.019) for {'num_estimators': 1000}
    0.803 (+/-0.019) for {'num_estimators': 5000}

In [None]:
best_xgb_model = xgb.XGBRegressor(
                n_estimators=250,
                max_depth = 3,
                min_child_weight = 8,
                gamma = .1,
                colsample_bytree=.8,
                reg_alpha = 10,
                reg_lambda = 1,
                seed=42,
                missing=0,
                eval_metric='mae' )
best_xgb_model.fit(x_train_boost, y_train)

cv_5 = cross_val_score(best_xgb_model, x_train_boost, y_train, cv=5)
r2 = cv_5.mean()
r2

In [None]:
# make prediction
preds = best_xgb_model.predict(x_test_boost)

#log residuals
residuals = test_actual - preds

# plot residuals
fig = plt.figure(figsize=(20,15))
plt.scatter(preds, residuals)

In [None]:
# Calculate our mean absolute error

mae = round(mean_absolute_error(test_actual, preds), 2)
mae

In [None]:
# MAE for properties UNDER 500k

under500 = pd.DataFrame(list(zip(test_actual, preds)), columns =['actual', 'predictions'])
under500 = under500[under500['actual'] < 500000]
mae500 = round(mean_absolute_error(under500['actual'], under500['predictions']), 2)
mae500

In [None]:
models['Models'].append('XGBoost')
models['r2'].append(r2)
models['mae'].append(mae)
models['mae_500'].append(mae500)
values['boost_pred'] = preds

### Decision Tree Regressors - Target Encoding

# Model Selection

We ran several different types of models, and logged the r^squared and mean absolute error for each model type. Which model performed the best for us?

In [None]:
# make data frame from our models dictionary
model_types = pd.DataFrame(models)

# sort data frame by mae and reset index
model_types = model_types.sort_values('mae', ascending=True).reset_index()
model_types.drop('index',axis=1, inplace=True)
model_types.set_index('Models', inplace=True)

model_types

In [None]:
# plot model mae

plt.figure(figsize=(15,10))
plt.plot(model_types['mae'])
plt.title("Mean Average Error")
plt.xticks(rotation=90)
plt.xlabel('Model')
plt.ylabel("MAE");

In [None]:
# get residuals for a few types
values['rfecv_resid'] = (values['actual'] - values['rfecv_pred'])/10000
values['compsmodel_resid'] = (values['actual'] - values['compsmodel_pred'])/10000
values['boost_resid'] = (values['actual'] - values['boost_pred'])/10000

In [None]:
# plot overlapping residuals for a few types
fig = plt.figure(figsize=(20,15))

plt.scatter(values['boost_pred'], values['boost_resid'], color='red')
plt.scatter(values['compsmodel_pred'], values['compsmodel_resid'], color='blue')
plt.scatter(values['rfecv_pred'], values['rfecv_resid'], color='green')
plt.title("Residuals")
#plt.xticks(rotation=90)
plt.xlabel('Predictions in millions')
plt.ylabel("Distance from Actual, in $10,000");

Ultimately we will select the sklearn RFECV method for our model. This method performs optimal feature selection via cross validation in order to determine the best feature set for the data. 

Support Vector Regression and Ridge performed better, but it doesn't fulfill two of the asks for this project - that the model be a linear regression, and that it be easily explained. RFECV also had the lowest MAE on properties under 500k, so we're happy with its overall performance.

# Final Model

Now that we've chosen a favorite regression for this problem (RFECV), we want to be able to actually use the model to predict new data. Our steps are as follows:

* Build a model using our ENTIRE dataset for deployment
* Write our own standardization functions that we can apply to incoming new data
* Prepare a more intuitive GUI for predicting on new data

## Prepare full data set for final model

In [None]:
# merge our x_test and x_train continuous variables into one data frame, then make sure index is reset
X_all_continuous = x_test_cont.append(x_train_cont)
X_all_continuous.reset_index(inplace=True)
X_all_continuous.drop(['index'], axis=1, inplace=True)

# merge our x_test and x_train categorical variables into one data frame, then make sure index is reset
X_all_categoricals = x_test_cat.append(x_train_cat)
X_all_categoricals.reset_index(inplace=True)
X_all_categoricals.drop('index', axis=1, inplace=True)

While forming our model, we used the sklearn StandardScaler. But how do we scale brand new features that we are trying to predict from? In order to do so, we need to write a standardization function for each of our features, that we can then save and apply to our new prediction data.

In [None]:
# make dictionary to store standardization coefficients
standardization_coeffs = {}

# make standardization coefficients for each column
for item in X_all_continuous:
    standardization_coeffs[item+'_mean'] = X_all_continuous[item].mean()
    standardization_coeffs[item+'_std'] = X_all_continuous[item].std()

# log transform all continuous
X_all_continuous = np.log(X_all_continuous)

# apply standardization to all continuous
for item in X_all_continuous:
    X_all_continuous[item] = (X_all_continuous[item] - standardization_coeffs[item+'_mean'])/standardization_coeffs[item+'_std']
    
# concat continuous and categoricals into a final X
X_final = pd.concat([X_all_continuous, X_all_categoricals], axis=1)

# concat train and test y into a final y, reset index so it matches X
y_final = y_test.append(y_train_val)
y_final.reset_index(inplace=True)
y_final.drop('index',  axis=1, inplace=True) 

## Train Final Model

Using sklearn RFECV to perform integrated CV while picking the number of features.

We're running a heavy cv to ensure that we really, truly have the right feature selection. We're using sets of 5 folds, and then repeating 5 times with different sets of 5 folds using mean absolute error as the scoring metric - we're looking for lowest error range

In [None]:
# define estimator and cv plan
model = LinearRegression()
cv = RepeatedKFold(n_splits=10, n_repeats=5, random_state=1)

# create final model using RFECV
rfecv = RFECV(estimator=model, step=1, cv=cv, scoring='neg_mean_absolute_error')

# fit final model to all data
rfecv.fit(X_final, y_final)

# final model r^2
rfecv.score(X_final, y_final)

### Get Coefficient Importance

In [None]:
print('Optimal number of features: {}'.format(rfecv.n_features_))

In [None]:
# get list of features used by the final model
dset = pd.DataFrame()
dset['attr'] = X_final.columns
dset['used'] = rfecv.support_

# make a list of the features used in the rfecv
rfecv_result = list(dset[(dset['used'] == True)]['attr'])

In [None]:
# Run the model with the selected features in statsmodels OLS so we can get
# a list of model pvalues
X_specific = X_final[rfecv_result]

predictors_train = sm.add_constant(X_specific)
model = sm.OLS(y_final, predictors_train).fit()

model.summary()

We have values over p-value .05, so we're going to dump those by doing the feature selector on this.

In [None]:
# run feature selector and get list of features with p-value under .05
stepwise_result = stepwise_selection(X_specific, y_final, verbose=True)

Our stepwise selector shows us our most important features. Grade, sqft_living, condition, and several zip codes were incredibly important features. Almost all included features had p-values so low that the probability that these features are influencing the price by random chance is near zero. Our highest included p-value, zipcode 98003, has only a .3% chance of having influenced prices by random chance.

In [None]:
# make one last data set using the features that the selector accepted
X_specific = X_specific[stepwise_result]

# make final model
final_model = LinearRegression()
final_model.fit(X_specific, y_final)

# evaluate final model
predictors_train = sm.add_constant(X_specific)
model = sm.OLS(y_final, predictors_train).fit()

model.summary()

In [None]:
# save our final summary to image
plt.rc('figure', figsize=(10,23))
plt.text(0.01, 0.05, str(model.summary()), {'fontsize': 10}, fontproperties = 'monospace') # approach improved by OP -> monospace!
plt.axis('off')
plt.tight_layout()
#plt.savefig('images/output.png')

In [None]:
# final model MAE

end_z = np.exp(final_model.predict(X_specific))
mae = round(mean_absolute_error(np.exp(y_final), end_z), 2)
mae

In [None]:
# get residuals
residuals = np.exp(y_final) - end_z

# plot residuals
fig = plt.figure(figsize=(20,15))
plt.scatter(end_z, residuals)
plt.title("Residuals")
#plt.xticks(rotation=90)
plt.xlabel('Predictions in millions')
plt.ylabel("Distance from Actual, in $10,000");

#plt.savefig('images/residuals.png')

Our residuals show the heteroscedasticity described by our Durbin Watson score above 2.

In [None]:
fig, ax = plt.subplots(figsize=(15,10))
fig = sm.graphics.qqplot(model.resid, dist=stats.norm, line='45', fit=True, ax=ax)

QQ-plot of residuals shows heavy tails.

### Observations on final model summary

> 88.2% of our variation in price can be explained by our model variables. Our equal R^2 and adjusted R^2 tells us that all of our features are contributing to our model. Adjusted R^2 penalizes the R^2 formula based on number of variables, so a same score tells us that our variables are all contributors.

> F-statistic and Prob(F-statistic) tell us if our variable group as a whole is statistically significant.The null hypothesis is a model with no variables (only intercept). Our prob(f-statistic) evaluates the chance that our null hypothesis is true and our variables have no effect. On our model this chance is so low that it's functionally 0. F-statistic is evaluating all of the coefficients used together, so p-value of individual coefficients is still important. So this number tells us that our MODEL is significant without telling us if any individual feature is significant.

> Omnibus and prob(omnibus) are telling us that our residuals are not normally distributed. We want to see a value close to 0 for Omnibus, and a value close to 1 for Prob(Omnibus). Neither of these things is the case. We saw heteroscedasticity on the residuals for all of our model types that we tried. All of our models seem to perform well at lower values and then variance increases as price increases. This was true on both linear and nonlinear approaches such as decision trees.

> Our data is skewed slightly left. Our data has high kurtosis which implies tighter grouping of our residuals around zero and fewer outliers, which implies a fitted model.

> Durbin Watson is a measure of homoscedasticity - an even distribution of errors. We are showing slight heteroscedasticity because we'd like to see this number between 1 and 2. We saw that heteroscedasticity in the residuals.

> Our low condition number implies no multicollinearity with our features.

> Overall our characteristics are mediocre. Our high omnibus and above 2 Durbin-Watson are problematic. Our QQ-plot shows very heavy tails. But no better model was found to correct for the heteroscedasticity.



# Analysis

> Our final model utilizes a combination of continuous variables and one-hot-encoded categoricals to produce a linear regression with R^2 of 88.2% and a mean absolute error of 56k. I tried several different transformations including polynomial features, mean target encoding, lower-granularity binning, and median rank as a continuous, and ALL of these efforts resulted in a lower R^2 and higher mean absolute error, leading to a final decision to one-hot encode all 70 zip codes individually. Similar efforts on other categoricals such as age and month sold also did not improve the model over one-hot encoding. This resulted in the greatest accuracy despite a model that is more "messy" with a large number of features.

> Features were selected using the sklean "Recursive Feature Elimination with Cross Validation" function, or RFECV. RFECV uses cross-validation and begins the model with all features, then eliminates the weakest feature and scores the model again, using cv with each iteration. At the end it returns the model with the feature set that produced the highest score on the cv. In the case of price predictions I used mean absolute error as my preferred scoring metric. Then, starting with the features recommended by RFECV, I ran the model again through a forward-backward feature selector, which starts with zero features and iteratively adds and subtracts features until all features have a p-value under the threshold (in this case .05). 

> Almost all included features had p-values so low that the probability that these features are influencing the price by random chance is near zero. Our highest included p-value, zipcode 98003, has only a .3% chance of having influenced prices by random chance.

> 88.2% of our variation in price can be explained by our model variables. Our equal R^2 and adjusted R^2 tells us that all of our features are contributing to our model. Adjusted R^2 penalizes the R^2 formula based on number of variables, so a same score tells us that our variables are all contributors.

> F-statistic and Prob(F-statistic) tell us if our variable group as a whole is statistically significant.The null hypothesis is a model with no variables (only intercept). Our prob(f-statistic) evaluates the chance that our null hypothesis is true and our variables have no effect. On our model this chance is so low that it's functionally 0. F-statistic is evaluating all of the coefficients used together, so p-value of individual coefficients is still important. So this number tells us that our MODEL is significant without telling us if any individual feature is significant.

> Omnibus and prob(omnibus) are telling us that our residuals are not normally distributed. We want to see a value close to 0 for Omnibus, and a value close to 1 for Prob(Omnibus). Neither of these things is the case. We saw heteroscedasticity on the residuals for all of our model types that we tried. All of our models seem to perform well at lower values and then variance increases as price increases. This was true on both linear and nonlinear approaches such as decision trees.

> Our data is skewed slightly left. Our data has high kurtosis which implies tighter grouping of our residuals around zero and fewer outliers, which implies a fitted model.

> Durbin Watson is a measure of homoscedasticity - an even distribution of errors. We are showing slight heteroscedasticity because we'd like to see this number between 1 and 2. We saw that heteroscedasticity in the residuals.

> Our low condition number implies no multicollinearity with our features.

> Overall our characteristics are mediocre. Our high omnibus and above 2 Durbin-Watson are problematic. But no better model was found to correct for the heteroscedasticity.

# Conclusions and Recommendations

#### What are the primary factors influencing housing prices in the King County metro area?
> As square footage increases so does quality of materials. Most importantly you can see the upward price trend with both increased square footage and materials grade. I was intrigued that our lower bound of data points is very linear, but as our square footage increases, the upper bound gradually breaks away with higher variance. 

>I ranked the 70 zip codes in King County by median home value, and used those ranks to color our data points.  Our low median zip codes have a low price per square footage, and price per square foot increases with zip code median, which makes sense, but also shows the importance of zip code to pricing. I found it interesting that while most zip codes exhibit a clear trend of price per square foot decreasing with increased total square footage, which is entirely normal, certain very high value zip codes seem to retain their high price per square foot regardless of total square footage. Certain zip codes seem immune to the usual price per square foot decay. 

> As they say, location is everything, and it is the primary influencing factor for a home price in the King County metro area. Our darkest areas, and therefore highest value sales, are clustered in and around Seattle to the west of Lake Washington and into the eastern lake cities of Bellevue and Redmond which are the technical employer hubs of the region. As we move away from Seattle and the tech hubs into the suburbs, our prices clearly go down.

> These three features alone explain 85% of the price variance.

#### Can we effectively use a regression model based system for realtors to determine a proper list price?
> Our regression model, while explaining over 88% of the price variance with our features, was nonetheless far from accurate in absolute terms. A mean average error of 55k in either direction is a huge variance to a home price - one that is so large that it renders the prediction much less meaningful and useful. Other models need to be explored, better data needs to be sourced, or easy-to-use features that an average realtor is capable of evaluating/acquiring should be added to the model to improve its predictive quality. The model is providing at best a baseline starting point.

#### Is a model-based system more accurate for determining list price than the traditional comps-based system?
>At present, the regression model is only slightly more accurate than the comps-based system on our test data - and this is even with our comps-based simulator lacking a feature set as robust as our model. In fact our realtor simulator performed better on new data than our model. The model is lacking something - quality start data, predictive features, proper feature engineering, or something else - and we should identify and include them.

> With our model one major pro is it is a mathematical model and we can use it without housing data on hand. It would be benefited by up to date data, but it doesn't require it to function. This certainly results in less labor overhead as it only needs occasional updating. However the wide error range gives only a starting point, and it ultimately requires human intervention to determine final price, which is likely a comps-based manual pass anyway.

> The comps based system, on the other hand, is a familiar method for realtors. It somehow more accurately captures the location granularity that the model misses. And most importantly, it can still be accomplished programmatically, as I've demonstrated, which can provide a quicker starting point for a realtor instead of them identifying the comps manually. The downside here is cost of maintenance - the comps finder requires the entire data set to be present to function, and it must be updated frequently, preferably daily, for most accurate results. With that in mind the realtor simulator requires a lot more overhead to remain usable.

#### What easy-to-use features can we add to our model to increase its accuracy?

>I would consider two different models that separate urban and suburban areas, to better capture the non-decaying price per square foot in our high value zip codes that we saw back on our square footage visualization.

> I'd source better data for this project. Data that includes the type of sale would be incredibly important to weed out non-market sales which may be hurting our model.  When I searched out new data on which to test our completed model I saw a large proportion of non-standard sales on record, and this adds noise to our data.

> I want to use latitude and longitude as a more granular location identifier in place of zip code

> If we add features to the model, they need to be ones that an average realtor is capable of finding and utilizing with average Google skills. We could add frontend GUI features and backend web scraping that allows some of these things to be run by the model software. For example, using latititude and longitude, GreatSchools.org can tell a property's exact school assignments and ratings. A realtor doesn't know lat and long, but we can use under the hood web scraping that takes the address and uses a reverse address finder such as at https://www.latlong.net/Show-Latitude-Longitude.html to determine lat/long, then takes that result to GreatSchools.org to scrape for school information. Lat/long can also be utilized within the model in combination with the many GIS map tools available from King County about public services, parks, school districts, area income and more to obtain and enter data without the realtor needing to do any more than enter an address.


# Visualizations

## Feature Visualizations

In [None]:
# refresh on our original data frame
df

In [None]:
# get the columns we are going to make visualizations from
viz_df = df[['price', 'sqft_living', 'median_zip', 'zip_rank', 'grade']]
viz_df['pr_sf'] = round(viz_df['price']/viz_df['sqft_living'], 2)
viz_df

In [None]:
# make simpler variables for our visualiation variables
viz_target = viz_df['price']/100000
viz_sqft = viz_df['sqft_living']
viz_grade = viz_df['grade']
viz_zip = viz_df['zip_rank'] 
viz_zip2 = viz_df['median_zip']
viz_pr_sf = viz_df['pr_sf']

In [None]:
fig, ax=plt.subplots(figsize=(20,15)) # prepare our figure

sns.set(font_scale = 1.5) # set our font scale bigger for this vis

# scatter our data
scatter2 = sns.scatterplot(x="sqft_living", y=viz_target, data=viz_df, hue='grade', palette='magma_r')

# label axes and title
ax.set_xlabel('Total Square Footage', fontsize=16)
ax.set_ylabel('Price in $100,000', fontsize=16)
ax.set_title("Price to Total Square Footage\nby Grade of Materials", fontsize=20)

# label and position our legend
plt.legend(title='Grade', loc='upper left', title_fontsize=20);

# save visualization to png
#plt.savefig('images/pr_grade.png')

In [None]:
fig, ax = plt.subplots(figsize=(20, 15))

#ax.scatter(viz_sqft, viz_pr_sf, c=viz_zip, cmap='magma_r')
scatter2 = sns.scatterplot(x="sqft_living", y="pr_sf", data=viz_df, hue='grade', palette='magma_r')

# label axes and title
ax.set_xlabel('Total Square Footage', fontsize=16)
ax.set_ylabel('Price per Square Foot', fontsize=16)
ax.set_title("Price per Square Foot to Total Square Footage\nby Grade of Materials", fontsize=20)

# label and position our legend
plt.legend(title='Grade', loc='upper right', title_fontsize=20);

# save visualization to png
#plt.savefig('images/pr_sf_grade.png');

In [None]:
# prepare figure
fig, ax = plt.subplots(figsize=(20, 15))

#scatter our data
scatter3 = sns.scatterplot(x="sqft_living", y="price", data=viz_df, hue='zip_rank', palette='magma_r')
#ax.scatter(viz_sqft, viz_target, c=viz_zip, cmap='magma_r')

# label our axes and title
ax.set_xlabel('Total Square Footage', fontsize=16)
ax.set_ylabel('Price in $100,000', fontsize=16)
ax.set_title("Price per Total Square Footage\nby Zip Code Median Value Rank", fontsize=20);

# save visualization to png
#plt.savefig('images/sqft.png');

In [None]:
# prepare figure
fig, ax = plt.subplots(figsize=(20, 15))

#scatter our data
scatter3 = sns.scatterplot(x="sqft_living", y="pr_sf", data=viz_df, hue='zip_rank', palette='magma_r')
#ax.scatter(viz_sqft, viz_target, c=viz_zip, cmap='magma_r')

# label our axes and title
ax.set_xlabel('Total Square Footage', fontsize=16)
ax.set_ylabel('Price per Square Foot', fontsize=16)
ax.set_title("Price per per Square Foot to Total Square Footage\nby Zip Code Median Value Rank", fontsize=20);

# save visualization to png
#plt.savefig('images/pr_sf_zip.png');

In [None]:
viz_y = viz_df['price']
viz_x = viz_df.drop('price', axis=1)

fig = plt.figure(figsize=(20,15))
ax = fig.add_subplot(111, projection='3d')

ax.scatter(viz_sqft, viz_grade, viz_target, c=viz_zip, cmap='magma_r')
#ax.scatter(viz_sqft, viz_grade, viz_target, c='red', label="Predictions")
#ax.scatter(viz_sqft, viz_grade, end_z/100000, c='green', label="Actuals")

ax.set_xlabel('Square Feet of Living Space', fontsize=12)
ax.set_ylabel('Grade of Materials', fontsize=12)
ax.set_zlabel('Price', fontsize=12)

ax.set_title("Price per Square Footage and Grade of Materials, by Zip Median Rank", fontsize=20)

# first num is tilt angle, second num is turn angle
# default is about 30,305
# 0, 270 creates side view of pr/sqft
# 0, 360 creates side view of pr/grade
ax.view_init(30, 305)


# save visualization to png
#plt.savefig('images/3d_feats.png');

# DEPRECATED

In [22]:
 load and look at our austin housing data
df = pd.read_csv('newyork_housing.csv')
df.head()

Unnamed: 0,address/city,address/community,address/neighborhood,address/state,address/streetAddress,address/subdivision,address/zipcode,bathrooms,bedrooms,currency,dateposted,description,homeStatus,latitude,livingArea,longitude,photos/0,photos/1,photos/2,photos/3,photos/4,photos/5,photos/6,photos/7,photos/8,photos/9,photos/10,photos/11,photos/12,photos/13,photos/14,photos/15,photos/16,photos/17,photos/18,photos/19,photos/20,photos/21,photos/22,photos/23,photos/24,photos/25,photos/26,photos/27,photos/28,photos/29,photos/30,photos/31,photos/32,photos/33,photos/34,photos/35,photos/36,photos/37,photos/38,photos/39,photos/40,photos/41,photos/42,photos/43,photos/44,photos/45,photos/46,photos/47,photos/48,photos/49,photos/50,photos/51,photos/52,photos/53,photos/54,photos/55,photos/56,photos/57,photos/58,photos/59,photos/60,photos/61,photos/62,photos/63,photos/64,photos/65,photos/66,photos/67,photos/68,photos/69,photos/70,photos/71,photos/72,photos/73,photos/74,photos/75,photos/76,photos/77,photos/78,photos/79,photos/80,photos/81,photos/82,photos/83,photos/84,photos/85,photos/86,photos/87,photos/88,photos/89,photos/90,photos/91,photos/92,photos/93,photos/94,photos/95,photos/96,photos/97,photos/98,photos/99,photos/100,photos/101,photos/102,photos/103,photos/104,photos/105,photos/106,photos/107,photos/108,photos/109,photos/110,photos/111,photos/112,photos/113,photos/114,photos/115,photos/116,photos/117,photos/118,photos/119,photos/120,photos/121,photos/122,photos/123,photos/124,photos/125,photos/126,photos/127,photos/128,photos/129,photos/130,photos/131,photos/132,photos/133,photos/134,photos/135,photos/136,photos/137,photos/138,photos/139,photos/140,photos/141,photos/142,photos/143,photos/144,photos/145,photos/146,photos/147,photos/148,photos/149,photos/150,photos/151,photos/152,photos/153,photos/154,photos/155,photos/156,photos/157,photos/158,photos/159,photos/160,photos/161,photos/162,photos/163,photos/164,photos/165,photos/166,photos/167,photos/168,photos/169,photos/170,photos/171,photos/172,photos/173,photos/174,photos/175,photos/176,photos/177,photos/178,photos/179,photos/180,photos/181,photos/182,photos/183,photos/184,photos/185,photos/186,photos/187,photos/188,photos/189,photos/190,photos/191,photos/192,photos/193,photos/194,photos/195,photos/196,photos/197,photos/198,photos/199,photos/200,photos/201,photos/202,photos/203,photos/204,photos/205,photos/206,photos/207,photos/208,photos/209,photos/210,photos/211,photos/212,photos/213,photos/214,photos/215,photos/216,photos/217,photos/218,photos/219,photos/220,photos/221,photos/222,photos/223,photos/224,photos/225,photos/226,photos/227,photos/228,photos/229,photos/230,photos/231,photos/232,photos/233,photos/234,photos/235,photos/236,photos/237,photos/238,photos/239,photos/240,photos/241,photos/242,photos/243,photos/244,photos/245,photos/246,photos/247,photos/248,photos/249,photos/250,photos/251,photos/252,photos/253,photos/254,photos/255,photos/256,photos/257,photos/258,photos/259,photos/260,photos/261,photos/262,photos/263,photos/264,photos/265,photos/266,photos/267,photos/268,photos/269,photos/270,photos/271,photos/272,photos/273,photos/274,photos/275,photos/276,photos/277,price,priceHistory,priceHistory/0/attributeSource/infoString1,priceHistory/0/attributeSource/infoString2,priceHistory/0/attributeSource/infoString3,priceHistory/0/buyerAgent,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/postingIsRental,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/showCountyLink,priceHistory/0/source,priceHistory/0/time,priceHistory/1/attributeSource/infoString1,priceHistory/1/attributeSource/infoString2,priceHistory/1/attributeSource/infoString3,priceHistory/1/buyerAgent,priceHistory/1/buyerAgent/name,priceHistory/1/buyerAgent/photo,priceHistory/1/buyerAgent/photo/url,priceHistory/1/buyerAgent/profileUrl,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/sellerAgent,priceHistory/1/sellerAgent/name,priceHistory/1/sellerAgent/photo,priceHistory/1/sellerAgent/photo/url,priceHistory/1/sellerAgent/profileUrl,priceHistory/1/showCountyLink,priceHistory/1/source,priceHistory/1/time,priceHistory/2/attributeSource/infoString1,priceHistory/2/attributeSource/infoString2,priceHistory/2/attributeSource/infoString3,priceHistory/2/buyerAgent,priceHistory/2/buyerAgent/name,priceHistory/2/buyerAgent/photo,priceHistory/2/buyerAgent/photo/url,priceHistory/2/buyerAgent/profileUrl,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/sellerAgent,priceHistory/2/sellerAgent/name,priceHistory/2/sellerAgent/photo,priceHistory/2/sellerAgent/photo/url,priceHistory/2/sellerAgent/profileUrl,priceHistory/2/showCountyLink,priceHistory/2/source,priceHistory/2/time,priceHistory/3/attributeSource/infoString1,priceHistory/3/attributeSource/infoString2,priceHistory/3/attributeSource/infoString3,priceHistory/3/buyerAgent,priceHistory/3/buyerAgent/name,priceHistory/3/buyerAgent/photo,priceHistory/3/buyerAgent/photo/url,priceHistory/3/buyerAgent/profileUrl,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/sellerAgent,priceHistory/3/sellerAgent/name,priceHistory/3/sellerAgent/photo,priceHistory/3/sellerAgent/photo/url,priceHistory/3/sellerAgent/profileUrl,priceHistory/3/showCountyLink,priceHistory/3/source,priceHistory/3/time,priceHistory/4/attributeSource/infoString1,priceHistory/4/attributeSource/infoString2,priceHistory/4/attributeSource/infoString3,priceHistory/4/buyerAgent,priceHistory/4/buyerAgent/name,priceHistory/4/buyerAgent/photo,priceHistory/4/buyerAgent/photo/url,priceHistory/4/buyerAgent/profileUrl,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/sellerAgent,priceHistory/4/sellerAgent/name,priceHistory/4/sellerAgent/photo,priceHistory/4/sellerAgent/photo/url,priceHistory/4/sellerAgent/profileUrl,priceHistory/4/showCountyLink,priceHistory/4/source,priceHistory/4/time,priceHistory/5/attributeSource/infoString1,priceHistory/5/attributeSource/infoString2,priceHistory/5/attributeSource/infoString3,priceHistory/5/buyerAgent,priceHistory/5/buyerAgent/name,priceHistory/5/buyerAgent/photo,priceHistory/5/buyerAgent/photo/url,priceHistory/5/buyerAgent/profileUrl,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/sellerAgent,priceHistory/5/sellerAgent/name,priceHistory/5/sellerAgent/photo,priceHistory/5/sellerAgent/photo/url,priceHistory/5/sellerAgent/profileUrl,priceHistory/5/showCountyLink,priceHistory/5/source,priceHistory/5/time,priceHistory/6/attributeSource/infoString1,priceHistory/6/attributeSource/infoString2,priceHistory/6/attributeSource/infoString3,priceHistory/6/buyerAgent,priceHistory/6/buyerAgent/name,priceHistory/6/buyerAgent/photo,priceHistory/6/buyerAgent/photo/url,priceHistory/6/buyerAgent/profileUrl,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/sellerAgent,priceHistory/6/sellerAgent/name,priceHistory/6/sellerAgent/photo,priceHistory/6/sellerAgent/photo/url,priceHistory/6/sellerAgent/profileUrl,priceHistory/6/showCountyLink,priceHistory/6/source,priceHistory/6/time,priceHistory/7/attributeSource/infoString1,priceHistory/7/attributeSource/infoString2,priceHistory/7/attributeSource/infoString3,priceHistory/7/buyerAgent,priceHistory/7/buyerAgent/name,priceHistory/7/buyerAgent/photo,priceHistory/7/buyerAgent/photo/url,priceHistory/7/buyerAgent/profileUrl,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/sellerAgent,priceHistory/7/sellerAgent/name,priceHistory/7/sellerAgent/photo,priceHistory/7/sellerAgent/photo/url,priceHistory/7/sellerAgent/profileUrl,priceHistory/7/showCountyLink,priceHistory/7/source,priceHistory/7/time,priceHistory/8/attributeSource/infoString1,priceHistory/8/attributeSource/infoString2,priceHistory/8/attributeSource/infoString3,priceHistory/8/buyerAgent,priceHistory/8/buyerAgent/name,priceHistory/8/buyerAgent/photo,priceHistory/8/buyerAgent/photo/url,priceHistory/8/buyerAgent/profileUrl,priceHistory/8/event,priceHistory/8/postingIsRental,priceHistory/8/price,priceHistory/8/priceChangeRate,priceHistory/8/sellerAgent,priceHistory/8/sellerAgent/name,priceHistory/8/sellerAgent/photo,priceHistory/8/sellerAgent/photo/url,priceHistory/8/sellerAgent/profileUrl,priceHistory/8/showCountyLink,priceHistory/8/source,priceHistory/8/time,priceHistory/9/attributeSource/infoString1,priceHistory/9/attributeSource/infoString2,priceHistory/9/attributeSource/infoString3,priceHistory/9/buyerAgent,priceHistory/9/buyerAgent/name,priceHistory/9/buyerAgent/photo,priceHistory/9/buyerAgent/photo/url,priceHistory/9/buyerAgent/profileUrl,priceHistory/9/event,priceHistory/9/postingIsRental,priceHistory/9/price,priceHistory/9/priceChangeRate,priceHistory/9/sellerAgent,priceHistory/9/sellerAgent/name,priceHistory/9/sellerAgent/photo,priceHistory/9/sellerAgent/photo/url,priceHistory/9/sellerAgent/profileUrl,priceHistory/9/showCountyLink,priceHistory/9/source,priceHistory/9/time,priceHistory/10/attributeSource/infoString1,priceHistory/10/attributeSource/infoString2,priceHistory/10/attributeSource/infoString3,priceHistory/10/buyerAgent,priceHistory/10/buyerAgent/name,priceHistory/10/buyerAgent/photo,priceHistory/10/buyerAgent/photo/url,priceHistory/10/buyerAgent/profileUrl,priceHistory/10/event,priceHistory/10/postingIsRental,priceHistory/10/price,priceHistory/10/priceChangeRate,priceHistory/10/sellerAgent,priceHistory/10/sellerAgent/name,priceHistory/10/sellerAgent/photo,priceHistory/10/sellerAgent/photo/url,priceHistory/10/sellerAgent/profileUrl,priceHistory/10/showCountyLink,priceHistory/10/source,priceHistory/10/time,priceHistory/11/attributeSource/infoString1,priceHistory/11/attributeSource/infoString2,priceHistory/11/attributeSource/infoString3,priceHistory/11/buyerAgent,priceHistory/11/buyerAgent/name,priceHistory/11/buyerAgent/photo,priceHistory/11/buyerAgent/photo/url,priceHistory/11/buyerAgent/profileUrl,priceHistory/11/event,priceHistory/11/postingIsRental,priceHistory/11/price,priceHistory/11/priceChangeRate,priceHistory/11/sellerAgent,priceHistory/11/sellerAgent/name,priceHistory/11/sellerAgent/photo,priceHistory/11/sellerAgent/photo/url,priceHistory/11/sellerAgent/profileUrl,priceHistory/11/showCountyLink,priceHistory/11/source,priceHistory/11/time,priceHistory/12/attributeSource/infoString1,priceHistory/12/attributeSource/infoString2,priceHistory/12/attributeSource/infoString3,priceHistory/12/buyerAgent,priceHistory/12/buyerAgent/name,priceHistory/12/buyerAgent/photo/url,priceHistory/12/buyerAgent/profileUrl,priceHistory/12/event,priceHistory/12/postingIsRental,priceHistory/12/price,priceHistory/12/priceChangeRate,priceHistory/12/sellerAgent,priceHistory/12/sellerAgent/name,priceHistory/12/sellerAgent/photo,priceHistory/12/sellerAgent/photo/url,priceHistory/12/sellerAgent/profileUrl,priceHistory/12/showCountyLink,priceHistory/12/source,priceHistory/12/time,priceHistory/13/attributeSource/infoString1,priceHistory/13/attributeSource/infoString2,priceHistory/13/attributeSource/infoString3,priceHistory/13/buyerAgent,priceHistory/13/buyerAgent/name,priceHistory/13/buyerAgent/photo,priceHistory/13/buyerAgent/photo/url,priceHistory/13/buyerAgent/profileUrl,priceHistory/13/event,priceHistory/13/postingIsRental,priceHistory/13/price,priceHistory/13/priceChangeRate,priceHistory/13/sellerAgent,priceHistory/13/sellerAgent/name,priceHistory/13/sellerAgent/photo,priceHistory/13/sellerAgent/photo/url,priceHistory/13/sellerAgent/profileUrl,priceHistory/13/showCountyLink,priceHistory/13/source,priceHistory/13/time,priceHistory/14/attributeSource/infoString1,priceHistory/14/attributeSource/infoString2,priceHistory/14/attributeSource/infoString3,priceHistory/14/buyerAgent,priceHistory/14/buyerAgent/name,priceHistory/14/buyerAgent/photo,priceHistory/14/buyerAgent/photo/url,priceHistory/14/buyerAgent/profileUrl,priceHistory/14/event,priceHistory/14/postingIsRental,priceHistory/14/price,priceHistory/14/priceChangeRate,priceHistory/14/sellerAgent,priceHistory/14/sellerAgent/name,priceHistory/14/sellerAgent/photo,priceHistory/14/sellerAgent/photo/url,priceHistory/14/sellerAgent/profileUrl,priceHistory/14/showCountyLink,priceHistory/14/source,priceHistory/14/time,priceHistory/15/attributeSource/infoString1,priceHistory/15/attributeSource/infoString2,priceHistory/15/attributeSource/infoString3,priceHistory/15/buyerAgent,priceHistory/15/buyerAgent/name,priceHistory/15/buyerAgent/photo,priceHistory/15/buyerAgent/photo/url,priceHistory/15/buyerAgent/profileUrl,priceHistory/15/event,priceHistory/15/postingIsRental,priceHistory/15/price,priceHistory/15/priceChangeRate,priceHistory/15/sellerAgent,priceHistory/15/sellerAgent/name,priceHistory/15/sellerAgent/photo,priceHistory/15/sellerAgent/photo/url,priceHistory/15/sellerAgent/profileUrl,priceHistory/15/showCountyLink,priceHistory/15/source,priceHistory/15/time,priceHistory/16/attributeSource/infoString1,priceHistory/16/attributeSource/infoString2,priceHistory/16/attributeSource/infoString3,priceHistory/16/buyerAgent,priceHistory/16/buyerAgent/name,priceHistory/16/buyerAgent/photo/url,priceHistory/16/buyerAgent/profileUrl,priceHistory/16/event,priceHistory/16/postingIsRental,priceHistory/16/price,priceHistory/16/priceChangeRate,priceHistory/16/sellerAgent,priceHistory/16/sellerAgent/name,priceHistory/16/sellerAgent/photo,priceHistory/16/sellerAgent/photo/url,priceHistory/16/sellerAgent/profileUrl,priceHistory/16/showCountyLink,priceHistory/16/source,priceHistory/16/time,priceHistory/17/attributeSource/infoString1,priceHistory/17/attributeSource/infoString2,priceHistory/17/attributeSource/infoString3,priceHistory/17/buyerAgent,priceHistory/17/buyerAgent/name,priceHistory/17/buyerAgent/photo,priceHistory/17/buyerAgent/photo/url,priceHistory/17/buyerAgent/profileUrl,priceHistory/17/event,priceHistory/17/postingIsRental,priceHistory/17/price,priceHistory/17/priceChangeRate,priceHistory/17/sellerAgent,priceHistory/17/sellerAgent/name,priceHistory/17/sellerAgent/photo,priceHistory/17/sellerAgent/photo/url,priceHistory/17/sellerAgent/profileUrl,priceHistory/17/showCountyLink,priceHistory/17/source,priceHistory/17/time,priceHistory/18/attributeSource/infoString1,priceHistory/18/attributeSource/infoString2,priceHistory/18/attributeSource/infoString3,priceHistory/18/buyerAgent,priceHistory/18/buyerAgent/name,priceHistory/18/buyerAgent/photo/url,priceHistory/18/buyerAgent/profileUrl,priceHistory/18/event,priceHistory/18/postingIsRental,priceHistory/18/price,priceHistory/18/priceChangeRate,priceHistory/18/sellerAgent,priceHistory/18/sellerAgent/name,priceHistory/18/sellerAgent/photo,priceHistory/18/sellerAgent/photo/url,priceHistory/18/sellerAgent/profileUrl,priceHistory/18/showCountyLink,priceHistory/18/source,priceHistory/18/time,priceHistory/19/attributeSource/infoString1,priceHistory/19/attributeSource/infoString2,priceHistory/19/attributeSource/infoString3,priceHistory/19/buyerAgent,priceHistory/19/buyerAgent/name,priceHistory/19/buyerAgent/photo/url,priceHistory/19/buyerAgent/profileUrl,priceHistory/19/event,priceHistory/19/postingIsRental,priceHistory/19/price,priceHistory/19/priceChangeRate,priceHistory/19/sellerAgent,priceHistory/19/sellerAgent/name,priceHistory/19/sellerAgent/photo,priceHistory/19/sellerAgent/photo/url,priceHistory/19/sellerAgent/profileUrl,priceHistory/19/showCountyLink,priceHistory/19/source,priceHistory/19/time,priceHistory/20/attributeSource/infoString1,priceHistory/20/attributeSource/infoString2,priceHistory/20/attributeSource/infoString3,priceHistory/20/buyerAgent,priceHistory/20/buyerAgent/name,priceHistory/20/buyerAgent/photo/url,priceHistory/20/buyerAgent/profileUrl,priceHistory/20/event,priceHistory/20/postingIsRental,priceHistory/20/price,priceHistory/20/priceChangeRate,priceHistory/20/sellerAgent,priceHistory/20/sellerAgent/name,priceHistory/20/sellerAgent/photo/url,priceHistory/20/sellerAgent/profileUrl,priceHistory/20/showCountyLink,priceHistory/20/source,priceHistory/20/time,priceHistory/21/attributeSource/infoString1,priceHistory/21/attributeSource/infoString2,priceHistory/21/attributeSource/infoString3,priceHistory/21/buyerAgent,priceHistory/21/buyerAgent/name,priceHistory/21/buyerAgent/photo/url,priceHistory/21/buyerAgent/profileUrl,priceHistory/21/event,priceHistory/21/postingIsRental,priceHistory/21/price,priceHistory/21/priceChangeRate,priceHistory/21/sellerAgent,priceHistory/21/sellerAgent/name,priceHistory/21/sellerAgent/photo/url,priceHistory/21/sellerAgent/profileUrl,priceHistory/21/showCountyLink,priceHistory/21/source,priceHistory/21/time,priceHistory/22/attributeSource/infoString1,priceHistory/22/attributeSource/infoString2,priceHistory/22/attributeSource/infoString3,priceHistory/22/buyerAgent,priceHistory/22/buyerAgent/name,priceHistory/22/buyerAgent/photo,priceHistory/22/buyerAgent/photo/url,priceHistory/22/buyerAgent/profileUrl,priceHistory/22/event,priceHistory/22/postingIsRental,priceHistory/22/price,priceHistory/22/priceChangeRate,priceHistory/22/sellerAgent,priceHistory/22/sellerAgent/name,priceHistory/22/sellerAgent/photo,priceHistory/22/sellerAgent/photo/url,priceHistory/22/sellerAgent/profileUrl,priceHistory/22/showCountyLink,priceHistory/22/source,priceHistory/22/time,priceHistory/23/attributeSource/infoString1,priceHistory/23/attributeSource/infoString2,priceHistory/23/attributeSource/infoString3,priceHistory/23/buyerAgent,priceHistory/23/buyerAgent/name,priceHistory/23/buyerAgent/photo/url,priceHistory/23/buyerAgent/profileUrl,priceHistory/23/event,priceHistory/23/postingIsRental,priceHistory/23/price,priceHistory/23/priceChangeRate,priceHistory/23/sellerAgent,priceHistory/23/sellerAgent/name,priceHistory/23/sellerAgent/photo,priceHistory/23/sellerAgent/photo/url,priceHistory/23/sellerAgent/profileUrl,priceHistory/23/showCountyLink,priceHistory/23/source,priceHistory/23/time,priceHistory/24/attributeSource/infoString1,priceHistory/24/attributeSource/infoString2,priceHistory/24/attributeSource/infoString3,priceHistory/24/buyerAgent,priceHistory/24/buyerAgent/name,priceHistory/24/buyerAgent/photo/url,priceHistory/24/buyerAgent/profileUrl,priceHistory/24/event,priceHistory/24/postingIsRental,priceHistory/24/price,priceHistory/24/priceChangeRate,priceHistory/24/sellerAgent,priceHistory/24/sellerAgent/name,priceHistory/24/sellerAgent/photo/url,priceHistory/24/sellerAgent/profileUrl,priceHistory/24/showCountyLink,priceHistory/24/source,priceHistory/24/time,priceHistory/25/attributeSource/infoString1,priceHistory/25/attributeSource/infoString2,priceHistory/25/attributeSource/infoString3,priceHistory/25/buyerAgent,priceHistory/25/buyerAgent/name,priceHistory/25/buyerAgent/photo/url,priceHistory/25/buyerAgent/profileUrl,priceHistory/25/event,priceHistory/25/postingIsRental,priceHistory/25/price,priceHistory/25/priceChangeRate,priceHistory/25/sellerAgent,priceHistory/25/sellerAgent/name,priceHistory/25/sellerAgent/photo/url,priceHistory/25/sellerAgent/profileUrl,priceHistory/25/showCountyLink,priceHistory/25/source,priceHistory/25/time,priceHistory/26/attributeSource/infoString1,priceHistory/26/attributeSource/infoString2,priceHistory/26/attributeSource/infoString3,priceHistory/26/buyerAgent,priceHistory/26/buyerAgent/name,priceHistory/26/buyerAgent/photo/url,priceHistory/26/buyerAgent/profileUrl,priceHistory/26/event,priceHistory/26/postingIsRental,priceHistory/26/price,priceHistory/26/priceChangeRate,priceHistory/26/sellerAgent,priceHistory/26/showCountyLink,priceHistory/26/source,priceHistory/26/time,priceHistory/27/attributeSource/infoString1,priceHistory/27/attributeSource/infoString2,priceHistory/27/attributeSource/infoString3,priceHistory/27/buyerAgent,priceHistory/27/buyerAgent/name,priceHistory/27/buyerAgent/photo/url,priceHistory/27/buyerAgent/profileUrl,priceHistory/27/event,priceHistory/27/postingIsRental,priceHistory/27/price,priceHistory/27/priceChangeRate,priceHistory/27/sellerAgent,priceHistory/27/sellerAgent/name,priceHistory/27/sellerAgent/photo/url,priceHistory/27/sellerAgent/profileUrl,priceHistory/27/showCountyLink,priceHistory/27/source,priceHistory/27/time,priceHistory/28/attributeSource/infoString1,priceHistory/28/attributeSource/infoString2,priceHistory/28/attributeSource/infoString3,priceHistory/28/buyerAgent,priceHistory/28/event,priceHistory/28/postingIsRental,priceHistory/28/price,priceHistory/28/priceChangeRate,priceHistory/28/sellerAgent,priceHistory/28/showCountyLink,priceHistory/28/source,priceHistory/28/time,priceHistory/29/attributeSource/infoString1,priceHistory/29/attributeSource/infoString2,priceHistory/29/attributeSource/infoString3,priceHistory/29/buyerAgent,priceHistory/29/event,priceHistory/29/postingIsRental,priceHistory/29/price,priceHistory/29/priceChangeRate,priceHistory/29/sellerAgent,priceHistory/29/showCountyLink,priceHistory/29/source,priceHistory/29/time,propertyTaxRate,resoFactsStats/aboveGradeFinishedArea,resoFactsStats/accessibilityFeatures,resoFactsStats/accessibilityFeatures/0,resoFactsStats/accessibilityFeatures/1,resoFactsStats/additionalParcelsDescription,resoFactsStats/appliances,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/appliances/3,resoFactsStats/appliances/4,resoFactsStats/appliances/5,resoFactsStats/appliances/6,resoFactsStats/appliances/7,resoFactsStats/appliances/8,resoFactsStats/appliances/9,resoFactsStats/appliances/10,resoFactsStats/appliances/11,resoFactsStats/appliances/12,resoFactsStats/appliances/13,resoFactsStats/appliances/14,resoFactsStats/architecturalStyle,resoFactsStats/associationAmenities,resoFactsStats/associationAmenities/0,resoFactsStats/associationAmenities/1,resoFactsStats/associationFee,resoFactsStats/associationFee2,resoFactsStats/associationFeeIncludes,resoFactsStats/associationFeeIncludes/0,resoFactsStats/associationFeeIncludes/1,resoFactsStats/associationFeeIncludes/2,resoFactsStats/associationFeeIncludes/3,resoFactsStats/associationFeeIncludes/4,resoFactsStats/associationFeeIncludes/5,resoFactsStats/associationFeeIncludes/6,resoFactsStats/associationFeeIncludes/7,resoFactsStats/associationFeeIncludes/8,resoFactsStats/associationFeeIncludes/9,resoFactsStats/associationFeeIncludes/10,resoFactsStats/associationFeeIncludes/11,resoFactsStats/associationName,resoFactsStats/associationName2,resoFactsStats/associationPhone,resoFactsStats/associationPhone2,resoFactsStats/atAGlanceFacts,resoFactsStats/atAGlanceFacts/0/factLabel,resoFactsStats/atAGlanceFacts/0/factValue,resoFactsStats/atAGlanceFacts/1/factLabel,resoFactsStats/atAGlanceFacts/1/factValue,resoFactsStats/atAGlanceFacts/2/factLabel,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factLabel,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factLabel,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factLabel,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/atAGlanceFacts/6/factLabel,resoFactsStats/atAGlanceFacts/6/factValue,resoFactsStats/atAGlanceFacts/7/factLabel,resoFactsStats/atAGlanceFacts/7/factValue,resoFactsStats/atAGlanceFacts/8/factLabel,resoFactsStats/atAGlanceFacts/8/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsPartial,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/belowGradeFinishedArea,resoFactsStats/builderModel,resoFactsStats/builderName,resoFactsStats/buildingArea,resoFactsStats/buildingAreaSource,resoFactsStats/buildingFeatures,resoFactsStats/buildingFeatures/0,resoFactsStats/buildingFeatures/1,resoFactsStats/buildingFeatures/2,resoFactsStats/buildingFeatures/3,resoFactsStats/buildingFeatures/4,resoFactsStats/buildingFeatures/5,resoFactsStats/buildingName,resoFactsStats/canRaiseHorses,resoFactsStats/carportSpaces,resoFactsStats/cityRegion,resoFactsStats/commonWalls,resoFactsStats/communityFeatures,resoFactsStats/communityFeatures/0,resoFactsStats/communityFeatures/1,resoFactsStats/communityFeatures/2,resoFactsStats/communityFeatures/3,resoFactsStats/communityFeatures/4,resoFactsStats/communityFeatures/5,resoFactsStats/constructionMaterials,resoFactsStats/constructionMaterials/0,resoFactsStats/constructionMaterials/1,resoFactsStats/constructionMaterials/2,resoFactsStats/constructionMaterials/3,resoFactsStats/constructionMaterials/4,resoFactsStats/constructionMaterials/5,resoFactsStats/cooling,resoFactsStats/cooling/0,resoFactsStats/cooling/1,resoFactsStats/cooling/2,resoFactsStats/cooling/3,resoFactsStats/coveredSpaces,resoFactsStats/developmentStatus,resoFactsStats/electric,resoFactsStats/electric/0,resoFactsStats/electric/1,resoFactsStats/electric/2,resoFactsStats/electric/3,resoFactsStats/electric/4,resoFactsStats/elementarySchool,resoFactsStats/elementarySchoolDistrict,resoFactsStats/entryLevel,resoFactsStats/entryLocation,resoFactsStats/exteriorFeatures,resoFactsStats/exteriorFeatures/0,resoFactsStats/exteriorFeatures/1,resoFactsStats/exteriorFeatures/2,resoFactsStats/exteriorFeatures/3,resoFactsStats/exteriorFeatures/4,resoFactsStats/exteriorFeatures/5,resoFactsStats/exteriorFeatures/6,resoFactsStats/exteriorFeatures/7,resoFactsStats/exteriorFeatures/8,resoFactsStats/exteriorFeatures/9,resoFactsStats/exteriorFeatures/10,resoFactsStats/fencing,resoFactsStats/fireplaceFeatures,resoFactsStats/fireplaceFeatures/0,resoFactsStats/fireplaces,resoFactsStats/flooring,resoFactsStats/flooring/0,resoFactsStats/flooring/1,resoFactsStats/flooring/2,resoFactsStats/flooring/3,resoFactsStats/flooring/4,resoFactsStats/flooring/5,resoFactsStats/flooring/6,resoFactsStats/flooring/7,resoFactsStats/flooring/8,resoFactsStats/foundationDetails,resoFactsStats/foundationDetails/0,resoFactsStats/foundationDetails/1,resoFactsStats/foundationDetails/2,resoFactsStats/frontageLength,resoFactsStats/frontageType,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/gas,resoFactsStats/gas/0,resoFactsStats/greenBuildingVerificationType,resoFactsStats/greenBuildingVerificationType/0,resoFactsStats/greenBuildingVerificationType/1,resoFactsStats/greenEnergyEfficient,resoFactsStats/greenEnergyEfficient/0,resoFactsStats/greenEnergyEfficient/1,resoFactsStats/greenEnergyEfficient/2,resoFactsStats/greenEnergyEfficient/3,resoFactsStats/greenIndoorAirQuality,resoFactsStats/greenSustainability,resoFactsStats/greenSustainability/0,resoFactsStats/greenSustainability/1,resoFactsStats/greenSustainability/2,resoFactsStats/greenSustainability/3,resoFactsStats/greenSustainability/4,resoFactsStats/greenSustainability/5,resoFactsStats/greenWaterConservation,resoFactsStats/greenWaterConservation/0,resoFactsStats/greenWaterConservation/1,resoFactsStats/greenWaterConservation/2,resoFactsStats/hasAdditionalParcels,resoFactsStats/hasAssociation,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasElectricOnProperty,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasHomeWarranty,resoFactsStats/hasLandLease,resoFactsStats/hasOpenParking,resoFactsStats/hasPetsAllowed,resoFactsStats/hasPrivatePool,resoFactsStats/hasRentControl,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/hasWaterfrontView,resoFactsStats/heating,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/heating/2,resoFactsStats/heating/3,resoFactsStats/heating/4,resoFactsStats/heating/5,resoFactsStats/heating/6,resoFactsStats/heating/7,resoFactsStats/heating/8,resoFactsStats/heating/9,resoFactsStats/heating/10,resoFactsStats/heating/11,resoFactsStats/heating/12,resoFactsStats/highSchool,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/isSeniorCommunity,resoFactsStats/landLeaseAmount,resoFactsStats/laundryFeatures,resoFactsStats/laundryFeatures/0,resoFactsStats/laundryFeatures/1,resoFactsStats/levels,resoFactsStats/listAOR,resoFactsStats/listingId,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/lotSizeDimensions,resoFactsStats/mainLevelBathrooms,resoFactsStats/middleOrJuniorSchool,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/numberOfUnitsInCommunity,resoFactsStats/numberOfUnitsVacant,resoFactsStats/onMarketDate,resoFactsStats/openParkingSpaces,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/name,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/name,resoFactsStats/otherFacts/11/value,resoFactsStats/otherFacts/12/name,resoFactsStats/otherFacts/12/value,resoFactsStats/otherFacts/13/name,resoFactsStats/otherFacts/13/value,resoFactsStats/otherFacts/14/name,resoFactsStats/otherFacts/14/value,resoFactsStats/otherFacts/15/name,resoFactsStats/otherFacts/15/value,resoFactsStats/otherFacts/16/name,resoFactsStats/otherFacts/16/value,resoFactsStats/otherFacts/17/name,resoFactsStats/otherFacts/17/value,resoFactsStats/otherFacts/18/name,resoFactsStats/otherFacts/18/value,resoFactsStats/otherFacts/19/name,resoFactsStats/otherFacts/19/value,resoFactsStats/otherFacts/20/name,resoFactsStats/otherFacts/20/value,resoFactsStats/otherFacts/21/name,resoFactsStats/otherFacts/21/value,resoFactsStats/otherFacts/22/name,resoFactsStats/otherFacts/22/value,resoFactsStats/otherFacts/23/name,resoFactsStats/otherFacts/23/value,resoFactsStats/otherFacts/24/name,resoFactsStats/otherFacts/24/value,resoFactsStats/otherFacts/25/name,resoFactsStats/otherFacts/25/value,resoFactsStats/otherFacts/26/name,resoFactsStats/otherFacts/26/value,resoFactsStats/otherFacts/27/name,resoFactsStats/otherFacts/27/value,resoFactsStats/otherFacts/28/name,resoFactsStats/otherFacts/28/value,resoFactsStats/otherFacts/29/name,resoFactsStats/otherFacts/29/value,resoFactsStats/otherFacts/30/name,resoFactsStats/otherFacts/30/value,resoFactsStats/otherFacts/31/name,resoFactsStats/otherFacts/31/value,resoFactsStats/otherFacts/32/name,resoFactsStats/otherFacts/32/value,resoFactsStats/otherFacts/33/name,resoFactsStats/otherFacts/33/value,resoFactsStats/otherFacts/34/name,resoFactsStats/otherFacts/34/value,resoFactsStats/otherFacts/35/name,resoFactsStats/otherFacts/35/value,resoFactsStats/otherFacts/36/name,resoFactsStats/otherFacts/36/value,resoFactsStats/otherFacts/37/name,resoFactsStats/otherFacts/37/value,resoFactsStats/otherFacts/38/name,resoFactsStats/otherFacts/38/value,resoFactsStats/otherFacts/39/name,resoFactsStats/otherFacts/39/value,resoFactsStats/otherFacts/40/name,resoFactsStats/otherFacts/40/value,resoFactsStats/otherFacts/41/name,resoFactsStats/otherFacts/41/value,resoFactsStats/otherFacts/42/name,resoFactsStats/otherFacts/42/value,resoFactsStats/otherFacts/43/name,resoFactsStats/otherFacts/43/value,resoFactsStats/otherFacts/44/name,resoFactsStats/otherFacts/44/value,resoFactsStats/otherFacts/45/name,resoFactsStats/otherFacts/45/value,resoFactsStats/otherFacts/46/name,resoFactsStats/otherFacts/46/value,resoFactsStats/otherFacts/47/name,resoFactsStats/otherFacts/47/value,resoFactsStats/otherFacts/48/name,resoFactsStats/otherFacts/48/value,resoFactsStats/otherFacts/49/name,resoFactsStats/otherFacts/49/value,resoFactsStats/otherFacts/50/name,resoFactsStats/otherFacts/50/value,resoFactsStats/otherFacts/51/name,resoFactsStats/otherFacts/51/value,resoFactsStats/otherFacts/52/name,resoFactsStats/otherFacts/52/value,resoFactsStats/otherFacts/53/name,resoFactsStats/otherFacts/53/value,resoFactsStats/otherFacts/54/name,resoFactsStats/otherFacts/54/value,resoFactsStats/otherFacts/55/name,resoFactsStats/otherFacts/55/value,resoFactsStats/otherFacts/56/name,resoFactsStats/otherFacts/56/value,resoFactsStats/otherFacts/57/name,resoFactsStats/otherFacts/57/value,resoFactsStats/otherFacts/58/name,resoFactsStats/otherFacts/58/value,resoFactsStats/otherFacts/59/name,resoFactsStats/otherFacts/59/value,resoFactsStats/otherFacts/60/name,resoFactsStats/otherFacts/60/value,resoFactsStats/otherFacts/61/name,resoFactsStats/otherFacts/61/value,resoFactsStats/otherFacts/62/name,resoFactsStats/otherFacts/62/value,resoFactsStats/otherFacts/63/name,resoFactsStats/otherFacts/63/value,resoFactsStats/otherFacts/64/name,resoFactsStats/otherFacts/64/value,resoFactsStats/otherFacts/65/name,resoFactsStats/otherFacts/65/value,resoFactsStats/otherFacts/66/name,resoFactsStats/otherFacts/66/value,resoFactsStats/otherFacts/67/name,resoFactsStats/otherFacts/67/value,resoFactsStats/otherFacts/68/name,resoFactsStats/otherFacts/68/value,resoFactsStats/otherFacts/69/name,resoFactsStats/otherFacts/69/value,resoFactsStats/otherFacts/70/name,resoFactsStats/otherFacts/70/value,resoFactsStats/otherFacts/71/name,resoFactsStats/otherFacts/71/value,resoFactsStats/otherFacts/72/name,resoFactsStats/otherFacts/72/value,resoFactsStats/otherFacts/73/name,resoFactsStats/otherFacts/73/value,resoFactsStats/otherFacts/74/name,resoFactsStats/otherFacts/74/value,resoFactsStats/otherFacts/75/name,resoFactsStats/otherFacts/75/value,resoFactsStats/otherFacts/76/name,resoFactsStats/otherFacts/76/value,resoFactsStats/otherFacts/77/name,resoFactsStats/otherFacts/77/value,resoFactsStats/otherFacts/78/name,resoFactsStats/otherFacts/78/value,resoFactsStats/otherFacts/79/name,resoFactsStats/otherFacts/79/value,resoFactsStats/otherFacts/80/name,resoFactsStats/otherFacts/80/value,resoFactsStats/otherFacts/81/name,resoFactsStats/otherFacts/81/value,resoFactsStats/otherFacts/82/name,resoFactsStats/otherFacts/82/value,resoFactsStats/otherFacts/83/name,resoFactsStats/otherFacts/83/value,resoFactsStats/otherFacts/84/name,resoFactsStats/otherFacts/84/value,resoFactsStats/otherFacts/85/name,resoFactsStats/otherFacts/85/value,resoFactsStats/otherFacts/86/name,resoFactsStats/otherFacts/86/value,resoFactsStats/otherFacts/87/name,resoFactsStats/otherFacts/87/value,resoFactsStats/otherFacts/88/name,resoFactsStats/otherFacts/88/value,resoFactsStats/otherFacts/89/name,resoFactsStats/otherFacts/89/value,resoFactsStats/otherFacts/90/name,resoFactsStats/otherFacts/90/value,resoFactsStats/otherFacts/91/name,resoFactsStats/otherFacts/91/value,resoFactsStats/otherFacts/92/name,resoFactsStats/otherFacts/92/value,resoFactsStats/otherParking,resoFactsStats/otherParking/0,resoFactsStats/otherStructures,resoFactsStats/otherStructures/0,resoFactsStats/otherStructures/1,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/parkingFeatures/2,resoFactsStats/parkingFeatures/3,resoFactsStats/parkingFeatures/4,resoFactsStats/parkingFeatures/5,resoFactsStats/parkingFeatures/6,resoFactsStats/parkingFeatures/7,resoFactsStats/parkingFeatures/8,resoFactsStats/patioAndPorchFeatures,resoFactsStats/patioAndPorchFeatures/0,resoFactsStats/patioAndPorchFeatures/1,resoFactsStats/patioAndPorchFeatures/2,resoFactsStats/propertyCondition,resoFactsStats/roofType,resoFactsStats/rooms/0/area,resoFactsStats/rooms/0/description,resoFactsStats/rooms/0/dimensions,resoFactsStats/rooms/0/features,resoFactsStats/rooms/0/length,resoFactsStats/rooms/0/level,resoFactsStats/rooms/0/roomType,resoFactsStats/rooms/0/width,resoFactsStats/rooms/1/area,resoFactsStats/rooms/1/description,resoFactsStats/rooms/1/dimensions,resoFactsStats/rooms/1/features,resoFactsStats/rooms/1/length,resoFactsStats/rooms/1/level,resoFactsStats/rooms/1/roomType,resoFactsStats/rooms/1/width,resoFactsStats/rooms/2/area,resoFactsStats/rooms/2/description,resoFactsStats/rooms/2/dimensions,resoFactsStats/rooms/2/features,resoFactsStats/rooms/2/length,resoFactsStats/rooms/2/level,resoFactsStats/rooms/2/roomType,resoFactsStats/rooms/2/width,resoFactsStats/rooms/3/area,resoFactsStats/rooms/3/description,resoFactsStats/rooms/3/dimensions,resoFactsStats/rooms/3/features,resoFactsStats/rooms/3/length,resoFactsStats/rooms/3/level,resoFactsStats/rooms/3/roomType,resoFactsStats/rooms/3/width,resoFactsStats/rooms/4/area,resoFactsStats/rooms/4/description,resoFactsStats/rooms/4/dimensions,resoFactsStats/rooms/4/features,resoFactsStats/rooms/4/length,resoFactsStats/rooms/4/level,resoFactsStats/rooms/4/roomType,resoFactsStats/rooms/4/width,resoFactsStats/rooms/5/area,resoFactsStats/rooms/5/description,resoFactsStats/rooms/5/dimensions,resoFactsStats/rooms/5/features,resoFactsStats/rooms/5/length,resoFactsStats/rooms/5/level,resoFactsStats/rooms/5/roomType,resoFactsStats/rooms/5/width,resoFactsStats/rooms/6/area,resoFactsStats/rooms/6/description,resoFactsStats/rooms/6/dimensions,resoFactsStats/rooms/6/features,resoFactsStats/rooms/6/length,resoFactsStats/rooms/6/level,resoFactsStats/rooms/6/roomType,resoFactsStats/rooms/6/width,resoFactsStats/rooms/7/area,resoFactsStats/rooms/7/description,resoFactsStats/rooms/7/dimensions,resoFactsStats/rooms/7/features,resoFactsStats/rooms/7/length,resoFactsStats/rooms/7/level,resoFactsStats/rooms/7/roomType,resoFactsStats/rooms/7/width,resoFactsStats/rooms/8/area,resoFactsStats/rooms/8/description,resoFactsStats/rooms/8/dimensions,resoFactsStats/rooms/8/features,resoFactsStats/rooms/8/length,resoFactsStats/rooms/8/level,resoFactsStats/rooms/8/roomType,resoFactsStats/rooms/8/width,resoFactsStats/rooms/9/area,resoFactsStats/rooms/9/description,resoFactsStats/rooms/9/dimensions,resoFactsStats/rooms/9/features,resoFactsStats/rooms/9/length,resoFactsStats/rooms/9/level,resoFactsStats/rooms/9/roomType,resoFactsStats/rooms/9/width,resoFactsStats/rooms/10/area,resoFactsStats/rooms/10/description,resoFactsStats/rooms/10/dimensions,resoFactsStats/rooms/10/features,resoFactsStats/rooms/10/length,resoFactsStats/rooms/10/level,resoFactsStats/rooms/10/roomType,resoFactsStats/rooms/10/width,resoFactsStats/rooms/11/area,resoFactsStats/rooms/11/description,resoFactsStats/rooms/11/dimensions,resoFactsStats/rooms/11/features,resoFactsStats/rooms/11/length,resoFactsStats/rooms/11/level,resoFactsStats/rooms/11/roomType,resoFactsStats/rooms/11/width,resoFactsStats/rooms/12/area,resoFactsStats/rooms/12/description,resoFactsStats/rooms/12/dimensions,resoFactsStats/rooms/12/features,resoFactsStats/rooms/12/length,resoFactsStats/rooms/12/level,resoFactsStats/rooms/12/roomType,resoFactsStats/rooms/12/width,resoFactsStats/rooms/13/area,resoFactsStats/rooms/13/description,resoFactsStats/rooms/13/dimensions,resoFactsStats/rooms/13/features,resoFactsStats/rooms/13/length,resoFactsStats/rooms/13/level,resoFactsStats/rooms/13/roomType,resoFactsStats/rooms/13/width,resoFactsStats/securityFeatures,resoFactsStats/securityFeatures/0,resoFactsStats/securityFeatures/1,resoFactsStats/securityFeatures/2,resoFactsStats/sewer,resoFactsStats/sewer/0,resoFactsStats/sewer/1,resoFactsStats/sewer/2,resoFactsStats/spaFeatures,resoFactsStats/spaFeatures/0,resoFactsStats/specialListingConditions,resoFactsStats/stories,resoFactsStats/storiesTotal,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/topography,resoFactsStats/utilities,resoFactsStats/utilities/0,resoFactsStats/utilities/1,resoFactsStats/utilities/2,resoFactsStats/utilities/3,resoFactsStats/utilities/4,resoFactsStats/utilities/5,resoFactsStats/utilities/6,resoFactsStats/utilities/7,resoFactsStats/vegetation,resoFactsStats/vegetation/0,resoFactsStats/vegetation/1,resoFactsStats/view,resoFactsStats/view/0,resoFactsStats/view/1,resoFactsStats/view/2,resoFactsStats/view/3,resoFactsStats/view/4,resoFactsStats/view/5,resoFactsStats/virtualTour,resoFactsStats/waterSources,resoFactsStats/waterfrontFeatures,resoFactsStats/waterfrontFeatures/0,resoFactsStats/waterfrontFeatures/1,resoFactsStats/waterfrontFeatures/2,resoFactsStats/waterfrontFeatures/3,resoFactsStats/waterfrontFeatures/4,resoFactsStats/waterfrontFeatures/5,resoFactsStats/waterfrontFeatures/6,resoFactsStats/windowFeatures,resoFactsStats/windowFeatures/0,resoFactsStats/windowFeatures/1,resoFactsStats/windowFeatures/2,resoFactsStats/windowFeatures/3,resoFactsStats/woodedArea,resoFactsStats/yearBuilt,resoFactsStats/yearBuiltEffective,resoFactsStats/zoning,resoFactsStats/zoningDescription,schools,schools/0/assigned,schools/0/distance,schools/0/grades,schools/0/isAssigned,schools/0/level,schools/0/link,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/assigned,schools/1/distance,schools/1/grades,schools/1/isAssigned,schools/1/level,schools/1/link,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/assigned,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,url,yearBuilt,zpid
0,New York,,,NY,60 Terrace View Ave,,10463.0,2.0,5.0,USD,1610134000000.0,"Discover Marble Hill, a neighborhood rich with...",FOR_SALE,40.877743,1889.0,-73.910866,https://photos.zillowstatic.com/fp/006eb7ab9c3...,https://photos.zillowstatic.com/fp/8b10648864c...,https://photos.zillowstatic.com/fp/08edf887608...,https://photos.zillowstatic.com/fp/d61f7b6017e...,https://photos.zillowstatic.com/fp/4a8d68dccb2...,https://photos.zillowstatic.com/fp/9fd2c273f39...,https://photos.zillowstatic.com/fp/7bc3ee9bc4c...,https://photos.zillowstatic.com/fp/644388de162...,https://photos.zillowstatic.com/fp/7bc3ee9bc4c...,https://photos.zillowstatic.com/fp/ca5af4b6f89...,https://photos.zillowstatic.com/fp/a02acd504da...,https://photos.zillowstatic.com/fp/13d4c861d6c...,https://photos.zillowstatic.com/fp/d9e3016fec3...,https://photos.zillowstatic.com/fp/d2356973069...,https://photos.zillowstatic.com/fp/8b5653ce255...,https://photos.zillowstatic.com/fp/f6dd48ce052...,https://photos.zillowstatic.com/fp/5cc0bd941fe...,https://photos.zillowstatic.com/fp/d0794595e03...,https://photos.zillowstatic.com/fp/8913d507ba5...,https://photos.zillowstatic.com/fp/cbbc0e4f7d2...,https://photos.zillowstatic.com/fp/e3b451e19b9...,https://photos.zillowstatic.com/fp/bc86ac6df63...,https://photos.zillowstatic.com/fp/438509c0a62...,https://photos.zillowstatic.com/fp/4c261f1022e...,https://photos.zillowstatic.com/fp/a03737cb83e...,https://photos.zillowstatic.com/fp/75b66bdea1e...,https://photos.zillowstatic.com/fp/1393169ba0b...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,799999.0,,,Keller Williams via MLS,,,,,,,Listed for sale,False,799999.0,0.335558,,,,,,False,Keller Williams via MLS,1610064000000.0,FYD4510324,RE/MAX IN THE CITY,,,,,,,Listing removed,False,599000.0,0.0,,,,,,False,RE/MAX IN THE CITY,1459469000000.0,4510324,RE/MAX In The City,,,,,,,Listed for sale,False,599000.0,0.711429,,,,,,False,RE/MAX In The City,1426810000000.0,,RE/MAX Voyage,,,,,,,Listing removed,False,350000.0,0.0,,,,,,False,RE/MAX Voyage,1293062000000.0,,RE/MAX Voyage,,,,,,,Price change,False,350000.0,-0.066667,,,,,,False,RE/MAX Voyage,1276128000000.0,,RE/MAX Voyage,,,,,,,Price change,False,375000.0,0.071429,,,,,,False,RE/MAX Voyage,1275610000000.0,,Remax Voyage,,,,,,,Listed for sale,False,350000.0,0.0,,,,,,False,Remax Voyage,1265328000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.88,,,,,,,,,,,,,,,,,,,,,,"Victorian,Trilevel",,,,,,,,,,,,,,,,,,,,,,,,Type,Residential,Year Built,1920,Heating,"Natural Gas, Hot Water",Cooling,,Parking,Driveway,Lot,,Days on Zillow,12 Days,Price/sqft,$424,,,Finished,2.0,1.0,1.0,,,,5.0,,,,,,,,,,,,,,False,,New York,,,Park,,,,,,,Frame,Vinyl Siding,,,,,,,,,,,,,,,,,,Call Listing Agent,Bronx 10,,,,,,,,,,,,,,,Back Yard,,,,,,,,,,,,,,,,,,,,False,0,,,,,,,,,,,,,Frame,Vinyl Siding,,,,,,,,,False,,False,False,False,True,,,False,True,False,False,False,False,,False,False,False,,,Natural Gas,Hot Water,,,,,,,,,,,,Call Listing Agent,Bronx 10,Residential,,,,,,,3.0,,,"1,889 sqft",,,,Call Listing Agent,Bronx 10,,,1610064000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,NO TAX ID FOUND,0,Driveway,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Public Sewer,,,,,,,,,5096.0,711000.0,,,,,,,,,,,,,,,,,,,,,https://my.matterport.com/show/?m=jZtoZmynvwi,,,,,,,,,,,,,,,,1920.0,,,,,,0.1,K-8,True,Elementary,https://www.greatschools.org/school?id=01885&s...,Ps 37 Multiple Intelligence School,4.0,647.0,14.0,1.0,Public,,0.1,6-12,True,Middle,https://www.greatschools.org/school?id=06426&s...,In Tech Academy Aka Ms High School 368,3.0,993.0,14.0,1.0,Public,,,,,,,,,,,,,https://www.zillow.com/homedetails/60-Terrace-...,1920.0,31554050.0
1,Bronx,,,NY,625 W 246th St,,10471.0,8.0,8.0,USD,1595968000000.0,EXCLUSIVE BRAND NEW\nLavish Newly Built 8-Bd. ...,FOR_SALE,40.892689,7000.0,-73.910667,https://photos.zillowstatic.com/fp/9c91425f8fb...,https://photos.zillowstatic.com/fp/82a019ae300...,https://photos.zillowstatic.com/fp/d1d0ed56f85...,https://photos.zillowstatic.com/fp/65103165fc7...,https://photos.zillowstatic.com/fp/8d20b0ec4ef...,https://photos.zillowstatic.com/fp/7b9f904cb69...,https://photos.zillowstatic.com/fp/20e8df506bb...,https://photos.zillowstatic.com/fp/f356dcac67b...,https://photos.zillowstatic.com/fp/8d4738c7035...,https://photos.zillowstatic.com/fp/381211d9a4b...,https://photos.zillowstatic.com/fp/f630db124fc...,https://photos.zillowstatic.com/fp/6b068122b76...,https://photos.zillowstatic.com/fp/0b980e850df...,https://photos.zillowstatic.com/fp/d2d206a9dfa...,https://photos.zillowstatic.com/fp/ec0ee308ede...,https://photos.zillowstatic.com/fp/07c1adf8728...,https://photos.zillowstatic.com/fp/b172b360d8c...,https://photos.zillowstatic.com/fp/98852036c55...,https://photos.zillowstatic.com/fp/1a13e70d310...,https://photos.zillowstatic.com/fp/51ee870df37...,https://photos.zillowstatic.com/fp/5bb38514311...,https://photos.zillowstatic.com/fp/8e49441db57...,https://photos.zillowstatic.com/fp/927f1eb3bd3...,https://photos.zillowstatic.com/fp/edaf44c6406...,https://photos.zillowstatic.com/fp/07f0edd3200...,https://photos.zillowstatic.com/fp/0d5f4e68cc8...,https://photos.zillowstatic.com/fp/57245954056...,https://photos.zillowstatic.com/fp/32c9f16821f...,https://photos.zillowstatic.com/fp/e5d0eca2d0c...,https://photos.zillowstatic.com/fp/10ebc500fdd...,https://photos.zillowstatic.com/fp/215b83e8307...,https://photos.zillowstatic.com/fp/2c282c9bea7...,https://photos.zillowstatic.com/fp/89066632fb8...,https://photos.zillowstatic.com/fp/21c99b54739...,https://photos.zillowstatic.com/fp/626307fd49e...,https://photos.zillowstatic.com/fp/be99ce8bd2b...,https://photos.zillowstatic.com/fp/32e2f4343a7...,https://photos.zillowstatic.com/fp/0a24cccaa57...,https://photos.zillowstatic.com/fp/1191f7008f2...,https://photos.zillowstatic.com/fp/8d2cac2af75...,https://photos.zillowstatic.com/fp/efae0d4a4b8...,https://photos.zillowstatic.com/fp/f7a194a729c...,https://photos.zillowstatic.com/fp/c45d4ccc01f...,https://photos.zillowstatic.com/fp/1bf74886b03...,https://photos.zillowstatic.com/fp/31650031d32...,https://photos.zillowstatic.com/fp/900af1d012f...,https://photos.zillowstatic.com/fp/8ea8f816183...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,3995000.0,,7e926562cd277890e4e3da97173191ad,Trebach Realty,,,,,,,Price change,False,3995000.0,-0.111235,,,,,,False,Trebach Realty,1607299000000.0,7e926562cd277890e4e3da97173191ad,Trebach Realty,,,,,,,Price change,False,4495000.0,-0.080401,,,,,,False,Trebach Realty,1601510000000.0,7e926562cd277890e4e3da97173191ad,Trebach Realty,,,,,,,Listed for sale,False,4888000.0,0.08743,,,,,,False,Trebach Realty,1595894000000.0,9ab560ae9634619d2e8d9742db9bc276,Trebach Realty,,,,,,,Listing removed,False,4495000.0,0.0,,,,,,False,Trebach Realty,1584144000000.0,9ab560ae9634619d2e8d9742db9bc276,Trebach Realty,,,,,,,Listed for sale,False,4495000.0,3.610256,,,,,,False,Trebach Realty,1572566000000.0,,Public Record,,,,,,,Sold,False,975000.0,-0.025,,Trebach Realty,,,/profile/Trebach-Realty/,False,Public Record,1450397000000.0,9ab560ae9634619d2e8d9742db9bc276,Trebach Realty,,,,,,,Listing removed,False,1000000.0,0.0,,,,,,False,Trebach Realty,1447200000000.0,9ab560ae9634619d2e8d9742db9bc276,Trebach Realty,,,,,,,Pending sale,False,1000000.0,0.0,,,,,,False,Trebach Realty,1439856000000.0,9ab560ae9634619d2e8d9742db9bc276,Trebach Realty,,,,,,,Listed for sale,False,1000000.0,0.0,,,,,,False,Trebach Realty,1430438000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,Dishwasher,Dryer,Washer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Type,Single Family,Year Built,1940,Heating,,Cooling,Central,Parking,"Garage, Garage - Attached",Lot,0.29 Acres,Days on Zillow,176 Days,Price/sqft,$571,,,,8.0,7.0,1.0,0.0,,0.0,8.0,,,,,,,,,,,,,,False,,Bronx,,,,,,,,,,,,,,,,,Central,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Hardwood,,,,,,,,,,,,,,,False,0,,,,,,,,,,,,,,,,,,,,,,,False,,False,False,False,True,,True,False,False,False,False,False,False,,False,False,False,,,,,,,,,,,,,,,,,,Single Family,False,,,,,,,,,"7,000 sqft",0.29 Acres,,,,,,,1595894000000.0,,,Clubhouse,,Granite countertop,,Playground,,Stainless steel appliances,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,059130860,0,Garage,Garage - Attached,,,,,,,,,,,,,Other,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.0,,Other,13941.0,1937000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1940.0,1940.0,,,,,0.4,K-5,True,Elementary,https://www.greatschools.org/school?id=02100&s...,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,,0.3,6-12,True,Middle,https://www.greatschools.org/school?id=08367&s...,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,,https://www.zillow.com/homedetails/625-W-246th...,1940.0,29854120.0
2,Bronx,,,NY,716 W 231st St,,10463.0,3.0,4.0,USD,1592668000000.0,This 4233 square foot single family home has 4...,FOR_SALE,40.883419,4233.0,-73.918106,https://photos.zillowstatic.com/fp/797eb4a9695...,https://photos.zillowstatic.com/fp/956034ac704...,https://photos.zillowstatic.com/fp/b799eae9027...,https://photos.zillowstatic.com/fp/4b393c3aff3...,https://photos.zillowstatic.com/fp/a207d715629...,https://photos.zillowstatic.com/fp/5724a2d40d9...,https://photos.zillowstatic.com/fp/e4f91d8b114...,https://photos.zillowstatic.com/fp/6a4794be3fa...,https://photos.zillowstatic.com/fp/8f8ba2f927e...,https://photos.zillowstatic.com/fp/b94ce3a3760...,https://photos.zillowstatic.com/fp/2c1f970ac7f...,https://photos.zillowstatic.com/fp/917bd9c4e54...,https://photos.zillowstatic.com/fp/db01538258a...,https://photos.zillowstatic.com/fp/7f2028a428a...,https://photos.zillowstatic.com/fp/0f05d322288...,https://photos.zillowstatic.com/fp/eecbde358e0...,https://photos.zillowstatic.com/fp/e7fcad63756...,https://photos.zillowstatic.com/fp/7754ce0ffb4...,https://photos.zillowstatic.com/fp/df91e0b59e8...,https://photos.zillowstatic.com/fp/8524b0e1c00...,https://photos.zillowstatic.com/fp/4b3880fc069...,https://photos.zillowstatic.com/fp/dbe2a87f0d0...,https://photos.zillowstatic.com/fp/29b13baaece...,https://photos.zillowstatic.com/fp/9b90c4d0b0a...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1495000.0,,,"William Raveis Real Estate, Mortgage & Insurance",,,,,,,Price change,False,1495000.0,-0.002668,,,,,,False,"William Raveis Real Estate, Mortgage & Insurance",1611101000000.0,,William Raveis Real Estate,,,,,,,Listed for sale,False,1499000.0,0.0,,,,,,False,William Raveis Real Estate,1592611000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,Dishwasher,Dryer,Washer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Type,Single Family,Year Built,1920,Heating,,Cooling,,Parking,"Garage, Garage - Attached",Lot,0.42 Acres,Days on Zillow,214 Days,Price/sqft,$353,,,,3.0,3.0,0.0,0.0,,0.0,4.0,,,,,,,,,,,,,,False,,Bronx,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,False,0,,,,,,,,,,,,,,,,,,,,,,,False,,False,False,False,False,,,False,False,False,False,False,False,,False,False,False,,,,,,,,,,,,,,,,,,Single Family,False,,,,,,,,,"4,233 sqft",0.42 Acres,,,,,,,1592611000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,057500494,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,12253.0,2341000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1920.0,1920.0,,,,,0.3,K-5,True,Elementary,https://www.greatschools.org/school?id=02100&s...,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,,0.4,6-12,True,Middle,https://www.greatschools.org/school?id=08367&s...,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,,https://www.zillow.com/homedetails/716-W-231st...,1920.0,29851860.0
3,Bronx,,,NY,750 W 232nd St,,10463.0,6.0,5.0,USD,1600814000000.0,EXCLUSIVE NEW TO MARKET\nPrime Renovation Oppo...,FOR_SALE,40.885033,7000.0,-73.917793,https://photos.zillowstatic.com/fp/36a83a8d300...,https://photos.zillowstatic.com/fp/90000a23b0b...,https://photos.zillowstatic.com/fp/9d7bb173624...,https://photos.zillowstatic.com/fp/6e171567e67...,https://photos.zillowstatic.com/fp/f75d80b80be...,https://photos.zillowstatic.com/fp/8c2d97e9c45...,https://photos.zillowstatic.com/fp/ba73c3d72d9...,https://photos.zillowstatic.com/fp/ae9d8311876...,https://photos.zillowstatic.com/fp/adb5f981b0b...,https://photos.zillowstatic.com/fp/a460a6dfd4f...,https://photos.zillowstatic.com/fp/0b4fce6c3ea...,https://photos.zillowstatic.com/fp/6ef5773f2e3...,https://photos.zillowstatic.com/fp/2d586b0dc99...,https://photos.zillowstatic.com/fp/110af9eb437...,https://photos.zillowstatic.com/fp/47a750500f5...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,3450000.0,,8fc1856e2d9a089a07098efa626531d7,StreetEasy,,,,,,,Price change,False,3450000.0,-0.092105,,,,,,False,StreetEasy,1608163000000.0,b47b78b7b2de6980413e5c3903cf7920,Trebach Realty,,,,,,,Listed for sale,False,3800000.0,0.225806,,,,,,False,Trebach Realty,1600733000000.0,,Public Record,,,,,,,Sold,False,3100000.0,-0.156463,,,,,,False,Public Record,1551917000000.0,edf73c0971e6f990409268b83a3d7508,Trebach Realty,,,,,,,Listing removed,False,3675000.0,0.0,,,,,,False,Trebach Realty,1550707000000.0,edf73c0971e6f990409268b83a3d7508,Trebach Realty,,,,,,,Pending sale,False,3675000.0,0.0,,,,,,False,Trebach Realty,1510272000000.0,edf73c0971e6f990409268b83a3d7508,Trebach Realty,,,,,,,Listed for sale,False,3675000.0,-0.125,,,,,,False,Trebach Realty,1506298000000.0,,Public Record,,,,,,,Sold,False,4200000.0,0.0,,,,,,False,Public Record,965088000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Type,Single Family,Year Built,1950,Heating,,Cooling,Central,Parking,"Garage, Garage - Attached",Lot,0.26 Acres,Days on Zillow,120 Days,Price/sqft,$493,,,,6.0,6.0,0.0,0.0,,0.0,5.0,,,,,,,,,,,,,,False,,Bronx,,,,,,,,,,,,,,,,,Central,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,False,0,,,,,,,,,,,,,,,,,,,,,,,False,,False,False,False,True,,True,False,False,False,False,False,False,,False,False,False,,,,,,,,,,,,,,,,,,Single Family,False,,,,,,,,,"7,000 sqft",0.26 Acres,,,,,,,1600733000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,057510300,0,Garage,Garage - Attached,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,,,19472.0,3011000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1950.0,1950.0,,,,,0.2,K-5,True,Elementary,https://www.greatschools.org/school?id=02100&s...,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,,0.3,6-12,True,Middle,https://www.greatschools.org/school?id=08367&s...,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,,https://www.zillow.com/homedetails/750-W-232nd...,1950.0,29851860.0
4,Bronx,,,NY,632 W 230th St,,10463.0,6.0,5.0,USD,1605751000000.0,EXCLUSIVE JUST LISTED\nNewly Built 5-Bd. Stucc...,FOR_SALE,40.881702,,-73.914185,https://photos.zillowstatic.com/fp/daa98ebe0c5...,https://photos.zillowstatic.com/fp/fe816fa637b...,https://photos.zillowstatic.com/fp/2b2e0075694...,https://photos.zillowstatic.com/fp/ccfe5f7baf0...,https://photos.zillowstatic.com/fp/f2c81a6d161...,https://photos.zillowstatic.com/fp/d0f78479941...,https://photos.zillowstatic.com/fp/5e7fb1fe73d...,https://photos.zillowstatic.com/fp/76d72a4dfac...,https://photos.zillowstatic.com/fp/425f1f7017b...,https://photos.zillowstatic.com/fp/ae3df89de6a...,https://photos.zillowstatic.com/fp/944026d5209...,https://photos.zillowstatic.com/fp/ce447bd1c65...,https://photos.zillowstatic.com/fp/ee8b3c836d8...,https://photos.zillowstatic.com/fp/d469ac88321...,https://photos.zillowstatic.com/fp/ffeda9cdd3b...,https://photos.zillowstatic.com/fp/196b1e1202e...,https://photos.zillowstatic.com/fp/86780857849...,https://photos.zillowstatic.com/fp/c4efa2cb67b...,https://photos.zillowstatic.com/fp/471a8b76fc5...,https://photos.zillowstatic.com/fp/c799c277d12...,https://photos.zillowstatic.com/fp/eedd9642ac6...,https://photos.zillowstatic.com/fp/19adc20f541...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1790000.0,,336048b9faf140601813bb877d92efd2,Trebach Realty,,,,,,,Listed for sale,False,1790000.0,0.0,,,,,,False,Trebach Realty,1605658000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,,,,Dishwasher,Dryer,Washer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Type,Single Family,Year Built,2020,Heating,,Cooling,Central,Parking,0 spaces,Lot,,Days on Zillow,62 Days,,,,,,6.0,5.0,1.0,0.0,,0.0,5.0,,,,,,,,,,,,,,False,,Bronx,,,,,,,,,,,,,,,,,Central,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,False,0,,,,,,,,,,,,,,,,,,,,,,,False,,False,False,False,True,,,False,False,False,False,False,False,,False,False,False,,,,,,,,,,,,,,,,,,Single Family,False,,,,,,,,,,,,,,,,,1605744000000.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2020.0,,,,,,0.3,K-5,True,Elementary,https://www.greatschools.org/school?id=02100&s...,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,,0.4,6-12,True,Middle,https://www.greatschools.org/school?id=08367&s...,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,,https://www.zillow.com/homedetails/632-W-230th...,2020.0,2077107000.0


In [23]:
df.shape

(75630, 1507)

In [24]:
df.drop(df.loc[:,df.columns.str.contains('factLabel')], axis=1, inplace=True)

In [25]:
df.drop(df.loc[:,df.columns.str.contains('photos')], axis=1, inplace=True)

In [26]:
df.isna().sum().head(20)

address/city                                      1
address/community                             75630
address/neighborhood                          74953
address/state                                     1
address/streetAddress                             1
address/subdivision                           75628
address/zipcode                                  19
bathrooms                                     19053
bedrooms                                      19464
currency                                          0
dateposted                                    52200
description                                    6684
homeStatus                                        0
latitude                                         26
livingArea                                     9211
longitude                                        26
price                                            39
priceHistory                                  75630
priceHistory/0/attributeSource/infoString1    51477
priceHistory

In [27]:
df.drop(df.loc[df['bathrooms'].isna()].index, axis=0, inplace=True)
df.drop(df.loc[df['bedrooms'].isna()].index, axis=0, inplace=True)
df.drop(df.loc[df['livingArea'].isna()].index, axis=0, inplace=True)

In [32]:
df.isna().sum().head(20)

address/city                                      0
address/community                             49943
address/neighborhood                          49315
address/state                                     0
address/streetAddress                             0
address/subdivision                           49941
address/zipcode                                   1
bathrooms                                         0
bedrooms                                          0
currency                                          0
dateposted                                    32138
description                                    4087
homeStatus                                        0
latitude                                         23
livingArea                                        0
longitude                                        23
price                                            29
priceHistory                                  49943
priceHistory/0/attributeSource/infoString1    31267
priceHistory

In [33]:
limit = len(df) * .25
#df.dropna(thresh=limit, axis=1, inplace=True)

In [34]:
df.isna().sum().head(20)

address/city                                      0
address/state                                     0
address/streetAddress                             0
address/zipcode                                   1
bathrooms                                         0
bedrooms                                          0
currency                                          0
dateposted                                    32138
description                                    4087
homeStatus                                        0
latitude                                         23
livingArea                                        0
longitude                                        23
price                                            29
priceHistory/0/attributeSource/infoString1    31267
priceHistory/0/attributeSource/infoString2     3534
priceHistory/0/buyerAgent/name                34172
priceHistory/0/buyerAgent/photo/url           34831
priceHistory/0/buyerAgent/profileUrl          34172
priceHistory

In [37]:
df.columns

Index(['address/city', 'address/state', 'address/streetAddress',
       'address/zipcode', 'bathrooms', 'bedrooms', 'currency', 'dateposted',
       'description', 'homeStatus',
       ...
       'schools/2/link', 'schools/2/name', 'schools/2/rating',
       'schools/2/size', 'schools/2/studentsPerTeacher',
       'schools/2/totalCount', 'schools/2/type', 'url', 'yearBuilt', 'zpid'],
      dtype='object', length=210)

In [38]:
df['homeStatus'].unique()

array(['FOR_SALE', 'SOLD', 'RECENTLY_SOLD', 'PRE_FORECLOSURE', 'FOR_RENT',
       'FORECLOSED'], dtype=object)

In [39]:
df.drop(df.loc[df['homeStatus'] == 'FOR_RENT'].index, axis=0, inplace=True)
df.drop(df.loc[df['homeStatus'] == 'PRE_FORECLOSURE'].index, axis=0, inplace=True)
df.drop(df.loc[df['homeStatus'] == 'FORECLOSED'].index, axis=0, inplace=True)
df.drop(df.loc[df['homeStatus'] == 'OTHER'].index, axis=0, inplace=True)

In [40]:
df.loc[df['priceHistory/0/postingIsRental'] == True]

Unnamed: 0,address/city,address/state,address/streetAddress,address/zipcode,bathrooms,bedrooms,currency,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/attributeSource/infoString1,priceHistory/0/attributeSource/infoString2,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/postingIsRental,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/showCountyLink,priceHistory/0/source,priceHistory/0/time,priceHistory/1/attributeSource/infoString1,priceHistory/1/attributeSource/infoString2,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/source,priceHistory/1/time,priceHistory/2/attributeSource/infoString1,priceHistory/2/attributeSource/infoString2,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/source,priceHistory/2/time,priceHistory/3/attributeSource/infoString1,priceHistory/3/attributeSource/infoString2,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/source,priceHistory/3/time,priceHistory/4/attributeSource/infoString1,priceHistory/4/attributeSource/infoString2,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/source,priceHistory/4/time,priceHistory/5/attributeSource/infoString1,priceHistory/5/attributeSource/infoString2,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/source,priceHistory/5/time,priceHistory/6/attributeSource/infoString2,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/source,priceHistory/6/time,priceHistory/7/attributeSource/infoString2,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/source,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/0/factValue,resoFactsStats/atAGlanceFacts/1/factValue,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/canRaiseHorses,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAdditionalParcels,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasHomeWarranty,resoFactsStats/hasLandLease,resoFactsStats/hasOpenParking,resoFactsStats/hasPetsAllowed,resoFactsStats/hasPrivatePool,resoFactsStats/hasRentControl,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuilt,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/grades,schools/0/isAssigned,schools/0/level,schools/0/link,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/grades,schools/1/isAssigned,schools/1/level,schools/1/link,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,url,yearBuilt,zpid
136,New York,NY,581 Academy St APT 2C,10034.0,1.0,1.0,USD,,Sunny One Bedroom Apartment Near Parks & Trans...,SOLD,40.863861,625.0,-73.923073,399000.0,080ffee64010474abfd07ee9846fec18,New Heights Realty,,,,Listing removed,True,1650.0,0.0,,,,False,New Heights Realty,1.528330e+12,080ffee64010474abfd07ee9846fec18,New Heights Realty,Price change,True,1650.0,-0.029412,False,New Heights Realty,1.528157e+12,080ffee64010474abfd07ee9846fec18,New Heights Realty,Price change,True,1700.0,-0.028571,False,New Heights Realty,1.526947e+12,080ffee64010474abfd07ee9846fec18,New Heights Realty,Listed for rent,True,1750.0,0.000000,False,New Heights Realty,1.526429e+12,,Public Record,Sold,False,399000.0,0.000000,False,Public Record,1.525651e+12,25a22c198fd6cf93c3fe060cc3390136,Corcoran,Listing removed,False,399000.0,0.000000,False,Corcoran,1.524787e+12,Corcoran,Pending sale,False,399000.0,0.000000,False,Corcoran,1.516752e+12,Corcoran,Listed for sale,False,399000.0,0.266667,False,Corcoran,1.514851e+12,0.88,Dishwasher,,,Multiple Occupancy,,,Central,0 spaces,$413/mo,,1.0,0.0,0.0,0.0,0.0,1.0,False,New York,,Central,,,False,0,False,False,False,False,True,,False,False,False,False,False,False,,False,False,False,,,,Multiple Occupancy,,625 sqft,,,,,Bicycle Storage,,Large Dogs Allowed,,Live In Super,,Small Dogs Allowed,,,,,,,,,,,,,,,022211160,0,,,,,5.0,,3152.0,66533.0,,,0.3,PK-5,True,Primary,https://www.greatschools.org/school?id=01799&s...,Ps 5 Ellen Lurie,7.0,593.0,12.0,1.0,Public,0.5,6-8,True,Middle,https://www.greatschools.org/school?id=01793&s...,Is 218 Salome Urena,4.0,173.0,9.0,1.0,Public,0.2,9-12,True,High,https://www.greatschools.org/school?id=13289&s...,High School For Excellence And Innovation,1.0,194.0,11.0,1.0,Public,https://www.zillow.com/homedetails/581-Academy...,,31554258.0
137,Bronx,NY,5 Rivercrest Rd,10471.0,3.0,4.0,USD,1.610521e+12,,SOLD,40.903187,1800.0,-73.908684,1500000.0,,Trebach Realty,,,,Listing removed,True,6900.0,0.0,,,,False,Trebach Realty,1.573776e+12,,Trebach Realty,Listed for rent,True,6900.0,0.000000,False,Trebach Realty,1.569974e+12,H4840568,OneKey™ MLS,Sold,False,1500000.0,-0.090358,False,OneKey™ MLS,1.555459e+12,,Julia B Fee Sotheby's International Realty,Pending sale,False,1649000.0,0.000000,False,Julia B Fee Sotheby's International Realty,1.555286e+12,,Julia B Fee Sotheby's International Realty,Price change,False,1649000.0,-0.083380,False,Julia B Fee Sotheby's International Realty,1.540339e+12,,Julia B Fee Sotheby's International Realty,Listed for sale,False,1799000.0,0.000000,False,Julia B Fee Sotheby's International Realty,1.538006e+12,,,,,,,,,,,,,,,,,0.95,Dishwasher,Dryer,Range,Residential,1935,"Natural Gas, Oil, Forced Air",Central Air,"Detached, Driveway",0.36 Acres,Partially Finished,3.0,3.0,0.0,,,4.0,False,Bronx,Brick,Central Air,Bronx 10,Hardwood,False,1,False,False,False,False,True,True,True,True,False,False,False,False,,False,False,False,Natural Gas,Oil,Bronx 10,Residential,,"1,800 sqft",0.36 Acres,Bronx 10,1.537920e+12,,,,,,,,,,,,,,,,,,,,,,,05949-0308,0,Detached,Driveway,,Sewer,,,12706.0,1786000.0,1935.0,,0.2,K-5,True,Elementary,https://www.greatschools.org/school?id=02288&s...,Ps 81 Robert J Christen,7.0,725.0,14.0,1.0,Public,1.0,6-12,True,Middle,https://www.greatschools.org/school?id=08367&s...,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/5-Rivercres...,1935.0,29854613.0
317,Jamaica,NY,9806 163rd Ave,11414.0,4.0,6.0,USD,,"9806 163rd Ave, Jamaica, NY 11414 is a multi f...",SOLD,40.652981,1152.0,-73.833176,750000.0,8b66e0f4b02701d2b9112706acf6b790,PH Home LLC,,,,Listing removed,True,4800.0,0.0,,,,False,PH Home LLC,1.550189e+12,8b66e0f4b02701d2b9112706acf6b790,PH Home LLC,Listed for rent,True,4800.0,0.000000,False,PH Home LLC,1.547078e+12,,Public Record,Sold,False,750000.0,-0.048223,False,Public Record,1.543190e+12,8bf4af9f26247df2db59db0d28afdc55,Keller Williams Landmark Realty,Listing removed,False,788000.0,0.000000,False,Keller Williams Landmark Realty,1.531354e+12,8bf4af9f26247df2db59db0d28afdc55,Keller Williams Landmark Realty,Price change,False,788000.0,-0.048309,False,Keller Williams Landmark Realty,1.529971e+12,8bf4af9f26247df2db59db0d28afdc55,Keller Williams Landmark Realty,Price change,False,828000.0,-0.051425,False,Keller Williams Landmark Realty,1.527811e+12,Keller Williams Rlty Landmark,Listed for sale,False,872888.0,0.163851,False,Keller Williams Rlty Landmark,1.523578e+12,Howard Beach Realty Inc,Listing removed,False,750000.0,0.000000,False,Howard Beach Realty Inc,1.475194e+12,0.84,Dishwasher,Dryer,Refrigerator,Multiple Occupancy,1940,,Other,"Garage, Garage - Attached, Covered",,,4.0,0.0,0.0,0.0,0.0,6.0,False,Jamaica,,Other,Community District 27,,False,0,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,,,Community District 27,Multiple Occupancy,,"1,152 sqft","3,998 sqft",Community District 27,,,Appliances \ REF \ Inc,,Appliances \ STOVE \ Inc,,BSMT \ Full,,Basement \ BSMT \ Full,,Bedrooms \ MBR 1ST FLOOR \ Y,,CLASS \ Ren,,COUNTY \ Queens,,FIN \ Y,,KIT \ Eik,,Kitchen \ KIT \ Eik,LEASE \ Y,Location \ COUNTY \ Queens,142060003,0,Garage,Garage - Attached,DiningRoom,,,Colonial,6614.0,775000.0,1940.0,1940.0,0.5,PK-8,True,Primary,https://www.greatschools.org/school?id=02489&s...,Ps 146 Howard Beach,8.0,681.0,14.0,2.0,Public,1.6,9-12,True,High,https://www.greatschools.org/school?id=02007&s...,John Adams High School,3.0,2364.0,14.0,1.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/9806-163rd-...,1940.0,32217899.0
378,Howard Beach,NY,10242 164th Dr,11414.0,2.0,3.0,USD,,"Freshly painted apartment with ample space, th...",SOLD,40.651054,2100.0,-73.828117,2150.0,5763a69dc921149846bb92becc65ae02,Zillow Rental Manager,,,,Listing removed,True,2700.0,0.0,,,,False,Zillow Rental Manager,1.608336e+12,5763a69dc921149846bb92becc65ae02,Zillow Rental Manager,Price change,True,2700.0,0.080000,False,Zillow Rental Manager,1.607818e+12,5763a69dc921149846bb92becc65ae02,Zillow Rental Manager,Listed for rent,True,2500.0,0.000000,False,Zillow Rental Manager,1.607299e+12,,,Sold,False,2150.0,-0.995939,False,Agent Provided,1.560643e+12,,Public Record,Sold,False,529490.0,1.614765,False,Public Record,1.443658e+12,,Public Record,Sold,False,202500.0,0.000000,False,Public Record,1.165277e+12,,,,,,,,,,,,,,,,,0.84,Dryer,Washer,,Single Family,2012,Forced air,Other,Off-street,"3,200 sqft",,2.0,0.0,0.0,0.0,0.0,3.0,False,Howard Beach,Frame,Other,,,False,0,False,False,False,False,True,,False,True,False,False,False,False,,False,False,False,Forced air,,,Single Family,,"2,100 sqft","3,200 sqft",,,,Broker Exclusive,Cooling System,Air Conditioning,,Electricity not included in rent,,Gas not included in rent,Laundry,In Unit,,Water not included in rent,,,,,,,,,,,142551707,0,Off-street,,,,2.0,,6177.0,611000.0,2012.0,,0.7,PK-8,True,Primary,https://www.greatschools.org/school?id=02489&s...,Ps 146 Howard Beach,8.0,681.0,14.0,2.0,Public,1.8,9-12,True,High,https://www.greatschools.org/school?id=02007&s...,John Adams High School,3.0,2364.0,14.0,1.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/10242-164th...,2012.0,112516361.0
408,Far Rockaway,NY,1341 Chandler St,11691.0,1.0,2.0,USD,,Amazing 2 fam corner property !Fully renovated...,SOLD,40.606884,1666.0,-73.756783,538000.0,a5b89baebe4e850175451ee321db674e,Realty Trends Corp,,,,Listing removed,True,1950.0,0.0,,,,False,Realty Trends Corp,1.572221e+12,a5b89baebe4e850175451ee321db674e,Realty Trends Corp,Listed for rent,True,1950.0,0.218750,False,Realty Trends Corp,1.570925e+12,,Public Record,Sold,False,538000.0,-0.001855,False,Public Record,1.531354e+12,738871f840ff2586e22dbe94b5db6678,New York 1 Homes Network Inc,Listing removed,False,539000.0,0.000000,False,New York 1 Homes Network Inc,1.528762e+12,738871f840ff2586e22dbe94b5db6678,New York 1 Homes Network Inc,Listed for sale,False,539000.0,0.146809,False,New York 1 Homes Network Inc,1.522800e+12,738871f840ff2586e22dbe94b5db6678,New York 1 Homes Network Inc,Listing removed,False,470000.0,0.000000,False,New York 1 Homes Network Inc,1.513123e+12,New York 1 Homes Network Inc,Price change,False,470000.0,-0.010526,False,New York 1 Homes Network Inc,1.497830e+12,New York 1 Homes Network Inc,Price change,False,475000.0,-0.008351,False,New York 1 Homes Network Inc,1.495066e+12,0.84,Dryer,Refrigerator,Washer,Apartment,1930,Forced air,,"Garage, Off-street, Covered",,Unfinished,1.0,0.0,0.0,0.0,0.0,2.0,False,Far Rockaway,,,Community District 27,Hardwood,False,1,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,Forced air,,Community District 27,Apartment,,"1,666 sqft","1,799 sqft",Community District 27,,,Appliances \ REF \ Inc,,Appliances \ STOVE \ Inc,,BSMT \ -,,Basement \ BSMT \ -,,Bedrooms \ MBR 1ST FLOOR \ Y,,CLASS \ Ren,,COUNTY \ Queens,,ELEC \ Not Inc,,GAS \ Not Inc,,KIT \ Eik,Kitchen \ KIT \ Eik,Location \ COUNTY \ Queens,156620001,1,Garage,Off-street,DiningRoom,,,Colonial,3261.0,419000.0,1930.0,1930.0,0.5,PK-5,True,Primary,https://www.greatschools.org/school?id=02357&s...,Ps 104 The Bays Water,6.0,685.0,14.0,1.0,Public,0.5,6-12,True,Middle,https://www.greatschools.org/school?id=13371&s...,Acadey of Medical Technology - A College Board...,4.0,645.0,13.0,1.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/1341-Chandl...,1930.0,32221415.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
75403,Flushing,NY,6125 Austin St,11374.0,2.5,3.0,USD,,Beautiful detached whole house for rent in the...,SOLD,40.729725,1500.0,-73.869637,870000.0,0c423f1f8ddcda3ef2c026387d0d5afa,City Realty Group,,,,Listing removed,True,3500.0,0.0,,,,False,City Realty Group,1.578010e+12,0c423f1f8ddcda3ef2c026387d0d5afa,City Realty Group,Listed for rent,True,3500.0,0.000000,False,City Realty Group,1.577664e+12,,Public Record,Sold,False,870000.0,-0.116751,False,Public Record,1.574035e+12,c7261ce1aaa694db09c319b383a9b1d3,Cyclone Realty,Listing removed,False,985000.0,0.000000,False,Cyclone Realty,1.546906e+12,c7261ce1aaa694db09c319b383a9b1d3,Cyclone Realty,Pending sale,False,985000.0,0.000000,False,Cyclone Realty,1.508112e+12,44d11ad3a3896b038cdcb2d79b3420d8,Cyclone Realty,Listed for sale,False,985000.0,0.647157,False,Cyclone Realty,1.504742e+12,Public Record,Sold,False,598000.0,0.000000,False,Public Record,1.137024e+12,,,,,,,,,0.84,Dishwasher,Dryer,Washer,Single Family,1925.0,,,"Garage, Garage - Detached","3,449 sqft",,,0.0,0.0,0.0,0.0,3.0,False,Flushing,,,Community District 28,Hardwood,False,0,False,False,False,False,False,,False,False,False,False,False,False,,False,False,False,,,Community District 28,Single Family,,"1,500 sqft","3,449 sqft",Community District 28,,,Courtyard,,Fios Available,Laundry,In Unit,Parking Type,Garage,,,,,,,,,,,,,,,030900057,0,Garage,Garage - Detached,DiningRoom,,,Colonial,4273.0,1031000.0,1925.0,1925.0,0.4,K-5,True,Elementary,https://www.greatschools.org/school?id=02466&s...,Ps 139 Rego Park,7.0,748.0,16.0,1.0,Public,1.2,6-8,True,Middle,https://www.greatschools.org/school?id=02599&s...,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.5,9-12,True,High,https://www.greatschools.org/school?id=01960&s...,Forest Hills High School,5.0,3784.0,20.0,1.0,Public,https://www.zillow.com/homedetails/6125-Austin...,1925.0,31998864.0
75434,Glendale,NY,8851 Aubrey Ave,11385.0,1.5,4.0,USD,,"Beautifully renovated 4 bedroom apt, large roo...",SOLD,40.711185,3074.0,-73.862823,900000.0,11ce660762d700db8e1df593bc304728,RE/MAX City Square,,,,Listing removed,True,2700.0,0.0,,,,False,RE/MAX City Square,1.573862e+12,11ce660762d700db8e1df593bc304728,RE/MAX City Square,Listed for rent,True,2700.0,0.000000,False,RE/MAX City Square,1.572048e+12,,Public Record,Sold,False,900000.0,-0.302866,False,Public Record,1.538438e+12,f451499cd6da5eb3bc531b1098c68174,Winzone Realty Inc,Price change,False,1291000.0,-0.064493,False,Winzone Realty Inc,1.527034e+12,0e9e987845a8b924049c730ccfa0cacf,Winzone Realty Inc,Price change,False,1380000.0,-0.067568,False,Winzone Realty Inc,1.519690e+12,0e9e987845a8b924049c730ccfa0cacf,Winzone Realty Inc,Listed for sale,False,1480000.0,0.238494,False,Winzone Realty Inc,1.519085e+12,CENTURY 21 American Homes,Listing removed,False,1195000.0,0.000000,False,CENTURY 21 American Homes,1.482538e+12,Century 21 Best Inc,Listed for sale,False,1195000.0,0.770402,False,Century 21 Best Inc,1.461542e+12,0.84,Refrigerator,,,Single Family,1970.0,,,"Garage, Garage - Attached","3,075 sqft",,,0.0,0.0,0.0,0.0,4.0,False,Glendale,Frame,,Community District 24,,False,0,False,False,False,False,False,,False,False,False,False,False,False,,False,False,False,,,Community District 24,Single Family,,"3,074 sqft","3,075 sqft",Community District 24,,MLS Listing ID,3175827,MLS Name,MLS of Long Island,,,,,,,,,,,,,,,,,,,038510154,0,Garage,Garage - Attached,DiningRoom,,,Colonial,8591.0,908000.0,1970.0,1970.0,0.4,K-8,True,Elementary,https://www.greatschools.org/school?id=02386&s...,Ps 113 Isaac Chauncey,9.0,895.0,17.0,1.0,Public,2.0,9-12,True,High,https://www.greatschools.org/school?id=02023&s...,Newtown High School,3.0,1913.0,17.0,2.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/8851-Aubrey...,1970.0,32019133.0
75461,Flushing,NY,6736 79th St,11379.0,1.0,2.0,USD,,Beautiful 2 Bedroom in Middle Village\nHardwoo...,SOLD,40.712185,3825.0,-73.873779,999000.0,e15e491142d25982b4c0360a0b788190,Concrete Jungle NYC,,,,Listing removed,True,1850.0,0.0,,,,False,Concrete Jungle NYC,1.562112e+12,e15e491142d25982b4c0360a0b788190,Concrete Jungle NYC,Listed for rent,True,1850.0,0.000000,False,Concrete Jungle NYC,1.561766e+12,,Public Record,Sold,False,999000.0,-0.184490,False,Public Record,1.543968e+12,3f6e3d2ba5e09f81da11eb800c0a3ecb,Keller Williams Realty Landmark II,Listing removed,False,1225000.0,0.000000,False,Keller Williams Realty Landmark II,1.527206e+12,3f6e3d2ba5e09f81da11eb800c0a3ecb,Keller Williams Realty Landmark II,Price change,False,1225000.0,-0.125000,False,Keller Williams Realty Landmark II,1.518739e+12,c66cb65e9c248de916b7cbff1fd87e35,Keller Williams Landmark II,Listed for sale,False,1400000.0,0.631702,False,Keller Williams Landmark II,1.510618e+12,Keller Williams Realty Landmark II,Listing removed,False,858000.0,0.000000,False,Keller Williams Realty Landmark II,1.313626e+12,Keller Williams Realty Landmark,Listed for sale,False,858000.0,0.865217,False,Keller Williams Realty Landmark,1.282262e+12,0.84,Range / Oven,Refrigerator,,Multiple Occupancy,2005.0,,,,,,1.0,0.0,0.0,0.0,0.0,2.0,False,Flushing,,,Community District 24,Hardwood,False,0,False,False,False,False,False,,False,False,False,False,False,False,,False,False,False,,,Community District 24,Multiple Occupancy,,"3,825 sqft","2,500 sqft",Community District 24,,roof_types,Unknown,taxes_annual,1820,,,,,,,,,,,,,,,,,,,037770034,0,,,,,,Contemporary,6249.0,1454000.0,2005.0,,0.1,PK-8,True,Primary,https://www.greatschools.org/school?id=02306&s...,Ps 87 Middle Village,6.0,608.0,11.0,1.0,Public,2.2,9-12,True,High,https://www.greatschools.org/school?id=01968&s...,Grover Cleveland H.S.,3.0,1751.0,16.0,2.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/6736-79th-S...,2005.0,32016614.0
75561,Glendale,NY,7978 77th Rd,11385.0,1.0,2.0,USD,,Beautiful 2 Bedroom in Glendale! Apartment Fea...,SOLD,40.705814,750.0,-73.868980,550000.0,d76e349ac4bb6cf70c43bd0e872f828f,EXP Realty,,,,Listing removed,True,2200.0,0.0,,,,False,EXP Realty,1.598918e+12,d76e349ac4bb6cf70c43bd0e872f828f,EXP Realty,Listed for rent,True,2200.0,0.517241,False,EXP Realty,1.595981e+12,,Public Record,Sold,False,550000.0,0.111111,False,Public Record,1.554163e+12,6ebba6a3bb49cb7851813b4ffff4865c,FILLMORE,Listing removed,True,1450.0,0.000000,False,FILLMORE,1.416442e+12,6ebba6a3bb49cb7851813b4ffff4865c,FILLMORE,Listed for rent,True,1450.0,0.000000,False,FILLMORE,1.415232e+12,,Public Record,Sold,False,495000.0,0.000000,False,Public Record,1.055462e+12,,,,,,,,,,,,,,,,,0.84,Range / Oven,Refrigerator,,Apartment,1960.0,,,,,,1.0,0.0,0.0,0.0,0.0,2.0,False,Glendale,,,,,False,0,False,False,False,False,False,,False,False,False,False,False,False,,False,False,False,,,,Apartment,,750 sqft,"2,700 sqft",,,,ArchitecturalStyle \ Apt In Bldg,,AssociationFeeIncludes \ Heat,,AssociationFeeIncludes \ Water,,CommunityFeatures \ Near Public Transportation,,CommunityFeatures \ Park,Elementary School,Is 119 Glendale (The),,FireplaceYN \ 0,,LotFeatures \ Near Public Transit,MLS Listing ID,3237531,MLS Name,OneKey ZDD (OneKey ZDD),PoolPrivateYN \ 0,PropertyType \ Residential Lease,038180032,0,,,,,2.0,,15415.0,532000.0,1960.0,,0.4,K-8,True,Elementary,https://www.greatschools.org/school?id=02402&s...,I.S. 119 the Glendale,8.0,1278.0,17.0,1.0,Public,2.5,9-12,True,High,https://www.greatschools.org/school?id=01968&s...,Grover Cleveland H.S.,3.0,1751.0,16.0,1.0,Public,,,,,,,,,,,,https://www.zillow.com/homedetails/7978-77th-R...,1960.0,32017911.0


In [41]:
df.drop(df.loc[df['priceHistory/0/postingIsRental'] == True].index, axis=0, inplace=True)

In [42]:
df.replace({False: 0, True: 1}, inplace=True)

In [43]:
df.describe()

Unnamed: 0,address/zipcode,bathrooms,bedrooms,dateposted,latitude,livingArea,longitude,price,priceHistory/0/postingIsRental,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/showCountyLink,priceHistory/0/time,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/canRaiseHorses,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAdditionalParcels,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasHomeWarranty,resoFactsStats/hasLandLease,resoFactsStats/hasOpenParking,resoFactsStats/hasPetsAllowed,resoFactsStats/hasPrivatePool,resoFactsStats/hasRentControl,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/isNewConstruction,resoFactsStats/onMarketDate,resoFactsStats/parking,resoFactsStats/stories,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuilt,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/isAssigned,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/1/distance,schools/1/isAssigned,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/2/distance,schools/2/isAssigned,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,yearBuilt,zpid
count,46791.0,46792.0,46792.0,17206.0,46791.0,46792.0,46791.0,46763.0,46784.0,46747.0,46749.0,46784.0,46784.0,41688.0,41596.0,41603.0,41688.0,41688.0,39280.0,39181.0,39185.0,39280.0,39280.0,33483.0,33351.0,33365.0,33483.0,33483.0,27240.0,27063.0,27088.0,27240.0,27240.0,21916.0,21711.0,21742.0,21916.0,21916.0,17552.0,17345.0,17366.0,17552.0,17552.0,13933.0,13754.0,13770.0,13933.0,13933.0,46785.0,45955.0,44399.0,44382.0,31528.0,39119.0,46792.0,46792.0,46792.0,46792.0,46792.0,46792.0,46792.0,46792.0,45998.0,14680.0,46792.0,46615.0,46792.0,46792.0,46792.0,46792.0,18131.0,46792.0,46792.0,46792.0,13628.0,17206.0,46792.0,29817.0,41024.0,40236.0,44439.0,24564.0,46722.0,46722.0,46203.0,46444.0,46184.0,46722.0,46268.0,46268.0,46192.0,46173.0,45315.0,46268.0,35167.0,35167.0,35147.0,35139.0,34831.0,35167.0,44419.0,46792.0
mean,10838.975551,2.822816,4.0424,1603914000000.0,40.693087,2775.702919,-73.944114,1003919.0,0.0,1009257.0,280.3336,0.0,1570025000000.0,0.037853,960181.0,231.3589,0.0,1530793000000.0,0.04002,936630.2,596.0665,0.0,1521413000000.0,0.039871,900366.4,285.805323,0.0,1462156000000.0,0.052349,895817.9,686.5025,0.0,1432649000000.0,0.054116,816131.5,250.7897,0.0,1411707000000.0,0.054923,795836.3,475.1121,0.0,1392504000000.0,0.0567,779645.6,293.0819,0.0,1379118000000.0,0.833163,2.831966,2.082614,0.519377,0.002633,0.150847,4.0424,0.0,0.006283,0.559839,0.0,0.041396,0.071615,0.03475,0.453585,0.400204,0.26543,0.535064,0.0,0.0,0.05159,0.021029,0.166896,0.0,0.009275,0.094375,0.059657,1593902000000.0,0.518401,2.57712,15595.21,1112901.0,1948.397151,2766.269,0.389795,1.0,6.198537,726.256395,13.912047,1.195347,0.891813,1.0,5.661089,996.635458,14.237449,1.239388,1.21585,1.0,3.688537,2111.404821,17.289397,1.184093,1949.274432,307007700.0
std,528.21038,6.451641,7.349804,9907187000.0,0.103393,13503.476184,0.139483,1831713.0,0.0,1809040.0,16966.01,0.0,29636880000.0,0.190842,1725092.0,15957.2,0.0,103052400000.0,0.196009,1663645.0,21535.56,0.0,105647600000.0,0.195659,4199573.0,9409.190782,0.0,169672000000.0,0.222735,6256834.0,31553.26,0.0,183056300000.0,0.226251,1733687.0,10381.98,0.0,187279900000.0,0.227835,1700066.0,20217.76,0.0,190364500000.0,0.231277,1688207.0,15158.59,0.0,188643600000.0,0.097864,6.508337,1.513018,6.442786,0.059278,0.458427,7.349804,0.0,0.079018,7.812925,0.0,0.199206,0.257852,0.183147,0.497846,0.489956,0.441567,0.498774,0.0,0.0,0.2212,0.143483,0.372894,0.0,0.095861,0.292353,0.236858,19691760000.0,7.814519,2.655639,131019.1,9029546.0,57.288685,128630.1,0.281603,0.0,2.113628,289.794506,2.040061,0.771877,0.709641,0.0,2.222016,664.874297,2.36623,0.787799,0.899164,0.0,1.666988,1329.406117,2.707508,0.591514,39.665745,687262600.0
min,148.0,0.5,1.0,1451194000000.0,40.498634,1.0,-74.253983,1.0,0.0,0.0,-1.0,0.0,1200010000000.0,0.0,0.0,-1.0,0.0,300326400000.0,0.0,0.0,-1.0,0.0,165196800000.0,0.0,0.0,-1.0,0.0,55036800000.0,0.0,0.0,-0.9999975,0.0,27561600000.0,0.0,0.0,-1.0,0.0,-86400000.0,0.0,1.0,-0.999999,0.0,173923200000.0,0.0,0.0,-1.0,0.0,154569600000.0,0.65,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1451174000000.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,115.0,6.0,1.0,0.0,1.0,1.0,85.0,7.0,1.0,0.0,1.0,1.0,101.0,7.0,1.0,1.0,29776660.0
25%,10312.0,2.0,3.0,1599951000000.0,40.606413,1296.0,-74.077274,510000.0,0.0,515000.0,-0.05309735,0.0,1543882000000.0,0.0,485000.0,0.0,0.0,1524182000000.0,0.0,479000.0,-0.02004008,0.0,1518134000000.0,0.0,399900.0,-0.021322,0.0,1438560000000.0,0.0,355000.0,-0.01788909,0.0,1379030000000.0,0.0,330000.0,-0.01578688,0.0,1342051000000.0,0.0,315000.0,-0.01392758,0.0,1318961000000.0,0.0,299900.0,-0.01393734,0.0,1305245000000.0,0.84,2.0,1.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1593043000000.0,0.0,2.0,4329.0,489000.0,1925.0,1920.0,0.2,1.0,5.0,532.0,13.0,1.0,0.4,1.0,4.0,504.0,13.0,1.0,0.5,1.0,3.0,641.0,15.0,1.0,1925.0,30731000.0
50%,11004.0,3.0,3.0,1606156000000.0,40.685352,1800.0,-73.914764,689000.0,0.0,695000.0,-0.01557286,0.0,1571357000000.0,0.0,669000.0,0.0,0.0,1554077000000.0,0.0,659000.0,0.0,0.0,1545955000000.0,0.0,599000.0,0.0,0.0,1528157000000.0,0.0,559999.0,0.0,0.0,1511136000000.0,0.0,525000.0,0.0,0.0,1493035000000.0,0.0,499000.0,0.0,0.0,1459296000000.0,0.0,474250.0,0.0,0.0,1432512000000.0,0.84,3.0,2.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1599951000000.0,0.0,2.0,5651.0,622000.0,1943.0,1940.0,0.3,1.0,7.0,690.0,14.0,1.0,0.7,1.0,6.0,973.0,14.0,1.0,1.0,1.0,3.0,1984.0,18.0,1.0,1944.0,32198580.0
75%,11363.0,3.0,5.0,1610390000000.0,40.760981,2500.0,-73.83733,960000.0,0.0,969000.0,0.0,0.0,1600128000000.0,0.0,949000.0,0.0,0.0,1582762000000.0,0.0,930000.0,0.02127886,0.0,1573430000000.0,0.0,889000.0,0.001178,0.0,1561507000000.0,0.0,839000.0,0.0,0.0,1551830000000.0,0.0,799900.0,0.003836777,0.0,1540620000000.0,0.0,779000.0,0.01499332,0.0,1531267000000.0,0.0,749000.0,0.002222168,0.0,1523146000000.0,0.87,3.0,3.0,1.0,0.0,0.0,5.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1606176000000.0,0.0,3.0,7415.0,886000.0,1975.0,1970.0,0.5,1.0,8.0,878.0,15.0,1.0,1.2,1.0,7.0,1341.0,16.0,1.0,1.7,1.0,4.0,3281.0,19.0,1.0,1975.0,32375900.0
max,29512.0,1346.0,1502.0,1611281000000.0,40.911961,986641.0,-73.700432,90000000.0,0.0,90000000.0,2649999.0,0.0,1611274000000.0,1.0,88000000.0,1794999.0,0.0,1611014000000.0,1.0,98000000.0,1999999.0,0.0,1610842000000.0,1.0,699000000.0,878999.0,0.0,1610842000000.0,1.0,690900000.0,3302999.0,0.0,1610323000000.0,1.0,73123490.0,1008332.0,0.0,1608509000000.0,1.0,54000000.0,1449999.0,0.0,1607990000000.0,1.0,54000000.0,1449999.0,0.0,1607645000000.0,2.21,1346.0,70.0,1344.0,4.0,6.0,1502.0,0.0,1.0,1422.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1611274000000.0,1422.0,112.0,7123810.0,1680558000.0,4073.0,20162020.0,6.7,1.0,10.0,2000.0,20.0,17.0,6.6,1.0,10.0,5839.0,24.0,12.0,6.6,1.0,10.0,5839.0,24.0,5.0,4073.0,2146979000.0


In [44]:
df['address/city'].unique()

array(['New York', 'Bronx', 'Manhattan', 'new york', 'New york', 'Street',
       'Pelham', 'Howard Beach', 'Broad Channel', 'Far Rockaway',
       'Jamaica', 'Rosedale', 'Rockaway Park', 'Belle Harbor', 'Neponsit',
       'Breezy Point', 'Queens', 'Far rockaway', 'BELLE HARBOR',
       'Belle harbor', 'rosedale', 'Rockaway point', 'Far Rockway',
       'Washington Heights', 'Avenue', 'Brooklyn', 'Maspeth',
       'Little Neck', 'Flushing', 'NEW YORK', 'Staten Island',
       'Staten island', 'staten Island', 'staten island', 'BROOKLYN',
       'Cambria Heights', 'Queens Village', 'Springfield Gardens',
       'Arverne', 'far rockaway', 'Lawrence', 'belle harbor',
       'New york City', 'NY', 'Yonkers', 'College', 'bronx', 'Cen',
       'elmont', 'boulevard', 'Breezy Pt', 'Rockaway Beach',
       'Rockaway Point', 'Rockaway park', '350w42ndst', 'Oval', 'BRONX',
       'East 214th Street', 'West 156th', 'Concourse', 'Plains',
       'New York CIty', 'New York City', 'West', 'Long Islan

In [45]:
df['resoFactsStats/cityRegion'].unique()

array(['New York', 'Bronx', 'Manhattan', 'Street', 'Pelham',
       'Howard Beach', 'Broad Channel', 'Far Rockaway', 'Jamaica',
       'Rosedale', 'Rockaway Park', 'Belle Harbor', 'Neponsit',
       'Breezy Point', 'Queens', 'Rockaway Point', 'Far Rockway',
       'Washington Heights', 'Avenue', 'Brooklyn', 'Maspeth',
       'Little Neck', 'Flushing', 'Staten Island', 'Cambria Heights',
       'Queens Village', 'Springfield Gardens', 'Arverne', 'Lawrence',
       'New York City', 'Ny', 'Yonkers', 'College', 'Cen', 'Elmont',
       'Boulevard', 'Breezy Pt', 'Rockaway Beach', '350 W 42 Ndst',
       'Oval', 'East 214th Street', 'West 156th', 'Concourse', 'Plains',
       'West', 'Long Island City', 'Astoria', 'College Pt',
       'East Elmhurst', 'Woodside', 'College Point', 'Corona',
       'Sunnyside', 'Ridgewood', 'Greenpoint', 'Bayside', 'Whitestone',
       'Beechhurst', '16th', 'Douglaston', 'Douglas Manor', 'Floral Park',
       'Great Neck', 'Flushing Ny 11355 Queensboro Hills',


In [46]:
df.drop(df.loc[:,df.columns.str.contains('source')], axis=1, inplace=True)
df.drop(df.loc[:,df.columns.str.contains('Source')], axis=1, inplace=True)
df.drop(['currency', 'resoFactsStats/atAGlanceFacts/1/factValue', 'resoFactsStats/atAGlanceFacts/0/factValue', 'priceHistory/0/postingIsRental', 
        'priceHistory/0/showCountyLink', 'resoFactsStats/canRaiseHorses', 'resoFactsStats/yearBuilt', 'zpid',
        'url', 'schools/0/link', 'schools/1/link', 'resoFactsStats/hasAdditionalParcels', 'resoFactsStats/hasHomeWarranty', 
        'resoFactsStats/hasLandLease', 'resoFactsStats/hasRentControl', 'schools/0/isAssigned', 'schools/1/isAssigned',
        'resoFactsStats/hasPetsAllowed', 'schools/0/grades', 'schools/1/grades', 'address/state',], axis=1, inplace=True)

In [47]:
df

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
0,New York,60 Terrace View Ave,10463.0,2.0,5.0,1.610134e+12,"Discover Marble Hill, a neighborhood rich with...",FOR_SALE,40.877743,1889.0,-73.910866,799999.0,,,,Listed for sale,799999.0,0.335558,,,,1.610064e+12,Listing removed,0.0,599000.0,0.000000,0.0,1.459469e+12,Listed for sale,0.0,599000.0,0.711429,0.0,1.426810e+12,Listing removed,0.0,350000.0,0.000000,0.0,1.293062e+12,Price change,0.0,350000.0,-0.066667,0.0,1.276128e+12,Price change,0.0,375000.0,0.071429,0.0,1.275610e+12,Listed for sale,0.0,350000.0,0.000000,0.0,1.265328e+12,,,,,,,0.88,,,,"Natural Gas, Hot Water",,Driveway,,Finished,2.0,1.0,1.0,,,5.0,New York,Frame,,Bronx 10,,0,0,0,0,0,1.0,,0,1.0,0,,0,0,Natural Gas,Hot Water,Bronx 10,Residential,,"1,889 sqft",,Bronx 10,1.610064e+12,,,,,,,,,,,,,,,,,,,,,,,NO TAX ID FOUND,0,Driveway,,,Public Sewer,,,5096.0,711000.0,,0.1,Elementary,Ps 37 Multiple Intelligence School,4.0,647.0,14.0,1.0,Public,0.1,Middle,In Tech Academy Aka Ms High School 368,3.0,993.0,14.0,1.0,Public,,,,,,,,,,,,1920.0
1,Bronx,625 W 246th St,10471.0,8.0,8.0,1.595968e+12,EXCLUSIVE BRAND NEW\nLavish Newly Built 8-Bd. ...,FOR_SALE,40.892689,7000.0,-73.910667,3995000.0,,,,Price change,3995000.0,-0.111235,,,,1.607299e+12,Price change,0.0,4495000.0,-0.080401,0.0,1.601510e+12,Listed for sale,0.0,4888000.0,0.087430,0.0,1.595894e+12,Listing removed,0.0,4495000.0,0.000000,0.0,1.584144e+12,Listed for sale,0.0,4495000.0,3.610256,0.0,1.572566e+12,Sold,0.0,975000.0,-0.025000,0.0,1.450397e+12,Listing removed,0.0,1000000.0,0.000000,0.0,1.447200e+12,Pending sale,0.0,1000000.0,0.000000,0.0,1.439856e+12,0.95,Dishwasher,Dryer,Washer,,Central,"Garage, Garage - Attached",0.29 Acres,,8.0,7.0,1.0,0.0,0.0,8.0,Bronx,,Central,,Hardwood,0,0,0,0,0,1.0,1.0,0,0.0,0,,0,0,,,,Single Family,0.0,"7,000 sqft",0.29 Acres,,1.595894e+12,,Clubhouse,,Granite countertop,,Playground,,Stainless steel appliances,,,,,,,,,,,,,,,059130860,0,Garage,Garage - Attached,,,1.0,Other,13941.0,1937000.0,1940.0,0.4,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1940.0
2,Bronx,716 W 231st St,10463.0,3.0,4.0,1.592668e+12,This 4233 square foot single family home has 4...,FOR_SALE,40.883419,4233.0,-73.918106,1495000.0,,,,Price change,1495000.0,-0.002668,,,,1.611101e+12,Listed for sale,0.0,1499000.0,0.000000,0.0,1.592611e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Dishwasher,Dryer,Washer,,,"Garage, Garage - Attached",0.42 Acres,,3.0,3.0,0.0,0.0,0.0,4.0,Bronx,,,,,0,0,0,0,0,0.0,,0,0.0,0,,0,0,,,,Single Family,0.0,"4,233 sqft",0.42 Acres,,1.592611e+12,,,,,,,,,,,,,,,,,,,,,,,057500494,0,Garage,Garage - Attached,,,2.0,,12253.0,2341000.0,1920.0,0.3,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,0.4,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1920.0
3,Bronx,750 W 232nd St,10463.0,6.0,5.0,1.600814e+12,EXCLUSIVE NEW TO MARKET\nPrime Renovation Oppo...,FOR_SALE,40.885033,7000.0,-73.917793,3450000.0,,,,Price change,3450000.0,-0.092105,,,,1.608163e+12,Listed for sale,0.0,3800000.0,0.225806,0.0,1.600733e+12,Sold,0.0,3100000.0,-0.156463,0.0,1.551917e+12,Listing removed,0.0,3675000.0,0.000000,0.0,1.550707e+12,Pending sale,0.0,3675000.0,0.000000,0.0,1.510272e+12,Listed for sale,0.0,3675000.0,-0.125000,0.0,1.506298e+12,Sold,0.0,4200000.0,0.000000,0.0,9.650880e+11,,,,,,,0.95,,,,,Central,"Garage, Garage - Attached",0.26 Acres,,6.0,6.0,0.0,0.0,0.0,5.0,Bronx,,Central,,,0,0,0,0,0,1.0,1.0,0,0.0,0,,0,0,,,,Single Family,0.0,"7,000 sqft",0.26 Acres,,1.600733e+12,,,,,,,,,,,,,,,,,,,,,,,057510300,0,Garage,Garage - Attached,,,2.0,,19472.0,3011000.0,1950.0,0.2,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1950.0
5,New York,24 Cooper St #5CD,10034.0,2.0,3.0,1.611091e+12,"Due to Coronavirus 19, outbreak, ALL showings ...",FOR_SALE,40.867687,994.0,-73.924606,230000.0,,,,Listed for sale,230000.0,0.000000,,,,1.611014e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.88,,,,,,0 spaces,"$1,472/mo",,2.0,2.0,0.0,0.0,0.0,3.0,New York,,,,,0,0,0,0,0,0.0,,0,0.0,0,,0,0,,,,Condo,0.0,994 sqft,,,1.611014e+12,,,,,,,,,,,,,,,,,,,,,,,,0,,,,,,,,,,0.8,Elementary,Ps 18 Park Terrace,5.0,349.0,13.0,3.0,Public,1.9,Middle,Ms 319 Marie Teresa,7.0,421.0,12.0,3.0,Public,0.1,9-12,1.0,High,https://www.greatschools.org/school?id=18169&s...,INWOOD EARLY COLLEGE FOR HEALTH AND INFORMATIO...,3.0,371.0,,1.0,Public,1925.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
75622,Forest Hills,93-19 71st Ave,11375.0,2.0,4.0,,This house is a rare combination of superb loc...,SOLD,40.712009,2200.0,-73.850281,1255000.0,,,,Sold,1255000.0,0.004804,David Yakubov,https://photos.zillowstatic.com/h_e/ISd8b63jno...,/profile/user3094820/,1.530230e+12,Pending sale,0.0,1249000.0,0.000000,0.0,1.523837e+12,Listed for sale,0.0,1249000.0,1.401923,0.0,1.519603e+12,Sold,0.0,520000.0,0.000000,0.0,1.512086e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Washer,,,"Garage, Garage - Attached","2,500 sqft",,2.0,1.0,1.0,0.0,0.0,4.0,Forest Hills,,,Community District 28,,0,0,0,0,0,0.0,,0,0.0,0,,0,0,,,Community District 28,Single Family,,"2,200 sqft","2,500 sqft",Community District 28,,Den/Family Room,Y,Detached/Attached,Det,Driveway,Pvt,Eat In Kitchen,Y,Picture,Y,Water,Public,Attic,Y,Heat,Steam,Sewer,Y,Wood Floors,Y,Gas,Y,032220033,0,Garage,Garage - Attached,DiningRoom,Y,,Loft,7129.0,1034000.0,1930.0,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,Public,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1930.0
75625,Flushing,6829 Manse St,11375.0,2.0,3.0,,Wonderful 1 Family Home. First Floor Features ...,SOLD,40.714203,2417.0,-73.855263,825000.0,,,,Sold,825000.0,-0.049539,Annie/Steve Your Home Sold Guaranteed,https://photos.zillowstatic.com/h_e/ISqh2r0uwq...,/profile/Agardi-Team/,1.532995e+12,Listing removed,0.0,868000.0,0.000000,0.0,1.519690e+12,Listed for sale,0.0,868000.0,0.000000,0.0,1.518566e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,Other,,"Garage, Garage - Attached","2,417 sqft",,2.0,0.0,0.0,0.0,0.0,3.0,Flushing,,,,,0,0,0,0,0,0.0,,0,1.0,0,,0,0,Other,,,Single Family,,"2,417 sqft","2,417 sqft",,,,,,,,,,,,,,,,,,,,,,,,,031950052,0,Garage,Garage - Attached,DiningRoom,,2.0,,6447.0,907000.0,1920.0,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,Public,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1920.0
75626,Forest Hills Gardens,82 Greenway Ter,11375.0,6.0,6.0,,"""DISTINQUISHED FIELDSTONE TOWNHOUSE TREASURE""\...",SOLD,40.717163,6085.0,-73.843124,2704000.0,GiGi Malek,https://photos.zillowstatic.com/h_e/ISvww3cyma...,/profile/ForestHillsGiGi/,Sold,2704000.0,0.040400,Linda Weiss,https://photos.zillowstatic.com/h_e/ISbxza59gh...,/profile/lindaweiss11/,1.561939e+12,Listing removed,0.0,2599000.0,0.000000,0.0,1.560989e+12,Pending sale,0.0,2599000.0,0.000000,0.0,1.556410e+12,Listed for sale,0.0,2599000.0,0.000000,0.0,1.554336e+12,Pending sale,0.0,2599000.0,0.000000,0.0,1.553558e+12,Listing removed,1.0,7000.0,0.000000,0.0,1.553126e+12,Price change,1.0,7000.0,-0.176471,0.0,1.552349e+12,Listed for sale,0.0,2599000.0,1.652041,0.0,1.551139e+12,0.84,,,,,,"Garage, Garage - Attached","3,255 sqft",,6.0,5.0,1.0,0.0,0.0,6.0,Forest Hills Gardens,,,,,0,0,0,0,0,0.0,0.0,0,0.0,0,0.0,0,0,,,,Townhouse,,"6,085 sqft","3,255 sqft",,,,Fios Available,,Parking,Parking Type,Garage,,,,,,,,,,,,,,,,,032740007,0,Garage,Garage - Attached,,,2.0,,18430.0,2513000.0,1925.0,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,Public,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0
75627,Forest Hills Gardens,86 Greenway Ter,11375.0,5.0,6.0,,EXCLUSIVE LISTING OF TERRACE SOTHEBY'S INTERNA...,SOLD,40.717052,4564.0,-73.843025,2750000.0,Terrace Sotheby's International Realty,https://photos.zillowstatic.com/h_e/ISbpd8y5jp...,/profile/terracesir/,Sold,2750000.0,-0.075630,Sheldon Stivelman,https://photos.zillowstatic.com/h_e/ISf403agde...,/profile/SheldonStivelman/,1.532390e+12,Pending sale,0.0,2975000.0,0.000000,0.0,1.523318e+12,Listed for sale,0.0,2975000.0,0.000000,0.0,1.521677e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,,,0 spaces,"6,603 sqft",,5.0,4.0,1.0,0.0,0.0,6.0,Forest Hills Gardens,,,,,0,0,0,0,0,0.0,0.0,0,0.0,0,0.0,0,0,,,,Townhouse,,"4,564 sqft","6,603 sqft",,,Features,"Special Program/QC Approved Listing, Garage Co...",,,,,,,,,,,,,,,,,,,,,032740004,0,,,,,2.0,,24649.0,2893000.0,1925.0,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,Public,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0


In [48]:
df['resoFactsStats/homeType'].unique()

array(['Residential', 'Single Family', 'Condo', 'Apartment',
       'Multiple Occupancy', 'Townhouse', 'Residential Income', 'Other',
       'Mobile / Manufactured', 'Vacant Land'], dtype=object)

In [49]:
df.drop(df[(df['resoFactsStats/homeType'] == 'Multiple Occupancy') | (df['resoFactsStats/homeType'] == 'Vacant Land') | 
           (df['resoFactsStats/homeType'] == 'Residential Income') | (df['resoFactsStats/homeType'] == 'Land') |
           (df['resoFactsStats/homeType'] == 'Other') | (df['resoFactsStats/homeType'] == 'Mixed Use') |
            (df['resoFactsStats/homeType'] == 'Mobile / Manufactured')].index, inplace=True)

In [50]:
df['resoFactsStats/homeType'].unique()

array(['Residential', 'Single Family', 'Condo', 'Apartment', 'Townhouse'],
      dtype=object)

In [51]:
df['schools/0/type'].unique()

array(['Public', nan, 'Charter'], dtype=object)

In [52]:
df['schools/0/totalCount'].unique()

array([ 1.,  3.,  2., nan,  7.,  4.,  6.,  8.,  5., 16., 11., 17.])

In [53]:
df.loc[df['resoFactsStats/garageSpaces'] > 0]

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
6,Bronx,1 Ploughmans Bush,10471.0,3.0,3.0,1.606156e+12,Looking for a country retreat not too far from...,FOR_SALE,40.892391,1864.0,-73.912140,1275000.0,,,,Price change,1275000.0,-0.055556,,,,1.607645e+12,Price change,0.0,1350000.0,-0.068966,0.0,1.596586e+12,Listed for sale,0.0,1450000.0,0.188525,0.0,1.593562e+12,Sold,0.0,1220000.0,0.000000,0.0,1.240877e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,,,,"Natural Gas, Forced Air",Central Air,"Detached, Driveway",0.13 Acres,Full,3.0,2.0,1.0,,,3.0,Bronx,Other,Central Air,Bronx 10,,0,2,0,0,0,1.0,,1,1.0,0,,0,0,Natural Gas,Forced Air,Bronx 10,Residential,,"1,864 sqft",0.13 Acres,Bronx 10,1.601856e+12,,,,,,,,,,,,,,,,,,,,,,,05924-0517,0,Detached,Driveway,,Public Sewer,,,11836.0,1941000.0,,0.8,Elementary,Ps 81 Robert J Christen,7.0,725.0,14.0,1.0,Public,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1930.0
8,Bronx,5450 Palisade Ave,10471.0,4.0,4.0,1.596225e+12,This spectacular 1864 carriage house in the Es...,FOR_SALE,40.904579,3500.0,-73.911598,2895000.0,,,,Listed for sale,2895000.0,0.366694,,,,1.596154e+12,Sold,0.0,2118250.0,-0.090880,0.0,1.437437e+12,Listing removed,0.0,2330000.0,0.000000,0.0,1.432253e+12,Price change,0.0,2330000.0,-0.041152,0.0,1.431043e+12,Listed for sale,0.0,2430000.0,-0.028000,0.0,1.423181e+12,Listing removed,0.0,2500000.0,0.000000,0.0,1.400890e+12,Listed for sale,0.0,2500000.0,0.000000,0.0,1.383869e+12,Listing removed,0.0,2500000.0,0.000000,0.0,1.319587e+12,0.95,Dishwasher,Dryer,Washer,Forced air,Central,"Garage, Garage - Attached, Covered",0.48 Acres,,4.0,3.0,1.0,0.0,0.0,4.0,Bronx,Frame,Central,Call Listing Agent,Hardwood,0,2,0,0,0,1.0,1.0,1,1.0,0,1.0,0,0,Forced air,,Call Listing Agent,Single Family,0.0,"3,500 sqft",0.48 Acres,Call Listing Agent,1.596154e+12,,,,,,,,,,,,,,,,,,,,,,,059470016,2,Garage,Garage - Attached,WalkInCloset,,2.0,Other,16364.0,2174000.0,1901.0,0.4,Elementary,Ps 81 Robert J Christen,7.0,725.0,14.0,1.0,Public,1.0,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1901.0
11,Bronx,3104 Netherland Ave,10463.0,5.0,6.0,1.606156e+12,"Mediterranean house with a garden & patio, bui...",FOR_SALE,40.882561,3850.0,-73.911613,2498000.0,,,,Price change,2498000.0,-0.039231,,,,1.609718e+12,Listed for sale,0.0,2600000.0,2.611111,0.0,1.599005e+12,Sold,0.0,720000.0,1.376238,0.0,1.352765e+12,Sold,0.0,303000.0,0.000000,0.0,9.313056e+11,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Dishwasher,Dryer,,"Electric, Forced Air",Central Air,Attached,,Walk-Out Access,5.0,4.0,1.0,,,6.0,Bronx,Brick,Central Air,Call Listing Agent,,0,2,1,0,0,1.0,,1,1.0,0,,0,0,Electric,Forced Air,Call Listing Agent,Residential,,"3,850 sqft",,Call Listing Agent,1.598918e+12,,,,,,,,,,,,,,,,,,,,,,,057390264,0,Attached,,,Public Sewer,,,11987.0,1494000.0,,0.3,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,0.4,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,2016.0
12,Bronx,739 Ladd Rd,10471.0,5.0,4.0,1.592943e+12,EXCLUSIVE JUST LISTED\nStylish & Light-Filled ...,FOR_SALE,40.905693,5972.0,-73.910690,2650000.0,,,,Price change,2650000.0,-0.045045,,,,1.606694e+12,Price change,0.0,2775000.0,-0.067227,0.0,1.600992e+12,Listed for sale,0.0,2975000.0,0.652778,0.0,1.592870e+12,Sold,0.0,1800000.0,0.000000,0.0,1.145578e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Dishwasher,Dryer,Washer,"Forced air, Gas",Central,"Garage, Garage - Attached, Covered",0.43 Acres,Finished,5.0,4.0,1.0,0.0,0.0,4.0,Bronx,,Central,Call Listing Agent,Tile,0,1,0,0,0,1.0,1.0,1,1.0,0,1.0,0,1,Forced air,Gas,Call Listing Agent,Single Family,0.0,"5,972 sqft",0.43 Acres,Call Listing Agent,1.592870e+12,Appliances,"Dryer, Refrigerator, Dishwasher, Washer, Range...",Cooling,Central Air,ExteriorFeatures,Sprinkler Lawn System,FireplaceYN,1,Flooring,Hardwood,Heating,"Forced Air, Natural Gas",Inclusions,"Dryer, Dishwasher, Refrigerator, Washer, Oven/...",InteriorFeatures,"Eat-in Kitchen, Master Downstairs, 1st Floor B...",PropertyType,Residential,PatioAndPorchFeatures,Patio,Sewer,Public,059470035,1,Garage,Garage - Attached,DiningRoom,Sewer,2.0,Other,19416.0,1886000.0,2005.0,0.4,Elementary,Ps 81 Robert J Christen,7.0,725.0,14.0,1.0,Public,1.1,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1960.0
13,Bronx,2655 Netherland Ave,10463.0,4.0,4.0,1.610420e+12,Riverdale is the Hot Spot! Just outside of Man...,FOR_SALE,40.879478,2300.0,-73.914932,1250000.0,,,,Price change,1250000.0,-0.038462,,,,1.610410e+12,Listing removed,1.0,5500.0,0.000000,0.0,1.603411e+12,Listed for sale,0.0,1300000.0,0.000000,0.0,1.603066e+12,Price change,1.0,5500.0,-0.083333,0.0,1.600128e+12,Price change,1.0,6000.0,-0.142857,0.0,1.594339e+12,Listed for rent,1.0,7000.0,0.000000,0.0,1.592784e+12,,,,,,,,,,,,,0.95,,,,"Natural Gas, Baseboard",Central Air,"Attached, Driveway",0.09 Acres,"Finished,Walk-Out Access",4.0,3.0,1.0,,,4.0,Bronx,Frame,Central Air,Bronx 10,Hardwood,0,1,1,0,0,1.0,1.0,1,1.0,0,,0,1,Natural Gas,Baseboard,Bronx 10,Residential,,"2,300 sqft",0.09 Acres,Bronx 10,1.610323e+12,,,,,,,,,,,,,,,,,,,,,,,05724-0712,0,Attached,Driveway,,Public Sewer,,,9000.0,1752000.0,2020.0,0.5,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,Public,0.5,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,Public,,,,,,,,,,,,1988.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
75587,Forest Hills,7032 Juno St,11375.0,3.0,4.0,,PLEASE CALL OWNER.\nVIEWING BY APPOINTMENT ONL...,SOLD,40.714142,2268.0,-73.849945,1490000.0,,,,Listing removed,1795000.0,0.000000,,,,1.599610e+12,Price change,0.0,1795000.0,-0.050265,0.0,1.597104e+12,Listed for sale,0.0,1890000.0,0.268456,0.0,1.593821e+12,Sold,0.0,1490000.0,-0.079679,0.0,1.548806e+12,Listing removed,0.0,1619000.0,0.000000,0.0,1.547424e+12,Listed for sale,0.0,1619000.0,0.000000,0.0,1.546646e+12,Pending sale,0.0,1619000.0,0.000000,0.0,1.539043e+12,Price change,0.0,1619000.0,-0.047087,0.0,1.534464e+12,0.84,Dishwasher,Dryer,Freezer,,,"Garage, Garage - Detached, Covered","3,998 sqft",Finished,3.0,3.0,0.0,0.0,0.0,4.0,Forest Hills,,,,Hardwood,0,2,0,0,0,0.0,1.0,1,0.0,0,,0,0,,,,Single Family,,"2,268 sqft","3,998 sqft",,,Features,Special Program/QC Approved Listing,,,,,,,,,,,,,,,,,,,,,03225002001,2,Garage,Garage - Detached,DiningRoom,,2.0,Tudor,12624.0,1749000.0,1940.0,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,Public,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.5,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1940.0
75595,Forest Hills GARDENS,44 Slocum Cres,11375.0,2.0,4.0,,Philosphers Row Townhome in Forest Hills Garde...,SOLD,40.717140,1765.0,-73.842094,1675000.0,Linda Weiss,https://photos.zillowstatic.com/h_e/ISbxza59gh...,/profile/lindaweiss11/,Sold,1675000.0,-0.068927,Terrace Sotheby's International Realty,https://photos.zillowstatic.com/h_e/ISbpd8y5jp...,/profile/terracesir/,1.548720e+12,Pending sale,0.0,1799000.0,0.000000,0.0,1.540771e+12,Price change,0.0,1799000.0,-0.027568,0.0,1.534810e+12,Listed for sale,0.0,1850000.0,0.307420,0.0,1.534378e+12,Sold,0.0,1415000.0,-0.053512,0.0,1.442966e+12,Pending sale,0.0,1495000.0,0.000000,0.0,1.438733e+12,Listed for sale,0.0,1495000.0,1.723133,0.0,1.434672e+12,Sold,0.0,549000.0,0.000000,0.0,9.623232e+11,0.84,Dishwasher,Dryer,Freezer,"Heat pump, Electric",Central,"On-street, Covered","1,751 sqft",Partially finished,2.0,2.0,0.0,0.0,0.0,4.0,Forest Hills Gardens,,Central,,Hardwood,0,3,0,0,0,1.0,1.0,1,1.0,0,0.0,0,1,Heat pump,Electric,,Townhouse,,"1,765 sqft","1,751 sqft",,,,,,,,,,,,,,,,,,,,,,,,,032760012,3,On-street,Covered,DiningRoom,,3.0,,9404.0,1413000.0,1925.0,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,Public,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0
75606,Forest Hills,15 Slocum Cres,11375.0,3.0,3.0,,A brilliantly cared for Grosvenor Atterbury-de...,RECENTLY_SOLD,40.717827,2275.0,-73.842270,1510000.0,GiGi Malek,https://photos.zillowstatic.com/h_e/ISvww3cyma...,/profile/ForestHillsGiGi/,Sold,1510000.0,-0.055069,GiGi Malek,https://photos.zillowstatic.com/h_e/ISvww3cyma...,/profile/ForestHillsGiGi/,1.607299e+12,Listing removed,0.0,1598000.0,0.000000,0.0,1.605830e+12,Pending sale,0.0,1598000.0,0.000000,0.0,1.599178e+12,Listed for sale,0.0,1598000.0,-0.000625,0.0,1.594685e+12,Listing removed,0.0,1599000.0,0.000000,0.0,1.576714e+12,Listed for sale,0.0,1599000.0,15.831579,0.0,1.574208e+12,Sold,0.0,95000.0,0.900000,0.0,3.003264e+11,Sold,0.0,50000.0,0.000000,0.0,1.545696e+11,0.84,Dishwasher,Dryer,Microwave,,,"On-street, Covered, Garage",$52/mo,Finished,3.0,2.0,0.0,0.0,1.0,3.0,Forest Hills,,,,Tile,0,4,0,0,0,0.0,1.0,1,0.0,0,,0,1,,,,Single Family,,"2,275 sqft",994 sqft,,,,,,,,,,,,,,,,,,,,,,,,,032750103,4,On-street,Covered,DiningRoom,,3.0,Queen Anne / Victorian,8426.0,1484000.0,1925.0,0.2,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,Public,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0
75607,Forest Hills,7044 Kessel St,11375.0,4.0,4.0,,Charming English Tudor,RECENTLY_SOLD,40.713360,3249.0,-73.849953,1590000.0,Monica Sharma,https://photos.zillowstatic.com/h_e/ISn2wo1xkl...,/profile/monicaSharm/,Sold,1590000.0,-0.058615,GiGi Malek,https://photos.zillowstatic.com/h_e/ISvww3cyma...,/profile/ForestHillsGiGi/,1.601942e+12,Listing removed,0.0,1689000.0,0.000000,0.0,1.601251e+12,Pending sale,0.0,1689000.0,0.000000,0.0,1.595894e+12,Listed for sale,0.0,1689000.0,2.795506,0.0,1.594685e+12,Sold,0.0,445000.0,0.000000,0.0,9.172224e+11,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Freezer,,Central,"Garage, Garage - Detached, On-street, Covered",$75/mo,Finished,4.0,3.0,1.0,0.0,0.0,4.0,Forest Hills,,Central,,Hardwood,0,1,0,0,0,1.0,1.0,1,0.0,0,,0,1,,,,Single Family,,"3,249 sqft","4,000 sqft",,,,,,,,,,,,,,,,,,,,,,,,,032240026,1,Garage,Garage - Detached,WalkInCloset,,2.0,Tudor,12952.0,1451000.0,1940.0,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,Public,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.5,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1937.0


In [54]:
df.loc[df['resoFactsStats/furnished'] > 0]

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
2121,Bronx,4020 Pratt Ave,10466.0,4.0,8.0,,Multifamily home: Two 3 bedrooms / 1 bath unit...,RECENTLY_SOLD,40.891418,3413.0,-73.833641,740000.0,,,,Sold,740000.0,0.138462,,,,1.604275e+12,Listing removed,0.0,650000.0,0.0,0.0,1.547597e+12,Pending sale,0.0,650000.0,0.000000,0.0,1.535501e+12,Listed for sale,0.0,650000.0,0.000000,0.0,1.535501e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Refrigerator,,,,,"Garage, Garage - Detached","3,750 sqft",,4.0,2.0,2.0,0.0,0.0,8.0,Bronx,brick,,UNKNOWN,Hardwood,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,UNKNOWN,Single Family,,"3,413 sqft","3,750 sqft",UNKNOWN,,Sewer Type,Municipal,Water Source Type,Municipal,Construction Type,Masonry Brick,Building Style,1000,View Street,yes,HOA or Building Fee,0.00,Appliance Oven,yes,,,,,,,,,049660018,0,Garage,Garage - Detached,DiningRoom,,3.0,Colonial,6504.0,666000.0,1920.0,0.3,Primary,Ps 68,7.0,677.0,9.0,1.0,Public,0.6,Middle,Baychester Middle School,4.0,301.0,10.0,2.0,Public,1.1,9-12,1.0,High,https://www.greatschools.org/school?id=01969&s...,Harry S Truman High School,3.0,1984.0,19.0,1.0,Public,1920.0
2122,Bronx,4020 Pratt Ave,10466.0,4.0,8.0,,Multifamily home: Two 3 bedrooms / 1 bath unit...,RECENTLY_SOLD,40.891418,3413.0,-73.833641,740000.0,,,,Sold,740000.0,0.138462,,,,1.604275e+12,Listing removed,0.0,650000.0,0.0,0.0,1.547597e+12,Pending sale,0.0,650000.0,0.000000,0.0,1.535501e+12,Listed for sale,0.0,650000.0,0.000000,0.0,1.535501e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Refrigerator,,,,,"Garage, Garage - Detached","3,750 sqft",,4.0,2.0,2.0,0.0,0.0,8.0,Bronx,brick,,UNKNOWN,Hardwood,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,UNKNOWN,Single Family,,"3,413 sqft","3,750 sqft",UNKNOWN,,Sewer Type,Municipal,Water Source Type,Municipal,Construction Type,Masonry Brick,Building Style,1000,View Street,yes,HOA or Building Fee,0.00,Appliance Oven,yes,,,,,,,,,049660018,0,Garage,Garage - Detached,DiningRoom,,3.0,Colonial,6504.0,666000.0,1920.0,0.3,Primary,Ps 68,7.0,677.0,9.0,1.0,Public,0.6,Middle,Baychester Middle School,4.0,301.0,10.0,2.0,Public,1.1,9-12,1.0,High,https://www.greatschools.org/school?id=01969&s...,Harry S Truman High School,3.0,1984.0,19.0,1.0,Public,1920.0
2124,Bronx,4020 Pratt Ave,10466.0,4.0,8.0,,Multifamily home: Two 3 bedrooms / 1 bath unit...,RECENTLY_SOLD,40.891418,3413.0,-73.833641,740000.0,,,,Sold,740000.0,0.138462,,,,1.604275e+12,Listing removed,0.0,650000.0,0.0,0.0,1.547597e+12,Pending sale,0.0,650000.0,0.000000,0.0,1.535501e+12,Listed for sale,0.0,650000.0,0.000000,0.0,1.535501e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Refrigerator,,,,,"Garage, Garage - Detached","3,750 sqft",,4.0,2.0,2.0,0.0,0.0,8.0,Bronx,brick,,UNKNOWN,Hardwood,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,UNKNOWN,Single Family,,"3,413 sqft","3,750 sqft",UNKNOWN,,Sewer Type,Municipal,Water Source Type,Municipal,Construction Type,Masonry Brick,Building Style,1000,View Street,yes,HOA or Building Fee,0.00,Appliance Oven,yes,,,,,,,,,049660018,0,Garage,Garage - Detached,DiningRoom,,3.0,Colonial,6504.0,666000.0,1920.0,0.3,Primary,Ps 68,7.0,677.0,9.0,1.0,Public,0.6,Middle,Baychester Middle School,4.0,301.0,10.0,2.0,Public,1.1,9-12,1.0,High,https://www.greatschools.org/school?id=01969&s...,Harry S Truman High School,3.0,1984.0,19.0,1.0,Public,1920.0
2125,Bronx,4020 Pratt Ave,10466.0,4.0,8.0,,Multifamily home: Two 3 bedrooms / 1 bath unit...,RECENTLY_SOLD,40.891418,3413.0,-73.833641,740000.0,,,,Sold,740000.0,0.138462,,,,1.604275e+12,Listing removed,0.0,650000.0,0.0,0.0,1.547597e+12,Pending sale,0.0,650000.0,0.000000,0.0,1.535501e+12,Listed for sale,0.0,650000.0,0.000000,0.0,1.535501e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Refrigerator,,,,,"Garage, Garage - Detached","3,750 sqft",,4.0,2.0,2.0,0.0,0.0,8.0,Bronx,brick,,UNKNOWN,Hardwood,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,UNKNOWN,Single Family,,"3,413 sqft","3,750 sqft",UNKNOWN,,Sewer Type,Municipal,Water Source Type,Municipal,Construction Type,Masonry Brick,Building Style,1000,View Street,yes,HOA or Building Fee,0.00,Appliance Oven,yes,,,,,,,,,049660018,0,Garage,Garage - Detached,DiningRoom,,3.0,Colonial,6504.0,666000.0,1920.0,0.3,Primary,Ps 68,7.0,677.0,9.0,1.0,Public,0.6,Middle,Baychester Middle School,4.0,301.0,10.0,2.0,Public,1.1,9-12,1.0,High,https://www.greatschools.org/school?id=01969&s...,Harry S Truman High School,3.0,1984.0,19.0,1.0,Public,1920.0
2126,Bronx,4020 Pratt Ave,10466.0,4.0,8.0,,Multifamily home: Two 3 bedrooms / 1 bath unit...,RECENTLY_SOLD,40.891418,3413.0,-73.833641,740000.0,,,,Sold,740000.0,0.138462,,,,1.604275e+12,Listing removed,0.0,650000.0,0.0,0.0,1.547597e+12,Pending sale,0.0,650000.0,0.000000,0.0,1.535501e+12,Listed for sale,0.0,650000.0,0.000000,0.0,1.535501e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Refrigerator,,,,,"Garage, Garage - Detached","3,750 sqft",,4.0,2.0,2.0,0.0,0.0,8.0,Bronx,brick,,UNKNOWN,Hardwood,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,UNKNOWN,Single Family,,"3,413 sqft","3,750 sqft",UNKNOWN,,Sewer Type,Municipal,Water Source Type,Municipal,Construction Type,Masonry Brick,Building Style,1000,View Street,yes,HOA or Building Fee,0.00,Appliance Oven,yes,,,,,,,,,049660018,0,Garage,Garage - Detached,DiningRoom,,3.0,Colonial,6504.0,666000.0,1920.0,0.3,Primary,Ps 68,7.0,677.0,9.0,1.0,Public,0.6,Middle,Baychester Middle School,4.0,301.0,10.0,2.0,Public,1.1,9-12,1.0,High,https://www.greatschools.org/school?id=01969&s...,Harry S Truman High School,3.0,1984.0,19.0,1.0,Public,1920.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
72826,Flushing,8604 Grand Ave APT 4D,11373.0,1.0,2.0,,Call Gail at Nest Properties to see this renov...,SOLD,40.735531,900.0,-73.879990,332000.0,Gail Opromalla,https://photos.zillowstatic.com/h_e/ISfkkdsize...,/profile/Gopromalla/,Sold,332000.0,0.021538,Gail Opromalla,https://photos.zillowstatic.com/h_e/ISfkkdsize...,/profile/Gopromalla/,1.529280e+12,Listing removed,0.0,325000.0,0.0,0.0,1.522627e+12,Listed for sale,0.0,325000.0,0.000000,0.0,1.519776e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,,,,,"Garage, Garage - Attached",$489/mo,,1.0,1.0,0.0,0.0,0.0,2.0,Flushing,,,,,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,,Condo,,900 sqft,0.36 Acres,,,,,,,,,,,,,,,,,,,,,,,,,0286400174D,0,Garage,Garage - Attached,,,6.0,,169931.0,4641000.0,,0.2,Primary,Ps 102 Bayview,9.0,1331.0,15.0,2.0,Public,0.5,High,Newtown High School,3.0,1913.0,17.0,1.0,Public,,,,,,,,,,,,1959.0
72943,Flushing,8511 57th Rd,11373.0,2.5,3.0,,"Location!!! Southern Exposure, Renovated, Love...",SOLD,40.731827,1742.0,-73.876266,820000.0,,,,Sold,820000.0,2.346939,,,,1.547424e+12,Listing removed,1.0,1000.0,0.0,0.0,1.345075e+12,Listed for rent,1.0,1000.0,0.000000,0.0,1.344730e+12,Sold,0.0,245000.0,0.000000,0.0,9.520416e+11,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Freezer,Forced air,Central,"Garage, Carport","1,800 sqft",,,,,,,3.0,Flushing,,Central,,Tile,1,0,0,0,0,1.0,1.0,0,1.0,0,1.0,1,0,Forced air,,,Single Family,,"1,742 sqft","1,800 sqft",,,,,,,,,,,,,,,,,,,,,,,,,028820016,0,Garage,Carport,WalkInCloset,,2.0,,6218.0,719000.0,1935.0,0.1,Primary,Ps 102 Bayview,9.0,1331.0,15.0,2.0,Public,0.6,High,Newtown High School,3.0,1913.0,17.0,1.0,Public,,,,,,,,,,,,1935.0
73011,Flushing,8737 Justice Ave,11373.0,1.0,2.0,,Tenants pay for utilities.,SOLD,40.737148,1000.0,-73.875221,1349628.0,,,,Sold,1349628.0,0.420661,,,,1.554077e+12,Listing removed,1.0,2000.0,0.0,0.0,1.275091e+12,Listed for rent,1.0,2000.0,0.000000,0.0,1.268006e+12,Sold,0.0,950000.0,0.057906,0.0,1.251072e+12,Sold,0.0,898000.0,0.000000,0.0,1.198282e+12,,,,,,,,,,,,,,,,,,,0.84,,,,,,0 spaces,,,1.0,0.0,0.0,0.0,0.0,2.0,Flushing,,,Community District 26,,1,0,0,0,0,0.0,,0,0.0,0,,0,0,,,Community District 26,Apartment,,"1,000 sqft",142 sqft,Community District 26,,,,,,,,,,,,,,,,,,,,,,,,018390057,0,,,,,3.0,,100482.0,1442000.0,,0.5,Primary,Ps 7 Louis F Simeone,9.0,1002.0,15.0,1.0,Public,0.8,Middle,Is 5 The Walter Crowley Intermediate School,9.0,1796.0,15.0,1.0,Public,0.2,9-12,1.0,High,https://www.greatschools.org/school?id=02023&s...,Newtown High School,3.0,1913.0,17.0,1.0,Public,2004.0
75157,Flushing,6264 Saunders St,11374.0,1.0,1.0,,Fully furnished bright and sunny walk up apart...,RECENTLY_SOLD,40.730694,700.0,-73.866920,319000.0,,,,Sold,319000.0,0.603015,,,,1.606694e+12,Listing removed,0.0,199000.0,0.0,0.0,1.460678e+12,Listed for sale,0.0,199000.0,0.117978,0.0,1.460592e+12,Listing removed,1.0,1450.0,0.000000,0.0,1.421280e+12,Listed for rent,1.0,1450.0,-0.033333,0.0,1.420934e+12,Listing removed,0.0,178000.0,0.0,0.0,1.415318e+12,Listed for sale,0.0,178000.0,0.194631,0.0,1.414195e+12,Listing removed,1.0,1500.0,0.0,0.0,1.412381e+12,0.84,Range / Oven,Refrigerator,,Forced air,Other,On-street,$557/mo,,1.0,0.0,0.0,0.0,0.0,1.0,Flushing,,Other,Community District 28,Hardwood,1,0,0,0,0,1.0,,0,1.0,0,,0,0,Forced air,,Community District 28,Apartment,,700 sqft,0.48 Acres,Community District 28,,,,,,,,,,,,,,,,,,,,,,,,030790076,0,On-street,,WalkInCloset,,1.0,,230081.0,5321000.0,,0.4,Primary,Ps 206 The Horace Harding School,6.0,612.0,14.0,1.0,Public,1.1,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.4,9-12,1.0,High,https://www.greatschools.org/school?id=01960&s...,Forest Hills High School,5.0,3784.0,20.0,1.0,Public,1935.0


In [55]:
df.loc[df['resoFactsStats/hasCarport'] > 0]

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
467,Far Rockaway,2609 Seagirt Ave,11691.0,2.0,3.0,1.603132e+12,New Construction (2006) Single Family Brick Fu...,FOR_SALE,40.594967,1600.0,-73.759659,494000.0,,,,Price change,494000.0,-0.010020,,,,1.609891e+12,Listed for sale,0.0,499000.0,0.108889,0.0,1.603066e+12,Sold,0.0,450000.0,0.022730,0.0,1.583798e+12,Listing removed,0.0,439999.0,0.000000,0.0,1.575504e+12,Price change,0.0,439999.0,-0.022224,0.0,1.570147e+12,Price change,0.0,450000.0,-0.032258,0.0,1.565309e+12,Price change,0.0,465000.0,-0.029228,0.0,1.562544e+12,Price change,0.0,479000.0,-0.04008,0.0,1.561075e+12,0.84,,,,"Natural Gas, Baseboard",,"Carport, 1 Space, 2 Spaces, 3 Spaces","2,500 sqft",Crawl Space,2.0,2.0,0.0,,0.0,3.0,Far Rockaway,Brick,,,Hardwood,0,6,0,0,1,1.0,,1,1.0,0,,0,0,Natural Gas,Baseboard,,Residential,,"1,600 sqft","2,500 sqft",,1.603132e+12,,,,,,,,,,,,,,,,,,,,,,,158180239,6,Carport,1 Space,,,2.0,Duplex,4697.0,369000.0,,0.1,Primary,Ps 43,2.0,894.0,14.0,3.0,Public,0.5,High,Queens High School For Information Research And T,3.0,436.0,19.0,1.0,Public,,,,,,,,,,,,2006.0
2443,Staten Island,127 Fields Ave,10314.0,2.0,3.0,1.608428e+12,"Centrally located in Willowbrook, this 3 bedro...",FOR_SALE,40.599575,1296.0,-74.137489,559000.0,,,,Listed for sale,559000.0,0.000000,,,,1.608422e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.87,,,,"Forced Air, Natural Gas",Central Air,"No Garage, Carport","2,700 sqft",Full,2.0,1.0,1.0,,0.0,3.0,Staten Island,,Central Air,,,0,0,0,1,1,1.0,,0,1.0,0,0.0,0,0,Forced Air,Natural Gas,,Residential,0.0,"1,296 sqft","2,700 sqft",,1.608428e+12,,,,,,,,,,,,,,,,,,,,,,,01984-0009,0,No Garage,Carport,Bathroom,Public Sewer,2.0,,5685.0,502000.0,,0.2,Primary,Ps 54 Charles W Leng,8.0,808.0,14.0,1.0,Public,1.4,Middle,Is 72 Rocco Laurie,7.0,1407.0,14.0,1.0,Public,0.8,9-12,1.0,High,https://www.greatschools.org/school?id=02797&s...,Susan E Wagner High School,4.0,3281.0,20.0,1.0,Public,1970.0
2448,Staten Island,39 Rockland Ave,10306.0,2.0,3.0,1.602706e+12,Beautiful Home Nestled Back On Sprawling 30x19...,FOR_SALE,40.577282,1464.0,-74.126915,749999.0,,,,Price change,779000.0,-0.026249,,,,1.608077e+12,Price change,0.0,799999.0,-0.030303,0.0,1.603670e+12,Listed for sale,0.0,824999.0,0.000000,0.0,1.602634e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.87,,,,"Hot Water, Natural Gas",Units,"Detached, Garage, Carport, Off Street, On Stre...","5,730 sqft",Full,2.0,1.0,0.0,,1.0,3.0,Staten Island,Stone,Units,,,0,2,0,0,1,1.0,,1,1.0,1,0.0,0,0,Hot Water,Natural Gas,,Residential,0.0,"1,464 sqft","5,730 sqft",,1.602706e+12,,,,,,,,,,,,,,,,,,,,,,,00950-0011,2,Detached,Garage,Bathroom,Public Sewer,3.0,,5296.0,613000.0,2016.0,0.6,Primary,Ps 23 Richmondtown,9.0,562.0,13.0,1.0,Public,2.3,Middle,Is 24 Myra S Barnes,6.0,1264.0,12.0,1.0,Public,1.3,9-12,1.0,High,https://www.greatschools.org/school?id=02797&s...,Susan E Wagner High School,4.0,3281.0,20.0,1.0,Public,1930.0
2449,Staten Island,47 Drysdale St,10314.0,2.0,4.0,1.606159e+12,"This lovely, two family home is located on a q...",FOR_SALE,40.599617,1990.0,-74.138329,799000.0,,,,Listed for sale,799000.0,0.117483,,,,1.606090e+12,Listing removed,1.0,2400.0,0.000000,0.0,1.496362e+12,Listed for rent,1.0,2400.0,0.000000,0.0,1.494979e+12,Sold,0.0,715000.0,0.000000,0.0,1.483574e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.87,,,,"Forced Air, Natural Gas",Central Air,"Built-in, Garage, Carport",58 Days,,2.0,2.0,0.0,,0.0,4.0,Staten Island,,Central Air,,,0,1,0,0,1,1.0,,1,1.0,0,0.0,0,0,Forced Air,Natural Gas,,Residential,0.0,"1,990 sqft","4,583 sqft",,1.606159e+12,,,,,,,,,,,,,,,,,,,,,,,01979-0044,1,Built-in,Garage,Bathroom,Public Sewer,2.0,,7132.0,801000.0,,0.2,Primary,Ps 54 Charles W Leng,8.0,808.0,14.0,1.0,Public,1.3,Middle,Is 72 Rocco Laurie,7.0,1407.0,14.0,1.0,Public,0.8,9-12,1.0,High,https://www.greatschools.org/school?id=02797&s...,Susan E Wagner High School,4.0,3281.0,20.0,1.0,Public,1970.0
2456,Staten Island,162 Portage Ave,10314.0,6.0,7.0,1.609528e+12,Fully Detached 2 family House in Willowbrook ...,FOR_SALE,40.600338,2700.0,-74.126228,859900.0,,,,Listed for sale,859900.0,0.000000,,,,1.609459e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.87,Dishwasher,Dryer,Refrigerator,"Hot Water, Natural Gas",Central Air,"Built-in, Garage, Carport, Off Street",19 Days,,6.0,3.0,2.0,,1.0,7.0,Staten Island,,Central Air,,,0,1,0,0,1,1.0,,1,1.0,0,0.0,0,0,Hot Water,Natural Gas,,Residential,0.0,"2,700 sqft","7,800 sqft",,1.609528e+12,,,,,,,,,,,,,,,,,,,,,,,00802-1021,1,Built-in,Garage,Bathroom,Public Sewer,2.0,,9044.0,,2020.0,1.0,Primary,Ps 29 Bardwell,7.0,642.0,16.0,1.0,Public,1.8,Middle,Is 27 Anning S Prall,5.0,1001.0,14.0,1.0,Public,0.2,9-12,1.0,High,https://www.greatschools.org/school?id=02797&s...,Susan E Wagner High School,4.0,3281.0,20.0,1.0,Public,1988.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59772,Brooklyn,1960 E 34th St,11234.0,2.0,2.0,1.606757e+12,Marine Park home located on a quiet block. T...,FOR_SALE,40.609180,994.0,-73.933586,569000.0,,,,Listed for sale,569000.0,-0.015571,,,,1.606694e+12,Listing removed,0.0,578000.0,0.000000,0.0,1.600128e+12,Price change,0.0,578000.0,0.001733,0.0,1.581466e+12,Listed for sale,0.0,577000.0,0.030357,0.0,1.571702e+12,Listing removed,0.0,560000.0,0.000000,0.0,1.564963e+12,Price change,0.0,560000.0,-0.034483,0.0,1.558397e+12,Listed for sale,0.0,580000.0,-0.056911,0.0,1.549843e+12,Listing removed,0.0,615000.0,0.00000,0.0,1.542154e+12,0.65,Stove,,,"Natural Gas, Steam/Radiator",,"Carport, 1 Space, 2 Spaces","1,800 sqft",Finished,2.0,1.0,1.0,,0.0,2.0,Brooklyn,Siding,,,Laminate,0,3,0,1,1,,,1,1.0,0,,0,0,Natural Gas,Steam/Radiator,,Residential,,994 sqft,"1,800 sqft",,1.606757e+12,,,,,,,,,,,,,,,,,,,,,,,085000072,3,Carport,1 Space,,,2.0,,5224.0,620000.0,,0.4,Primary,Ps 207 Elizabeth G Leary,7.0,1230.0,15.0,2.0,Public,0.9,High,James Madison High School,4.0,3655.0,21.0,1.0,Public,,,,,,,,,,,,1930.0
59948,Brooklyn,3701 Fillmore Ave,11234.0,2.0,3.0,1.603209e+12,Marine Park - Terrific 3 Bedroom/2 Bathroom Du...,FOR_SALE,40.612099,1064.0,-73.932518,699000.0,,,,Listed for sale,699000.0,0.000000,,,,1.603152e+12,Listing removed,0.0,699000.0,0.000000,0.0,1.556496e+12,Listed for sale,0.0,699000.0,0.594071,0.0,1.555373e+12,Sold,0.0,438500.0,0.000000,0.0,1.296518e+12,Listing removed,0.0,438500.0,0.000000,0.0,1.289261e+12,Listed for sale,0.0,438500.0,0.624074,0.0,1.285718e+12,Sold,0.0,270000.0,0.000000,0.0,1.060214e+12,,,,,,,0.65,Dishwasher,Dryer,Microwave,"Natural Gas, Baseboard",Wall Unit(s),"Carport, 1 Space","1,360 sqft",Finished,2.0,2.0,0.0,,0.0,3.0,Brooklyn,Aluminum Siding,Wall Unit(s),,Hardwood,0,1,0,1,1,1.0,,1,1.0,0,,0,0,Natural Gas,Baseboard,,Residential,,"1,064 sqft","1,360 sqft",,1.603209e+12,,,,,,,,,,,,,,,,,,,,,,,084810011,1,Carport,1 Space,,,2.0,Duplex,4500.0,605000.0,,0.2,Primary,Ps 207 Elizabeth G Leary,7.0,1230.0,15.0,2.0,Public,1.0,High,James Madison High School,4.0,3655.0,21.0,1.0,Public,,,,,,,,,,,,1925.0
60564,Brooklyn,2819 Avenue P,11229.0,4.0,8.0,1.599951e+12,Presenting this beautiful large 2 Fam house co...,SOLD,40.612125,2480.0,-73.945366,1097000.0,Shu Yu Zeng,https://photos.zillowstatic.com/h_e/IShbckmjkj...,/profile/Shu-Yu-Zeng/,Sold,1097000.0,-0.084307,Shu Yu Zeng,https://photos.zillowstatic.com/h_e/IShbckmjkj...,/profile/Shu-Yu-Zeng/,1.578874e+12,Listed for sale,0.0,1198000.0,0.000000,0.0,1.575936e+12,Pending sale,0.0,1198000.0,0.000000,0.0,1.573085e+12,Price change,0.0,1198000.0,-0.077042,0.0,1.566691e+12,Listed for sale,0.0,1298000.0,0.000000,0.0,1.562803e+12,,,,,,,,,,,,,,,,,,,0.65,Dryer,Microwave,Refrigerator,"Hot Water, Steam, Natural Gas",Units,"Attached, Garage, Assigned, Carport, Garage Do...",,Full,4.0,3.0,1.0,,0.0,8.0,Brooklyn,Stone,Units,,,0,2,1,1,1,1.0,,1,1.0,0,0.0,0,0,Hot Water,Steam,,Residential,0.0,"2,480 sqft","2,500 sqft",,1.599951e+12,,,,,,,,,,,,,,,,,,,,,,,07689-0002,2,Attached,Garage,Bathroom,Public Sewer,2.0,,8413.0,818000.0,,0.4,Primary,Ps 222 Katherine R Snyder,8.0,935.0,12.0,1.0,Public,0.8,Middle,Jhs 234 Arthur W Cunningham,8.0,1909.0,18.0,1.0,Public,0.2,9-12,1.0,High,https://www.greatschools.org/school?id=02004&s...,James Madison High School,4.0,3655.0,21.0,1.0,Public,1930.0
61128,Brooklyn,8113 Avenue L,11236.0,3.0,3.0,1.609979e+12,18174H-BRICK ONE FAMILY WITH PRIVATE CARPORT I...,SOLD,40.631508,1120.0,-73.906845,444000.0,Salvatore Taormina,https://photos.zillowstatic.com/h_e/ISbdcfl37e...,/profile/Salshomes/,Sold,444000.0,-0.043103,Salvatore Taormina,https://photos.zillowstatic.com/h_e/ISbdcfl37e...,/profile/Salshomes/,1.550189e+12,Listing removed,0.0,464000.0,0.000000,0.0,1.546301e+12,Pending sale,0.0,464000.0,0.000000,0.0,1.530317e+12,Price change,0.0,464000.0,0.013100,0.0,1.530058e+12,Price change,0.0,458000.0,-0.012931,0.0,1.528243e+12,Price change,0.0,464000.0,-0.021097,0.0,1.523405e+12,Listed for sale,0.0,474000.0,0.358166,0.0,1.522541e+12,Listing removed,0.0,349000.0,0.00000,0.0,1.315094e+12,0.65,Indoor Grill,Refrigerator,,"Forced Air, Natural Gas",Central Air,"No Garage, Carport",,Full,3.0,1.0,2.0,,0.0,3.0,Brooklyn,,Central Air,,,0,0,0,1,1,1.0,,0,1.0,0,0.0,0,0,Forced Air,Natural Gas,,Residential,0.0,"1,120 sqft","1,445 sqft",,1.609979e+12,,,,,,,,,,,,,,,,,,,,,,,08047-0006,0,No Garage,Carport,Bathroom,Public Sewer,2.0,,4018.0,462000.0,,0.2,Primary,Ps 276 Louis Marshall,3.0,615.0,13.0,1.0,Public,0.3,Middle,Is 68 Isaac Bildersee,6.0,347.0,10.0,1.0,Public,0.7,9-12,1.0,High,https://www.greatschools.org/school?id=13337&s...,Academy for Conservation and the Environment,3.0,282.0,12.0,1.0,Public,1965.0


In [56]:
df.loc[df['resoFactsStats/hasOpenParking'] > 0]

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
269,Jamaica,14977 Huxley St,11422.0,4.0,5.0,1.600376e+12,"Wonderful multifamily treasure on quiet, tree ...",FOR_SALE,40.651073,1996.0,-73.737541,750000.0,,,,Listed for sale,750000.0,0.420455,,,,1.600301e+12,Listing removed,1.0,1400.0,0.000000,0.0,1.366848e+12,Price change,1.0,1400.0,-0.066667,0.0,1.365638e+12,Listed for rent,1.0,1500.0,0.000000,0.0,1.365206e+12,Sold,0.0,528000.0,0.000000,0.0,1.153440e+12,,,,,,,,,,,,,,,,,,,0.84,Gas Water Heater,,,"Natural Gas, Other",Wall Unit(s),"Private Drive, Other",125 Days,,4.0,2.0,2.0,,0.0,5.0,Jamaica,Aluminum Siding,Wall Unit(s),,Hardwood,0,0,0,1,0,1.0,,0,1.0,1,,0,0,Natural Gas,Other,,Residential,,"1,996 sqft","3,167 sqft",,1.600376e+12,,,,,,,,,,,,,,,,,,,,,,,136460043,0,Private Drive,Other,,,2.0,,6005.0,655000.0,,0.2,Primary,Ps 195 William Haberle,2.0,560.0,15.0,1.0,Public,1.6,Middle,Collaborative Arts Middle School,4.0,303.0,14.0,1.0,Public,1.6,9-12,1.0,High,https://www.greatschools.org/school?id=08444&s...,Excelsior Preparatorty High School,4.0,477.0,19.0,1.0,Public,1960.0
318,Far Rockaway,1121 Mcbride St,11691.0,3.0,6.0,1.600889e+12,"Great investment opportunity, 3 family all 2 b...",FOR_SALE,40.604797,3594.0,-73.758415,725000.0,,,,Listed for sale,725000.0,3.061625,,,,1.600819e+12,Sold,0.0,178500.0,1.005618,0.0,1.330906e+12,Listing removed,0.0,,,0.0,1.314662e+12,Listed for sale,0.0,89000.0,-0.841956,0.0,1.314317e+12,Sold,0.0,563136.0,-0.158242,0.0,1.269475e+12,Sold,0.0,669000.0,0.000000,0.0,1.156723e+12,,,,,,,,,,,,,0.84,Refrigerator,Stove,Gas Water Heater,"Natural Gas, Steam/Radiator",,"Street, None",119 Days,,3.0,3.0,0.0,,0.0,6.0,Far Rockaway,Brick,,,Ceramic Tile,0,0,0,1,0,,,0,1.0,1,,0,0,Natural Gas,Steam/Radiator,,Residential,,"3,594 sqft","2,378 sqft",,1.600889e+12,,,,,,,,,,,,,,,,,,,,,,,157140188,0,Street,,,,3.0,,7092.0,706000.0,,0.3,Primary,Wave Preparatory Elementary School,9.0,557.0,14.0,1.0,Public,0.6,Middle,Ms 53 Brian Piccolo,3.0,271.0,12.0,2.0,Public,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=08441&s...,Frederick Douglas Academy Vi High School,1.0,336.0,16.0,1.0,Public,2006.0
345,Neponsit,464 Beach 145 St,11694.0,3.0,4.0,1.601422e+12,"NEPONSIT Corner detached 1 family center hall,...",FOR_SALE,40.575123,2571.0,-73.863052,1499999.0,,,,Listed for sale,1499999.0,0.000000,,,,1.610496e+12,Listing removed,0.0,1499999.0,0.000000,0.0,1.601942e+12,Price change,0.0,1499999.0,-0.062501,0.0,1.595635e+12,Listed for sale,0.0,1600000.0,-0.085192,0.0,1.590106e+12,Listing removed,0.0,1749000.0,0.000000,0.0,1.562976e+12,Price change,0.0,1749000.0,-0.000571,0.0,1.562112e+12,Listed for sale,0.0,1750000.0,0.0,0.0,1.555373e+12,,,,,,,0.84,Dryer,Refrigerator,Stove,"Natural Gas, Hot Water","A/C Unit, Central Air","Private Drive, 6+ Spaces, Attached Garage","6,000 sqft",Finished,3.0,2.0,1.0,,0.0,4.0,Neponsit,Brick,A/C Unit,,Carpet,0,6,1,0,0,1.0,,1,1.0,1,,0,0,Natural Gas,Hot Water,,Residential,,"2,571 sqft","6,000 sqft",,1.601422e+12,,,,,,,,,,,,,,,,,,,,,,,,6,Private Drive,6+ Spaces,,,2.0,Duplex,11842.0,,,0.6,Primary,Ps Ms 114 Belle Harbor,8.0,669.0,13.0,2.0,Public,2.4,Middle,Scholars Academy,8.0,1375.0,19.0,1.0,Public,,,,,,,,,,,,1931.0
482,Far Rockaway,2511 Edgemere Ave,11691.0,1.0,3.0,1.599882e+12,Subject to short sale. Drive by view only. Ple...,FOR_SALE,40.596405,1122.0,-73.759117,400000.0,,,,Listed for sale,400000.0,0.253918,,,,1.583453e+12,Listing removed,0.0,319000.0,0.000000,0.0,1.483229e+12,Price change,0.0,319000.0,0.281124,0.0,1.467158e+12,Listed for sale,0.0,249000.0,-0.038610,0.0,1.466726e+12,Listing removed,0.0,259000.0,0.000000,0.0,1.460506e+12,Listed for sale,0.0,259000.0,0.000000,0.0,1.455494e+12,Listing removed,0.0,259000.0,0.0,0.0,1.443571e+12,Price change,0.0,259000.0,-0.26,0.0,1.431043e+12,0.84,Gas Water Heater,,,"Natural Gas, Steam/Radiator",,"Private Drive, 1 Space","2,100 sqft",Unfinished,1.0,1.0,0.0,,0.0,3.0,Far Rockaway,Siding,,,Hardwood,0,1,0,1,0,,,1,1.0,1,,0,0,Natural Gas,Steam/Radiator,,Residential,,"1,122 sqft","2,100 sqft",,1.599882e+12,,,,,,,,,,,,,,,,,,,,,,,157830053,1,Private Drive,1 Space,,,2.0,,3088.0,345000.0,,0.2,Primary,Ps 43,2.0,894.0,14.0,3.0,Public,0.4,High,Queens High School For Information Research And T,3.0,436.0,19.0,1.0,Public,,,,,,,,,,,,1920.0
486,Far Rockaway,16-44 Seagirt Blvd,11691.0,2.0,3.0,1.606156e+12,Spacious 3 bedroom condo/ townhouse centrally ...,FOR_SALE,40.595592,1224.0,-73.750786,475000.0,,,,Listed for sale,475000.0,0.000000,,,,1.599264e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,"Natural Gas, Forced Air",Central Air,Attached,$400/mo,Partially Finished,2.0,1.0,1.0,,,3.0,Far Rockaway,Brick,Central Air,Queens 27,,0,0,1,0,0,1.0,0.0,1,1.0,1,0.0,0,0,Natural Gas,Forced Air,Queens 27,Residential,0.0,"1,224 sqft",1.60 Acres,Queens 27,1.599005e+12,,,,,,,,,,,,,,,,,,,,,,,Q15632-1014,0,Attached,,,,,,,,,0.5,Primary,Ps 197 The Ocean School,6.0,495.0,13.0,1.0,Public,0.5,Middle,Ms 53 Brian Piccolo,3.0,271.0,12.0,2.0,Public,0.9,9-12,1.0,High,https://www.greatschools.org/school?id=13369&s...,Queens High School For Information Research And T,3.0,436.0,19.0,1.0,Public,1967.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
73184,Woodside,62-10 Woodside Ave #207,11377.0,2.0,2.0,1.606156e+12,Large condo apartment. Doorman building Access...,FOR_SALE,40.744015,985.0,-73.901970,719000.0,,,,Listed for sale,719000.0,0.000000,,,,1.597363e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Microwave,Refrigerator,"Natural Gas, Hot Water",Central Air,Assigned,$580/mo,,2.0,2.0,0.0,,,2.0,Woodside,,Central Air,Queens 30,Hardwood,0,1,0,0,0,1.0,,1,1.0,1,,0,0,Natural Gas,Hot Water,Queens 30,Residential,0.0,985 sqft,0.98 Acres,Queens 30,1.597363e+12,,,,,,,,,,,,,,,,,,,,,,,Q01337-1024,1,Assigned,,,,,,,,,0.5,Elementary,Ps 11 Kathryn Phelan,8.0,1024.0,14.0,1.0,Public,1.2,Middle,Is 10 Horace Greeley,8.0,740.0,14.0,1.0,Public,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=02805&s...,William Cullen Bryant High School,3.0,2378.0,17.0,1.0,Public,1922.0
74394,Rego Park,64-75 Austin St #7A,11374.0,2.0,2.0,1.610381e+12,Penthouse Top Floor Totally Renovated Condo in...,FOR_SALE,40.726528,914.0,-73.860222,849888.0,,,,Listed for sale,849888.0,-0.054630,,,,1.610323e+12,Listing removed,0.0,899000.0,0.000000,0.0,1.603670e+12,Listed for sale,0.0,899000.0,0.000000,0.0,1.584317e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Range,Natural Gas,,Attached,$468/mo,,2.0,2.0,0.0,,,2.0,Rego Park,Brick,,Queens 28,Hardwood,0,1,1,0,0,,,1,1.0,1,,0,1,Natural Gas,,Queens 28,Residential,0.0,914 sqft,,Queens 28,1.610323e+12,,,,,,,,,,,,,,,,,,,,,,,,1,Attached,,,,,,355.0,,2020.0,0.2,Elementary,Ps 139 Rego Park,7.0,748.0,16.0,1.0,Public,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=01960&s...,Forest Hills High School,5.0,3784.0,20.0,1.0,Public,2015.0
74555,Rego Park,99-31 66 #4D,11374.0,2.0,2.0,1.606155e+12,large 2 bed 2 bath condo with terrace steps to...,FOR_SALE,40.728745,1200.0,-73.853745,929000.0,,,,Listed for sale,929000.0,0.000000,,,,1.610496e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Range,"Electric, Steam",Central Air,0 spaces,"$1,050/mo",Full,2.0,2.0,0.0,,,2.0,Rego Park,Brick,Central Air,Queens 28,Hardwood,0,0,0,0,0,1.0,,0,1.0,1,,0,0,Electric,Steam,Queens 28,Residential,1.0,"1,200 sqft",,Queens 28,1.583453e+12,,,,,,,,,,,,,,,,,,,,,,,,0,,,,,,,500.0,,,0.1,Primary,Ps 175 The Lynn Gross Discovery School,7.0,809.0,19.0,1.0,Public,0.2,Middle,Jhs 157 Stephen A Halsey,9.0,1632.0,16.0,1.0,Public,0.5,9-12,1.0,High,https://www.greatschools.org/school?id=01960&s...,Forest Hills High School,5.0,3784.0,20.0,1.0,Public,2008.0
74560,Forest Hills,65-10 108th St #1L,11375.0,1.0,1.0,1.609824e+12,"Large living, Large Master Bedroom, Updated Ba...",FOR_SALE,40.731186,900.0,-73.849129,318000.0,,,,Listed for sale,318000.0,0.000000,,,,1.609805e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,"Natural Gas, Steam",,Garage,$678/mo,,1.0,1.0,0.0,,,1.0,Forest Hills,Brick,,Queens 28,Hardwood,0,0,0,0,0,,,1,1.0,1,,0,0,Natural Gas,Steam,Queens 28,Residential,0.0,900 sqft,,Queens 28,1.609805e+12,,,,,,,,,,,,,,,,,,,,,,,,0,Garage,,,,,,,,,0.2,Primary,Ps 175 The Lynn Gross Discovery School,7.0,809.0,19.0,1.0,Public,0.3,Middle,Jhs 157 Stephen A Halsey,9.0,1632.0,16.0,1.0,Public,0.3,9-12,1.0,High,https://www.greatschools.org/school?id=01960&s...,Forest Hills High School,5.0,3784.0,20.0,1.0,Public,1950.0


In [57]:
df.loc[df['resoFactsStats/hasSpa'] > 0]

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
62,New York,34 Fort Charles Pl,10463.0,3.0,6.0,,"34 Fort Charles Pl, New York, NY 10463 is a si...",SOLD,40.875870,3000.0,-73.910347,56687.0,,,,Sold,56687.0,-0.943171,,,,1.540253e+12,Listed for sale,0.0,997500.0,0.000000,0.0,1.533341e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.88,,,,Gas,,"Garage - Attached, Covered","2,591 sqft",Finished,3.0,3.0,0.0,0.0,0.0,6.0,New York,,,City of New York,,0,1,0,0,0,0.0,1.0,1,1.0,0,1.0,1,0,Gas,,City of New York,Single Family,,"3,000 sqft","2,591 sqft",City of New York,,Garbage,Public,Property Type,Single Family,Status,Active,Amenities,1st Fl Master Bedroom,Air Conditioning,Window Units,Year Built Exception,Estimated,Sq Ft Source,Other,Attic Description,Finished,Hotwater,Electric Stand Alone,Water Description,Community,Radiator,Municipality,022150496,1,Garage - Attached,Covered,,,4.0,Queen Anne / Victorian,5487.0,689000.0,2010.0,0.2,Elementary,Ps 37 Multiple Intelligence School,4.0,647.0,14.0,1.0,Public,0.2,Middle,In Tech Academy Aka Ms High School 368,3.0,993.0,14.0,1.0,Public,,,,,,,,,,,,1920.0
335,Broad Channel,808 Church Rd,11693.0,2.0,4.0,,"808 Church Rd, Broad Channel, NY 11693 is a si...",SOLD,40.606899,1298.0,-73.818039,250000.0,,,,Sold,250000.0,39.000000,,,,1.525651e+12,Sold,0.0,6250.0,0.000000,0.0,9.229248e+11,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Freezer,"Baseboard, Oil","Refrigerator, Wall","On-street, Garage","2,500 sqft",,2.0,2.0,0.0,0.0,0.0,4.0,Broad Channel,,Refrigerator,,Tile,0,0,0,0,0,1.0,0.0,0,1.0,0,0.0,1,0,Baseboard,Oil,,Single Family,,"1,298 sqft","2,500 sqft",,,,,,,,,,,,,,,,,,,,,,,,,154620007,0,On-street,Garage,WalkInCloset,,2.0,Other,1220.0,412000.0,1925.0,0.2,Primary,Ps 47 Chris Galas,7.0,218.0,10.0,1.0,Public,1.3,High,Rockaway Park High School For Environmental Su...,2.0,310.0,14.0,1.0,Public,,,,,,,,,,,,1925.0
337,Neponsit,146-16 Rockaway Beach Blvd,11694.0,7.0,7.0,1.610402e+12,Call Owner or email\nThe recently renovated Ho...,FOR_SALE,40.570076,6322.0,-73.862450,2950000.0,,,,Listed for sale,2950000.0,0.000000,,,,1.610323e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Freezer,,,"Garage - Detached, Covered","10,705 sqft",Finished,7.0,1.0,1.0,0.0,5.0,7.0,Neponsit,,,,Tile,0,12,0,0,0,0.0,1.0,1,0.0,0,,1,1,,,,Single Family,0.0,"6,322 sqft","10,705 sqft",,1.610323e+12,,,,,,,,,,,,,,,,,,,,,,,,12,Garage - Detached,Covered,WalkInCloset,,3.0,Colonial,,,1995.0,0.7,Primary,Ps Ms 114 Belle Harbor,8.0,669.0,13.0,2.0,Public,2.5,Middle,Scholars Academy,8.0,1375.0,19.0,1.0,Public,,,,,,,,,,,,1910.0
640,Far Rockaway,270 Beach 137th St,11694.0,4.0,4.0,,Spectacular center hall colonial with 4 spacio...,SOLD,40.575317,3600.0,-73.855217,1575000.0,,,,Sold,1575000.0,0.285714,Robin Shapiro,https://photos.zillowstatic.com/h_e/ISqok07z9m...,/profile/Robin-Shapiro/,1.521763e+12,Sold,0.0,1225000.0,-0.092593,0.0,1.433808e+12,Listing removed,0.0,1350000.0,0.000000,0.0,1.427069e+12,Listed for sale,0.0,1350000.0,0.356784,0.0,1.426896e+12,Sold,0.0,995000.0,-0.091324,0.0,1.206403e+12,Sold,0.0,1095000.0,1.085714,0.0,1.182211e+12,Sold,0.0,525000.0,0.000000,0.0,9.538560e+11,,,,,,,0.84,Dishwasher,Dryer,Freezer,"Baseboard, Gas, Solar","Central, Solar","Garage, Garage - Attached","5,998 sqft",Finished,4.0,4.0,0.0,0.0,0.0,4.0,Far Rockaway,,Central,,Tile,0,0,0,0,0,1.0,1.0,0,1.0,0,1.0,1,0,Baseboard,Gas,,Single Family,,"3,600 sqft","5,998 sqft",,,,,,,,,,,,,,,,,,,,,,,,,162700086,0,Garage,Garage - Attached,WalkInCloset,,2.0,Colonial,10603.0,1604000.0,1940.0,0.2,Primary,Ps Ms 114 Belle Harbor,8.0,669.0,13.0,2.0,Public,2.0,Middle,Scholars Academy,8.0,1375.0,19.0,1.0,Public,,,,,,,,,,,,1940.0
734,Far Rockaway,14015 Rockaway Beach Blvd,11694.0,5.0,4.0,,"Location, Location, Location!!!! Lovely, elega...",SOLD,40.571941,2740.0,-73.857193,1180000.0,Carla Belisario,https://photos.zillowstatic.com/h_e/ISatcykpz5...,/profile/carlabelisario/,Sold,1180000.0,-0.063492,Talk of the Town RE,https://photos.zillowstatic.com/h_e/ISmy0ygdih...,/profile/Talk-of-the-Town-RE/,1.525046e+12,Price change,0.0,1260000.0,-0.022498,0.0,1.503878e+12,Price change,0.0,1289000.0,-0.008462,0.0,1.501546e+12,Price change,0.0,1300000.0,-0.071429,0.0,1.491264e+12,Listed for sale,0.0,1400000.0,0.000000,0.0,1.482710e+12,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Microwave,"Other, Radiant, Gas",Central,"Garage, Carport, Garage - Attached, Covered","6,172 sqft",Finished,5.0,4.0,1.0,0.0,0.0,4.0,Far Rockaway,,Central,,Tile,0,10,0,0,0,1.0,1.0,1,1.0,0,0.0,1,0,Other,Radiant,,Single Family,,"2,740 sqft","6,172 sqft",,,,,,,,,,,,,,,,,,,,,,,,,162850001,10,Garage,Carport,FamilyRoom,,2.0,Tudor,11271.0,1059000.0,1971.0,0.4,Primary,Ps Ms 114 Belle Harbor,8.0,669.0,13.0,2.0,Public,2.1,Middle,Scholars Academy,8.0,1375.0,19.0,1.0,Public,,,,,,,,,,,,1945.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
74621,Flushing,11034 68th Rd,11375.0,3.5,4.0,,Location Location Location.\nfully detached b...,SOLD,40.727116,5000.0,-73.842636,10.0,,,,Sold,10.0,-0.999993,,,,1.527638e+12,Listing removed,1.0,5250.0,0.000000,0.0,1.507334e+12,Price change,1.0,5250.0,0.050210,0.0,1.507075e+12,Price change,1.0,4999.0,-0.049439,0.0,1.503965e+12,Listed for rent,1.0,5259.0,0.000000,0.0,1.502928e+12,Sold,0.0,1465000.0,0.000000,0.0,1.439770e+12,,,,,,,,,,,,,0.84,Dishwasher,,,,,"Garage, Garage - Attached","5,000 sqft",,,0.0,0.0,0.0,0.0,4.0,Flushing,,,,,0,0,0,0,0,0.0,1.0,0,0.0,0,,1,0,,,,Single Family,,"5,000 sqft","5,000 sqft",,,,Courtyard,,Fios Available,,,,,,,,,,,,,,,,,,,022270022,0,Garage,Garage - Attached,,,2.0,,16231.0,1583000.0,1930.0,0.4,Primary,Ps 196 Grand Central Parkway,10.0,996.0,18.0,1.0,Public,0.7,Middle,Jhs 157 Stephen A Halsey,9.0,1632.0,16.0,1.0,Public,0.2,9-12,1.0,High,https://www.greatschools.org/school?id=01960&s...,Forest Hills High School,5.0,3784.0,20.0,1.0,Public,1930.0
75050,Flushing,6945 Fleet St,11375.0,5.0,4.0,,NYC Well positioned on a charming Fleet street...,SOLD,40.717712,3995.0,-73.849693,1685000.0,,,,Listing removed,1995000.0,0.000000,,,,1.582330e+12,Pending sale,0.0,1995000.0,0.000000,0.0,1.572912e+12,Listed for sale,0.0,1995000.0,0.183976,0.0,1.572912e+12,Sold,0.0,1685000.0,-0.063369,0.0,1.570493e+12,Price change,0.0,1799000.0,-0.026515,0.0,1.561853e+12,Listed for sale,0.0,1848000.0,0.000000,0.0,1.559174e+12,Listing removed,0.0,1848000.0,0.000000,0.0,1.556323e+12,Price change,0.0,1848000.0,-0.026856,0.0,1.555373e+12,0.84,Dishwasher,Dryer,Freezer,"Baseboard, Other, Radiant, Electric, Gas",Central,"Garage, Garage - Detached, Off-street, On-stre...","4,000 sqft",Finished,5.0,3.0,2.0,0.0,0.0,4.0,Flushing,,Central,,Tile,0,2,0,0,0,1.0,1.0,1,1.0,0,0.0,1,1,Baseboard,Other,,Single Family,,"3,995 sqft","4,000 sqft",,,,,,,,,,,,,,,,,,,,,,,,,032170046,2,Garage,Garage - Detached,WalkInCloset,,3.0,Modern,13879.0,1564000.0,1925.0,0.3,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,Public,0.4,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.7,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0
75586,Forest Hills Gardens,94 Groton St,11375.0,5.0,5.0,,"Triple-Mint Tudor Masterpiece, This Is One Of ...",SOLD,40.715370,3885.0,-73.845581,3000000.0,Rachel Borut,https://photos.zillowstatic.com/h_e/ISp5ldpee9...,/profile/rachel-borut/,Sold,3000000.0,-0.141631,,,,1.552435e+12,Listing removed,0.0,3495000.0,0.000000,0.0,1.541981e+12,Price change,0.0,3495000.0,-0.080263,0.0,1.526602e+12,Listed for sale,0.0,3800000.0,1.054054,0.0,1.521677e+12,Sold,0.0,1850000.0,-0.191434,0.0,1.367885e+12,Listing removed,0.0,2288000.0,0.000000,0.0,1.340323e+12,Price change,0.0,2288000.0,-0.084434,0.0,1.338509e+12,Listed for sale,0.0,2499000.0,2.519718,0.0,1.328486e+12,0.84,Range / Oven,Refrigerator,Washer,"Forced air, Stove, Wall, Gas, Other",Central,"Garage, Garage - Attached, Covered","5,523 sqft",Finished,5.0,4.0,1.0,0.0,0.0,5.0,Forest Hills Gardens,,Central,Community District 28,Hardwood,0,2,0,0,0,1.0,1.0,1,1.0,0,0.0,1,1,Forced air,Stove,Community District 28,Single Family,,"3,885 sqft","5,523 sqft",Community District 28,,Den/Family Room,Y,Detached/Attached,Det,Driveway,Pvt,Eat In Kitchen,Y,Picture,Y,Water,Public,Attic,Y,Sewer,Y,Wood Floors,Y,Fuel,Gas,Y,Other,032480059,2,Garage,Garage - Attached,WalkInCloset,Y,,Contemporary,16610.0,2030000.0,1920.0,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,Public,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.8,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1920.0
75618,Flushing,140 71st Ave,11375.0,7.0,5.0,,"5,638 SF, 5 beds 7 baths Elevator, EXCELLENT ...",SOLD,40.715462,5638.0,-73.847389,3012500.0,,,,Sold,3012500.0,-0.397500,,,,1.531181e+12,Listed for sale,0.0,5000000.0,0.000000,0.0,1.442534e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Freezer,Gas,Central,"Garage, Garage - Attached","$2,500/mo",Finished,7.0,5.0,2.0,0.0,0.0,5.0,Flushing,,Central,,Tile,1,0,0,0,0,1.0,1.0,0,1.0,0,0.0,1,1,Gas,,,Single Family,,"5,638 sqft","8,558 sqft",,,,,,,,,,,,,,,,,,,,,,,,,032470004,0,Garage,Garage - Attached,WalkInCloset,,3.0,Colonial,28999.0,2810000.0,1998.0,0.2,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,Public,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,Public,0.7,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1920.0


In [58]:
df.loc[df['resoFactsStats/hasAttachedProperty'] > 0]

Unnamed: 0,address/city,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/furnished,resoFactsStats/garageSpaces,resoFactsStats/hasAttachedGarage,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasSpa,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/0/type,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/1/type,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
269,Jamaica,14977 Huxley St,11422.0,4.0,5.0,1.600376e+12,"Wonderful multifamily treasure on quiet, tree ...",FOR_SALE,40.651073,1996.0,-73.737541,750000.0,,,,Listed for sale,750000.0,0.420455,,,,1.600301e+12,Listing removed,1.0,1400.0,0.000000,0.0,1.366848e+12,Price change,1.0,1400.0,-0.066667,0.0,1.365638e+12,Listed for rent,1.0,1500.0,0.000000,0.0,1.365206e+12,Sold,0.0,528000.0,0.000000,0.0,1.153440e+12,,,,,,,,,,,,,,,,,,,0.84,Gas Water Heater,,,"Natural Gas, Other",Wall Unit(s),"Private Drive, Other",125 Days,,4.0,2.0,2.0,,0.0,5.0,Jamaica,Aluminum Siding,Wall Unit(s),,Hardwood,0,0,0,1,0,1.0,,0,1.0,1,,0,0,Natural Gas,Other,,Residential,,"1,996 sqft","3,167 sqft",,1.600376e+12,,,,,,,,,,,,,,,,,,,,,,,136460043,0,Private Drive,Other,,,2.0,,6005.0,655000.0,,0.2,Primary,Ps 195 William Haberle,2.0,560.0,15.0,1.0,Public,1.6,Middle,Collaborative Arts Middle School,4.0,303.0,14.0,1.0,Public,1.6,9-12,1.0,High,https://www.greatschools.org/school?id=08444&s...,Excelsior Preparatorty High School,4.0,477.0,19.0,1.0,Public,1960.0
318,Far Rockaway,1121 Mcbride St,11691.0,3.0,6.0,1.600889e+12,"Great investment opportunity, 3 family all 2 b...",FOR_SALE,40.604797,3594.0,-73.758415,725000.0,,,,Listed for sale,725000.0,3.061625,,,,1.600819e+12,Sold,0.0,178500.0,1.005618,0.0,1.330906e+12,Listing removed,0.0,,,0.0,1.314662e+12,Listed for sale,0.0,89000.0,-0.841956,0.0,1.314317e+12,Sold,0.0,563136.0,-0.158242,0.0,1.269475e+12,Sold,0.0,669000.0,0.0,0.0,1.156723e+12,,,,,,,,,,,,,0.84,Refrigerator,Stove,Gas Water Heater,"Natural Gas, Steam/Radiator",,"Street, None",119 Days,,3.0,3.0,0.0,,0.0,6.0,Far Rockaway,Brick,,,Ceramic Tile,0,0,0,1,0,,,0,1.0,1,,0,0,Natural Gas,Steam/Radiator,,Residential,,"3,594 sqft","2,378 sqft",,1.600889e+12,,,,,,,,,,,,,,,,,,,,,,,157140188,0,Street,,,,3.0,,7092.0,706000.0,,0.3,Primary,Wave Preparatory Elementary School,9.0,557.0,14.0,1.0,Public,0.6,Middle,Ms 53 Brian Piccolo,3.0,271.0,12.0,2.0,Public,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=08441&s...,Frederick Douglas Academy Vi High School,1.0,336.0,16.0,1.0,Public,2006.0
482,Far Rockaway,2511 Edgemere Ave,11691.0,1.0,3.0,1.599882e+12,Subject to short sale. Drive by view only. Ple...,FOR_SALE,40.596405,1122.0,-73.759117,400000.0,,,,Listed for sale,400000.0,0.253918,,,,1.583453e+12,Listing removed,0.0,319000.0,0.000000,0.0,1.483229e+12,Price change,0.0,319000.0,0.281124,0.0,1.467158e+12,Listed for sale,0.0,249000.0,-0.038610,0.0,1.466726e+12,Listing removed,0.0,259000.0,0.000000,0.0,1.460506e+12,Listed for sale,0.0,259000.0,0.0,0.0,1.455494e+12,Listing removed,0.0,259000.0,0.0,0.0,1.443571e+12,Price change,0.0,259000.0,-0.26,0.0,1.431043e+12,0.84,Gas Water Heater,,,"Natural Gas, Steam/Radiator",,"Private Drive, 1 Space","2,100 sqft",Unfinished,1.0,1.0,0.0,,0.0,3.0,Far Rockaway,Siding,,,Hardwood,0,1,0,1,0,,,1,1.0,1,,0,0,Natural Gas,Steam/Radiator,,Residential,,"1,122 sqft","2,100 sqft",,1.599882e+12,,,,,,,,,,,,,,,,,,,,,,,157830053,1,Private Drive,1 Space,,,2.0,,3088.0,345000.0,,0.2,Primary,Ps 43,2.0,894.0,14.0,3.0,Public,0.4,High,Queens High School For Information Research And T,3.0,436.0,19.0,1.0,Public,,,,,,,,,,,,1920.0
2435,Staten Island,16 Leason Pl,10314.0,4.0,3.0,1.599951e+12,20006R: NEW oversized & custom One Family semi...,FOR_SALE,40.583450,1650.0,-74.149788,699999.0,,,,Listed for sale,699999.0,0.000000,,,,1.607645e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.87,Dishwasher,,,"Hot Water, Natural Gas",Central Air,"No Garage, Off Street, On Street","2,550 sqft",Full,4.0,1.0,1.0,,2.0,3.0,Staten Island,Stone,Central Air,,,0,0,0,1,0,1.0,,0,1.0,1,0.0,0,0,Hot Water,Natural Gas,,Residential,1.0,"1,650 sqft","2,550 sqft",,1.599951e+12,,,,,,,,,,,,,,,,,,,,,,,02390-0169,0,No Garage,Off Street,Bathroom,Public Sewer,2.0,,7000.0,,,0.7,Primary,Ps 69 Daniel D Tompkins,5.0,936.0,10.0,1.0,Public,0.8,Middle,Is 72 Rocco Laurie,7.0,1407.0,14.0,1.0,Public,1.8,9-12,1.0,High,https://www.greatschools.org/school?id=02797&s...,Susan E Wagner High School,4.0,3281.0,20.0,1.0,Public,2019.0
2443,Staten Island,127 Fields Ave,10314.0,2.0,3.0,1.608428e+12,"Centrally located in Willowbrook, this 3 bedro...",FOR_SALE,40.599575,1296.0,-74.137489,559000.0,,,,Listed for sale,559000.0,0.000000,,,,1.608422e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.87,,,,"Forced Air, Natural Gas",Central Air,"No Garage, Carport","2,700 sqft",Full,2.0,1.0,1.0,,0.0,3.0,Staten Island,,Central Air,,,0,0,0,1,1,1.0,,0,1.0,0,0.0,0,0,Forced Air,Natural Gas,,Residential,0.0,"1,296 sqft","2,700 sqft",,1.608428e+12,,,,,,,,,,,,,,,,,,,,,,,01984-0009,0,No Garage,Carport,Bathroom,Public Sewer,2.0,,5685.0,502000.0,,0.2,Primary,Ps 54 Charles W Leng,8.0,808.0,14.0,1.0,Public,1.4,Middle,Is 72 Rocco Laurie,7.0,1407.0,14.0,1.0,Public,0.8,9-12,1.0,High,https://www.greatschools.org/school?id=02797&s...,Susan E Wagner High School,4.0,3281.0,20.0,1.0,Public,1970.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
61313,Brooklyn,2006 E 34th St,11234.0,3.0,3.0,1.609979e+12,18676H-VERY PLEASED TO OFFER THIS WELL BUILT B...,SOLD,40.608479,1939.0,-73.932816,700000.0,MicheleDinatale,https://photos.zillowstatic.com/h_e/IS1725z4tq...,/profile/MicheleDinatale/,Sold,700000.0,-0.077734,MicheleDinatale,https://photos.zillowstatic.com/h_e/IS1725z4tq...,/profile/MicheleDinatale/,1.554682e+12,Listing removed,0.0,759000.0,0.000000,0.0,1.546560e+12,Price change,0.0,759000.0,-0.038023,0.0,1.543968e+12,Price change,0.0,789000.0,-0.024722,0.0,1.542067e+12,Listed for sale,0.0,809000.0,0.000000,0.0,1.541376e+12,,,,,,,,,,,,,,,,,,,0.65,Dishwasher,Microwave,Refrigerator,"Hot Water, Natural Gas",Units,"Built-in, Garage, Off Street, Garage Door Opener",,,3.0,1.0,1.0,,1.0,3.0,Brooklyn,,Units,,,0,1,0,1,0,1.0,,1,1.0,0,0.0,0,0,Hot Water,Natural Gas,,Residential,0.0,"1,939 sqft","1,800 sqft",,1.609979e+12,,,,,,,,,,,,,,,,,,,,,,,08519-0044,1,Built-in,Garage,Bathroom,Public Sewer,3.0,,4757.0,736000.0,,0.4,Primary,Ps 207 Elizabeth G Leary,7.0,1230.0,15.0,2.0,Public,1.0,High,James Madison High School,4.0,3655.0,21.0,1.0,Public,,,,,,,,,,,,1935.0
61329,Brooklyn,2035 E 24th St,11229.0,2.0,7.0,1.599951e+12,"Location, Location, Location! Great Neighborh...",SOLD,40.601196,2640.0,-73.947563,1352000.0,,,,Sold,1352000.0,0.040080,Deana Gambino,https://photos.zillowstatic.com/h_e/ISbhny6fzj...,/profile/dd60927/,1.576022e+12,Listing removed,0.0,1299900.0,0.000000,0.0,1.568506e+12,Listed for sale,0.0,1299900.0,0.000000,0.0,1.562544e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.65,Dishwasher,Dryer,Indoor Grill,"Hot Water, Natural Gas",Units,"Detached, Garage, Off Street, On Street",,Full,2.0,2.0,0.0,,0.0,7.0,Brooklyn,,Units,,,0,2,0,1,0,1.0,,1,1.0,1,0.0,0,0,Hot Water,Natural Gas,,Residential,0.0,"2,640 sqft","2,500 sqft",,1.599951e+12,,,,,,,,,,,,,,,,,,,,,,,07329-0076,2,Detached,Garage,Bathroom,Public Sewer,2.0,,8400.0,1381000.0,,0.3,Primary,Ps 206 Joseph F Lamb,6.0,1503.0,16.0,2.0,Public,0.5,High,James Madison High School,4.0,3655.0,21.0,1.0,Public,,,,,,,,,,,,1930.0
61356,Brooklyn,9319 Schenck St,11236.0,2.0,3.0,1.599951e+12,Semi Detached home in Canarsie Brooklyn is now...,SOLD,40.630463,1920.0,-73.888741,620000.0,Mark Lanfranchi,https://photos.zillowstatic.com/h_e/ISzztyxc8n...,/profile/markl255/,Sold,620000.0,-0.046007,Mark Lanfranchi,https://photos.zillowstatic.com/h_e/ISzztyxc8n...,/profile/markl255/,1.569283e+12,Listing removed,0.0,649900.0,0.000000,0.0,1.565827e+12,Listed for sale,0.0,649900.0,0.000000,0.0,1.561680e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.65,Dryer,Microwave,Refrigerator,"Forced Air, Natural Gas",Central Air,"No Garage, Off Street, On Street","1,715 sqft",Full,2.0,1.0,1.0,,0.0,3.0,Brooklyn,,Central Air,,,0,0,0,1,0,1.0,,0,1.0,1,0.0,0,0,Forced Air,Natural Gas,,Residential,0.0,"1,920 sqft","1,715 sqft",,1.599951e+12,,,,,,,,,,,,,,,,,,,,,,,08324-0002,0,No Garage,Off Street,Bathroom,Public Sewer,2.0,,4604.0,640000.0,2019.0,0.4,Primary,Ps 272 Curtis Estabrook,3.0,471.0,11.0,1.0,Public,1.4,Middle,Is 68 Isaac Bildersee,6.0,347.0,10.0,1.0,Public,0.9,9-12,1.0,High,https://www.greatschools.org/school?id=13333&s...,High School For Innovation In Advertising And ...,3.0,253.0,13.0,1.0,Public,1960.0
61644,Brooklyn,2046 E 73rd St,11234.0,3.0,3.0,1.602528e+12,"Bright and spacious home features 3bedrooms, 3...",RECENTLY_SOLD,40.621391,1232.0,-73.907097,690000.0,Mark Gleyzerman,https://photos.zillowstatic.com/h_e/IS9t4wxnex...,/profile/MGley/,Sold,690000.0,-0.048276,Mark Gleyzerman,https://photos.zillowstatic.com/h_e/IS9t4wxnex...,/profile/MGley/,1.605830e+12,Listing removed,0.0,725000.0,0.000000,0.0,1.605139e+12,Listed for sale,0.0,725000.0,0.294643,0.0,1.602461e+12,Sold,0.0,560000.0,0.000000,0.0,1.438128e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.65,Dishwasher,Dryer,Microwave,"Forced Air, Hot Water, Natural Gas",Units,"No Garage, Carport","2,000 sqft",Full,3.0,1.0,1.0,,1.0,3.0,Brooklyn,,Units,,,0,0,0,1,1,1.0,,0,1.0,0,1.0,0,0,Forced Air,Hot Water,,Residential,0.0,"1,232 sqft","2,000 sqft",,1.602528e+12,,,,,,,,,,,,,,,,,,,,,,,08413-0053,0,No Garage,Carport,Bathroom,,2.0,,5182.0,564000.0,2015.0,0.1,Primary,Ps 312 Bergen Beach,6.0,791.0,14.0,1.0,Public,0.4,Middle,Jhs 78 Roy H Mann,5.0,563.0,14.0,1.0,Public,2.7,9-12,1.0,High,https://www.greatschools.org/school?id=02004&s...,James Madison High School,4.0,3655.0,21.0,1.0,Public,1925.0


In [59]:
df.drop([ 'address/city', 'schools/0/type', 'schools/1/type', 'resoFactsStats/garageSpaces', 'resoFactsStats/hasAttachedGarage', 'resoFactsStats/furnished', 'resoFactsStats/hasSpa'], axis=1, inplace=True)

In [60]:
df.describe()

Unnamed: 0,address/zipcode,bathrooms,bedrooms,dateposted,latitude,livingArea,longitude,price,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/time,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasView,resoFactsStats/isNewConstruction,resoFactsStats/onMarketDate,resoFactsStats/parking,resoFactsStats/stories,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/1/distance,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/2/distance,schools/2/isAssigned,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,yearBuilt
count,33843.0,33844.0,33844.0,13226.0,33843.0,33844.0,33843.0,33835.0,33825.0,33826.0,33838.0,30031.0,29975.0,29981.0,30031.0,30031.0,28315.0,28236.0,28240.0,28315.0,28315.0,24125.0,24030.0,24040.0,24125.0,24125.0,19526.0,19400.0,19417.0,19526.0,19526.0,15627.0,15466.0,15493.0,15627.0,15627.0,12452.0,12313.0,12325.0,12452.0,12452.0,9922.0,9793.0,9805.0,9922.0,9922.0,33841.0,33172.0,32312.0,32306.0,21811.0,29412.0,33844.0,33844.0,33844.0,33306.0,10692.0,33844.0,33721.0,33844.0,14605.0,33844.0,11180.0,13226.0,33844.0,22646.0,29295.0,28633.0,16889.0,33792.0,33505.0,33679.0,33529.0,33792.0,33517.0,33484.0,33477.0,32885.0,33517.0,26316.0,26316.0,26310.0,26308.0,26058.0,26316.0,31970.0
mean,10794.807553,2.602683,3.435941,1603643000000.0,40.678698,2730.136509,-73.960902,966733.9,969819.2,191.9423,1570375000000.0,0.035297,930618.5,132.7336,0.0,1533056000000.0,0.034469,915461.6,477.8968,0.0,1523948000000.0,0.033326,901464.9,288.731967,0.0,1464040000000.0,0.042661,917956.2,177.477669,0.0,1434414000000.0,0.042875,834405.4,230.7985,0.0,1415150000000.0,0.042644,818005.8,319.2425,0.0,1397651000000.0,0.04354,812766.4,247.243055,0.0,1381606000000.0,0.836489,2.610756,1.751145,0.588374,0.003072,0.187916,3.435941,0.099013,0.047571,0.502282,0.468294,0.273579,0.565375,0.069732,0.190962,0.091892,0.0678,1596825000000.0,0.525056,2.595469,18334.64,1175702.0,3143.208,0.409884,6.387733,713.287033,13.900355,1.177734,0.944407,5.766396,1047.36416,14.345659,1.252827,1.330974,1.0,3.85572,2279.162802,17.617469,1.177079,1951.923866
std,542.126751,7.438426,8.291188,9424258000.0,0.101378,15220.475839,0.149962,1958550.0,1929019.0,11808.7,29596480000.0,0.184532,1882280.0,11072.79,0.0,101027600000.0,0.182435,1858818.0,16546.74,0.0,105193800000.0,0.179491,4910203.0,8675.00847,0.0,170121900000.0,0.202097,7304644.0,6942.938632,0.0,184933900000.0,0.202581,1953747.0,10252.02,0.0,187119400000.0,0.202061,1901577.0,14655.56,0.0,188129900000.0,0.204079,1900557.0,10360.213257,0.0,188224400000.0,0.089741,7.511967,1.16344,7.523867,0.065927,0.50121,8.291188,0.298684,0.21286,0.500002,0.499017,0.445802,0.495715,0.254698,0.393073,0.288878,0.251413,16998920000.0,9.089507,2.925505,150102.8,10642930.0,155128.0,0.297713,2.083572,273.208952,2.117014,0.757481,0.73962,2.229512,654.438856,2.312655,0.850986,0.927931,0.0,1.679862,1307.940922,2.585937,0.585038,38.803584
min,148.0,0.5,1.0,1451194000000.0,40.498634,1.0,-74.253983,1.0,0.0,-1.0,1200010000000.0,0.0,1.0,-0.9999996,0.0,300326400000.0,0.0,10.0,-0.9999983,0.0,165196800000.0,0.0,0.0,-0.999999,0.0,55036800000.0,0.0,0.0,-0.999997,0.0,27561600000.0,0.0,0.0,-1.0,0.0,-86400000.0,0.0,1.0,-0.999999,0.0,173923200000.0,0.0,0.0,-1.0,0.0,154569600000.0,0.65,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1451174000000.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,115.0,6.0,1.0,0.0,1.0,85.0,7.0,1.0,0.0,1.0,1.0,101.0,7.0,1.0,1.0
25%,10309.0,2.0,3.0,1599951000000.0,40.595226,1200.0,-74.10704,475000.0,480000.0,-0.05015555,1544400000000.0,0.0,459000.0,0.0,0.0,1525867000000.0,0.0,459000.0,-0.02197802,0.0,1520208000000.0,0.0,389000.0,-0.022688,0.0,1444003000000.0,0.0,349900.0,-0.02002,0.0,1380866000000.0,0.0,329000.0,-0.01716247,0.0,1345680000000.0,0.0,314900.0,-0.01669449,0.0,1324080000000.0,0.0,299900.0,-0.014631,0.0,1307059000000.0,0.84,2.0,1.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1598918000000.0,0.0,2.0,4185.0,464000.0,1925.0,0.2,5.0,540.0,13.0,1.0,0.4,4.0,573.0,13.0,1.0,0.6,1.0,3.0,1088.0,16.0,1.0,1925.0
50%,10469.0,2.0,3.0,1605120000000.0,40.663528,1560.0,-73.941345,630000.0,635000.0,-0.01655456,1571875000000.0,0.0,618000.0,0.0,0.0,1555459000000.0,0.0,615000.0,0.0,0.0,1548374000000.0,0.0,569000.0,0.0,0.0,1529885000000.0,0.0,540000.0,0.0,0.0,1515110000000.0,0.0,500000.0,0.0,0.0,1496880000000.0,0.0,479900.0,0.0,0.0,1464739000000.0,0.0,456663.0,0.0,0.0,1433765000000.0,0.87,2.0,2.0,0.0,0.0,0.0,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1600042000000.0,0.0,2.0,5455.0,586000.0,1945.0,0.4,7.0,688.0,14.0,1.0,0.7,6.0,993.0,14.0,1.0,1.1,1.0,4.0,2590.0,18.0,1.0,1950.0
75%,11362.0,3.0,4.0,1609979000000.0,40.747591,2162.25,-73.835815,885500.0,891000.0,0.0,1600301000000.0,0.0,879444.0,0.0,0.0,1583453000000.0,0.0,875000.0,0.01540832,0.0,1574640000000.0,0.0,838000.0,0.0,0.0,1562717000000.0,0.0,799000.0,0.0,0.0,1553818000000.0,0.0,779000.0,0.0,0.0,1542326000000.0,0.0,750000.0,0.005221932,0.0,1533341000000.0,0.0,725000.0,0.0,0.0,1525046000000.0,0.87,3.0,2.0,1.0,0.0,0.0,4.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,1607040000000.0,0.0,3.0,7258.0,842943.0,1975.0,0.5,8.0,867.0,15.0,1.0,1.3,8.0,1368.0,16.0,1.0,1.9,1.0,5.0,3352.0,19.0,1.0,1980.0
max,29512.0,1346.0,1502.0,1611281000000.0,40.911961,986641.0,-73.700432,90000000.0,90000000.0,1749999.0,1611274000000.0,1.0,88000000.0,1554999.0,0.0,1611014000000.0,1.0,98000000.0,1447999.0,0.0,1610842000000.0,1.0,699000000.0,524999.0,0.0,1610842000000.0,1.0,690900000.0,677499.0,0.0,1610323000000.0,1.0,73123490.0,1008332.0,0.0,1608509000000.0,1.0,54000000.0,1105555.0,0.0,1607990000000.0,1.0,54000000.0,629999.0,0.0,1607645000000.0,2.21,1346.0,40.0,1344.0,4.0,6.0,1502.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1611274000000.0,1422.0,112.0,7123810.0,1680558000.0,20162020.0,6.7,10.0,2000.0,20.0,17.0,6.6,10.0,5839.0,24.0,12.0,6.6,1.0,10.0,5839.0,24.0,5.0,2200.0


In [61]:
df.reset_index(inplace=True, drop=True)

In [62]:
df

Unnamed: 0,address/streetAddress,address/zipcode,bathrooms,bedrooms,dateposted,description,homeStatus,latitude,livingArea,longitude,price,priceHistory/0/buyerAgent/name,priceHistory/0/buyerAgent/photo/url,priceHistory/0/buyerAgent/profileUrl,priceHistory/0/event,priceHistory/0/price,priceHistory/0/priceChangeRate,priceHistory/0/sellerAgent/name,priceHistory/0/sellerAgent/photo/url,priceHistory/0/sellerAgent/profileUrl,priceHistory/0/time,priceHistory/1/event,priceHistory/1/postingIsRental,priceHistory/1/price,priceHistory/1/priceChangeRate,priceHistory/1/showCountyLink,priceHistory/1/time,priceHistory/2/event,priceHistory/2/postingIsRental,priceHistory/2/price,priceHistory/2/priceChangeRate,priceHistory/2/showCountyLink,priceHistory/2/time,priceHistory/3/event,priceHistory/3/postingIsRental,priceHistory/3/price,priceHistory/3/priceChangeRate,priceHistory/3/showCountyLink,priceHistory/3/time,priceHistory/4/event,priceHistory/4/postingIsRental,priceHistory/4/price,priceHistory/4/priceChangeRate,priceHistory/4/showCountyLink,priceHistory/4/time,priceHistory/5/event,priceHistory/5/postingIsRental,priceHistory/5/price,priceHistory/5/priceChangeRate,priceHistory/5/showCountyLink,priceHistory/5/time,priceHistory/6/event,priceHistory/6/postingIsRental,priceHistory/6/price,priceHistory/6/priceChangeRate,priceHistory/6/showCountyLink,priceHistory/6/time,priceHistory/7/event,priceHistory/7/postingIsRental,priceHistory/7/price,priceHistory/7/priceChangeRate,priceHistory/7/showCountyLink,priceHistory/7/time,propertyTaxRate,resoFactsStats/appliances/0,resoFactsStats/appliances/1,resoFactsStats/appliances/2,resoFactsStats/atAGlanceFacts/2/factValue,resoFactsStats/atAGlanceFacts/3/factValue,resoFactsStats/atAGlanceFacts/4/factValue,resoFactsStats/atAGlanceFacts/5/factValue,resoFactsStats/basement,resoFactsStats/bathrooms,resoFactsStats/bathroomsFull,resoFactsStats/bathroomsHalf,resoFactsStats/bathroomsOneQuarter,resoFactsStats/bathroomsThreeQuarter,resoFactsStats/bedrooms,resoFactsStats/cityRegion,resoFactsStats/constructionMaterials/0,resoFactsStats/cooling/0,resoFactsStats/elementarySchoolDistrict,resoFactsStats/flooring/0,resoFactsStats/hasAttachedProperty,resoFactsStats/hasCarport,resoFactsStats/hasCooling,resoFactsStats/hasFireplace,resoFactsStats/hasGarage,resoFactsStats/hasHeating,resoFactsStats/hasOpenParking,resoFactsStats/hasPrivatePool,resoFactsStats/hasView,resoFactsStats/heating/0,resoFactsStats/heating/1,resoFactsStats/highSchoolDistrict,resoFactsStats/homeType,resoFactsStats/isNewConstruction,resoFactsStats/livingArea,resoFactsStats/lotSize,resoFactsStats/middleOrJuniorSchoolDistrict,resoFactsStats/onMarketDate,resoFactsStats/otherFacts/0/name,resoFactsStats/otherFacts/0/value,resoFactsStats/otherFacts/1/name,resoFactsStats/otherFacts/1/value,resoFactsStats/otherFacts/2/name,resoFactsStats/otherFacts/2/value,resoFactsStats/otherFacts/3/name,resoFactsStats/otherFacts/3/value,resoFactsStats/otherFacts/4/name,resoFactsStats/otherFacts/4/value,resoFactsStats/otherFacts/5/name,resoFactsStats/otherFacts/5/value,resoFactsStats/otherFacts/6/name,resoFactsStats/otherFacts/6/value,resoFactsStats/otherFacts/7/name,resoFactsStats/otherFacts/7/value,resoFactsStats/otherFacts/8/name,resoFactsStats/otherFacts/8/value,resoFactsStats/otherFacts/9/name,resoFactsStats/otherFacts/9/value,resoFactsStats/otherFacts/10/value,resoFactsStats/otherFacts/11/value,resoFactsStats/parcelNumber,resoFactsStats/parking,resoFactsStats/parkingFeatures/0,resoFactsStats/parkingFeatures/1,resoFactsStats/rooms/0/roomType,resoFactsStats/sewer/0,resoFactsStats/stories,resoFactsStats/structureType,resoFactsStats/taxAnnualAmount,resoFactsStats/taxAssessedValue,resoFactsStats/yearBuiltEffective,schools/0/distance,schools/0/level,schools/0/name,schools/0/rating,schools/0/size,schools/0/studentsPerTeacher,schools/0/totalCount,schools/1/distance,schools/1/level,schools/1/name,schools/1/rating,schools/1/size,schools/1/studentsPerTeacher,schools/1/totalCount,schools/2/distance,schools/2/grades,schools/2/isAssigned,schools/2/level,schools/2/link,schools/2/name,schools/2/rating,schools/2/size,schools/2/studentsPerTeacher,schools/2/totalCount,schools/2/type,yearBuilt
0,60 Terrace View Ave,10463.0,2.0,5.0,1.610134e+12,"Discover Marble Hill, a neighborhood rich with...",FOR_SALE,40.877743,1889.0,-73.910866,799999.0,,,,Listed for sale,799999.0,0.335558,,,,1.610064e+12,Listing removed,0.0,599000.0,0.000000,0.0,1.459469e+12,Listed for sale,0.0,599000.0,0.711429,0.0,1.426810e+12,Listing removed,0.0,350000.0,0.000000,0.0,1.293062e+12,Price change,0.0,350000.0,-0.066667,0.0,1.276128e+12,Price change,0.0,375000.0,0.071429,0.0,1.275610e+12,Listed for sale,0.0,350000.0,0.000000,0.0,1.265328e+12,,,,,,,0.88,,,,"Natural Gas, Hot Water",,Driveway,,Finished,2.0,1.0,1.0,,,5.0,New York,Frame,,Bronx 10,,0,0,1.0,,0,1.0,0,,0,Natural Gas,Hot Water,Bronx 10,Residential,,"1,889 sqft",,Bronx 10,1.610064e+12,,,,,,,,,,,,,,,,,,,,,,,NO TAX ID FOUND,0,Driveway,,,Public Sewer,,,5096.0,711000.0,,0.1,Elementary,Ps 37 Multiple Intelligence School,4.0,647.0,14.0,1.0,0.1,Middle,In Tech Academy Aka Ms High School 368,3.0,993.0,14.0,1.0,,,,,,,,,,,,1920.0
1,625 W 246th St,10471.0,8.0,8.0,1.595968e+12,EXCLUSIVE BRAND NEW\nLavish Newly Built 8-Bd. ...,FOR_SALE,40.892689,7000.0,-73.910667,3995000.0,,,,Price change,3995000.0,-0.111235,,,,1.607299e+12,Price change,0.0,4495000.0,-0.080401,0.0,1.601510e+12,Listed for sale,0.0,4888000.0,0.087430,0.0,1.595894e+12,Listing removed,0.0,4495000.0,0.000000,0.0,1.584144e+12,Listed for sale,0.0,4495000.0,3.610256,0.0,1.572566e+12,Sold,0.0,975000.0,-0.025000,0.0,1.450397e+12,Listing removed,0.0,1000000.0,0.000000,0.0,1.447200e+12,Pending sale,0.0,1000000.0,0.000000,0.0,1.439856e+12,0.95,Dishwasher,Dryer,Washer,,Central,"Garage, Garage - Attached",0.29 Acres,,8.0,7.0,1.0,0.0,0.0,8.0,Bronx,,Central,,Hardwood,0,0,1.0,1.0,0,0.0,0,,0,,,,Single Family,0.0,"7,000 sqft",0.29 Acres,,1.595894e+12,,Clubhouse,,Granite countertop,,Playground,,Stainless steel appliances,,,,,,,,,,,,,,,059130860,0,Garage,Garage - Attached,,,1.0,Other,13941.0,1937000.0,1940.0,0.4,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,,,,,,,,,,,,1940.0
2,716 W 231st St,10463.0,3.0,4.0,1.592668e+12,This 4233 square foot single family home has 4...,FOR_SALE,40.883419,4233.0,-73.918106,1495000.0,,,,Price change,1495000.0,-0.002668,,,,1.611101e+12,Listed for sale,0.0,1499000.0,0.000000,0.0,1.592611e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.95,Dishwasher,Dryer,Washer,,,"Garage, Garage - Attached",0.42 Acres,,3.0,3.0,0.0,0.0,0.0,4.0,Bronx,,,,,0,0,0.0,,0,0.0,0,,0,,,,Single Family,0.0,"4,233 sqft",0.42 Acres,,1.592611e+12,,,,,,,,,,,,,,,,,,,,,,,057500494,0,Garage,Garage - Attached,,,2.0,,12253.0,2341000.0,1920.0,0.3,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,0.4,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,,,,,,,,,,,,1920.0
3,750 W 232nd St,10463.0,6.0,5.0,1.600814e+12,EXCLUSIVE NEW TO MARKET\nPrime Renovation Oppo...,FOR_SALE,40.885033,7000.0,-73.917793,3450000.0,,,,Price change,3450000.0,-0.092105,,,,1.608163e+12,Listed for sale,0.0,3800000.0,0.225806,0.0,1.600733e+12,Sold,0.0,3100000.0,-0.156463,0.0,1.551917e+12,Listing removed,0.0,3675000.0,0.000000,0.0,1.550707e+12,Pending sale,0.0,3675000.0,0.000000,0.0,1.510272e+12,Listed for sale,0.0,3675000.0,-0.125000,0.0,1.506298e+12,Sold,0.0,4200000.0,0.000000,0.0,9.650880e+11,,,,,,,0.95,,,,,Central,"Garage, Garage - Attached",0.26 Acres,,6.0,6.0,0.0,0.0,0.0,5.0,Bronx,,Central,,,0,0,1.0,1.0,0,0.0,0,,0,,,,Single Family,0.0,"7,000 sqft",0.26 Acres,,1.600733e+12,,,,,,,,,,,,,,,,,,,,,,,057510300,0,Garage,Garage - Attached,,,2.0,,19472.0,3011000.0,1950.0,0.2,Elementary,Ps 24 Spuyten Duyvil,10.0,907.0,16.0,1.0,0.3,Middle,Riverdale Kingsbridge Academy (Ms High School ...,5.0,1516.0,15.0,1.0,,,,,,,,,,,,1950.0
4,24 Cooper St #5CD,10034.0,2.0,3.0,1.611091e+12,"Due to Coronavirus 19, outbreak, ALL showings ...",FOR_SALE,40.867687,994.0,-73.924606,230000.0,,,,Listed for sale,230000.0,0.000000,,,,1.611014e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.88,,,,,,0 spaces,"$1,472/mo",,2.0,2.0,0.0,0.0,0.0,3.0,New York,,,,,0,0,0.0,,0,0.0,0,,0,,,,Condo,0.0,994 sqft,,,1.611014e+12,,,,,,,,,,,,,,,,,,,,,,,,0,,,,,,,,,,0.8,Elementary,Ps 18 Park Terrace,5.0,349.0,13.0,3.0,1.9,Middle,Ms 319 Marie Teresa,7.0,421.0,12.0,3.0,0.1,9-12,1.0,High,https://www.greatschools.org/school?id=18169&s...,INWOOD EARLY COLLEGE FOR HEALTH AND INFORMATIO...,3.0,371.0,,1.0,Public,1925.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
33839,93-19 71st Ave,11375.0,2.0,4.0,,This house is a rare combination of superb loc...,SOLD,40.712009,2200.0,-73.850281,1255000.0,,,,Sold,1255000.0,0.004804,David Yakubov,https://photos.zillowstatic.com/h_e/ISd8b63jno...,/profile/user3094820/,1.530230e+12,Pending sale,0.0,1249000.0,0.000000,0.0,1.523837e+12,Listed for sale,0.0,1249000.0,1.401923,0.0,1.519603e+12,Sold,0.0,520000.0,0.000000,0.0,1.512086e+12,,,,,,,,,,,,,,,,,,,,,,,,,0.84,Dishwasher,Dryer,Washer,,,"Garage, Garage - Attached","2,500 sqft",,2.0,1.0,1.0,0.0,0.0,4.0,Forest Hills,,,Community District 28,,0,0,0.0,,0,0.0,0,,0,,,Community District 28,Single Family,,"2,200 sqft","2,500 sqft",Community District 28,,Den/Family Room,Y,Detached/Attached,Det,Driveway,Pvt,Eat In Kitchen,Y,Picture,Y,Water,Public,Attic,Y,Heat,Steam,Sewer,Y,Wood Floors,Y,Gas,Y,032220033,0,Garage,Garage - Attached,DiningRoom,Y,,Loft,7129.0,1034000.0,1930.0,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1930.0
33840,6829 Manse St,11375.0,2.0,3.0,,Wonderful 1 Family Home. First Floor Features ...,SOLD,40.714203,2417.0,-73.855263,825000.0,,,,Sold,825000.0,-0.049539,Annie/Steve Your Home Sold Guaranteed,https://photos.zillowstatic.com/h_e/ISqh2r0uwq...,/profile/Agardi-Team/,1.532995e+12,Listing removed,0.0,868000.0,0.000000,0.0,1.519690e+12,Listed for sale,0.0,868000.0,0.000000,0.0,1.518566e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,Other,,"Garage, Garage - Attached","2,417 sqft",,2.0,0.0,0.0,0.0,0.0,3.0,Flushing,,,,,0,0,0.0,,0,1.0,0,,0,Other,,,Single Family,,"2,417 sqft","2,417 sqft",,,,,,,,,,,,,,,,,,,,,,,,,031950052,0,Garage,Garage - Attached,DiningRoom,,2.0,,6447.0,907000.0,1920.0,0.2,Primary,Ps 144 Col Jeromus Remsen,8.0,893.0,16.0,1.0,0.6,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,0.4,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1920.0
33841,82 Greenway Ter,11375.0,6.0,6.0,,"""DISTINQUISHED FIELDSTONE TOWNHOUSE TREASURE""\...",SOLD,40.717163,6085.0,-73.843124,2704000.0,GiGi Malek,https://photos.zillowstatic.com/h_e/ISvww3cyma...,/profile/ForestHillsGiGi/,Sold,2704000.0,0.040400,Linda Weiss,https://photos.zillowstatic.com/h_e/ISbxza59gh...,/profile/lindaweiss11/,1.561939e+12,Listing removed,0.0,2599000.0,0.000000,0.0,1.560989e+12,Pending sale,0.0,2599000.0,0.000000,0.0,1.556410e+12,Listed for sale,0.0,2599000.0,0.000000,0.0,1.554336e+12,Pending sale,0.0,2599000.0,0.000000,0.0,1.553558e+12,Listing removed,1.0,7000.0,0.000000,0.0,1.553126e+12,Price change,1.0,7000.0,-0.176471,0.0,1.552349e+12,Listed for sale,0.0,2599000.0,1.652041,0.0,1.551139e+12,0.84,,,,,,"Garage, Garage - Attached","3,255 sqft",,6.0,5.0,1.0,0.0,0.0,6.0,Forest Hills Gardens,,,,,0,0,0.0,0.0,0,0.0,0,0.0,0,,,,Townhouse,,"6,085 sqft","3,255 sqft",,,,Fios Available,,Parking,Parking Type,Garage,,,,,,,,,,,,,,,,,032740007,0,Garage,Garage - Attached,,,2.0,,18430.0,2513000.0,1925.0,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0
33842,86 Greenway Ter,11375.0,5.0,6.0,,EXCLUSIVE LISTING OF TERRACE SOTHEBY'S INTERNA...,SOLD,40.717052,4564.0,-73.843025,2750000.0,Terrace Sotheby's International Realty,https://photos.zillowstatic.com/h_e/ISbpd8y5jp...,/profile/terracesir/,Sold,2750000.0,-0.075630,Sheldon Stivelman,https://photos.zillowstatic.com/h_e/ISf403agde...,/profile/SheldonStivelman/,1.532390e+12,Pending sale,0.0,2975000.0,0.000000,0.0,1.523318e+12,Listed for sale,0.0,2975000.0,0.000000,0.0,1.521677e+12,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.84,,,,,,0 spaces,"6,603 sqft",,5.0,4.0,1.0,0.0,0.0,6.0,Forest Hills Gardens,,,,,0,0,0.0,0.0,0,0.0,0,0.0,0,,,,Townhouse,,"4,564 sqft","6,603 sqft",,,Features,"Special Program/QC Approved Listing, Garage Co...",,,,,,,,,,,,,,,,,,,,,032740004,0,,,,,2.0,,24649.0,2893000.0,1925.0,0.1,Primary,Ps 101 School In The Gardens,9.0,654.0,16.0,1.0,0.7,Middle,Jhs 190 Russell Sage,7.0,1054.0,15.0,1.0,1.0,9-12,1.0,High,https://www.greatschools.org/school?id=13512&s...,Queens Metropolitan High School,6.0,1088.0,15.0,1.0,Public,1925.0


In [None]:
df.to_pickle("listings_cleaned_na.pkl")