### Introduction: Housing Stock Analysis

In this notebook, we will go over how to do a simple housing stock analysis. We will be explicitly addressing the following questions:
- Which markets have the highest/lowest percentage of single family homes?
- Which markets have seen the greatest increase/decrease in the percentage of single family home development out of all construction in the last 5 years?

In order to follow along with this analysis, you will need a [Parcl Labs API Key](https://dashboard.parcllabs.com/signup)

In [None]:
import os

import pandas as pd

from parcllabs import ParclLabsClient
from parcllabs.search.top_markets import get_top_n_metros

In [3]:
api_key = os.getenv('PARCL_LABS_API_KEY')

client = ParclLabsClient(api_key=api_key)

In [4]:
top_markets = get_top_n_metros(n=50)
top_markets.iloc[0]
# Official MSA Name is derived from Census, parcl_id is what's important here. 
# The parcl_id is your key to everything about New York metro housing. 

Common Name                             New York
Official MSA Name    New York-Newark-Jersey City
parcl_id                                 2900187
Name: 0, dtype: object

In [5]:
# let's set aside the NY MSA parcl_id for analysis
ny_msa_parcl_id = top_markets.iloc[0]['parcl_id']
ny_msa_parcl_id

2900187

In [9]:
# let's set aside all top markets as well
top_market_ids = top_markets['parcl_id'].tolist()
top_markets.head()

Unnamed: 0,Common Name,Official MSA Name,parcl_id
0,New York,New York-Newark-Jersey City,2900187
1,Los Angeles,Los Angeles-Long Beach-Anaheim,2900078
2,Chicago,Chicago-Naperville-Elgin,2899845
3,Dallas,Dallas-Fort Worth-Arlington,2899734
4,Houston,Houston-Pasadena-The Woodlands,2899967


#### Retrieve Housing Stock for a Single Market

In [8]:
# let's start with the basics, let's get the breakdown of housing stock in New York Metro. 
# Housing stock is the mix of condos, single family homes, townhomes in a market. This mix changes all the time. Urban
# areas will get built out creating denser concentration of units. Covid caused a suburban shock, increasing the velocity of 
# suburban home developments. Assuming a fixed denominator in housing is a mistake. 

housing_stock_ny_msa = client.market_metrics_housing_stock.retrieve(
    parcl_id=ny_msa_parcl_id,
    params={
        'limit': 1 # let's get the most recent stock
    },
    as_dataframe=True # make life easy on ourselves
)

housing_stock_ny_msa

Unnamed: 0,date,single_family,condo,townhouse,other,all_properties,parcl_id
0,2024-03-01,2802362,957017,76608,1583688,5419675,2900187


#### Retrieve Housing Stock for Many Markets

In [10]:
# as of March, 2024, there are 5.4 million units within NY Metro, 2.8 million of which are single family homes, 
# and a million are condos. 

# let's see how this mix compares to other metros on a proportional basis. 
housing_stock = client.market_metrics_housing_stock.retrieve_many(
    parcl_ids=top_market_ids,
    params={
        'limit': 1 # let's get most recent again
    },
    as_dataframe=True
)

housing_stock.head()

Unnamed: 0,date,single_family,condo,townhouse,other,all_properties,parcl_id
0,2024-03-01,2802362,957017,76608,1583688,5419675,2900187
0,2024-03-01,1997656,858838,19719,555526,3431739,2900078
0,2024-03-01,2017899,768604,123514,588914,3498931,2899845
0,2024-03-01,1921281,457709,41641,373657,2794288,2899734
0,2024-03-01,1765489,383733,33520,355410,2538152,2899967


In [13]:
# add names back
housing_stock = pd.merge(housing_stock, top_markets, on='parcl_id')
housing_stock.head()

Unnamed: 0,date,single_family,condo,townhouse,other,all_properties,parcl_id,Common Name,Official MSA Name
0,2024-03-01,2802362,957017,76608,1583688,5419675,2900187,New York,New York-Newark-Jersey City
1,2024-03-01,1997656,858838,19719,555526,3431739,2900078,Los Angeles,Los Angeles-Long Beach-Anaheim
2,2024-03-01,2017899,768604,123514,588914,3498931,2899845,Chicago,Chicago-Naperville-Elgin
3,2024-03-01,1921281,457709,41641,373657,2794288,2899734,Dallas,Dallas-Fort Worth-Arlington
4,2024-03-01,1765489,383733,33520,355410,2538152,2899967,Houston,Houston-Pasadena-The Woodlands


In [14]:
# let's focus on mix of single family homes, condos, and townhouses
housing_stock['pct_single_family'] = housing_stock['single_family']/housing_stock['all_properties']
housing_stock['pct_condo'] = housing_stock['condo']/housing_stock['all_properties']
housing_stock['pct_townhouse'] = housing_stock['townhouse']/housing_stock['all_properties']

In [15]:
# which market has the highest percentage of single family homes?
housing_stock.sort_values('pct_single_family', ascending=False).head(5)
# Oklahoma, Sacramento, Freso, Richmond, and Indianopolis all have over 75% of the mix allocated towards
# single family homes

Unnamed: 0,date,single_family,condo,townhouse,other,all_properties,parcl_id,Common Name,Official MSA Name,pct_single_family,pct_condo,pct_townhouse
41,2024-03-01,442187,32060,4333,78573,557153,2900205,Oklahoma City,Oklahoma City,0.793655,0.057543,0.007777
27,2024-03-01,630344,81408,3133,101917,816802,2900315,Sacramento,Sacramento-Roseville-Folsom,0.771722,0.099667,0.003836
47,2024-03-01,211759,20236,344,42830,275169,2899715,Fresno,Fresno,0.76956,0.07354,0.00125
44,2024-03-01,397404,49416,18702,54384,519906,2900292,Richmond,Richmond,0.764377,0.095048,0.035972
33,2024-03-01,611490,68375,22103,103202,805170,2899979,Indianapolis,Indianapolis-Carmel-Greenwood,0.759455,0.08492,0.027451


In [16]:
# which markets have the smallest percentage of single family homes? 
housing_stock.sort_values('pct_single_family').head(5)
# Miami, Boston, Washington DC, Baltimore, and New York all approximately under 50% single family homes. 

# why is this important? Indices like the Case Shiller Index only track single family homes. They are leaving out a lot of the activity

Unnamed: 0,date,single_family,condo,townhouse,other,all_properties,parcl_id,Common Name,Official MSA Name,pct_single_family,pct_condo,pct_townhouse
8,2024-03-01,939341,1121073,226524,284459,2571397,2900128,Miami,Miami-Fort Lauderdale-West Palm Beach,0.365304,0.435978,0.088094
10,2024-03-01,897671,586304,22644,346885,1853504,2899625,Boston,Boston-Cambridge-Newton,0.48431,0.316322,0.012217
5,2024-03-01,1136154,577772,398207,198000,2310133,2900475,Washington,Washington-Arlington-Alexandria,0.491813,0.250103,0.172374
19,2024-03-01,523636,145202,257462,135495,1061795,2887292,Baltimore,Baltimore-Columbia-Towson,0.493161,0.136751,0.242478
0,2024-03-01,2802362,957017,76608,1583688,5419675,2900187,New York,New York-Newark-Jersey City,0.517072,0.176582,0.014135


#### Retrieve Housing Stock for Many Markets Over Time

In [25]:
# now lets see how this has changed over the last 5 years, by market.
# lets find the market that has the greatest share increase in Single Family Homes over the last 
# 5 years and the greatest decline in the proportion of single family homes

start_date = '2019-01-01'
end_date = '2024-04-01'
housing_stock_hist = client.market_metrics_housing_stock.retrieve_many(
    parcl_ids=top_market_ids,
    start_date=start_date,
    end_date=end_date,
    params={
        'limit': 200 # let's expand the limit to collect all observations in one call
    },
    as_dataframe=True
)

housing_stock_hist.head()
# add names
housing_stock_hist = pd.merge(housing_stock_hist, top_markets, on='parcl_id')

In [26]:
# recalc percentages
housing_stock_hist['pct_single_family'] = housing_stock_hist['single_family']/housing_stock_hist['all_properties']
housing_stock_hist['pct_condo'] = housing_stock_hist['condo']/housing_stock_hist['all_properties']
housing_stock_hist['pct_townhouse'] = housing_stock_hist['townhouse']/housing_stock_hist['all_properties']

In [27]:
# get the first value at 2019-01-01
hs_first = housing_stock_hist.loc[housing_stock_hist['date'] == start_date][['parcl_id', 'pct_single_family', 'pct_condo', 'pct_townhouse']]
hs_first = hs_first.rename(
    columns={
    'pct_single_family': 'pct_single_family_start',
    'pct_condo': 'pct_condo_start',
    'pct_townhouse': 'pct_townhouse_start'
    }
)

In [28]:
# join with full history
housing_stock_hist_v2 = pd.merge(housing_stock_hist, hs_first, on='parcl_id')
housing_stock_hist_v2.head()

Unnamed: 0,date,single_family,condo,townhouse,other,all_properties,parcl_id,Common Name,Official MSA Name,pct_single_family,pct_condo,pct_townhouse,pct_single_family_start,pct_condo_start,pct_townhouse_start
0,2024-03-01,2802362,957017,76608,1583688,5419675,2900187,New York,New York-Newark-Jersey City,0.517072,0.176582,0.014135,0.518624,0.174174,0.014015
1,2024-02-01,2802343,956595,76556,1583503,5418997,2900187,New York,New York-Newark-Jersey City,0.517133,0.176526,0.014127,0.518624,0.174174,0.014015
2,2024-01-01,2802321,956061,76525,1583326,5418233,2900187,New York,New York-Newark-Jersey City,0.517202,0.176453,0.014124,0.518624,0.174174,0.014015
3,2023-12-01,2802282,955583,76498,1583164,5417527,2900187,New York,New York-Newark-Jersey City,0.517262,0.176387,0.01412,0.518624,0.174174,0.014015
4,2023-11-01,2802256,955308,76461,1583070,5417095,2900187,New York,New York-Newark-Jersey City,0.517299,0.176351,0.014115,0.518624,0.174174,0.014015


In [30]:
# going back to our original question, which has had the highest increase in single family home percentage
housing_stock_hist_v2['pct_single_family_delta'] = housing_stock_hist_v2['pct_single_family']-housing_stock_hist_v2['pct_single_family_start']

housing_stock_hist_v2.loc[housing_stock_hist_v2['date'] == '2024-03-01'].sort_values('pct_single_family_delta', ascending=False)[['Common Name', 'pct_single_family_delta']].head(5)

Unnamed: 0,Common Name,pct_single_family_delta
189,Dallas,0.007585
1638,Austin,0.007053
2331,Jacksonville,0.007003
1764,Las Vegas,0.006152
1323,Orlando,0.005611


In [None]:
# Dallas, Austin, Jacksonville, Las Vegas, and Orlando have added over 50 basis points of the proportion of single family homes.
# of all development in these markets, single family homes have increased their share by 50 basis points. 
# consumers of these markets, or at least their is a thesis, that consumers in these markets particularly enjoy single family
# homes over other types of housing stock

In [31]:
# what about the inverse? 
housing_stock_hist_v2.loc[housing_stock_hist_v2['date'] == '2024-03-01'].sort_values('pct_single_family_delta', ascending=True)[['Common Name', 'pct_single_family_delta']].head(5)

Unnamed: 0,Common Name,pct_single_family_delta
1386,Charlotte,-0.011159
882,Seattle,-0.009557
2835,Salt Lake City,-0.007737
2142,Nashville,-0.006484
630,Boston,-0.005247


In [None]:
# Charlotte, Seattle, Salt Lake City, Nashville, Boston have decreased their share of single family homes 
# over the last 5 years relative to all new construction being built in these markets