<center>
<h1>Welcome to the Lab 🥼🧪</h1>
</center>

## How to identify who is providing the actual supply to housing markets?

We will analyze if supply is coming from investors, new construction or existing homeowners. We will breakout investors into portfolio sizes and analyze the impact of each group on the housing market.

#### Need help getting started?

As a reminder, you can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along.

To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY`.

Run in collab --> [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-cookbook/blob/main/examples/experimental/supply_and_demand/who_is_providing_supply.ipynb)

In [None]:
# if needed, install and/or upgrade to the latest verison of the Parcl Labs Python library
%pip install --upgrade parcllabs nbformat

In [1]:
import os
import pandas as pd
import plotly.express as px
from datetime import datetime
import plotly.graph_objects as go
from parcllabs import ParclLabsClient
from parcllabs.beta.charting.styling import SIZE_CONFIG
from parcllabs.beta.ts_stats import TimeSeriesAnalysis
from parcllabs.beta.charting.utils import create_labs_logo_dict
from parcllabs.beta.charting.default_charts import create_dual_axis_chart
from parcllabs.beta.charting.styling import default_style_config as style_config


client = ParclLabsClient(
    api_key=os.environ.get('PARCL_LABS_API_KEY', "<your Parcl Labs API key if not set as environment variable>"), 
    limit=200 # set global default limit, will be handy when retrieving the market data itself
)

In [44]:
# STEP 1. Retrieve Markets

# lets analyze the top 100 markets comped against US average

metros = client.search.markets.retrieve(
    sort_by='TOTAL_POPULATION',
    sort_order='DESC',
    location_type='CBSA',
    limit=100 # get top 300 metros based on population
)

# add us national to comp markets against national numbers
us = client.search.markets.retrieve(
    query='United States',
    limit=1
)

markets = pd.concat([metros, us])
market_parcl_ids = markets['parcl_id'].tolist()

markets['state'] = markets['name'].apply(lambda x: x.split(',')[-1].strip().upper().split('-')[0])
markets['clean_name'] = markets.apply(lambda x: f"{x['name'].split('-')[0].split(',')[0].strip()}, {x['state']}", axis=1)
markets['clean_name'] = markets['clean_name'].replace({'United States Of America, UNITED STATES OF AMERICA': 'USA'})

In [45]:
PROPERTY_TYPE = 'ALL_PROPERTIES'
# get supply side of the market
supply = client.for_sale_market_metrics.for_sale_inventory.retrieve(
    parcl_ids=market_parcl_ids,
    limit=300,
    property_type=PROPERTY_TYPE,
)

# get price changing dynamics
price_changes = client.for_sale_market_metrics.for_sale_inventory_price_changes.retrieve(
    parcl_ids=market_parcl_ids,
    limit=300,
    property_type=PROPERTY_TYPE,
)

|████████████████████████████████████████| 101/101 [100%] in 8.6s (11.72/s) 
|████████████████████████████████████████| 101/101 [100%] in 9.1s (11.06/s) 


In [46]:
supply = supply.merge(price_changes[['parcl_id', 'date', 'count_price_drop']], on=['parcl_id', 'date'])
supply.head()

Unnamed: 0,date,for_sale_inventory,parcl_id,property_type,count_price_drop
0,2024-07-01,29089,2900187,ALL_PROPERTIES,6032
1,2024-06-24,29253,2900187,ALL_PROPERTIES,6008
2,2024-06-17,29468,2900187,ALL_PROPERTIES,5878
3,2024-06-10,29206,2900187,ALL_PROPERTIES,5771
4,2024-06-03,27513,2900187,ALL_PROPERTIES,5490


In [54]:
supply = supply.sort_values(['parcl_id', 'date'])
supply['yoy_change_in_price_drops'] = supply.groupby('parcl_id')['pct_price_drops'].pct_change(52) * 100

In [47]:
supply['pct_price_drops'] = supply['count_price_drop'] / supply['for_sale_inventory']

In [48]:
ath = supply.groupby('parcl_id')['pct_price_drops'].max().reset_index(name='ath_pct_drops')
ath_supply = supply.merge(ath, on=['parcl_id'], how='left')
pids = ath_supply.loc[(ath_supply['date']=='2024-07-01') & (ath_supply['pct_price_drops'] == ath_supply['ath_pct_drops'])]['parcl_id'].tolist()


In [55]:
top_n = supply.loc[supply['date']=='2024-07-01'].sort_values('yoy_change_in_price_drops', ascending=False).head(20)
top_n_ids = top_n['parcl_id'].tolist()
if us['parcl_id'].values[0] not in top_n_ids:
    top_n_ids.append(us['parcl_id'].values[0])

chart = supply.loc[supply['parcl_id'].isin(top_n_ids)]

In [56]:

chart = chart.merge(markets[['parcl_id', 'clean_name']], on='parcl_id')

In [57]:
max_date_for_chart = chart['date'].max().date()
max_date_for_chart = max_date_for_chart.strftime('%B %d, %Y')

# Create the line chart using Plotly Express
fig = px.line(
    chart,
    x='date',
    y='pct_price_drops',
    color='clean_name',
    line_group='clean_name',
    labels={'pct_price_drops': '% of Inventory with Price Cuts'},
    title=f'Percentage of Inventory with Price Reductions ({max_date_for_chart})'
)

# Update traces to apply specific styles
for trace in fig.data:
    if trace.name == 'USA':
        trace.update(
            line=dict(color='red', width=4),
            opacity=1
        )
    else:
        trace.update(
            line=dict(color='lightblue', dash='dash', width=2),
            opacity=0.8
        )
    # Remove text annotations from traces
    trace.update(
        mode='lines'
    )

# Find the latest date in the dataset
latest_date = max(chart['date'])

# Add annotations for each line on the far right
annotations = []
y_positions = []

for trace in fig.data:
    # Get the last y-value for each clean_name
    last_y_value = chart[
        (chart['clean_name'] == trace.name) &
        (chart['date'] == latest_date)
    ]['pct_price_drops'].values[0]
    
    # Only add the annotation if it doesn't overlap with existing annotations
    if not any(abs(last_y_value - y) < 0.01 for y in y_positions):  # Adjust threshold as needed
        annotations.append(dict(
            x=latest_date,
            y=last_y_value,
            xref='x',
            yref='y',
            text=trace.name,
            showarrow=False,
            xanchor='left',
            font=dict(size=12)  # Adjust the font size if needed
        ))
        y_positions.append(last_y_value)

fig.add_layout_image(
        create_labs_logo_dict()
)

# Update layout for axes, title, and other styling
fig.update_layout(
    width=1600,
    height=800,
    xaxis=dict(
        title='',
        showgrid=style_config['showgrid'],
        gridwidth=style_config['gridwidth'],
        gridcolor=style_config['grid_color'],
        # tickangle=style_config['tick_angle'],
        linecolor=style_config['line_color_axis'],
        linewidth=style_config['linewidth'],
        titlefont=style_config['title_font_axis']
    ),
    yaxis=dict(
        title='% Price Reductions',
        showgrid=style_config['showgrid'],
        gridwidth=style_config['gridwidth'],
        gridcolor=style_config['grid_color'],
        tickfont=style_config['axis_font'],
        zeroline=False,
        tickformat='.0%',
        linecolor=style_config['line_color_axis'],
        linewidth=style_config['linewidth'],
        titlefont=style_config['title_font_axis']
    ),
    plot_bgcolor=style_config['background_color'],
    paper_bgcolor=style_config['background_color'],
    font=dict(color=style_config['font_color']),
    showlegend=False,  # Remove the legend
    margin=dict(l=40, r=40, t=80, b=40),
    title={
        'y': 0.98,
        'x': 0.5,
        'xanchor': 'center',
        'yanchor': 'top',
        'font': dict(size=24)
    },
    annotations=annotations  # Add annotations
)

fig.show()


In [11]:
start_date = '2022-09-01'

supply = client.for_sale_market_metrics.for_sale_inventory.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

new_listings = client.market_metrics.housing_event_counts.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

# we will need to secure data from 3 separate endpoints
nc_listings = client.new_construction_metrics.housing_event_counts.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

investor_listings = client.investor_metrics.housing_event_counts.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

|████████████████████████████████████████| 301/301 [100%] in 25.7s (11.71/s) 
|████████████████████████████████████████| 301/301 [100%] in 24.3s (12.39/s) 
|████████████████████████████████████████| 301/301 [100%] in 24.8s (12.12/s) 
|████████████████████████████████████████| 301/301 [100%] in 27.6s (10.90/s) 


In [22]:
nc_listings.head()

Unnamed: 0,date,sales,new_listings_for_sale,new_rental_listings,parcl_id,property_type
0,2024-05-01,658,848,1178,2900187,ALL_PROPERTIES
1,2024-04-01,868,728,876,2900187,ALL_PROPERTIES
2,2024-03-01,994,768,940,2900187,ALL_PROPERTIES
3,2024-02-01,847,808,913,2900187,ALL_PROPERTIES
4,2024-01-01,926,786,1171,2900187,ALL_PROPERTIES


In [12]:
# get investor ownership
investor_ownership = client.investor_metrics.housing_stock_ownership.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

investor_ownership

|████████████████████████████████████████| 301/301 [100%] in 1:44.3 (2.88/s) 


Unnamed: 0,date,count,pct_ownership,parcl_id
0,2024-05-01,352299,6.47,2900187
1,2024-04-01,351312,6.45,2900187
2,2024-03-01,350113,6.43,2900187
3,2024-02-01,348896,6.41,2900187
4,2024-01-01,347824,6.39,2900187
...,...,...,...,...
6316,2023-01-01,9924827,7.93,5826765
6317,2022-12-01,9922237,7.93,5826765
6318,2022-11-01,9921070,7.93,5826765
6319,2022-10-01,9920807,7.93,5826765


In [17]:
# need to index supply monthly as its currently a weekly series
supply['date_month'] = supply['date'].dt.to_period('M').dt.to_timestamp()
max_weekly_date = supply.groupby(['parcl_id', 'date_month'])['date'].max().reset_index()
supply = supply.merge(max_weekly_date, on=['parcl_id', 'date_month', 'date'], how='inner') # inner join will get us the last week of each month
supply = supply.rename(columns={
    'date': 'date_arch',
    'date_month': 'date'
})
supply = supply[['date', 'parcl_id', 'for_sale_inventory']]

In [18]:
cols = ['date', 'parcl_id', 'new_listings_for_sale']
nl = new_listings[cols]
ncl = nc_listings[cols]
nil = investor_listings[cols]

In [19]:
ncl = ncl.rename(columns={
    'new_listings_for_sale': 'new_construction_listings_for_sale' 
})

nil = nil.rename(columns={
    'new_listings_for_sale': 'new_investor_listings_for_sale'
})

data = pd.merge(nl, ncl, on=['date', 'parcl_id'])
data = data.merge(nil, on=['date', 'parcl_id'])
data = pd.merge(markets[['parcl_id', 'name']], data, on='parcl_id')
data = data.merge(supply, on=['parcl_id', 'date'])
data = data.merge(investor_ownership[['date', 'parcl_id', 'pct_ownership']], on=['date', 'parcl_id'])
data['pct_ownership'] = data['pct_ownership']/100

In [20]:
data.head()

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_ownership
0,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-05-01,14754,848,1439,27629,0.0647
1,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-04-01,14475,728,1347,26348,0.0645
2,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-03-01,13290,768,1308,24987,0.0643
3,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-02-01,12080,808,1233,22718,0.0641
4,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-01-01,11417,786,1203,18667,0.0639


In [21]:
data['pct_new_listings_of_all'] = data['new_listings_for_sale']/data['for_sale_inventory']
data['pct_new_construction_listings_of_new'] = data['new_construction_listings_for_sale']/data['new_listings_for_sale']
data['pct_new_investor_listings_of_new'] = data['new_investor_listings_for_sale'] / data['new_listings_for_sale']
data['pct_new_construction_listings_of_all'] = data['new_construction_listings_for_sale']/data['for_sale_inventory']
data['pct_new_investor_listings_of_all'] = data['new_investor_listings_for_sale']/data['for_sale_inventory']
data['ownership_to_list_skew_new_listings'] = data['pct_new_investor_listings_of_new'] - data['pct_ownership']
data['ownership_to_list_skew_of_all'] = data['pct_new_investor_listings_of_all'] - data['pct_ownership']

In [64]:
data.loc[data['date']=='5/1/2024'].sort_values('ownership_to_list_skew_new_listings', ascending=False).head(20)

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_ownership,pct_new_listings_of_all,pct_new_construction_listings_of_new,pct_new_investor_listings_of_new,pct_new_construction_listings_of_all,pct_new_investor_listings_of_all,ownership_to_list_skew_new_listings,ownership_to_list_skew_of_all
1071,2900301,"Rochester, Ny",2024-05-01,3057,121,399,2678,0.056,1.141524,0.039581,0.13052,0.045183,0.148992,0.07452,0.092992
903,2900292,"Richmond, Va",2024-05-01,2139,184,236,3219,0.0442,0.664492,0.086022,0.110332,0.057161,0.073315,0.066132,0.029115
756,2900462,"Virginia Beach-Norfolk-Newport News, Va-Nc",2024-05-01,2682,135,307,4775,0.0499,0.561675,0.050336,0.114467,0.028272,0.064293,0.064567,0.014393
420,2900321,"St. Louis, Mo-Il",2024-05-01,4673,96,847,8313,0.1179,0.562132,0.020544,0.181254,0.011548,0.101889,0.063354,-0.016011
1281,2887291,"Bakersfield, Ca",2024-05-01,1202,32,207,2071,0.109,0.580396,0.026622,0.172213,0.015451,0.099952,0.063213,-0.009048
1953,2900429,"Toledo, Oh",2024-05-01,1010,26,155,1479,0.0916,0.682894,0.025743,0.153465,0.017579,0.104801,0.061865,0.013201
1995,2900229,"Palm Bay-Melbourne-Titusville, Fl",2024-05-01,2117,114,354,4928,0.1076,0.429586,0.05385,0.167218,0.023133,0.071834,0.059618,-0.035766
1596,2899822,"Cape Coral-Fort Myers, Fl",2024-05-01,3496,239,738,10354,0.1526,0.337647,0.068364,0.211098,0.023083,0.071277,0.058498,-0.081323
1722,2899854,"Akron, Oh",2024-05-01,868,14,133,1232,0.0955,0.704545,0.016129,0.153226,0.011364,0.107955,0.057726,0.012455
693,2899654,"Cleveland-Elyria, Oh",2024-05-01,3069,66,447,4611,0.0882,0.665582,0.021505,0.14565,0.014314,0.096942,0.05745,0.008742


In [9]:
data.loc[data['date']=='5/1/2024'].sort_values('pct_new_construction_listings_of_new', ascending=False).head(20)

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_ownership,pct_new_listings_of_all,pct_new_construction_listings_of_new,pct_new_investor_listings_of_new,pct_new_construction_listings_of_all,pct_new_investor_listings_of_all,ownership_to_list_skew_new_listings,ownership_to_list_skew_of_all
1344,2900116,"Mcallen-Edinburg-Mission, Tx",2024-05-01,866,160,53,1927,0.0338,0.449403,0.184758,0.061201,0.083031,0.027504,0.027401,-0.006296
1701,2899752,"Des Moines-West Des Moines, Ia",2024-05-01,1538,250,124,3059,0.0715,0.502779,0.162549,0.080624,0.081726,0.040536,0.009124,-0.030964
714,2900174,"Nashville-Davidson--Murfreesboro--Franklin, Tn",2024-05-01,5350,848,436,9232,0.0671,0.579506,0.158505,0.081495,0.091854,0.047227,0.014395,-0.019873
567,2887289,"Austin-Round Rock-Georgetown, Tx",2024-05-01,5984,904,388,14821,0.0509,0.403751,0.15107,0.06484,0.060995,0.026179,0.01394,-0.024721
882,2900122,"Memphis, Tn-Ms-Ar",2024-05-01,517,78,43,1110,0.1111,0.465766,0.15087,0.083172,0.07027,0.038739,-0.027928,-0.072361
1848,2900276,"Provo-Orem, Ut",2024-05-01,1294,194,115,2490,0.1346,0.519679,0.149923,0.088872,0.077912,0.046185,-0.045728,-0.088415
84,2899967,"Houston-The Woodlands-Sugar Land, Tx",2024-05-01,15024,2185,1323,30807,0.0546,0.487681,0.145434,0.088059,0.070925,0.042945,0.033459,-0.011655
1974,2887286,"Augusta-Richmond County, Ga-Sc",2024-05-01,1233,178,94,2494,0.0685,0.494387,0.144363,0.076237,0.071371,0.03769,0.007737,-0.03081
1617,2899621,"Boise City, Id",2024-05-01,2052,289,167,3926,0.0921,0.522669,0.140838,0.081384,0.073612,0.042537,-0.010716,-0.049563
483,2900331,"San Antonio-New Braunfels, Tx",2024-05-01,5256,737,514,12981,0.0543,0.404899,0.140221,0.097793,0.056775,0.039596,0.043493,-0.014704


In [66]:
data['state'] = data['name'].apply(lambda x: x.split(',')[-1].strip().upper().split('-')[0])
data['clean_name'] = data.apply(lambda x: f"{x['name'].split('-')[0].split(',')[0].strip()}, {x['state']}", axis=1)
data['clean_name'] = data['clean_name'].replace({'United States Of America, UNITED STATES OF AMERICA': 'USA'})

In [40]:
def multi_market_line_chart_as_pct(
    data: pd.DataFrame,
    y: str='pct_new_construction_listings_of_new',
    x: str='date',
    title: str='Percentage of New Listings coming from New Construction',
    label: str='% of New Listings from New Construction',
    color: str='clean_name'
): 

    max_date_for_chart = data['date'].max().date()
    max_date_for_chart = max_date_for_chart.strftime('%B %d, %Y')

    # Create the line chart using Plotly Express
    fig = px.line(
        data,
        x=x,
        y=y,
        color=color,
        line_group=color,
        labels={y: label},
        title=f'{title} ({max_date_for_chart})'
    )

    # Update traces to apply specific styles
    for trace in fig.data:
        if trace.name == 'USA':
            trace.update(
                line=dict(color='red', width=4),
                opacity=1
            )
        else:
            trace.update(
                line=dict(color='lightblue', dash='dash', width=2),
                opacity=0.8
            )
        # Remove text annotations from traces
        trace.update(
            mode='lines'
        )

    # Find the latest date in the dataset
    latest_date = max(data[x])

    # Add annotations for each line on the far right
    annotations = []
    y_positions = []

    for trace in fig.data:
        # Get the last y-value for each clean_name
        last_y_value = data[
            (data[color] == trace.name) &
            (data[x] == latest_date)
        ][y].values[0]
        
        # Only add the annotation if it doesn't overlap with existing annotations
        if not any(abs(last_y_value - y) < 0.01 for y in y_positions):  # Adjust threshold as needed
            annotations.append(dict(
                x=latest_date,
                y=last_y_value,
                xref='x',
                yref='y',
                text=trace.name,
                showarrow=False,
                xanchor='left',
                font=dict(size=12)  # Adjust the font size if needed
            ))
            y_positions.append(last_y_value)

    fig.add_layout_image(
            create_labs_logo_dict()
    )

    # Update layout for axes, title, and other styling
    fig.update_layout(
        width=1600,
        height=800,
        xaxis=dict(
            title='',
            showgrid=style_config['showgrid'],
            gridwidth=style_config['gridwidth'],
            gridcolor=style_config['grid_color'],
            # tickangle=style_config['tick_angle'],
            linecolor=style_config['line_color_axis'],
            linewidth=style_config['linewidth'],
            titlefont=style_config['title_font_axis']
        ),
        yaxis=dict(
            title='% Price Reductions',
            showgrid=style_config['showgrid'],
            gridwidth=style_config['gridwidth'],
            gridcolor=style_config['grid_color'],
            tickfont=style_config['axis_font'],
            zeroline=False,
            tickformat='.0%',
            linecolor=style_config['line_color_axis'],
            linewidth=style_config['linewidth'],
            titlefont=style_config['title_font_axis']
        ),
        plot_bgcolor=style_config['background_color'],
        paper_bgcolor=style_config['background_color'],
        font=dict(color=style_config['font_color']),
        showlegend=False,  # Remove the legend
        margin=dict(l=40, r=40, t=80, b=40),
        title={
            'y': 0.98,
            'x': 0.5,
            'xanchor': 'center',
            'yanchor': 'top',
            'font': dict(size=24)
        },
        annotations=annotations  # Add annotations
    )

    fig.show()


In [42]:
multi_market_line_chart_as_pct(
    data=data,
    y='pct_new_construction_listings_of_new',
    x='date',
    title='Percentage of New Listings coming from New Construction',
    label='% of New Listings from New Construction',
    color='clean_name'
)


In [70]:
multi_market_line_chart_as_pct(
    data=data.loc[data['parcl_id'].isin([2899822, 5826765, 2900417, 2900128,
 2900417,
 2900213,
 2899989,
 2900192,
 2899822,
 2900041,
 2899748,
 2900229])],
    y='pct_new_investor_listings_of_new',
    x='date',
    title='Percentage of New Listings coming from Investors',
    label='% of New Listings from Investors',
    color='clean_name'
)

In [69]:
# cape coral pid: 2899822, 5826765, 2900417
data.loc[data['clean_name'].str.contains(', FL')]['parcl_id'].unique().tolist()

[2900128,
 2900417,
 2900213,
 2899989,
 2900192,
 2899822,
 2900041,
 2899748,
 2900229]