<center>
<h1>Welcome to the Lab 🥼🧪</h1>
</center>

## How to identify who is providing the actual supply to housing markets?

We will analyze if supply is coming from investors, new construction or existing homeowners. We will breakout investors into portfolio sizes and analyze the impact of each group on the housing market.

#### Need help getting started?

As a reminder, you can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along.

To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY`.

Run in collab --> [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-cookbook/blob/main/examples/experimental/supply_and_demand/who_is_providing_supply.ipynb)

In [None]:
# if needed, install and/or upgrade to the latest verison of the Parcl Labs Python library
%pip install --upgrade parcllabs nbformat

In [4]:
import os
import pandas as pd
import plotly.express as px
from datetime import datetime
import plotly.graph_objects as go
from parcllabs import ParclLabsClient
from parcllabs.beta.charting.styling import SIZE_CONFIG
from parcllabs.beta.ts_stats import TimeSeriesAnalysis
from parcllabs.beta.charting.utils import create_labs_logo_dict
from parcllabs.beta.charting.default_charts import create_dual_axis_chart
from parcllabs.beta.charting.styling import default_style_config as style_config


client = ParclLabsClient(
    api_key=os.environ.get('PARCL_LABS_API_KEY', "<your Parcl Labs API key if not set as environment variable>"), 
    limit=200 # set global default limit, will be handy when retrieving the market data itself
)

In [7]:
# STEP 1. Retrieve Markets

# lets analyze the top 100 markets comped against US average

metros = client.search.markets.retrieve(
    sort_by='TOTAL_POPULATION',
    sort_order='DESC',
    location_type='CBSA',
    limit=100 # get top 300 metros based on population
)

# add us national to comp markets against national numbers
us = client.search.markets.retrieve(
    query='United States',
    limit=1
)

markets = pd.concat([metros, us])
market_parcl_ids = markets['parcl_id'].tolist()

In [9]:
start_date = '2022-09-01'

supply = client.for_sale_market_metrics.for_sale_inventory.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

new_listings = client.market_metrics.housing_event_counts.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

# we will need to secure data from 3 separate endpoints
nc_listings = client.new_construction_metrics.housing_event_counts.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

investor_listings = client.investor_metrics.housing_event_counts.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

|████████████████████████████████████████| 101/101 [100%] in 8.2s (12.38/s) 
|████████████████████████████████████████| 101/101 [100%] in 7.5s (13.51/s) 
|████████████████████████████████████████| 101/101 [100%] in 7.7s (13.04/s) 
|████████████████████████████████████████| 101/101 [100%] in 7.6s (13.29/s) 


In [52]:
# get investor ownership
investor_ownership = client.investor_metrics.housing_stock_ownership.retrieve(
    parcl_ids=market_parcl_ids,
    start_date=start_date
)

investor_ownership

|████████████████████████████████████████| 101/101 [100%] in 8.5s (11.82/s) 


Unnamed: 0,date,count,pct_ownership,parcl_id
0,2024-05-01,352299,6.47,2900187
1,2024-04-01,351312,6.45,2900187
2,2024-03-01,350113,6.43,2900187
3,2024-02-01,348896,6.41,2900187
4,2024-01-01,347824,6.39,2900187
...,...,...,...,...
2116,2023-01-01,9924827,7.93,5826765
2117,2022-12-01,9922237,7.93,5826765
2118,2022-11-01,9921070,7.93,5826765
2119,2022-10-01,9920807,7.93,5826765


In [23]:
# need to index supply monthly as its currently a weekly series
supply['date_month'] = supply['date'].dt.to_period('M').dt.to_timestamp()
max_weekly_date = supply.groupby(['parcl_id', 'date_month'])['date'].max().reset_index()
supply = supply.merge(max_weekly_date, on=['parcl_id', 'date_month', 'date'], how='inner') # inner join will get us the last week of each month
supply = supply.rename(columns={
    'date': 'date_arch',
    'date_month': 'date'
})
supply = supply[['date', 'parcl_id', 'for_sale_inventory']]

In [12]:
cols = ['date', 'parcl_id', 'new_listings_for_sale']
nl = new_listings[cols]
ncl = nc_listings[cols]
nil = investor_listings[cols]

In [59]:
ncl = ncl.rename(columns={
    'new_listings_for_sale': 'new_construction_listings_for_sale' 
})

nil = nil.rename(columns={
    'new_listings_for_sale': 'new_investor_listings_for_sale'
})

data = pd.merge(nl, ncl, on=['date', 'parcl_id'])
data = data.merge(nil, on=['date', 'parcl_id'])
data = pd.merge(markets[['parcl_id', 'name']], data, on='parcl_id')
data = data.merge(supply, on=['parcl_id', 'date'])
data = data.merge(investor_ownership[['date', 'parcl_id', 'pct_ownership']], on=['date', 'parcl_id'])
data['pct_ownership'] = data['pct_ownership']/100

In [60]:
data.head()

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_ownership
0,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-05-01,14754,848,1439,27642,0.0647
1,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-04-01,14475,728,1347,26328,0.0645
2,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-03-01,13290,768,1308,24960,0.0643
3,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-02-01,12080,808,1233,22714,0.0641
4,2900187,"New York-Newark-Jersey City, Ny-Nj-Pa",2024-01-01,11417,786,1203,18649,0.0639


In [61]:
data['pct_new_listings_of_all'] = data['new_listings_for_sale']/data['for_sale_inventory']
data['pct_new_construction_listings_of_new'] = data['new_construction_listings_for_sale']/data['new_listings_for_sale']
data['pct_new_investor_listings_of_new'] = data['new_investor_listings_for_sale'] / data['new_listings_for_sale']
data['pct_new_construction_listings_of_all'] = data['new_construction_listings_for_sale']/data['for_sale_inventory']
data['pct_new_investor_listings_of_all'] = data['new_investor_listings_for_sale']/data['for_sale_inventory']
data['ownership_to_list_skew_new_listings'] = data['pct_new_investor_listings_of_new'] - data['pct_ownership']
data['ownership_to_list_skew_of_all'] = data['pct_new_investor_listings_of_all'] - data['pct_ownership']

In [64]:
data.loc[data['date']=='5/1/2024'].sort_values('ownership_to_list_skew_new_listings', ascending=False).head(20)

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_ownership,pct_new_listings_of_all,pct_new_construction_listings_of_new,pct_new_investor_listings_of_new,pct_new_construction_listings_of_all,pct_new_investor_listings_of_all,ownership_to_list_skew_new_listings,ownership_to_list_skew_of_all
1071,2900301,"Rochester, Ny",2024-05-01,3057,121,399,2678,0.056,1.141524,0.039581,0.13052,0.045183,0.148992,0.07452,0.092992
903,2900292,"Richmond, Va",2024-05-01,2139,184,236,3219,0.0442,0.664492,0.086022,0.110332,0.057161,0.073315,0.066132,0.029115
756,2900462,"Virginia Beach-Norfolk-Newport News, Va-Nc",2024-05-01,2682,135,307,4775,0.0499,0.561675,0.050336,0.114467,0.028272,0.064293,0.064567,0.014393
420,2900321,"St. Louis, Mo-Il",2024-05-01,4673,96,847,8313,0.1179,0.562132,0.020544,0.181254,0.011548,0.101889,0.063354,-0.016011
1281,2887291,"Bakersfield, Ca",2024-05-01,1202,32,207,2071,0.109,0.580396,0.026622,0.172213,0.015451,0.099952,0.063213,-0.009048
1953,2900429,"Toledo, Oh",2024-05-01,1010,26,155,1479,0.0916,0.682894,0.025743,0.153465,0.017579,0.104801,0.061865,0.013201
1995,2900229,"Palm Bay-Melbourne-Titusville, Fl",2024-05-01,2117,114,354,4928,0.1076,0.429586,0.05385,0.167218,0.023133,0.071834,0.059618,-0.035766
1596,2899822,"Cape Coral-Fort Myers, Fl",2024-05-01,3496,239,738,10354,0.1526,0.337647,0.068364,0.211098,0.023083,0.071277,0.058498,-0.081323
1722,2899854,"Akron, Oh",2024-05-01,868,14,133,1232,0.0955,0.704545,0.016129,0.153226,0.011364,0.107955,0.057726,0.012455
693,2899654,"Cleveland-Elyria, Oh",2024-05-01,3069,66,447,4611,0.0882,0.665582,0.021505,0.14565,0.014314,0.096942,0.05745,0.008742


In [28]:
data.loc[data['name'].str.contains('Tampa')]

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_new_listings_of_all,pct_new_construction_listings_of_new,pct_new_investor_listings_of_new,pct_new_construction_listings_of_all,pct_new_investor_listings_of_all
357,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2024-05-01,8994,395,1148,22583,0.398264,0.043918,0.127641,0.017491,0.050835
358,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2024-04-01,9478,399,1112,22880,0.414248,0.042097,0.117324,0.017439,0.048601
359,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2024-03-01,9672,348,1096,22792,0.424359,0.03598,0.113317,0.015269,0.048087
360,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2024-02-01,9535,336,1128,21917,0.43505,0.035239,0.118301,0.015331,0.051467
361,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2024-01-01,8993,429,1136,17945,0.501142,0.047704,0.12632,0.023906,0.063305
362,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2023-12-01,5749,280,703,16418,0.350164,0.048704,0.122282,0.017054,0.042819
363,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2023-11-01,7747,470,975,18297,0.423403,0.060669,0.125855,0.025687,0.053287
364,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2023-10-01,8479,478,1091,18200,0.465879,0.056375,0.128671,0.026264,0.059945
365,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2023-09-01,8508,447,993,16702,0.5094,0.052539,0.116714,0.026763,0.059454
366,2900417,"Tampa-St. Petersburg-Clearwater, Fl",2023-08-01,8167,395,1006,15276,0.534629,0.048365,0.123179,0.025858,0.065855


In [33]:
data.loc[data['date']=='5/1/2024'].sort_values('pct_new_construction_listings_of_new', ascending=False).head(20)

Unnamed: 0,parcl_id,name,date,new_listings_for_sale,new_construction_listings_for_sale,new_investor_listings_for_sale,for_sale_inventory,pct_new_listings_of_all,pct_new_construction_listings_of_new,pct_new_investor_listings_of_new,pct_new_construction_listings_of_all,pct_new_investor_listings_of_all
1344,2900116,"Mcallen-Edinburg-Mission, Tx",2024-05-01,866,160,53,1929,0.448937,0.184758,0.061201,0.082945,0.027475
1701,2899752,"Des Moines-West Des Moines, Ia",2024-05-01,1538,250,124,3057,0.503108,0.162549,0.080624,0.08178,0.040563
714,2900174,"Nashville-Davidson--Murfreesboro--Franklin, Tn",2024-05-01,5350,848,436,9230,0.579632,0.158505,0.081495,0.091874,0.047237
567,2887289,"Austin-Round Rock-Georgetown, Tx",2024-05-01,5984,904,388,14983,0.399386,0.15107,0.06484,0.060335,0.025896
882,2900122,"Memphis, Tn-Ms-Ar",2024-05-01,517,78,43,1107,0.467028,0.15087,0.083172,0.070461,0.038844
1848,2900276,"Provo-Orem, Ut",2024-05-01,1294,194,115,2487,0.520306,0.149923,0.088872,0.078006,0.04624
84,2899967,"Houston-The Woodlands-Sugar Land, Tx",2024-05-01,15024,2185,1323,30860,0.486844,0.145434,0.088059,0.070804,0.042871
1974,2887286,"Augusta-Richmond County, Ga-Sc",2024-05-01,1233,178,94,2498,0.493595,0.144363,0.076237,0.071257,0.03763
1617,2899621,"Boise City, Id",2024-05-01,2052,289,167,3912,0.52454,0.140838,0.081384,0.073875,0.042689
483,2900331,"San Antonio-New Braunfels, Tx",2024-05-01,5256,737,514,12981,0.404899,0.140221,0.097793,0.056775,0.039596


In [66]:
data['state'] = data['name'].apply(lambda x: x.split(',')[-1].strip().upper().split('-')[0])
data['clean_name'] = data.apply(lambda x: f"{x['name'].split('-')[0].split(',')[0].strip()}, {x['state']}", axis=1)
data['clean_name'] = data['clean_name'].replace({'United States Of America, UNITED STATES OF AMERICA': 'USA'})

In [40]:
def multi_market_line_chart_as_pct(
    data: pd.DataFrame,
    y: str='pct_new_construction_listings_of_new',
    x: str='date',
    title: str='Percentage of New Listings coming from New Construction',
    label: str='% of New Listings from New Construction',
    color: str='clean_name'
): 

    max_date_for_chart = data['date'].max().date()
    max_date_for_chart = max_date_for_chart.strftime('%B %d, %Y')

    # Create the line chart using Plotly Express
    fig = px.line(
        data,
        x=x,
        y=y,
        color=color,
        line_group=color,
        labels={y: label},
        title=f'{title} ({max_date_for_chart})'
    )

    # Update traces to apply specific styles
    for trace in fig.data:
        if trace.name == 'USA':
            trace.update(
                line=dict(color='red', width=4),
                opacity=1
            )
        else:
            trace.update(
                line=dict(color='lightblue', dash='dash', width=2),
                opacity=0.8
            )
        # Remove text annotations from traces
        trace.update(
            mode='lines'
        )

    # Find the latest date in the dataset
    latest_date = max(data[x])

    # Add annotations for each line on the far right
    annotations = []
    y_positions = []

    for trace in fig.data:
        # Get the last y-value for each clean_name
        last_y_value = data[
            (data[color] == trace.name) &
            (data[x] == latest_date)
        ][y].values[0]
        
        # Only add the annotation if it doesn't overlap with existing annotations
        if not any(abs(last_y_value - y) < 0.01 for y in y_positions):  # Adjust threshold as needed
            annotations.append(dict(
                x=latest_date,
                y=last_y_value,
                xref='x',
                yref='y',
                text=trace.name,
                showarrow=False,
                xanchor='left',
                font=dict(size=12)  # Adjust the font size if needed
            ))
            y_positions.append(last_y_value)

    fig.add_layout_image(
            create_labs_logo_dict()
    )

    # Update layout for axes, title, and other styling
    fig.update_layout(
        width=1600,
        height=800,
        xaxis=dict(
            title='',
            showgrid=style_config['showgrid'],
            gridwidth=style_config['gridwidth'],
            gridcolor=style_config['grid_color'],
            # tickangle=style_config['tick_angle'],
            linecolor=style_config['line_color_axis'],
            linewidth=style_config['linewidth'],
            titlefont=style_config['title_font_axis']
        ),
        yaxis=dict(
            title='% Price Reductions',
            showgrid=style_config['showgrid'],
            gridwidth=style_config['gridwidth'],
            gridcolor=style_config['grid_color'],
            tickfont=style_config['axis_font'],
            zeroline=False,
            tickformat='.0%',
            linecolor=style_config['line_color_axis'],
            linewidth=style_config['linewidth'],
            titlefont=style_config['title_font_axis']
        ),
        plot_bgcolor=style_config['background_color'],
        paper_bgcolor=style_config['background_color'],
        font=dict(color=style_config['font_color']),
        showlegend=False,  # Remove the legend
        margin=dict(l=40, r=40, t=80, b=40),
        title={
            'y': 0.98,
            'x': 0.5,
            'xanchor': 'center',
            'yanchor': 'top',
            'font': dict(size=24)
        },
        annotations=annotations  # Add annotations
    )

    fig.show()


In [42]:
multi_market_line_chart_as_pct(
    data=data,
    y='pct_new_construction_listings_of_new',
    x='date',
    title='Percentage of New Listings coming from New Construction',
    label='% of New Listings from New Construction',
    color='clean_name'
)


In [70]:
multi_market_line_chart_as_pct(
    data=data.loc[data['parcl_id'].isin([2899822, 5826765, 2900417, 2900128,
 2900417,
 2900213,
 2899989,
 2900192,
 2899822,
 2900041,
 2899748,
 2900229])],
    y='pct_new_investor_listings_of_new',
    x='date',
    title='Percentage of New Listings coming from Investors',
    label='% of New Listings from Investors',
    color='clean_name'
)

In [69]:
# cape coral pid: 2899822, 5826765, 2900417
data.loc[data['clean_name'].str.contains(', FL')]['parcl_id'].unique().tolist()

[2900128,
 2900417,
 2900213,
 2899989,
 2900192,
 2899822,
 2900041,
 2899748,
 2900229]