<center>
<h1>Welcome to the Lab 🥼🧪</h1>
</center>

### How can I analyze supply, demand, and price trends for new construction?

In this notebook, we will be creating an analysis on new construction. We will be comparing new construction demand relative to the price of new construction sales for the US housing market. This notebook will work for any location in the US, whether it be a zip code, city, or metro area. To fine tune it for your location, modify the search criteria. 

#### What will you create in this notebook?

<p align="center">
  <img src="../../../images/new_construction_pricing_and_demand.png" alt="Alt text">
</p>

#### Need help getting started?

As a reminder, you can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along.

To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY`.

Run in collab --> [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-cookbook/blob/main/examples/housing_market_research/supply_and_demand/new_construction_trends.ipynb)

### 1. Import the Parcl Labs Python Library

In [None]:
# if needed, install and/or upgrade to the latest verison of the Parcl Labs Python library
%pip install --upgrade parcllabs nbformat

In [2]:
import os
import pandas as pd
from parcllabs import ParclLabsClient
from parcllabs.beta.charting.styling import SIZE_CONFIG
from parcllabs.beta.charting.default_charts import create_dual_axis_chart

client = ParclLabsClient(
    api_key=os.environ.get('PARCL_LABS_API_KEY', "<your Parcl Labs API key if not set as environment variable>"), 
    limit=12 # set default limit
)

### 2. Search for Markets

In [3]:
# in this case, lets look at US market overall
us_market = client.search.markets.retrieve(
    query='United States',
    sort_by='TOTAL_POPULATION',
    sort_order='DESC',
    limit=1
)

us_market.head()

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5826765,USA,,,United States Of America,,,COUNTRY,331097593,75149,1,1,0,0


In [4]:
us_parcl_id = us_market['parcl_id'].values[0]

### 3. Retrieve the Data

In [5]:
# lets get new construction demand and supply counts in addition to prices
new_construction_housing_event_prices = client.new_construction_metrics.housing_event_prices.retrieve(
    parcl_ids=[us_parcl_id],
    limit=100, # lets get the full series
)

new_construction_housing_event_counts = client.new_construction_metrics.housing_event_counts.retrieve(
    parcl_ids=[us_parcl_id],
    limit=100
)

|████████████████████████████████████████| 1/1 [100%] in 0.1s (8.89/s) 
|████████████████████████████████████████| 1/1 [100%] in 0.1s (9.74/s) 


In [6]:
new_construction_housing_event_prices.head()

Unnamed: 0,date,price_median_sales,price_median_new_listings_for_sale,price_median_new_rental_listings,price_per_square_foot_median_sales,price_per_square_foot_median_new_listings_for_sale,price_per_square_foot_median_new_rental_listings,parcl_id,property_type
0,2024-05-01,430000.0,462989.0,2300.0,213.7,221.03,1.42,5826765,ALL_PROPERTIES
1,2024-04-01,425000.0,464700.0,2295.0,213.35,220.05,1.4,5826765,ALL_PROPERTIES
2,2024-03-01,428950.0,459995.0,2290.0,211.52,219.82,1.41,5826765,ALL_PROPERTIES
3,2024-02-01,426490.0,449160.0,2200.0,209.3,217.53,1.39,5826765,ALL_PROPERTIES
4,2024-01-01,413911.0,459000.0,2250.0,206.2,216.9,1.4,5826765,ALL_PROPERTIES


In [7]:
new_construction_housing_event_counts.head()

Unnamed: 0,date,sales,new_listings_for_sale,new_rental_listings,parcl_id,property_type
0,2024-05-01,30685,39000.0,38257.0,5826765,ALL_PROPERTIES
1,2024-04-01,33180,36892.0,36475.0,5826765,ALL_PROPERTIES
2,2024-03-01,33275,35527.0,33806.0,5826765,ALL_PROPERTIES
3,2024-02-01,29958,34417.0,32334.0,5826765,ALL_PROPERTIES
4,2024-01-01,28885,34064.0,39281.0,5826765,ALL_PROPERTIES


### 4. Prepare the data for analysis/charting

In [8]:
# in this notebook, we will focus on median sales price and demand counts
# lets merge the two dataframes
new_construction = pd.merge(new_construction_housing_event_prices[['parcl_id', 'date', 'price_median_sales']], new_construction_housing_event_counts[['parcl_id', 'date', 'sales', 'new_listings_for_sale']], on=['date', 'parcl_id'], how='inner')
new_construction

Unnamed: 0,parcl_id,date,price_median_sales,sales,new_listings_for_sale
0,5826765,2024-05-01,430000.0,30685,39000.0
1,5826765,2024-04-01,425000.0,33180,36892.0
2,5826765,2024-03-01,428950.0,33275,35527.0
3,5826765,2024-02-01,426490.0,29958,34417.0
4,5826765,2024-01-01,413911.0,28885,34064.0
...,...,...,...,...,...
60,5826765,2019-05-01,287707.0,115795,
61,5826765,2019-04-01,284681.0,109841,
62,5826765,2019-03-01,288000.0,103179,
63,5826765,2019-02-01,285000.0,83707,


### 5. Chart the Data

In [9]:
new_construction = new_construction.sort_values('date')
# rename sales to Number of Units Sold for readability
new_construction = new_construction.rename(columns={'sales': 'Number of Units Sold'})
new_construction.head()

Unnamed: 0,parcl_id,date,price_median_sales,Number of Units Sold,new_listings_for_sale
64,5826765,2019-01-01,287240.0,86498,
63,5826765,2019-02-01,285000.0,83707,
62,5826765,2019-03-01,288000.0,103179,
61,5826765,2019-04-01,284681.0,109841,
60,5826765,2019-05-01,287707.0,115795,


In [10]:
create_dual_axis_chart(
    title='US New Construction Median Sales Price vs. Number of Units Sold',
    line_data=new_construction,
    bar1_data=new_construction,
    bar2_data=new_construction,
    line_series='price_median_sales',
    bar1_series='Number of Units Sold',
    yaxis1_title='Median Sales Price',
    yaxis2_title='Number of Units Sold',
    height=SIZE_CONFIG['x']['height'],
    width=SIZE_CONFIG['x']['width']
)