# Welcome to the Lab 🥼🧪
## How do I search markets?

In this notebook, we will be building some basic intuition around how search works, broken into basic and advanced topics. 

**Note** This notebook will work with any of the 70k+ markets supported by the Parcl Labs API.

As a reminder, you can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along. 

To view the technical docs for this endpoint, click [here](https://docs.parcllabs.com/reference/search_markets_v1_search_markets_get-1)

To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY` as a secret. See this [guide](https://medium.com/@parthdasawant/how-to-use-secrets-in-google-colab-450c38e3ec75) for more information.

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-examples/blob/main/python/introduction/search.ipynb)

In [1]:
import os
import sys
import json
import subprocess
from datetime import datetime
from urllib.request import urlopen

# Collab setup from one click above
if "google.colab" in sys.modules:
    from google.colab import userdata
    %pip install parcllabs plotly kaleido
    api_key = userdata.get('PARCL_LABS_API_KEY')
else:
    api_key = os.getenv('PARCL_LABS_API_KEY')

In [2]:
import parcllabs
from parcllabs import ParclLabsClient

print(f"Parcl Labs Version: {parcllabs.__version__}")

Parcl Labs Version: 0.2.0


In [3]:
# Initialize the Parcl Labs client
client = ParclLabsClient(api_key)

In [14]:
# Search for a specific market by name and type
# In this case, we are going to search for New York CBSA (Core Based Statistical Area, also known as Metro Area)
market = client.search_markets.retrieve(
    query='New York',
    location_type='CBSA',
)

market

{'items': [{'parcl_id': 2900187,
   'country': 'USA',
   'geoid': '35620',
   'state_fips_code': None,
   'name': 'New York-Newark-Jersey City, Ny-Nj-Pa',
   'state_abbreviation': None,
   'region': None,
   'location_type': 'CBSA',
   'total_population': 19908595,
   'median_income': 93610,
   'parcl_exchange_market': 0,
   'pricefeed_market': 1,
   'case_shiller_10_market': 1,
   'case_shiller_20_market': 1}],
 'total': 1,
 'limit': 12,
 'offset': 0,
 'links': {'first': 'https://api.parcllabs.com/v1/search/markets?query=New+York&location_type=CBSA&limit=12&offset=0',
  'last': 'https://api.parcllabs.com/v1/search/markets?query=New+York&location_type=CBSA&limit=12&offset=0',
  'self': 'https://api.parcllabs.com/v1/search/markets?query=New+York&location_type=CBSA&limit=12',
  'next': None,
  'prev': None}}

In [15]:
# The key thing to note here is the parcl_id, which is the unique identifier for the market.
# This allows us to distinguish between different markets with the same name, and have a central language for describing geographic areas. 
# Why can't we use FIPS codes? Boundaries change over time, a parcl_id is unique to a boundary at a point in time. 
# Fips codes can also also start with 0, which can cause issues with some programming languages and data conversions.

print(f"Market Parcl ID: {market['items'][0]['parcl_id']} -- Market Name: {market['items'][0]['name']}")

Market Parcl ID: 2900187 -- Market Name: New York-Newark-Jersey City, Ny-Nj-Pa


In [8]:
# Now lets get this as a dataframe for easier manipulation when we start to analyze the data
# and find more markets
market_df = client.search_markets.retrieve(
    query='New York',
    location_type='CBSA',
    as_dataframe=True
)

market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,2900187,USA,35620,,"New York-Newark-Jersey City, Ny-Nj-Pa",,,CBSA,19908595,93610,0,1,1,1


In [36]:
# Now lets say you want to do analysis for the entire country
market_df = client.search_markets.retrieve(
    query='United States',
    as_dataframe=True
)

market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5826765,USA,,,United States Of America,,,COUNTRY,331097593,75149,1,1,0,0


In [10]:
# Now lets search for New York City, the 5 boroughs of New York
market_df = client.search_markets.retrieve(
    query='New York City',
    location_type='CITY',
    as_dataframe=True
)

market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5372594,USA,3651000,36,New York City,NY,MIDDLE_ATLANTIC,CITY,8622467,76607,1,1,0,0


In [11]:
# Now lets search for a specific zip code in New York City
market_df = client.search_markets.retrieve(
    query='10013',
    location_type='ZIP5',
    as_dataframe=True
)
market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5281250,USA,,36,10013,NY,MIDDLE_ATLANTIC,ZIP5,29453,150675,0,0,0,0


In [20]:
# Now lets search for all zip codes in New York
market_df = client.search_markets.retrieve(
    state_abbreviation='NY',
    location_type='ZIP5',
    as_dataframe=True,
    params={'limit': 1000},  # expand the default limit of 12 to 1000
    auto_paginate=True # Traverse all results until we have obtained all the zip codes
)
market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5453067,USA,,36,11368,NY,MIDDLE_ATLANTIC,ZIP5,112750,69391.0,0,0,0,0
1,5358467,USA,,36,11208,NY,MIDDLE_ATLANTIC,ZIP5,108180,56298.0,0,0,0,0
2,5453121,USA,,36,11385,NY,MIDDLE_ATLANTIC,ZIP5,105521,85127.0,0,0,0,0
3,5452957,USA,,36,11373,NY,MIDDLE_ATLANTIC,ZIP5,102618,67489.0,0,0,0,0
4,5358103,USA,,36,11226,NY,MIDDLE_ATLANTIC,ZIP5,101053,75947.0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1771,5453171,USA,,36,10020,NY,MIDDLE_ATLANTIC,ZIP5,0,,0,0,0,0
1772,5452918,USA,,36,12722,NY,MIDDLE_ATLANTIC,ZIP5,0,,0,0,0,0
1773,5269128,USA,,36,11931,NY,MIDDLE_ATLANTIC,ZIP5,0,,0,0,0,0
1774,5358566,USA,,36,10165,NY,MIDDLE_ATLANTIC,ZIP5,0,,0,0,0,0


In [21]:
# now lets say we know the parcl_id and want to get the market details
market_df = client.search_markets.retrieve(
    parcl_id=2900187, # New York CBSA
    as_dataframe=True
)
market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,2900187,USA,35620,,"New York-Newark-Jersey City, Ny-Nj-Pa",,,CBSA,19908595,93610,0,1,1,1


In [22]:
import pandas as pd

# Now lets say we organize our data interally by fips codes and want to get the market details
# and the parcl_id's to join against external data
fips = [
    '36061', # New York County
    '36047', # Kings County
    '36081', # Queens County
    '36085', # Richmond County
    '36005', # Bronx County
]

all_markets = []

for fip in fips:
    market_df = client.search_markets.retrieve(
        geoid=fip,
        as_dataframe=True
    )
    all_markets.append(market_df)

all_markets_df = pd.concat(all_markets)
all_markets_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5822484,USA,36061,36,New York County,NY,MIDDLE_ATLANTIC,COUNTY,1645867,99880,0,1,0,0
0,5822447,USA,36047,36,Kings County,NY,MIDDLE_ATLANTIC,COUNTY,2679620,74692,1,1,0,0
0,5822371,USA,36081,36,Queens County,NY,MIDDLE_ATLANTIC,COUNTY,2360826,82431,0,0,0,0
0,5821951,USA,36085,36,Richmond County,NY,MIDDLE_ATLANTIC,COUNTY,492925,96185,0,0,0,0
0,5821247,USA,36005,36,Bronx County,NY,MIDDLE_ATLANTIC,COUNTY,1443229,47036,0,0,0,0


In [27]:
# Now lets say I want the top 100 metros in the country based on population
market_df = client.search_markets.retrieve(
    location_type='CBSA',
    sort_by='TOTAL_POPULATION',
    sort_order='DESC', # most populous first
    as_dataframe=True,
    params={'limit': 100} # truncate to top 100
)

market_df.head(5)

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,2900187,USA,35620,,"New York-Newark-Jersey City, Ny-Nj-Pa",,,CBSA,19908595,93610,0,1,1,1
1,2900078,USA,31080,,"Los Angeles-Long Beach-Anaheim, Ca",,,CBSA,13111917,89105,0,1,1,1
2,2899845,USA,16980,,"Chicago-Naperville-Elgin, Il-In-Wi",,,CBSA,9566955,85087,0,1,1,1
3,2899734,USA,19100,,"Dallas-Fort Worth-Arlington, Tx",,,CBSA,7673379,83398,0,1,0,1
4,2899967,USA,26420,,"Houston-The Woodlands-Sugar Land, Tx",,,CBSA,7142603,78061,0,1,0,0


In [26]:
# same query except with income
market_df = client.search_markets.retrieve(
    location_type='CBSA',
    sort_by='MEDIAN_INCOME',
    sort_order='DESC', # most populous first
    as_dataframe=True,
    params={'limit': 100} # truncate to top 100
)

market_df.head(5)

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,2900338,USA,41940,,"San Jose-Sunnyvale-Santa Clara, Ca",,,CBSA,1981584,151713,0,1,0,0
1,2900077,USA,31060,,"Los Alamos, Nm",,,CBSA,19253,135801,0,0,0,0
2,2900336,USA,41860,,"San Francisco-Oakland-Berkeley, Ca",,,CBSA,4692242,129315,0,1,1,1
3,2900475,USA,47900,,"Washington-Arlington-Alexandria, Dc-Va-Md-Wv",,,CBSA,6346083,119803,0,1,1,1
4,2899948,USA,25720,,"Heber, Ut",,,CBSA,77533,114857,0,0,0,0


In [28]:
# Now lets say you want all cities in EAST_NORTH_CENTRAL Census Region, ordered
# by total_population
market_df = client.search_markets.retrieve(
    region='EAST_NORTH_CENTRAL',
    location_type='CITY',
    as_dataframe=True,
    params={'limit': 1000},  # expand the default limit of 12 to 1000
    sort_by='TOTAL_POPULATION',
    sort_order='DESC', # most populous first
    auto_paginate=True # Traverse all results until we have obtained all the zip codes
)

market_df.head(5)

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5387853,USA,1714000,17,Chicago City,IL,EAST_NORTH_CENTRAL,CITY,2721914,71673.0,1,1,0,0
1,5332060,USA,3918000,39,Columbus City,OH,EAST_NORTH_CENTRAL,CITY,902449,62994.0,0,1,0,0
2,5288667,USA,1836003,18,Indianapolis City (Balance),IN,EAST_NORTH_CENTRAL,CITY,882006,59110.0,0,0,0,0
3,5278514,USA,2622000,26,Detroit City,MI,EAST_NORTH_CENTRAL,CITY,636787,37761.0,0,1,0,0
4,5333209,USA,5553000,55,Milwaukee City,WI,EAST_NORTH_CENTRAL,CITY,573299,49733.0,0,1,0,0


In [31]:
# Now lets say you want all markets that we have a daily price feed for
market_df = client.search_markets.retrieve(
    sort_by='PRICEFEED_MARKET',
    as_dataframe=True,
    params={'limit': 100},  # expand the default limit of 12 to 1000
)

market_df.head(5)

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,2900122,USA,32820,,"Memphis, Tn-Ms-Ar",,,CBSA,1335804.0,62178.0,0,1,0,0
1,2900417,USA,45300,,"Tampa-St. Petersburg-Clearwater, Fl",,,CBSA,3194310.0,67197.0,0,1,0,1
2,2899989,USA,27260,,"Jacksonville, Fl",,,CBSA,1613587.0,73194.0,0,1,0,0
3,5290547,USA,3755000,37.0,Raleigh City,NC,SOUTH_ATLANTIC,CITY,465517.0,78631.0,0,1,0,0
4,2899626,USA,14500,,"Boulder, Co",,,CBSA,328658.0,99770.0,0,1,0,0


In [33]:
# Now lets say you want all price feed markets that are on the parcl exchange
market_df = client.search_markets.retrieve(
    sort_by='PARCL_EXCHANGE_MARKET',
    as_dataframe=True,
    params={'limit': 14},  # expand the default limit of 12 to 1000
)

market_df

Unnamed: 0,parcl_id,country,geoid,state_fips_code,name,state_abbreviation,region,location_type,total_population,median_income,parcl_exchange_market,pricefeed_market,case_shiller_10_market,case_shiller_20_market
0,5384169,USA,1304000.0,13.0,Atlanta City,GA,SOUTH_ATLANTIC,CITY,494838,77655,1,1,0,0
1,5407714,USA,2507000.0,25.0,Boston City,MA,NEW_ENGLAND,CITY,665945,89212,1,1,0,0
2,5387853,USA,1714000.0,17.0,Chicago City,IL,EAST_NORTH_CENTRAL,CITY,2721914,71673,1,1,0,0
3,5380879,USA,4805000.0,48.0,Austin City,TX,WEST_SOUTH_CENTRAL,CITY,958202,86556,1,1,0,0
4,5353022,USA,1245025.0,12.0,Miami Beach City,FL,SOUTH_ATLANTIC,CITY,82400,65116,1,1,0,0
5,5377230,USA,3240000.0,32.0,Las Vegas City,NV,MOUNTAIN,CITY,644835,66356,1,1,0,0
6,5503877,USA,1150000.0,11.0,Washington City,DC,SOUTH_ATLANTIC,CITY,670587,101722,1,1,0,0
7,5822447,USA,36047.0,36.0,Kings County,NY,MIDDLE_ATLANTIC,COUNTY,2679620,74692,1,1,0,0
8,5372594,USA,3651000.0,36.0,New York City,NY,MIDDLE_ATLANTIC,CITY,8622467,76607,1,1,0,0
9,5373892,USA,644000.0,6.0,Los Angeles City,CA,PACIFIC,CITY,3881041,76244,1,1,0,0


In [34]:
# Now you have a basic understanding of search. It's very powerful and the entry point to the Parcl Labs ecosystem. 
# It's scalable across all endpoints, for example

# find supply/demand for many markets
supply_demand = client.market_metrics_housing_event_counts.retrieve_many(
    parcl_ids=market_df['parcl_id'].tolist(),
    as_dataframe=True,
    params={'limit': 1} # get most recent
)

supply_demand # note 5826765 corresponds to the entire country

|████████████████████████████████████████| 14/14 [100%] in 1.9s (7.51/s) 


Unnamed: 0,date,sales,new_listings_for_sale,new_rental_listings,parcl_id
0,2024-04-01,1183,1592,7964,5384169
1,2024-04-01,1081,1687,11752,5407714
2,2024-04-01,5392,3345,18027,5387853
3,2024-04-01,840,2072,11724,5380879
4,2024-04-01,293,503,1297,5353022
5,2024-04-01,1996,1696,2636,5377230
6,2024-04-01,1164,1429,7745,5503877
7,2024-04-01,1586,1512,2039,5822447
8,2024-04-01,5573,6092,9329,5372594
9,2024-04-01,5621,3526,13655,5373892
