<center>
<h1>Welcome to the Lab 🥼🧪</h1>
</center>

## Deep dive analysis into a market

In this notebook we will have a deep dive into the market dynamics of a specific market. We will be using our properties search endpoint to identify all the single family homes currently available for sale in the Washington DC Market. We will then create a map to visualize where the inventory is located. 

#### What will you create in this notebook?

##### Location of Single Family Homes for Sale in Washington DC
<p align="center">
  <img src="../../../images/dc_properties_map.png" alt="Alt text">
</p>

#### Need help getting started?

As a reminder, you can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along.

To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY`.

Run in collab --> [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-cookbook/blob/main/examples/experimental/supply_and_demand/active_inventory_dc.ipynb)

### Import required packages and setup the Parcl Labs API key

In [None]:
# if needed, install and/or upgrade to the latest verison of the Parcl Labs Python library
%pip install --upgrade parcllabs geopandas plotly

In [19]:
import os
import pandas as pd
import geopandas as gpd
import plotly.express as px
from datetime import datetime
import plotly.graph_objects as go
from parcllabs import ParclLabsClient
import plotly.io as pio
from PIL import Image
import base64
from io import BytesIO


# Create a ParclLabsClient instance
client = ParclLabsClient(
    api_key=os.environ.get('PARCL_LABS_API_KEY', "<your Parcl Labs API key if not set as environment variable>"), 
    limit=1000, 
    turbo_mode=True # set turbo mode to True
)

In [None]:
# We now search for the Washington DC market so we can identify the parcl id to download the data
markets = client.search.markets.retrieve(
    query = 'Washington',
    location_type = 'city',
    sort_by='TOTAL_POPULATION',  # Sort by total population
    sort_order='DESC',           # In descending order
    limit=100                    # Limit results to top 100 metros
)
# subset to only include the Washington DC market by selecting the first row
market_for_analysis_id = int(markets.iloc[0]['parcl_id']) # make sure the value is an integer
market_for_analysis_name = markets.iloc[0]['name']
market_for_analysis_state = markets.iloc[0]['state_abbreviation']
print(f'The market for analysis id is {market_for_analysis_id} and the name is {market_for_analysis_name}'
      f' in the {market_for_analysis_state} state')



This API call retrieves all single family homes currently available for sale on Washington D.C. We have a wide variety of available filters see the [documentation](https://docs.parcllabs.com/reference/search_v1_property_search_get) but in this case we are interested in the `property_type` and the `current_on_market_flag` which will tell us what market is active at the time of the query. This brand new functionality on our API is updated daily so you will get the most up to date information on the real estate market. 

Calling this endpoint **can consume a lot of credits** as we will pull every property that is active in the specific market so be cafeful.

In [None]:
# now we can search for all the single family homes for sale in the Washington DC market

# Define the search parameters,
search_params = {
    'parcl_ids': [market_for_analysis_id],  # Required
    'property_type': 'SINGLE_FAMILY',  # Required
    'current_on_market_flag':True,
    #'current_owner_occupied_flag': True,
    #'current_investor_owned_flag': False
}

# We search for properties in the market we defined above using the parameters that are not commented out.
active_properties_dc = client.property.search.retrieve(**search_params)

print(f"Found {len(active_properties_dc)} active single family homes in Washington D.C.")

In [None]:
# explore the data
active_properties_dc.head()


With this information we can get the particular information of prices of homes for sale in Washington D.C. 

In [None]:
# now retrieve information for sale events to get prices
# Pass the parcl_property_ids from the search results to a list named search_results_ids to retrieve the sale events 
# for those properties.
search_results_ids = active_properties_dc['parcl_property_id'].tolist()

# Define the parameters we want to use in the search for property events.
property_events_parameters = {
    'parcl_property_ids': search_results_ids,
    'event_type': 'LISTING',
    #'entity_owner_name': None, # Specify one of the options or None
    #'start_date': '2020-01-01',
    #'end_date': '2021-01-01',
}

# we can pass the search_params dictionary to the retrieve method to get the search results using **property_events_parameters
listing_events = client.property.events.retrieve(
    **property_events_parameters
    )

print(f"Found {len(listing_events)} events matching the criteria.")
print(listing_events.head(2))

In [None]:
# then we merge with the original properties, get the latest price
# Assuming your DataFrame is named 'listing_events'
last_events = (listing_events
               .sort_values(by=['parcl_property_id','event_date'], ascending=[True,True])
               .groupby('parcl_property_id')
               .last()
               .reset_index())

last_events.head(2)


In [None]:
# merge for the map
pd.set_option('display.max_columns', None)
listing_events_final =active_properties_dc.merge(last_events[[
    'parcl_property_id',
    'event_date',
    'price'
    ]], 
    on='parcl_property_id', 
    how='left')
listing_events_final.head(1)

In [28]:

# Data preparation - make sure lat/long columns are properly formatted
listing_events_final['latitude'] = pd.to_numeric(listing_events_final['latitude'], errors='coerce')
listing_events_final['longitude'] = pd.to_numeric(listing_events_final['longitude'], errors='coerce')

# Drop rows with invalid coordinates
listing_events_final = listing_events_final.dropna(subset=['latitude', 'longitude'])

# Create hover text for each property with improved typography
hover_text = []
for idx, row in listing_events_final.iterrows():
    property_info = []
    
    # Add address if available - making it more prominent
    if 'address' in row and not pd.isna(row['address']):
        property_info.append(f"<b style='font-family: Arial, sans-serif; font-size: 14px;'>{row['address']}</b>")
    
    # Add city, state, zip if available
    location_parts = []
    if 'city' in row and not pd.isna(row['city']):
        location_parts.append(row['city'])
    if 'state_abbreviation' in row and not pd.isna(row['state_abbreviation']):
        location_parts.append(row['state_abbreviation'])
    if 'zip_code' in row and not pd.isna(row['zip_code']):
        location_parts.append(str(row['zip_code']))
    
    if location_parts:
        property_info.append(f"<span style='font-family: Arial, sans-serif;'>{', '.join(location_parts)}</span>")
    
    # Add property details with improved labels and readability
    if 'property_type' in row and not pd.isna(row['property_type']):
        property_info.append(f"<span style='font-family: Arial, sans-serif;'><b>Type:</b> {row['property_type']}</span>")
    if 'bedrooms' in row and not pd.isna(row['bedrooms']):
        property_info.append(f"<span style='font-family: Arial, sans-serif;'><b>Beds:</b> {row['bedrooms']}</span>")
    if 'bathrooms' in row and not pd.isna(row['bathrooms']):
        property_info.append(f"<span style='font-family: Arial, sans-serif;'><b>Baths:</b> {row['bathrooms']}</span>")
    if 'square_footage' in row and not pd.isna(row['square_footage']):
        property_info.append(f"<span style='font-family: Arial, sans-serif;'><b>Sq Ft:</b> {row['square_footage']:,.0f}</span>")
    if 'year_built' in row and not pd.isna(row['year_built']):
        property_info.append(f"<span style='font-family: Arial, sans-serif;'><b>Year Built:</b> {int(row['year_built'])}</span>")
    if 'price' in row and not pd.isna(row['price']):
        property_info.append(f"<span style='font-family: Arial, sans-serif;'><b>Price:</b> ${row['price']:,.0f}</span>")
    
    hover_text.append("<br>".join(property_info))

listing_events_final['hover_text'] = hover_text

# Create the map using scatter_mapbox with fixed figure size
fig = px.scatter_mapbox(
    listing_events_final, 
    lat="latitude", 
    lon="longitude", 
    hover_name="address" if "address" in listing_events_final.columns else None,
    hover_data={"latitude": False, "longitude": False},
    custom_data=["hover_text"],
    color_discrete_sequence=[blue_color],
    zoom=11, 
    height=800,
    width=1200,
    title="Washington DC Properties")

# Update the hover template to use our custom hover text
fig.update_traces(
    hovertemplate="%{customdata[0]}",
    marker=dict(opacity=0.65)  # Reduced opacity for better map readability
)

# Use a dark theme map style
fig.update_layout(
    mapbox_style="carto-darkmatter",
    mapbox=dict(
        center=dict(lat=38.9072, lon=-77.0369),  # Center on DC
        zoom=11
    ),
    margin=dict(l=0, r=0, t=50, b=0),
    paper_bgcolor="rgb(30, 30, 30)",
    plot_bgcolor="rgb(30, 30, 30)",
    title=dict(
        text="Washington DC Active Properties",
        x=0.5,
        xanchor="center",
        font=dict(family="Arial, sans-serif", size=24, color="white")
    ),
    hoverlabel=dict(
        bgcolor="rgba(50, 50, 50, 0.9)",
        bordercolor="white",
        font=dict(
            family="Arial, sans-serif",
            size=14,
            color="white"
        )
    )
)

# Add logo to the lower left corner
logo_path = '../../../images/Logo_ParclLabs_White_Dec2024.png'
try:
    # Load the logo image
    logo_img = Image.open(logo_path)
    
    # Convert to base64 string for embedding
    buffer = BytesIO()
    logo_img.save(buffer, format="PNG")
    logo_base64 = base64.b64encode(buffer.getvalue()).decode('utf-8')
    
    # Add as layout image using base64 encoding
    fig.add_layout_image(
        dict(
            source=f'data:image/png;base64,{logo_base64}',
            xref="paper", yref="paper",
            x=0.97, y=0.05,  # Position in the lower right
            sizex=0.15, sizey=0.15,  # Size of the image relative to the plot
            xanchor="right", yanchor="bottom"  # Anchor point changed to right
        )
    )
except Exception as e:
    print(f"Failed to add logo: {e}")

# Save the figure as a PNG file using plotly.io
try:
    # Using plotly.io with higher resolution (scale=2)
    pio.write_image(fig, "dc_properties_map.png", scale=2)
    print("Figure saved as dc_properties_map.png")
except Exception as e:
    print(f"Failed to save image: {e}")
    print("Make sure the kaleido package is installed: pip install -U kaleido")

# Display the map
fig.show()


*scatter_mapbox* is deprecated! Use *scatter_map* instead. Learn more at: https://plotly.com/python/mapbox-to-maplibre/



Figure saved as dc_properties_map.png


In [27]:
# if we want to save the csv and the map we can do it here
active_properties_dc.to_csv('active_properties_dc.csv', index=False)