<center>
<h1>Welcome to the Lab 🥼🧪</h1>
</center>

### In this notebook, we will learn how to download individual units for a given `parcl id` and select the events of interest for those units using the Parcl Labs API.


#### Need help getting started?

As a reminder, you can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along.

To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY`.

You will need a paid account to get your API, you can get it [here](https://dashboard.parcllabs.com/). 

Run in Colab --> [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-cookbook/blob/main/examples/getting_started/property_data_download.ipynb)

In [1]:
%pip install --upgrade parcllabs

Looking in indexes: https://pypi.org/simple, https://aws:****@parcl-labs-211125433237.d.codeartifact.us-east-1.amazonaws.com/pypi/python/simple/export

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


After installing the required libraries, we need to load them and start the Parcl Labs client with our `API_KEY`. The client is a Python library designed to facilitate and optimize the user experience with the Parcl Labs API. It handles searching, retrieving, and formatting the data for us, making it much easier to use our data by abstracting away complexity and allowing you to focus on building. As a reminder, while you can simply enter your `API_KEY`, it is recommended that you save it as an environment variable for added security. If you are using Colab, you can follow these [steps](https://medium.com/@parthdasawant/how-to-use-secrets-in-google-colab-450c38e3ec75).


In [2]:
import os
from datetime import datetime, timedelta
import pandas as pd
from parcllabs import ParclLabsClient

In [3]:
# Instantiate the client  and make sure we have a folder to download the data 
client = ParclLabsClient(
    api_key=os.environ.get('PARCL_LABS_API_KEY', "<your Parcl Labs API key if not set as environment variable>"), 
    limit=10 # set default limit
)

# Function to check and create directory if it doesn't exist
def ensure_directory(directory):
    if not os.path.exists(directory):
        os.makedirs(directory)
    return directory

# Create a 'downloads' directory for our CSV files
download_dir = ensure_directory('downloads')

print("Setup complete. You can now use the 'client' object to interact with the Parcl Labs API.")

Setup complete. You can now use the 'client' object to interact with the Parcl Labs API.


With the client object, you can now interact with the Parcl Labs API. The client object has multiple methods that you can use to download data. You can find more information about the methods for the Parcl Labs Client [here](https://github.com/ParclLabs/parcllabs-python). In this case, we are interested in the [property search v2 endpoint](https://docs.parcllabs.com/reference/property_search_v2_v2_property_search_post-1), and for this example, we are getting all sales between $300,000 and $500,000 for single-family homes in the city of Pittsburgh, PA (`parcl_id`: `5377717`) that are between 2,000 and 4,000 square feet. In the example below, we define the necessary parameters to search for this information and then call the Parcl Labs Client to download the data for us. You will notice that we have a handful of additional parameters commented out; those help us narrow down our search, but for now, let's use the `parcl_ids`, `event_names`, `min_price`, and `max_price`, `property_types`, `min_sqft`, and `max_sqft`. Additionally, we set the `include_property_details` to True, so every event record will also return property metadata. If you do not need the property metadata feel free to change this field to False. The parameters defined in `search_params` are a critical tool on how to search for relevant information on the API. They are designed to provide our users with the necessary control to make their search as wide or as narrow as they wish.

If you are interested in other markets you can search them [following these steps](https://github.com/ParclLabs/parcllabs-cookbook/blob/main/examples/getting_started/search.ipynb).

In [None]:
# For a full list of available parameters along with their documentation, see the property search v2 endpoint documentation linked in the cell above.
search_params = {
    'parcl_ids': [5377717],  # One of Parcl ID, parcl property ids, or geo coordinates is required
    # 'parcl_property_ids'=[78353317, 135921544],
    # 'geo_coordinates'= {"latitude": 36.159445, "longitude": -86.483244, radius: 1},
    'event_names': ["ALL_SOLD"], # See docs for full list of event names
    # 'min_event_date': "2023-01-01",
    # 'max_event_date': "2024-12-31",
    'min_price': 300000,
    'max_price': 500000,
    # 'is_new_construction': False,
    # 'min_record_updated_date': "2024-01-01",
    # 'max_record_updated_date': "2024-12-31",
    # 'is_current_owner': True,
    # 'owner_name'=["BLACKSTONE"],
    # 'is_investor_owned': True,
    # 'is_owner_occupied': False,
    'include_property_details': True,
    'property_types': ["SINGLE_FAMILY"],
    # 'min_beds': 1,
    # 'max_beds': 5,
    # 'min_baths': 1,
    # 'max_baths': 3,
    'min_sqft': 2000,
    'max_sqft': 4000,
    'min_year_built': 2000,
    # 'max_year_built': 2020,
    # 'min_record_added_date': "2024-12-13",
    # 'max_record_added_date': "2024-12-31",
    # 'current_on_market_flag': True,
    # 'current_on_market_rental_flag': False,
    # 'limit': 100,
}

# we can pass the search_params dictionary to the retrieve method to get the search results using **search_params
search_results, filter_data = client.property_v2.search.retrieve(**search_params)

print(f"Found {search_results['parcl_property_id'].nunique()} distinct properties and {len(search_results)} events matching the criteria.")
print(search_results.head(2))

# We can also save the properties we found to a file using the to_csv method.
# Save the event results to a CSV file using today's date in the filename for easier tracking
home_search_filename = f'pittsburgh_property_homes_{datetime.now().strftime("%Y-%m-%d")}.csv'
search_results_file_path = os.path.join(download_dir, home_search_filename)
search_results.to_csv(search_results_file_path, index=False)

print(f"Search results saved to {search_results_file_path}")

Processing property search request...
No limit provided. Setting limit to maximum value of 50000.
Found 28 distinct properties and 44 events matching the criteria.
   parcl_property_id  property_metadata_bathrooms  property_metadata_bedrooms  \
0           66128550                          5.0                           4   
1           66128550                          5.0                           4   

   property_metadata_sq_ft  property_metadata_year_built  \
0                     4749                          2017   
1                     4749                          2017   

  property_metadata_property_type property_metadata_address1  \
0                   SINGLE_FAMILY         212 S HOMEWOOD AVE   
1                   SINGLE_FAMILY         212 S HOMEWOOD AVE   

  property_metadata_address2 property_metadata_city property_metadata_state  \
0                       None             PITTSBURGH                      PA   
1                       None             PITTSBURGH         

The newly added parameters have narrowed the results and helped us identify relevant homes in our desired market. Having a targeted search helps our customers identify relevant properties and is necessary before accessing all the events associated with a given home.

We access this information for any given home using another method of the client (`client.property.events.retrieve`) where we pass the `parcl_property_id` of each home to a list and feed it to the client. In this example, we will use this method to retrieve the sales history for the homes found in the search results. However, we could also modify the parameters to get listing or rental events. The complete documentation for this API endpoint can be found [here](https://docs.parcllabs.com/reference/property_events_v1_property_event_history_post).


In [None]:
# Pass the parcl_property_ids from the search results to a list named search_results_ids to retrieve the sale events 
# for those properties.
search_results_ids = search_results['parcl_property_id'].tolist()

# Define the parameters we want to use in the search for property events.
property_events_parameters = {
    'parcl_property_ids': search_results_ids,
    'event_type': 'SALE',
    #'entity_owner_name': None, # Specify one of the options or None
    #'start_date': '2020-01-01',
    #'end_date': '2021-01-01',
}

# Call the client with the list of property ids and the event_type as 'SALE' to retrieve the sale events for the properties.
# we can pass the search_params dictionary to the retrieve method to get the search results using **property_events_parameters
sale_events = client.property.events.retrieve(
    **property_events_parameters
    )

print(f"Found {len(sale_events)} events matching the criteria.")
print(sale_events.head(2))

Just as in the case of property search, we can also modify parameters to get more information about a particular home. For a detailed list of parameters, you can see the documentation [here](https://docs.parcllabs.com/reference/property_events_v1_property_event_history_post). In the example below, we will modify the `event_type` parameter from `SALE` to `ALL`.

In [None]:
# Re-run the results for all event types by modifying the event_type parameter to 'ALL'
property_events_parameters = {
    'parcl_property_ids': search_results_ids,
    'event_type': 'ALL',
    #'entity_owner_name': None, # Specify one of the options or None
    #'start_date': '2020-01-01',
    #'end_date': '2021-01-01',
    }

all_events = client.property.events.retrieve(
    **property_events_parameters
    )

print(f"Found {len(all_events)} events matching the criteria.")
print(all_events.head(2))

With the new parameters, we now get all the events associated with the selected properties, including sales, listings, and rentals. If we want to look at events since 2022, we can simply uncomment the `start_date` parameter and set it to '2022-01-01'.

In [None]:
# Re-run the results for all event types and modify the event_type parameter to 'ALL'
property_events_parameters = {
    'parcl_property_ids': search_results_ids,
    'event_type': 'ALL',
    #'entity_owner_name': None, # Specify one of the options or None
    'start_date': '2022-01-01',
    #'end_date': '2023-01-01',
}

all_events = client.property.events.retrieve(
    **property_events_parameters
    )

print(f"Found {len(all_events)} events matching the criteria.")
print(all_events.head(2))

When you are ready to save your data for you can use the following code to save the data to a CSV file.

In [None]:
# Now that we have the sale events, we can download the data to a CSV file
final_data_events = all_events.merge(search_results, on='parcl_property_id', how='left')

print(f"Final data shape: {final_data_events.shape}")

In [None]:
# Save the event results to a CSV file using today's date in the filename for easier tracking
events_filename = f'pittsburgh_property_events_all_events_{datetime.now().strftime("%Y-%m-%d")}.csv'
events_file_path = os.path.join(download_dir, events_filename)
final_data_events.to_csv(events_file_path, index=False)

print(f"Event history saved to {events_file_path}")
print(f"Total events retrieved: {len(final_data_events)}")