<center>
<h1>Welcome to the Lab 🥼🧪</h1>
</center>

## Tracking Invitation Homes Activity Over Time

In this notebook, we will analyze Invitation Homes activity in five tertiary markets to answer the following questions:
- How many homes does Invitation Homes own in these markets today? How many did they own a year ago?
- What is the rent rate that Invitation Homes is charging renters in these homes? What was it a year ago? 
- How have these metrics changed year over year?

The notebook is broken up into the following sections:
1. Import required packages and setup the Parcl Labs API key
2. Pull Parcl IDs for Five Tertiary Markets
3. Retrieve the Data for the Current Invitation Homes Properties
4. Prepare the Data for the Current Invitation Homes Properties
5. Retrieve the Data for the Nov 2023 Invitation Homes Properties
6. Prepare the Data for the November 2023 Invitation Homes Properties
7. Calculate the YoY Change for the Invitation Homes Portfolio in the Tertiary Markets

#### What will you create in this notebook?

##### Understand changes in supply and Demand YoY
<p align="center">
  <img src="../../../images/INVH_YoY_Changes_tertiary_markets.png" alt="Alt text">
</p>

#### Need help getting started?

**Reminders:**

- You can get your Parcl Labs API key [here](https://dashboard.parcllabs.com/signup) to follow along.

- To run this immediately, you can use Google Colab. Remember, you must set your `PARCL_LABS_API_KEY`. 
- To run this notebook at scale and download data for multiple markets and endpoints, you will need to upgrade your Parcl Labs API account from free to starter to get additional credits. You can easily upgrade at any time by visiting your [Parcl Labs dashboard](https://dashboard.parcllabs.com/login), clicking "Upgrade Now" ($99, no commitment). This will unlock more credits immediately.

Run in collab --> [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ParclLabs/parcllabs-cookbook/blob/main/examples/housing_market_research/investor_analytics/invh_yoy_change.ipynb)

### 1. Import required packages and setup the Parcl Labs API key

In [None]:
# if needed, install and/or upgrade to the latest verison of the Parcl Labs Python library
%pip install --upgrade parcllabs nbformat

In [2]:
import os
import pandas as pd
from parcllabs import ParclLabsClient
from requests.exceptions import RequestException

In [3]:
api_key = os.getenv('PARCL_LABS_API_KEY')
client = ParclLabsClient(api_key, turbo_mode=True)

### 2. Pull Parcl IDs for Five Tertiary Markets

In [4]:
# List of markets
queries = ['Indianapolis', 'Cincinnati', 'Columbia, SC', 'Columbus']

# Empty list to store parcl_ids
market_parcl_id_list = []

# Loop through each query and make the API request
for query in queries:
    markets = client.search.markets.retrieve(
        query=query,
        location_type='CBSA'
    )

    # Append Parcl IDs
    parcl_id = int(markets['parcl_id'].iloc[0]) if hasattr(markets, 'iloc') else markets['parcl_id']
    market_parcl_id_list.append(parcl_id)

### 3. Retrieve the Data for the Current Invitation Homes Properties

In steps 3 and 4 we can leverage the property search endpoint to pull the current Invitation Homes portfolio so we pass in less properties to the events endpoint. Once we have the parcl property IDs for the homes IH currently owns in these markets, we can partition on parcl property ID and sort by event date descending to get the most recent event at those properties.

Reminder that this will consume a lot of creidts at scale and you will need to upgrade your Parcl Labs API account from free to starter to get additional credits. If you would like to use less credits consider doing the analysis for only one market.

In [None]:
# pass in the parcl IDs and use the invitation homes parameter to pull current IH properties in the 5 markets
current_ih_units = client.property.search.retrieve(
    parcl_ids=market_parcl_id_list,
    property_type='single_family',
    current_entity_owner_name='invitation_homes'
)

#Dataframe of properties with CBSA and Current Owner 
current_ih_units_df = current_ih_units[['parcl_property_id', 'cbsa', 'cbsa_parcl_id', 'current_entity_owner_name']]

#Create Parcl Prop ID list to pass into events
parcl_property_id_list = current_ih_units_df['parcl_property_id'].tolist()

In [6]:
# pass in the parcl_property_id_list and use the RENTAL parameter to pull in all rental events at the properties
rental_events = client.property.events.retrieve(
        parcl_property_ids=parcl_property_id_list,
        event_type='RENTAL',
        end_date='2024-11-30'
)

### 4. Prepare the Data for the Current Invitation Homes Properties

In [7]:
# Pulling only the current rent rate (either PRICE_CHANGE or LISTED_RENT) at a given property
filtered_rental_events = rental_events[
    rental_events['event_name'].isin(['PRICE_CHANGE', 'LISTED_RENT'])
]

filtered_rental_events = filtered_rental_events.sort_values(
    by=['parcl_property_id', 'event_date'], ascending=[True, False]
)

latest_events_df = filtered_rental_events.drop_duplicates(
    subset=['parcl_property_id'], keep='first'
)

latest_events_df = latest_events_df[['parcl_property_id', 'price']]

In [None]:
merged_current_props = current_ih_units_df.merge(latest_events_df, on='parcl_property_id', how='left')

# Calculate inventory and rent rates
current_state_invh = merged_current_props.groupby(['cbsa', 'current_entity_owner_name']).agg(
    Current_Inventory=('parcl_property_id', 'count'),  # Count of 'parcl_property_id'
    Current_Rent_Rates=('price', 'median')  # Median of 'price'
).reset_index()

# Rename columns for the final output format
current_state_invh.rename(columns={
    'cbsa': 'Market',
    'current_entity_owner_name': 'Entity_Name',
}, inplace=True)

#print final current state df
current_state_invh


### 5. Retrieve the Data for the Nov 2023 Invitation Homes Properties

In steps 5 and 6 we'll show how we can leverage the historical nature of our data to gain insight into what an operator such as Invitation Homes was doing a year ago. First we pull in all properties and event prior to 11/30/2023 in the five markets. Once we have the parcl property IDs, we can partition on parcl property ID and sort by event date descending to get the most recent event at those properties prior to 11/30/23. Once we have the most recent events prior to 11/30/23, we can filter on owner to identify whether or not IH owned the homes at the time of that event. Once we've zeroed in on the universe we can continue with the analysis. Reminder that this process will consume millions of credits due to the amount of data we need to pull from the API for this analysis.

In [9]:
# Since we want a point in time portfolio analysis (Nov 2023), we cannot pull the current IH portfolio to truncate this query and must grab all units in the five markets, a much larger query
total_market_units = client.property.search.retrieve(
    parcl_ids=market_parcl_id_list,
    property_type='single_family'
)

#Dataframe of properties with CBSA and Current Owner 
total_market_units = total_market_units[['parcl_property_id', 'cbsa', 'cbsa_parcl_id', 'property_type']]

#Parcl Prop ID list to pass into events
total_unit_list = total_market_units['parcl_property_id'].tolist()

In [10]:
# pass in the parcl_property_id_list to pull in all events in the five markets through November 2023

market_events_2023 = client.property.events.retrieve(
        parcl_property_ids=total_unit_list,
        end_date='2023-11-30'
)

### 6. Prepare the Data for the November 2023 Invitation Homes Properties

In [11]:
# Pulling only the latest eventat every property prior to November 2023
latest_2023_event = market_events_2023.sort_values(
    by=['parcl_property_id', 'event_date'], ascending=[True, False]
)

latest_2023_event_dedupe = latest_2023_event.drop_duplicates(
    subset=['parcl_property_id'], keep='first'
)

In [12]:
# find the props owned by IH at the time

props_owned_by_ih_nov_2023 = latest_2023_event_dedupe[
    latest_2023_event_dedupe['entity_owner_name'].isin(['INVITATION_HOMES'])
]

In [13]:
# Merge the two DataFrames on 'parcl_property_id'
parcls_events_ih = props_owned_by_ih_nov_2023.merge(total_market_units, on='parcl_property_id', how='inner')

# Select only the columns 'parcl_property_id', 'cbsa', 'price', and 'entity_owner_name'
final_events = parcls_events_ih[['parcl_property_id', 'cbsa', 'entity_owner_name', 'transfer_index']]

# Filter latest_2023_event_dedupe for rows where event_type is 'RENTAL' and event_name is either 'LISTED_RENT' or 'PRICE_CHANGE'
rentals_2023 = latest_2023_event_dedupe[
    (latest_2023_event_dedupe['event_type'] == 'RENTAL') &
    (latest_2023_event_dedupe['event_name'].isin(['LISTED_RENT', 'PRICE_CHANGE'])) &
    (latest_2023_event_dedupe['price'] <= 8000)
]

# Select only the columns needed for the join
rentals_2023 = rentals_2023[['parcl_property_id', 'transfer_index', 'price']]

# Perform a left join on final_events with filtered_latest_2023 using 'parcl_property_id' and 'transfer_index'
final_events_with_prices = final_events.merge(
    rentals_2023,
    on=['parcl_property_id', 'transfer_index'],
    how='left',
    suffixes=('', '_rental')  # to avoid column name conflicts if needed
)

In [None]:
# Calculate IH 2023 inventory and rent rates
invh_2023 = final_events_with_prices.groupby(['cbsa', 'entity_owner_name']).agg(
    November_2023_Inventory=('parcl_property_id', 'count'),  # Count of 'parcl_property_id'
    November_2023_Rent_Rates=('price', 'median')  # Median of 'price'
).reset_index()

# Rename columns for the final output format
invh_2023.rename(columns={
    'cbsa': 'Market',
    'entity_owner_name': 'Entity_Name',
}, inplace=True)

invh_2023

### 7. Calculate the YoY Change for the Invitation Homes Portfolio in the Tertiary Markets

In [None]:
# Calc the YoY changes
YoY_changes = pd.merge(
    invh_2023,
    current_state_invh,
    on=['Market', 'Entity_Name'],
    how='left'
)

# Calculate YoY Inventory Change and YoY Rent Change
YoY_changes['YoY Inventory Change'] = ((YoY_changes['Current_Inventory'] - YoY_changes['November_2023_Inventory']) 
                                     / YoY_changes['November_2023_Inventory']) * 100
YoY_changes['YoY Rent Change'] = ((YoY_changes['Current_Rent_Rates'] - YoY_changes['November_2023_Rent_Rates']) 
                                / YoY_changes['November_2023_Rent_Rates']) * 100

# Rename columns for the final output
YoY_changes.rename(columns={
    'November_2023_Inventory': 'November 2023 Inventory',
    'Current_Inventory': 'Current Inventory',
    'November_2023_Rent_Rates': 'November 2023 Rent Rates',
    'Current_Rent_Rates': 'Current Rent Rates'
}, inplace=True)

# Reorder columns to match the desired format
final = YoY_changes[['Market', 'Entity_Name', 'November 2023 Inventory', 'Current Inventory', 
                      'YoY Inventory Change', 'November 2023 Rent Rates', 'Current Rent Rates', 
                      'YoY Rent Change']]

# Export the final DataFrame to CSV
final.to_csv('final_market_data.csv', index=False)

# Display the final DataFrame to confirm
final