[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1_wJz9HUb3EHO9WDjbk9u3EGW0kzOO_II?usp=sharing)
# Identifying Tired Landlord + Vacancy Leads with Zillow Rentals

## Overview
| Detail Tag            | Information                                                                                        |
|-----------------------|----------------------------------------------------------------------------------------------------|
| Originally Created By | Ariel Herrera arielherrera@analyticsariel.com |
| External References   | Scrapeak Zillow API |

## History
| Date         | Developed By  | Reason                                                |
|--------------|---------------|-------------------------------------------------------|
| 24th Oct 2024 | Ariel Herrera | Create notebook. |

## Getting Started
1. Copy this notebook -> File -> Save a Copy in Drive
2. Directions

## Useful Resources
- [Google Colab Cheat Sheet](https://towardsdatascience.com/cheat-sheet-for-google-colab-63853778c093)

# 🎯 **Goal of this Notebook: Finding Tired Landlord Leads via Zillow Rental Listings**

This notebook will guide you through how to gather and analyze rental listings from Zillow to identify potential tired landlords.
The aim is to help you find properties that are likely vacant and could be owned by landlords interested in selling.

## 💡 **Who is this for?**
This guide is perfect for those starting out in real estate investment, wholesaling, or anyone looking for creative ways to find off-market deals.

---

### 📖 **Overview: What You'll Learn**
- **How to use Zillow's rental listings to gather leads**
- **How to process and filter the data to target tired landlords**
- **How to visualize the listings and analyze potential opportunities**

➡️ Join our LIVE Alternative Leads Mini Series: <a href="https://www.coffeeclozers.com/landing-pages/alternative-leads-lists-mini-series" target="_blank">
  <button style="padding:10px; background-color:#4CAF50; color:white; border:none; border-radius:5px; cursor:pointer;">
    Learn More
  </button>
</a>

![Image Description](https://drive.google.com/uc?export=view&id=1UQVlnsyaDAerRybzeiiFxn_iXIyJ-Jdw)

## 📦 **Step 1: Install Necessary Packages**

Before we start, make sure you have the necessary libraries installed to scrape Zillow rental listings and perform data analysis.

- **requests**: To interact with web resources
- **pandas**: For data manipulation
- **plotly**: For visualizations

In [14]:
!pip install requests pandas plotly -q

## 🛠 **Step 2: Setup (Import Libraries, Functions, Locals & Constants)**

These are the essential Python libraries that you'll use throughout this notebook to fetch and process the rental data.

- **pandas**: For handling datasets and making it easy to work with tabular data.
- **getpass**: To safely input passwords or sensitive information if needed.
- **plotly.express**: To visualize the rental listings on a map and analyze price data.

### <font color="blue">Imports</font>

In [15]:
# Import the necessary libraries for this analysis
# requests: to fetch data from Zillow or other web APIs
# pandas: for handling and analyzing structured datasets
# getpass: to securely input sensitive information (e.g., passwords or API keys)
# plotly.express: for creating interactive visualizations

import requests
import pandas as pd
import getpass
from datetime import datetime
import plotly.express as px

pd.set_option('display.max_columns', None)

### <font color="blue">Functions</font>

In [16]:
# This section sends a request to the Zillow API to retrieve rental listings data
# Replace the URL and query parameters with the specific ones needed for your request
# The response contains raw rental listings data, which we'll process and analyze later

def get_zillow_listings(api_key, listing_url):
  #/ API endpoint and default headers
  api_url = "https://app.scrapeak.com/v1/scrapers/zillow/listing"

  parameters = {"api_key": api_key, "url": listing_url}

  #/ Make the API request
  response = requests.get(api_url, params=parameters)
  return response


def normalize_column_with_clipping(df, column_name, lower_percentile=1, upper_percentile=99):
    """
    Normalize the values in a given column to the range [0, 1], handling outliers by clipping.

    Parameters:
    df (pandas.DataFrame): The dataframe containing the column to normalize.
    column_name (str): The name of the column to normalize.
    lower_percentile (float): The lower percentile to clip outliers (default is 1st percentile).
    upper_percentile (float): The upper percentile to clip outliers (default is 99th percentile).

    Returns:
    pandas.Series: A series with normalized values between 0 and 1.
    """
    # Calculate the clipping thresholds
    lower_bound = df[column_name].quantile(lower_percentile / 100)
    upper_bound = df[column_name].quantile(upper_percentile / 100)

    # Clip the values in the column to the specified percentiles
    clipped_column = df[column_name].clip(lower=lower_bound, upper=upper_bound)

    # Normalize the clipped column
    col_min = clipped_column.min()
    col_max = clipped_column.max()

    # Avoid division by zero if all values in the column are the same
    if col_max - col_min == 0:
        return clipped_column.apply(lambda x: 1 if col_max != 0 else 0)

    return (clipped_column - col_min) / (col_max - col_min)

### <font color="blue">Locals & Constants</font>

Get Your [Zillow API Key by signing up for a free Scrapeak account](https://bit.ly/3YVU3Ga)

In [17]:
# Set your API key and listing URL
api_key = getpass.getpass("Enter your api key: ")  # Replace with your actual API key
listing_url = "https://www.zillow.com/toledo-oh/rentals/?searchQueryState=%7B%22isMapVisible%22%3Atrue%2C%22mapBounds%22%3A%7B%22north%22%3A41.81569685823597%2C%22south%22%3A41.49711053161347%2C%22east%22%3A-83.3011579580078%2C%22west%22%3A-83.91501904199218%7D%2C%22filterState%22%3A%7B%22fr%22%3A%7B%22value%22%3Atrue%7D%2C%22fsba%22%3A%7B%22value%22%3Afalse%7D%2C%22fsbo%22%3A%7B%22value%22%3Afalse%7D%2C%22nc%22%3A%7B%22value%22%3Afalse%7D%2C%22cmsn%22%3A%7B%22value%22%3Afalse%7D%2C%22auc%22%3A%7B%22value%22%3Afalse%7D%2C%22fore%22%3A%7B%22value%22%3Afalse%7D%2C%22tow%22%3A%7B%22value%22%3Afalse%7D%2C%22apco%22%3A%7B%22value%22%3Afalse%7D%2C%22apa%22%3A%7B%22value%22%3Afalse%7D%2C%22con%22%3A%7B%22value%22%3Afalse%7D%7D%2C%22isListVisible%22%3Atrue%2C%22mapZoom%22%3A11%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A34303%2C%22regionType%22%3A6%7D%5D%2C%22pagination%22%3A%7B%7D%2C%22usersSearchTerm%22%3A%22Toledo%20OH%22%7D"  # The URL of the listing page with the 'searchQueryState' parameter.

Enter your password: ··········


## 📋 **Step 3: Data Collection - Fetch Zillow Rental Listings**

Here, we will use the Zillow API to gather rental listings. The goal is to find landlords who have listed properties but may be having trouble renting them out.

The code in this section will:
- Query the Zillow API for rental listings
- Filter the results to target listings that have been on the market for a long time (indicating potential vacancies)


In [18]:
# make request
response = get_zillow_listings(api_key, listing_url) # sample of ~40 listings

In [19]:
# view keys
response.json().keys()

dict_keys(['is_success', 'data', 'message', 'info'])

In [20]:
# view results count
len(response.json()['data']['cat1']['searchResults']['listResults'])

41

In [21]:
# view results for a single listing
x = response.json()['data']['cat1']['searchResults']['listResults'][1]
x

{'zpid': '34635601',
 'id': '34635601',
 'rawHomeStatusCd': 'ForRent',
 'marketingStatusSimplifiedCd': 'For Rent',
 'providerListingId': '2d9prjz6apdjt',
 'imgSrc': 'https://photos.zillowstatic.com/fp/9f710a057691d166265807757d570349-p_e.jpg',
 'hasImage': True,
 'detailUrl': 'https://www.zillow.com/homedetails/1110-Camden-St-Toledo-OH-43605/34635601_zpid/',
 'statusType': 'FOR_RENT',
 'statusText': 'House for rent',
 'countryCurrency': '$',
 'price': '$1,150/mo',
 'unformattedPrice': 1150,
 'address': '1110 Camden St, Toledo, OH 43605',
 'addressStreet': '1110 Camden St',
 'addressCity': 'Toledo',
 'addressState': 'OH',
 'addressZipcode': '43605',
 'isUndisclosedAddress': False,
 'beds': 3,
 'baths': 1.0,
 'area': 982,
 'latLong': {'latitude': 41.635185, 'longitude': -83.51398},
 'isZillowOwned': False,
 'variableData': {'type': 'TIME_ON_INFO',
  'text': '9 days ago',
  'data': {'isRead': None, 'isFresh': False}},
 'badgeInfo': {'type': 'ACCEPTS_APPLICATIONS', 'text': 'Apply instantly

In [25]:
# index to get specific fields
print(x['zpid'])
print(x['detailUrl'])
print(x['imgSrc'])
print(x['unformattedPrice'])
print(x['address'])
print(x['beds'])
print(x['baths'])
print(x['area'])
print(x['latLong'])
print(x['latLong']['latitude'])
print(x['latLong']['longitude'])
print(x['hdpData']['homeInfo']['zestimate'])
print(x['hdpData']['homeInfo']['rentZestimate'])
print(x['hdpData']['homeInfo']['timeOnZillow'])
print(x['availabilityDate'])
print(x['marketingTreatments'])

34635601
https://www.zillow.com/homedetails/1110-Camden-St-Toledo-OH-43605/34635601_zpid/
https://photos.zillowstatic.com/fp/9f710a057691d166265807757d570349-p_e.jpg
1150
1110 Camden St, Toledo, OH 43605
3
1.0
982
{'latitude': 41.635185, 'longitude': -83.51398}
41.635185
-83.51398
68000
1242
812953000
2024-10-15 00:00:00
['paid', 'zillowRentalManager']


## 🔍 **Step 4: Data Filtering and Analysis**

Once you have the raw rental listings, the next step is to clean and filter the data to focus on landlords who might be motivated to sell.
In this section, you'll learn how to filter for:
- Properties listed for extended periods
- Properties where the price has dropped multiple times (a key indicator of tired landlords)

In [26]:
# get all listings to list
rental_listings_l = response.json()['data']['cat1']['searchResults']['listResults']

In [27]:
# iterate through listings to select a subset of fields
formatted_listings_l = []
for x in rental_listings_l:
  # check property data is available
  if 'hdpData' in x:
    # check if single family home
    if x['hdpData']['homeInfo']['homeType'] == 'SINGLE_FAMILY':
      d = {
          'zpid': x['zpid'],
          'detailUrl': x['detailUrl'],
          'imgSrc': x['imgSrc'],
          'price': x['unformattedPrice'],
          'address': x['address'],
          'beds': x['beds'],
          'baths': x['baths'],
          'area': x['area'],
          'homeType': x['hdpData']['homeInfo']['homeType'],
          'latitude': x['latLong']['latitude'],
          'longitude': x['latLong']['longitude'],
          'zestimate': x['hdpData']['homeInfo']['zestimate'] if 'zestimate' in x else None,
          'rentZestimate': x['hdpData']['homeInfo']['rentZestimate'] if 'rentZestimate' in x else None,
          'daysOnZillow': x['hdpData']['homeInfo']['daysOnZillow'],
          'priceChange': x['hdpData']['homeInfo']['priceChange'] if 'priceChange' in x['hdpData']['homeInfo'] else None,
          'datePriceChanged': x['hdpData']['homeInfo']['datePriceChanged'] if 'datePriceChanged' in x['hdpData']['homeInfo'] else None,
          'availabilityDate': x['availabilityDate'] if 'availabilityDate' in x else None,
          'marketingTreatments': x['marketingTreatments']
      }
      formatted_listings_l.append(d)

In [28]:
# transform to a dataframe
df = pd.DataFrame(formatted_listings_l)
df.head(2)

Unnamed: 0,zpid,detailUrl,imgSrc,price,address,beds,baths,area,homeType,latitude,longitude,zestimate,rentZestimate,daysOnZillow,priceChange,datePriceChanged,availabilityDate,marketingTreatments
0,34635601,https://www.zillow.com/homedetails/1110-Camden...,https://photos.zillowstatic.com/fp/9f710a05769...,1150,"1110 Camden St, Toledo, OH 43605",3,1.0,982,SINGLE_FAMILY,41.635185,-83.51398,68000.0,,9,,,2024-10-15 00:00:00,"[paid, zillowRentalManager]"
1,34633440,https://www.zillow.com/homedetails/3507-Saint-...,https://photos.zillowstatic.com/fp/be7a5b6a02c...,1150,"3507 Saint Bernard Dr, Toledo, OH 43613",2,1.0,933,SINGLE_FAMILY,41.684124,-83.59744,107100.0,,7,,,2024-11-05 00:00:00,"[paid, zillowRentalManager]"


In [29]:
# view high level stats on price by bed count
df.groupby(['beds']).agg({'zpid': 'count', 'price': 'mean'}).reset_index()

Unnamed: 0,beds,zpid,price
0,1,1,695.0
1,2,8,1159.375
2,3,29,1272.37931
3,4,2,1400.0


In [30]:
# create features
df_features = df.copy()
df_features['price_per_sqft'] = df_features['price'] / df_features['area']
df_features['rent_to_price_ratio'] = (df_features['price']*12) / df_features['zestimate']
df_features['one_percent_rule'] = (df_features['price'] / df_features['zestimate']) > 0.01
df_features.head(2)

Unnamed: 0,zpid,detailUrl,imgSrc,price,address,beds,baths,area,homeType,latitude,longitude,zestimate,rentZestimate,daysOnZillow,priceChange,datePriceChanged,availabilityDate,marketingTreatments,price_per_sqft,rent_to_price_ratio,one_percent_rule
0,34635601,https://www.zillow.com/homedetails/1110-Camden...,https://photos.zillowstatic.com/fp/9f710a05769...,1150,"1110 Camden St, Toledo, OH 43605",3,1.0,982,SINGLE_FAMILY,41.635185,-83.51398,68000.0,,9,,,2024-10-15 00:00:00,"[paid, zillowRentalManager]",1.171079,0.202941,True
1,34633440,https://www.zillow.com/homedetails/3507-Saint-...,https://photos.zillowstatic.com/fp/be7a5b6a02c...,1150,"3507 Saint Bernard Dr, Toledo, OH 43613",2,1.0,933,SINGLE_FAMILY,41.684124,-83.59744,107100.0,,7,,,2024-11-05 00:00:00,"[paid, zillowRentalManager]",1.232583,0.128852,True


## 📊 **Step 5a: Visualizing the Data Characteristics**

In [31]:
# 1. Price Distribution by Beds
price_by_beds_type = px.bar(
    df_features.groupby('beds')['price'].mean().reset_index(),
    x='beds', y='price',
    title="Average Rental Price by Home Type",
    labels={'price': 'Average Price ($)', 'beds': 'Beds'}
)
price_by_beds_type.show()

In [32]:
# 2. Price vs Area Scatter Plot
price_vs_area = px.scatter(
    df_features, x='area', y='price',
    title="Rental Price vs Area",
    labels={'area': 'Area (sq ft)', 'price': 'Price ($)'},
    trendline="ols"  # This adds a trendline
)
price_vs_area.show()

In [33]:
# 3. Price per Square Foot by Number of Beds
price_per_sqft_beds = px.bar(
    df_features.groupby('beds')['price_per_sqft'].mean().reset_index(),
    x='beds', y='price_per_sqft',
    title="Average Price per Square Foot by Number of Beds",
    labels={'price_per_sqft': 'Price per Sq Ft ($)', 'beds': 'Number of Beds'}
)
price_per_sqft_beds.show()

In [34]:
# 4. Distribution of Rental Prices
price_distribution = px.histogram(
    df_features, x='price', nbins=20,
    title="Distribution of Rental Prices",
    labels={'price': 'Price ($)'}
)
price_distribution.show()

## 📊 **Step 5b: Visualizing the Rental Listings on a Map**

After collecting and filtering the data, you will want to visualize the listings on a map.
This will give you an overview of where these properties are located and help you prioritize your outreach efforts.

- Larger bubbles represent higher rental prices, while smaller bubbles represent lower-priced properties.


In [None]:
# Create an interactive map visualization using Plotly
# Each point on the map represents a rental listing, with bubble size corresponding to price
# This helps identify geographic clusters of potential tired landlord properties

# Bubble Map showing rental listings, where bubble size corresponds to price
bubble_map = px.scatter_mapbox(
    df_features,
    lat='latitude',
    lon='longitude',
    size='price',
    color='price',  # Color by price for additional visual cue
    hover_name='address',  # Display address when hovering over the bubble
    hover_data={'price': True, 'beds': True, 'baths': True, 'area': True},  # Extra data to show on hover
    title="Rental Listings: Bubble Size by Price",
    zoom=10,  # Adjust zoom level for the map
    size_max=15  # Maximum bubble size
)

# Set the map style and layout
bubble_map.update_layout(
    mapbox_style="open-street-map",  # Using Open Street Map
    margin={"r":0,"t":0,"l":0,"b":0}  # Removing excess margins
)

# Show the map
bubble_map.show()

## 🧐 **Step 6: Interpreting the Results**

Now that you have visualized the data, you can start analyzing the patterns. Look for clusters of high-priced properties that have been listed for long periods,
as these are often signs of landlords struggling to find tenants.

Ask yourself:
- Are there areas with multiple vacant properties?
- Are certain neighborhoods more likely to have price reductions?
- Are there opportunities to target landlords who might be interested in selling?

These insights will help you target the best leads.

In [35]:
# score leads based on motivation and profitability
df_score = df_features.copy()
df_score['norm_days_on_zillow'] = normalize_column_with_clipping(df_score, 'daysOnZillow')
df_score['norm_rent_to_price_ratio'] = normalize_column_with_clipping(df_score, 'rent_to_price_ratio')
df_score['lead_score'] = df_score.apply(lambda x:
  (x['norm_days_on_zillow'] * 0.40) + \
  (x['norm_rent_to_price_ratio'] * 0.40) + \
  ((1 if pd.isna(x['priceChange']) == False else 0) * 0.20)
  , axis=1)
df_score.sort_values(by='lead_score', ascending=False).head(5)

Unnamed: 0,zpid,detailUrl,imgSrc,price,address,beds,baths,area,homeType,latitude,longitude,zestimate,rentZestimate,daysOnZillow,priceChange,datePriceChanged,availabilityDate,marketingTreatments,price_per_sqft,rent_to_price_ratio,one_percent_rule,norm_days_on_zillow,norm_rent_to_price_ratio,lead_score
25,34628144,https://www.zillow.com/homedetails/1011-E-Manh...,https://photos.zillowstatic.com/fp/048e6e6f0f3...,1150,"1011 E Manhattan Blvd, Toledo, OH 43608",3,1.0,1316,SINGLE_FAMILY,41.687164,-83.524216,63800.0,,27,-200.0,1728457000000.0,2024-09-23 00:00:00,"[trustedListing, paid, singleFamilyPaid, feedC...",0.87386,0.216301,True,0.538704,0.842141,0.752338
38,34692149,https://www.zillow.com/homedetails/238-Dickens...,https://photos.zillowstatic.com/fp/1769c004023...,1150,"238 Dickens Dr, Toledo, OH 43607",3,1.0,960,SINGLE_FAMILY,41.64111,-83.637085,78100.0,,37,-50.0,1727507000000.0,2024-10-01 00:00:00,"[trustedListing, paid, singleFamilyPaid, feedC...",1.197917,0.176697,True,0.749053,0.566748,0.726321
23,34685822,https://www.zillow.com/homedetails/640-Nevada-...,https://photos.zillowstatic.com/fp/e2b26eabdf9...,1195,"640 Nevada St, Toledo, OH 43605",3,1.0,1232,SINGLE_FAMILY,41.639603,-83.519485,71100.0,,24,-100.0,1728284000000.0,2024-09-25 00:00:00,"[trustedListing, paid, singleFamilyPaid, feedC...",0.969968,0.201688,True,0.475599,0.740527,0.686451
37,34638762,https://www.zillow.com/homedetails/1814-Bigelo...,https://photos.zillowstatic.com/fp/7386cdfc82b...,850,"1814 Bigelow St, Toledo, OH 43613",2,1.0,787,SINGLE_FAMILY,41.679058,-83.5872,46600.0,,16,-75.0,1728025000000.0,,"[paid, zillowRentalManager]",1.080051,0.218884,True,0.30732,0.860104,0.66697
9,34645040,https://www.zillow.com/homedetails/1225-Harvar...,https://photos.zillowstatic.com/fp/b15a1ecc684...,1499,"1225 Harvard Blvd, Toledo, OH 43614",3,1.0,1195,SINGLE_FAMILY,41.615574,-83.58626,116000.0,,33,-100.0,1728976000000.0,,"[paid, zillowRentalManager]",1.254393,0.155069,True,0.664914,0.416358,0.632509


## 🔎 **Step 7: Property Investigation**
1️⃣ **View Rental Listing**<br>
<img src="https://drive.google.com/uc?export=view&id=19A3a6pMeN2QTEcfECh6sHGMAD0FMHrid" alt="Image Description" width="70%"><br>
2️⃣ **View Property Records**<br>
<img src="https://drive.google.com/uc?export=view&id=1Ne23mEnrF6ZGBsqDM9_SN7Lo8af5TQhl" alt="Image Description" width="70%"><br>
3️⃣ **Find Individual Owner**<br>
<img src="https://drive.google.com/uc?export=view&id=1KU3m2B-3mOUKcpBid9BdkFap0KBijD9j" alt="Image Description" width="50%"><br>
4️⃣ **Skip Trace**<br>
<img src="https://drive.google.com/uc?export=view&id=1Rp7k8KWyy9HFbpwWpdC-QKdI2wQ990am" alt="Image Description" width="50%"><br>

In [36]:
# save to a csv file (i.e. open it up in google sheets or excel)
df_score.to_csv('sample_rental_listings.csv', index=False)

## 🔮 **Considerations for the Future**

While this notebook provides a solid foundation for finding tired landlord leads via Zillow rental listings, there are several ways to extend and improve this process:

➡️ Join our LIVE Alternative Leads Mini Series: <a href="https://www.coffeeclozers.com/landing-pages/alternative-leads-lists-mini-series" target="_blank">
  <button style="padding:10px; background-color:#4CAF50; color:white; border:none; border-radius:5px; cursor:pointer;">
    Learn More
  </button>
</a>

### 1. **Data Accuracy & Enrichment**
   - **APIs for Additional Data**: Consider integrating more data sources, such as property history and owner financing information, to enrich your dataset and make more informed decisions.
   - **Skip Tracing Services**: Once you’ve identified potential leads, using a skip tracing service to find the landlord's contact information can streamline your outreach efforts.

### 2. **Automating Data Retrieval**
   - **Scheduling the Scrape**: Instead of manually running this notebook, you could set up a process to scrape rental listings at regular intervals using a cloud service (e.g., AWS Lambda or Google Cloud Functions) to stay updated with new leads.
   - **Web Scraping with Automations**: If Zillow's API becomes restrictive, consider using tools like **Browse AI** or **PhantomBuster** for scraping listings dynamically without violating terms of service.

### 3. **Enhanced Analysis**
   - **Sentiment Analysis on Listings**: You could perform a text analysis on the listing descriptions to detect motivated sellers. For example, phrases like "price negotiable" can provide additional insights into seller motivation.
   - **Time-on-Market Analysis**: Visualize the time on market over several months to detect trends in particular neighborhoods, indicating areas where landlords are struggling to rent out properties.

### 4. **Contact Management & Outreach**
   - **CRM Integration**: Once you have a list of tired landlord leads, consider integrating the data into a CRM like **GoHighLevel** or **HubSpot** for automated outreach. This would allow you to create automated email or SMS campaigns targeting landlords.
   - **Personalization in Outreach**: The more personalized your outreach, the better. Consider using the listing details in your email or SMS templates to make your offers more appealing.

### 5. **Predictive Modeling**
   - **Machine Learning for Lead Scoring**: You could build a predictive model that scores each property based on how likely the landlord is to sell. Inputs could include time on market, number of price reductions, and rental market conditions in the area.
   - **Market Demand Prediction**: Predict future rental demand in certain areas using external data such as employment trends, population growth, and new developments.

---

By incorporating some or all of these enhancements, you can take your lead generation efforts to the next level and stay competitive in finding off-market real estate deals.

# End Notebook