# How to Automate Building Your Foreclosure List with Selenium in Python | Easy Tutorial

## Overview
| Detail Tag            | Information                                                                                        |
|-----------------------|----------------------------------------------------------------------------------------------------|
| Originally Created By | Ariel Herrera arielherrera@analyticsariel.com                                                      |
| External References   | <a href="https://www.hillsborough.realforeclose.com/index.cfm" target="_blank">County Foreclosure Site</a>|
| Input Datasets        | Login Credentials                                                                                    |
| Output Datasets       | Table of Foreclosure Properties                                              |
| Input Data Source     | Dataframe                                                                                                |
| Output Data Source    | Dataframe                                                                                                   |

## History
| Date         | Developed By  | Reason                                                |
|--------------|---------------|-------------------------------------------------------|
| 6th Feb 2021 | Ariel Herrera | Notebook created to automate creating foreclosure list. |

## Other Details
This Notebook is a prototype.

## <font color="blue">Install Packages</font>

In [1]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install beautifulsoup4
!{sys.executable} -m pip install selenium
!{sys.executable} -m pip install webdriver-manager



In [2]:
# install chromedriver
# https://sites.google.com/a/chromium.org/chromedriver/home

## <font color="blue">Imports</font>

In [3]:
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd

## <font color="blue">Local and Constant Variables</font>

In [4]:
# set chrome options
chrome_options = webdriver.ChromeOptions()
# chrome_options.add_argument('--headless') # run in background
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')

In [5]:
# set parameters
url = "https://www.hillsborough.realforeclose.com/index.cfm"

df_login = pd.read_csv('login_credentials.csv')
username = df_login.loc[df_login['key'] == "username"]['value'].iloc[0]
password = df_login.loc[df_login['key'] == "password"]['value'].iloc[0]

# date
date_str = "2021-02-26"

print(username)
print(date_str)

aherrera614
2021-02-26


## <font color="blue">Navigate Website</font>

In [6]:
# set up driver
driver = webdriver.Chrome(ChromeDriverManager().install(), options=chrome_options)
# get url
driver.get(url)

[WDM] - Current google-chrome version is 88.0.4324
[WDM] - Get LATEST driver version for 88.0.4324






[WDM] - Driver [/Users/arielherrera/.wdm/drivers/chromedriver/mac64/88.0.4324.96/chromedriver] found in cache


In [7]:
# enter in credentials
# username
login_user_name_elem = driver.find_element_by_id("LogName")
login_user_name_elem.send_keys(username)

# password
login_password_elem = driver.find_element_by_id("LogPass")
login_password_elem.send_keys(password)

In [8]:
# login
login_elem = driver.find_element_by_id("LogButton")
login_elem.click()

In [9]:
# bypass notice
enter_notice_elem = driver.find_element_by_xpath("//input[@type='button'][@value='OK']")
enter_notice_elem.click()

In [10]:
# calendar
calendar_elem = driver.find_element_by_xpath("//span[contains(text(), 'Calendar')]")
calendar_elem.click()

In [11]:
# day input
day_str = str(int(date_str.split("-")[-1])) # get single value if single digit
print(date_str)
print("Day:", day_str)

# calendar date
calendar_date_elem = driver.find_element_by_xpath("//span[@class='CALNUM'][contains(text(), '{0}')]"\
                                                   .format(day_str))
calendar_date_elem.click()

2021-02-26
Day: 26


#### <font color="green">Get Auction Content</font>

In [12]:
# get page contents
soup = BeautifulSoup(driver.page_source, 'html.parser')

##### <font color="purple">Get Auction Detail</font>

In [15]:
# create list
auction_details_list = []

# find auctions stats
auction_details_content = soup.findAll("div", {"class": "AUCTION_DETAILS"})

In [16]:
# view contents of first element
print(auction_details_content[0].prettify())

<div class="AUCTION_DETAILS">
 <table class="ad_tab" tabindex="0">
  <tbody>
   <tr>
    <th class="AD_LBL" scope="row">
     Auction Type:
    </th>
    <td class="AD_DTA">
     FORECLOSURE
    </td>
   </tr>
   <tr>
    <th aria-label="Case Number" class="AD_LBL" scope="row">
     Case #:
    </th>
    <td class="AD_DTA">
     <a href="/index.cfm?zaction=auction&amp;zmethod=details&amp;AID=1282294&amp;bypassPage=1">
      292019CA012595A001HC
     </a>
    </td>
   </tr>
   <tr>
    <th class="AD_LBL" scope="row">
     Final Judgment Amount:
    </th>
    <td class="AD_DTA">
     $197,273.19
    </td>
   </tr>
   <tr>
    <th class="AD_LBL" scope="row">
     Parcel ID:
    </th>
    <td class="AD_DTA">
     2031065WA000015000450U
    </td>
   </tr>
   <tr>
    <th class="AD_LBL" scope="row">
     Property Address:
    </th>
    <td class="AD_DTA">
     10313 HUNTERS HAVEN BLVD
    </td>
   </tr>
   <tr>
    <th class="AD_LBL" scope="row">
    </th>
    <td class="AD_DTA">
     RIVERV

In [17]:
auction_details_content[0].findAll("th", {"class": "AD_LBL"})

[<th class="AD_LBL" scope="row">Auction Type:</th>,
 <th aria-label="Case Number" class="AD_LBL" scope="row">Case #:</th>,
 <th class="AD_LBL" scope="row">Final Judgment Amount:</th>,
 <th class="AD_LBL" scope="row">Parcel ID:</th>,
 <th class="AD_LBL" scope="row">Property Address:</th>,
 <th class="AD_LBL" scope="row"></th>,
 <th class="AD_LBL" scope="row">Assessed Value:</th>,
 <th class="AD_LBL" scope="row">Plaintiff Max Bid:</th>]

In [18]:
auction_details_content[0].findAll("td", {"class": "AD_DTA"})

[<td class="AD_DTA">FORECLOSURE</td>,
 <td class="AD_DTA"><a href="/index.cfm?zaction=auction&amp;zmethod=details&amp;AID=1282294&amp;bypassPage=1">292019CA012595A001HC</a></td>,
 <td class="AD_DTA">$197,273.19</td>,
 <td class="AD_DTA">2031065WA000015000450U</td>,
 <td class="AD_DTA">10313 HUNTERS HAVEN BLVD</td>,
 <td class="AD_DTA">RIVERVIEW, FL- 33578</td>,
 <td class="AD_DTA">$147,997.00</td>,
 <td class="AD_DTA ASTAT_MSGPB">Hidden</td>]

In [19]:
# set params
prior_label = ""
a_detail_row = 0
d = {}

# iterate through details
for a_detail in auction_details_content:
    d[a_detail_row] = {}
    
    # get all labels
    auction_starts = soup.findAll("div", {"class": "ASTAT_MSGB Astat_DATA"})
    auction_detail_labels = a_detail.findAll("th", {"class": "AD_LBL"})
    auction_detail_values = a_detail.findAll("td", {"class": "AD_DTA"})
    
    # iterate through labels
    for i in range(len(auction_detail_labels)):
        # get label
        label = auction_detail_labels[i].text
        
        if label == "Auction Type:":
            auction_type = auction_detail_values[i].text
        elif label == "Case #:":
            case_num = auction_detail_values[i].text
        elif label == "Final Judgment Amount:":
            final_judgement_amt = auction_detail_values[i].text
        elif label == "Parcel ID:":
            parcel_id = auction_detail_values[i].text
        elif label == "Property Address:":
            property_address_street = auction_detail_values[i].text
        elif (label == "") & (prior_label == "Property Address:"):
            property_address_city = auction_detail_values[i].text
        elif label == "Assessed Value:":
            assessed_val = auction_detail_values[i].text
        elif label == "Plaintiff Max Bid:":
            plaintiff_max_bid = auction_detail_values[i].text
        
        # set prior label before moving on to next
        prior_label = label
        
    d[a_detail_row] = {
        'AUCTION_STARTS': auction_starts[a_detail_row].text,
        'AUCTION_TYPE': auction_type, 
        'CASE_NUMBER': case_num,
        'FINAL_JUDGEMENT_AMOUNT': final_judgement_amt, 
        'PARCEL_ID': parcel_id,
        'PROPERTY_STREET': property_address_street, 
        'PROPERTY_CITY': property_address_city,
        'ASSESSED_VALUE': assessed_val, 
        'PLAINTIFF_MAX_BID': plaintiff_max_bid}
    
    # reset values
    auction_type = ""
    case_num = ""
    final_judgement_amt = ""
    parcel_id = ""
    property_address_street = ""
    property_address_city = ""
    assessed_val = ""
    plaintiff_max_bid = ""
        
    a_detail_row += 1

In [20]:
df = pd.DataFrame(data=d).T # convert dictionary to dataframe
df.to_csv('{0}_auction_properties.csv'.format(date_str), index=False) # save
df

Unnamed: 0,AUCTION_STARTS,AUCTION_TYPE,CASE_NUMBER,FINAL_JUDGEMENT_AMOUNT,PARCEL_ID,PROPERTY_STREET,PROPERTY_CITY,ASSESSED_VALUE,PLAINTIFF_MAX_BID
0,02/26/2021 10:00 AM ET,FORECLOSURE,292019CA012595A001HC,"$197,273.19",2031065WA000015000450U,10313 HUNTERS HAVEN BLVD,"RIVERVIEW, FL- 33578","$147,997.00",Hidden
1,02/26/2021 10:00 AM ET,FORECLOSURE,292019CC065643A001HC,"$4,578.43",,,,,"$4,578.43"
2,02/26/2021 10:00 AM ET,FORECLOSURE,292020CA007527A001HC,"$154,930.44",20302080P000010000020U,9225 STONE RIVER PL,"RIVERVIEW, FL- 33578","$152,322.00",Hidden
3,02/26/2021 10:00 AM ET,FORECLOSURE,292020CC037960A001HC,"$9,132.88",1932099NP000004000370U,2022 PEACEFUL PALM ST,"RUSKIN, FL- 33570","$136,764.00",Hidden
4,02/26/2021 10:00 AM ET,FORECLOSURE,292020CC040346A001HC,"$10,449.27",1830053X9000000000060A,4405 W FAIR OAKS AVE 6,"TAMPA, FL- 33611","$71,687.00","$10,805.57"
5,02/26/2021 10:00 AM ET,FORECLOSURE,292020CC043311A001HC,"$5,938.50",2030189CO000060000080U,6919 TOWERING SPRUCE DR,"RIVERVIEW, FL- 33578","$113,756.00",Hidden
6,Canceled per County,FORECLOSURE,292019CA012595A001HC,$0.00,,,,,Hidden


# End Notebook