## Research Question

**What spatial patterns emerge from eviction warrant data in Baltimore City, and how can this data be used to inform policies that address housing instability and displacement?**


## Abstract

The objective of this exercise is to analyze eviction warrant data in Baltimore City to identify spatial patterns related to housing instability. Using geospatial analysis techniques, the study will examine the geographic distribution of eviction events and the potential socio-economic factors contributing to these patterns. This analysis aims to inform policy recommendations that can help mitigate displacement and promote housing stability in vulnerable neighborhoods. The results of this research will contribute to understanding how eviction data can be leveraged in urban planning and policy decisions aimed at reducing housing insecurity.


In [1]:
# Import necessary modules and libraries.
import pandas as pd
import geopandas as gpd
import utils
import census_geocode
from census_geocode import geocode_csvs
import exercise03
from exercise03 import prep_warrants_for_geocoding

# Enable autoreloading of modules to reflect changes automatically.
%load_ext autoreload
%autoreload 2


In [2]:
# Load warrants and make sure zip codes are stored as strings without decimals.
warrants_df = pd.read_csv('md_eviction_warrants_through_2024.csv')
warrants_df['TenantZipCode'] = warrants_df['TenantZipCode'].astype('Int64').astype('string')
len(warrants_df)

411040

In [3]:
# Prepare unique addresses for geocoding.
geocode_input_df = exercise03.prep_warrants_for_geocoding(warrants_df)

411040 warrants input
Reduced to 167949 unique addresses


In [4]:
# Split dataframe into smaller chunks (sub-dataframes) with fewer than 10,000 rows each.
geocode_input_dfs = utils.chunk_dataframe(geocode_input_df, 9999)

# Save each dataframe as a CSV without a header.
utils.save_dfs_to_csv(geocode_input_dfs, 'geocode_inputs', header=False)

split dataframe into 17 chunks


In [5]:
# Geocode addresses with the Census Geocoder.
census_geocode.geocode_csvs('geocode_inputs', 'geocode_outputs')

Processing file: geocode_inputs\df_0.csv
Saved results to: geocode_outputs\geocoderesult_df_0.csv
Processing file: geocode_inputs\df_1.csv
Saved results to: geocode_outputs\geocoderesult_df_1.csv
Processing file: geocode_inputs\df_10.csv
Saved results to: geocode_outputs\geocoderesult_df_10.csv
Processing file: geocode_inputs\df_11.csv
Saved results to: geocode_outputs\geocoderesult_df_11.csv
Processing file: geocode_inputs\df_12.csv
Saved results to: geocode_outputs\geocoderesult_df_12.csv
Processing file: geocode_inputs\df_13.csv
Saved results to: geocode_outputs\geocoderesult_df_13.csv
Processing file: geocode_inputs\df_14.csv
Saved results to: geocode_outputs\geocoderesult_df_14.csv
Processing file: geocode_inputs\df_15.csv
Saved results to: geocode_outputs\geocoderesult_df_15.csv
Processing file: geocode_inputs\df_16.csv
Saved results to: geocode_outputs\geocoderesult_df_16.csv
Processing file: geocode_inputs\df_2.csv
Saved results to: geocode_outputs\geocoderesult_df_2.csv
Proces

In [6]:
# Recombine outputs from geocoder into a single dataframe.
geocode_output_df = exercise03.combine_census_geocoded_csvs('geocode_outputs')
len(geocode_output_df)

167949

In [7]:
# Merge geocoded address back onto the inputs with separate fields for address, city, state, and zip.
geocoded_df = geocode_input_df.merge(geocode_output_df, left_index=True, right_index=True)
len(geocoded_df)

167949

In [8]:
# Use address, city, state, and zip columns to join geocodes onto original warrant records.
warrants_df = warrants_df.merge(geocoded_df, on=['TenantAddress','TenantCity','TenantState','TenantZipCode'])
len(warrants_df)

411040

In [9]:
# Convert warrants into a geodataframe with points.
warrants_gdf = utils.lonlat_str_to_geodataframe(warrants_df, 'match_lon_lat')

In [10]:
# Calculate proportion of records that received a valid geocode.
len(warrants_gdf[warrants_gdf.lon.notnull()]) / len(warrants_gdf)

0.9462971973530556

In [11]:
# Calculate proportion of records with exact geocode matches.
len(warrants_gdf[warrants_gdf.match_type == 'Exact']) / len(warrants_gdf)

0.5531189178668743

In [19]:
warrants_gdf.to_parquet('md_eviction_warrants_through_2024.geoparquet')

In [20]:
gdf = gpd.read_parquet('md_eviction_warrants_through_2024.geoparquet')

In [21]:
gdf.columns.tolist()

['ID',
 'EventDate',
 'EventType',
 'EventComment',
 'County',
 'Location',
 'TenantAddress',
 'TenantCity',
 'TenantState',
 'TenantZipCode',
 'CaseType',
 'CaseNumber',
 'EvictedDate',
 'Source',
 'SourceDate',
 'Year',
 'EvictionYear',
 'unique_id',
 'input_address',
 'match_status',
 'match_type',
 'match_address',
 'match_lon_lat',
 'match_tiger_line_id',
 'match_tiger_line_side',
 'lon',
 'lat',
 'geometry']