<a href="https://colab.research.google.com/github/Tompotmelon/CSR_eeo2121_worksample/blob/main/EOgunsanya_HW6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hello there!

Welcome to an exercise on using Python for basic data analysis. For this walkthrough to be most helpful for you, you should have a very basic understanding of how to use to Python (what a method is, what a dataframe is, how to read in data from a csv file, etc.).

Here is an overview of what we'll be working with.

#### **Questions we are trying to answer:**

A [New York State-wide eviction moratorium](https://www.npr.org/sections/coronavirus-live-updates/2020/12/29/951042050/new-york-approves-eviction-moratorium-until-may) went into effect on December 28, 2020. This exercise begins to evaluate whether (in New York City) the moratorium was an effective tool for keeping New Yorkers in their homes. The most pressing questions become:
1. With the onset of the pandemic and the loss or significant reduction of income for many New Yorkers did eviction rates increase?
2. Have the number of eviction proceedings in New York City reduced since December 28, 2020?
3. Are there any changes in the number of tenant harassment complaints that coincide with the eviction moratorium (given that eviction proceedings have been postponed)?
4. Based on the answers to questions 1 through 3, is there a need for further analysis focused on whether landlords receiving financial aid from the City through the Housing New York Plan participated in eviction proceedings during 2020 and 2021?

 

#### **Hypothesis:**

Given the economic shock of the pandemic, many households were unable to afford their rents. Evictions must have increased at the start of the pandemic and reduced with the introduction (and recent extension) of the COVID-19 Emergency Eviction and Foreclosure Prevention Act of 2020. In order to allow for a moratorium, the City would have to provide either rental assistance or some other form of funding directly given to the Landlords.


#### **Datasets we will use:**


> [Evictions (Department of Investigation)](https://data.cityofnewyork.us/City-Government/Evictions/6z8x-wfk4)

> [Housing Litigations (Department of Housing Preservation and Development)](https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc)

# Structure of Operations

1. Set up the digital workspace
2. Working with City Evictions data: how did eviction rates change during 2020?
3. Working with Housing Litigations data: were there variations in complaints about tenant harassment?

# Set up
Set up begins with importing the necessary pythoon packages and then naming the locations of all four sets of data. We will be using the NYC Open Data API. API (**A**pplication **P**rograming **I**nterface) is a type of software that helps two computer programs to communicate. In our case the two programs are Python and the program holding the NYC Open Data database.

Before reading in and manipulating the data, it is important to look through the accompanying data dictionaries and other attached documents to understand what the data contains. These documents can be found at each linked page, along with the API endpoints defined below.


In [None]:
# Set up begins with importing the necessary python packages.
# Sodapy and the pandas requests package will help us read data from NYC Open Data's API while plotly.express will help us create charts.

!pip install sodapy

import pandas as pd
from sodapy import Socrata
import requests
import plotly.express as px

client = Socrata("data.cityofnewyork.us", None)
# The above line of code identifies us as unauthenticated users.
# If you, however, do have a TokenApp and a user name and password, you can
# substitute that line of code with the following:
#  client = Socrata(data.cityofnewyork.us,
#                  MyAppToken,
#                  userame="user@example.com",
#                  password="AFakePassword")

# Next we define the locations of the data for easier reference later on
# City Evictions data
url1 = 'https://data.cityofnewyork.us/resource/6z8x-wfk4.json'

# Housing Litigations data
url2 = 'https://data.cityofnewyork.us/resource/59kj-x8nc.json'






# Working with City Evictions Data
## How eviction rates changed during 2020 and the beginning of 2021

*Main questions:*
 

1.   *With the onset of the pandemic and the loss or significant reduction of income for many New Yorkers did eviction rates increase?*
2.   *Have the number of eviction proceedings in New York City reduced since December 28, 2020?*


In this section we will use API to pull the data we want from NYC Open Data and create a dataframe. The Evictions data contains over 66.2 thousand rows with data from 2017 to 2021. We want to focus on the period starting in March 2019 for our comparisons.

It is helpful to look at the API documentation NYC Open Data provides. You can find that [here](https://dev.socrata.com/foundry/data.cityofnewyork.us/6z8x-wfk4).

### Data Retrieval Method 1:
This method uses client.get() to retrieve data. It is important to note that we use the [$limit](https://dev.socrata.com/docs/queries/limit.html) parameter to increase the limit because Socrata automatically caps it at 1000 rows per response. A quick scan of the data on the NYC Open Data site shows that we have 66,403 rows. 17,077 of these rows contain data from within the range of dates that interest us the most: between Mar/22/2019 (one year before New York PAUSE came into effect) and May/01/2021 (the initial end date for the eviction moratorium).

In [None]:
response = client.get('6z8x-wfk4', limit=66403)
data = pd.json_normalize(response)
evictions = pd.DataFrame(data)
evictions

Unnamed: 0,court_index_number,docket_number,eviction_address,eviction_apt_num,executed_date,marshal_first_name,marshal_last_name,residential_commercial_ind,borough,eviction_zip,ejectment,eviction_possession,latitude,longitude,community_board,council_district,census_tract,bin,bbl,nta
0,26615/17,335203,2042 CHATTERTON AVENUE,BASEMENT,2018-01-10T00:00:00.000,Thomas,Bia,Residential,BRONX,10472,Not an Ejectment,Possession,40.827703,-73.854822,9,18,78,2025982,2037970028,Westchester-Unionport
1,Q069737/17,385710,105 BEACH 56TH PLACE,401,2017-11-09T00:00:00.000,Richard,McCoy,Residential,QUEENS,11692,Not an Ejectment,Possession,40.591060,-73.786140,14,31,97202,4459312,4159260001,Hammels-Arverne-Edgemere
2,23921/16,320746,820 ST ANNS AVENUE,4A,2017-03-16T00:00:00.000,John,Villanueva,Residential,BRONX,10455,Not an Ejectment,Possession,40.821114,-73.909942,1,17,75,2116169,2026187501,Melrose South-Mott Haven North
3,60752/18,084398,449 BEACH 68TH ST,2ND FL,2019-03-01T00:00:00.000,Henry,Daley,Residential,QUEENS,11692,Not an Ejectment,Possession,40.593820,-73.797276,14,31,954,4302875,4160420048,Hammels-Arverne-Edgemere
4,68407/17,14047,387 WARWICK STREET,,2018-01-23T00:00:00.000,Edward,Guida,Residential,BROOKLYN,11207,Not an Ejectment,Possession,40.674150,-73.886026,5,37,1150,3088866,3039990008,East New York
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
66398,62624/18,082096,1 ADRIAN AVENUE,3F,2019-01-17T00:00:00.000,Justin,Grossman,Residential,MANHATTAN,10463,Not an Ejectment,Possession,40.875895,-73.912471,8,10,309,1064542,1022150225,Marble Hill-Inwood
66399,54927/18,7834,172 EAST 106TH STREET - RETAIL STORE AND BASEMENT,,2018-04-19T00:00:00.000,Robert,Renzulli,Commercial,MANHATTAN,10016,Not an Ejectment,Possession,,,,,,,,
66400,74557/18,8665,122-25 LAKEVIEW LANE,,2019-02-14T00:00:00.000,Bernard,Blake,Residential,QUEENS,11434,Not an Ejectment,Possession,40.677843,-73.782051,12,28,288,4265875,4122540017,Baisley Park
66401,86541/16,004556,"453 CENTRAL AVE FRONT RM, SOUTH SIDE",2C,2017-01-04T00:00:00.000,Frank,Siracusa,Residential,BROOKLYN,11221,Not an Ejectment,Possession,,,,,,,,


In [None]:
#filter for dates between Mar/22/2019 and May.01.2021.
evictions_recent = evictions[(evictions.executed_date > "2019-03-21T11:59:59.000")& (evictions.executed_date < "2021-05-02T00:00:00.000")]
evictions_recent

Unnamed: 0,court_index_number,docket_number,eviction_address,eviction_apt_num,executed_date,marshal_first_name,marshal_last_name,residential_commercial_ind,borough,eviction_zip,ejectment,eviction_possession,latitude,longitude,community_board,council_district,census_tract,bin,bbl,nta
15,55311/19,017241,404 WEST 40TH STREET,2,2019-08-06T00:00:00.000,George,"Essock, Jr.",Residential,MANHATTAN,10018,Not an Ejectment,Possession,40.757260,-73.993860,4,3,115,1013003,1007370041,Clinton
18,68486/18,095474,115 WEST 137TH ST,5A,2019-06-25T00:00:00.000,Henry,Daley,Residential,MANHATTAN,10030,Not an Ejectment,Possession,40.815722,-73.940643,10,9,228,1059997,1020060022,Central Harlem North-Polo Grounds
20,73916/17,10469,2301 7TH AVENUE - APARTMENT # 5,5,2019-05-08T00:00:00.000,Robert,Renzulli,Residential,MANHATTAN,10030,Not an Ejectment,Possession,,,,,,,,
24,61171/18,171328,3153 SEYMOUR AVENUE,1A,2019-06-06T00:00:00.000,Alfred,Locascio,Residential,BRONX,10469,Not an Ejectment,Possession,40.871432,-73.846479,12,12,364,2061682,2047590019,Eastchester-Edenwald-Baychester
25,63764/19,098891,1076 BROADWAY,UNIT: %STORE A-B + C,2019-11-15T00:00:00.000,Henry,Daley,Commercial,BROOKLYN,11221,Not an Ejectment,Possession,40.694588,-73.931011,3,36,289,3043191,3016000006,Stuyvesant Heights
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
66375,51615/19R,100174,480 TARGEE STREET A /K/A 478 TARGEE STREET & ...,-,2019-12-06T00:00:00.000,Justin,Grossman,Commercial,STATEN ISLAND,10304,Not an Ejectment,Possession,40.615473,-74.084691,1,49,29,5016385,5006470001,Stapleton-Rosebank
66378,49743/18,352272,1194 UNIVERSITY AVENUE,3,2019-05-15T00:00:00.000,Thomas,Bia,Residential,BRONX,10452,Not an Ejectment,Possession,40.837977,-73.927454,4,16,199,2003499,2025280008,Highbridge
66381,13692/19,116804,182 ST NICHOLAS AVE,06C,2019-11-26T00:00:00.000,Maxine,Chevlowe,Residential,MANHATTAN,10026,Not an Ejectment,Eviction,40.805962,-73.952803,10,9,220,1058446,1019250015,Central Harlem South
66383,53363/18,007134,15 HOLLAND AVENUE ALL ROOMS APT 1,1,2019-05-06T00:00:00.000,Frank,Siracusa,Residential,STATEN ISLAND,10303,Not an Ejectment,Possession,,,,,,,,


### Data Retrieval Method 2: The (Quicker) Alternative
This is an alternative method for getting the data into Google CoLabs. It immediately filters for the desired date range. 
**Note that the date range start times are different for the two methods!**

In [None]:
response2 = requests.get("https://data.cityofnewyork.us/resource/6z8x-wfk4.json?$limit=20000&$where= executed_date between '2019-03-22T00:00:00.000' and '2021-05-02T00:00:00.000'")
df2 = response2.json()
evict = pd.DataFrame(df2)
evict

Unnamed: 0,court_index_number,docket_number,eviction_address,eviction_apt_num,executed_date,marshal_first_name,marshal_last_name,residential_commercial_ind,borough,eviction_zip,ejectment,eviction_possession,latitude,longitude,community_board,council_district,census_tract,bin,bbl,nta
0,74278/18,352614,55 EAST 115TH STREET,6-I,2019-03-22T00:00:00.000,Thomas,Bia,Residential,MANHATTAN,10029,Not an Ejectment,Possession,40.799000,-73.944645,11,8,184,1088480,1016217502,East Harlem North
1,28394/18,348628,828 JACKSON AVENUE,3,2019-03-22T00:00:00.000,Thomas,Bia,Residential,BRONX,10456,Not an Ejectment,Possession,40.820392,-73.906138,1,17,77,2113379,2026470011,Melrose South-Mott Haven North
2,66036/18A,090409,137-15 220TH PLACE,2ND FLOOR,2019-03-22T00:00:00.000,Justin,Grossman,Residential,QUEENS,11413,Not an Ejectment,Possession,40.674571,-73.750687,13,31,358,4281935,4131270003,Laurelton
3,11106/18,20985,178 LOCKMAN AVE.,06B,2019-03-22T00:00:00.000,Edward,Guida,Residential,STATEN ISLAND,10303,Not an Ejectment,Eviction,40.632655,-74.161691,1,49,31901,5109118,5012450001,Mariner's Harbor-Arlington-Port Ivory-Granitev...
4,70531/18,087690,1324 ROGERS AVENUE,1L,2019-03-22T00:00:00.000,Justin,Grossman,Residential,BROOKLYN,11210,Not an Ejectment,Possession,40.637441,-73.951067,14,45,788,3120916,3052280062,Flatbush
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17072,901310/19,358618,117 ROCKWOOD STREET,FIRST,2021-01-11T00:00:00.000,Thomas,Bia,Commercial,BRONX,10452,Not an Ejectment,Possession,40.841944,-73.913260,4,14,209,2008013,2028360030,West Concourse
17073,55269/20,13639,66-31 OTTO ROAD,,2021-01-12T00:00:00.000,Robert,Renzulli,Commercial,QUEENS,11385,Not an Ejectment,Possession,40.705458,-73.889294,5,30,61301,4089237,4036670579,Ridgewood
17074,65405/19,12421,442A EAST 14TH STREET - THE STREET LEVEL STORE...,,2021-02-04T00:00:00.000,Robert,Renzulli,Commercial,MANHATTAN,10009,Not an Ejectment,Possession,,,,,,,,
17075,79035/19,106064,2253 STRAUSS STRE ET A/K/A 47 NEWPORT STREET,STOREFRONT,2021-02-09T00:00:00.000,Justin,Grossman,Commercial,BROOKLYN,11212,Not an Ejectment,Possession,40.658878,-73.913787,16,42,896,3082249,3035970006,Brownsville


### Data Analysis
We will use line graphs to better understand our data. We will disagregate charts by rental unit type (column: residential_commercial_ind).

1. Total eviction proceedings per day between Mar/22/2019 and May/01/2021;
2. Total eviction proceedings per month during the entire date range.

#### Evictions per day

In [None]:
# Format date column
evict['executed_date'] = pd.to_datetime(evict['executed_date'])

# Create a new dataframe that counts the number of eviction proceedings per day.
count_evict = evict.groupby(['executed_date','residential_commercial_ind']).size().reset_index(name='count_evictions')
count_evict.head

<bound method NDFrame.head of     executed_date residential_commercial_ind  count_evictions
0      2019-03-22                 Commercial                2
1      2019-03-22                Residential               47
2      2019-03-25                 Commercial               10
3      2019-03-25                Residential               65
4      2019-03-26                 Commercial                8
..            ...                        ...              ...
495    2021-01-11                 Commercial                1
496    2021-01-12                 Commercial                1
497    2021-02-04                 Commercial                1
498    2021-02-09                 Commercial                1
499    2021-03-02                 Commercial                1

[500 rows x 3 columns]>

In [None]:
#Total evictions per day
total_evict_d = evict.groupby('executed_date').size().reset_index(name='count')
total_evict_d

Unnamed: 0,executed_date,count
0,2019-03-22,49
1,2019-03-25,75
2,2019-03-26,76
3,2019-03-27,82
4,2019-03-28,61
...,...,...
257,2021-01-11,1
258,2021-01-12,1
259,2021-02-04,1
260,2021-02-09,1


####Evictions per month

We start off with the total aggregated evictions per month and then move on to the disaggregated data.
 
  
*Disaggregated Data*

Eviction proceedings for commercial units are separated from those for residential units to make it easier to regroup them by month instead of by day. We will merge the resulting two dataframes into a new data frame to create a chart showing the number of monthly eviction proceedings.



In [None]:
# Total evictions per month for the 24 months in question.
total_evict_m = total_evict_d.resample('M', on='executed_date').sum().reset_index()
total_evict_m['month_year'] = pd.to_datetime(total_evict_m['executed_date']).dt.to_period('M')
total_evict_m

Unnamed: 0,executed_date,count,month_year
0,2019-03-31,429,2019-03
1,2019-04-30,1746,2019-04
2,2019-05-31,1795,2019-05
3,2019-06-30,1583,2019-06
4,2019-07-31,1652,2019-07
5,2019-08-31,1515,2019-08
6,2019-09-30,1425,2019-09
7,2019-10-31,1432,2019-10
8,2019-11-30,1158,2019-11
9,2019-12-31,948,2019-12


In [None]:
# Disaggregated data.
comm_evict = count_evict[count_evict['residential_commercial_ind']=='Commercial']
comm_evict
res_evict = count_evict[count_evict['residential_commercial_ind']=='Residential']
res_evict

Unnamed: 0,executed_date,residential_commercial_ind,count_evictions
1,2019-03-22,Residential,47
3,2019-03-25,Residential,65
5,2019-03-26,Residential,68
7,2019-03-27,Residential,63
9,2019-03-28,Residential,57
...,...,...,...
486,2020-12-03,Residential,1
488,2020-12-07,Residential,1
490,2020-12-10,Residential,2
491,2020-12-11,Residential,1


In [None]:
# Commercial unit evictions per month
comm_evict_m = comm_evict.resample('M', on='executed_date').sum().reset_index()
comm_evict_m['month_year'] = pd.to_datetime(comm_evict_m['executed_date']).dt.to_period('M')
comm_evict_m

Unnamed: 0,executed_date,count_evictions,month_year
0,2019-03-31,50,2019-03
1,2019-04-30,161,2019-04
2,2019-05-31,177,2019-05
3,2019-06-30,128,2019-06
4,2019-07-31,159,2019-07
5,2019-08-31,109,2019-08
6,2019-09-30,107,2019-09
7,2019-10-31,139,2019-10
8,2019-11-30,113,2019-11
9,2019-12-31,88,2019-12


In [None]:
# Residential unit evictions per month
res_evict_m = res_evict.resample('M', on='executed_date').sum().reset_index()
res_evict_m['month_year'] = pd.to_datetime(res_evict_m['executed_date']).dt.to_period('M')
res_evict_m

Unnamed: 0,executed_date,count_evictions,month_year
0,2019-03-31,379,2019-03
1,2019-04-30,1585,2019-04
2,2019-05-31,1618,2019-05
3,2019-06-30,1455,2019-06
4,2019-07-31,1493,2019-07
5,2019-08-31,1406,2019-08
6,2019-09-30,1318,2019-09
7,2019-10-31,1293,2019-10
8,2019-11-30,1045,2019-11
9,2019-12-31,860,2019-12


**A quick warning: I was unable to preserve the last three rows of comm_evict_m so I manually added them in.**

In [None]:
# Merge the disaggregated data into a new dataframe using the month_year column as the key.
evict_m1 = pd.merge(left=res_evict_m, right=comm_evict_m, left_on= 'month_year', right_on='month_year').fillna(0)

# Rename the columns to avoid confusion
evict_m1.rename(columns={'count_evictions_x': 'Residential', 'count_evictions_y': 'Commercial'}, inplace=True)

# Delete the redudant columns
columns = ['executed_date_x', 'executed_date_y']
evict_m1.drop(columns, inplace=True, axis=1)

# Manually add in missing comm_evict_m rows for January, February, and March 2021.
listofseries = [pd.Series([4, '2021-01', 0], index=evict_m1.columns ) ,
                pd.Series([2, '2021-02',0], index=evict_m1.columns ) ,
                pd.Series([1, '2021-03', 0], index=evict_m1.columns ) ]
evict_m2 = evict_m1.append(  listofseries,
                        ignore_index=True)

# Reorder columns
evict_m = evict_m2[['month_year','Commercial','Residential']]
evict_m

Unnamed: 0,month_year,Commercial,Residential
0,2019-03,50,379
1,2019-04,161,1585
2,2019-05,177,1618
3,2019-06,128,1455
4,2019-07,159,1493
5,2019-08,109,1406
6,2019-09,107,1318
7,2019-10,139,1293
8,2019-11,113,1045
9,2019-12,88,860


####Charts

In [None]:
# Create a line graph for evictions per day.
evict_chart_d = px.line(count_evict,
              x='executed_date',
              y='count_evictions',
              color='residential_commercial_ind',
              title='Eviction proceedings per day (Mar/22/2019 to May/01/2021)'
              )
evict_chart_d.update_layout(
    xaxis_title = 'Date',
    yaxis_title = 'Number of eviction proceedings',
    legend_title = 'Rental unit type')

evict_chart_d.show()

In [None]:
evict_m.dtypes

month_year     object
Commercial      int64
Residential     int64
dtype: object

In [None]:
# Create a bar chart for evictions per month. But first, upgrade plotly.
!pip install plotly --upgrade

# Change dtype for month_year column to avoid ValueError when generating bar chart.
evict_m['month_year']= evict_m['month_year'].astype(str)

evict_chart_m = px.line(evict_m,
              x='month_year',
              y=['Commercial','Residential'],
              title= 'Eviction Proceedings per month (Mar/22/2019 to May/01/2021)')
evict_chart_m.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number of eviction proceedings',
    legend_title = 'Rental unit type')

evict_chart_m

Requirement already up-to-date: plotly in /usr/local/lib/python3.7/dist-packages (4.14.3)


In [None]:
evict_bar_m = px.bar(evict_m,
              x='month_year',
              y=['Commercial','Residential'],
              title= 'Eviction Proceedings per month (Mar/22/2019 to May/01/2021)',
              height= 1000)

# We increase the bar graph height to make it easier to see the data from
# November 2020 to February 2021.

evict_bar_m.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number of eviction proceedings',
    legend_title = 'Rental unit type'
    )

evict_bar_m

## Conclusions
*Main questions:*



**1.   With the onset of the pandemic and the loss or significant reduction of income for many New Yorkers did eviction rates increase?**

Contrary to the earlier hypothesis, eviction rates reduced after the onset of the COVID-19 pandemic in 2020. Between March 14, 2020 and November 12, 2020 there were zero new eviction proceedings for residential units. Residential units saw zero new proceedings (down from 52 on March 13) between March 14, 2020 and November 29, 2020. This suggests that another factor led to the temporary halt in eviction proceedings and will require more research to determine what that factor was. Commercial units followed a similar pattern, with zero proceedings from March 14, 2020 to November 12, 2020 (down from one on March 13).

**2.   Have the number of eviction proceedings in New York City reduced since December 28, 2020?**

Yes and no. The number of eviction proceedings for residential units in New York City has not reduced since the moratorium because they stopped completely after December 14, 2020. The number of eviction proceedings for commercial units dropped to zero between December 29, 2020 and January 3, 2021. It rose to 2 on January 4, 2021 before falling back to zero.

# Working with Housing Litigations Data

*Main Question*

3. *Are there any changes in the number of tenant harassment complaints that coincide with the eviction moratorium (given that eviction proceedings have been postponed)?*

### Data Retrieval

We will use one of the methods presented we used while exploring the Eviction data to retrieve litigation data that cites tenant harassment. 

**Note: the data misspells harassment as "harrassment" so a correct spelling (or another incorrect spelling) will not yield any results.**

In [None]:
# Get the litigation data citing tenant harassment. 
response3 = requests.get("https://data.cityofnewyork.us/resource/59kj-x8nc.json?$limit=200000&$where= casetype like 'Tenant Action/Harrassment'")
# & caseopendate between '03/22/2019 00:00:00' and '05/02/2021 00:00:00'")
df3 = response3.json()
litigation = pd.DataFrame(df3)
litigation

Unnamed: 0,litigationid,buildingid,boroid,housenumber,streetname,zip,block,lot,casetype,caseopendate,casestatus,casejudgement,respondent,latitude,longitude,community_district,council_district,census_tract,bin,bbl,nta,findingofharassment,findingdate,penalty
0,377150,808361,3,1711,FULTON STREET,11233,1691,12,Tenant Action/Harrassment,10/14/2020 00:00:00,PENDING,NO,VERTICAL HOLDINGS,40.679338,-73.930374,3,36,297,3325183,3016910012,Crown Heig,,,
1,315486,81993,2,940,GRAND CONCOURSE,10451,2461,45,Tenant Action/Harrassment,07/31/2017 00:00:00,CLOSED,NO,940 CORPORATION,40.828783,-73.921617,4,8,18301,2002814,2024610045,East Conco,,,
2,127254,398217,3,171,WYCKOFF AVENUE,11237,3271,6,Tenant Action/Harrassment,04/14/2010 00:00:00,CLOSED,NO,"LUZVIMINDA RODRIGUEZ,ROMEO RODRIGUEZ",40.703255,-73.917516,4,37,443,3074568,3032710006,Bushwick N,,,
3,377611,81661,2,1188,GRAND CONCOURSE,10456,2456,163,Tenant Action/Harrassment,10/28/2020 00:00:00,PENDING,NO,"C/O MILEVOI,ML 1188 GRAND CONCOURSE",40.834089,-73.918106,4,16,18102,2002748,2024560163,East Conco,,,
4,96462,31015,1,418,WEST 20 STREET,10011,717,53,Tenant Action/Harrassment,03/17/2009 00:00:00,CLOSED,NO,LUBOMIR CHMELAR,40.744948,-74.003490,4,3,89,1012589,1007170053,Hudson Yar,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11495,382386,668122,4,192-19,JAMAICA AVENUE,11423,10448,16,Tenant Action/Harrassment,02/17/2021 00:00:00,CLOSED,NO,NOVA TRADING INCORPORATED,40.713109,-73.767370,12,23,482,4222147,4104480016,Hollis,,,
11496,385592,926432,2,1313,SENECA AVENUE,10474,2761,112,Tenant Action/Harrassment,04/15/2021 00:00:00,PENDING,NO,MANUEL JIMENEZ,40.818945,-73.887057,2,17,11502,2114219,2027610112,Hunts Poin,,,
11497,386104,148873,3,969,40 STREET,11219,5583,77,Tenant Action/Harrassment,04/28/2021 00:00:00,PENDING,NO,CINDY XIV LIANGE,40.644387,-73.994004,12,38,110,3135208,3055830077,Sunset Par,,,
11498,384597,442847,4,94-19,54 AVENUE,11373,1893,29,Tenant Action/Harrassment,03/25/2021 00:00:00,CLOSED,NO,ELIZABETH CANELO,40.738625,-73.867989,4,25,457,4046868,4018930029,Elmhurst,,,


In [None]:
#Ensure that the caseopendate column is formatted as a datetime64 dtype.
litigation['caseopendate'] = pd.to_datetime(litigation['caseopendate'])
litigation.dtypes

litigationid                   object
buildingid                     object
boroid                         object
housenumber                    object
streetname                     object
zip                            object
block                          object
lot                            object
casetype                       object
caseopendate           datetime64[ns]
casestatus                     object
casejudgement                  object
respondent                     object
latitude                       object
longitude                      object
community_district             object
council_district               object
census_tract                   object
bin                            object
bbl                            object
nta                            object
findingofharassment            object
findingdate                    object
penalty                        object
dtype: object

In [None]:
# Filter for the desired date range.
litigation_recent = litigation[(litigation.caseopendate>'03/22/2019')&(litigation.caseopendate< '05/02/2021')]
litigation_recent

Unnamed: 0,litigationid,buildingid,boroid,housenumber,streetname,zip,block,lot,casetype,caseopendate,casestatus,casejudgement,respondent,latitude,longitude,community_district,council_district,census_tract,bin,bbl,nta,findingofharassment,findingdate,penalty
0,377150,808361,3,1711,FULTON STREET,11233,1691,12,Tenant Action/Harrassment,2020-10-14,PENDING,NO,VERTICAL HOLDINGS,40.679338,-73.930374,3,36,297,3325183,3016910012,Crown Heig,,,
3,377611,81661,2,1188,GRAND CONCOURSE,10456,2456,163,Tenant Action/Harrassment,2020-10-28,PENDING,NO,"C/O MILEVOI,ML 1188 GRAND CONCOURSE",40.834089,-73.918106,4,16,18102,2002748,2024560163,East Conco,,,
7,353561,30028,1,22,WEST 12 STREET,10011,575,41,Tenant Action/Harrassment,2019-04-12,CLOSED,NO,DOLORES GARCIA MORENO,40.735000,-73.995385,2,3,63,1009578,1005750041,West Villa,,,
14,377155,363590,3,1147,ROGERS AVENUE,11226,5193,64,Tenant Action/Harrassment,2020-10-14,PENDING,NO,PATRICK MARTINEZ,40.641956,-73.951521,17,45,828,3119949,3051930064,Erasmus,,,
15,376897,60515,2,2386,DAVIDSON AVENUE,10468,3199,53,Tenant Action/Harrassment,2020-09-23,PENDING,NO,ELIZA GOZIVODA,40.861404,-73.903217,7,14,253,2014319,2031990053,Kingsbridg,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11495,382386,668122,4,192-19,JAMAICA AVENUE,11423,10448,16,Tenant Action/Harrassment,2021-02-17,CLOSED,NO,NOVA TRADING INCORPORATED,40.713109,-73.767370,12,23,482,4222147,4104480016,Hollis,,,
11496,385592,926432,2,1313,SENECA AVENUE,10474,2761,112,Tenant Action/Harrassment,2021-04-15,PENDING,NO,MANUEL JIMENEZ,40.818945,-73.887057,2,17,11502,2114219,2027610112,Hunts Poin,,,
11497,386104,148873,3,969,40 STREET,11219,5583,77,Tenant Action/Harrassment,2021-04-28,PENDING,NO,CINDY XIV LIANGE,40.644387,-73.994004,12,38,110,3135208,3055830077,Sunset Par,,,
11498,384597,442847,4,94-19,54 AVENUE,11373,1893,29,Tenant Action/Harrassment,2021-03-25,CLOSED,NO,ELIZABETH CANELO,40.738625,-73.867989,4,25,457,4046868,4018930029,Elmhurst,,,


### Data Analysis

#### Litigations per day

In [None]:
# Count of litigations per day
count_lit_d = litigation_recent.groupby(['caseopendate']).size().reset_index(name='counts')
count_lit_d

Unnamed: 0,caseopendate,counts
0,2019-03-25,9
1,2019-03-26,7
2,2019-03-27,7
3,2019-03-28,3
4,2019-03-29,9
...,...,...
488,2021-04-23,5
489,2021-04-26,9
490,2021-04-27,2
491,2021-04-28,4


#### Litigations per month

In [None]:
# Count of litigations per month
count_lit_m = count_lit_d.resample('M', on='caseopendate').sum().reset_index()
count_lit_m['month_year'] = pd.to_datetime(count_lit_m['caseopendate']).dt.to_period('M')
count_lit_m

Unnamed: 0,caseopendate,counts,month_year
0,2019-03-31,35,2019-03
1,2019-04-30,131,2019-04
2,2019-05-31,142,2019-05
3,2019-06-30,130,2019-06
4,2019-07-31,164,2019-07
5,2019-08-31,141,2019-08
6,2019-09-30,169,2019-09
7,2019-10-31,187,2019-10
8,2019-11-30,127,2019-11
9,2019-12-31,145,2019-12


#### Litigations per month compared to evictions per month

Before mergeing two dataframes (one on evictions by month and the other on litigation by month) we need to make sure the month_year key is an object for both since we turned it into an object when workign with the eviction data.

In [None]:
count_lit_m.dtypes

caseopendate    datetime64[ns]
counts                   int64
month_year           period[M]
dtype: object

In [None]:
# Change dtype for month_year column in count_lit_m
count_lit_m['month_year'] = count_lit_m['month_year'].astype(str)

# Merge
compare_lit_evict1 =  pd.merge(left=evict_m, right=count_lit_m, left_on= 'month_year', right_on='month_year').fillna(0)

# Rename columns
compare_lit_evict1.rename(columns={'counts': 'Litigations', 'Commercial': 'Commercial evictions', 'Residential': 'Residential evictions'}, inplace=True)

# Delete caseopendate column
compare_lit_evict1.drop('caseopendate', inplace=True, axis=1)

# Manually add missing data from April 2021
series = [pd.Series(['2021-04', 0, 0, 113], index=compare_lit_evict1.columns )]
compare_lit_evict = compare_lit_evict1.append(series,
                        ignore_index=True)

compare_lit_evict

Unnamed: 0,month_year,Commercial evictions,Residential evictions,Litigations
0,2019-03,50,379,35
1,2019-04,161,1585,131
2,2019-05,177,1618,142
3,2019-06,128,1455,130
4,2019-07,159,1493,164
5,2019-08,109,1406,141
6,2019-09,107,1318,169
7,2019-10,139,1293,187
8,2019-11,113,1045,127
9,2019-12,88,860,145


####Charts

In [None]:
# A line graph showing number of tenant harassment litigations per day.
lit_chart_d = px.line(count_lit_d,
              x= 'caseopendate',
              y= 'counts',
              title = 'Tenant Harassment Litigations per day (Mar/22/2019 to May/01/2021)'
          )
lit_chart_d.update_layout(
    xaxis_title = 'Date',
    yaxis_title = 'Number of tenant harassment litigations',
    )
lit_chart_d

In [None]:
# A line graph showing number of litigations per month
lit_chart_m = px.line(count_lit_m,
              x= 'month_year',
              y= 'counts',
              title = 'Tenant Harassment Litigations per month (Mar/22/2019 to May/01/2021)'
          )
lit_chart_m.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number of tenant harassment litigations'
    )
lit_chart_m

In [None]:
# A scatterplot showing number of litigations per month after converting month_year to datetime.

count_lit_m['month_year'] = pd.to_datetime(count_lit_m['month_year'])

lit_chart_m = px.scatter(count_lit_m,
              x= 'month_year',
              y= 'counts',
              title = 'Tenant Harassment Litigations per month (Mar/22/2019 to May/01/2021)',
              trendline = 'ols'
          )
lit_chart_m.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number of tenant harassment litigations'
    )
lit_chart_m

In [None]:
# A bar chart showing number of tenant harassment litigations filed per month.
# Comparing the line chart above with this and bar graph shows that the latter
# is easier to read.
lit_chart_m = px.bar(count_lit_m,
              x= 'month_year',
              y= 'counts',
              title = 'Tenant Harassment Litigations per month (Mar/22/2019 to May/01/2021)'
          )
lit_chart_m.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number of tenant harassment litigations'
    )
lit_chart_m

In [None]:
compare_chart = px.line(compare_lit_evict,
              x= 'month_year',
              y= ['Commercial evictions', 'Residential evictions', 'Litigations'],
              title = 'Eviction proceedings and tenant harassment litigations per month (Mar/22/2019 to May/01/2021)'
          )
compare_chart.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number'
    )
compare_chart.show()

In [None]:
# Create a grouped bar chart comparing the 
import plotly.graph_objects as go

compare_bar = go.Figure(data=[
    go.Bar(name='Commercial unit evictions', x=compare_lit_evict['month_year'], y=compare_lit_evict['Commercial evictions']),
    go.Bar(name='Residential unit evictions', x=compare_lit_evict['month_year'], y=compare_lit_evict['Residential evictions']),
    go.Bar(name='Tenant harassment litigations', x=compare_lit_evict['month_year'], y= compare_lit_evict['Litigations'] )] 
    )
# Change the bar mode
compare_bar.update_layout(
    xaxis_title = 'Month',
    yaxis_title = 'Number',
    height = 700,
    title = 'Eviction proceedings and tenant harassment litigations per month (Mar/22/2019 to May/01/2021)',
    barmode ='group')
compare_bar.show()

### Conclusions



**Are there any changes in the number of tenant harassment complaints that coincide with the eviction moratorium (given that eviction proceedings have been postponed)?**

Based on the data, the changes in the number of tenant harassment complaints are not related to the ratification of the eviction moratorium. Tenant harassment litigations had been increasing several months before the eviction moratorium and the trend continued after the moratorium. Looking at the period between Mar/22/2019 and May/01/2021, the number of litigations did not vary much, with the exception of the months of April to June 2020.

It might be that the pandemic led to the increase after April 2020 since the uptick increased steeply in June 2020. This is a few months after the start of the pandemic. The financial strain of the pandemic would have been more apparent at this time.

Something that is interesting within the data is the reduction in litigation between January 2020 and April 2020. It coincides with the reductions in commercial and residential evictions.

# Final Takeaways

The data reveals  the likelihood that the eviction moratorium was not targeting tenant evictions that pass through the courts via eviction proceedings. It seems more likely that the moratorium was addressing potential evictions, coerced vacancies, and evictions that happened outside of the court system. It highlights the need for further analysis of data surrounding evictions that happened without formal eviction proceedings and events of coercion that caused tenants to vacate their rental units.

**Eviction data**

Given that there was public pressure for an eviction moratorium, this data analysis highlights the fact that not all evictions are administered through these proceedings. At least one other dataset capturing evictions that occurred without a court hearing is necessary. Further helpful data would be data on the Hardship Declaration Forms renters can submit to their landlords to prevent eviction proceedings while the eviction moratorium is in effect. This information is not publicly accessible. There is also a chance that aggregated data does not exist since it seems to be a tenant-landlord exchange that does not necessitate using a government agency as an intermediary or arbiter.

**Litigation data**

Litigations that cited tenant harassment were not closely aligned with the eviction moratorium They showed an unexpected drop at the start of the pandemic, but quickly returned to prepandemic levels by July 2020. Given the nature of coercion, it is possible that other forms of litigation became more pronounced after the pandemic and after the eviction moratorium. To check whether that hypothesis is true, a much broader data analysis exercise needs to be carried out.

**Question 4**

*Based on the answers to questions 1 through 3, is there a need for further analysis focused on whether landlords receiving financial aid from the City through the Housing New York Plan participated in eviction proceedings during 2020 and 2021?*

The answer to question 4 is no. To evaluate the effectiveness of the eviction moratorium, one does not need to know if any landlords participating in eviction proceedings received financial assistance from the city government. 


# Farewell!

Thank you for choosing to join me in this quick exercise. Hopefully you have a better grasp of how python can be used to analyze data and can tackle some of your own projects!

Until next time,

Stay safe and be kind to yourself.

– E.