# **Introduction**

## **Chi-test (boroughs + svi)**
## **Bar-chart with svi as regression/scatterplot (boroughs first)**

source: https://www.atsdr.cdc.gov/place-health/media/pdfs/2024/10/SVI2022Documentation.pdf

source: https://www.atsdr.cdc.gov/place-health/php/svi/svi-interactive-map.html

Chi-test: with the whole dataset considered, the chi-test gives much more moderate and accurate result. There is no statistically strong association between high svi nta and high evi nta. (But if you move into a nta with higher-than-average svi, you are still 2.8 times more likely to get evicted compared to living in a lower than average svi nta). There is a statistically strong association between high black + hispanics nta with high evi nta. (and the odds are 3.81).

In [2]:
# !pip install geopandas folium matplotlib seaborn scipy
# !pip install esda
# !pip install splot
# !pip install geopandas contextily
# # for google colab, had to reinstall some pacakges.

In [None]:
# !pip install geopandas folium matplotlib seaborn scipy esda splot

In [3]:
import pandas as pd
import geopandas as gpd
import numpy as np
import datetime as dt
import scipy

from sklearn.cluster import DBSCAN
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import silhouette_score
from shapely.geometry import Point
from sklearn.neighbors import NearestNeighbors

# visualization
import matplotlib.pyplot as plt
from matplotlib import colors as mcolors
import seaborn as sns
import folium
from folium.plugins import HeatMap
from folium import Marker
from folium.plugins import MarkerCluster
import plotly.express as px
import plotly.io as pio


import contextily as ctx
from scipy.stats import f_oneway
from sklearn.decomposition import PCA
from functools import reduce
from scipy.stats import chi2_contingency
import statsmodels.api as sm

# spatial statistics
from esda.moran import Moran
from esda.getisord import G_Local
from libpysal.weights import Queen, Rook

# system and utility
import warnings
import os
import io
from IPython.display import IFrame
from google.colab import files

from libpysal.weights import Queen, Rook
from esda.moran import Moran
import matplotlib.pyplot as plt
from splot.esda import moran_scatterplot

# suppress warnings
warnings.filterwarnings('ignore')

# inline
%matplotlib inline

In [4]:
pd.set_option('display.float_format', lambda x: '%.4f' % x)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# **Step 1 Get the Eviction data**

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
# data source:
file_path1 = '/content/drive/My Drive/X999/bbl_evictions_311_svi_normal_times.csv'
file_path2 = '/content/drive/My Drive/X999/bbl_evictions_311_svi_covid.csv'

In [7]:
evictions_pre_post_raw = pd.read_csv(file_path1)
evictions_covid_raw = pd.read_csv(file_path2)
evictions_covid_raw.shape, evictions_pre_post_raw.shape
# 91 and 92 with normal time + one more analysis column regarding an svi item

((5386, 91), (66397, 92))

In [8]:
evictions_pre_post = evictions_pre_post_raw.copy()
evictions_covid = evictions_covid_raw.copy()

In [9]:
evictions_pre_post.head(2)

Unnamed: 0,primary_key,bbl,court_index_number,docket_number,eviction_address,eviction_apartment_number,executed_date,borough,zipcode,ejectment,eviction/legal_possession,latitude,longitude,community_board,council_district,census_tract,bin,nta,year,month_year,geometry,average_year_eviction_count,yearbuilt,bldgclass,numfloors,unitsres,ownername,bldgarea,building_type,building_category,is_condo,floor_category,rent_era,architectural_style,economic_period,residential_units_category,is_llc,building_size_category,size_quartile,decade,fips,e_totpop,rpl_theme1,rpl_theme2,rpl_theme3,rpl_theme4,rpl_themes,ep_pov150,ep_unemp,ep_nohsdp,ep_uninsur,ep_age65,ep_age17,ep_disabl,ep_limeng,ep_noveh,ep_crowd,ep_hburd,ep_afam,ep_hisp,ep_asian,ep_aian,ep_nhpi,ep_twomore,ep_otherrace,ep_minrty,ep_white,invalid_zip,svi_quartile,svi_group,air_quality,animal_issues,appliances,building_exterior,doors_windows,electrical_issues,elevator_issues,floors_stairs,general_complaints,graffiti_posting,heat_hot_water,homeless_issues,noise_complaints,other_issues,pest_issues,plumbing_issues,police_matters,public_nuisance,safety_concerns,sanitation_issues,walls_ceilings,total_complaints
0,*308072/22_5865,3037420029,*308072/22,5865,356 MILLER AVE,1 AND BASEMENT,2024-12-04,BROOKLYN,11207,Not an Ejectment,Possession,40.6721,-73.8911,5.0,37.0,1152.0,3083989,East New York,2024,2024-12,POINT (-73.891105 40.672121),0.8,1930.0,C0,3.0,3.0,356 MILLER LLC,2700.0,pre-war,walk-up,False,low-rise,"Pre-1947, pre-rent-control","1921–1930, Art Deco Skyscrapers","1930-1945, great depression and WWII",3-5 units,True,small,Q3 (50-75%),1930-1939,11207,96801.0,0.9788,0.914,0.9808,0.9812,0.9839,33.9,11.1,19.1,6.0,13.8,22.5,13.8,5.3,57.8,9.1,44.7,55.9,32.8,1.5,0.0,0.0,2.9,1.6,94.7,5.3,False,Q3,medium-high,0.0,0.0,1.0,0.0,1.0,2.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,3.0,5.0,19.0
1,*313639/23_5202,3057940012,*313639/23,5202,710 61ST STREET,2ND FLOOR,2024-03-04,BROOKLYN,11220,Not an Ejectment,Possession,40.6359,-74.0119,7.0,38.0,118.0,3143881,Sunset Park East,2024,2024-03,POINT (-74.011883 40.635941),0.6,1920.0,B2,2.0,2.0,"A.R.M. PARKING, LLC",1204.0,pre-war,two-family,False,low-rise,"Pre-1947, pre-rent-control","1900–1920, Beaux-Arts","Pre-1929, pre-great depression",2-unit,True,very small,Q1 (smallest 25%),1920-1929,11220,93008.0,0.9885,0.7635,0.9594,0.9179,0.9662,37.5,7.5,37.9,11.6,13.1,25.4,8.4,40.2,61.7,23.7,43.6,1.7,40.9,40.7,0.4,0.0,1.2,0.2,85.0,15.0,False,Q3,medium-high,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,4.0


In [10]:
evictions_covid.head(2)

Unnamed: 0,primary_key,bbl,court_index_number,docket_number,eviction_address,eviction_apartment_number,executed_date,borough,zipcode,ejectment,eviction/legal_possession,latitude,longitude,community_board,council_district,census_tract,bin,nta,year,month_year,geometry,average_year_eviction_count,yearbuilt,bldgclass,numfloors,unitsres,ownername,bldgarea,building_type,building_category,is_condo,floor_category,rent_era,architectural_style,economic_period,residential_units_category,is_llc,building_size_category,size_quartile,decade,fips,e_totpop,rpl_theme1,rpl_theme2,rpl_theme3,rpl_theme4,rpl_themes,ep_pov150,ep_unemp,ep_nohsdp,ep_uninsur,ep_age65,ep_age17,ep_disabl,ep_limeng,ep_noveh,ep_crowd,ep_hburd,ep_afam,ep_hisp,ep_asian,ep_aian,ep_nhpi,ep_twomore,ep_otherrace,ep_minrty,ep_white,invalid_zip,svi_quartile,air_quality,animal_issues,appliances,building_exterior,doors_windows,electrical_issues,elevator_issues,floors_stairs,general_complaints,graffiti_posting,heat_hot_water,homeless_issues,noise_complaints,other_issues,pest_issues,plumbing_issues,police_matters,public_nuisance,safety_concerns,sanitation_issues,walls_ceilings,total_complaints
0,004123/20_209969,2032140141,004123/20,209969,2541 A GRAND AVE,ROOM 3B,2022-08-22,BRONX,10468,Not an Ejectment,Possession,40.8654,-73.9013,7.0,14.0,265.0,2113173,Kingsbridge Heights,2022,2022-08,POINT (-73.901317 40.865396),0.2,2004.0,C0,3.0,3.0,MONJU SARKER,3420.0,post-war,walk-up,False,low-rise,"1994–Present, vacancy decontrol","2001-present, New Architecture","1991–2008, modern economic growth",3-5 units,False,medium-small,Q4 (largest 25%),2000-2009,10468,81397.0,0.9954,0.9407,0.987,0.947,0.9874,39.5,11.6,28.3,9.2,11.2,26.4,12.2,26.9,71.8,19.2,56.7,15.6,78.0,2.3,0.0,0.0,0.5,0.5,96.9,3.1,False,Q3,0.0,0.0,0.0,0.0,3.0,0.0,0.0,2.0,0.0,0.0,1.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0,0.0,3.0,1.0,14.0
1,0050153/20_106030,4031560133,0050153/20,106030,98-05 67TH AVENUE,12F,2022-04-14,QUEENS,11375,Not an Ejectment,Possession,40.7242,-73.8556,6.0,29.0,71306.0,4074666,Forest Hills,2022,2022-04,POINT (-73.855552 40.724241),0.2,1960.0,D3,13.0,181.0,MARSEILLES LEASING LIMITED PARTNERSHIP,177710.0,post-war,elevator,False,high-rise,"1947–1969, rent-control","1951–1980, the International Style, Alternativ...","1946–1975, pst war economic boom",100+ units,False,mega,Q4 (largest 25%),1960-1969,11375,75212.0,0.4759,0.5698,0.8789,0.8057,0.7322,12.0,4.8,6.1,3.7,20.4,18.0,10.5,7.9,41.9,5.8,25.4,2.7,16.4,28.5,0.1,0.0,4.6,0.7,53.0,47.0,False,Q1 (Low),0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,62.0,0.0,34.0,0.0,0.0,4.0,1.0,0.0,0.0,2.0,5.0,112.0


In [11]:
evictions_pre_post.columns, \
evictions_covid.columns, \
evictions_pre_post.shape, \
evictions_covid.shape

(Index(['primary_key', 'bbl', 'court_index_number', 'docket_number',
        'eviction_address', 'eviction_apartment_number', 'executed_date',
        'borough', 'zipcode', 'ejectment', 'eviction/legal_possession',
        'latitude', 'longitude', 'community_board', 'council_district',
        'census_tract', 'bin', 'nta', 'year', 'month_year', 'geometry',
        'average_year_eviction_count', 'yearbuilt', 'bldgclass', 'numfloors',
        'unitsres', 'ownername', 'bldgarea', 'building_type',
        'building_category', 'is_condo', 'floor_category', 'rent_era',
        'architectural_style', 'economic_period', 'residential_units_category',
        'is_llc', 'building_size_category', 'size_quartile', 'decade', 'fips',
        'e_totpop', 'rpl_theme1', 'rpl_theme2', 'rpl_theme3', 'rpl_theme4',
        'rpl_themes', 'ep_pov150', 'ep_unemp', 'ep_nohsdp', 'ep_uninsur',
        'ep_age65', 'ep_age17', 'ep_disabl', 'ep_limeng', 'ep_noveh',
        'ep_crowd', 'ep_hburd', 'ep_afam', 'ep_hisp

In [12]:
link = '/content/drive/My Drive/X999/svi_cleaned.csv'

In [13]:
svi_df = pd.read_csv(link)
svi_df.head(2)

Unnamed: 0,fips,location,area_sqmi,e_totpop,m_totpop,e_hu,m_hu,e_hh,m_hh,e_pov150,m_pov150,e_unemp,m_unemp,e_hburd,m_hburd,e_nohsdp,m_nohsdp,e_uninsur,m_uninsur,e_age65,m_age65,e_age17,m_age17,e_disabl,m_disabl,e_sngpnt,m_sngpnt,e_limeng,m_limeng,e_minrty,m_minrty,e_munit,m_munit,e_mobile,m_mobile,e_crowd,m_crowd,e_noveh,m_noveh,e_groupq,m_groupq,ep_pov150,mp_pov150,ep_unemp,mp_unemp,ep_hburd,mp_hburd,ep_nohsdp,mp_nohsdp,ep_uninsur,mp_uninsur,ep_age65,mp_age65,ep_age17,mp_age17,ep_disabl,mp_disabl,ep_sngpnt,mp_sngpnt,ep_limeng,mp_limeng,ep_minrty,mp_minrty,ep_munit,mp_munit,ep_mobile,mp_mobile,ep_crowd,mp_crowd,ep_noveh,mp_noveh,ep_groupq,mp_groupq,epl_pov150,epl_unemp,epl_hburd,epl_nohsdp,epl_uninsur,spl_theme1,rpl_theme1,epl_age65,epl_age17,epl_disabl,epl_sngpnt,epl_limeng,spl_theme2,rpl_theme2,epl_minrty,spl_theme3,rpl_theme3,epl_munit,epl_mobile,epl_crowd,epl_noveh,epl_groupq,spl_theme4,rpl_theme4,spl_themes,rpl_themes,f_pov150,f_unemp,f_hburd,f_nohsdp,f_uninsur,f_theme1,f_age65,f_age17,f_disabl,f_sngpnt,f_limeng,f_theme2,f_minrty,f_theme3,f_munit,f_mobile,f_crowd,f_noveh,f_groupq,f_theme4,f_total,e_daypop,e_noint,m_noint,e_afam,m_afam,e_hisp,m_hisp,e_asian,m_asian,e_aian,m_aian,e_nhpi,m_nhpi,e_twomore,m_twomore,e_otherrace,m_otherrace,ep_noint,mp_noint,ep_afam,mp_afam,ep_hisp,mp_hisp,ep_asian,mp_asian,ep_aian,mp_aian,ep_nhpi,mp_nhpi,ep_twomore,mp_twomore,ep_otherrace,mp_otherrace
0,10001,ZCTA5 10001,0.6238,27004,1827,16975,831,14375,782,5248,797,761,266,3314,531,1930,534,831,289,3428,432,2694,643,2310,499,501,215,1381,405,13460,2305,15840,898,15,23,389,135,12285,840,2213,218,20.3,2.7,4.3,1.5,23.1,3.5,9.1,2.4,3.1,1.0,12.7,1.6,10.0,2.1,8.6,1.9,3.5,1.5,5.3,1.5,49.8,7.8,93.3,2.7,0.1,0.1,2.7,0.9,85.5,2.8,8.2,0.6,0.6108,0.4574,0.5573,0.5902,0.4436,2.6593,0.5688,0.142,0.1161,0.1891,0.4707,0.8777,1.7956,0.1692,0.867,0.867,0.867,0.9853,0.271,0.7402,0.9949,0.9104,3.9018,0.9806,9.2237,0.7414,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,3,3,239407,1047,389,2220,576,5206,943,5031,774,0,25,0,25,780,326,223,169,7.3,2.6,8.2,2.2,19.3,3.0,18.6,2.9,0.0,0.1,0.0,0.1,2.9,1.2,0.8,0.6
1,10002,ZCTA5 10002,0.8223,76518,2894,39094,1241,36028,1326,27908,2853,2833,574,14688,1367,18301,1376,4074,766,17681,1287,10028,1549,9896,1062,2211,499,18393,1640,56964,3226,35725,1677,16,28,2461,449,29828,1403,2090,39,36.8,3.5,7.6,1.4,40.8,3.5,30.0,2.0,5.4,1.0,23.1,1.7,13.1,1.8,13.0,1.4,6.1,1.4,24.7,2.0,74.4,3.1,91.4,3.2,0.0,0.1,6.8,1.2,82.8,1.8,2.7,0.1,0.9148,0.7946,0.9219,0.9741,0.7207,4.3261,0.9639,0.7296,0.1831,0.5186,0.739,0.9944,3.1647,0.8781,0.9369,0.9369,0.9369,0.979,0.0,0.9105,0.9915,0.773,3.654,0.9254,12.0817,0.9656,1,0,1,1,0,3,0,0,0,0,1,1,1,1,1,0,1,1,0,3,8,64307,8590,1110,6141,1194,19864,2190,28477,1989,74,83,24,45,1810,486,574,394,23.8,2.9,8.0,1.5,26.0,2.5,37.2,2.2,0.1,0.1,0.0,0.1,2.4,0.6,0.8,0.5


In [14]:
svi_df.shape

(204, 153)

In [15]:
# list(svi_df.columns)

In [16]:
svi_df.ep_nhpi.unique()

array([ 0.00e+00,  1.00e-01,  3.00e-01,  2.00e-01, -9.99e+02,  8.00e-01,
        1.20e+00,  5.00e-01,  4.00e-01])

# **Step 2: SVI items**

q quick double check

In [17]:
link = "/content/drive/My Drive/X999/NewYork_ZCTA.csv"

In [18]:
svi_raw = pd.read_csv(link)
svi_raw.head(2)

Unnamed: 0,ST,STATE,ST_ABBR,FIPS,LOCATION,AREA_SQMI,E_TOTPOP,M_TOTPOP,E_HU,M_HU,E_HH,M_HH,E_POV150,M_POV150,E_UNEMP,M_UNEMP,E_HBURD,M_HBURD,E_NOHSDP,M_NOHSDP,E_UNINSUR,M_UNINSUR,E_AGE65,M_AGE65,E_AGE17,M_AGE17,E_DISABL,M_DISABL,E_SNGPNT,M_SNGPNT,E_LIMENG,M_LIMENG,E_MINRTY,M_MINRTY,E_MUNIT,M_MUNIT,E_MOBILE,M_MOBILE,E_CROWD,M_CROWD,E_NOVEH,M_NOVEH,E_GROUPQ,M_GROUPQ,EP_POV150,MP_POV150,EP_UNEMP,MP_UNEMP,EP_HBURD,MP_HBURD,EP_NOHSDP,MP_NOHSDP,EP_UNINSUR,MP_UNINSUR,EP_AGE65,MP_AGE65,EP_AGE17,MP_AGE17,EP_DISABL,MP_DISABL,EP_SNGPNT,MP_SNGPNT,EP_LIMENG,MP_LIMENG,EP_MINRTY,MP_MINRTY,EP_MUNIT,MP_MUNIT,EP_MOBILE,MP_MOBILE,EP_CROWD,MP_CROWD,EP_NOVEH,MP_NOVEH,EP_GROUPQ,MP_GROUPQ,EPL_POV150,EPL_UNEMP,EPL_HBURD,EPL_NOHSDP,EPL_UNINSUR,SPL_THEME1,RPL_THEME1,EPL_AGE65,EPL_AGE17,EPL_DISABL,EPL_SNGPNT,EPL_LIMENG,SPL_THEME2,RPL_THEME2,EPL_MINRTY,SPL_THEME3,RPL_THEME3,EPL_MUNIT,EPL_MOBILE,EPL_CROWD,EPL_NOVEH,EPL_GROUPQ,SPL_THEME4,RPL_THEME4,SPL_THEMES,RPL_THEMES,F_POV150,F_UNEMP,F_HBURD,F_NOHSDP,F_UNINSUR,F_THEME1,F_AGE65,F_AGE17,F_DISABL,F_SNGPNT,F_LIMENG,F_THEME2,F_MINRTY,F_THEME3,F_MUNIT,F_MOBILE,F_CROWD,F_NOVEH,F_GROUPQ,F_THEME4,F_TOTAL,E_DAYPOP,E_NOINT,M_NOINT,E_AFAM,M_AFAM,E_HISP,M_HISP,E_ASIAN,M_ASIAN,E_AIAN,M_AIAN,E_NHPI,M_NHPI,E_TWOMORE,M_TWOMORE,E_OTHERRACE,M_OTHERRACE,EP_NOINT,MP_NOINT,EP_AFAM,MP_AFAM,EP_HISP,MP_HISP,EP_ASIAN,MP_ASIAN,EP_AIAN,MP_AIAN,EP_NHPI,MP_NHPI,EP_TWOMORE,MP_TWOMORE,EP_OTHERRACE,MP_OTHERRACE
0,36,New York,NY,6390,ZCTA5 06390,4.0467,53,39,253,49,19,19,17,16,0,13,9,26,0,13,27,34,0,13,6,11,31,33,0,18,9,53,20,51,0,18,4,5,0,18,0,13,17,16,32.1,18.8,0.0,52.7,47.4,100.0,0.0,51.4,50.9,45.7,0.0,45.2,11.3,19.0,58.5,41.4,0.0,94.7,17.0,99.2,37.7,92.1,0.0,7.1,1.6,2.0,0.0,94.7,0.0,75.5,32.1,18.8,0.879,0.0,0.9635,0.0,0.996,2.8385,0.6342,0.0,0.1408,0.9944,0.0,0.9775,2.1127,0.3009,0.8062,0.8062,0.8062,0.0,0.4654,0.0,0.0,0.9735,1.4389,0.2205,7.1963,0.4192,0,0,1,0,1,2,0,0,1,0,1,2,0,0,0,0,0,0,1,1,5,601,9,14,0,13,9,19,0,13,0,13,8,16,3,7,0,13,47.4,51.8,0.0,45.2,17.0,35.0,0.0,45.2,0.0,45.2,15.1,32.1,5.7,12.1,0.0,45.2
1,36,New York,NY,10001,ZCTA5 10001,0.6238,27004,1827,16975,831,14375,782,5248,797,761,266,3314,531,1930,534,831,289,3428,432,2694,643,2310,499,501,215,1381,405,13460,2305,15840,898,15,23,389,135,12285,840,2213,218,20.3,2.7,4.3,1.5,23.1,3.5,9.1,2.4,3.1,1.0,12.7,1.6,10.0,2.1,8.6,1.9,3.5,1.5,5.3,1.5,49.8,7.8,93.3,2.7,0.1,0.1,2.7,0.9,85.5,2.8,8.2,0.6,0.6108,0.4574,0.5573,0.5902,0.4436,2.6593,0.5688,0.142,0.1161,0.1891,0.4707,0.8777,1.7956,0.1692,0.867,0.867,0.867,0.9853,0.271,0.7402,0.9949,0.9104,3.9018,0.9806,9.2237,0.7414,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,3,3,239407,1047,389,2220,576,5206,943,5031,774,0,25,0,25,780,326,223,169,7.3,2.6,8.2,2.2,19.3,3.0,18.6,2.9,0.0,0.1,0.0,0.1,2.9,1.2,0.8,0.6


In [19]:
def is_nyc_zipcode(zipcode):
    zip_int = int(zipcode) if isinstance(zipcode, str) else zipcode

    # Manhattan: 10001-10282
    if 10001 <= zip_int <= 10282:
        return True
    # addition Manhattan: 10300-10499
    if 10300 <= zip_int <= 10499:
        return True
    # Bronx: 10451-10475
    if 10451 <= zip_int <= 10475:
        return True
    # Brooklyn: 11201-11256
    if 11201 <= zip_int <= 11256:
        return True
    # Queens: 11351-11436, 11101-11109
    if (11351 <= zip_int <= 11436) or (11101 <= zip_int <= 11109):
        return True
    # Staten Island: 10301-10314
    if 10301 <= zip_int <= 10314:
        return True
    # additional Queens ZIPs
    if zip_int in [11004, 11005, 11411, 11412, 11413, 11418, 11419, 11420, 11421, 11422, 11423, 11426, 11427, 11428, 11429]:
        return True
    return False

In [20]:
nyc_df = svi_raw[svi_raw['FIPS'].apply(is_nyc_zipcode)]

In [21]:
nyc_df.shape

(204, 156)

In [22]:
nyc_df.EP_NHPI.unique()
# interesting, so there was an error in the source data

array([ 0.00e+00,  1.00e-01,  3.00e-01,  2.00e-01, -9.99e+02,  8.00e-01,
        1.20e+00,  5.00e-01,  4.00e-01])

In [23]:
svi_raw.EP_NHPI.unique()

array([ 1.51e+01,  0.00e+00,  1.00e-01,  3.00e-01,  2.00e-01, -9.99e+02,
        8.00e-01,  1.20e+00,  5.00e-01,  4.00e-01,  1.10e+00,  7.00e-01,
        1.50e+00,  1.80e+00,  9.00e-01,  6.00e-01,  1.40e+00,  2.10e+00,
        2.30e+00,  1.00e+00])

In [24]:
-9.99e+02, 0.00e+00, 9.00e-01, 2.00e-01

(-999.0, 0.0, 0.9, 0.2)

# **Step 3: All boroughs and their eviction rates**

In [25]:
evictions_pre_post_mean = evictions_pre_post[['ep_afam', 'ep_asian', 'ep_hisp', 'ep_nhpi', 'ep_white', 'ep_twomore', 'ep_otherrace']].mean()
evictions_pre_post_mean
# may need to merge ep_twomore and ep_otherrance toegther

Unnamed: 0,0
ep_afam,29.2346
ep_asian,8.904
ep_hisp,38.7307
ep_nhpi,0.0115
ep_white,19.0236
ep_twomore,2.8025
ep_otherrace,1.0239


In [26]:
evictions_pre_post_mean = evictions_pre_post_mean.reset_index()

In [27]:
evictions_pre_post_mean.rename(columns = {'index':'race_svi', 0: "racial percentage"}, inplace=True)

In [28]:
evictions_pre_post_mean

Unnamed: 0,race_svi,racial percentage
0,ep_afam,29.2346
1,ep_asian,8.904
2,ep_hisp,38.7307
3,ep_nhpi,0.0115
4,ep_white,19.0236
5,ep_twomore,2.8025
6,ep_otherrace,1.0239


In [29]:
# type(evictions_pre_post_mean)
# so this is correct

In [30]:
# evictions_pre_post.columns

In [160]:
neighbor_evictions = evictions_pre_post.groupby('nta').agg({'average_year_eviction_count': 'mean', 'borough': 'first'}).reset_index()
neighbor_evictions.sort_values('average_year_eviction_count', ascending=False, inplace=True)
neighbor_evictions

Unnamed: 0,nta,average_year_eviction_count,borough
185,park-cemetery-etc-Bronx,6.4667,BRONX
143,Seagate-Coney Island,4.3904,BROOKLYN
36,Corona,4.2597,QUEENS
163,University Heights-Morris Heights,3.5051,BRONX
125,Park Slope-Gowanus,2.3725,BROOKLYN
75,Grymes Hill-Clifton-Fox Hills,2.336,STATEN ISLAND
106,Morrisania-Melrose,2.255,BRONX
19,Bronxdale,2.2396,BRONX
21,Brownsville,2.0909,BROOKLYN
31,Claremont-Bathgate,1.9783,BRONX


In [163]:
man_nta = {
    'nta': neighbor_evictions['nta'].unique(),
    'eviction_rates': neighbor_evictions['average_year_eviction_count']
}

man_nta_df = pd.DataFrame(man_nta)
man_nta_df

Unnamed: 0,nta,eviction_rates
185,park-cemetery-etc-Bronx,6.4667
143,Seagate-Coney Island,4.3904
36,Corona,4.2597
163,University Heights-Morris Heights,3.5051
125,Park Slope-Gowanus,2.3725
75,Grymes Hill-Clifton-Fox Hills,2.336
106,Morrisania-Melrose,2.255
19,Bronxdale,2.2396
21,Brownsville,2.0909
31,Claremont-Bathgate,1.9783


In [164]:
avg_per_nta = neighbor_evictions.average_year_eviction_count.mean()
avg_per_nta
# per building, per year, that's why it is similar to the borough one too

np.float64(0.9049163589883761)

### **Step 3.2. All neighborhoods racial composite**

In [170]:
race_columns = ['ep_afam', 'ep_asian', 'ep_hisp', 'ep_nhpi', 'ep_white', 'ep_twomore', 'ep_otherrace']
racial_avg_all = evictions_pre_post.groupby('nta')[['average_year_eviction_count', 'ep_afam', 'ep_asian', 'ep_hisp', 'ep_nhpi', 'ep_white', 'ep_twomore', 'ep_otherrace', 'rpl_themes']].mean()
racial_avg_all.sort_values('average_year_eviction_count', ascending=False, inplace=True)
racial_avg_all.reset_index(inplace=True)
racial_avg_all

Unnamed: 0,nta,average_year_eviction_count,ep_afam,ep_asian,ep_hisp,ep_nhpi,ep_white,ep_twomore,ep_otherrace,rpl_themes
0,park-cemetery-etc-Bronx,6.4667,28.6,6.2,52.9,0.0,8.7,2.3,0.7,0.9925
1,Seagate-Coney Island,4.3904,20.1385,9.827,19.4159,0.0,47.3927,2.9144,0.0159,0.9911
2,Corona,4.2597,7.3968,11.305,75.7668,0.0,4.4275,0.5039,0.4997,0.9632
3,University Heights-Morris Heights,3.5051,24.4407,1.6079,69.3353,0.0,2.0544,1.8489,0.7046,0.9979
4,Park Slope-Gowanus,2.3725,7.4853,9.501,16.3461,0.0,60.6373,5.4206,0.5765,0.6327
5,Grymes Hill-Clifton-Fox Hills,2.336,25.1149,11.6023,22.3858,0.0,38.7568,1.5584,0.4703,0.9444
6,Morrisania-Melrose,2.255,35.4504,0.6731,59.1572,0.0,2.5927,1.5814,0.3022,0.9965
7,Bronxdale,2.2396,29.4301,6.1401,52.1306,0.0,8.6984,2.2915,0.7095,0.9909
8,Brownsville,2.0909,70.0757,0.8424,19.3509,0.0002,3.6573,5.3733,0.6034,0.9933
9,Claremont-Bathgate,1.9783,30.5182,1.0987,63.2627,0.0,3.0336,1.4791,0.4059,0.9971


In [166]:
racial_avg_all.T

nta,park-cemetery-etc-Bronx,Seagate-Coney Island,Corona,University Heights-Morris Heights,Park Slope-Gowanus,Grymes Hill-Clifton-Fox Hills,Morrisania-Melrose,Bronxdale,Brownsville,Claremont-Bathgate,Westchester-Unionport,Battery Park City-Lower Manhattan,Fordham South,Turtle Bay-East Midtown,Oakwood-Oakwood Beach,Pelham Parkway,East Concourse-Concourse Village,Mount Hope,Ft. Totten-Bay Terrace-Clearview,Melrose South-Mott Haven North,East Tremont,Flushing,Woodlawn-Wakefield,Highbridge,Kew Gardens,Crown Heights South,West Concourse,Central Harlem North-Polo Grounds,Springfield Gardens North,Co-op City,Belmont,Kew Gardens Hills,Midtown-Midtown South,Crotona Park East,Clinton Hill,Bedford Park-Fordham North,Prospect Lefferts Gardens-Wingate,Hunts Point,North Riverdale-Fieldston-Riverdale,Van Cortlandt Village,West Farms-Bronx River,Soundview-Bruckner,Williamsbridge-Olinville,Soundview-Castle Hill-Clason Point-Harding Park,Norwood,West Brighton,Longwood,Dyker Heights,Briarwood-Jamaica Hills,Starrett City,Ocean Hill,Spuyten Duyvil-Kingsbridge,Douglas Manor-Douglaston-Little Neck,Murray Hill,Kingsbridge Heights,Fort Greene,Parkchester,Mariner's Harbor-Arlington-Port Ivory-Graniteville,East Harlem North,East Williamsburg,East Flatbush-Farragut,Jamaica,Hudson Yards-Chelsea-Flatiron-Union Square,Bushwick South,DUMBO-Vinegar Hill-Downtown Brooklyn-Boerum Hill,Rego Park,East New York,East Harlem South,Marble Hill-Inwood,Sheepshead Bay-Gerritsen Beach-Manhattan Beach,North Side-South Side,Midwood,Auburndale,Mott Haven-Port Morris,Forest Hills,Hollis,Erasmus,Bushwick North,Bedford,Lincoln Square,Crown Heights North,Elmhurst,Brighton Beach,Flatlands,Hunters Point-Sunnyside-West Maspeth,Fresh Meadows-Utopia,Rugby-Remsen Village,Washington Heights South,Flatbush,Lower East Side,Upper West Side,Clinton,Washington Heights North,West New Brighton-New Brighton-St. George,Elmhurst-Maspeth,Bensonhurst East,Jamaica Estates-Holliswood,Jackson Heights,Williamsburg,Homecrest,Central Harlem South,Hamilton Heights,Carroll Gardens-Columbia Street-Red Hook,Pelham Bay-Country Club-City Island,Bayside-Bayside Hills,Stuyvesant Heights,Canarsie,East New York (Pennsylvania Ave),Manhattanville,Ocean Parkway South,Stapleton-Rosebank,Madison,Bath Beach,Steinway,Murray Hill-Kips Bay,Chinatown,Allerton-Pelham Gardens,Gravesend,Queens Village,Upper East Side-Carnegie Hill,Cypress Hills-City Line,Yorkville,Bensonhurst West,Van Nest-Morris Park-Westchester Square,Bay Ridge,Stuyvesant Town-Cooper Village,Morningside Heights,SoHo-TriBeCa-Civic Center-Little Italy,Eastchester-Edenwald-Baychester,New Brighton-Silver Lake,Gramercy,Woodhaven,Old Astoria,South Jamaica,Kensington-Ocean Parkway,Grasmere-Arrochar-Ft. Wadsworth,New Dorp-Midland Beach,Rosedale,Woodside,St. Albans,Astoria,Lenox Hill-Roosevelt Island,Glendale,Todt Hill-Emerson Hill-Heartland Village-Lighthouse Hill,Schuylerville-Throgs Neck-Edgewater Park,Laurelton,Old Town-Dongan Hills-South Beach,Ridgewood,Pomonok-Flushing Heights-Hillcrest,Middle Village,Baisley Park,East Elmhurst,Richmond Hill,Queensbridge-Ravenswood-Long Island City,Georgetown-Marine Park-Bergen Beach-Mill Basin,Sunset Park East,Springfield Gardens South-Brookville,South Ozone Park,College Point,Sunset Park West,East Flushing,Port Richmond,Brooklyn Heights-Cobble Hill,Borough Park,Prospect Heights,Windsor Terrace,North Corona,Whitestone,Ozone Park,Lindenwood-Howard Beach,West Village,Cambria Heights,East Village,New Springville-Bloomfield-Travis,Bellerose,Queensboro Hill,Oakland Gardens,Greenpoint,Charleston-Richmond Valley-Tottenville,Great Kills,Annadale-Huguenot-Prince's Bay-Eltingville,Westerleigh,Maspeth,Arden Heights,Glen Oaks-Floral Park-New Hyde Park,Rossville-Woodrow,park-cemetery-etc-Brooklyn
average_year_eviction_count,6.4667,4.3904,4.2597,3.5051,2.3725,2.336,2.255,2.2396,2.0909,1.9783,1.9408,1.7746,1.6845,1.6833,1.6722,1.6411,1.6058,1.5958,1.58,1.5741,1.5485,1.5223,1.5069,1.4761,1.4738,1.4379,1.4167,1.4098,1.4075,1.4012,1.3974,1.3964,1.3892,1.3511,1.3447,1.3362,1.3238,1.314,1.3126,1.3,1.2898,1.2836,1.2645,1.2506,1.2479,1.243,1.1946,1.1926,1.1921,1.1795,1.1774,1.173,1.128,1.0903,1.0777,1.0685,1.0623,1.0523,1.0314,1.0216,1.02,1.0086,1.0078,1.0003,0.9541,0.942,0.9293,0.9178,0.9141,0.914,0.9124,0.9109,0.9,0.8987,0.8943,0.871,0.8645,0.8557,0.8555,0.855,0.8474,0.8458,0.8373,0.8351,0.8293,0.8226,0.8211,0.7778,0.7657,0.7504,0.7383,0.736,0.7316,0.7241,0.7151,0.7123,0.7014,0.6998,0.6615,0.6607,0.6557,0.6395,0.6355,0.633,0.62,0.6075,0.6047,0.5943,0.5773,0.575,0.5726,0.5656,0.5641,0.5548,0.5468,0.5377,0.5339,0.5302,0.5187,0.5138,0.5084,0.4964,0.4955,0.4917,0.4845,0.473,0.4706,0.4696,0.4667,0.4639,0.463,0.4487,0.4406,0.4396,0.4331,0.4204,0.4146,0.4068,0.4051,0.4026,0.4024,0.3978,0.3945,0.3905,0.3904,0.389,0.3867,0.3805,0.3754,0.3722,0.3687,0.362,0.3608,0.3587,0.3544,0.3495,0.3464,0.3456,0.3447,0.3394,0.3388,0.3345,0.3333,0.3325,0.3312,0.329,0.3231,0.3222,0.3158,0.2984,0.2945,0.2923,0.2796,0.2789,0.2651,0.2645,0.2632,0.2531,0.2516,0.2488,0.2471,0.2471,0.2467,0.2235,0.2235,0.2235,0.2
ep_afam,28.6,20.1385,7.3968,24.4407,7.4853,25.1149,35.4504,29.4301,70.0757,30.5182,22.5381,5.1135,19.7701,2.5679,3.0,19.8323,37.2462,28.4259,1.005,29.0565,28.0218,2.5128,46.3838,28.0424,6.7883,54.7185,29.7957,53.1706,78.4795,58.96,19.2911,7.125,5.8074,27.1736,24.8126,16.0227,63.2727,27.4511,9.398,14.4409,24.4517,23.1,46.3329,30.7604,28.6,13.6025,26.5898,1.0864,21.3177,61.7,65.7064,11.9581,1.092,2.379,15.8222,19.7271,22.2,27.7042,37.4339,14.5885,66.0359,21.6398,6.2457,37.1939,14.1253,3.3957,49.8602,24.4735,6.8807,3.4201,5.9004,14.3727,4.7833,25.111,2.9038,41.5005,63.5239,17.0372,33.916,4.5269,54.5405,1.5945,2.4114,48.9986,3.5968,11.0357,73.8256,9.4778,50.3284,7.546,6.7846,5.3544,3.2136,19.9547,1.5192,2.9276,21.3094,4.278,7.2038,5.0105,43.8578,25.3079,11.1444,7.8839,2.855,43.7517,79.3561,55.8968,28.9019,11.1609,21.0813,6.4066,1.5683,4.5262,5.2682,7.7371,51.6827,5.5611,44.4504,2.7172,47.2724,3.7964,1.444,19.5497,2.943,7.2048,19.8254,2.9904,55.0526,19.9309,6.563,3.9,6.8923,48.4326,9.2556,9.6898,3.4146,77.9907,1.5051,73.3103,4.7717,1.73,2.4202,5.1357,9.6984,85.2791,6.935,2.3485,10.2669,1.025,68.0141,2.8165,9.2388,9.3452,40.5737,3.2142,80.6366,17.6146,0.65,2.9753,1.6265,21.2966,11.4,2.9941,23.675,7.7032,7.4986,0.3056,6.7684,3.7148,2.8794,85.0327,6.2083,3.3,13.2767,4.8677,1.6,2.6,0.3323,1.2951,0.6765,11.5412,0.945,0.8,5.0824,0.5353,21.1
ep_asian,6.2,9.827,11.305,1.6079,9.501,11.6023,0.6731,6.1401,0.8424,1.0987,10.2746,18.7738,2.2371,15.7632,13.8,15.9261,0.8263,0.8842,29.76,1.1642,1.3305,69.6381,2.3225,1.0293,22.4743,2.7929,1.095,3.806,3.0306,2.7267,2.7425,27.9187,18.7385,1.0732,6.9238,3.1098,2.9446,0.2221,3.5199,2.7809,5.7368,8.2,4.3523,2.1141,6.2,11.4456,0.4757,32.5111,30.9335,2.6,1.6955,3.6905,37.38,64.5205,2.3325,8.1823,16.6,10.4827,5.7044,6.2317,3.7104,34.6499,15.705,4.7231,11.6771,30.9804,5.4992,10.7473,3.3681,17.4785,5.9106,14.5658,46.0556,0.5096,28.9331,25.7903,3.2934,5.8536,5.5412,16.0862,2.4402,48.2681,14.8568,6.6704,30.6782,42.9817,1.2907,4.0466,6.2411,25.3629,8.9182,17.9763,2.3211,6.9297,39.5795,30.5268,38.021,22.9115,5.7577,24.7288,6.871,4.0009,6.0226,10.9935,42.8083,4.2734,2.5177,1.5462,6.582,16.4906,13.6288,23.6603,36.3623,10.1821,18.5861,34.1605,5.4953,27.8376,19.5081,9.5957,6.9815,8.7272,33.4381,11.5567,16.4915,13.819,9.4149,22.4956,4.356,6.8887,16.4863,24.2869,15.8664,16.9343,17.7152,18.1898,13.5927,3.0669,36.5899,6.6202,17.3413,10.931,7.3275,16.7333,6.5829,1.6604,17.555,7.1182,40.1715,14.6125,5.6036,2.1835,27.7371,25.8087,8.6921,31.9137,3.4473,32.9424,34.4,33.9505,60.5551,6.5286,14.2,22.4337,6.9,15.8355,11.4552,26.1889,28.7342,6.341,9.6448,1.5981,14.9575,18.0,40.5419,56.5581,48.6,4.9,3.2581,9.139,6.8059,12.5,15.3967,8.7,47.4706,4.6412,7.5
ep_hisp,52.9,19.4159,75.7668,69.3353,16.3461,22.3858,59.1572,52.1306,19.3509,63.2627,55.1108,9.5817,72.1502,8.4167,16.2,47.6325,56.9323,66.751,11.275,64.9251,64.7416,14.9697,26.4032,65.4042,25.8432,11.8073,63.4311,28.1355,11.2336,30.4533,69.5754,18.5071,17.3736,65.525,13.985,72.5584,11.5881,68.8494,34.0166,69.3459,62.4244,61.7,38.9846,62.0966,52.9,15.6823,69.2973,15.2309,25.9996,15.9,15.7823,51.3399,11.248,14.3215,76.504,13.995,47.2,36.938,40.0579,32.6691,7.4676,23.1637,16.3429,32.8167,12.6747,22.8543,36.1831,44.8647,65.9,9.0215,24.2743,11.1658,18.6333,69.6073,17.8522,13.5161,14.357,50.2836,22.017,10.145,14.5104,42.4846,9.1264,8.2359,27.3174,21.0122,14.6261,66.1568,12.8934,25.2146,18.6,18.2173,70.1921,27.1777,40.5945,14.2061,19.9659,50.9053,20.9923,12.0759,21.6834,50.0514,14.8081,47.29,15.4267,27.1091,9.2142,32.7744,37.3204,11.6844,20.637,8.7828,17.3407,24.3393,11.9763,20.8311,29.8205,15.1705,19.1415,9.5336,37.8611,11.3452,14.984,52.2908,21.9995,21.781,24.2418,9.6741,29.6155,27.1351,10.2918,53.4142,25.7587,17.6007,16.3987,16.5653,16.5122,9.1432,39.3741,9.9275,26.6804,7.4325,44.7761,16.1571,47.5134,6.9242,16.115,44.8673,20.2715,24.1847,12.6763,45.3759,33.7512,22.3698,8.9026,36.8353,8.7786,22.3905,39.7713,42.154,16.3102,40.2134,11.0,13.7491,14.0625,16.1065,75.2322,15.5472,39.0171,25.1492,11.26,7.0904,17.8939,15.2,19.9488,21.1677,16.4,15.6,10.4258,12.5707,11.4824,24.2529,36.77,12.8,12.9118,9.9765,14.1
ep_nhpi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0002,0.0,0.0257,0.0413,0.0,0.0933,0.0,0.0,0.0,0.0,0.0433,0.0,0.0,0.2739,0.0,0.0,0.0,0.0,0.0,0.0001,0.0,0.0,0.0,0.0004,0.1061,0.0,0.0,0.0,0.0001,0.0,0.0,0.0001,0.0551,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0398,0.0,0.0004,0.0,0.0,0.3759,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0596,0.0137,0.0493,0.0,0.0935,0.1182,0.0,0.0,0.0297,0.0004,0.0858,0.0,0.5501,0.0255,0.0687,0.0007,0.013,0.0147,0.0169,0.0004,0.0014,0.0,0.0,0.0,0.0,0.0001,0.0,0.0136,0.0003,0.0,0.1559,0.0001,0.0,0.0027,0.0,0.0529,0.0799,0.0,0.0377,0.0,0.0,1.1323,0.0,0.0,0.0555,0.0,0.0014,0.0,0.0844,0.0,0.0987,0.0,0.0,0.0017,0.0437,0.0,0.0,0.0114,0.0698,0.165,0.0128,0.0,0.0,0.0,0.0,0.0,0.0007,0.0,0.0,0.0425,0.0,0.0,0.0017,0.0,0.0,0.0,0.0034,0.0,0.204,0.0,0.169,0.0018,0.0,0.0,0.0044,0.0,0.0,0.0038,0.0125,0.0024,-12.6443,0.0326,0.0,0.0026,0.0,0.0134,0.2104,0.1181,0.0,0.0857,0.0,0.0,0.0,0.0,0.0,0.0,0.1972,0.0,0.0,0.0097,0.0,0.0459,0.0,0.0,0.0613,0.0,0.0,0.0645,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
ep_white,8.7,47.3927,4.4275,2.0544,60.6373,38.7568,2.5927,8.6984,3.6573,3.0336,8.5335,61.0143,3.9206,68.9144,63.2,13.3175,2.7574,1.856,55.16,2.7172,3.1983,10.4855,19.5334,2.7253,38.6612,25.8661,3.053,10.7591,1.9993,4.9,6.0168,41.6759,53.4791,3.3835,47.701,6.2559,17.0358,2.1526,49.1656,11.4502,3.2033,2.1,5.714,1.9302,8.7,55.1152,1.7514,49.0025,13.0843,15.0,12.3684,29.1115,46.332,15.8959,4.1492,52.0,10.4,21.1215,13.0285,43.2914,17.0875,12.0938,56.9549,21.2338,54.5024,36.1913,3.9025,16.3084,20.764,64.9294,60.0699,54.857,27.4556,2.6292,44.5541,8.8687,13.7612,21.8724,32.8712,64.5531,23.1759,5.2125,67.8136,30.9677,33.7452,22.5052,4.2998,17.8351,25.4287,37.8,61.7132,53.3678,21.332,41.5975,15.6699,48.1211,12.4942,18.3224,61.4346,54.1529,22.9082,16.5456,60.3677,31.9943,36.1283,20.5893,3.9173,5.2764,22.2668,55.7234,42.5566,57.5265,40.7341,56.4071,58.8023,32.9545,9.0134,47.3698,6.6801,74.7638,3.3022,72.5574,46.0653,13.5962,55.4398,52.2524,41.8333,60.9415,6.3871,41.7093,62.1342,12.6112,45.3839,5.6046,51.2371,53.1,62.6927,5.6144,19.8259,2.6801,46.197,76.7173,42.8972,58.6976,33.6246,1.4538,56.5633,43.1687,26.1369,56.4292,2.2534,21.9646,13.6151,36.3524,37.0368,25.5174,2.125,5.3608,23.5691,18.6702,18.7673,29.621,57.5,56.6195,47.9375,54.9774,4.686,55.1944,15.5671,61.459,72.6042,3.2308,56.2602,60.3,18.4,15.0258,30.2,69.6,84.3645,74.8878,79.1471,48.4529,44.535,76.1,29.2588,82.6294,52.1
ep_twomore,2.3,2.9144,0.5039,1.8489,5.4206,1.5584,1.5814,2.2915,5.3733,1.4791,1.8548,4.5833,1.3248,3.5507,3.5,2.172,1.6357,1.4164,1.5867,1.3918,1.6115,1.2782,2.7199,1.403,3.7029,3.7998,1.3816,3.0765,3.1664,2.7333,1.6611,3.2125,3.9919,1.7266,5.7636,1.4718,4.35,0.7526,3.1205,1.144,1.8585,1.8,2.2981,1.6034,2.3,3.7443,1.3332,1.3543,3.735,3.3,3.6325,2.4378,2.936,1.6605,0.6403,5.1072,2.4,3.3169,2.8921,2.6288,4.9143,4.2445,4.0345,3.1821,6.2782,5.4087,2.849,2.0242,2.3974,4.6068,3.196,3.7827,2.5111,0.8451,4.8548,4.3318,4.1827,3.4703,4.737,2.7188,4.2121,1.7357,5.1882,4.1132,3.9427,1.9443,5.3272,1.9685,4.1145,3.3274,3.451,4.4961,1.8446,3.4786,1.7356,2.7689,4.921,2.9211,4.2154,3.0665,3.9573,3.2988,5.6032,1.0235,1.8283,3.545,4.3909,2.9083,3.8175,3.6281,1.5612,3.2146,3.0084,3.9488,5.0358,3.0994,2.2984,2.7423,4.7516,2.7974,2.8249,2.8557,2.7369,1.8914,2.6201,4.2048,3.7995,3.4215,2.305,3.4835,3.8904,3.1723,3.7133,3.3761,4.5795,1.9551,3.4878,2.6576,1.7152,3.7258,4.1337,2.6845,2.1642,2.5357,1.7952,2.1484,2.3733,2.0737,2.3538,1.9958,5.3916,-10.5127,6.2089,4.9897,3.8211,1.5411,2.617,8.9785,0.8606,1.7677,1.9408,1.9193,5.3,1.9107,6.7875,4.6452,0.5301,1.8694,4.9079,2.9689,3.2152,2.2192,4.0735,2.4,4.5698,1.4677,2.5,6.5,1.3355,1.8756,1.3235,2.5235,1.3367,1.2,2.3059,1.4647,4.1
ep_otherrace,0.7,0.0159,0.4997,0.7046,0.5765,0.4703,0.3022,0.7095,0.6034,0.4059,1.4325,0.8952,0.5114,0.6397,0.4,0.7675,0.3825,0.5394,1.1267,0.6533,0.7708,0.599,2.3837,1.0964,2.1874,1.0145,0.9442,0.9779,1.8142,0.1333,0.4844,1.3674,0.4764,0.7472,0.6816,0.4192,0.7072,0.1642,0.8099,0.6361,2.0909,3.0,1.855,1.1973,0.7,0.1835,0.3745,0.742,4.3913,1.2,0.2397,0.8953,0.372,0.5867,0.5249,0.8243,0.8,0.408,0.7767,0.4604,0.7734,3.4744,0.6224,0.7687,0.6394,1.0609,1.5832,1.5818,0.4966,0.4147,0.5093,1.2673,0.4611,0.5355,0.8019,5.1272,0.6934,1.2368,0.8051,1.9519,0.9645,0.4099,0.5036,1.0044,0.5849,0.4261,0.5474,0.4406,0.9041,0.6992,0.4398,0.3882,0.9654,0.2721,0.5438,1.1346,2.6065,0.5392,0.3577,0.8293,0.6441,0.7076,0.8202,0.7187,0.75,0.5082,0.703,1.5961,0.9071,1.2484,0.4667,0.2132,0.7946,0.4179,0.2775,1.1293,0.9961,1.1987,5.3386,0.4862,1.5919,0.6875,1.0142,0.7971,0.384,0.6,0.6995,0.3837,1.7445,0.2691,0.4178,2.6131,1.6811,6.8437,0.9106,0.4245,0.3951,1.3458,0.5943,3.1374,0.7455,0.2588,0.4128,0.4786,0.7561,2.4275,0.415,0.4202,0.6631,1.6542,5.5988,-12.3253,8.6533,1.1341,0.8711,0.7389,2.1955,11.0244,0.6096,0.2374,0.5531,0.4059,0.5,2.1,0.5375,0.8097,0.5,0.5028,4.8237,0.2639,0.3079,0.9173,0.4602,0.5,3.1744,0.6742,0.3,0.6,0.2419,0.2366,0.5235,0.4235,0.8917,0.4,2.8941,0.6647,0.9
rpl_themes,0.9925,0.9911,0.9632,0.9979,0.6327,0.9444,0.9965,0.9909,0.9933,0.9971,0.9805,0.3961,0.9921,0.5234,0.8739,0.9706,0.9974,0.9985,0.7743,0.9943,0.9962,0.9448,0.964,0.9931,0.8872,0.8916,0.9925,0.9694,0.9319,0.976,0.9909,0.9494,0.7432,0.9945,0.7963,0.9888,0.8996,0.9951,0.8462,0.987,0.9933,0.9937,0.9845,0.9865,0.9925,0.9775,0.9952,0.9135,0.9596,0.9845,0.945,0.9788,0.7325,0.9341,0.9878,0.8567,0.9708,0.9247,0.9831,0.9371,0.9278,0.9627,0.6906,0.9531,0.6721,0.8588,0.9682,0.9703,0.9602,0.9387,0.8905,0.962,0.8954,0.9984,0.7642,0.9294,0.923,0.9446,0.8881,0.6556,0.9298,0.9424,0.9524,0.9077,0.7801,0.9052,0.9694,0.9725,0.927,0.9164,0.7518,0.7875,0.9781,0.9212,0.9145,0.9599,0.9564,0.9263,0.8649,0.9386,0.9183,0.9384,0.7827,0.9442,0.8567,0.9404,0.9119,0.9837,0.9314,0.9654,0.9344,0.9056,0.9652,0.8261,0.4907,0.9213,0.951,0.9599,0.8862,0.5451,0.9617,0.5769,0.9587,0.9763,0.8954,0.8059,0.8909,0.6857,0.9643,0.9205,0.5312,0.906,0.818,0.9471,0.9204,0.906,0.8744,0.7996,0.9021,-2.4252,0.8473,0.5378,0.8815,0.8112,0.9515,0.7489,0.8938,0.8814,0.9125,0.8658,0.9105,-11.6977,0.9267,0.848,0.8969,0.9576,0.8174,0.876,0.8925,0.9492,0.9197,0.9112,0.6233,0.9433,0.7095,0.8414,0.9625,0.857,0.8964,0.8641,0.4954,0.7059,0.7145,0.7861,0.8152,0.9445,0.824,0.637,0.4412,0.5291,0.546,0.8489,0.8709,0.5333,0.7721,0.5606,0.9427


## **This is for the race composites and neighborhoods bar char use**

# **Step 4 We want a dataframe that has neighborhoods as columns and average_year_eviction_count as rows and contents**

In [171]:
evi_svi_df = evictions_pre_post.groupby('nta')[['average_year_eviction_count','rpl_themes']].mean()
evi_svi_df.sort_values('average_year_eviction_count', ascending=False, inplace=True)
evi_svi_df

Unnamed: 0_level_0,average_year_eviction_count,rpl_themes
nta,Unnamed: 1_level_1,Unnamed: 2_level_1
park-cemetery-etc-Bronx,6.4667,0.9925
Seagate-Coney Island,4.3904,0.9911
Corona,4.2597,0.9632
University Heights-Morris Heights,3.5051,0.9979
Park Slope-Gowanus,2.3725,0.6327
Grymes Hill-Clifton-Fox Hills,2.336,0.9444
Morrisania-Melrose,2.255,0.9965
Bronxdale,2.2396,0.9909
Brownsville,2.0909,0.9933
Claremont-Bathgate,1.9783,0.9971


In [145]:
evi_svi_df.reset_index(inplace=True)
evi_svi_df

Unnamed: 0,nta,average_year_eviction_count,rpl_themes
0,park-cemetery-etc-Bronx,6.4667,0.9925
1,Seagate-Coney Island,4.3904,0.9911
2,Corona,4.2597,0.9632
3,University Heights-Morris Heights,3.5051,0.9979
4,Park Slope-Gowanus,2.3725,0.6327
5,Grymes Hill-Clifton-Fox Hills,2.336,0.9444
6,Morrisania-Melrose,2.255,0.9965
7,Bronxdale,2.2396,0.9909
8,Brownsville,2.0909,0.9933
9,Claremont-Bathgate,1.9783,0.9971


In [146]:
avg_per_nta

np.float64(0.9049163589883761)

In [147]:
average_svi = 0.80198
# https://www.atsdr.cdc.gov/place-health/php/svi/svi-interactive-map.html

In [202]:
evi_svi_df['above_eviction_average'] = evi_svi_df['average_year_eviction_count'] > avg_per_nta
evi_svi_df

Unnamed: 0_level_0,average_year_eviction_count,rpl_themes,above_svi_average,above_eviction_average
nta,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
park-cemetery-etc-Bronx,6.4667,0.9925,True,True
Seagate-Coney Island,4.3904,0.9911,True,True
Corona,4.2597,0.9632,True,True
University Heights-Morris Heights,3.5051,0.9979,True,True
Park Slope-Gowanus,2.3725,0.6327,False,True
Grymes Hill-Clifton-Fox Hills,2.336,0.9444,True,True
Morrisania-Melrose,2.255,0.9965,True,True
Bronxdale,2.2396,0.9909,True,True
Brownsville,2.0909,0.9933,True,True
Claremont-Bathgate,1.9783,0.9971,True,True


In [203]:
evi_svi_df['above_svi_average'] = evi_svi_df['rpl_themes'] > average_svi
evi_svi_df

Unnamed: 0_level_0,average_year_eviction_count,rpl_themes,above_svi_average,above_eviction_average
nta,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
park-cemetery-etc-Bronx,6.4667,0.9925,True,True
Seagate-Coney Island,4.3904,0.9911,True,True
Corona,4.2597,0.9632,True,True
University Heights-Morris Heights,3.5051,0.9979,True,True
Park Slope-Gowanus,2.3725,0.6327,False,True
Grymes Hill-Clifton-Fox Hills,2.336,0.9444,True,True
Morrisania-Melrose,2.255,0.9965,True,True
Bronxdale,2.2396,0.9909,True,True
Brownsville,2.0909,0.9933,True,True
Claremont-Bathgate,1.9783,0.9971,True,True


# **Step 4.3 Run Chi-Square test**
  **Null Hypothesis: There is no association between neighborhoods that have above the average of black and hispanic percentage and have above than average svi scores**

In [204]:
contingency_table = pd.crosstab(evi_svi_df.above_svi_average, evi_svi_df.above_eviction_average)
contingency_table

above_eviction_average,False,True
above_svi_average,Unnamed: 1_level_1,Unnamed: 2_level_1
False,29,9
True,86,63


In [205]:
chi2, p_value, dof, expected = chi2_contingency(contingency_table)
chi2, p_value, dof, expected

(np.float64(3.672242587459067),
 np.float64(0.05532569618900214),
 1,
 array([[23.36898396, 14.63101604],
        [91.63101604, 57.36898396]]))

### **P-value is 0.05533 and chi-square is 3.67. We can not refute the null hypothesis. However the p-value is at borderline and the chi-square suggests a small-to-moderate deviation from independence, and therefore it is worthwile to look into it.**

# **conclusion: there is no statistically significant association between the two "above average eviction" and "above average svi scores" neighborhoods.**

In [153]:
a = evi_svi_df[(evi_svi_df['above_svi_average'] == True) & (evi_svi_df['above_eviction_average'] == True)].shape[0]
b = evi_svi_df[(evi_svi_df['above_svi_average'] == True) & (evi_svi_df['above_eviction_average'] == False)].shape[0]
c = evi_svi_df[(evi_svi_df['above_svi_average'] == False) & (evi_svi_df['above_eviction_average'] == True)].shape[0]
d = evi_svi_df[(evi_svi_df['above_svi_average'] == False) & (evi_svi_df['above_eviction_average'] == False)].shape[0]

In [156]:
observed = np.array([[a, b], [c, d]])
observed

array([[63.5, 86.5],
       [ 9.5, 29.5]])

In [157]:
# a += 0.5
# b += 0.5
# c += 0.5
# d += 0.5
# # avoid 0 divisions

In [158]:
odds_ratio = (a * d) / (b * c)
odds_ratio
# extremely strong association between high svi and high evitction rates.

2.2795862488591423

**This means, technically, neighborhoods with above average svi had 2.28 times higher odds of having above average eviction rates compared to low-SVI neighborhoods.** This high odds may be due to all high svi neighborhoods had high evictions, and all high eviction neighborhoods had high svi from the sample.

In [159]:
chi2, p_value, dof, expected = chi2_contingency(observed)
chi2, p_value

(np.float64(3.494209694415174), np.float64(0.061583795661582674))

# **Step 5: We also need a dataframe that has neighborhoods as columns and black and hispanics percentages as rows and contents**

In [173]:
racial_avg_all['black_hispanic_pct'] = racial_avg_all['ep_afam'] + racial_avg_all['ep_hisp']
racial_avg_all

Unnamed: 0,nta,average_year_eviction_count,ep_afam,ep_asian,ep_hisp,ep_nhpi,ep_white,ep_twomore,ep_otherrace,rpl_themes,black_hispanic_pct
0,park-cemetery-etc-Bronx,6.4667,28.6,6.2,52.9,0.0,8.7,2.3,0.7,0.9925,81.5
1,Seagate-Coney Island,4.3904,20.1385,9.827,19.4159,0.0,47.3927,2.9144,0.0159,0.9911,39.5544
2,Corona,4.2597,7.3968,11.305,75.7668,0.0,4.4275,0.5039,0.4997,0.9632,83.1637
3,University Heights-Morris Heights,3.5051,24.4407,1.6079,69.3353,0.0,2.0544,1.8489,0.7046,0.9979,93.776
4,Park Slope-Gowanus,2.3725,7.4853,9.501,16.3461,0.0,60.6373,5.4206,0.5765,0.6327,23.8314
5,Grymes Hill-Clifton-Fox Hills,2.336,25.1149,11.6023,22.3858,0.0,38.7568,1.5584,0.4703,0.9444,47.5007
6,Morrisania-Melrose,2.255,35.4504,0.6731,59.1572,0.0,2.5927,1.5814,0.3022,0.9965,94.6076
7,Bronxdale,2.2396,29.4301,6.1401,52.1306,0.0,8.6984,2.2915,0.7095,0.9909,81.5607
8,Brownsville,2.0909,70.0757,0.8424,19.3509,0.0002,3.6573,5.3733,0.6034,0.9933,89.4267
9,Claremont-Bathgate,1.9783,30.5182,1.0987,63.2627,0.0,3.0336,1.4791,0.4059,0.9971,93.7809


In [180]:
evi_bh_df = racial_avg_all[['nta', 'average_year_eviction_count', 'black_hispanic_pct']]
evi_bh_df

Unnamed: 0,nta,average_year_eviction_count,black_hispanic_pct
0,park-cemetery-etc-Bronx,6.4667,81.5
1,Seagate-Coney Island,4.3904,39.5544
2,Corona,4.2597,83.1637
3,University Heights-Morris Heights,3.5051,93.776
4,Park Slope-Gowanus,2.3725,23.8314
5,Grymes Hill-Clifton-Fox Hills,2.336,47.5007
6,Morrisania-Melrose,2.255,94.6076
7,Bronxdale,2.2396,81.5607
8,Brownsville,2.0909,89.4267
9,Claremont-Bathgate,1.9783,93.7809


In [181]:
average_bh_pct = evi_bh_df['black_hispanic_pct'].mean()
average_bh_pct

np.float64(49.541793265928135)

In [182]:
avg_per_nta, average_bh_pct
# evictions and black + hispanic pct

(np.float64(0.9049163589883761), np.float64(49.541793265928135))

In [183]:
evi_bh_df['above_evi_average'] = evi_bh_df['average_year_eviction_count'] > avg_per_nta
evi_bh_df

Unnamed: 0,nta,average_year_eviction_count,black_hispanic_pct,above_evi_average
0,park-cemetery-etc-Bronx,6.4667,81.5,True
1,Seagate-Coney Island,4.3904,39.5544,True
2,Corona,4.2597,83.1637,True
3,University Heights-Morris Heights,3.5051,93.776,True
4,Park Slope-Gowanus,2.3725,23.8314,True
5,Grymes Hill-Clifton-Fox Hills,2.336,47.5007,True
6,Morrisania-Melrose,2.255,94.6076,True
7,Bronxdale,2.2396,81.5607,True
8,Brownsville,2.0909,89.4267,True
9,Claremont-Bathgate,1.9783,93.7809,True


In [187]:
evi_bh_df['above_bh_average'] = evi_bh_df['black_hispanic_pct'] > average_bh_pct
evi_bh_df

Unnamed: 0,nta,average_year_eviction_count,black_hispanic_pct,above_evi_average,above_bh_average
0,park-cemetery-etc-Bronx,6.4667,81.5,True,True
1,Seagate-Coney Island,4.3904,39.5544,True,False
2,Corona,4.2597,83.1637,True,True
3,University Heights-Morris Heights,3.5051,93.776,True,True
4,Park Slope-Gowanus,2.3725,23.8314,True,False
5,Grymes Hill-Clifton-Fox Hills,2.336,47.5007,True,False
6,Morrisania-Melrose,2.255,94.6076,True,True
7,Bronxdale,2.2396,81.5607,True,True
8,Brownsville,2.0909,89.4267,True,True
9,Claremont-Bathgate,1.9783,93.7809,True,True


### **This is the neighborhood evictions + black/hispanic chi-test df:**

# **Run Chi-Square test**
  **Null Hypothesis: There is no association between neighborhoods that have above the average of black and hispanic percentage and have above than average svi scores**

In [188]:
contingency_table = pd.crosstab(evi_bh_df.above_bh_average, evi_bh_df.above_evi_average)
contingency_table

above_evi_average,False,True
above_bh_average,Unnamed: 1_level_1,Unnamed: 2_level_1
False,80,27
True,35,45


In [189]:
chi2, p_value, dof, expected = chi2_contingency(contingency_table)
chi2, p_value, dof, expected

(np.float64(17.311162196233465),
 np.float64(3.1731742700266107e-05),
 1,
 array([[65.80213904, 41.19786096],
        [49.19786096, 30.80213904]]))

### **P-value is 0.00003.17 (extremely small) and chi-squiare is 17.311. We can refute the null hypothesis. We can say we are confident that high "black+hispanic percetage" boroughs are more likely to have high evictions (or vice versa) or there is a strong association between high BH pct and high eviction rates in the full dataset**

# **conclusion: there is a statistically significant association between the two "above average eviction" and "above average black+hispanic pct" neighborhoods.**

In [190]:
a = evi_bh_df[(evi_bh_df['above_bh_average'] == True) & (evi_bh_df['above_evi_average'] == True)].shape[0]
b = evi_bh_df[(evi_bh_df['above_bh_average'] == True) & (evi_bh_df['above_evi_average'] == False)].shape[0]
c = evi_bh_df[(evi_bh_df['above_bh_average'] == False) & (evi_bh_df['above_evi_average'] == True)].shape[0]
d = evi_bh_df[(evi_bh_df['above_bh_average'] == False) & (evi_bh_df['above_evi_average'] == False)].shape[0]

In [191]:
observed = np.array([[a, b], [c, d]])
observed

array([[45, 35],
       [27, 80]])

In [193]:
odds_ratio = (a * d) / (b * c)
odds_ratio

3.8095238095238093

**This means, technically, neighborhoods with above average b+h pct had 3.81 times higher odds of having above average eviction rates compared to low-bh-pct neighborhoods in nyc.** This high odds may be due to all high bh pct neighborhoods had high evictions, and all high eviction neighborhoods had high bh pct from the sample.

In [194]:
chi2, p_value, dof, expected = chi2_contingency(observed)
chi2, p_value
# 11.32 is a strong deviation from expected undert the null hypothesis

(np.float64(17.311162196233465), np.float64(3.1731742700266107e-05))

### **Quick extreme cases**

In [207]:
evi_svi_df_top_10 = evi_svi_df.sort_values('average_year_eviction_count', ascending=False).head(10)
evi_svi_df_bottom_10 = evi_svi_df.sort_values('average_year_eviction_count', ascending=False).tail(10)
evi_svi_df_extremes = pd.concat([evi_svi_df_top_10, evi_svi_df_bottom_10])
evi_svi_df_extremes

Unnamed: 0_level_0,average_year_eviction_count,rpl_themes,above_svi_average,above_eviction_average
nta,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
park-cemetery-etc-Bronx,6.4667,0.9925,True,True
Seagate-Coney Island,4.3904,0.9911,True,True
Corona,4.2597,0.9632,True,True
University Heights-Morris Heights,3.5051,0.9979,True,True
Park Slope-Gowanus,2.3725,0.6327,False,True
Grymes Hill-Clifton-Fox Hills,2.336,0.9444,True,True
Morrisania-Melrose,2.255,0.9965,True,True
Bronxdale,2.2396,0.9909,True,True
Brownsville,2.0909,0.9933,True,True
Claremont-Bathgate,1.9783,0.9971,True,True


In [211]:
contingency_table = pd.crosstab(evi_svi_df_extremes.above_svi_average, evi_svi_df_extremes.above_eviction_average)
contingency_table

above_eviction_average,False,True
above_svi_average,Unnamed: 1_level_1,Unnamed: 2_level_1
False,7,1
True,3,9


In [212]:
chi2, p_value, dof, expected = chi2_contingency(contingency_table)
chi2, p_value, dof, expected

(np.float64(5.208333333333334),
 np.float64(0.022478873366125265),
 1,
 array([[4., 4.],
        [6., 6.]]))

### **P-value is 0.022 (extremely small) and chi-squiare is 5.208. We can refute the null hypothesis. We can say we are confident that high "svi" boroughs are more likely to have high evictions (or vice versa) or there is a strong association between high SVI and high eviction rates in the full dataset**

# **conclusion: there is a statistically significant association between the two "above average eviction" and "above average black+hispanic pct" neighborhoods.**

In [214]:
a = evi_svi_df_extremes[(evi_svi_df_extremes['above_svi_average'] == True) & (evi_svi_df_extremes['above_eviction_average'] == True)].shape[0]
b = evi_svi_df_extremes[(evi_svi_df_extremes['above_svi_average'] == True) & (evi_svi_df_extremes['above_eviction_average'] == False)].shape[0]
c = evi_svi_df_extremes[(evi_svi_df_extremes['above_svi_average'] == False) & (evi_svi_df_extremes['above_eviction_average'] == True)].shape[0]
d = evi_svi_df_extremes[(evi_svi_df_extremes['above_svi_average'] == False) & (evi_svi_df_extremes['above_eviction_average'] == False)].shape[0]

In [215]:
observed = np.array([[a, b], [c, d]])
observed

array([[9, 3],
       [1, 7]])

In [216]:
odds_ratio = (a * d) / (b * c)
odds_ratio

21.0

**This means, technically, neighborhoods with above average b+h pct had 21 times higher odds of having above average eviction rates compared to low-bh-pct neighborhoods in nyc.** This high odds may be due to all high bh pct neighborhoods had high evictions, and all high eviction neighborhoods had high bh pct from the sample.

In [217]:
evi_bh_df_top_10 = evi_bh_df.sort_values('average_year_eviction_count', ascending=False).head(10)
evi_bh_df_bottom_10 = evi_bh_df.sort_values('average_year_eviction_count', ascending=False).tail(10)
evi_bh_df_extremes = pd.concat([evi_bh_df_top_10, evi_bh_df_bottom_10])
evi_bh_df_extremes

Unnamed: 0,nta,average_year_eviction_count,black_hispanic_pct,above_evi_average,above_bh_average
0,park-cemetery-etc-Bronx,6.4667,81.5,True,True
1,Seagate-Coney Island,4.3904,39.5544,True,False
2,Corona,4.2597,83.1637,True,True
3,University Heights-Morris Heights,3.5051,93.776,True,True
4,Park Slope-Gowanus,2.3725,23.8314,True,False
5,Grymes Hill-Clifton-Fox Hills,2.336,47.5007,True,False
6,Morrisania-Melrose,2.255,94.6076,True,True
7,Bronxdale,2.2396,81.5607,True,True
8,Brownsville,2.0909,89.4267,True,True
9,Claremont-Bathgate,1.9783,93.7809,True,True


In [220]:
contingency_table = pd.crosstab(evi_bh_df_extremes.above_bh_average, evi_bh_df_extremes.above_evi_average)
contingency_table

above_evi_average,False,True
above_bh_average,Unnamed: 1_level_1,Unnamed: 2_level_1
False,10,3
True,0,7


In [221]:
chi2, p_value, dof, expected = chi2_contingency(contingency_table)
chi2, p_value, dof, expected

(np.float64(7.912087912087912),
 np.float64(0.004910556162172649),
 1,
 array([[6.5, 6.5],
        [3.5, 3.5]]))

### **P-value is 0.0049 (extremely small) and chi-squiare is 7.812. We can refute the null hypothesis. We can say we are confident that high "black+hispanic" boroughs are more likely to have high evictions (or vice versa) or there is a strong association between high black+hispanic and high eviction rates in the full dataset**

# **conclusion: there is a statistically significant association between the two "above average eviction" and "above average black+hispanic pct" neighborhoods.**

In [224]:
a = evi_bh_df_extremes[(evi_bh_df_extremes['above_bh_average'] == True) & (evi_bh_df_extremes['above_evi_average'] == True)].shape[0]
b = evi_bh_df_extremes[(evi_bh_df_extremes['above_bh_average'] == True) & (evi_bh_df_extremes['above_evi_average'] == False)].shape[0]
c = evi_bh_df_extremes[(evi_bh_df_extremes['above_bh_average'] == False) & (evi_bh_df_extremes['above_evi_average'] == True)].shape[0]
d = evi_bh_df_extremes[(evi_bh_df_extremes['above_bh_average'] == False) & (evi_bh_df_extremes['above_evi_average'] == False)].shape[0]

In [225]:
observed = np.array([[a, b], [c, d]])
observed

array([[ 7,  0],
       [ 3, 10]])

In [226]:
a += 0.5
b += 0.5
c += 0.5
d += 0.5

In [227]:
odds_ratio = (a * d) / (b * c)
odds_ratio

45.0

**This means, technically, neighborhoods with above average b+h pct had 45 times higher odds of having above average eviction rates compared to low-bh-pct neighborhoods in nyc.** This high odds may be due to all high bh pct neighborhoods had high evictions, and all high eviction neighborhoods had high bh pct from the sample.