# Data Prep - Amsterdam Open Data -  Predicting Housing Prices

In this notebook we prepare and join data retrieved as csv-files from Amsterdam Open Data. We have lot's of information about the different neighborhoods in the city such as "safety index", "crime index", % of people with certain income levels etc.

In [2]:
# Import libraries
import pandas as pd
import re

In [2]:
# Read excelfiles

xl = pd.ExcelFile(r'C:\Users\tiina\Downloads\buurt_info3.xlsx')
xl.sheet_names
# read a specific sheet to DataFrame
safety = xl.parse('VI_heel 2019')

In [3]:
safety.head()

Unnamed: 0,VM14gebiednaam afgekort,VM14gebiednummer,VM14 gebiedsnaam voluit,totale veiligheidsindex,index gemiddelde inschatting (ontwikkeling) buurtcriminaliteit en -overlast,index gemiddelde score onveiligheids-gevoelens buurt,index gemiddelde score vermijding plekken en situaties in buurt,totale index onveiligheids-beleving,deelindex overlast: personenoverlast,deelindex overlast: verloedering,...,populatie per 1-1-2017,aantal woningen per 1-1-2017,aantal winkels per 1-1-2017,aantal bedrijven (niet-winkels) per 1-1-2017,aantal werkzame personen,"aantal studenten (VO, HBO en WO)",aantal bezoekers op een gemiddelde dag,"som werkzame personen, studenten en bezoekers",aantal verblijvers in een buurt waarbij nbezoek voor 1/3 meetelt,aantal verblijvers in een buurt waarbij werkzame personen voor 1/3 en andere bezoekers voor 1/8 meetellen
0,vmgebiednaam,vmgebied16nr,,veiligheidsindex,ipsl_risicoperceptie,ipsl_onveiligheidsgevoelens,ipsl_vermijding,Onveiligheidsbelevingindex,Ipersonenoverlast,Iverloedering,...,pop,nwoninge,nwink,nbedr,werkpers,student,bezoek,nbezoek,verbtot3,verbnieuw
1,A00,1,Burgwallen-Oude Zijde,178.1,136.67,99.27,84.26,106.73,507.91,143.21,...,4293,2917,222,1934,6233,1746,80626,88605,33828,16667.2
2,A01,2,Burgwallen-Nieuwe Zijde,168.84,122.61,94.46,73.79,96.95,453.97,130.8,...,4130,2884,594,2350,9428,1689,139692,150809,54399.7,24945.3
3,A02,3,Grachtengordel-West,79.61,110.42,52.24,45.99,69.55,132.56,100.52,...,6374,4314,245,3043,6989,0,43940,50929,23350.3,14196.2
4,A03,4,Grachtengordel-Zuid,137.45,112.97,65.4,45.37,74.58,268.8,127.85,...,5383,3449,177,1799,6350,0,29459,35809,17319.3,11182


In [4]:
# Clean up some messy rows to get the data right
safety = safety[safety['VM14gebiednaam afgekort'] != 'vmgebiednaam']

In [5]:
# Read income dataset

xl = pd.ExcelFile(r'C:\Users\tiina\Downloads\income staadsdelen.xlsx')

xl.sheet_names

# read a specific sheet to DataFrame
income = xl.parse('2020_stadsdelen_3.18')

In [6]:
income.head(60)

Unnamed: 0,3.18a Personen met inkomen in particuliere huishoudens (incl. studenten) (x 1.000) naar inkomensklassen,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7
0,(20%-groepen van de landelijke ve...,,,,,,,
1,,,,,,,,
2,,,minder dan,11.100 euro-,20.000 euro-,31.200 euro-,46.800 euro,
3,wijk/std,naam wijk/std,11.100 euro,<20.000 euro,<31.200 euro,<46.800 euro,en meer,totaal
4,,,,,,,,
5,A00,Burgwallen-Oude Zijde,0.8,0.7,0.7,0.7,0.9,3.7
6,A01,Burgwallen-Nieuwe Zijde,0.6,0.6,0.6,0.7,0.9,3.4
7,A02,Grachtengordel-West,1,0.6,0.6,0.9,2.1,5.2
8,A03,Grachtengordel-Zuid,0.8,0.5,0.6,0.8,1.7,4.3
9,A04,Nieuwmarkt/Lastage,1.4,1.5,1.3,1.5,2.4,8.1


In [7]:
# Rename columns and translate to english

column_names = ['neighborhood', 'income_low', 'income_med_low', 'income_med', 'income_med_high', 'ingome_high', 'total']

In [8]:
# Drop unnecessary column
income.drop(columns={'3.18a   Personen met inkomen in particuliere huishoudens (incl. studenten) (x 1.000) naar inkomensklassen'},inplace=True)

In [9]:
income.columns

Index(['Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5',
       'Unnamed: 6', 'Unnamed: 7'],
      dtype='object')

In [10]:
# Drop null values
income.dropna(inplace=True)

In [11]:
# Assign new columns
income.columns = column_names

In [12]:
# remove messy rows
income = income[income.neighborhood != 'naam wijk/std']

In [13]:
# Calculate percentage of people at certain income level
for col in income.columns:
    if col != 'neighborhood':
        income[col] = pd.to_numeric(income[col], errors='coerce')
        income[col] = income[col]/income['total']

In [None]:
# See the unique values for different neighborhoods
income.neighborhood.unique()

In [16]:
# Read the houses dataset - so we can start combining data
houses = pd.read_csv('houses_coordinates.csv')

## Building keys to join data from different datasets

In this project  I have data coming from different sources, and with that also comes many variants of naming etc. I will use two datasets from CBS to first combine data with postalcodes, and then retrieving the neighborhood names, which will then naturally be unified accross the dataset.

In [17]:
# Read postcode data
post_code_statsdel = pd.read_csv(r'C:\Users\tiina\Downloads\2019-cbs-pc6huisnr20190801_buurt\pc6hnr20190801_gwb.csv', sep=";")

In [18]:
# Read neighborhood names
stads_del = pd.read_csv(r'C:\Users\tiina\Downloads\2019-cbs-pc6huisnr20190801_buurt\wijk2019.csv', sep=";")

In [19]:
# Drop duplicate values - data was on a more granural level than postcodes
post_wijken = post_code_statsdel.drop_duplicates(subset=['PC6'])

In [20]:
# Merge datasets together to retrieve the name
post_wijken = post_wijken.merge(stads_del, how='left', left_on='Wijk2019', right_on='Wijkcode2019')

In [21]:
# Filter only Amsterdam with code 363
post_wijken = post_wijken[post_wijken['Gemeente2019']==363]
post_wijken.head()

Unnamed: 0,PC6,Huisnummer,Buurt2019,Wijk2019,Gemeente2019,Wijkcode2019,Wijknaam_2019K_NAAM
0,1011AB,105,3630400,36304,363,36304,Nieuwmarkt/Lastage
1,1011AC,125,3630400,36304,363,36304,Nieuwmarkt/Lastage
2,1011AD,1,3630400,36304,363,36304,Nieuwmarkt/Lastage
3,1011AE,8,3630400,36304,363,36304,Nieuwmarkt/Lastage
4,1011AG,96,3630403,36304,363,36304,Nieuwmarkt/Lastage


In [22]:
# Drop unnecessary columns
post_wijken = post_wiken.drop(columns={'Huisnummer', 'Buurt2019', 'Gemeente2019', 'Wijkcode2019'})

In [23]:
# Clean neighborhood names
post_wijken['Wijknaam_2019K_NAAM'] = post_wijken.Wijknaam_2019K_NAAM.apply(lambda x: re.sub('-', ' ', x))

In [24]:
# Compare neighborhood names 
sorted(post_wijken.Wijknaam_2019K_NAAM.unique())

['Amstel III/Bullewijk',
 'Apollobuurt',
 'Banne Buiksloot',
 'Bedrijventerrein Sloterdijk',
 'Betondorp',
 'Bijlmer Centrum (D,F,H)',
 'Bijlmer Oost (E,G,K)',
 'Buikslotermeer',
 'Buitenveldert Oost',
 'Buitenveldert West',
 'Burgwallen Nieuwe Zijde',
 'Burgwallen Oude Zijde',
 'Centrale Markt',
 'Chassébuurt',
 'Da Costabuurt',
 'Dapperbuurt',
 'De Kolenkit',
 'De Punt',
 'De Weteringschans',
 'Driemond',
 'Eendracht',
 'Elzenhagen',
 'Erasmuspark',
 'Frankendael',
 'Frederik Hendrikbuurt',
 'Gein',
 'Geuzenbuurt',
 'Geuzenveld',
 'Grachtengordel West',
 'Grachtengordel Zuid',
 'Haarlemmerbuurt',
 'Helmersbuurt',
 'Holendrecht/Reigersbos',
 'Hoofddorppleinbuurt',
 'Hoofdweg e.o.',
 'Houthavens',
 'IJburg Oost',
 'IJburg West',
 'IJburg Zuid',
 'IJplein/Vogelbuurt',
 'IJselbuurt',
 'Indische Buurt Oost',
 'Indische Buurt West',
 'Jordaan',
 'Kadoelen',
 'Kinkerbuurt',
 'Landlust',
 'Lutkemeer/Ookmeer',
 'Middelveldsche Akerpolder',
 'Middenmeer',
 'Museumkwartier',
 'Nellestein',
 'Ni

In [25]:
# Identifying neighborhood values that don't match our neighborhood names
not_found = []
for wijk in income.neighborhood:
    if re.sub('-', ' ', wijk)not in post_wijken.Wijknaam_2019K_NAAM.unique():
        not_found.append(wijk)

In [26]:
len(not_found)

10

In [27]:
print(not_found)

['Centrum', 'Westpoort', 'West', 'Nieuw-West', 'Zuid', 'De Omval/Overamstel', 'Oost', 'Noord', 'Zuidoost', 'Amsterdam']


In [28]:
# modify:
sorted(income.neighborhood.unique())

['Amstel III/Bullewijk',
 'Amsterdam',
 'Apollobuurt',
 'Banne Buiksloot',
 'Bedrijventerrein Sloterdijk',
 'Betondorp',
 'Bijlmer-Centrum (D,F,H)',
 'Bijlmer-Oost (E,G,K)',
 'Buikslotermeer',
 'Buitenveldert-Oost',
 'Buitenveldert-West',
 'Burgwallen-Nieuwe Zijde',
 'Burgwallen-Oude Zijde',
 'Centrale Markt',
 'Centrum',
 'Chassébuurt',
 'Da Costabuurt',
 'Dapperbuurt',
 'De Kolenkit',
 'De Omval/Overamstel',
 'De Punt',
 'De Weteringschans',
 'Driemond',
 'Eendracht',
 'Elzenhagen',
 'Erasmuspark',
 'Frankendael',
 'Frederik Hendrikbuurt',
 'Gein',
 'Geuzenbuurt',
 'Geuzenveld',
 'Grachtengordel-West',
 'Grachtengordel-Zuid',
 'Haarlemmerbuurt',
 'Helmersbuurt',
 'Holendrecht/Reigersbos',
 'Hoofddorppleinbuurt',
 'Hoofdweg e.o.',
 'Houthavens',
 'IJburg-Oost',
 'IJburg-West',
 'IJburg-Zuid',
 'IJplein/Vogelbuurt',
 'IJselbuurt',
 'Indische Buurt-Oost',
 'Indische Buurt-West',
 'Jordaan',
 'Kadoelen',
 'Kinkerbuurt',
 'Landlust',
 'Lutkemeer/Ookmeer',
 'Middelveldsche Akerpolder',
 'M

In [29]:
# Merge data on post codes
houses = pd.merge(houses, post_wijken[['PC6','Wijknaam_2019K_NAAM']], how='left', left_on='post_code', right_on='PC6')

In [30]:
houses.isnull().sum()

Unnamed: 0                0
Unnamed: 0.1              0
price                     1
post_code                 0
neighborhood              0
living_area               0
rooms                     0
year                      0
offered_since            14
status                   14
monthly_cost_vve          0
type                     14
bedrooms                130
bathrooms               484
isolation               688
energy_label           1038
heating                 411
parking                 539
garage                  271
balcony                 180
garden                  183
storage                 933
location               2010
loc                       0
PC6                       4
Wijknaam_2019K_NAAM       4
dtype: int64

In [31]:
houses[houses['Wijknaam_2019K_NAAM'].isnull()]

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,price,post_code,neighborhood,living_area,rooms,year,offered_since,status,...,heating,parking,garage,balcony,garden,storage,location,loc,PC6,Wijknaam_2019K_NAAM
1958,1958,1958,196000.0,1087VR,Centrumeiland,46,1,20112020,3 maanden,Beschikbaar,...,"Stadsverwarming, Warmtepomp",,Ja,Niet aanwezig,Niet aanwezig,Aanwezig,,"['-', '-']",,
1959,1959,1959,445000.0,1087VR,Centrumeiland,107,4,20112020,3 maanden,Beschikbaar,...,"Stadsverwarming, Warmtepomp",,Ja,Niet aanwezig,Niet aanwezig,Aanwezig,,"['-', '-']",,
1960,1960,1960,357000.0,1087VR,Centrumeiland,86,3,20112020,3 maanden,Beschikbaar,...,"Stadsverwarming, Warmtepomp",,Ja,Niet aanwezig,Niet aanwezig,Aanwezig,,"['-', '-']",,
1961,1961,1961,241000.0,1087VR,Centrumeiland,58,2,20112020,3 maanden,Beschikbaar,...,"Stadsverwarming, Warmtepomp",,Ja,Niet aanwezig,Niet aanwezig,Aanwezig,,"['-', '-']",,


In [32]:
# Adjust names that they match what is required for combining the income information (as we don't have post_code level data)
income.neighborhood = income.neighborhood.apply(lambda x: re.sub('-', ' ', x))
income.neighborhood = income.neighborhood.apply(lambda x: re.sub('De Omval/Overamstel', 'Omval/Overamstel', x))


In [33]:
houses = pd.merge(houses, income, how='left', left_on='Wijknaam_2019K_NAAM', right_on='neighborhood')

In [34]:
houses[houses['total'].isnull()]

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,price,post_code,neighborhood_x,living_area,rooms,year,offered_since,status,...,loc,PC6,Wijknaam_2019K_NAAM,neighborhood_y,income_low,income_med_low,income_med,income_med_high,ingome_high,total
541,541,541,399000.0,1087VA,Centrumeiland,81,3,2021,2 weken,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
542,542,542,550000.0,1087VA,Centrumeiland,123,4,2021,2 weken,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
543,543,543,325000.0,1087VA,Centrumeiland,64,2,2021,2 weken,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
1890,1890,1890,675000.0,1087VA,Centrumeiland,150,5,20112020,3 maanden,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
1895,1895,1895,495000.0,1087VA,Centrumeiland,105,4,20112020,2 maanden,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
1896,1896,1896,540000.0,1087VA,Centrumeiland,120,6,20112020,2 maanden,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
1897,1897,1897,285000.0,1087VA,Centrumeiland,60,3,20112020,2 maanden,Beschikbaar,...,"[52.3523901, 5.011545]",1087VA,IJburg Oost,IJburg Oost,,,,,,
1958,1958,1958,196000.0,1087VR,Centrumeiland,46,1,20112020,3 maanden,Beschikbaar,...,"['-', '-']",,,,,,,,,
1959,1959,1959,445000.0,1087VR,Centrumeiland,107,4,20112020,3 maanden,Beschikbaar,...,"['-', '-']",,,,,,,,,
1960,1960,1960,357000.0,1087VR,Centrumeiland,86,3,20112020,3 maanden,Beschikbaar,...,"['-', '-']",,,,,,,,,


Some of the newer neighborhoods don't have information on these datasets. I will deal with null values at later stage in this project.

# Neighborhood info on Safety - Preparation

In [36]:
safety.head()

Unnamed: 0,VM14gebiednaam afgekort,VM14gebiednummer,VM14 gebiedsnaam voluit,totale veiligheidsindex,index gemiddelde inschatting (ontwikkeling) buurtcriminaliteit en -overlast,index gemiddelde score onveiligheids-gevoelens buurt,index gemiddelde score vermijding plekken en situaties in buurt,totale index onveiligheids-beleving,deelindex overlast: personenoverlast,deelindex overlast: verloedering,...,populatie per 1-1-2017,aantal woningen per 1-1-2017,aantal winkels per 1-1-2017,aantal bedrijven (niet-winkels) per 1-1-2017,aantal werkzame personen,"aantal studenten (VO, HBO en WO)",aantal bezoekers op een gemiddelde dag,"som werkzame personen, studenten en bezoekers",aantal verblijvers in een buurt waarbij nbezoek voor 1/3 meetelt,aantal verblijvers in een buurt waarbij werkzame personen voor 1/3 en andere bezoekers voor 1/8 meetellen
1,A00,1,Burgwallen-Oude Zijde,178.1,136.67,99.27,84.26,106.73,507.91,143.21,...,4293,2917,222,1934,6233,1746,80626,88605,33828.0,16667.2
2,A01,2,Burgwallen-Nieuwe Zijde,168.84,122.61,94.46,73.79,96.95,453.97,130.8,...,4130,2884,594,2350,9428,1689,139692,150809,54399.7,24945.3
3,A02,3,Grachtengordel-West,79.61,110.42,52.24,45.99,69.55,132.56,100.52,...,6374,4314,245,3043,6989,0,43940,50929,23350.3,14196.2
4,A03,4,Grachtengordel-Zuid,137.45,112.97,65.4,45.37,74.58,268.8,127.85,...,5383,3449,177,1799,6350,0,29459,35809,17319.3,11182.0
5,A04,5,Nieuwmarkt,120.79,117.95,75.9,68.92,87.59,203.33,119.49,...,9676,6190,128,2218,6830,2217,35470,44517,24515.0,16663.5


In [37]:
# Pick specific columns 
safety = safety[['VM14 gebiedsnaam voluit', 'totale veiligheidsindex', 'totale index onveiligheids-beleving', 'totale overlastindex', 'totale criminaliteitsindex', 'aantal winkels per  1-1-2017', 'aantal bedrijven (niet-winkels) per  1-1-2017', 'aantal bezoekers op een gemiddelde dag']]

In [38]:
safety

Unnamed: 0,VM14 gebiedsnaam voluit,totale veiligheidsindex,totale index onveiligheids-beleving,totale overlastindex,totale criminaliteitsindex,aantal winkels per 1-1-2017,aantal bedrijven (niet-winkels) per 1-1-2017,aantal bezoekers op een gemiddelde dag
1,Burgwallen-Oude Zijde,178.1,106.73,325.56,102.01,222,1934,80626
2,Burgwallen-Nieuwe Zijde,168.84,96.95,292.38,117.2,594,2350,139692
3,Grachtengordel-West,79.61,69.55,116.54,52.74,245,3043,43940
4,Grachtengordel-Zuid,137.45,74.58,198.33,139.46,177,1799,29459
5,Nieuwmarkt,120.79,87.59,161.41,113.35,128,2218,35470
...,...,...,...,...,...,...,...,...
137,De Kwakel,43.03,61.9,45.58,21.6,10,499,1800
138,Kudelstaart,58.24,84.37,60.93,29.42,18,454,2245
139,Aalsmeer Hornmeer,63.96,64.37,74.14,53.36,8,340,1388
140,Aalsmeer Oost,57.32,74.79,60.76,36.4,39,1031,2776


In [39]:
# Translate columns names
new_columns = ['buurt', 'safety_index', 'safety_perception_index', 'nuisance_index', 'crime_index', 'no_shops', 'no_companies', 'daily_visitors']

In [40]:
# Assign new column names
safety.columns = new_columns

In [41]:
safety.head()

Unnamed: 0,buurt,safety_index,safety_perception_index,nuisance_index,crime_index,no_shops,no_companies,daily_visitors
1,Burgwallen-Oude Zijde,178.1,106.73,325.56,102.01,222,1934,80626
2,Burgwallen-Nieuwe Zijde,168.84,96.95,292.38,117.2,594,2350,139692
3,Grachtengordel-West,79.61,69.55,116.54,52.74,245,3043,43940
4,Grachtengordel-Zuid,137.45,74.58,198.33,139.46,177,1799,29459
5,Nieuwmarkt,120.79,87.59,161.41,113.35,128,2218,35470


In [43]:
# Transform names to match our requested format
safety.buurt = safety.buurt.apply(lambda x: re.sub('-', ' ', x))

In [44]:
# Merge datasets together
houses = pd.merge(houses, safety, how='left', left_on='Wijknaam_2019K_NAAM', right_on='buurt')

In [45]:
houses.columns

Index(['Unnamed: 0', 'Unnamed: 0.1', 'price', 'post_code', 'neighborhood_x',
       'living_area', 'rooms', 'year', 'offered_since', 'status',
       'monthly_cost_vve', 'type', 'bedrooms', 'bathrooms', 'isolation',
       'energy_label', 'heating', 'parking', 'garage', 'balcony', 'garden',
       'storage', 'location', 'loc', 'PC6', 'Wijknaam_2019K_NAAM',
       'neighborhood_y', 'income_low', 'income_med_low', 'income_med',
       'income_med_high', 'ingome_high', 'total', 'buurt', 'safety_index',
       'safety_perception_index', 'nuisance_index', 'crime_index', 'no_shops',
       'no_companies', 'daily_visitors'],
      dtype='object')

In [46]:
houses.head()

146                Houthavens
405     Westelijk Havengebied
544               IJburg Oost
545               IJburg Oost
546               IJburg Oost
1030               Houthavens
1117               Houthavens
1406               Houthavens
1655    Westelijk Havengebied
1724               Houthavens
1906              IJburg Oost
1911              IJburg Oost
1912              IJburg Oost
1913              IJburg Oost
1935               Houthavens
1976                      NaN
1977                      NaN
1978                      NaN
1979                      NaN
Name: Wijknaam_2019K_NAAM, dtype: object

In [47]:
houses.drop(columns={'Unnamed: 0', 'Unnamed: 0.1', 'location', 'PC6', 'n'}), 

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,price,post_code,neighborhood_x,living_area,rooms,year,offered_since,status,...,ingome_high,total,buurt,safety_index,safety_perception_index,nuisance_index,crime_index,no_shops,no_companies,daily_visitors
0,0,0,550000.0,1055NL,Gibraltarbuurt,100,4,1937,30-06-2020,Beschikbaar,...,0.195946,1.0,Landlust,107.58,98.26,129.33,95.15,67,2791,6219
1,1,1,675000.0,1054GB,Vondelparkbuurt Oost,88,4,1950,30-06-2020,Beschikbaar,...,0.357143,1.0,Vondelbuurt,95.98,73.36,106.85,107.73,76,2022,8159
2,2,2,1000000.0,1075EG,Willemsparkbuurt Noord,109,4,1906,30-06-2020,Beschikbaar,...,0.487179,1.0,Willemspark,76.5,72.79,74.59,82.13,31,1263,3225
3,3,3,475000.0,1064DD,Noordoever Sloterplas,107,4,1999,30-06-2020,Beschikbaar,...,0.11194,1.0,Slotermeer Zuidwest,128.64,122.43,138.62,124.88,7,525,3365
4,4,4,485000.0,1016TA,Groenmarktkadebuurt,71,3,1906,30-06-2020,Beschikbaar,...,0.277108,1.0,Jordaan,92.96,71.03,130.83,77.02,270,4004,31695


In [48]:
# Save dataset for next stage
houses.to_csv('final_houses.csv')

Index(['Unnamed: 0', 'Unnamed: 0.1', 'price', 'post_code', 'neighborhood_x',
       'living_area', 'rooms', 'year', 'offered_since', 'status',
       'monthly_cost_vve', 'type', 'bedrooms', 'bathrooms', 'isolation',
       'energy_label', 'heating', 'parking', 'garage', 'balcony', 'garden',
       'storage', 'location', 'loc', 'PC6', 'Wijknaam_2019K_NAAM',
       'neighborhood_y', 'income_low', 'income_med_low', 'income_med',
       'income_med_high', 'ingome_high', 'total', 'buurt', 'safety_index',
       'safety_perception_index', 'nuisance_index', 'crime_index', 'no_shops',
       'no_companies', 'daily_visitors'],
      dtype='object')