The goal of this project is to provide insights into neighborhood ammenity accessibility for homebuyers, renters, and city officials in Gouda, Netherlands. This project will generate an interactive map comparing the walkabiltiy and bikeability of the city's 51 neighborhoods. 

All data was obtained from the "Gouda in Cijfers" database, provided by the municipality of Gouda. The data can be accessed [here](https://gouda.incijfers.nl/mosaic/wijkprofielen/).

In [1]:
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# to show files uploaded to online workbook:
import os 
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/gouda-ammenities-by-neighborhood/Ammenities by Neighborhood.csv
/kaggle/input/gouda-population/Gouda_Population.csv


In [2]:
data = pd.read_csv("/kaggle/input/gouda-ammenities-by-neighborhood/Ammenities by Neighborhood.csv", sep=";")
data_pop = pd.read_csv("/kaggle/input/gouda-population/Gouda_Population.csv", sep=";")

# Need to change one index to match Gouda's spatial data set
data.loc[data.Buurten == "Mammoet", "Buurten"] = "De Mammoet"
data_pop.loc[data.Buurten == "Mammoet", "Buurten"] = "De Mammoet"

In [3]:
data = data.set_index("Buurten").join(data_pop.set_index("Buurten"))

Unfortunately, the municipality of Gouda uses commas as a decimal separator rather than periods. We need to clean up the data a bit so Python can make sense of it.

In [4]:
def convert_to_float(p):
    """Converts strings to floats, replacing all commas with periods. Will
    skip any non-string objects provided.
    
    >>>convert_to_float("3,2")
    3.2
    
    """
    try:
        return float(p.replace(',', '.'))
    except ValueError:
        return p
    except AttributeError:
        return p

data = data.map(convert_to_float)
data.replace("?", "NaN", inplace = True)
data.head()

Unnamed: 0_level_0,Afstand tot grote supermarkt|2008,Afstand tot grote supermarkt|2009,Afstand tot grote supermarkt|2010,Afstand tot grote supermarkt|2011,Afstand tot grote supermarkt|2012,Afstand tot grote supermarkt|2013,Afstand tot grote supermarkt|2014,Afstand tot grote supermarkt|2015,Afstand tot grote supermarkt|2016,Afstand tot grote supermarkt|2017,...,Bevolkingsdichtheid|2012,Bevolkingsdichtheid|2013,Bevolkingsdichtheid|2014,Bevolkingsdichtheid|2015,Bevolkingsdichtheid|2016,Bevolkingsdichtheid|2017,Bevolkingsdichtheid|2018,Bevolkingsdichtheid|2019,Bevolkingsdichtheid|2020,Bevolkingsdichtheid|2022
Buurten,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Nieuwe Markt e.o.,0.4,0.4,0.4,0.4,0.3,0.3,0.3,0.3,0.3,0.3,...,7571.0,7571.0,7786.0,7857.0,8429.0,8714.0,8857.0,9571.0,9857.0,10393.0
De Baan e.o.,0.6,0.6,0.6,0.5,0.5,0.5,0.5,0.6,0.6,0.6,...,8441.0,8588.0,8471.0,8412.0,8206.0,8441.0,8647.0,8794.0,8882.0,9059.0
Turfmarkt e.o.,0.3,0.3,0.3,0.3,0.3,0.3,0.4,0.4,0.4,0.4,...,10500.0,10458.0,10583.0,10583.0,10250.0,10417.0,10583.0,10708.0,10625.0,10625.0
Raam e.o.,0.3,0.3,0.2,0.2,0.3,0.2,0.4,0.4,0.4,0.4,...,10188.0,10063.0,10938.0,10938.0,12063.0,12500.0,12469.0,12906.0,13344.0,13438.0
Nieuwe Park Oost,0.3,0.4,0.8,0.8,0.7,0.7,0.7,0.8,0.8,0.8,...,3283.0,3750.0,3891.0,3935.0,3957.0,4022.0,4065.0,4163.0,4413.0,4511.0


Let's focus in on the most recent data related to walking distances. Let's also filter out any neighborhoods with a population density below 300, since those neighborhoods are very rural and are not representative of Gouda as a city.

In [5]:
data_2022 = data.filter(like='2022', axis=1).rename(columns={"Bevolkingsdichtheid|2022": "pop_density"})
data_2022_distances = data_2022.loc[data_2022.pop_density >= 300].filter(like="Afstand", axis=1).rename(columns={"Afstand tot grote supermarkt|2022":"supermarket", "Afstand tot overige dagelijkse levensmiddelen|2022":"other_groceries", "Afstand tot huisartsenpraktijk|2022":"doctors_office", "Afstand tot apotheek|2022":"pharmacy", "Afstand tot school basisonderwijs|2022":"primary_school", "Afstand tot school voortgezet onderwijs totaal|2022":"secondary_school", "Afstand tot bibliotheek|2022":"library", "Afstand tot bioscoop|2022":"cinema"})
data_2022_distances.head()

Unnamed: 0_level_0,supermarket,other_groceries,doctors_office,pharmacy,primary_school,secondary_school,library,cinema
Buurten,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Nieuwe Markt e.o.,0.9,0.1,0.7,0.4,0.6,1.5,0.8,1.7
De Baan e.o.,0.8,0.1,0.9,0.7,0.3,1.6,0.9,1.5
Turfmarkt e.o.,0.5,0.2,0.4,1.4,1.2,1.1,1.8,1.1
Raam e.o.,0.4,0.2,0.7,1.3,0.7,1.2,1.5,1.2
Nieuwe Park Oost,0.6,0.6,0.4,1.1,1.2,0.5,1.4,1.1


We'll develop the walkability index on a 10 point scale, where a 10 is a super walkable neighborhood and a 0 is a completely unwalkable neighborhood.

People walk, on average, at a speed of roughly 0.06 km per minute. Let's say a distance >= 0.3km (>5 min) is very walkable and equivalent to a 10 in our walkability index. Distances from 0.3km to 0.9km (5 to 15 min) will be considered walkable and given a 6.66. Above 0.9km and up to 1.8km (15-30 minute walk) will be somewhat walkable and given a 3.33. Anything over 1.8km is considered not walkable and given a 0.

0=A-1.8B
A=1.8B
100=A-0.3B
100=1.5B
B=66.667
A=120

In [6]:
def rate_walkability(value):
    if value <= 0.3:
        return 100
    elif value <= 1.8:
        return int(120 - (66.666667 * value))
    else:
        return 0
walkability_rankings = data_2022_distances.map(rate_walkability)
walkability_rankings.head()

Unnamed: 0_level_0,supermarket,other_groceries,doctors_office,pharmacy,primary_school,secondary_school,library,cinema
Buurten,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Nieuwe Markt e.o.,59,100,73,93,79,19,66,6
De Baan e.o.,66,100,59,73,100,13,59,19
Turfmarkt e.o.,86,100,93,26,39,46,0,46
Raam e.o.,93,100,73,33,73,39,19,39
Nieuwe Park Oost,79,79,93,46,39,86,26,46


We should assign weights to the different ammenities covered in this table, as the walkability of some ammenities is more important to residents than others. Based on the core principles of the [15-minute city concept](https://www.c40knowledgehub.org/s/article/How-to-build-back-better-with-a-15-minute-city?language=en_US#:~:text=A%20successful%2015%2Dminute%20neighbourhood,recreation%2C%20working%20spaces%20and%20more.), groceries and medical care are the most important ammenities for urban residents to have access to. So, let's give "supermarket" and "doctors_office" the top weight. We'll give "other_groceries" and "pharmacy" the second-largest weighting, along with "primary_school". That leaves "secondary_school", "library", and "cinema" with the lowest weights for the walkability index.

In [7]:
ammenity_weights = {"supermarket":3, "doctors_office":3, "other_groceries":2, "pharmacy":2, "primary_school":2, "secondary_school":1, "library":1, "cinema":1}
index_denominator = sum(ammenity_weights.values())

Applying these weights to the walkability rankings:

In [8]:
def index_weigh(df, weights):
    weighed_dataframe = pd.DataFrame(index = df.index)
    for key in weights:
        result = df.loc[:,key].map(lambda p: p*weights[key])
        weighed_dataframe = weighed_dataframe.join(result)
    return weighed_dataframe
        
weighed_walkability_rankings = index_weigh(walkability_rankings, ammenity_weights)
weighed_walkability_rankings.head()

Unnamed: 0_level_0,supermarket,doctors_office,other_groceries,pharmacy,primary_school,secondary_school,library,cinema
Buurten,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Nieuwe Markt e.o.,177,219,200,186,158,19,66,6
De Baan e.o.,198,177,200,146,200,13,59,19
Turfmarkt e.o.,258,279,200,52,78,46,0,46
Raam e.o.,279,219,200,66,146,39,19,39
Nieuwe Park Oost,237,279,158,92,78,86,26,46


Now that we have the weighed values, all we need to do is sum up each row and divide it by the index denominator.

In [9]:
walkability_index_result = weighed_walkability_rankings.apply(lambda p: int(sum(p) / index_denominator), axis="columns")
print(walkability_index_result)

Buurten
Nieuwe Markt e.o.                   68
De Baan e.o.                        67
Turfmarkt e.o.                      63
Raam e.o.                           67
Nieuwe Park Oost                    66
De Korte Akkeren Oud                76
De Korte Akkeren Nieuw              70
Industrieterrein Kromme Gouwe       32
Weidebloemkwartier                  47
Boerhaavekwartier                   73
Windrooskwartier en Heesterbuurt    78
Groenhovenkwartier                  65
Bloemendaalseweg                    68
De Goudse Poort                     59
Gaardenbuurt                        54
Hoef- en Veldbuurt                  78
Zomenbuurt                          68
Hoevenbuurt                         60
Lusten-, Burgen- en Steinenbuurt    67
Grassen- Waterbuurt                 46
Bodegraafsestraatweg                13
Wervenbuurt                         46
Ouwe Gouwe                          77
Statensingel                        87
Wethouder Venteweg                  55
Achterwillenseweg

We'll construct our bikeability index the same way, except we'll change the cutoff points for our rankings. If we assume the average speed of an urban cyclist to be 15km/hr, then one can get 1.25km in five minutes. So 1.25km and less will receive a bikeability index score of 100. Conversely, any trip above 7.5km would take at least half an hour by bike (approximately), so distacnes above 7.5 will receive a bikeability index score of 0. Distances in-between will decrease linearly from 100 to 0.

In [10]:
def rate_bikeability(value):
    if value <= 1.25:
        return 100
    elif value <= 7.5:
        return int(120 - (16 * value))
    else:
        return 0
bikeability_rankings = data_2022_distances.map(rate_bikeability)
bikeability_rankings.head()

Unnamed: 0_level_0,supermarket,other_groceries,doctors_office,pharmacy,primary_school,secondary_school,library,cinema
Buurten,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Nieuwe Markt e.o.,100,100,100,100,100,96,100,92
De Baan e.o.,100,100,100,100,100,94,100,96
Turfmarkt e.o.,100,100,100,97,100,100,91,100
Raam e.o.,100,100,100,99,100,100,96,100
Nieuwe Park Oost,100,100,100,100,100,100,97,100


In [11]:
weighed_bikeability_rankings = index_weigh(bikeability_rankings, ammenity_weights)
weighed_bikeability_rankings.head()

Unnamed: 0_level_0,supermarket,doctors_office,other_groceries,pharmacy,primary_school,secondary_school,library,cinema
Buurten,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Nieuwe Markt e.o.,300,300,200,200,200,96,100,92
De Baan e.o.,300,300,200,200,200,94,100,96
Turfmarkt e.o.,300,300,200,194,200,100,91,100
Raam e.o.,300,300,200,198,200,100,96,100
Nieuwe Park Oost,300,300,200,200,200,100,97,100


In [12]:
bikeability_index_result = weighed_bikeability_rankings.apply(lambda p: int(sum(p) / index_denominator), axis="columns")
print(bikeability_index_result)

Buurten
Nieuwe Markt e.o.                    99
De Baan e.o.                         99
Turfmarkt e.o.                       99
Raam e.o.                            99
Nieuwe Park Oost                     99
De Korte Akkeren Oud                 99
De Korte Akkeren Nieuw               98
Industrieterrein Kromme Gouwe        97
Weidebloemkwartier                   97
Boerhaavekwartier                    99
Windrooskwartier en Heesterbuurt     99
Groenhovenkwartier                   98
Bloemendaalseweg                     98
De Goudse Poort                      98
Gaardenbuurt                         99
Hoef- en Veldbuurt                   98
Zomenbuurt                           98
Hoevenbuurt                          97
Lusten-, Burgen- en Steinenbuurt     98
Grassen- Waterbuurt                  98
Bodegraafsestraatweg                 93
Wervenbuurt                          97
Ouwe Gouwe                           99
Statensingel                        100
Wethouder Venteweg              

In [13]:
indices_final = pd.concat([walkability_index_result, bikeability_index_result], axis=1, keys=["Walkability Index","Bikeability Index"])
print(indices_final)

                                  Walkability Index  Bikeability Index
Buurten                                                               
Nieuwe Markt e.o.                                68                 99
De Baan e.o.                                     67                 99
Turfmarkt e.o.                                   63                 99
Raam e.o.                                        67                 99
Nieuwe Park Oost                                 66                 99
De Korte Akkeren Oud                             76                 99
De Korte Akkeren Nieuw                           70                 98
Industrieterrein Kromme Gouwe                    32                 97
Weidebloemkwartier                               47                 97
Boerhaavekwartier                                73                 99
Windrooskwartier en Heesterbuurt                 78                 99
Groenhovenkwartier                               65                 98
Bloeme

In [14]:
indices_final.to_csv("Gouda_Indices_By_Neighborhood.csv")