# Capstone Project - Amsterdam Pizzeria

### Introduction
Imagine that you want to open a new pizzeria in Amsterdam. The big question is: What is a good location?
This notebook is trying to answer this question by examining the amount of pizzerias and population per neighborhood.

### Data
Three different data sources will be used in this project:
- The geodata about the neighborhoods is retrieved from https://data.overheid.nl/dataset/mea3qdtnvln9ca. This data is a so called shapefile. To read and modify this data, the geopandas package is used.
- The population data about the neighborhoods is retrieved from https://data.amsterdam.nl/datasets/DMknRs8hEH-CtA/bevolking-wijken/. The most recent file is used, 2020.
- The amount of pizzerias in Amsterdam is retrieved by using the Foursquari API. Please note that only the venues with category Pizzeria are retrieved by setting this condition in the query

### Analysis
At first, we import all the required packages.

In [1]:
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
import folium
import requests
import zipfile
import shapely
from pandas.io.json import json_normalize
import json
import numpy as np
import folium.plugins as plugins

In [2]:
# Set working directory (delete cell before upload)
%cd "C:\Users\nick_\.jupyter\WorkingDirectory\Applied Data Science Capstone"

C:\Users\nick_\.jupyter\WorkingDirectory\Applied Data Science Capstone


We start with downloading the geodata of the neighborhoods of Amsterdam.

In [3]:
# Define the url
url = 'https://e85bcf2124fb4437b1bc6eb75dfc3abf.objectstore.eu/dcatd/6fa5809fafea43d09894ac6b1818c85a'
# Download and save data
r = requests.get(url, allow_redirects=True)
open('geodata.zip', 'wb').write(r.content)

53395

After this, the files have to be unzipped:

In [4]:
# Unzip all files in working directory
with zipfile.ZipFile("geodata.zip","r") as zip_ref:
    zip_ref.extractall()

It is now possible to read the data using geopandas:

In [5]:
geodata = gpd.read_file("woonbrt10_region.shp").to_crs(4326)

We can visualize the data in a folium map:

In [6]:
# Convert to json file
gjson = geodata.to_json()

# Create map
map_ams = folium.Map([52.3702157, 4.8951679],
                  zoom_start=12,
                  tiles='cartodbpositron')
                  

# Add neighborhoods
points = folium.features.GeoJson(gjson)

map_ams.add_child(points)
map_ams

For data wrangling and merging purposes, the data is converted to a normal dataframe:

In [7]:
geo_df = pd.DataFrame(geodata)
geo_df.shape

(386, 11)

As you can see, 386 different areas are defined. Below a snapshot of the data:

In [8]:
geo_df.head()

Unnamed: 0,BUURT,BC,SD,BCNAAM,SDBRT,SD10,BRTK2010,XZWAAR,YZWAAR,AFSTDAM,geometry
0,60a,60,N,Van der Pekbuurt,N60a,N,N60a,122402,488916,0,"POLYGON ((4.90208 52.38304, 4.90611 52.38484, ..."
1,60b,60,N,Bloemenbuurt Zuid,N60b,N,N60b,122869,489764,0,"POLYGON ((4.90918 52.39982, 4.91048 52.39996, ..."
2,60c,60,N,Bloemenbuurt Noord,N60c,N,N60c,122881,490105,0,"POLYGON ((4.90918 52.39982, 4.90831 52.40117, ..."
3,61a,61,N,IJplein e.o.,N61a,N,N61a,122558,488400,0,"POLYGON ((4.90784 52.38172, 4.91013 52.38339, ..."
4,61b,61,N,Vogelbuurt Zuid,N61b,N,N61b,122892,488817,0,"POLYGON ((4.91031 52.38330, 4.91013 52.38339, ..."


Now the geodata is loaded, we move on to the population data. At first, we download and read it:

In [9]:
# Define the url
url = 'https://api.data.amsterdam.nl/dcatd/datasets/DMknRs8hEH-CtA/purls/23'
# Download and save data
r = requests.get(url, allow_redirects=True)
open('pop.xlsx', 'wb').write(r.content)

18652

Read the data in the dataframe:

In [10]:
pop_df = pd.read_excel('pop.xlsx', engine = 'openpyxl', header=None, skiprows=4, names=['Code','Name', '2020', '2025', '2030', '2040', '2050', 'Drop'])
pop_df.head(5)

Unnamed: 0,Code,Name,2020,2025,2030,2040,2050,Drop
0,A00,Burgwallen-Oude Zijde,4465,4479.0,4424.0,4370.0,4328,
1,A01,Burgwallen-Nieuwe Zijde,4134,4259.0,4209.0,4186.0,4177,
2,A02,Grachtengordel-West,6440,6332.0,6220.0,6147.0,6102,
3,A03,Grachtengordel-Zuid,5436,5374.0,5276.0,5201.0,5148,
4,A04,Nieuwmarkt/Lastage,9703,9766.0,9557.0,9411.0,9314,


As shown above, some data wrangling is needed:

In [11]:
# Remove last column
pop_df.drop(['Drop'], axis=1, inplace=True)
# Filter out rows with NaN in first and third column
pop_df = pop_df[pop_df['Code'].notna()]
pop_df = pop_df[pop_df['2020'].notna()]
# Replace - with 0 in column 2020
pop_df['2020'].replace(to_replace='-', value=0, inplace=True)
# Change columns to integer
cols = ['2020','2025','2030','2040','2050']
pop_df[cols] = pop_df[cols].astype(int)
# Remove unneccessary rows
mask = (pop_df['Code'].str.len() == 3) & (pop_df['Code'] != 'ADS')
pop_df = pop_df.loc[mask].reset_index(drop=True)
#show dataset
pop_df.head()

Unnamed: 0,Code,Name,2020,2025,2030,2040,2050
0,A00,Burgwallen-Oude Zijde,4465,4479,4424,4370,4328
1,A01,Burgwallen-Nieuwe Zijde,4134,4259,4209,4186,4177
2,A02,Grachtengordel-West,6440,6332,6220,6147,6102
3,A03,Grachtengordel-Zuid,5436,5374,5276,5201,5148
4,A04,Nieuwmarkt/Lastage,9703,9766,9557,9411,9314


We can now start working on the merge of the both dataframes to get population and geodata in one dataframe. However, there are some problems. Firs, the levels of both dataframes are not the same, so we need to aggregate the frame with geodata on BC level:

In [12]:
# Aggregate one level up to match levels with population data
temp = geodata[['BC','geometry']].dissolve(by='BC')
# Add centroid points
temp['X'] = temp.centroid.x
temp['Y'] = temp.centroid.y
# Replace geo_data frame
geo_df = pd.DataFrame(temp)
# Create index
geo_df = geo_df.reset_index()
geo_df.head()


  temp['X'] = temp.centroid.x

  temp['Y'] = temp.centroid.y


Unnamed: 0,BC,geometry,X,Y
0,0,"POLYGON ((4.89871 52.37098, 4.89549 52.36752, ...",4.897064,52.372613
1,1,"POLYGON ((4.89549 52.36752, 4.89589 52.36744, ...",4.894429,52.374289
2,2,"POLYGON ((4.88810 52.36855, 4.88856 52.36812, ...",4.886786,52.372964
3,3,"POLYGON ((4.90056 52.36564, 4.90060 52.36555, ...",4.893906,52.364363
4,4,"POLYGON ((4.90147 52.36631, 4.90134 52.36635, ...",4.904852,52.371703


We will be using the following map:

In [13]:
# Convert to json file
gjson = temp['geometry'].to_crs(epsg='4326').to_json()

# Create map
map_ams = folium.Map([52.3702157, 4.8951679],
                  zoom_start=12,
                  tiles='cartodbpositron')

# Add neighborhoods
area = folium.features.GeoJson(gjson)

map_ams.add_child(area)
map_ams

The second problem is the fact that the codes of the neighborhood are not equivalent at the moment. To fix this, we need to change the code in the population data:

In [14]:
# Extract the last two characters
pop_df = pop_df.assign(Code=pop_df['Code'].str[1:3])
pop_df.head()

Unnamed: 0,Code,Name,2020,2025,2030,2040,2050
0,0,Burgwallen-Oude Zijde,4465,4479,4424,4370,4328
1,1,Burgwallen-Nieuwe Zijde,4134,4259,4209,4186,4177
2,2,Grachtengordel-West,6440,6332,6220,6147,6102
3,3,Grachtengordel-Zuid,5436,5374,5276,5201,5148
4,4,Nieuwmarkt/Lastage,9703,9766,9557,9411,9314


We can now merge both columns based on BC (geodata) and Code (population). The table with geodata is leading.

In [15]:
# Merge the frames
merge_df = pd.merge(geo_df, pop_df, left_on='BC', right_on='Code')
# Drop the code column
merge_df.drop(['Code'], axis=1, inplace=True)
merge_df.head()

Unnamed: 0,BC,geometry,X,Y,Name,2020,2025,2030,2040,2050
0,0,"POLYGON ((4.89871 52.37098, 4.89549 52.36752, ...",4.897064,52.372613,Burgwallen-Oude Zijde,4465,4479,4424,4370,4328
1,1,"POLYGON ((4.89549 52.36752, 4.89589 52.36744, ...",4.894429,52.374289,Burgwallen-Nieuwe Zijde,4134,4259,4209,4186,4177
2,2,"POLYGON ((4.88810 52.36855, 4.88856 52.36812, ...",4.886786,52.372964,Grachtengordel-West,6440,6332,6220,6147,6102
3,3,"POLYGON ((4.90056 52.36564, 4.90060 52.36555, ...",4.893906,52.364363,Grachtengordel-Zuid,5436,5374,5276,5201,5148
4,4,"POLYGON ((4.90147 52.36631, 4.90134 52.36635, ...",4.904852,52.371703,Nieuwmarkt/Lastage,9703,9766,9557,9411,9314


We show some summary statistics regarding the population:

In [16]:
merge_df[cols].describe()

Unnamed: 0,2020,2025,2030,2040,2050
count,91.0,91.0,91.0,91.0,91.0
mean,9296.901099,9755.604396,10071.912088,10728.692308,11163.67033
std,5571.873208,5818.566328,5962.490286,6435.329907,6935.461238
min,0.0,1121.0,1098.0,1063.0,1047.0
25%,5187.5,5566.5,5489.0,5906.5,5886.5
50%,8537.0,8831.0,8964.0,9387.0,9592.0
75%,12576.0,13829.0,14181.5,15011.0,15360.0
max,29788.0,29999.0,31944.0,31619.0,35390.0


We can visualize the population data by using a choropleth map:

In [17]:
# Create new map
map_pop = folium.Map([52.3702157, 4.8951679],
                     zoom_start=12,
                     tiles='cartodbpositron')

# Generate choropleth map based on population in 2020                  
map_pop.choropleth(
    geo_data=gjson,
    data=merge_df,
    columns=['BC', '2020'],
    key_on='feature.id',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Population in 2020'
)

map_pop



We are at the stage that we retrieve the foursquare data. First, let set some variables:

In [18]:
CLIENT_ID = 'NQ3QOIFN1QWSWIMNWKVZ31REVQDC3XYOVLXF1YXTNXAIMCYT' # your Foursquare ID
CLIENT_SECRET = 'D0UZ15XVBHKQVOZU42L1V24BTZ2V2ERUGPQJ2S50FXH1LF2K' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 1000
CATEGORY = '4bf58dd8d48988d1ca941735' #category of pizza place
PLACE = 'Amsterdam'

Set the url we need to use:

In [19]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&near={}&categoryId={}&v={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, PLACE, CATEGORY, VERSION, LIMIT)

Send request and check results:

In [20]:
results = requests.get(url).json()

Transform relevant part into dataframe

In [21]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
pizza = json_normalize(venues)

# Show only places in Amsterdam
pizza = pizza[pizza['location.city'] == 'Amsterdam']
pizza.shape

# clean column names by keeping only last term
pizza.columns = [column.split('.')[-1] for column in pizza.columns]

  pizza = json_normalize(venues)


In [22]:
pizza.shape

(41, 18)

So currently there are only 41 pizza places in Amsterdam. Let's check the dataframe

In [23]:
pizza.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,address,lat,lng,labeledLatLngs,postalCode,cc,city,state,country,formattedAddress,neighborhood,crossStreet,id.1
0,5648c396498edda4852d4c23,La Zoccola del Pacioccone,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",v-1612527677,False,Nieuwe Nieuwstraat 22HS,52.375297,4.893965,"[{'label': 'display', 'lat': 52.37529702834186...",1012 NH,NL,Amsterdam,Noord-Holland,Nederland,"[Nieuwe Nieuwstraat 22HS, 1012 NH Amsterdam, N...",,,
1,5967d58b9cadd94f2c04a767,FIKO,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",v-1612527677,False,Eerste Constantijn Huygensstraat 60,52.370523,4.868961,"[{'label': 'display', 'lat': 52.370523, 'lng':...",1054 BR,NL,Amsterdam,Noord-Holland,Nederland,"[Eerste Constantijn Huygensstraat 60, 1054 BR ...",Da Costabuurt,,
2,5d7d25f09212480007bae77a,Eatmosfera Oost,"[{'id': '4bf58dd8d48988d1ca941735', 'name': 'P...",v-1612527677,False,,52.363061,4.932581,"[{'label': 'display', 'lat': 52.363061, 'lng':...",1094 JB,NL,Amsterdam,Noord-Holland,Nederland,"[1094 JB Amsterdam, Nederland]",,,
3,51b84b1b498eb3038b03d58c,Kebec Micro Bakery,"[{'id': '4bf58dd8d48988d1ca941735', 'name': 'P...",v-1612527677,False,TT Melaniaweg 12,52.403495,4.892019,"[{'label': 'display', 'lat': 52.40349510715673...",,NL,Amsterdam,Noord-Holland,Nederland,"[TT Melaniaweg 12, Amsterdam, Nederland]",,,
4,5c02846364c8e1002c9dd3c9,Porchetteria,"[{'id': '4bf58dd8d48988d1c5941735', 'name': 'S...",v-1612527677,False,Frans Halsstraat 63H,52.355954,4.888717,"[{'label': 'display', 'lat': 52.355954, 'lng':...",1072 BM,NL,Amsterdam,Noord-Holland,Nederland,"[Frans Halsstraat 63H (Saenredamstraat), 1072 ...",,Saenredamstraat,


Let's see the locations on the map:

In [24]:
for lat, lng, label in zip(pizza.lat, pizza.lng, pizza.name):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(map_pop)
    
map_pop

So there are a lot of areas without a pizzeria. Let's check all the foodplaces now, for further analysis. We will run this for every neighborhood based on the centroid. First, we define the function to do this:

In [25]:
# Set new category variable
CATEGORY = '4d4b7105d754a06374d81259'

# Function to loop
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/search?ll={},{}&categoryId={}&radius={}&client_id={}&client_secret={}&v={}&limit={}'.format(
            lat,
            lng,
            CATEGORY,
            radius,
            CLIENT_ID,
            CLIENT_SECRET,
            VERSION,
            1000)
            
        # make the GET request
        results = requests.get(url).json()["response"]['venues']
        results
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng'],  
            v['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Run the function for every area:

In [26]:
ams_venues = getNearbyVenues(names=merge_df['Name'],
                                   latitudes=merge_df['Y'],
                                   longitudes=merge_df['X']
                                  )

Burgwallen-Oude Zijde
Burgwallen-Nieuwe Zijde
Grachtengordel-West
Grachtengordel-Zuid
Nieuwmarkt/Lastage
Haarlemmerbuurt
Jordaan
De Weteringschans
Weesperbuurt/Plantage
Oostelijke Eilanden/Kadijken
Houthavens
Spaarndammer- en Zeeheldenbuurt
Staatsliedenbuurt
Centrale Markt
Frederik Hendrikbuurt
Da Costabuurt
Kinkerbuurt
Van Lennepbuurt
Helmersbuurt
Overtoomse Sluis
Vondelbuurt
Oude Pijp
Nieuwe Pijp
Zuid-Pijp
Weesperzijde
Oosterparkbuurt
Dapperbuurt
Transvaalbuurt
Indische Buurt-West
Indische Buurt-Oost
Oostelijk Havengebied
Zeeburgereiland/Nieuwe Diep
IJburg-West
Sloterdijk
Landlust
Erasmuspark
De Kolenkit
Geuzenbuurt
Van Galenbuurt
Hoofdweg e.o.
Westindische Buurt
Hoofddorppleinbuurt
Schinkelbuurt
Willemspark
Museumkwartier
Stadionbuurt
Apollobuurt
IJburg-Oost
IJburg-Zuid
Scheldebuurt
IJselbuurt
Rijnbuurt
Frankendael
Middenmeer
Betondorp
De Omval/Overamstel
Prinses Irenebuurt e.o.
Volewijck
IJplein/Vogelbuurt
Tuindorp Nieuwendam
Tuindorp Buiksloot
Nieuwendammerdijk/Buiksloterdijk
Tuin

Check the results:

In [27]:
print(ams_venues.shape)
ams_venues.head()

(3472, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Burgwallen-Oude Zijde,52.372613,4.897064,Wok to Go,52.372241,4.894869,Noodle House
1,Burgwallen-Oude Zijde,52.372613,4.897064,Coffeecompany,52.371581,4.896721,Coffee Shop
2,Burgwallen-Oude Zijde,52.372613,4.897064,Vincent Kaas & Vlees,52.372929,4.900333,Deli / Bodega
3,Burgwallen-Oude Zijde,52.372613,4.897064,De Koffieschenkerij,52.374043,4.898427,Coffee Shop
4,Burgwallen-Oude Zijde,52.372613,4.897064,De Bakkerswinkel,52.375047,4.897911,Bakery


Plot the venues in map:

In [28]:
# Create new map
map_merge = folium.Map([52.3702157, 4.8951679],
                     zoom_start=12,
                     tiles='cartodbpositron')

# Generate choropleth map based on population in 2020                  
map_merge.choropleth(
    geo_data=gjson,
    data=merge_df,
    columns=['BC', '2020'],
    key_on='feature.id',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Population in 2020'
)

#Plot venues
for lat, lng, label in zip(ams_venues['Venue Latitude'], ams_venues['Venue Longitude'], ams_venues.Venue):
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(map_merge)
    
map_merge



Group the results for all neighboorhood to see the amount of foodplaces per neighborhood:

In [29]:
venues_group = ams_venues[['Neighborhood','Venue']].groupby('Neighborhood').count()
venues_group.reset_index(inplace=True)
venues_group

Unnamed: 0,Neighborhood,Venue
0,Apollobuurt,43
1,Banne Buiksloot,11
2,Betondorp,7
3,"Bijlmer-Centrum (D,F,H)",49
4,"Bijlmer-Oost (E,G,K)",23
...,...,...
85,Westindische Buurt,49
86,Westlandgracht,48
87,Willemspark,50
88,Zeeburgereiland/Nieuwe Diep,11


Let's merge the dataframe with merge_df:

In [30]:
# Merge the frames
final_df = pd.merge(merge_df, venues_group, left_on='Name', right_on='Neighborhood', how='left')
# Drop the code column
final_df.drop(['Neighborhood'], axis=1, inplace=True)
# Fill with 0 if error
final_df['Venue'] = final_df['Venue'].fillna(0)
# Change column venue to int
final_df['Venue'] = final_df['Venue'].astype(int)
final_df.head()

Unnamed: 0,BC,geometry,X,Y,Name,2020,2025,2030,2040,2050,Venue
0,0,"POLYGON ((4.89871 52.37098, 4.89549 52.36752, ...",4.897064,52.372613,Burgwallen-Oude Zijde,4465,4479,4424,4370,4328,50
1,1,"POLYGON ((4.89549 52.36752, 4.89589 52.36744, ...",4.894429,52.374289,Burgwallen-Nieuwe Zijde,4134,4259,4209,4186,4177,50
2,2,"POLYGON ((4.88810 52.36855, 4.88856 52.36812, ...",4.886786,52.372964,Grachtengordel-West,6440,6332,6220,6147,6102,50
3,3,"POLYGON ((4.90056 52.36564, 4.90060 52.36555, ...",4.893906,52.364363,Grachtengordel-Zuid,5436,5374,5276,5201,5148,50
4,4,"POLYGON ((4.90147 52.36631, 4.90134 52.36635, ...",4.904852,52.371703,Nieuwmarkt/Lastage,9703,9766,9557,9411,9314,50


Lets examine the amount of venue's in the neighborhood:

In [31]:
final_df['Venue'].describe()

count    91.000000
mean     38.153846
std      16.400490
min       0.000000
25%      25.500000
50%      48.000000
75%      50.000000
max      50.000000
Name: Venue, dtype: float64

Show in map:

In [32]:
# Create new map
map_venue = folium.Map([52.3702157, 4.8951679],
                     zoom_start=12,
                     tiles='cartodbpositron')

# Generate choropleth map based on population in 2020                  
map_venue.choropleth(
    geo_data=gjson,
    data=final_df,
    columns=['BC', 'Venue'],
    key_on='feature.id',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Food places in 2020'
)

map_venue



As you can see, the food places are centered in the centre of Amsterdam. We will now divide the population of the neighborhood by the amount of venues:

In [33]:
# Divide
final_df['Ratio 2020'] = final_df['2020'] / final_df['Venue']
final_df['Ratio 2025'] = final_df['2025'] / final_df['Venue']
final_df['Ratio 2030'] = final_df['2030'] / final_df['Venue']
final_df['Ratio 2040'] = final_df['2040'] / final_df['Venue']
final_df['Ratio 2050'] = final_df['2050'] / final_df['Venue']

# Replace inf with 0
final_df = final_df.replace(np.inf, 0)

# Replace 0 with population of specific year
final_df['Ratio 2020'] = np.where(final_df['Ratio 2020'] == 0, final_df['2020'], final_df['Ratio 2020'])
final_df['Ratio 2025'] = np.where(final_df['Ratio 2025'] == 0, final_df['2025'], final_df['Ratio 2025'])
final_df['Ratio 2030'] = np.where(final_df['Ratio 2030'] == 0, final_df['2030'], final_df['Ratio 2030'])
final_df['Ratio 2040'] = np.where(final_df['Ratio 2040'] == 0, final_df['2040'], final_df['Ratio 2040'])
final_df['Ratio 2050'] = np.where(final_df['Ratio 2050'] == 0, final_df['2050'], final_df['Ratio 2050'])

Check the places where there are not much venues compared to population:

In [34]:
final_df[['Name','Venue', '2020','Ratio 2020']].sort_values(by='Ratio 2020', ascending=False).head()

Unnamed: 0,Name,Venue,2020,Ratio 2020
87,Nellestein,2,3037,1518.5
89,Gein,8,11327,1415.875
67,Banne Buiksloot,11,14781,1343.727273
86,"Bijlmer-Oost (E,G,K)",23,29788,1295.130435
72,Geuzenveld,13,16535,1271.923077


Check the results in general:

In [35]:
final_df['Ratio 2020'].describe()

count      91.000000
mean      361.174931
std       357.542519
min         0.000000
25%       143.950204
50%       236.354167
75%       408.318182
max      1518.500000
Name: Ratio 2020, dtype: float64

So, there is 1 food venue for every 361 people in Amsterdam! Create a map to summarize all the information:

In [36]:
# Create new map
map_final= folium.Map([52.3702157, 4.8951679],
                     zoom_start=12,
                     tiles='cartodbpositron')

# Generate choropleth map based on population in 2020                  
map_final.choropleth(
    geo_data=gjson,
    data=final_df,
    columns=['BC', 'Ratio 2020'],
    key_on='feature.id',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Population per food venue in 2020'
)

# Add the pizza places
for lat, lng, label in zip(pizza.lat, pizza.lng, pizza.name):
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(map_final)

# instantiate a mark cluster object for centroids of neighborhood
cluster = plugins.MarkerCluster().add_to(map_final)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(final_df.Y, final_df.X, final_df.Name):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(cluster)  

map_final   



Check the ratio between population and venues with expected population in 2025:

In [37]:
final_df[['Name','Venue', '2025','Ratio 2025']].sort_values(by='Ratio 2025', ascending=False).head()

Unnamed: 0,Name,Venue,2025,Ratio 2025
87,Nellestein,2,2863,1431.5
89,Gein,8,11297,1412.125
67,Banne Buiksloot,11,14787,1344.272727
86,"Bijlmer-Oost (E,G,K)",23,29999,1304.304348
72,Geuzenveld,13,16407,1262.076923


Check for expected population in 2025:

In [38]:
# Create new map
map_final= folium.Map([52.3702157, 4.8951679],
                     zoom_start=12,
                     tiles='cartodbpositron')

# Generate choropleth map based on population in 2025                 
map_final.choropleth(
    geo_data=gjson,
    data=final_df,
    columns=['BC', 'Ratio 2025'],
    key_on='feature.id',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Population per food venue in 2025'
)

# Add the pizza places
for lat, lng, label in zip(pizza.lat, pizza.lng, pizza.name):
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(map_final)

# instantiate a mark cluster object for centroids of neighborhood
cluster = plugins.MarkerCluster().add_to(map_final)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(final_df.Y, final_df.X, final_df.Name):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(cluster)  

map_final

