# The battle of neighbourhoods - Report

## Introduction: the business problem

The client is a big hotel chain who wants to estabilish itself in the city of Munich. The current subdivision of Munich in neighbourhood, while historically grounded, fails to mirror the internal subdivision of the city. The client is interested in getting a picture of the predominant activities in the different sectors of the city, so as to understand where to open which kind of hotel: a family resort would be ideally placed in an area with parks rather than an industrial area, a more hostel-kind of accomodation close to cafes and pubs, and so on.

It is a typical clustering problem, where we are both interested in geographical proximity of points - neighbourhood should be connected - and some kind of "cultural" proximity, i.e. similar kind of activities.

## Data

We will use the following modules:

In [38]:
import pandas as pd
import numpy as np
from scipy.spatial import ConvexHull
import folium
from folium.plugins import BeautifyIcon
from pandas.io.json import json_normalize
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
import requests

We review the situation as-is, i.e. the historical neighbourhood/postal codes subdivision:

In [2]:
pc_mun = pd.read_csv("postal_codes_munich.csv")
pc_mun.head()

Unnamed: 0,zipcode,Neighbourhood,latitude,longitude
0,80331,Altstadt-Lehel,48.1345,11.571
1,80333,Altstadt-Lehel,48.1452,11.5668
2,80333,Maxvorstadt,48.1452,11.5668
3,80335,Altstadt-Lehel,48.1427,11.5552
4,80335,Ludwigsvorstadt-Isarvorstadt,48.1427,11.5552


It contains 25 different neighbourhoods and 74 different postal codes:

In [393]:
print("Unique neighbourhoods: " + str(len(pc_mun["Neighbourhood"].unique())))
print("Unique postal codes: " + str(len(pc_mun["zipcode"].unique())))

Unique neighbourhoods: 25
Unique postal codes: 74


The tragic fact here is that postal codes aren't a refinement of the neighbourhoods, as can already be seen from the first rows of the dataset - 80333 corresponds to both Altstadt-Lehel and Maxvorstadt, covering two areas which we would expect to see in different clusters.

So what do we do? We build a grid over Munich and then cluster the points of the grid, that's what we do:

In [394]:
latitudes = np.linspace(start = 48.09, stop = 48.20, num = 20)
longitudes = np.linspace(start = 11.48, stop = 11.65, num = 20)

grid = pd.DataFrame(index = pd.MultiIndex.from_product([latitudes, longitudes], names = ["Latitude", "Longitude"])).reset_index()
grid["Neighbourhood"] = grid.index
grid.head()

Unnamed: 0,Latitude,Longitude,Neighbourhood
0,48.09,11.48,0
1,48.09,11.488947,1
2,48.09,11.497895,2
3,48.09,11.506842,3
4,48.09,11.515789,4


This is the grid on a map:

In [395]:
map_munich = folium.Map(location=[48.1351, 11.5620], zoom_start=12)

for lat, lng in zip(grid['Latitude'], grid['Longitude']):
    label = folium.Popup(str(lat) + " " + str(lng), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        popup = label,
        radius=5,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_munich)  
    
map_munich

Neat, right? Now let's do the clustering.

## Methodology

We set up the classification algorithm and let it run, just like in the previous course assignments. We take around every element of the grid a circle of radius 75% the distance between the points, so that we cover all of the map, at the cost of counting some element twice.

Setup:

In [285]:
CLIENT_ID = 'MULHTG1HFTHCFNK0JB4KXNMKI11XRU3TN1M5QPX3EFL1RUQJ' # your Foursquare ID
CLIENT_SECRET = 'VCJFXT0OALLHEUM4K0MN10V0ZWZGGXNJ0SRIPWKRDQPKMYSS' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: MULHTG1HFTHCFNK0JB4KXNMKI11XRU3TN1M5QPX3EFL1RUQJ
CLIENT_SECRET:VCJFXT0OALLHEUM4K0MN10V0ZWZGGXNJ0SRIPWKRDQPKMYSS


Three useful functions:

In [9]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [283]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [11]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[3:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [373]:
def find_boundary_points(df):
    
    grouped = df.groupby("Cluster Labels")
    
    vertices = []
    names = []
    for name, group in grouped:
        if len(group[["Latitude", "Longitude"]]) > 3: 
            vertices.append(group.iloc[ConvexHull(group[["Latitude", "Longitude"]], qhull_options='QJ').vertices,:][["Latitude", "Longitude"]])
        else:
            vertices.append(group[["Latitude", "Longitude"]])
        names.append(name)
    
    return [names, vertices]

We can load the activities for each grid point from Foursquare:

In [286]:
munich_venues = getNearbyVenues(names=grid['Neighbourhood'],
                                latitudes=grid['Latitude'],
                                longitudes=grid['Longitude'])

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
27

The dataframe we obtain looks like this:

In [287]:
munich_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,0,48.09,11.48,Rossmann,48.087505,11.484407,Drugstore
1,0,48.09,11.48,Schweizer Platz,48.088626,11.479916,Plaza
2,0,48.09,11.48,REWE,48.089149,11.480729,Supermarket
3,0,48.09,11.48,Ratschiller's,48.089207,11.480467,Bakery
4,0,48.09,11.48,Wochenmarkt am Schweizer Platz,48.089108,11.480147,Farmers Market


In [288]:
print('There are {} uniques categories.'.format(len(munich_venues['Venue Category'].unique())))

There are 320 uniques categories.


Now we need to get the data prepared for the clustering. This entails one hot encoding for the various venue types, but also rescaling of latitude and longitude to the [0, 1] interval.

In [290]:
# one hot encoding
munich_onehot = pd.get_dummies(munich_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
munich_onehot['Neighborhood'] = munich_venues['Neighborhood'] 

# process latitude and longitude
munich_onehot['Latitude'] = munich_venues['Neighborhood Latitude']
munich_onehot['Longitude'] = munich_venues['Neighborhood Longitude']
munich_onehot = munich_onehot.loc[lambda df : (df['Latitude'] > 48.09) & (df['Latitude'] < 48.20) & (df['Longitude'] > 11.48) & (df['Longitude'] < 11.65), :]
munich_onehot['Latitude'] = (munich_onehot['Latitude'] - munich_onehot['Latitude'].min())/(munich_onehot['Latitude'].max() - munich_onehot['Latitude'].min())
munich_onehot['Longitude'] = (munich_onehot['Longitude'] - munich_onehot['Longitude'].min())/(munich_onehot['Longitude'].max() - munich_onehot['Longitude'].min())

# move neighborhood column to the first column
fixed_columns = [munich_onehot.columns[-2]] + [munich_onehot.columns[-1]]+ list(munich_onehot.columns[:-2])
munich_onehot = munich_onehot[fixed_columns]

# group
munich_grouped = munich_onehot.groupby('Neighborhood').mean().reset_index()
munich_grouped

Unnamed: 0,Neighborhood,Latitude,Longitude,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Aquarium,Arcade,...,Vietnamese Restaurant,Volleyball Court,Water Park,Waterfall,Wine Bar,Wine Shop,Xinjiang Restaurant,Yoga Studio,Zoo,Zoo Exhibit
0,21,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
1,22,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
2,23,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
3,24,0.0,0.176471,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0
4,25,0.0,0.235294,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
317,374,1.0,0.764706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
318,375,1.0,0.823529,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
319,376,1.0,0.882353,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0
320,377,1.0,0.941176,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0


And now for the clustering - we drop the Neighbourhood ID for the dataset, since we don't want the algorithm to use it:

In [366]:
# set number of clusters
kclusters = 28

munich_grouped_clustering = munich_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(munich_grouped_clustering)

array([12, 12, 12, 12, 12, 12, 12, 10, 10,  7,  7,  7], dtype=int32)

Now we have the clusters and we can merge everything back in the original dataframe. We add columns with the most common venues per grid point, which will help us analyse the results later:

In [367]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = munich_grouped['Neighborhood']

for ind in np.arange(munich_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(munich_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

munich_merged = grid

munich_merged = munich_merged.merge(neighborhoods_venues_sorted.set_index('Neighborhood'), left_on='Neighbourhood', right_on = "Neighborhood")

munich_merged.head() # check the last columns!

Unnamed: 0,Latitude,Longitude,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,48.095789,11.488947,21,12,German Restaurant,Playground,Italian Restaurant,Castle,Accessories Store,Outdoor Sculpture,Organic Grocery,Optical Shop,Opera House,Office
1,48.095789,11.497895,22,12,Supermarket,Asian Restaurant,Bowling Alley,Bakery,Spa,Massage Studio,Drugstore,Metro Station,Nightclub,Optical Shop
2,48.095789,11.506842,23,12,Gym / Fitness Center,Hotel,Furniture / Home Store,Pet Store,Modern European Restaurant,Supermarket,Organic Grocery,Drugstore,Asian Restaurant,Electronics Store
3,48.095789,11.515789,24,12,Hotel,Bank,Café,Supermarket,Pet Store,Drugstore,Pizza Place,Construction & Landscaping,Greek Restaurant,Ice Cream Shop
4,48.095789,11.524737,25,12,Bakery,Gym / Fitness Center,Greek Restaurant,Supermarket,Doner Restaurant,Rental Car Location,Café,Noodle House,Organic Grocery,Optical Shop


## Results

Here's the number of points per cluster:

In [404]:
munich_merged.groupby("Cluster Labels").agg({"Cluster Labels":"count"})

Unnamed: 0_level_0,Cluster Labels
Cluster Labels,Unnamed: 1_level_1
0,18
1,12
2,14
3,3
4,12
5,25
6,17
7,22
8,19
9,5


Of course, it's so much better to visualize this on a map. To construct this map we built the convex hull of the cluster points for each cluster - since latitude and longitude went into the clustering algorithm bubbles are bound to appear.

In [396]:
# create map
map_clusters= folium.Map(location=[48.1001, 11.5620], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
name, boundaries = find_boundary_points(munich_merged)
i = 0
for points in boundaries:
    label = folium.Popup('Cluster ' + str(name[i]), parse_html=True)
    folium.Polygon(points,
                   color="black",
                   popup = label,
                   fill=True,
                   fill_color=rainbow[i],
                   fill_opacity=0.2).add_to(map_clusters)
    i += 1
       
map_clusters

## Discussion

The city center is divided in four parts, corresponding to clusters 0, 8, 20 and 6 - let's look at them one by one as an example.

### Cluster 0 - Maxvorstadt and Schwabing West

These are the point in the cluster:

In [397]:
munich_merged.loc[munich_merged['Cluster Labels'] == 0, munich_merged.columns[list(range(2, munich_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
153,190,0,Restaurant,Plaza,Café,German Restaurant,Church,Tram Station,Botanical Garden,Nightclub,Shopping Mall,Middle Eastern Restaurant
154,191,0,Café,Plaza,Boutique,French Restaurant,Cocktail Bar,Clothing Store,Bar,Hotel,Restaurant,Italian Restaurant
170,209,0,Café,Asian Restaurant,Middle Eastern Restaurant,Theater,History Museum,Restaurant,Salad Place,Coffee Shop,Sushi Restaurant,Steakhouse
171,210,0,Café,History Museum,Art Museum,Italian Restaurant,Japanese Restaurant,Plaza,Sushi Restaurant,Peruvian Restaurant,Event Space,Field
172,211,0,Café,Italian Restaurant,Ice Cream Shop,Breakfast Spot,Bar,Cocktail Bar,Bakery,Restaurant,Burger Joint,Plaza
173,212,0,Café,Italian Restaurant,Ice Cream Shop,Surf Spot,River,Eastern European Restaurant,Frozen Yogurt Shop,Snack Place,Nightclub,Beer Garden
188,229,0,Café,Asian Restaurant,Steakhouse,Bar,Bakery,German Restaurant,Falafel Restaurant,Ramen Restaurant,Doner Restaurant,Bookstore
189,230,0,Café,Bar,Bakery,Italian Restaurant,Mediterranean Restaurant,Vietnamese Restaurant,Spanish Restaurant,Gastropub,Cocktail Bar,French Restaurant
190,231,0,Bar,Café,Italian Restaurant,Ice Cream Shop,Asian Restaurant,German Restaurant,Restaurant,Steakhouse,Burger Joint,Breakfast Spot
191,232,0,Ice Cream Shop,Irish Pub,Beer Garden,Café,Restaurant,Optical Shop,Bagel Shop,Steakhouse,Bar,Monument / Landmark


This is where we'd build the youth hostel - look at the amount of cafes and bars in the top three. Maxvorstadt and Schwabing are indeed known to be hip neighbourhoods.

### Cluster 8 - Altstadt-Lehel and Au-Haidhausen

These are the points in the cluster:

In [398]:
munich_merged.loc[munich_merged['Cluster Labels'] == 8, munich_merged.columns[list(range(2, munich_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
101,132,8,Plaza,Hotel,Burger Joint,Bakery,Beer Garden,Supermarket,Greek Restaurant,Brewery,Café,Tram Station
102,133,8,Italian Restaurant,Plaza,French Restaurant,Restaurant,Café,Supermarket,Turkish Restaurant,Organic Grocery,German Restaurant,Bus Stop
103,134,8,Hotel,Gym / Fitness Center,Climbing Gym,Pub,Nightclub,Beach Bar,Beer Bar,Gym,Supermarket,Fried Chicken Joint
104,135,8,Bus Stop,Pub,Discount Store,Shipping Store,Beach Bar,Liquor Store,Nightclub,Austrian Restaurant,Turkish Restaurant,German Restaurant
118,151,8,Café,Bavarian Restaurant,Coffee Shop,Cocktail Bar,Pizza Place,Bookstore,German Restaurant,Theater,Tea Room,Bistro
119,152,8,Indian Restaurant,Ice Cream Shop,Pizza Place,Concert Hall,Science Museum,Supermarket,Hotel,Doner Restaurant,Afghan Restaurant,Gourmet Shop
120,153,8,Italian Restaurant,Café,Plaza,Bakery,German Restaurant,Indian Restaurant,Bar,Ice Cream Shop,French Restaurant,Concert Hall
121,154,8,German Restaurant,Hotel,Italian Restaurant,Café,Plaza,Indian Restaurant,Spanish Restaurant,Vegetarian / Vegan Restaurant,Bakery,Donut Shop
122,155,8,Italian Restaurant,Hotel,Portuguese Restaurant,Indian Restaurant,Home Service,Pizza Place,Doner Restaurant,Coffee Shop,Restaurant,Climbing Gym
136,171,8,Café,Bavarian Restaurant,Coffee Shop,Hotel,Plaza,Bookstore,Pizza Place,German Restaurant,Clothing Store,Italian Restaurant


This is a more residential area - we often see drugstores and supermarkets in the first positions. It's still central, but looking at the amount of restaurants vs bars we can tell that the target is different. Hotels are also more common.

### Cluster 20 - Ludwigsvorstadt-Isarvorstadt and Sendling

These are the points in the cluster:

In [399]:
munich_merged.loc[munich_merged['Cluster Labels'] == 20, munich_merged.columns[list(range(2, munich_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
43,68,20,Plaza,Turkish Restaurant,Soccer Field,Construction & Landscaping,Hotel,Bakery,Beer Garden,BBQ Joint,Climbing Gym,Greek Restaurant
44,69,20,Park,Beach,Gastropub,Café,Rest Area,Seafood Restaurant,Beer Garden,Accessories Store,Nightclub,Optical Shop
61,88,20,Plaza,Italian Restaurant,German Restaurant,Turkish Restaurant,Gas Station,Bakery,Park,Supermarket,Organic Grocery,Hotel
62,89,20,Athletics & Sports,Restaurant,Park,Trail,Gas Station,Soccer Field,Beach,Rest Area,Newsstand,Optical Shop
63,90,20,Taverna,Plaza,Drugstore,Bus Line,Spa,Bus Stop,Café,German Restaurant,Greek Restaurant,Gym
78,107,20,German Restaurant,Doner Restaurant,Bank,Gastropub,Bus Stop,Vietnamese Restaurant,Café,Italian Restaurant,Drugstore,Spanish Restaurant
79,108,20,Italian Restaurant,Supermarket,Market,Food & Drink Shop,Gastropub,Grocery Store,Falafel Restaurant,Gym Pool,Bar,Bus Stop
80,109,20,Bar,Café,Italian Restaurant,Supermarket,Turkish Restaurant,Park,Bakery,Cocktail Bar,Food Court,Trail
81,110,20,German Restaurant,Drugstore,Soccer Field,Plaza,Cupcake Shop,Gastropub,Bar,Beach,Beer Garden,Taverna
82,111,20,Italian Restaurant,Bar,Plaza,Pizza Place,German Restaurant,Café,Bakery,Drugstore,Brewery,Doner Restaurant


Yet another cafe area - it probably got separated from Cluster 0 because the algorithm tends to prefer circular neighbourhoods.

### Cluster 6 - Schwantalerhöhe and Neuhausen

In [409]:
munich_merged.loc[munich_merged['Cluster Labels'] == 18, munich_merged.columns[list(range(2, munich_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
192,233,18,Bus Stop,Monument / Landmark,Hotel Pool,Bavarian Restaurant,Restaurant,Beer Garden,Snack Place,Recreation Center,Athletics & Sports,Hotel
193,234,18,Bank,Bakery,Supermarket,Park,Gourmet Shop,Athletics & Sports,Hostel,German Restaurant,Organic Grocery,Bus Stop
210,253,18,Bar,Tunnel,Dog Run,Comedy Club,Trattoria/Osteria,Boat Rental,German Restaurant,Snack Place,Convenience Store,Beer Garden
211,254,18,Bathing Area,Bavarian Restaurant,Tennis Court,Café,Accessories Store,Noodle House,Organic Grocery,Optical Shop,Opera House,Office
212,255,18,Bathing Area,Trattoria/Osteria,Bus Stop,Accessories Store,Newsstand,Organic Grocery,Optical Shop,Opera House,Office,Noodle House
228,273,18,Hotel,German Restaurant,Bar,Afghan Restaurant,Bus Stop,Trattoria/Osteria,Nightclub,Outlet Store,Outdoor Sculpture,Organic Grocery
229,274,18,Photography Studio,Bathing Area,Bavarian Restaurant,Park,Tennis Court,Volleyball Court,Beer Garden,Newsstand,Opera House,Office
231,276,18,Bus Stop,Asian Restaurant,Gas Station,Bakery,Theater,Bar,Italian Restaurant,Music Store,Opera House,Outlet Store
247,294,18,Stadium,Bed & Breakfast,Trail,Beer Garden,Moving Target,Newsstand,Optical Shop,Mountain,Opera House,Office
248,295,18,Indie Theater,Lake,Park,Bathing Area,Dog Run,Skate Park,Music Venue,Nature Preserve,New American Restaurant,Newsstand


This is the area around the central station - look at how many hotels!

### Other clusters

The algorithm caught some other interesting features, for example the industrial area northwest (Cluster 9), the zoo in the south (Cluster 10) and the park (Cluster 18).

## Conclusion

The algorithm provides a new rationale for the subdivision of the city, grouping and splitting old neighbourhoods into new ones.