# Capstone Project: Munich or Vienna, Which City Has Better City Designing?

## Introduction

**Munich**, the third biggest city in Germany, has more than 1.5 million inhabitants and is famous for its beer, folk festival, beautiful English gardens and of course, also for 
its universities and a variety of businesses.

**Vienna**, the capital of Austrian, has more than 1.9 million people. It has been known for a long time that the design of Vienna is balanced and elegant. Every corner of the city tells 
its special story. 

As young scholars, who are going to choose their first jobs after their university studies, it must be interesting and relevant to get more insights on the city life before
moving there. Moreover, the question where you should set your home will be also provided with argumentative answers.

## Table of Contents

1. **Data explanation**
2. **Methodology** 

    2.1 Data cleaning
    
    2.2 Foursquare API
    
    2.3 KMeans Clustering
    
3. **Results**
4. **Discussions**
5. **Conclusion**
    

### 1. Data explanation

On the German Wikipedia website (https://de.wikipedia.org/wiki/M%C3%BCnchen), we can find the table on all the **25 boroughs of Munich with their areas, numbers of inhabitants and
the proportion of foreigners**.

On the website (https://de.wikipedia.org/wiki/Wien) the table with the same structure can be easily found. 

Using **Foursquare** geo data I would like to explore these two cities and get the **top 10 common venues** for each borough of the two cities. 

Similar boroughs of a city will be clustered by **KMeans** and **visualized** through two interactive maps. 

Let's begin.

## 2.Methodology
   ### 2.1 Data cleaning

In [5]:
# Import necessary moduls and libraries
!pip install lxml
import lxml
import pandas as pd

Collecting lxml
[?25l  Downloading https://files.pythonhosted.org/packages/55/6f/c87dffdd88a54dd26a3a9fef1d14b6384a9933c455c54ce3ca7d64a84c88/lxml-4.5.1-cp36-cp36m-manylinux1_x86_64.whl (5.5MB)
[K     |████████████████████████████████| 5.5MB 6.1MB/s eta 0:00:01
[?25hInstalling collected packages: lxml
Successfully installed lxml-4.5.1


In [6]:
# Using pd.read_html read Wiki pages
url1="https://de.wikipedia.org/wiki/M%C3%BCnchen"
table_mun=pd.read_html(url1, header=0, thousands=",")
print(table_mun[1])

     Nr.                                        Stadtbezirk  Fläche(km²)  \
0    1.0                                     Altstadt-Lehel          315   
1    2.0                       Ludwigsvorstadt-Isarvorstadt          440   
2    3.0                                        Maxvorstadt          430   
3    4.0                                     Schwabing-West          436   
4    5.0                                      Au-Haidhausen          422   
5    6.0                                           Sendling          394   
6    7.0                                  Sendling-Westpark          781   
7    8.0                                   Schwanthalerhöhe          207   
8    9.0                              Neuhausen-Nymphenburg         1291   
9   10.0                                            Moosach         1109   
10  11.0                              Milbertshofen-Am Hart         1342   
11  12.0                                 Schwabing-Freimann         2567   
12  13.0    

In [7]:
df_mun=pd.DataFrame(table_mun[1])
df_mun.head(5)

Unnamed: 0,Nr.,Stadtbezirk,Fläche(km²),Einwohner,Dichte(Einw./km²),Ausländer(%)
0,1.0,Altstadt-Lehel,315,21.1,6.708,261
1,2.0,Ludwigsvorstadt-Isarvorstadt,440,51.644,11.734,284
2,3.0,Maxvorstadt,430,51.402,11.96,254
3,4.0,Schwabing-West,436,68.527,15.706,227
4,5.0,Au-Haidhausen,422,61.356,14.541,235


In [8]:
# delete the redundant column "Nr." 
df_muc=df_mun.drop("Nr.",axis=1)
df_muc.columns=["Borough","Area","Inhabitant","Density","Foreigner%"] # Translate the columns to English
df_muc.head()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
0,Altstadt-Lehel,315,21.1,6.708,261
1,Ludwigsvorstadt-Isarvorstadt,440,51.644,11.734,284
2,Maxvorstadt,430,51.402,11.96,254
3,Schwabing-West,436,68.527,15.706,227
4,Au-Haidhausen,422,61.356,14.541,235


In [9]:
df_muc["Area"]=df_muc["Area"].values/100
df_muc["Foreigner%"]=df_muc["Foreigner%"].values/10
df_muc.head()
# correct the values in the columns "Area" and "Foreigner"

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
0,Altstadt-Lehel,3.15,21.1,6.708,26.1
1,Ludwigsvorstadt-Isarvorstadt,4.4,51.644,11.734,28.4
2,Maxvorstadt,4.3,51.402,11.96,25.4
3,Schwabing-West,4.36,68.527,15.706,22.7
4,Au-Haidhausen,4.22,61.356,14.541,23.5


In [10]:

df_muc["Density"]=df_muc["Density"].values*1000



In [11]:
df_muc["Inhabitant"]=pd.to_numeric(df_muc["Inhabitant"],errors="coerce")
df_muc["Inhabitant"]=df_muc["Inhabitant"].values*1000


In [20]:
df_muc.dtypes

Borough        object
Area          float64
Inhabitant    float64
Density       float64
Foreigner%    float64
dtype: object

In [12]:
df_muc.tail()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
21,Aubing-Lochhausen-Langwied,34.06,47813.0,1404.0,28.4
22,Allach-Untermenzing,15.45,33355.0,2159.0,24.2
23,Feldmoching-Hasenbergl,28.94,61774.0,2135.0,32.4
24,Laim,5.29,56546.0,10698.0,28.5
25,Landeshauptstadt München,310.71,,4963.0,28.1


In [22]:
# Delete the last line, because it is the sum of all boroughs 
df_muc=df_muc.iloc[0:25,:]
df_muc.tail()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
20,Pasing-Obermenzing,16.5,74625.0,4523.0,22.9
21,Aubing-Lochhausen-Langwied,34.06,47813.0,1404.0,28.4
22,Allach-Untermenzing,15.45,33355.0,2159.0,24.2
23,Feldmoching-Hasenbergl,28.94,61774.0,2135.0,32.4
24,Laim,5.29,56546.0,10698.0,28.5


In [23]:
df_muc.shape

(25, 5)

In [13]:
#change the longest borough name to a shorter one
df_muc["Borough"].replace("Thalkirchen-Obersendling-Forstenried-Fürstenried-Solln","Thalkirchen-Obersendling",inplace=True)

In [25]:
df_muc["Borough"]

0                   Altstadt-Lehel
1     Ludwigsvorstadt-Isarvorstadt
2                      Maxvorstadt
3                   Schwabing-West
4                    Au-Haidhausen
5                         Sendling
6                Sendling-Westpark
7                 Schwanthalerhöhe
8            Neuhausen-Nymphenburg
9                          Moosach
10           Milbertshofen-Am Hart
11              Schwabing-Freimann
12                     Bogenhausen
13                    Berg am Laim
14                  Trudering-Riem
15              Ramersdorf-Perlach
16         Obergiesing-Fasangarten
17         Untergiesing-Harlaching
18        Thalkirchen-Obersendling
19                          Hadern
20              Pasing-Obermenzing
21      Aubing-Lochhausen-Langwied
22             Allach-Untermenzing
23          Feldmoching-Hasenbergl
24                            Laim
Name: Borough, dtype: object

#### use the same method to deal with the Vienna table

In [14]:
url2="https://de.wikipedia.org/wiki/Wien"
table_wien=pd.read_html(url2, header=0, thousands=",")
print(table_wien[1])

               Gemeindebezirk  Fläche(km²)  Einwohner  Einwohnerpro km²  \
0           01., Innere Stadt          287     16.465             5.737   
1           02., Leopoldstadt         1924    105.003             5.458   
2             03., Landstraße          740     90.183            12.187   
3                 04., Wieden          178     33.035            18.559   
4             05., Margareten          201     55.356            27.540   
5              06., Mariahilf          146     31.865            21.825   
6                 07., Neubau          161     32.197            19.998   
7             08., Josefstadt          109     25.528            23.420   
8             09., Alsergrund          297     42.709            14.380   
9              10., Favoriten         3182    198.083             6.225   
10             11., Simmering         2326    100.137             4.305   
11              12., Meidling          810     95.955            11.846   
12              13., Hiet

In [15]:
df_wien=pd.DataFrame(table_wien[1])
df_wien.head()

Unnamed: 0,Gemeindebezirk,Fläche(km²),Einwohner,Einwohnerpro km²,Einwohner mit ausländischer Herkunft(Prozent)[34] Stand: 23. Mai 2019
0,"01., Innere Stadt",287,16.465,5.737,365
1,"02., Leopoldstadt",1924,105.003,5.458,452
2,"03., Landstraße",740,90.183,12.187,414
3,"04., Wieden",178,33.035,18.559,422
4,"05., Margareten",201,55.356,27.54,483


In [16]:
df_wien.columns=["Borough","Area","Inhabitant","Density","Foreigner%"]

In [29]:
df_wien.head()


Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
0,"01., Innere Stadt",287,16.465,5.737,365
1,"02., Leopoldstadt",1924,105.003,5.458,452
2,"03., Landstraße",740,90.183,12.187,414
3,"04., Wieden",178,33.035,18.559,422
4,"05., Margareten",201,55.356,27.54,483


In [17]:
df_wien["Borough"]=df_wien["Borough"].str[5:]  #clean the "Borough column"
df_wien["Borough"]


0             Innere Stadt
1             Leopoldstadt
2               Landstraße
3                   Wieden
4               Margareten
5                Mariahilf
6                   Neubau
7               Josefstadt
8               Alsergrund
9                Favoriten
10               Simmering
11                Meidling
12                Hietzing
13                 Penzing
14    Rudolfsheim-Fünfhaus
15               Ottakring
16                 Hernals
17                 Währing
18                 Döbling
19             Brigittenau
20             Floridsdorf
21              Donaustadt
22                 Liesing
23                    Wien
Name: Borough, dtype: object

In [18]:
df_wien=df_wien.iloc[0:23,:] #delete the last row of the dataframe because it's accumulated data.
df_wien.head(2)

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
0,Innere Stadt,287,16.465,5.737,365
1,Leopoldstadt,1924,105.003,5.458,452


In [19]:
# correct the values in the rest of the columns 
df_wien["Area"]=df_wien["Area"].values/100
df_wien["Foreigner%"]=df_wien["Foreigner%"].values/10
df_wien.head(2)

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
0,Innere Stadt,2.87,16.465,5.737,36.5
1,Leopoldstadt,19.24,105.003,5.458,45.2


In [33]:
df_wien.dtypes

Borough        object
Area          float64
Inhabitant     object
Density       float64
Foreigner%    float64
dtype: object

In [20]:
#change the "Inhabitant" column to type "float"
df_wien["Inhabitant"]=pd.to_numeric(df_wien["Inhabitant"], errors="coerce")
df_wien.dtypes

Borough        object
Area          float64
Inhabitant    float64
Density       float64
Foreigner%    float64
dtype: object

In [21]:
df_wien["Inhabitant"]=df_wien["Inhabitant"].values*1000
df_wien["Density"]=df_wien["Density"].values*1000
df_wien

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%
0,Innere Stadt,2.87,16465.0,5737.0,36.5
1,Leopoldstadt,19.24,105003.0,5458.0,45.2
2,Landstraße,7.4,90183.0,12187.0,41.4
3,Wieden,1.78,33035.0,18559.0,42.2
4,Margareten,2.01,55356.0,27540.0,48.3
5,Mariahilf,1.46,31865.0,21825.0,40.3
6,Neubau,1.61,32197.0,19998.0,38.8
7,Josefstadt,1.09,25528.0,23420.0,38.9
8,Alsergrund,2.97,42709.0,14380.0,41.1
9,Favoriten,31.82,198083.0,6225.0,47.8


In [22]:
df_postal_code_wien = pd.read_excel("Postal_Code_Wien.xlsx")
df_postal_code_wien.head()

Unnamed: 0,Borough,Postal Code
0,Innere Stadt,1010
1,Leopoldstadt,1020
2,Landstraße,1030
3,Wieden,1040
4,Margareten,1050


In [50]:
df_postal_code_wien.shape

(23, 2)

In [23]:
df_wien_new=df_wien.merge(df_postal_code_wien,on="Borough")
df_wien_new.head()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Postal Code
0,Innere Stadt,2.87,16465.0,5737.0,36.5,1010
1,Leopoldstadt,19.24,105003.0,5458.0,45.2,1020
2,Landstraße,7.4,90183.0,12187.0,41.4,1030
3,Wieden,1.78,33035.0,18559.0,42.2,1040
4,Margareten,2.01,55356.0,27540.0,48.3,1050


### 2.2 Foursquare API

Before using the Foursquare API, we should add the latitude and longitude coordinates of each borough to the dataframes.

In [24]:
!pip install geopy
import geopy

Collecting geopy
[?25l  Downloading https://files.pythonhosted.org/packages/07/e1/9c72de674d5c2b8fcb0738a5ceeb5424941fefa080bfe4e240d0bacb5a38/geopy-2.0.0-py3-none-any.whl (111kB)
[K     |████████████████████████████████| 112kB 6.0MB/s eta 0:00:01
[?25hCollecting geographiclib<2,>=1.49 (from geopy)
  Downloading https://files.pythonhosted.org/packages/8b/62/26ec95a98ba64299163199e95ad1b0e34ad3f4e176e221c40245f211e425/geographiclib-1.50-py3-none-any.whl
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-2.0.0


In [25]:
from geopy.geocoders import Nominatim

In [26]:
geolocator= Nominatim(user_agent="me")  #Geo coordinates of the city center of Vienna
location_wien = geolocator.geocode("VIE, Austria")
print((location_wien.latitude, location_wien.longitude))


(46.61340635, 13.826961098702938)


In [27]:
location_muc=geolocator.geocode("MUC,Germany")
print((location_muc.latitude, location_muc.longitude))


(48.35376735, 11.778011507058581)


In [28]:
geolocator= Nominatim(user_agent="me")
wien_coordinates=[]

for index in df_wien.index:
    wien_address=df_wien_new["Borough"][index]+", VIE, Austria"
    location_wien=geolocator.geocode(wien_address)
    wien_latitude=location_wien.latitude 
    wien_longitude=location_wien.longitude
    wien_coordinates.append([wien_address, wien_latitude,wien_longitude]) 
    
  

In [29]:
 wien_coordinates_df=pd.DataFrame(wien_coordinates,columns=["Borough","Latitude","Longitude"])

In [30]:
wien_coordinates_df.head()

Unnamed: 0,Borough,Latitude,Longitude
0,"Innere Stadt, VIE, Austria",46.612454,13.846583
1,"Leopoldstadt, VIE, Austria",48.200638,16.426948
2,"Landstraße, VIE, Austria",48.298303,14.291469
3,"Wieden, VIE, Austria",48.14549,14.888293
4,"Margareten, VIE, Austria",48.188073,16.353386


In [None]:
# Prepare to merge "wien_coordinates_df" and "df_wien"
# clean the "address" column of "wien_coordinates_df" dataframe

In [31]:
wien_coordinates_df["Borough"]=wien_coordinates_df["Borough"].str[:-14]
wien_coordinates_df["Borough"]

0             Innere Stadt
1             Leopoldstadt
2               Landstraße
3                   Wieden
4               Margareten
5                Mariahilf
6                   Neubau
7               Josefstadt
8               Alsergrund
9                Favoriten
10               Simmering
11                Meidling
12                Hietzing
13                 Penzing
14    Rudolfsheim-Fünfhaus
15               Ottakring
16                 Hernals
17                 Währing
18                 Döbling
19             Brigittenau
20             Floridsdorf
21              Donaustadt
22                 Liesing
Name: Borough, dtype: object

In [32]:
df_wien_merged=df_wien.merge(wien_coordinates_df,on="Borough")
df_wien_merged

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude
0,Innere Stadt,2.87,16465.0,5737.0,36.5,46.612454,13.846583
1,Leopoldstadt,19.24,105003.0,5458.0,45.2,48.200638,16.426948
2,Landstraße,7.4,90183.0,12187.0,41.4,48.298303,14.291469
3,Wieden,1.78,33035.0,18559.0,42.2,48.14549,14.888293
4,Margareten,2.01,55356.0,27540.0,48.3,48.188073,16.353386
5,Mariahilf,1.46,31865.0,21825.0,40.3,48.195475,16.347023
6,Neubau,1.61,32197.0,19998.0,38.8,48.201881,16.349056
7,Josefstadt,1.09,25528.0,23420.0,38.9,48.210598,16.35175
8,Alsergrund,2.97,42709.0,14380.0,41.1,48.225073,16.358398
9,Favoriten,31.82,198083.0,6225.0,47.8,48.173423,16.377914


#### the same procedure to the df_muc, but we need to use postal codes because the boroughs in Munich not readable

In [33]:
# read the webpage "https://www.muenchen.de/int/en/living/postal-codes.html"
url_muc_postal="https://www.muenchen.de/int/en/living/postal-codes.html"
table_muc_postal=pd.read_html(url_muc_postal, header=0)
print(table_muc_postal[0])

                                             District  \
0                                 Allach-Untermenzing   
1                                      Altstadt-Lehel   
2                                       Au-Haidhausen   
3                          Aubing-Lochhausen-Langwied   
4                                        Berg am Laim   
5                                         Bogenhausen   
6                              Feldmoching-Hasenbergl   
7                                              Hadern   
8                                                Laim   
9                        Ludwigsvorstadt-Isarvorstadt   
10                                        Maxvorstadt   
11                              Milbertshofen-Am Hart   
12                                            Moosach   
13                              Neuhausen-Nymphenburg   
14                                        Obergiesing   
15                                 Pasing-Obermenzing   
16                             

In [34]:
df_muc_postal=pd.DataFrame(table_muc_postal[0])

In [35]:
df_muc_postal

Unnamed: 0,District,Postal Code
0,Allach-Untermenzing,"80995, 80997, 80999, 81247, 81249"
1,Altstadt-Lehel,"80331, 80333, 80335, 80336, 80469, 80538, 80539"
2,Au-Haidhausen,"81541, 81543, 81667, 81669, 81671, 81675, 81677"
3,Aubing-Lochhausen-Langwied,"81243, 81245, 81249"
4,Berg am Laim,"81671, 81673, 81735, 81825"
5,Bogenhausen,"81675, 81677, 81679, 81925, 81927, 81929"
6,Feldmoching-Hasenbergl,"80933, 80935, 80995"
7,Hadern,"80689, 81375, 81377"
8,Laim,"80686, 80687, 80689"
9,Ludwigsvorstadt-Isarvorstadt,"80335, 80336, 80337, 80469"


In [36]:
df_muc_postal.columns=["Borough","Postal Code"]
df_muc_postal.shape

(25, 2)

Two values in the "Borough" columns of the two dfs are not the same, we must fix the problem.

In [45]:
df_muc_postal["Borough"].replace("Obergiesing","Obergiesing-Fasangarten",inplace=True)

In [46]:
df_muc_postal["Borough"].replace("Thalkirchen-Obersendling-Fürstenried-Forstenried-Solln","Thalkirchen-Obersendling",inplace=True)

In [47]:
df_muc_postal

Unnamed: 0,Borough,Postal Code
0,Allach-Untermenzing,"80995, 80997, 80999, 81247, 81249"
1,Altstadt-Lehel,"80331, 80333, 80335, 80336, 80469, 80538, 80539"
2,Au-Haidhausen,"81541, 81543, 81667, 81669, 81671, 81675, 81677"
3,Aubing-Lochhausen-Langwied,"81243, 81245, 81249"
4,Berg am Laim,"81671, 81673, 81735, 81825"
5,Bogenhausen,"81675, 81677, 81679, 81925, 81927, 81929"
6,Feldmoching-Hasenbergl,"80933, 80935, 80995"
7,Hadern,"80689, 81375, 81377"
8,Laim,"80686, 80687, 80689"
9,Ludwigsvorstadt-Isarvorstadt,"80335, 80336, 80337, 80469"


In [51]:
#Merge the two dataframes again.
df_muc_postal_merged=df_muc.merge(df_muc_postal,on="Borough",how="inner")
df_muc_postal_merged.shape


(25, 6)

In [52]:
geolocator_muc=Nominatim(user_agent="you")
muc_coordinates=[]
temp=df_muc_postal_merged["Postal Code"].str[0:5]
for p in df_muc_postal_merged.index:
    address_muc=temp[p]
    locator_muc=geolocator_muc.geocode(address_muc)
    latitude_muc=locator_muc.latitude
    longitude_muc=locator_muc.longitude
    muc_coordinates.append([address_muc, latitude_muc, longitude_muc])
muc_coordinates   
   

[['80331', 48.136065871544204, 11.573454570315972],
 ['80335', 48.145699615463236, 11.555925619440337],
 ['80333', 48.15114575, 11.562479204882363],
 ['80796', 48.1628675, 11.569807298619128],
 ['81541', 48.1258722, 11.580782024104963],
 ['80336', 48.133549200000004, 11.558182811298703],
 ['80686', 48.1317129, 11.516606790116903],
 ['80335', 48.145699615463236, 11.555925619440337],
 ['80634', 48.150301, 11.528453192816965],
 ['80637', 48.162216442177844, 11.536099784402223],
 ['80807', 48.18299020227114, 11.585847735691877],
 ['80538', 48.14232396465845, 11.590455807100975],
 ['81675', 48.1352758, 11.614246078423617],
 ['81671', 48.12326145, 11.61187684710469],
 ['81735', 48.10749725, 11.65583345502645],
 ['81539', 48.11011395, 11.591103189401016],
 ['81539', 48.11011395, 11.591103189401016],
 ['81543', 48.1216816, 11.576267],
 ['81379', 48.0919671, 11.5263946],
 ['80689', 48.13116667718667, 11.489509729797593],
 ['80687', 48.1386722, 11.518571],
 ['81243', 48.148806851027935, 11.43125

In [53]:
muc_coordinates_df=pd.DataFrame(muc_coordinates, columns=["1st Postal code","Latitude","Longitude"])
muc_coordinates_df.head()

Unnamed: 0,1st Postal code,Latitude,Longitude
0,80331,48.136066,11.573455
1,80335,48.1457,11.555926
2,80333,48.151146,11.562479
3,80796,48.162867,11.569807
4,81541,48.125872,11.580782


In [54]:
#merge the Muc dataframes according to their indexes
merge_muc=df_muc_postal_merged.merge(muc_coordinates_df,left_index=True,right_index=True)
merge_muc

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Postal Code,1st Postal code,Latitude,Longitude
0,Altstadt-Lehel,3.15,21100.0,6708.0,26.1,"80331, 80333, 80335, 80336, 80469, 80538, 80539",80331,48.136066,11.573455
1,Ludwigsvorstadt-Isarvorstadt,4.4,51644.0,11734.0,28.4,"80335, 80336, 80337, 80469",80335,48.1457,11.555926
2,Maxvorstadt,4.3,51402.0,11960.0,25.4,"80333, 80335, 80539, 80636, 80797, 80798, 8079...",80333,48.151146,11.562479
3,Schwabing-West,4.36,68527.0,15706.0,22.7,"80796, 80797, 80798, 80799, 80801, 80803, 8080...",80796,48.162867,11.569807
4,Au-Haidhausen,4.22,61356.0,14541.0,23.5,"81541, 81543, 81667, 81669, 81671, 81675, 81677",81541,48.125872,11.580782
5,Sendling,3.94,40983.0,10405.0,26.9,"80336, 80337, 80469, 81369, 81371, 81373, 81379",80336,48.133549,11.558183
6,Sendling-Westpark,7.81,59643.0,7632.0,28.9,"80686, 81369, 81373, 81377, 81379",80686,48.131713,11.516607
7,Schwanthalerhöhe,2.07,29743.0,14367.0,33.5,"80335, 80339",80335,48.1457,11.555926
8,Neuhausen-Nymphenburg,12.91,98814.0,7651.0,24.3,"80634, 80636, 80637, 80638, 80639",80634,48.150301,11.528453
9,Moosach,11.09,54223.0,4888.0,31.5,"80637, 80638, 80992, 80993, 80997",80637,48.162216,11.5361


In [55]:
merge_muc_cleared=merge_muc.drop(columns=["Postal Code","1st Postal code"])
merge_muc_cleared.head(3)

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude
0,Altstadt-Lehel,3.15,21100.0,6708.0,26.1,48.136066,11.573455
1,Ludwigsvorstadt-Isarvorstadt,4.4,51644.0,11734.0,28.4,48.1457,11.555926
2,Maxvorstadt,4.3,51402.0,11960.0,25.4,48.151146,11.562479


#### Introduce Foursquare API to find our trending venues in each borough of the two cities

In [60]:
CLIENT_ID = "ZWXJGKTL0JAIFUHVXFJLJXTSDFYY30R1YRKDSEBR5S3Z****"   
CLIENT_SECRET = 'SB2EPOCRJFZDMR2DAKJF5V4CQB5GFEYNTHMP2KSAQC3A****' 
VERSION = '20180605' # Foursquare API version

In [61]:
LIMIT =100
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Borough', 
                  'Borough Latitude', 
                  'Borough Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

  

In [62]:
import json
import requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

In [63]:
venues_muc=getNearbyVenues(names=merge_muc["Borough"],latitudes=merge_muc["Latitude"],longitudes=merge_muc["Longitude"],radius=500)
venues_muc.head()

Altstadt-Lehel
Ludwigsvorstadt-Isarvorstadt
Maxvorstadt
Schwabing-West
Au-Haidhausen
Sendling
Sendling-Westpark
Schwanthalerhöhe
Neuhausen-Nymphenburg
Moosach
Milbertshofen-Am Hart
Schwabing-Freimann
Bogenhausen
Berg am Laim
Trudering-Riem
Ramersdorf-Perlach
Obergiesing-Fasangarten
Untergiesing-Harlaching
Thalkirchen-Obersendling
Hadern
Pasing-Obermenzing
Aubing-Lochhausen-Langwied
Allach-Untermenzing
Feldmoching-Hasenbergl
Laim


Unnamed: 0,Borough,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Altstadt-Lehel,48.136066,11.573455,Marienplatz,48.137125,11.575483,Plaza
1,Altstadt-Lehel,48.136066,11.573455,Kustermann,48.136242,11.574897,Department Store
2,Altstadt-Lehel,48.136066,11.573455,St. Peter,48.13653,11.575615,Church
3,Altstadt-Lehel,48.136066,11.573455,Viktualienmarkt,48.135296,11.576368,Farmers Market
4,Altstadt-Lehel,48.136066,11.573455,Venchi Gelato,48.134563,11.574657,Ice Cream Shop


In [64]:
venues_muc.shape

(698, 7)

In [65]:
venues_muc.groupby(["Borough"]).count()

Unnamed: 0_level_0,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Allach-Untermenzing,4,4,4,4,4,4
Altstadt-Lehel,100,100,100,100,100,100
Au-Haidhausen,22,22,22,22,22,22
Aubing-Lochhausen-Langwied,9,9,9,9,9,9
Berg am Laim,34,34,34,34,34,34
Bogenhausen,29,29,29,29,29,29
Feldmoching-Hasenbergl,12,12,12,12,12,12
Hadern,8,8,8,8,8,8
Laim,4,4,4,4,4,4
Ludwigsvorstadt-Isarvorstadt,48,48,48,48,48,48


In [66]:
print("There are {} categories in Munich.".format(len(venues_muc["Venue Category"].unique())))

There are 163 categories in Munich.


In [67]:
#one hot coding
muc_onehot=pd.get_dummies(venues_muc["Venue Category"],prefix="",prefix_sep="")

muc_onehot["Borough"]=venues_muc["Borough"]
muc_onehot.head()


Unnamed: 0,Afghan Restaurant,Arcade,Argentinian Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Austrian Restaurant,Automotive Shop,BBQ Joint,...,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfall,Wine Bar,Wine Shop,Borough
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Altstadt-Lehel
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Altstadt-Lehel
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Altstadt-Lehel
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Altstadt-Lehel
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Altstadt-Lehel


In [70]:
#Move the "Borough" column to the beginning of the df

new_columns=[muc_onehot.columns[-1]]+list(muc_onehot.columns[:-1])
muc_onehot=muc_onehot[new_columns]
muc_onehot.head()

Unnamed: 0,Wine Shop,Borough,Afghan Restaurant,Arcade,Argentinian Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Austrian Restaurant,...,Theater,Theme Park Ride / Attraction,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfall,Wine Bar
0,0,Altstadt-Lehel,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,Altstadt-Lehel,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,Altstadt-Lehel,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,Altstadt-Lehel,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,Altstadt-Lehel,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [71]:
muc_onehot.shape

(698, 164)

In [72]:
# get the frequency for each category in each borough
muc_onehot_grouped=muc_onehot.groupby(["Borough"]).mean().reset_index()
muc_onehot_grouped.head()

Unnamed: 0,Borough,Wine Shop,Afghan Restaurant,Arcade,Argentinian Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Austrian Restaurant,...,Theater,Theme Park Ride / Attraction,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfall,Wine Bar
0,Allach-Untermenzing,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Altstadt-Lehel,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01
2,Au-Haidhausen,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0
3,Aubing-Lochhausen-Langwied,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0
4,Berg am Laim,0.0,0.0,0.0,0.0,0.0,0.029412,0.029412,0.0,0.0,...,0.029412,0.029412,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0


In [73]:
muc_onehot_grouped.shape

(25, 164)

#### find out the ten most common venues in each borough of Munich

In [74]:
def return_most_common_venue (category_row, num_top_venues):
    category_row = category_row.iloc[1:]
    categories_sorted = category_row.sort_values(ascending=False) # sort values in descending order
    
    return categories_sorted.index.values[0:num_top_venues]

In [75]:
import numpy as np

In [76]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Borough']
for q in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(q+1, indicators[q]))
    except:
        columns.append('{}th Most Common Venue'.format(q+1))

# create a new dataframe
borough_venues_sorted_muc = pd.DataFrame(columns=columns)
borough_venues_sorted_muc['Borough'] = muc_onehot_grouped['Borough']
borough_venues_sorted_muc.head()



Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allach-Untermenzing,,,,,,,,,,
1,Altstadt-Lehel,,,,,,,,,,
2,Au-Haidhausen,,,,,,,,,,
3,Aubing-Lochhausen-Langwied,,,,,,,,,,
4,Berg am Laim,,,,,,,,,,


In [77]:
borough_venues_sorted_muc.shape

(25, 11)

In [78]:
# Fill the empty dataframe with values
for m in np.arange(muc_onehot_grouped.shape[0]):
    borough_venues_sorted_muc.iloc[m, 1:] = return_most_common_venue(muc_onehot_grouped.iloc[m, :],
                                                                        num_top_venues)

borough_venues_sorted_muc.head()

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allach-Untermenzing,Bus Stop,Greek Restaurant,Lake,Wine Bar,Event Space,Food & Drink Shop,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant
1,Altstadt-Lehel,German Restaurant,Bavarian Restaurant,Café,Clothing Store,Church,Gourmet Shop,Hotel,Italian Restaurant,Coffee Shop,Cosmetics Shop
2,Au-Haidhausen,Café,Italian Restaurant,Hotel,Bakery,Food & Drink Shop,Brewery,Fast Food Restaurant,Beer Garden,Fair,Beach
3,Aubing-Lochhausen-Langwied,Rest Area,Miscellaneous Shop,Bus Stop,Supermarket,Light Rail Station,Lake,Bowling Alley,Park,Tunnel,Drugstore
4,Berg am Laim,Nightclub,Gym / Fitness Center,Supermarket,Coffee Shop,Beach Bar,Shipping Store,Restaurant,Burger Joint,Pub,Planetarium



### the same procedure for Vienna

In [79]:
df_wien_merged.head()


Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude
0,Innere Stadt,2.87,16465.0,5737.0,36.5,46.612454,13.846583
1,Leopoldstadt,19.24,105003.0,5458.0,45.2,48.200638,16.426948
2,Landstraße,7.4,90183.0,12187.0,41.4,48.298303,14.291469
3,Wieden,1.78,33035.0,18559.0,42.2,48.14549,14.888293
4,Margareten,2.01,55356.0,27540.0,48.3,48.188073,16.353386


In [80]:
# using Foursquare API to find all the venues for each borough of Vienna.
LIMIT=100
venues_wien=getNearbyVenues(names=df_wien_merged["Borough"],latitudes=df_wien_merged["Latitude"],longitudes=df_wien_merged["Longitude"],radius=500)
venues_wien.head()

Innere Stadt
Leopoldstadt
Landstraße
Wieden
Margareten
Mariahilf
Neubau
Josefstadt
Alsergrund
Favoriten
Simmering
Meidling
Hietzing
Penzing
Rudolfsheim-Fünfhaus
Ottakring
Hernals
Währing
Döbling
Brigittenau
Floridsdorf
Donaustadt
Liesing


Unnamed: 0,Borough,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Innere Stadt,46.612454,13.846583,Restaurant Delphi,46.615144,13.847048,Greek Restaurant
1,Innere Stadt,46.612454,13.846583,Hauptplatz Villach,46.613949,13.846609,Plaza
2,Innere Stadt,46.612454,13.846583,Wascher's Bar,46.611099,13.84312,Lounge
3,Innere Stadt,46.612454,13.846583,Holiday Inn,46.615444,13.850119,Hotel
4,Innere Stadt,46.612454,13.846583,Trastevere,46.613703,13.844146,Italian Restaurant


In [81]:
venues_wien.shape

(566, 7)

In [82]:
venues_wien.groupby(["Borough"]).count()

Unnamed: 0_level_0,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alsergrund,51,51,51,51,51,51
Brigittenau,11,11,11,11,11,11
Donaustadt,5,5,5,5,5,5
Döbling,3,3,3,3,3,3
Favoriten,42,42,42,42,42,42
Floridsdorf,9,9,9,9,9,9
Hernals,2,2,2,2,2,2
Hietzing,2,2,2,2,2,2
Innere Stadt,29,29,29,29,29,29
Josefstadt,75,75,75,75,75,75


In [83]:
print("There are {} categories in Vienna".format(len(venues_wien["Venue Category"].unique())))

There are 145 categories in Vienna


In [84]:
wien_onehot=pd.get_dummies(venues_wien["Venue Category"],prefix="",prefix_sep="")
wien_onehot["Borough"]=df_wien_merged["Borough"]
wien_onehot.head()

Unnamed: 0,Accessories Store,Afghan Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Austrian Restaurant,BBQ Joint,Bakery,Bar,...,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Wine Bar,Wine Shop,Winery,Borough
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Innere Stadt
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Leopoldstadt
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Landstraße
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Wieden
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Margareten


In [85]:
wien_onehot.shape

(566, 146)

In [86]:
fixed_columns=[wien_onehot.columns[-1]]+list(wien_onehot.columns[:-1])

In [87]:
wien_onehot=wien_onehot[fixed_columns]
wien_onehot.head(3)

Unnamed: 0,Borough,Accessories Store,Afghan Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Austrian Restaurant,BBQ Joint,Bakery,...,Train Station,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Wine Bar,Wine Shop,Winery
0,Innere Stadt,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Leopoldstadt,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Landstraße,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [88]:
wien_onehot_grouped=wien_onehot.groupby(["Borough"]).mean().reset_index()

In [89]:
wien_onehot_grouped.head()


Unnamed: 0,Borough,Accessories Store,Afghan Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Austrian Restaurant,BBQ Joint,Bakery,...,Train Station,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Wine Bar,Wine Shop,Winery
0,Alsergrund,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Brigittenau,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Donaustadt,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Döbling,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Favoriten,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### find the first 10 most common venues in Vienna.

In [90]:
num_top_venues_wien = 10

indicators_wien = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns_wien = ['Borough']
for r in np.arange(num_top_venues_wien):
    try:
        columns_wien.append('{}{} Most Common Venue'.format(r+1, indicators[r]))
    except:
        columns_wien.append('{}th Most Common Venue'.format(r+1))

# create a new dataframe
borough_venues_sorted_wien = pd.DataFrame(columns=columns_wien)
borough_venues_sorted_wien['Borough'] = wien_onehot_grouped['Borough']
borough_venues_sorted_wien.head()



Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alsergrund,,,,,,,,,,
1,Brigittenau,,,,,,,,,,
2,Donaustadt,,,,,,,,,,
3,Döbling,,,,,,,,,,
4,Favoriten,,,,,,,,,,


In [91]:
# Fill the empty dataframe with values
for s in np.arange(wien_onehot_grouped.shape[0]):
    borough_venues_sorted_wien.iloc[s, 1:] = return_most_common_venue(wien_onehot_grouped.iloc[s, :],
                                                                        num_top_venues_wien)

borough_venues_sorted_wien.head()

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alsergrund,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
1,Brigittenau,Restaurant,Food Court,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant,Winery
2,Donaustadt,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
3,Döbling,Japanese Restaurant,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant
4,Favoriten,Hotel,Hungarian Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant


In [92]:
borough_venues_sorted_wien.shape

(23, 11)

## 2.3 KMeans Clustering

In [93]:
from sklearn.cluster import KMeans

In [94]:
# add values "Density" and "Foreigner%" into df for KMeans analysis (Vienna)

wien_clustering=df_wien_merged.merge(wien_onehot_grouped, on="Borough")
wien_clustering.head()




Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude,Accessories Store,Afghan Restaurant,American Restaurant,...,Train Station,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Wine Bar,Wine Shop,Winery
0,Innere Stadt,2.87,16465.0,5737.0,36.5,46.612454,13.846583,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Leopoldstadt,19.24,105003.0,5458.0,45.2,48.200638,16.426948,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Landstraße,7.4,90183.0,12187.0,41.4,48.298303,14.291469,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Wieden,1.78,33035.0,18559.0,42.2,48.14549,14.888293,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Margareten,2.01,55356.0,27540.0,48.3,48.188073,16.353386,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [95]:
# add values "Density" and "Foreigners%" into df for KMeans analysis (Munich)

muc_clustering=merge_muc_cleared.merge(muc_onehot_grouped, on="Borough")
muc_clustering.head()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude,Wine Shop,Afghan Restaurant,Arcade,...,Theater,Theme Park Ride / Attraction,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfall,Wine Bar
0,Altstadt-Lehel,3.15,21100.0,6708.0,26.1,48.136066,11.573455,0.01,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01
1,Ludwigsvorstadt-Isarvorstadt,4.4,51644.0,11734.0,28.4,48.1457,11.555926,0.0,0.020833,0.0,...,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0
2,Maxvorstadt,4.3,51402.0,11960.0,25.4,48.151146,11.562479,0.0,0.0,0.022727,...,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0
3,Schwabing-West,4.36,68527.0,15706.0,22.7,48.162867,11.569807,0.0,0.016667,0.0,...,0.0,0.0,0.016667,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0
4,Au-Haidhausen,4.22,61356.0,14541.0,23.5,48.125872,11.580782,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0


In [96]:
wien_clustering=wien_clustering.drop(["Borough","Area","Inhabitant","Latitude","Longitude"],axis=1)


In [97]:
wien_clustering.head()

Unnamed: 0,Density,Foreigner%,Accessories Store,Afghan Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Austrian Restaurant,BBQ Joint,...,Train Station,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Wine Bar,Wine Shop,Winery
0,5737.0,36.5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,5458.0,45.2,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,12187.0,41.4,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,18559.0,42.2,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,27540.0,48.3,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [98]:
muc_clustering=muc_clustering.drop(["Borough","Area","Inhabitant","Latitude","Longitude"],axis=1)
muc_clustering.shape

(25, 165)

In [99]:
muc_clustering.head()

Unnamed: 0,Density,Foreigner%,Wine Shop,Afghan Restaurant,Arcade,Argentinian Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Theater,Theme Park Ride / Attraction,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfall,Wine Bar
0,6708.0,26.1,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01
1,11734.0,28.4,0.0,0.020833,0.0,0.0,0.0,0.0,0.083333,0.0,...,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0
2,11960.0,25.4,0.0,0.0,0.022727,0.0,0.022727,0.0,0.022727,0.0,...,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0
3,15706.0,22.7,0.0,0.016667,0.0,0.0,0.0,0.0,0.016667,0.0,...,0.0,0.0,0.016667,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0
4,14541.0,23.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0


In [100]:
kclusters_wien = 5

# run k-means clustering
kmeans_wien = KMeans(n_clusters=kclusters_wien, random_state=0).fit(wien_clustering)

# check cluster labels generated for each row in the dataframe
kmeans_wien.labels_[0:10] 

array([0, 0, 3, 1, 4, 1, 1, 4, 3, 0], dtype=int32)

In [101]:
kclusters_muc = 5

muc_venue_clustering = muc_onehot_grouped.drop('Borough', 1)

# run k-means clustering
kmeans_muc = KMeans(n_clusters=kclusters_muc, random_state=0).fit(muc_clustering)

# check cluster labels generated for each row in the dataframe
kmeans_muc.labels_[0:10] 

array([0, 1, 1, 4, 4, 1, 0, 4, 0, 3], dtype=int32)

In [102]:
# add clustering labels
borough_venues_sorted_wien.insert(0, 'Cluster Labels', kmeans_wien.labels_)


In [103]:
borough_venues_sorted_muc.insert(0, 'Cluster Labels', kmeans_muc.labels_)

In [104]:
borough_venues_sorted_muc.head()

Unnamed: 0,Cluster Labels,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,Allach-Untermenzing,Bus Stop,Greek Restaurant,Lake,Wine Bar,Event Space,Food & Drink Shop,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant
1,1,Altstadt-Lehel,German Restaurant,Bavarian Restaurant,Café,Clothing Store,Church,Gourmet Shop,Hotel,Italian Restaurant,Coffee Shop,Cosmetics Shop
2,1,Au-Haidhausen,Café,Italian Restaurant,Hotel,Bakery,Food & Drink Shop,Brewery,Fast Food Restaurant,Beer Garden,Fair,Beach
3,4,Aubing-Lochhausen-Langwied,Rest Area,Miscellaneous Shop,Bus Stop,Supermarket,Light Rail Station,Lake,Bowling Alley,Park,Tunnel,Drugstore
4,4,Berg am Laim,Nightclub,Gym / Fitness Center,Supermarket,Coffee Shop,Beach Bar,Shipping Store,Restaurant,Burger Joint,Pub,Planetarium


In [105]:
merge_muc_cleared.head()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude
0,Altstadt-Lehel,3.15,21100.0,6708.0,26.1,48.136066,11.573455
1,Ludwigsvorstadt-Isarvorstadt,4.4,51644.0,11734.0,28.4,48.1457,11.555926
2,Maxvorstadt,4.3,51402.0,11960.0,25.4,48.151146,11.562479
3,Schwabing-West,4.36,68527.0,15706.0,22.7,48.162867,11.569807
4,Au-Haidhausen,4.22,61356.0,14541.0,23.5,48.125872,11.580782


In [106]:
muc_merged=merge_muc_cleared.merge(borough_venues_sorted_muc,on="Borough")
muc_merged.head()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Altstadt-Lehel,3.15,21100.0,6708.0,26.1,48.136066,11.573455,1,German Restaurant,Bavarian Restaurant,Café,Clothing Store,Church,Gourmet Shop,Hotel,Italian Restaurant,Coffee Shop,Cosmetics Shop
1,Ludwigsvorstadt-Isarvorstadt,4.4,51644.0,11734.0,28.4,48.1457,11.555926,3,Italian Restaurant,Asian Restaurant,Hotel,Middle Eastern Restaurant,Beer Garden,Gym / Fitness Center,Indian Restaurant,Indie Movie Theater,Restaurant,Salad Place
2,Maxvorstadt,4.3,51402.0,11960.0,25.4,48.151146,11.562479,3,Café,Restaurant,German Restaurant,Coffee Shop,Steakhouse,Vietnamese Restaurant,Burger Joint,Pub,Martial Arts Dojo,Peruvian Restaurant
3,Schwabing-West,4.36,68527.0,15706.0,22.7,48.162867,11.569807,3,Vietnamese Restaurant,Italian Restaurant,Bar,Indian Restaurant,Greek Restaurant,Thai Restaurant,Supermarket,Sushi Restaurant,Japanese Restaurant,Café
4,Au-Haidhausen,4.22,61356.0,14541.0,23.5,48.125872,11.580782,1,Café,Italian Restaurant,Hotel,Bakery,Food & Drink Shop,Brewery,Fast Food Restaurant,Beer Garden,Fair,Beach


In [107]:
muc_merged.shape

(25, 18)

In [108]:
# merge the dataframes of Vienna
wien_merged=df_wien_merged.merge(borough_venues_sorted_wien,on="Borough")
wien_merged.head()

Unnamed: 0,Borough,Area,Inhabitant,Density,Foreigner%,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Innere Stadt,2.87,16465.0,5737.0,36.5,46.612454,13.846583,3,Greek Restaurant,Winery,Food Court,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
1,Leopoldstadt,19.24,105003.0,5458.0,45.2,48.200638,16.426948,3,Plaza,Winery,Doner Restaurant,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Dive Bar
2,Landstraße,7.4,90183.0,12187.0,41.4,48.298303,14.291469,2,Lounge,Winery,French Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
3,Wieden,1.78,33035.0,18559.0,42.2,48.14549,14.888293,2,Hotel,Hungarian Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
4,Margareten,2.01,55356.0,27540.0,48.3,48.188073,16.353386,2,Italian Restaurant,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant


In [109]:
wien_merged.shape

(23, 18)

### Show the clustered boroughs of the two cities

In [110]:
wien_merged.loc[wien_merged["Cluster Labels"]==0,wien_merged.columns[[0,3,4] + list(range(8, wien_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Neubau,19998.0,38.8,Supermarket,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant
7,Josefstadt,23420.0,38.9,Mexican Restaurant,Convenience Store,Food Court,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Winery
8,Alsergrund,14380.0,41.1,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
15,Ottakring,12033.0,46.9,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
19,Brigittenau,15213.0,50.1,Restaurant,Food Court,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant,Winery


In [111]:
wien_merged.loc[wien_merged["Cluster Labels"]==1,wien_merged.columns[[0,3,4]+list(range(8,wien_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Mariahilf,21825.0,40.3,Plaza,Winery,Doner Restaurant,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Dive Bar
16,Hernals,5020.0,43.7,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
18,Döbling,2891.0,34.8,Japanese Restaurant,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant
20,Floridsdorf,3571.0,33.2,Asian Restaurant,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant


In [112]:
wien_merged.loc[wien_merged["Cluster Labels"]==2,wien_merged.columns[[0,3,4]+list(range(8,wien_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Landstraße,12187.0,41.4,Lounge,Winery,French Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
3,Wieden,18559.0,42.2,Hotel,Hungarian Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
4,Margareten,27540.0,48.3,Italian Restaurant,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant
10,Simmering,4305.0,40.1,Gastropub,Winery,Food Court,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
13,Penzing,2735.0,35.2,Austrian Restaurant,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant
17,Währing,8052.0,36.7,Gastropub,Winery,Food Court,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
22,Liesing,3151.0,27.9,Coffee Shop,Winery,Drugstore,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Doner Restaurant


In [113]:
wien_merged.loc[wien_merged["Cluster Labels"]==3,wien_merged.columns[[0,3,4]+list(range(8,wien_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Innere Stadt,5737.0,36.5,Greek Restaurant,Winery,Food Court,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
1,Leopoldstadt,5458.0,45.2,Plaza,Winery,Doner Restaurant,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Dive Bar
11,Meidling,11846.0,45.7,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
14,Rudolfsheim-Fünfhaus,20153.0,53.6,Multiplex,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore
21,Donaustadt,1800.0,30.4,Café,Winery,Convenience Store,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore


In [114]:
wien_merged.loc[wien_merged["Cluster Labels"]==4,wien_merged.columns[[0,3,4]+list(range(8,wien_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Favoriten,6225.0,47.8,Hotel,Hungarian Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant
12,Hietzing,1436.0,28.7,Steakhouse,French Restaurant,Food & Drink Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Drugstore,Doner Restaurant


#### clustered boroughs in Munich 

In [115]:
muc_merged.loc[muc_merged["Cluster Labels"]==0,muc_merged.columns[[0,3,4]+list(range(8,muc_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Neuhausen-Nymphenburg,7651.0,24.3,German Restaurant,Supermarket,Drugstore,Bakery,Shipping Store,Italian Restaurant,Ice Cream Shop,Café,Pizza Place,Restaurant
11,Schwabing-Freimann,3036.0,29.3,German Restaurant,Italian Restaurant,Hotel,Bar,Park,Plaza,Tram Station,Art Museum,Asian Restaurant,Mexican Restaurant
22,Allach-Untermenzing,2159.0,24.2,Bus Stop,Greek Restaurant,Lake,Wine Bar,Event Space,Food & Drink Shop,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant
23,Feldmoching-Hasenbergl,2135.0,32.4,Clothing Store,Supermarket,Pharmacy,Bus Stop,Fast Food Restaurant,Drugstore,Gas Station,Thai Restaurant,Fish Market,Farmers Market
24,Laim,10698.0,28.5,Hardware Store,Tram Station,Supermarket,Greek Restaurant,Wine Bar,English Restaurant,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant


In [116]:
muc_merged.loc[muc_merged["Cluster Labels"]==1,muc_merged.columns[[0,3,4]+list(range(8,muc_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Altstadt-Lehel,6708.0,26.1,German Restaurant,Bavarian Restaurant,Café,Clothing Store,Church,Gourmet Shop,Hotel,Italian Restaurant,Coffee Shop,Cosmetics Shop
4,Au-Haidhausen,14541.0,23.5,Café,Italian Restaurant,Hotel,Bakery,Food & Drink Shop,Brewery,Fast Food Restaurant,Beer Garden,Fair,Beach
12,Bogenhausen,3709.0,24.4,Supermarket,Italian Restaurant,Bus Stop,Drugstore,Gym,Greek Restaurant,German Restaurant,Gastropub,Electronics Store,Shopping Mall
15,Ramersdorf-Perlach,5847.0,33.9,Bakery,Plaza,Bus Stop,Metro Station,Supermarket,Drugstore,German Restaurant,Park,Greek Restaurant,Shipping Store
17,Untergiesing-Harlaching,6601.0,24.1,Italian Restaurant,Gastropub,Park,German Restaurant,Pizza Place,Asian Restaurant,Café,Brewery,Plaza,Laundromat


In [117]:
muc_merged.loc[muc_merged["Cluster Labels"]==2,muc_merged.columns[[0,3,4]+list(range(8,muc_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Sendling-Westpark,7632.0,28.9,Hardware Store,Tram Station,Supermarket,Greek Restaurant,Wine Bar,English Restaurant,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant
9,Moosach,4888.0,31.5,Café,Hotel,Supermarket,German Restaurant,Gastropub,Italian Restaurant,Beer Garden,Trattoria/Osteria,Gym,Sushi Restaurant
10,Milbertshofen-Am Hart,5597.0,40.8,Hotel,Italian Restaurant,Burger Joint,Furniture / Home Store,Rental Car Location,Clothing Store,Kebab Restaurant,Gym,Bus Stop,Tram Station
14,Trudering-Riem,3261.0,23.3,Bus Stop,Tennis Court,Doner Restaurant,Supermarket,German Restaurant,Miscellaneous Shop,Food & Drink Shop,Fish Market,Fast Food Restaurant,Farmers Market
16,Obergiesing-Fasangarten,9485.0,31.1,Bakery,Plaza,Bus Stop,Metro Station,Supermarket,Drugstore,German Restaurant,Park,Greek Restaurant,Shipping Store
18,Thalkirchen-Obersendling,5445.0,27.4,Supermarket,Bakery,Spa,Laser Tag,Bus Stop,Restaurant,Tennis Court,Drugstore,Pharmacy,Gym


In [118]:
muc_merged.loc[muc_merged["Cluster Labels"]==3,muc_merged.columns[[0,3,4]+list(range(8,muc_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Ludwigsvorstadt-Isarvorstadt,11734.0,28.4,Italian Restaurant,Asian Restaurant,Hotel,Middle Eastern Restaurant,Beer Garden,Gym / Fitness Center,Indian Restaurant,Indie Movie Theater,Restaurant,Salad Place
2,Maxvorstadt,11960.0,25.4,Café,Restaurant,German Restaurant,Coffee Shop,Steakhouse,Vietnamese Restaurant,Burger Joint,Pub,Martial Arts Dojo,Peruvian Restaurant
3,Schwabing-West,15706.0,22.7,Vietnamese Restaurant,Italian Restaurant,Bar,Indian Restaurant,Greek Restaurant,Thai Restaurant,Supermarket,Sushi Restaurant,Japanese Restaurant,Café
5,Sendling,10405.0,26.9,Hotel,Middle Eastern Restaurant,Italian Restaurant,Plaza,Café,Supermarket,Wine Bar,German Restaurant,Pizza Place,Park
7,Schwanthalerhöhe,14367.0,33.5,Italian Restaurant,Asian Restaurant,Hotel,Middle Eastern Restaurant,Beer Garden,Gym / Fitness Center,Indian Restaurant,Indie Movie Theater,Restaurant,Salad Place
20,Pasing-Obermenzing,4523.0,22.9,Supermarket,Italian Restaurant,Climbing Gym,Automotive Shop,Bank,Bakery,Sushi Restaurant,Light Rail Station,Thai Restaurant,Drugstore


In [119]:
muc_merged.loc[muc_merged["Cluster Labels"]==4,muc_merged.columns[[0,3,4]+list(range(8,muc_merged.shape[1]))]]

Unnamed: 0,Borough,Density,Foreigner%,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,Berg am Laim,7300.0,31.9,Nightclub,Gym / Fitness Center,Supermarket,Coffee Shop,Beach Bar,Shipping Store,Restaurant,Burger Joint,Pub,Planetarium
19,Hadern,5410.0,27.3,Bus Stop,Supermarket,Shop & Service,Drugstore,Music Store,Bakery,Wine Bar,Fish Market,Fast Food Restaurant,Farmers Market
21,Aubing-Lochhausen-Langwied,1404.0,28.4,Rest Area,Miscellaneous Shop,Bus Stop,Supermarket,Light Rail Station,Lake,Bowling Alley,Park,Tunnel,Drugstore


### Make maps for a better visualization

In [2]:
!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library
import json

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    brotlipy-0.7.0             |py36h8c4c3a4_1000         346 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    chardet-3.0.4              |py36h9f0ad1d_1006         188 KB  conda-forge
    cryptography-2.9.2         |   py36h45

In [120]:
import matplotlib.cm as cm
import matplotlib.colors as colors

In [123]:
# A map of Vienna and its boroughs

map_clusters_wien = folium.Map(location=[location_wien.latitude, location_wien.longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters_wien)
ys = [p + x + (p*x)**2 for p in range(kclusters_wien)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(p) for p in colors_array]

# add markers to the map
markers_colors = []
for lat1, lon1, bo1, cluster1 in zip(wien_merged['Latitude'], wien_merged['Longitude'], wien_merged['Borough'], 
                                  wien_merged['Cluster Labels']):
    label_wien = folium.Popup(str(bo1) + ' Cluster ' + str(cluster1), parse_html=True)
    folium.CircleMarker(
        [lat1, lon1],
        radius=5,
        popup=label_wien,
        color=rainbow[cluster1-1],
        fill=True,
        fill_color=rainbow[cluster1-1],
        fill_opacity=0.7).add_to(map_clusters_wien)
       
map_clusters_wien

In [122]:
# A map of Munich and its boroughs

map_clusters_muc = folium.Map(location=[location_muc.latitude, location_muc.longitude], zoom_start=10)


# add markers to the map
markers_colors1 = []
for lat2, lon2, bo2, cluster2 in zip(muc_merged['Latitude'], muc_merged['Longitude'], muc_merged['Borough'], 
                                  muc_merged['Cluster Labels']):
    label_muc = folium.Popup(str(bo2) + ' Cluster ' + str(cluster2), parse_html=True)
    folium.CircleMarker(
        [lat2, lon2],
        radius=5,
        popup=label_muc,
        color=rainbow[cluster2-1],
        fill=True,
        fill_color=rainbow[cluster2-1],
        fill_opacity=0.7).add_to(map_clusters_muc)
       
map_clusters_muc

# 3 Results

**Munich's** 25 boroughs has 163 categories of common venues, while **Vienna's** 23 boroughs has 145.

**Munich**: we have devided the 25 boroughs of Munich into five groups, according to their common venues and also the densities and the proportions of foreigners. 

We can see that the boroughs in the city center(Ludwigsvorstadt-Isarvorstadt,Maxvorstadt, Schwabing-West, Sendling, Schwanthalerhöhe) and the new center of the satellite city (Pasing-Obermenzing) are in the same group. These regions (except the new Pasing center) have a very high density (>10000 people pro km2)and a relatively low proportion of foreigners (5 of 6 <30%). A great diversity of restaurants and cafes compose the most 10 common venues. (***green spots in the Muc-Map***)

Milbershofen-Am-Hart, Moosach, Thalkirchen-Obersendling, Truding and Obergiesing (***blue spots in the MUC-Map***) are around the suburb region of Munich, the density of the regions are relativly high, and because of the high proportion of foreigners, Milbershofen may be the most multicultural borough of Munich. The common venues are not fancy international restaurants, but local drink and food shops and some supermarkets, cafes and public transportation stops.

Altstadt-Lehel is the traditional old town of Munich, Bogenhausen, Auhaidhausen and Untergiesing are tightly around Lehel. With relatively low density and low proportion of international inhabitants, the regions are known for its bourgeois style: shopping mal, fancy gourmet shops, wine shops, parks and public squares.(***Puple points in the MUC-Map***)

Laim, Schwabing-Freimann and Nymphenburg plus Allach-Untermenzing and Feldmoching are in the same group, which surprises me not a few. The first three boroughs are between the city center and suburb regions of Munich. Laim has the highest density (>10000) because many companies are landed in Laim. The last two boroughs are some farther from the city center and have common venues for everyday life. (***Red points in the MUC-Map***) 

The last group is composed of Hadern, Berg-am-Laim and Aubing, which are in the middle of the broad Munich area. Besides venues for everyday life, in Berg-am-Laim, people can also easily find nightclubs and fitness studios. (***Orange spot in MUC-Map***)

**Vienna** :
Vienna's boroughs can also be divided in 5 groups.

Neubau, Josephsstadt, Alsergrund, Briggitenau, Ottakring constitute the city center of Vienna. A very high density (>12000 people/km2) and proportion of international inhabitants (>38%). Fancy restaurants, winery,farmer markets, event places are around every corner. (***red spots in the Vienna-Map***)

Mariahilf, which has the highest density, and Hernals, Floridsdorf, Döbling, whose density is moderate, are in the second group. The common venues are quite similar to the last group and Asian restaurants are more easily to find here. (***purple spots in the Vienna-Map***)

Liesing, Penzing, Sommering, Währing are the surburb regions of Vienna; Margaretten, Landsstr. and Wieden still belong to the center of Vienna. So I am suprised about this group. The density of the last three is quite high (>12000), and the first three are quite low (around 4000). But according to the common venues there are quite the same. (***blue spots in the Vienna-Map***)

Innere Stadt, Leopoldstr. Meidling, Donaustadt and Rudolfsheim are in the same group. Rudolfsheim is tightly around the city center and has a very high density (>20000) and Multiplex is its most common venue. (***green spots in the Vienna-Map***)

Hietzing and Favoriten constitute the final group. They are in the suburb of the Vienna city. In Favoriten, hotels will be the easiest venue to find, and in Hietzing, steakhouse and French restaurants are the most common venue, despite its lowest density and proportion of international inhabitants among all the boroughs in Vienna.




# 4 Discussion

The analysis using KMeans has still much room to improve. Only according to the common venues, we can only get a rough profile of the boroughs in the two cities. With the density and the proportion of international inhabitants, the big picture can be depicted, but for a finer picture, we need to take other relavant parameters into account, such as the average real estate price in each borough of the two cities, which will influence the results emormously.

Moreover, the categories got through the Foursquare API are quite detailed. A factor analysis to reduce dimensions is also quite necessary to know the boroughs better.

# 5 Conclusion

To sum up, in the city center of Munich, the huge diversity of restaurants is quite noticeable. Parks and public squares decorate its superior residential area. International inhabitants in Munich usually don't live in the city center, but in the north part of city, while in Vienna, the situation is oppisite. Overall, the proportion of foreigners in Vienna is around 10% higher than Munich. Moreover, the city center of Vienna is double crowed than that of Munich. In Vienna, the kinds of restaurants are less than those in Munich, but you can find more event places, winery and farmer markets.

So if you are a young scholar with great interest in gourmet and supreme life style, the center of Munich may be a better choice. And if you are a young international scholar,in the city center of Vienna you may find people with different culture more easily. 