# Capstone Project: restaurant opening in Munich

### Note: This notebook will be used for the capstone project required for the IBM Data Science Specialization on coursera.

#### Introduction

Munich is the capital of Baviera, a region in Germany. Being the third biggest city in the country, is also the eleventh biggest city in Europe. Having multiple high value industries and millions of tourists yearly, it has a vibrant economic activity.

The purpose of this project is to analyze the neighborhoods of Munich in order to determine 
possible location for opening a restaurant. I intend to analyze relevant data and provide value insights.


#### Data

We will be needing the following data:
1.	District data of Munich which we can find at: https://www.muenchen.de/int/en/living/postal-codes.html
2.	Geographical coordinates of Munich and each of its neighborhoods
3.	Venue data for neighborhoods in Munich


I intend to do the following:
1. Import all necessary libraries
2. Find Munich's Districts data and reading it in Jupyter
3. Creating a Dataframe that presents District, Latitude and Longitude for each Postal Code
4. Define a function to access Munich most common venues in Foursquare
5. Cluster Districts according to their most common venues categories
6. Visualize clusters in a map and analyze each cluster most common venues categories
7. Determine which clusters are a good viable option for opening a restaurant, and which are not

#### Imports

In [1]:
# Imports
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

!pip install geocoder
import geocoder

!pip install geopy
from geopy.geocoders import Nominatim

!pip install folium
import folium


# Importing KMeans
from sklearn.cluster import KMeans


# Clusters visualization
import matplotlib.cm as cm
import matplotlib.colors as colors

print('Libraries imported.')

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.




#### Importing Dataset and creating Dataframe

In [2]:
df = pd.read_html('https://www.muenchen.de/int/en/living/postal-codes.html')[0] #import data
df.head()

Unnamed: 0,District,Postal Code
0,Allach-Untermenzing,"80995, 80997, 80999, 81247, 81249"
1,Altstadt-Lehel,"80331, 80333, 80335, 80336, 80469, 80538, 80539"
2,Au-Haidhausen,"81541, 81543, 81667, 81669, 81671, 81675, 81677"
3,Aubing-Lochhausen-Langwied,"81243, 81245, 81249"
4,Berg am Laim,"81671, 81673, 81735, 81825"


In [3]:
# Split places according to ther Postal Code

df_cleaned = pd.DataFrame(columns={'District', 'Postal Code'})

items = []
for idx, codes in enumerate(df['Postal Code']):
    code_list = codes.split(',')
    district = df['District'][idx]
    for element in code_list:
        element = element.replace(' ', '')
        items.append({'District': district, 'Postal Code': element})
        
df_cleaned = df_cleaned.append(items)
print('Munich presents {} districts'.format(len(df_cleaned['District'].unique())))
df_cleaned.head()

Munich presents 25 districts


Unnamed: 0,Postal Code,District
0,80995,Allach-Untermenzing
1,80997,Allach-Untermenzing
2,80999,Allach-Untermenzing
3,81247,Allach-Untermenzing
4,81249,Allach-Untermenzing


In [4]:
df_districts = pd.DataFrame(df_cleaned['District'].unique())
df_districts

Unnamed: 0,0
0,Allach-Untermenzing
1,Altstadt-Lehel
2,Au-Haidhausen
3,Aubing-Lochhausen-Langwied
4,Berg am Laim
5,Bogenhausen
6,Feldmoching-Hasenbergl
7,Hadern
8,Laim
9,Ludwigsvorstadt-Isarvorstadt


In [5]:
# Foursquare credentials

CLIENT_ID = 'XONT31F4PFQKB2PPESZQ3O1OC3DIB3PVYP4IPDQDRM14GPG2'
CLIENT_SECRET = 'M11IHZ444GCB0SVITD5GFFEEYNDXT0ZJS3YTLPQZUTYUXCWA'
VERSION = '20201206'
LIMIT = 100

In [6]:
# Create new dataframe with latitude and longitude
df_coor = pd.DataFrame(columns = ['District', 'Postal Code', 'Latitude', 'Longitude'])

#Looping over data and filling the new dataframe one row at a time
items = []
for idx, district in enumerate (df_cleaned['District']):
    code = df_cleaned['Postal Code'][idx]
    address = district + ', ' + code
    
    geolocator = Nominatim(user_agent="mu_explorer")
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    items.append({'District': district, 
                  'Postal Code': code,
                  'Latitude': latitude,
                  'Longitude': longitude})

df_coor = df_coor.append(items)
print(df_coor.shape)
df_coor.head()

(127, 4)


Unnamed: 0,District,Postal Code,Latitude,Longitude
0,Allach-Untermenzing,80995,48.195157,11.462973
1,Allach-Untermenzing,80997,48.195157,11.462973
2,Allach-Untermenzing,80999,48.195157,11.462973
3,Allach-Untermenzing,81247,48.195157,11.462973
4,Allach-Untermenzing,81249,48.195157,11.462973


In [7]:
# Districts and postal codes
df = df_coor

print('The dataframe contains {} districts and {} postal codes'.format(
    len(df['District'].unique()), len(df['Postal Code'])))

The dataframe contains 25 districts and 127 postal codes


#### Visualizing data

In [8]:
# Munich Latitude and Longitude

address = 'Munich, DE'
geolocator = Nominatim(user_agent = "mu_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(' The geograpical coordinates of Munich are {}, {}'.format(latitude, longitude))

 The geograpical coordinates of Munich are 48.1371079, 11.5753822


In [9]:
# Creating a map of Munich with Districts superimposed on top

map_munich = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, district, in zip(df['Latitude'], df['Longitude'], df['District']):
    label = '{}'.format(district)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_munich) 
    
map_munich

#### District Exploration

In [10]:
# Now we will explore each District in Munich. For that, we will define a function

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['District', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [11]:
# Munich District venues

munich_venues = getNearbyVenues(names=df['District'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Allach-Untermenzing
Allach-Untermenzing
Allach-Untermenzing
Allach-Untermenzing
Allach-Untermenzing
Altstadt-Lehel
Altstadt-Lehel
Altstadt-Lehel
Altstadt-Lehel
Altstadt-Lehel
Altstadt-Lehel
Altstadt-Lehel
Au-Haidhausen
Au-Haidhausen
Au-Haidhausen
Au-Haidhausen
Au-Haidhausen
Au-Haidhausen
Au-Haidhausen
Aubing-Lochhausen-Langwied
Aubing-Lochhausen-Langwied
Aubing-Lochhausen-Langwied
Berg am Laim
Berg am Laim
Berg am Laim
Berg am Laim
Bogenhausen
Bogenhausen
Bogenhausen
Bogenhausen
Bogenhausen
Bogenhausen
Feldmoching-Hasenbergl
Feldmoching-Hasenbergl
Feldmoching-Hasenbergl
Hadern
Hadern
Hadern
Laim
Laim
Laim
Ludwigsvorstadt-Isarvorstadt
Ludwigsvorstadt-Isarvorstadt
Ludwigsvorstadt-Isarvorstadt
Ludwigsvorstadt-Isarvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Maxvorstadt
Milbertshofen-Am Hart
Milbertshofen-Am Hart
Milbertshofen-Am Hart
Milbertshofen-Am Hart
Moosach
Moosach
Moosach
Moosach
Moosach
Neuhausen-Nymphenburg
Neuhausen-Nym

In [12]:
# Dataframe size
print(munich_venues.shape)
munich_venues.head()

(3387, 7)


Unnamed: 0,District,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Allach-Untermenzing,48.195157,11.462973,Bäckerei Schuhmair,48.197175,11.459016,Bakery
1,Allach-Untermenzing,48.195157,11.462973,Sport Bittl,48.191447,11.466553,Sporting Goods Shop
2,Allach-Untermenzing,48.195157,11.462973,dm-drogerie markt,48.194118,11.46564,Drugstore
3,Allach-Untermenzing,48.195157,11.462973,Sicilia,48.193331,11.459387,Italian Restaurant
4,Allach-Untermenzing,48.195157,11.462973,Lidl,48.194428,11.465612,Supermarket


In [13]:
# Venues per District
munich_venues.groupby('District').count()

Unnamed: 0_level_0,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
District,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Allach-Untermenzing,40,40,40,40,40,40
Altstadt-Lehel,700,700,700,700,700,700
Au-Haidhausen,266,266,266,266,266,266
Berg am Laim,29,29,29,29,29,29
Bogenhausen,72,72,72,72,72,72
Feldmoching-Hasenbergl,6,6,6,6,6,6
Hadern,33,33,33,33,33,33
Laim,63,63,63,63,63,63
Ludwigsvorstadt-Isarvorstadt,400,400,400,400,400,400
Maxvorstadt,387,387,387,387,387,387


In [14]:
# Unique categories
print('There are {} unique categories venues'.format(len(munich_venues['Venue Category'].unique())))

There are 182 unique categories venues


#### Analyzing each District

In [15]:
# One hot encoding
munich_onehot = pd.get_dummies(munich_venues[['Venue Category']], prefix="", prefix_sep="")

# Adding District column to munich_onehot
munich_onehot.insert(0, 'District', munich_venues['District'])
print(munich_onehot.shape)
munich_onehot.head()

(3387, 183)


Unnamed: 0,District,ATM,Afghan Restaurant,American Restaurant,Arcade,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Austrian Restaurant,Auto Dealership,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Bavarian Restaurant,Beach,Beach Bar,Beer Garden,Beer Store,Big Box Store,Bistro,Board Shop,Bookstore,Boutique,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Bus Line,Bus Stop,Butcher,Cafeteria,Café,Candy Store,Chinese Restaurant,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Comic Shop,Community Center,Concert Hall,Convenience Store,Cosmetics Shop,Costume Shop,Cultural Center,Cupcake Shop,Currywurst Joint,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Doner Restaurant,Drugstore,Eastern European Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hardware Store,Hawaiian Restaurant,Hill,Historic Site,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Lake,Laundry Service,Light Rail Station,Liquor Store,Lounge,Manti Place,Market,Martial Arts School,Men's Store,Metro Station,Middle Eastern Restaurant,Mobile Phone Shop,Modern European Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Opera House,Optical Shop,Organic Grocery,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pastry Shop,Pedestrian Plaza,Peruvian Restaurant,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool,Post Office,Pub,Ramen Restaurant,Record Shop,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shipping Store,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soup Place,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Tapas Restaurant,Taverna,Tea Room,Thai Restaurant,Theater,Tiki Bar,Trail,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Water Park,Wine Bar,Wine Shop,Yoga Studio
0,Allach-Untermenzing,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Allach-Untermenzing,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Allach-Untermenzing,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Allach-Untermenzing,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Allach-Untermenzing,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [16]:
# Lets group by District and take the mean of the frequency of each category
munich_grouped = munich_onehot.groupby('District').mean().reset_index()
print(munich_grouped.shape)
munich_grouped.head()

(24, 183)


Unnamed: 0,District,ATM,Afghan Restaurant,American Restaurant,Arcade,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Austrian Restaurant,Auto Dealership,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Bavarian Restaurant,Beach,Beach Bar,Beer Garden,Beer Store,Big Box Store,Bistro,Board Shop,Bookstore,Boutique,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Bus Line,Bus Stop,Butcher,Cafeteria,Café,Candy Store,Chinese Restaurant,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Comic Shop,Community Center,Concert Hall,Convenience Store,Cosmetics Shop,Costume Shop,Cultural Center,Cupcake Shop,Currywurst Joint,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Doner Restaurant,Drugstore,Eastern European Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hardware Store,Hawaiian Restaurant,Hill,Historic Site,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Lake,Laundry Service,Light Rail Station,Liquor Store,Lounge,Manti Place,Market,Martial Arts School,Men's Store,Metro Station,Middle Eastern Restaurant,Mobile Phone Shop,Modern European Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Opera House,Optical Shop,Organic Grocery,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pastry Shop,Pedestrian Plaza,Peruvian Restaurant,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool,Post Office,Pub,Ramen Restaurant,Record Shop,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shipping Store,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soup Place,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Tapas Restaurant,Taverna,Tea Room,Thai Restaurant,Theater,Tiki Bar,Trail,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Water Park,Wine Bar,Wine Shop,Yoga Studio
0,Allach-Untermenzing,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Altstadt-Lehel,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.07,0.01,0.0,0.03,0.0,0.03,0.03,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.05,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.05,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0
2,Au-Haidhausen,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.026316,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.052632,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.026316,0.026316,0.026316,0.0,0.0,0.0,0.0,0.078947,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.026316,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.026316,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.026316,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0
3,Berg am Laim,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.137931,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.241379,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bogenhausen,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.138889,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.138889,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.097222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.069444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.069444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013889,0.0,0.0,0.0


In [17]:
# Each District top 5 venues
num_top_venues = 5

for hood in munich_grouped['District']:
    print('----'+hood+'----')
    temp = munich_grouped[munich_grouped['District'] == hood].T.reset_index()
    temp.columns = ['Venue', 'Frequency']
    temp = temp.iloc[1:]
    temp['Frequency'] = temp['Frequency'].astype(float)
    temp = temp.round({'Frequency':2})
    print(temp.sort_values('Frequency', ascending = False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Allach-Untermenzing----
                Venue  Frequency
0         Supermarket       0.25
1           Drugstore       0.25
2              Bakery       0.12
3  Italian Restaurant       0.12
4     Automotive Shop       0.12


----Altstadt-Lehel----
                 Venue  Frequency
0  Bavarian Restaurant       0.08
1                 Café       0.07
2                Plaza       0.07
3                Hotel       0.05
4    German Restaurant       0.05


----Au-Haidhausen----
                Venue  Frequency
0  Italian Restaurant       0.08
1         Coffee Shop       0.05
2     Thai Restaurant       0.05
3        Concert Hall       0.05
4   French Restaurant       0.05


----Berg am Laim----
           Venue  Frequency
0    Supermarket       0.24
1      Drugstore       0.14
2  Metro Station       0.10
3      Gastropub       0.10
4           Café       0.10


----Bogenhausen----
                Venue  Frequency
0            Bus Stop       0.22
1              Bakery       0.14
2          

In [18]:
# Lets define a function to sort the venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:] 
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [19]:
# Now lets create a Dataframe to show the top 10 venues of each District

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['District']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
district_venues_sorted = pd.DataFrame(columns=columns)
district_venues_sorted['District'] = munich_grouped['District']

for ind in np.arange(munich_grouped.shape[0]):
    district_venues_sorted.iloc[ind, 1:] = return_most_common_venues(munich_grouped.iloc[ind, :], num_top_venues)

district_venues_sorted.head()

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
1,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
2,Au-Haidhausen,Italian Restaurant,Concert Hall,Coffee Shop,French Restaurant,Thai Restaurant,Gourmet Shop,Doner Restaurant,Bistro,Movie Theater,Rock Club
3,Berg am Laim,Supermarket,Drugstore,Café,Metro Station,Hotel,Gastropub,Bakery,Cafeteria,Light Rail Station,Eastern European Restaurant
4,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park


#### Cluster Districts

In [20]:
# It is time to cluster Districts and see how they are similar and different from one another

In [21]:
# set number of clusters
kclusters = 5
munich_grouped_clustering = munich_grouped.drop('District', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(munich_grouped_clustering)

# Check labels generated for each row in the Dataframe
kmeans.labels_[:]

array([3, 0, 0, 3, 2, 1, 2, 2, 0, 0, 2, 3, 0, 0, 3, 2, 4, 0, 0, 2, 4, 2,
       3, 2])

In [22]:
# Add clusters to district_venues_sorted DataFrame
district_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
district_venues_sorted.head()

Unnamed: 0,Cluster Labels,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,3,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
1,0,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
2,0,Au-Haidhausen,Italian Restaurant,Concert Hall,Coffee Shop,French Restaurant,Thai Restaurant,Gourmet Shop,Doner Restaurant,Bistro,Movie Theater,Rock Club
3,3,Berg am Laim,Supermarket,Drugstore,Café,Metro Station,Hotel,Gastropub,Bakery,Cafeteria,Light Rail Station,Eastern European Restaurant
4,2,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park


In [23]:
# Adding Latitude and Longitude
munich = df_coor.join(district_venues_sorted.set_index('District'), on='District')

print(munich.shape)
munich.head()

(127, 15)


Unnamed: 0,District,Postal Code,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allach-Untermenzing,80995,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
1,Allach-Untermenzing,80997,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
2,Allach-Untermenzing,80999,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
3,Allach-Untermenzing,81247,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
4,Allach-Untermenzing,81249,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market


In [24]:
# I got NaN for District Aubing-Lochhausen-Langwied, which only represents 3 rows of the munich Dataframe
# Since there are only 3 rows, I discard those rows with Dropna

munich.dropna(subset=['Cluster Labels'], axis=0, inplace =True)
print(munich.shape) # now we have 124 rows instead of 127 --> we successfully droped the 3 NaN rows
munich.head()

(124, 15)


Unnamed: 0,District,Postal Code,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allach-Untermenzing,80995,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
1,Allach-Untermenzing,80997,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
2,Allach-Untermenzing,80999,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
3,Allach-Untermenzing,81247,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
4,Allach-Untermenzing,81249,48.195157,11.462973,3.0,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market


In [25]:
# Convert columns 'Cluster Labels' from float to integer
munich['Cluster Labels'] = munich['Cluster Labels'].astype(int)

In [26]:
# Lets visualize the clusters

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(munich['Latitude'], munich['Longitude'], munich['District'], munich['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

       
map_clusters

#### Examining each Cluster

In [27]:
# Cluster 0
cluster0 = munich.loc[munich['Cluster Labels'] == 0, munich.columns[[0] + list(range(5,15))]]
cluster0

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
6,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
7,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
8,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
9,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
10,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
11,Altstadt-Lehel,Bavarian Restaurant,Café,Plaza,Hotel,German Restaurant,Restaurant,Coffee Shop,Cocktail Bar,Church,Clothing Store
12,Au-Haidhausen,Italian Restaurant,Concert Hall,Coffee Shop,French Restaurant,Thai Restaurant,Gourmet Shop,Doner Restaurant,Bistro,Movie Theater,Rock Club
13,Au-Haidhausen,Italian Restaurant,Concert Hall,Coffee Shop,French Restaurant,Thai Restaurant,Gourmet Shop,Doner Restaurant,Bistro,Movie Theater,Rock Club
14,Au-Haidhausen,Italian Restaurant,Concert Hall,Coffee Shop,French Restaurant,Thai Restaurant,Gourmet Shop,Doner Restaurant,Bistro,Movie Theater,Rock Club


In [28]:
# Cluster 0 Value Counts
cluster0['1st Most Common Venue'].value_counts()

Café                     15
Vietnamese Restaurant     8
Bavarian Restaurant       7
Italian Restaurant        7
Bakery                    5
Park                      4
Name: 1st Most Common Venue, dtype: int64

In [29]:
# Cluster 1
cluster1 = munich.loc[munich['Cluster Labels'] == 1, munich.columns[[0] + list(range(5,15))]]
cluster1

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
32,Feldmoching-Hasenbergl,Motorcycle Shop,Greek Restaurant,Yoga Studio,Doner Restaurant,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store
33,Feldmoching-Hasenbergl,Motorcycle Shop,Greek Restaurant,Yoga Studio,Doner Restaurant,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store
34,Feldmoching-Hasenbergl,Motorcycle Shop,Greek Restaurant,Yoga Studio,Doner Restaurant,Fish Market,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store


In [36]:
# Cluster 1 Value Counts
cluster1['1st Most Common Venue'].value_counts()

Motorcycle Shop    3
Name: 1st Most Common Venue, dtype: int64

In [30]:
# Cluster 2
cluster2 = munich.loc[munich['Cluster Labels'] == 2, munich.columns[[0] + list(range(5,15))]]
cluster2

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park
27,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park
28,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park
29,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park
30,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park
31,Bogenhausen,Bus Stop,Drugstore,Bakery,Italian Restaurant,Greek Restaurant,Park,Bank,Supermarket,Pharmacy,Water Park
35,Hadern,Supermarket,ATM,Sushi Restaurant,Liquor Store,Sandwich Place,Ice Cream Shop,Trattoria/Osteria,Greek Restaurant,German Restaurant,Bakery
36,Hadern,Supermarket,ATM,Sushi Restaurant,Liquor Store,Sandwich Place,Ice Cream Shop,Trattoria/Osteria,Greek Restaurant,German Restaurant,Bakery
37,Hadern,Supermarket,ATM,Sushi Restaurant,Liquor Store,Sandwich Place,Ice Cream Shop,Trattoria/Osteria,Greek Restaurant,German Restaurant,Bakery
38,Laim,Supermarket,Bank,Snack Place,Plaza,Drugstore,Organic Grocery,Doner Restaurant,Restaurant,Sporting Goods Shop,Mobile Phone Shop


In [31]:
# Cluster 2 Value Counts
cluster2['1st Most Common Venue'].value_counts()

Supermarket          22
German Restaurant    10
Bus Stop              6
Name: 1st Most Common Venue, dtype: int64

In [32]:
# Cluster 3
cluster3 = munich.loc[munich['Cluster Labels'] == 3, munich.columns[[0] + list(range(5,15))]]
cluster3

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
1,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
2,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
3,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
4,Allach-Untermenzing,Supermarket,Drugstore,Bakery,Sporting Goods Shop,Automotive Shop,Italian Restaurant,Yoga Studio,Fish Market,Fast Food Restaurant,Farmers Market
22,Berg am Laim,Supermarket,Drugstore,Café,Metro Station,Hotel,Gastropub,Bakery,Cafeteria,Light Rail Station,Eastern European Restaurant
23,Berg am Laim,Supermarket,Drugstore,Café,Metro Station,Hotel,Gastropub,Bakery,Cafeteria,Light Rail Station,Eastern European Restaurant
24,Berg am Laim,Supermarket,Drugstore,Café,Metro Station,Hotel,Gastropub,Bakery,Cafeteria,Light Rail Station,Eastern European Restaurant
25,Berg am Laim,Supermarket,Drugstore,Café,Metro Station,Hotel,Gastropub,Bakery,Cafeteria,Light Rail Station,Eastern European Restaurant
58,Moosach,Bakery,Drugstore,Supermarket,Hotel,German Restaurant,Food,Bus Stop,Big Box Store,Italian Restaurant,Gastropub


In [33]:
# Cluster 3 Value Counts
cluster3['1st Most Common Venue'].value_counts()

Supermarket    13
Spa             6
Bakery          5
Name: 1st Most Common Venue, dtype: int64

In [34]:
# Cluster 4
cluster4 = munich.loc[munich['Cluster Labels'] == 4, munich.columns[[0] + list(range(5,15))]]
cluster4

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
85,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
86,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
87,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
88,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
89,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
90,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
91,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
92,Schwabing-Freimann,Fast Food Restaurant,Gym / Fitness Center,Intersection,Auto Dealership,Nightclub,Greek Restaurant,Automotive Shop,Beach,Bus Stop,Hotel
110,Sendling-Westpark,Ice Cream Shop,Post Office,Gym / Fitness Center,Metro Station,Coffee Shop,Italian Restaurant,Tunnel,Supermarket,Fast Food Restaurant,Bus Stop
111,Sendling-Westpark,Ice Cream Shop,Post Office,Gym / Fitness Center,Metro Station,Coffee Shop,Italian Restaurant,Tunnel,Supermarket,Fast Food Restaurant,Bus Stop


In [35]:
# Cluster 4 Value Counts
cluster4['1st Most Common Venue'].value_counts()

Fast Food Restaurant    8
Ice Cream Shop          5
Name: 1st Most Common Venue, dtype: int64

# Results and Conclusion

By analyzing the five clusters we see that some of them are more suited for opening a restaurant than others.

Cluster 0: This cluster is spread across Munich, including its centre and most tourist locations. As a result, there should be a strong demand, but rent would be higher than in other Districts.
Its most common venues include Cafes, different types of resutaurants (Italian, Vietnamese, Asian, Stakehouse) as well as Cocktail Bars. On the one hand this Districts presents lots of potential customers but, on the other hand, competence is numerous and diverse. In order to operate in this cluster, we must offer a highly diferentiated product in order to attract customers attention and convince them to try our restaurant.

Cluster 1: Districts in this cluster are away from the centre of Munich and do not present restaurants as most common venues. Therefore, we ought to discard this cluster.

Cluster 2: Also spreading across Munich but excepting the centre, these Districts present lots of supermarkets, German restaurants and bus stops. Looks like residential Districts (lots of supermarkets and bus stops). People in these Districts seem to prefer traditional food (German restaurants) over Fast Food, Italian, Asian, Falafel, Stakehouse, etc. So it would be an interesting option to consider if our target are Munich's residents and not tourists.

Cluster 3: This cluster represents a single District which is far away from the others. Although it has some food offers (Greek, Doner, Fast Food and Falafel restaurants), there aren't many. We can assume these Districts present more accessible prices to consumers, as well as lower costs for companies (rent for example). Might be a good oportunity if we do not have too much capital to invest.

Cluster 4: In this case we have Fast Food as the most common venues. Considering their low prices and how difficult it is to compete with them, we ought to discard this cluster.