<img src = "https://image-src.bcg.com/Images/Zurich_76734213_2360x922_tcm58-177074.jpg" width = "383" align="center" >  </a>







<h1 align=left><font size = 5>Where should I live in Zurich? IBM Data Science Capstone Project</font></h1>
<h3 align=left><font size = 3>Author: Gian Fink</font></h3>

## ABSTRACT
This data science project is submitted in partial fulfillment for the IBM Data Science Professional Certificate. The goal of this project is to identify the five most attractive urban living areas for young people in Zurich. To do so, publicly available census data on neighbourhood level was combined with Foursquare data to run a machine learning clustering algorithm that identified the most attractive neighbourhoods. My findings suggest that young people who want to live in an urban neighbourhood in Zurich should find an apartment in (1) Höngg, (2) Seebach, (3) Wipkingen, (4) Hard or (5) Oerlikon. 

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Introduction </a>

2. <a href="#item2">Data</a>
   
3. <a href="#item3">Methodology</a>

4. <a href="#item4">Results</a>

5. <a href="#item5">Discussion</a>  
    
6. <a href="#item6">Conclusion</a> 
</font>
</div>

<a id='item10'></a>

# Week 1 Submission

<a id='item1'></a>

## 1. Introduction

This Section discusses the problem, the background and the relevance for specific target audience groups.

__Background:__ 

Zurich is the largest Swiss city and Switzerland's financial and economic capital. The city is located at the northwestern tip of Lake Zurich and as of January 2020, the municipality had 434'335 inhabitants. The consulting firm Mercer has ranked Zurich as the city with the highest living standard for several subsequent years (Mercer, 2007; Mercer, 2008). According to Mercer, Zurich has particularly high scores in the categories work, housing, leisure, education and safety.

Despite the high overall living standard, there are differences between the 34 neighbourhoods in Zurich. In urban downtown neighbourhoods you will usually find many bars and restaurants, whereas neighbourhoods in the outskrits of Zurich have less to offer. On the other hand, are they downtown neighbourhoods often noisier and more busy than neighbourhoods in the outskrits. Hence, there is somehow a tradeoff between enjoying the benefits of living downtown and enjoying the silence of outskirt neighbourhoods. 



__Problem Definition:__ 

This project aims to rank the top 5 urban neighbourhoods for young people in Zurich. In order to achieve this goal, three questions have to be asked: 
- Which data determines the attractiveness of a nighbourhood for young people?
- What are Zurich's urban neighbourhoods? What are Zurich's outskirts?
- Which of the urban neighbourhoods is most attractive for living?

__Targeted Audience:__

The results of this project might be relevant for people who intent to move to Zurich (e.g. experts, exchange students) or people who want to spend some holidays in Zurich. The results of this project might provide a starting point for finding an apartment in an attractive neighbourhood. This contribution is valuable insofar as moving to another city is often difficult because people lack knowledge about the quality of different neighbourhoods. 

<a id='item2'></a>

## 2. Data

This Section descripes the datasets that were included for this project. Three datasources will be incorporated for this project: 
- __Zurich Neighbourhoods Geo Location:__ As not structured list with Geo Location data for Zurich's neighbourshoods was found, data was gathered for each of the 34 neighbourhoods via GeoHack and stored in an Excel spreadsheet. This dataset included the variables: _district, quarter, latitude_ and _longitude_. (Source: https://tools.wmflabs.org/geohack/)
- __Census Data on Neighbourhood Level:__ The statistical office of Zurich provides detailed statistics for each neighbourhood. This data was downloaded and matched with the Zurich Neighbourhoods Geo Locations (see above). By that the additional data on the population and commercial density in each neighbourhood was obtained. (Source: https://www.stadt-zuerich.ch/prd/de/index/statistik/kreise-quartiere.html) 
- __Foursquare Venue Data:__ This data was used to cluster Zurich's neighbourhoods based on venue characteristics and a k-nearest algorithm. 

<a id='item3'></a>

## 3. Methodology

This Section describes the methodology including all required steps for this data science project.

__Import libraries:__ Before importing the different datasets, relevant libraries are installed. 

In [1]:
import numpy as np # library to handle data in a vectorized manner

import os

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


__Import dataset:__ I have created a new neighourhood level (quarters) dataset based on the combination of data from different sources (see Section Data). I have stored this dataset as a .csv file on my computer and the following lines of code import this data as Pandas dataframe. 

In [2]:
# import data from .csv
df = pd.read_csv("zurich_data.csv", sep=';', delimiter=None, header='infer')
print("Data imported!")

Data imported!


In [3]:
# display data to check whether import worked
df.head()

Unnamed: 0,district,quarter,area_km2,latitude,longitude,employed,workplaces,wp_density,pop_density,wp_pop_ratio,pop2019,pop2018,pop2017,pop2016,pop2015,pop2014,pop2013,pop2012,pop2011,pop2010,pop2009,foreign2019,foreign2018,foreign2017,foreign2016,foreign2015,foreign2014,foreign2013,foreign2012,foreign2011,foreign2010,foreign2009
0,Kreis 1,Rathaus,0.38,47.38,8.55,10'199,1'588,4'179,8'703,0.48,3'307,3'267,3'266,3'226,3'197,3'236,3'194,3'164,3'140,3'099,3'110,30.6,30.1,30.5,30.6,30.2,29.7,29.0,28.8,28.6,28.8,28.7
1,Kreis 1,Hochschulen,0.56,47.38,8.55,14'033,734,1'311,1'211,1.08,678,664,677,640,641,634,665,673,681,678,673,31.7,34.3,32.6,32.5,31.8,29.8,30.7,30.8,28.9,28.9,30.8
2,Kreis 1,Lindenhof,0.23,47.37,8.54,13'851,1'194,5'191,4'387,1.18,1'009,990,955,972,974,955,923,935,950,951,940,29.0,30.1,28.7,27.3,27.2,27.4,25.8,26.4,24.9,25.0,23.4
3,Kreis 1,City,0.64,47.36,8.57,32'084,1'924,3'006,1'245,2.41,797,829,830,810,805,791,783,799,779,835,853,31.7,30.0,30.5,29.5,30.2,29.0,29.9,31.4,30.6,32.2,33.3
4,Kreis 2,Wollishofen,5.75,47.34,8.53,7'811,1'170,203,3'343,0.06,19'225,18'923,17'892,16'567,16'244,16'137,15'937,16'029,16'055,15'988,15'854,29.6,29.1,28.3,27.2,26.9,26.6,25.9,25.8,25.5,24.8,24.5


__Fetch geo coordinates from Zurich to create a map of Zurich:__

In [4]:
address = 'Zurich, ZH'

geolocator = Nominatim(user_agent="zh_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Zurich are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Zurich are 47.3723941, 8.5423328.


__Create a map of Zurich with geo coordinates for each neighbourhood in Zurich:__

In [5]:
# create map of Zurich using latitude and longitude values
map_zurich = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, district, quarter in zip(df.latitude, df.longitude, df.district, df.quarter):
    label = '{}, {}'.format(quarter, district)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_zurich)  

    
map_zurich

__Foursquare Credentials:__

In [6]:
CLIENT_ID = 'ABBJUMT3NUSPKLEURXMV5SEGENGBO1SMXUE51F5WDSKTM2RS' # your Foursquare ID
CLIENT_SECRET = 'MPU1JRH1RPVEZBYFUXNEWVOV3MTJYEGOHTVRJP1UUDLAZX4J' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ABBJUMT3NUSPKLEURXMV5SEGENGBO1SMXUE51F5WDSKTM2RS
CLIENT_SECRET:MPU1JRH1RPVEZBYFUXNEWVOV3MTJYEGOHTVRJP1UUDLAZX4J


__Explore the "Rathaus" quarter in Zurich:__

In [7]:
df.loc[0, 'quarter']

quarter_latitude = df.loc[0, 'latitude'] # quarter latitude value
quarter_longitude = df.loc[0, 'longitude'] # quarter longitude value

quarter_name = df.loc[0, 'quarter'] # quarter name



print('Latitude and longitude values of {} are {}, {}.'.format(quarter_name, 
                                                               quarter_latitude, 
                                                               quarter_longitude))

LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 500 # define radius


url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    quarter_latitude, 
    quarter_longitude, 
    radius, 
    LIMIT)
url # display URL

results = requests.get(url).json()
results

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
    
    

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()


    
print('{} venues were returned for the "Rathaus" quarter by Foursquare.'.format(nearby_venues.shape[0]))


Latitude and longitude values of Rathaus are 47.38, 8.55.
22 venues were returned for the "Rathaus" quarter by Foursquare.




__Explore all the 34 quarters in Zurich:__

_The code below retrieves Foursquare venues for all of the 34 quaters in Zurich:_

In [8]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['quarter', 
                  'latitude', 
                  'longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

_This code retrieves the Foursquare venues for each neighbourhood in Zurich:_

In [9]:
zurich_venues = getNearbyVenues(names=df['quarter'],
                                   latitudes=df['latitude'],
                                   longitudes=df['longitude']
                                  )

Rathaus
Hochschulen
Lindenhof
City
Wollishofen
Leimbach
Enge
Alt-Wiedikon
Friesenberg
Sihlfeld
Werd
Langstrasse
Hard
Gewerbeschule
Escher Wyss
Unterstrass
Oberstrass
Fluntern
Hottingen
Hirslanden
Witikon
Seefeld
Mühlebach
Weinegg
Albisrieden
Altstetten
Höngg
Wipkingen
Affoltern
Oerlikon
Seebach
Saatlen
Schwamedingen-Mitte
Hirzenbach


_The table below shows for each neighbourhood in Zurich it's Foursquare venues:_

In [10]:
print(zurich_venues.shape)
zurich_venues.head()

(969, 7)


Unnamed: 0,quarter,latitude,longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Rathaus,47.38,8.55,The Alehouse - Palmhof,47.379138,8.548037,Gastropub
1,Rathaus,47.38,8.55,K55 GmbH,47.378553,8.548116,Electronics Store
2,Rathaus,47.38,8.55,Kleine Freiheit,47.379216,8.544588,Snack Place
3,Rathaus,47.38,8.55,Restaurant Haldenbach,47.379901,8.547052,Restaurant
4,Rathaus,47.38,8.55,"focusTerra, ETH Zürich",47.378231,8.547438,Science Museum


_Let's count the venues in each neighbourhood:_

In [11]:
zurich_venues.groupby('quarter').count()

Unnamed: 0_level_0,latitude,longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
quarter,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Affoltern,10,10,10,10,10,10
Albisrieden,1,1,1,1,1,1
Alt-Wiedikon,33,33,33,33,33,33
Altstetten,5,5,5,5,5,5
City,10,10,10,10,10,10
Enge,32,32,32,32,32,32
Escher Wyss,90,90,90,90,90,90
Fluntern,13,13,13,13,13,13
Friesenberg,11,11,11,11,11,11
Gewerbeschule,86,86,86,86,86,86


_How many unique venue categories are overall in Zurich?_

In [12]:
print('There are {} uniques categories.'.format(len(zurich_venues['Venue Category'].unique())))

There are 135 uniques categories.


__Let's have a look at the venues of each Zurich neighbourhood:__

In [13]:
# one hot encoding
zurich_onehot = pd.get_dummies(zurich_venues[['Venue Category']], prefix="", prefix_sep="")

# add quarter column back to dataframe
zurich_onehot['quarter'] = zurich_venues['quarter'] 

# move quarter column to the first column
fixed_columns = [zurich_onehot.columns[-1]] + list(zurich_onehot.columns[:-1])
zurich_onehot = zurich_onehot[fixed_columns]

zurich_onehot.head()

Unnamed: 0,quarter,Accessories Store,Advertising Agency,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Field,Beer Garden,Bistro,Board Shop,Bookstore,Boutique,Burger Joint,Bus Station,Business Service,Café,Cambodian Restaurant,Candy Store,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Cultural Center,Cupcake Shop,Dance Studio,Department Store,Dessert Shop,Diner,Discount Store,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Fast Food Restaurant,Food & Drink Shop,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Historic Site,History Museum,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kids Store,Korean Restaurant,Laser Tag,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,Nightclub,Optical Shop,Other Great Outdoors,Outdoors & Recreation,Park,Pedestrian Plaza,Performing Arts Venue,Peruvian Restaurant,Pharmacy,Pizza Place,Playground,Plaza,Pool,Pool Hall,Pub,Restaurant,River,Rock Club,Salad Place,Sandwich Place,Science Museum,Shoe Store,Shopping Mall,Skate Park,Snack Place,Soccer Field,Soup Place,Southern / Soul Food Restaurant,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Swiss Restaurant,Tennis Court,Thai Restaurant,Theater,Trade School,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Yoga Studio
0,Rathaus,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Rathaus,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Rathaus,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Rathaus,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Rathaus,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [14]:
# What is the new dataframes size?
zurich_onehot.shape

(969, 136)

In [15]:
# This code groups the rows by quarter (neighbourhoods) and computes the mean frequency for each category.
zurich_grouped = zurich_onehot.groupby('quarter').mean().reset_index()
zurich_grouped.head()

Unnamed: 0,quarter,Accessories Store,Advertising Agency,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Field,Beer Garden,Bistro,Board Shop,Bookstore,Boutique,Burger Joint,Bus Station,Business Service,Café,Cambodian Restaurant,Candy Store,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Cultural Center,Cupcake Shop,Dance Studio,Department Store,Dessert Shop,Diner,Discount Store,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Fast Food Restaurant,Food & Drink Shop,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Historic Site,History Museum,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kids Store,Korean Restaurant,Laser Tag,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,Nightclub,Optical Shop,Other Great Outdoors,Outdoors & Recreation,Park,Pedestrian Plaza,Performing Arts Venue,Peruvian Restaurant,Pharmacy,Pizza Place,Playground,Plaza,Pool,Pool Hall,Pub,Restaurant,River,Rock Club,Salad Place,Sandwich Place,Science Museum,Shoe Store,Shopping Mall,Skate Park,Snack Place,Soccer Field,Soup Place,Southern / Soul Food Restaurant,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Swiss Restaurant,Tennis Court,Thai Restaurant,Theater,Trade School,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Yoga Studio
0,Affoltern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0
1,Albisrieden,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Alt-Wiedikon,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.151515,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.030303,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.060606,0.0,0.030303,0.0,0.0,0.030303,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0
3,Altstetten,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,City,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0


In [16]:
zurich_grouped.shape

(34, 136)

__This code shows the top 5 venues for each Zurich neighbourhood:__

In [17]:
num_top_venues = 5

for hood in zurich_grouped['quarter']:
    print("----"+hood+"----")
    temp = zurich_grouped[zurich_grouped['quarter'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Affoltern----
                venue  freq
0         Bus Station   0.2
1         Supermarket   0.2
2    Department Store   0.1
3  Italian Restaurant   0.1
4  Light Rail Station   0.1


----Albisrieden----
              venue  freq
0             Trail   1.0
1  Pedestrian Plaza   0.0
2       Music Store   0.0
3         Nightclub   0.0
4      Optical Shop   0.0


----Alt-Wiedikon----
                venue  freq
0                Café  0.15
1    Swiss Restaurant  0.06
2         Pizza Place  0.06
3   French Restaurant  0.06
4  Italian Restaurant  0.06


----Altstetten----
                       venue  freq
0                Bus Station   0.4
1                Pizza Place   0.2
2  Middle Eastern Restaurant   0.2
3               Soccer Field   0.2
4          Accessories Store   0.0


----City----
                venue  freq
0    Swiss Restaurant   0.2
1  Italian Restaurant   0.1
2               River   0.1
3       Grocery Store   0.1
4   French Restaurant   0.1


----Enge----
                

__This codes puts the data above into a Pandas dataframe:__

In [18]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [19]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['quarter']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
quarter_venues_sorted = pd.DataFrame(columns=columns)
quarter_venues_sorted['quarter'] = zurich_grouped['quarter']

for ind in np.arange(zurich_grouped.shape[0]):
    quarter_venues_sorted.iloc[ind, 1:] = return_most_common_venues(zurich_grouped.iloc[ind, :], num_top_venues)

quarter_venues_sorted

Unnamed: 0,quarter,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Affoltern,Bus Station,Supermarket,Hotel,Light Rail Station,Department Store,Italian Restaurant,Train Station,Athletics & Sports,Food & Drink Shop,Fast Food Restaurant
1,Albisrieden,Trail,Yoga Studio,Electronics Store,French Restaurant,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant,Eastern European Restaurant
2,Alt-Wiedikon,Café,Pizza Place,French Restaurant,Italian Restaurant,Swiss Restaurant,Supermarket,Ethiopian Restaurant,Fast Food Restaurant,Fried Chicken Joint,Playground
3,Altstetten,Bus Station,Pizza Place,Middle Eastern Restaurant,Soccer Field,Fried Chicken Joint,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant
4,City,Swiss Restaurant,Mediterranean Restaurant,Italian Restaurant,Museum,River,French Restaurant,Grocery Store,Tram Station,Wine Shop,Dessert Shop
5,Enge,Italian Restaurant,Park,Bar,History Museum,Restaurant,Swiss Restaurant,Supermarket,Hotel,Tram Station,Grocery Store
6,Escher Wyss,Café,Bar,Hotel,Nightclub,Restaurant,Gym / Fitness Center,Italian Restaurant,Park,Clothing Store,Concert Hall
7,Fluntern,Bakery,Plaza,Tram Station,Grocery Store,Bus Station,Pizza Place,Gastropub,Supermarket,Electronics Store,Falafel Restaurant
8,Friesenberg,Supermarket,Restaurant,Indian Restaurant,Art Gallery,Tram Station,Diner,Beer Garden,Gym,Bus Station,Lounge
9,Gewerbeschule,Bar,Italian Restaurant,Thai Restaurant,Café,Asian Restaurant,Chinese Restaurant,Swiss Restaurant,Bakery,Mediterranean Restaurant,Vegetarian / Vegan Restaurant


__Let's cluster Zurich's neighbourhoods based on their venues on Foursquare:__

In [34]:
# set number of clusters
kclusters = 5

zurich_grouped_clustering = zurich_grouped.drop('quarter', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(zurich_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 1, 0, 1, 1, 1, 1, 0, 1], dtype=int32)

In [35]:
# add clustering labels
quarter_venues_sorted.insert(0, 'labels', kmeans.labels_)

zurich_merged = df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
zurich_merged = zurich_merged.join(quarter_venues_sorted.set_index('quarter'), on='quarter')

zurich_merged.head() # check the last columns!

ValueError: cannot insert labels, already exists

__Visualization of the Zurich neighbourhood clusters:__

In [36]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(zurich_merged.latitude, zurich_merged.longitude, zurich_merged.quarter, zurich_merged.labels):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [37]:
# Cluster 1
cluster1 = zurich_merged.loc[zurich_merged['labels'] == 0, zurich_merged.columns[[1] + list(range(5, zurich_merged.shape[1]))]]
cluster1.sort_values(by=['wp_pop_ratio'])

Unnamed: 0,quarter,employed,workplaces,wp_density,pop_density,wp_pop_ratio,pop2019,pop2018,pop2017,pop2016,pop2015,pop2014,pop2013,pop2012,pop2011,pop2010,pop2009,foreign2019,foreign2018,foreign2017,foreign2016,foreign2015,foreign2014,foreign2013,foreign2012,foreign2011,foreign2010,foreign2009,labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Leimbach,717,212,73,2'107,0.03,6'152,6'320,6'212,6'173,6'102,5'936,5'730,5'354,5'340,5'293,5'287,33.0,33.6,32.1,31.1,30.2,27.9,26.8,24.0,24.2,23.4,23.6,0,Grocery Store,Tennis Court,Bus Station,Moroccan Restaurant,Dessert Shop,Diner,Discount Store,Eastern European Restaurant,Electronics Store,Fried Chicken Joint
28,Affoltern,4'106,911,151,4'422,0.03,26'710,26'562,26'177,26'054,25'874,25'902,25'082,24'855,24'437,22'972,22'383,32.7,33.0,32.9,32.8,33.1,33.5,32.5,32.5,32.1,31.4,31.7,0,Bus Station,Supermarket,Hotel,Light Rail Station,Department Store,Italian Restaurant,Train Station,Athletics & Sports,Food & Drink Shop,Fast Food Restaurant
31,Saatlen,1'516,236,209,7'824,0.03,8'841,8'582,8'388,8'283,8'508,7'563,7'280,7'118,7'131,7'175,7'132,27.9,28.8,30.2,30.8,31.2,31.0,31.0,31.9,32.2,31.8,32.7,0,Bagel Shop,Bus Station,Pool,Tram Station,Lounge,Restaurant,Supermarket,Yoga Studio,Fast Food Restaurant,Falafel Restaurant
8,Friesenberg,5'177,406,79,2'157,0.04,11'107,10'933,10'860,11'002,10'698,10'695,10'696,10'986,10'622,10'596,11'003,18.5,18.3,18.5,19.5,19.4,19.8,19.6,21.0,20.6,20.1,23.5,0,Supermarket,Restaurant,Indian Restaurant,Art Gallery,Tram Station,Diner,Beer Garden,Gym,Bus Station,Lounge
4,Wollishofen,7'811,1'170,203,3'343,0.06,19'225,18'923,17'892,16'567,16'244,16'137,15'937,16'029,16'055,15'988,15'854,29.6,29.1,28.3,27.2,26.9,26.6,25.9,25.8,25.5,24.8,24.5,0,Irish Pub,Restaurant,Bus Station,Supermarket,Cheese Shop,Dessert Shop,Diner,Discount Store,Eastern European Restaurant,French Restaurant
25,Altstetten,43'417,2'670,357,4'590,0.08,34'285,33'461,32'603,31'724,32'003,31'486,31'115,31'438,31'381,30'659,29'845,36.2,35.9,35.8,35.3,35.3,35.4,35.4,36.1,36.3,35.6,36.1,0,Bus Station,Pizza Place,Middle Eastern Restaurant,Soccer Field,Fried Chicken Joint,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant


__Finally let's have a look at the venues of each Zurich neighbourhood cluster:__

In [38]:
# Cluster 2
cluster2 = zurich_merged.loc[zurich_merged['labels'] == 1, zurich_merged.columns[[1] + list(range(5, zurich_merged.shape[1]))]]
cluster2.sort_values(by=['wp_pop_ratio'])


Unnamed: 0,quarter,employed,workplaces,wp_density,pop_density,wp_pop_ratio,pop2019,pop2018,pop2017,pop2016,pop2015,pop2014,pop2013,pop2012,pop2011,pop2010,pop2009,foreign2019,foreign2018,foreign2017,foreign2016,foreign2015,foreign2014,foreign2013,foreign2012,foreign2011,foreign2010,foreign2009,labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,Höngg,9'000,1'126,161,3'490,0.05,24'358,24'020,23'797,23'423,22'320,21'826,21'581,21'537,21'323,21'179,21'294,25.0,25.1,25.3,24.7,23.4,23.1,22.8,22.3,22.3,21.7,21.2,1,Tram Station,Pizza Place,Grocery Store,Plaza,Bus Station,Mexican Restaurant,Café,Gas Station,Sporting Goods Shop,Supermarket
30,Seebach,29'652,1'552,329,5'467,0.06,25'806,25'568,25'817,25'198,24'991,24'431,24'008,23'310,22'255,22'037,21'489,39.2,39.3,39.1,38.4,38.1,38.0,37.2,36.5,36.1,35.6,35.9,1,Hookah Bar,Supermarket,Bakery,Grocery Store,Korean Restaurant,Laser Tag,Hotel,Pool,Plaza,Tram Station
27,Wipkingen,7'402,1'154,547,7'808,0.07,16'474,16'321,16'544,16'141,15'879,15'717,15'835,15'791,15'869,15'700,15'645,27.4,28.3,28.7,28.4,28.8,29.3,30.0,30.5,30.1,29.7,30.4,1,Café,Bar,Hotel,Nightclub,Restaurant,Gym / Fitness Center,Italian Restaurant,Park,Clothing Store,Concert Hall
12,Hard,6'954,961,658,8'945,0.07,13'060,13'163,13'057,12'891,13'072,13'232,13'241,12'994,12'744,12'883,12'902,36.2,37.4,38.0,38.6,38.4,39.9,39.9,40.3,40.6,41.0,42.6,1,Swiss Restaurant,Restaurant,Café,Bar,Italian Restaurant,Hotel,Park,Sports Bar,Mediterranean Restaurant,Gym / Fitness Center
29,Oerlikon,23'553,1'858,696,8'722,0.08,23'288,23'214,22'815,22'161,21'966,21'895,21'658,21'331,21'253,20'787,20'740,38.3,38.3,38.8,38.0,37.2,37.1,36.3,35.7,35.7,34.7,34.2,1,Hookah Bar,Supermarket,Bakery,Grocery Store,Korean Restaurant,Laser Tag,Hotel,Pool,Plaza,Tram Station
9,Sihlfeld,12'141,1'969,1'201,13'501,0.09,22'141,21'680,21'660,21'177,21'339,21'195,20'931,20'831,20'307,20'115,20'464,31.0,31.2,31.7,31.4,31.8,31.6,32.2,32.4,32.2,32.0,33.5,1,Café,Pizza Place,French Restaurant,Italian Restaurant,Swiss Restaurant,Supermarket,Ethiopian Restaurant,Fast Food Restaurant,Fried Chicken Joint,Playground
15,Unterstrass,15'671,2'166,880,9'744,0.09,23'971,23'394,22'768,22'476,22'226,22'126,21'876,21'442,21'240,21'233,21'080,27.7,27.8,28.1,28.3,27.9,27.9,27.9,27.4,26.6,26.4,25.7,1,Italian Restaurant,Café,Middle Eastern Restaurant,Bakery,Swiss Restaurant,Grocery Store,Food & Drink Shop,Falafel Restaurant,Sporting Goods Shop,Bus Station
19,Hirslanden,3'520,757,344,3'410,0.1,7'503,7'488,7'465,7'321,7'380,7'403,7'285,7'131,7'024,6'998,6'956,26.9,27.5,27.2,26.6,27.3,27.5,26.2,25.1,24.8,23.1,22.1,1,Swiss Restaurant,Mediterranean Restaurant,Italian Restaurant,Museum,River,French Restaurant,Grocery Store,Tram Station,Wine Shop,Dessert Shop
17,Fluntern,15'382,902,318,3'042,0.1,8'639,8'485,8'221,8'038,7'953,7'865,7'856,7'779,7'873,7'637,7'528,30.5,30.7,32.1,31.9,31.9,31.5,31.0,30.6,30.4,27.9,28.2,1,Bakery,Plaza,Tram Station,Grocery Store,Bus Station,Pizza Place,Gastropub,Supermarket,Electronics Store,Falafel Restaurant
7,Alt-Wiedikon,27'291,2'012,1'088,9'662,0.11,17'874,17'956,17'522,17'321,17'158,16'918,16'706,16'109,16'014,15'988,15'504,34.3,34.8,34.2,33.6,33.6,34.2,33.4,32.3,32.1,31.7,32.1,1,Café,Pizza Place,French Restaurant,Italian Restaurant,Swiss Restaurant,Supermarket,Ethiopian Restaurant,Fast Food Restaurant,Fried Chicken Joint,Playground


In [39]:
# Cluster 3
cluster3 = zurich_merged.loc[zurich_merged['labels'] == 2, zurich_merged.columns[[1] + list(range(5, zurich_merged.shape[1]))]]
cluster3.sort_values(by=['wp_pop_ratio'])

Unnamed: 0,quarter,employed,workplaces,wp_density,pop_density,wp_pop_ratio,pop2019,pop2018,pop2017,pop2016,pop2015,pop2014,pop2013,pop2012,pop2011,pop2010,pop2009,foreign2019,foreign2018,foreign2017,foreign2016,foreign2015,foreign2014,foreign2013,foreign2012,foreign2011,foreign2010,foreign2009,labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
24,Albisrieden,8'772,1'104,240,4'859,0.05,22'352,22'304,22'113,21'174,19'325,19'199,19'146,18'999,18'432,17'835,17'675,26.8,27.1,27.2,26.5,25.9,25.6,25.6,25.4,25.2,24.9,24.8,2,Trail,Yoga Studio,Electronics Store,French Restaurant,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant,Eastern European Restaurant


In [40]:
# Cluster 4
cluster4 = zurich_merged.loc[zurich_merged['labels'] == 3, zurich_merged.columns[[1] + list(range(5, zurich_merged.shape[1]))]]
cluster4.sort_values(by=['wp_pop_ratio'])

Unnamed: 0,quarter,employed,workplaces,wp_density,pop_density,wp_pop_ratio,pop2019,pop2018,pop2017,pop2016,pop2015,pop2014,pop2013,pop2012,pop2011,pop2010,pop2009,foreign2019,foreign2018,foreign2017,foreign2016,foreign2015,foreign2014,foreign2013,foreign2012,foreign2011,foreign2010,foreign2009,labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
33,Hirzenbach,2'229,316,121,5'011,0.02,13'129,12'801,12'627,12'463,11'930,11'679,11'153,11'404,11'516,11'459,11'610,35.5,35.7,35.9,36.5,36.3,36.0,35.7,35.2,35.1,34.5,36.0,3,Tram Station,Soccer Field,Baseball Field,Steakhouse,Ethiopian Restaurant,French Restaurant,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space
32,Schwamedingen-Mitte,3'293,514,230,5'050,0.05,11'261,11'100,11'012,11'076,11'315,11'301,11'209,10'934,10'863,10'903,10'857,42.7,42.1,42.3,41.6,41.8,41.2,40.4,39.7,39.4,38.8,39.3,3,Tram Station,Pizza Place,Italian Restaurant,Yoga Studio,Electronics Store,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant


In [41]:
# Cluster 5
cluster5 = zurich_merged.loc[zurich_merged['labels'] == 4, zurich_merged.columns[[1] + list(range(5, zurich_merged.shape[1]))]]
cluster5.sort_values(by=['wp_pop_ratio'])

Unnamed: 0,quarter,employed,workplaces,wp_density,pop_density,wp_pop_ratio,pop2019,pop2018,pop2017,pop2016,pop2015,pop2014,pop2013,pop2012,pop2011,pop2010,pop2009,foreign2019,foreign2018,foreign2017,foreign2016,foreign2015,foreign2014,foreign2013,foreign2012,foreign2011,foreign2010,foreign2009,labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Witikon,1'878,519,105,2'242,0.05,11'054,10'953,10'600,10'667,10'639,10'406,10'267,10'246,10'258,10'242,10'284,26.6,26.0,25.3,25.2,24.9,24.1,23.5,22.5,21.8,21.0,20.8,4,Other Great Outdoors,Swiss Restaurant,Yoga Studio,Food & Drink Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Fried Chicken Joint


<a id='item4'></a>

## 4. Results

__Descriptive Results:__

Prior to the cluster analyses some descriptive results have to be discussed. The data from the Zurich statistical office contained _neighbourhood area, neighbourhood population, number of neighbourhood workplaces_ and _share of forein citizens._ Based on this data the (workplace density/population density)-ratio was computed to obtain a measure that reveals whether a specific neighbourhood is a "living neighbourhood" or a "working neighbourhood". The intuition is that neighbourhood with a low ratio are more likely to be "living neighbourhoods" (because there are relatively few workplaces in comparison to the neighbourhood population).

__Results of the Cluster Analyses:__

The k-means segmentation algorithm for Foursquare venues data resulted in five neighbourhood clusters for Zurich. Analyzing the most occuring venues in each cluster results in the following cluster classification:

__Cluster 1:__ _Urban Outskrits (red cluster)_
- Neighbourhoods in this cluster are characterized by many public transport venues, grocery stores and restaurants. Based on most occuring venues in this cluster is seems likely that neighbourhoods in this cluster are located at the interception between the urban area and the rural outskirts of Zurich. This interpretation is also supported by the fact that the ratio of workplace and population density is relatively low between 0.03 and 0.08, indicating that there are relatively few workplaces compared to the population living in these neighbourhoods. 

__Cluster 2:__ _Urban Downtown (purple cluster)
- Most frequent venues for neighbourhoods in the _Urban Downtown_ cluster are coffee shops, bars, restaurants and hotels. Hence, these neighbourhoods seem to be part of a lively urban area with lots of activities to do. A look at the (workplace density/population density)-ratio reveals that some of the _Urban Downtown_ areas are not really residential areas. When setting a ratio-threshold of ratio > 0.40 one can see that the neighbourhoods _Rathaus, Hochschule, Lindenhof_ and _city_ are mainly business districts. Hence, there neighbourhoods might not be the first choice for people who search for an apartment in Zurich. 

__Cluster 3:__ _Outskirt Cluster West (blue cluster)

__Cluster 4:__ _Outskirt Cluster North (green cluster)

__Cluster 5:__ _Outskirt Cluster South (orange cluster)

<a id='item5'></a>

## 5. Discussion

_will be updated in week 2_

<a id='item6'></a>

## 6. Conclusion

_will be updated in week 2_

## 7. References

- Mercer (2007). _2007 World-wide quality of living survey_. Retrieved from https://web.archive.org/web/20110812003533/http://www.mercer.com/referencecontent.htm?idContent=1173105
- Mercer (2008). _Mercer's 2008 Quality of living survey highlights_. Retrieved from https://mobilityexchange.mercer.com/Insights/quality-of-living-rankings