# Capstone Project - The Battle of Neighthourhoods (Week 1)

# Background

In any publicly available listing by positioning global cities with a range of criteria, London is normally able to secure a top place. It is a truly global city by any measurements. 

Every year, London attracts thousands of tourists to exploring its outstanding cultures, international students to advance their study, and a range of professions to look for new career chapter. London is so attractive in every aspect, so does its property market. 

London's property is very dymanic and complicated. It takes huge effort to investigate the market and its nearly impossible to find the right one without exploring & comparing neighbourhoods. To enable that, it is therefore, a very good idea to carry out the initial screening on rough locations based on a set of pre-defined conditions.   

This report is to help those people want to settle in London by providing a comaprsion across London Boroughs with the consideration of transport accessibility, crime records, and their neighbourhoods.  

# Problem Statement

London is organised by Boroughs. For each Borough, due to a lots of reasons, there are significant differences on a varity of factors that are impacting the overall life quality. 

Here, we are selecting four factors for this project: Transport Accessibility Level, Crime Record and neareast venues. We would like to splite London Boroughs into a few categories to showcase their similarites so that people can narrow their prefered options based on different creteria. 

# Methodology

K-mean cluster will be adpoted to segment and clustering London Boroughs.

## Data Sources: 

a), Wikipedia: List of Boroughs and their coordinates: https://en.wikipedia.org/wiki/List_of_London_boroughs

b), Public Transport Accessibility Levels by Boroughs: https://data.london.gov.uk/dataset/public-transport-accessibility-levels

c), London Recorded Crime Data within last 24 months by Boroughs: https://data.london.gov.uk/dataset/recorded_crime_summary

d), Neareast venues data retrived per Boroughs from Foursquare API  

## Step by Step:

1), download raw data from sources or via API

2), Examine the data to understand its peoperty and explore its neighbourhoods

3), Merge various data into one dataframe

4), Perform K-mean cluster analysis

5), Exam each culster to define the category

6), Discuss and compare category

7), make recommendations

# Capstone Project - The Battle of Neighborhoods (Week 2)

In [4]:
!pip install beautifulsoup4
!pip3 install requests
!pip install openpyxl
!pip install geopy

from geopy.geocoders import Nominatim
from bs4 import BeautifulSoup
from sklearn.cluster import KMeans

import pandas as pd
import numpy as np
import requests
import os
import folium 

import matplotlib.cm as cm
import matplotlib.colors as colors

Collecting openpyxl
[?25l  Downloading https://files.pythonhosted.org/packages/5c/90/61f83be1c335a9b69fa773784a785d9de95c7561d1661918796fd1cba3d2/openpyxl-3.0.5-py2.py3-none-any.whl (242kB)
[K     |████████████████████████████████| 245kB 10.9MB/s eta 0:00:01
[?25hCollecting jdcal (from openpyxl)
  Downloading https://files.pythonhosted.org/packages/f0/da/572cbc0bc582390480bbd7c4e93d14dc46079778ed915b505dc494b37c57/jdcal-1.4.1-py2.py3-none-any.whl
Collecting et-xmlfile (from openpyxl)
  Downloading https://files.pythonhosted.org/packages/22/28/a99c42aea746e18382ad9fb36f64c1c1f04216f41797f2f0fa567da11388/et_xmlfile-1.0.1.tar.gz
Building wheels for collected packages: et-xmlfile
  Building wheel for et-xmlfile (setup.py) ... [?25ldone
[?25h  Stored in directory: /home/jupyterlab/.cache/pip/wheels/2a/77/35/0da0965a057698121fc7d8c5a7a9955cdbfb3cc4e2423cad39
Successfully built et-xmlfile
Installing collected packages: jdcal, et-xmlfile, openpyxl
Successfully installed et-xmlfile-1.0.1 j

## Step 1), Download raw data from sources or via API

In [7]:
# Download and load crime records data
london_crime_data = pd.read_csv("MPS Borough Level Crime (most recent 24 months).csv")
london_crime_data.head()

Unnamed: 0,MajorText,MinorText,LookUp_BoroughName,201810,201811,201812,201901,201902,201903,201904,...,201912,202001,202002,202003,202004,202005,202006,202007,202008,202009
0,Arson and Criminal Damage,Arson,Barking and Dagenham,8,5,1,5,2,5,5,...,6,4,5,6,2,2,4,3,5,2
1,Arson and Criminal Damage,Criminal Damage,Barking and Dagenham,132,105,88,97,127,138,130,...,122,97,103,107,80,86,121,122,114,116
2,Burglary,Burglary - Business and Community,Barking and Dagenham,32,39,33,45,24,29,27,...,25,31,17,28,29,16,16,28,24,32
3,Burglary,Burglary - Residential,Barking and Dagenham,94,106,164,114,107,99,96,...,130,116,123,97,57,42,63,72,63,54
4,Drug Offences,Drug Trafficking,Barking and Dagenham,9,7,4,6,2,6,5,...,3,15,6,6,13,13,11,22,8,10


In [8]:
london_crime_data.shape

(1567, 27)

In [9]:
london_crime_data['Sum']=london_crime_data.iloc[:,3:27].sum(axis=1)


In [10]:
london_crime_data.head()

Unnamed: 0,MajorText,MinorText,LookUp_BoroughName,201810,201811,201812,201901,201902,201903,201904,...,202001,202002,202003,202004,202005,202006,202007,202008,202009,Sum
0,Arson and Criminal Damage,Arson,Barking and Dagenham,8,5,1,5,2,5,5,...,4,5,6,2,2,4,3,5,2,115
1,Arson and Criminal Damage,Criminal Damage,Barking and Dagenham,132,105,88,97,127,138,130,...,97,103,107,80,86,121,122,114,116,2705
2,Burglary,Burglary - Business and Community,Barking and Dagenham,32,39,33,45,24,29,27,...,31,17,28,29,16,16,28,24,32,686
3,Burglary,Burglary - Residential,Barking and Dagenham,94,106,164,114,107,99,96,...,116,123,97,57,42,63,72,63,54,2236
4,Drug Offences,Drug Trafficking,Barking and Dagenham,9,7,4,6,2,6,5,...,15,6,6,13,13,11,22,8,10,207


In [12]:
london_crime_data_new=london_crime_data[['LookUp_BoroughName','Sum']]

In [14]:
crime_data = london_crime_data_new.groupby(['LookUp_BoroughName'], as_index=False).sum()

In [15]:
crime_data.head()

Unnamed: 0,LookUp_BoroughName,Sum
0,Barking and Dagenham,39176
1,Barnet,59370
2,Bexley,34005
3,Brent,59730
4,Bromley,47481


In [18]:
crime_data.sort_values(by='Sum', ascending=False).head()

Unnamed: 0,LookUp_BoroughName,Sum
32,Westminster,139135
27,Southwark,73450
5,Camden,71487
24,Newham,69982
20,Lambeth,68329


In [19]:
crime_data['crime_rate']=crime_data['Sum']/139135

In [21]:
crime_data_final=crime_data[['LookUp_BoroughName','crime_rate']]

In [22]:
crime_data_final.columns=['Borough','crime_level']

In [23]:
crime_data_final.head()

Unnamed: 0,Borough,crime_level
0,Barking and Dagenham,0.281568
1,Barnet,0.426708
2,Bexley,0.244403
3,Brent,0.429295
4,Bromley,0.341258


In [27]:
# Download transport data
london_transport_data = pd.read_csv("Borough AvPTAI2015 (1).csv")
london_transport_data.head()

Unnamed: 0,Borough Code,Borough Name,AvPTAI2015,PTAL
0,E09000021,Kingston upon Thames,5.425275,2
1,E09000008,Croydon,6.757744,2
2,E09000006,Bromley,3.592084,1b
3,E09000018,Hounslow,5.842208,2
4,E09000009,Ealing,8.50351,2


In [28]:
london_transport_data_new=london_transport_data[['Borough Name','PTAL']]

In [29]:
london_transport_data_new.head()

Unnamed: 0,Borough Name,PTAL
0,Kingston upon Thames,2
1,Croydon,2
2,Bromley,1b
3,Hounslow,2
4,Ealing,2


In [30]:
london_transport_data_new.columns=['Borough','transport_level']

In [31]:
transport_data_final=london_transport_data_new

In [32]:
transport_data_final.head()

Unnamed: 0,Borough,transport_level
0,Kingston upon Thames,2
1,Croydon,2
2,Bromley,1b
3,Hounslow,2
4,Ealing,2


In [49]:
df=pd.merge(crime_data_final,transport_data_final,how='outer',on="Borough")

In [58]:
df.head()

Unnamed: 0,Borough,crime_level,transport_level
0,Barking and Dagenham,0.281568,2
1,Barnet,0.426708,2
2,Bexley,0.244403,1b
3,Brent,0.429295,2
4,Bromley,0.341258,1b


In [59]:
# get the coordination from Wiki

In [60]:
URL = 'https://en.wikipedia.org/wiki/List_of_London_boroughs'
page = requests.get(URL).text

In [61]:
soup = BeautifulSoup(page, "html.parser")

In [62]:
table=soup.find('table')

In [63]:
table

<table class="wikitable sortable" style="font-size:100%" width="100%">
<tbody><tr>
<th>Borough
</th>
<th>Inner
</th>
<th>Status
</th>
<th>Local authority
</th>
<th>Political control
</th>
<th>Headquarters
</th>
<th>Area (sq mi)
</th>
<th>Population (2013 est)<sup class="reference" id="cite_ref-1"><a href="#cite_note-1">[1]</a></sup>
</th>
<th>Co-ordinates
</th>
<th><span style="background:#67BCD3"> Nr. in map </span>
</th></tr>
<tr>
<td><a href="/wiki/London_Borough_of_Barking_and_Dagenham" title="London Borough of Barking and Dagenham">Barking and Dagenham</a> <sup class="reference" id="cite_ref-2"><a href="#cite_note-2">[note 1]</a></sup>
</td>
<td>
</td>
<td>
</td>
<td><a href="/wiki/Barking_and_Dagenham_London_Borough_Council" title="Barking and Dagenham London Borough Council">Barking and Dagenham London Borough Council</a>
</td>
<td><a href="/wiki/Labour_Party_(UK)" title="Labour Party (UK)">Labour</a>
</td>
<td><a href="/wiki/Barking_Town_Hall" title="Barking Town Hall">Town Hal

In [64]:

BoroughName = []
Population = []
Coordinates = []

for row in soup.find('table').find_all('tr'):
    cells = row.find_all('td')
    if len(cells) > 0:
        BoroughName.append(cells[0].text.rstrip('\n'))
        Population.append(cells[7].text.rstrip('\n'))
        Coordinates.append(cells[8].text.rstrip('\n'))

In [65]:

# Form a dataframe
dict = {'BoroughName' : BoroughName,
       'Population' : Population,
       'Coordinates': Coordinates}
info = pd.DataFrame.from_dict(dict)
info.head()

Unnamed: 0,BoroughName,Population,Coordinates
0,Barking and Dagenham [note 1],194352,51°33′39″N 0°09′21″E﻿ / ﻿51.5607°N 0.1557°E﻿ /...
1,Barnet,369088,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W﻿ /...
2,Bexley,236687,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E﻿ /...
3,Brent,317264,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W﻿ /...
4,Bromley,317899,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E﻿ /...


In [66]:
# Strip unwanted texts
info['BoroughName'] = info['BoroughName'].map(lambda x: x.rstrip(']'))
info['BoroughName'] = info['BoroughName'].map(lambda x: x.rstrip('1234567890.'))
info['BoroughName'] = info['BoroughName'].str.replace('note','')
info['BoroughName'] = info['BoroughName'].map(lambda x: x.rstrip(' ['))
info.head()

Unnamed: 0,BoroughName,Population,Coordinates
0,Barking and Dagenham,194352,51°33′39″N 0°09′21″E﻿ / ﻿51.5607°N 0.1557°E﻿ /...
1,Barnet,369088,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W﻿ /...
2,Bexley,236687,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E﻿ /...
3,Brent,317264,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W﻿ /...
4,Bromley,317899,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E﻿ /...


In [67]:
info[['Coordinates1','Coordinates2','Coordinates3']] = info['Coordinates'].str.split('/',expand=True)
info.head()

Unnamed: 0,BoroughName,Population,Coordinates,Coordinates1,Coordinates2,Coordinates3
0,Barking and Dagenham,194352,51°33′39″N 0°09′21″E﻿ / ﻿51.5607°N 0.1557°E﻿ /...,51°33′39″N 0°09′21″E﻿,﻿51.5607°N 0.1557°E﻿,51.5607; 0.1557﻿ (Barking and Dagenham)
1,Barnet,369088,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W﻿ /...,51°37′31″N 0°09′06″W﻿,﻿51.6252°N 0.1517°W﻿,51.6252; -0.1517﻿ (Barnet)
2,Bexley,236687,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E﻿ /...,51°27′18″N 0°09′02″E﻿,﻿51.4549°N 0.1505°E﻿,51.4549; 0.1505﻿ (Bexley)
3,Brent,317264,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W﻿ /...,51°33′32″N 0°16′54″W﻿,﻿51.5588°N 0.2817°W﻿,51.5588; -0.2817﻿ (Brent)
4,Bromley,317899,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E﻿ /...,51°24′14″N 0°01′11″E﻿,﻿51.4039°N 0.0198°E﻿,51.4039; 0.0198﻿ (Bromley)


In [68]:
info.drop(labels=['Coordinates','Coordinates1','Coordinates2'], axis=1,inplace = True)
info[['Latitude','Longitude']] = info['Coordinates3'].str.split(';',expand=True)
info.head()

Unnamed: 0,BoroughName,Population,Coordinates3,Latitude,Longitude
0,Barking and Dagenham,194352,51.5607; 0.1557﻿ (Barking and Dagenham),51.5607,0.1557﻿ (Barking and Dagenham)
1,Barnet,369088,51.6252; -0.1517﻿ (Barnet),51.6252,-0.1517﻿ (Barnet)
2,Bexley,236687,51.4549; 0.1505﻿ (Bexley),51.4549,0.1505﻿ (Bexley)
3,Brent,317264,51.5588; -0.2817﻿ (Brent),51.5588,-0.2817﻿ (Brent)
4,Bromley,317899,51.4039; 0.0198﻿ (Bromley),51.4039,0.0198﻿ (Bromley)


In [None]:
info.drop(labels=['Coordinates3'], axis=1,inplace = True)
info['Latitude'] = info['Latitude'].map(lambda x: x.rstrip(u'\ufeff'))
info['Latitude'] = info['Latitude'].map(lambda x: x.lstrip())
info['Longitude'] = info['Longitude'].map(lambda x: x.rstrip(')'))
info['Longitude'] = info['Longitude'].map(lambda x: x.rstrip('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ '))
info['Longitude'] = info['Longitude'].map(lambda x: x.rstrip(' ('))
info['Longitude'] = info['Longitude'].map(lambda x: x.rstrip(u'\ufeff'))
info['Longitude'] = info['Longitude'].map(lambda x: x.lstrip())
info['Population'] = info['Population'].str.replace(',','')



In [74]:
info.head()

Unnamed: 0,BoroughName,Population,Latitude,Longitude
0,Barking and Dagenham,194352,51.5607,0.1557
1,Barnet,369088,51.6252,-0.1517
2,Bexley,236687,51.4549,0.1505
3,Brent,317264,51.5588,-0.2817
4,Bromley,317899,51.4039,0.0198


In [78]:
print(venues.shape)
venues.head()

(1139, 7)


Unnamed: 0,BoroughName,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Barking and Dagenham,51.5607,0.1557,Central Park,51.55956,0.161981,Park
1,Barking and Dagenham,51.5607,0.1557,Crowlands Heath Golf Course,51.562457,0.155818,Golf Course
2,Barking and Dagenham,51.5607,0.1557,Robert Clack Leisure Centre,51.560808,0.152704,Martial Arts School
3,Barking and Dagenham,51.5607,0.1557,Morrisons,51.559774,0.148752,Supermarket
4,Barking and Dagenham,51.5607,0.1557,Becontree Heath Bus Station,51.561065,0.150998,Bus Station


In [81]:
info

Unnamed: 0,BoroughName,Population,Latitude,Longitude
0,Barking and Dagenham,194352,51.5607,0.1557
1,Barnet,369088,51.6252,-0.1517
2,Bexley,236687,51.4549,0.1505
3,Brent,317264,51.5588,-0.2817
4,Bromley,317899,51.4039,0.0198
5,Camden,229719,51.529,-0.1255
6,Croydon,372752,51.3714,-0.0977
7,Ealing,342494,51.513,-0.3089
8,Enfield,320524,51.6538,-0.0799
9,Greenwich,264008,51.4892,0.0648


In [82]:
df

Unnamed: 0,Borough,crime_level,transport_level
0,Barking and Dagenham,0.281568,2
1,Barnet,0.426708,2
2,Bexley,0.244403,1b
3,Brent,0.429295,2
4,Bromley,0.341258,1b
5,Camden,0.513796,5
6,Croydon,0.482359,2
7,Ealing,0.439494,2
8,Enfield,0.425292,1b
9,Greenwich,0.395573,2


In [86]:
df.columns=['BoroughName','crime_level','transport_level']

In [87]:
df

Unnamed: 0,BoroughName,crime_level,transport_level
0,Barking and Dagenham,0.281568,2
1,Barnet,0.426708,2
2,Bexley,0.244403,1b
3,Brent,0.429295,2
4,Bromley,0.341258,1b
5,Camden,0.513796,5
6,Croydon,0.482359,2
7,Ealing,0.439494,2
8,Enfield,0.425292,1b
9,Greenwich,0.395573,2


In [88]:
df=pd.merge(df,info,how='outer',on="BoroughName")

In [89]:
df

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude
0,Barking and Dagenham,0.281568,2,194352.0,51.5607,0.1557
1,Barnet,0.426708,2,369088.0,51.6252,-0.1517
2,Bexley,0.244403,1b,236687.0,51.4549,0.1505
3,Brent,0.429295,2,317264.0,51.5588,-0.2817
4,Bromley,0.341258,1b,317899.0,51.4039,0.0198
5,Camden,0.513796,5,229719.0,51.529,-0.1255
6,Croydon,0.482359,2,372752.0,51.3714,-0.0977
7,Ealing,0.439494,2,342494.0,51.513,-0.3089
8,Enfield,0.425292,1b,320524.0,51.6538,-0.0799
9,Greenwich,0.395573,2,264008.0,51.4892,0.0648


In [96]:
df = df.dropna()

In [97]:
df

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude
0,Barking and Dagenham,0.281568,2,194352,51.5607,0.1557
1,Barnet,0.426708,2,369088,51.6252,-0.1517
2,Bexley,0.244403,1b,236687,51.4549,0.1505
3,Brent,0.429295,2,317264,51.5588,-0.2817
4,Bromley,0.341258,1b,317899,51.4039,0.0198
5,Camden,0.513796,5,229719,51.529,-0.1255
6,Croydon,0.482359,2,372752,51.3714,-0.0977
7,Ealing,0.439494,2,342494,51.513,-0.3089
8,Enfield,0.425292,1b,320524,51.6538,-0.0799
9,Greenwich,0.395573,2,264008,51.4892,0.0648


In [98]:
CLIENT_ID = 'PYN0TUEZ5MPLJOYBI2AZ2NNJPQ5GKKGJ552NSAYDAITIKUM3' # your Foursquare ID
CLIENT_SECRET = 'MR00PGE5IZBWCMDIRNOF14VPHPO5B4GROGRHWF3FWPQ4NC3T' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: PYN0TUEZ5MPLJOYBI2AZ2NNJPQ5GKKGJ552NSAYDAITIKUM3
CLIENT_SECRET:MR00PGE5IZBWCMDIRNOF14VPHPO5B4GROGRHWF3FWPQ4NC3T


In [105]:
def getNearbyVenues(names, latitudes, longitudes):
    radius=500
    LIMIT=100
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['BoroughName', 
                  'BoroughName Latitude', 
                  'BoroughName Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [106]:
london_venues = getNearbyVenues(names=df['BoroughName'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Barking and Dagenham
Barnet
Bexley
Brent
Bromley
Camden
Croydon
Ealing
Enfield
Greenwich
Hackney
Hammersmith and Fulham
Haringey
Harrow
Havering
Hillingdon
Hounslow
Islington
Kensington and Chelsea
Kingston upon Thames
Lambeth
Lewisham
Merton
Newham
Redbridge
Richmond upon Thames
Southwark
Sutton
Tower Hamlets
Waltham Forest
Wandsworth
Westminster


In [107]:
london_venues.head()

Unnamed: 0,BoroughName,BoroughName Latitude,BoroughName Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Barking and Dagenham,51.5607,0.1557,Central Park,51.55956,0.161981,Park
1,Barking and Dagenham,51.5607,0.1557,Crowlands Heath Golf Course,51.562457,0.155818,Golf Course
2,Barking and Dagenham,51.5607,0.1557,Robert Clack Leisure Centre,51.560808,0.152704,Martial Arts School
3,Barking and Dagenham,51.5607,0.1557,Morrisons,51.559774,0.148752,Supermarket
4,Barking and Dagenham,51.5607,0.1557,Becontree Heath Bus Station,51.561065,0.150998,Bus Station


In [111]:
london_venues.groupby('BoroughName').count()

Unnamed: 0_level_0,BoroughName Latitude,BoroughName Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
BoroughName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Barking and Dagenham,7,7,7,7,7,7
Barnet,5,5,5,5,5,5
Bexley,29,29,29,29,29,29
Brent,76,76,76,76,76,76
Bromley,39,39,39,39,39,39
Camden,79,79,79,79,79,79
Croydon,38,38,38,38,38,38
Ealing,74,74,74,74,74,74
Enfield,52,52,52,52,52,52
Greenwich,40,40,40,40,40,40


In [170]:
london_onehot = pd.get_dummies(london_venues[['Venue Category']], prefix="", prefix_sep="")

In [171]:
london_onehot = pd.get_dummies(london_onehot,drop_first=True)

In [172]:
london_onehot.insert(loc=0, column='BoroughName', value=london_venues['BoroughName'] )
london_onehot.shape

(1456, 223)

In [173]:
london_grouped = london_onehot.groupby('BoroughName').mean().reset_index()
london_grouped.head()

Unnamed: 0,BoroughName,African Restaurant,Airport,Airport Lounge,Airport Service,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,...,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio
0,Barking and Dagenham,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Barnet,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bexley,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,...,0.0,0.0,0.034483,0.0,0.034483,0.0,0.0,0.0,0.0,0.0
3,Brent,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bromley,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [174]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [175]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['BoroughName']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
BoroughName_venues_sorted = pd.DataFrame(columns=columns)
BoroughName_venues_sorted['BoroughName'] = london_grouped['BoroughName']

for ind in np.arange(london_grouped.shape[0]):
    BoroughName_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

BoroughName_venues_sorted.head()

Unnamed: 0,BoroughName,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Barking and Dagenham,Pool,Bus Station,Supermarket,Martial Arts School,Park,Gym / Fitness Center,Golf Course,Escape Room,Flea Market,Fish Market
1,Barnet,Café,Bus Stop,Home Service,Salon / Barbershop,Yoga Studio,Escape Room,Food Court,Flea Market,Fish Market,Fish & Chips Shop
2,Bexley,Clothing Store,Pub,Coffee Shop,Furniture / Home Store,Fast Food Restaurant,Pharmacy,Supermarket,Portuguese Restaurant,Bakery,Italian Restaurant
3,Brent,Hotel,Coffee Shop,Clothing Store,Bar,Sporting Goods Shop,Grocery Store,Italian Restaurant,Sandwich Place,Indian Restaurant,American Restaurant
4,Bromley,Coffee Shop,Clothing Store,Pizza Place,Bar,Burger Joint,Gym / Fitness Center,Pub,Fast Food Restaurant,Gelato Shop,Bookstore


In [176]:
london_grouped=pd.merge(london_grouped,df, how='outer',on="BoroughName")

In [177]:
london_grouped

Unnamed: 0,BoroughName,African Restaurant,Airport,Airport Lounge,Airport Service,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,...,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,crime_level,transport_level,Population,Latitude,Longitude
0,Barking and Dagenham,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.281568,2,194352,51.5607,0.1557
1,Barnet,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.426708,2,369088,51.6252,-0.1517
2,Bexley,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.244403,1b,236687,51.4549,0.1505
3,Brent,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.429295,2,317264,51.5588,-0.2817
4,Bromley,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.341258,1b,317899,51.4039,0.0198
5,Camden,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,...,0.0,0.0,0.0,0.0,0.0,0.513796,5,229719,51.529,-0.1255
6,Croydon,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.482359,2,372752,51.3714,-0.0977
7,Ealing,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,...,0.013514,0.0,0.0,0.0,0.0,0.439494,2,342494,51.513,-0.3089
8,Enfield,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.019231,0.0,0.425292,1b,320524,51.6538,-0.0799
9,Greenwich,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.395573,2,264008,51.4892,0.0648


In [181]:
london_grouped.loc[(london_grouped['transport_level']=='1b')]=1

In [183]:
london_grouped.loc[(london_grouped['transport_level']=='6a')]=7
london_grouped.loc[(london_grouped['transport_level']=='6b')]=6

In [186]:
london_grouped.head()

Unnamed: 0,BoroughName,African Restaurant,Airport,Airport Lounge,Airport Service,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,...,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,crime_level,transport_level,Population,Latitude,Longitude
0,Barking and Dagenham,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.281568,2,194352,51.5607,0.1557
1,Barnet,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.426708,2,369088,51.6252,-0.1517
2,1,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1,1,1.0,1.0
3,Brent,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.429295,2,317264,51.5588,-0.2817
4,1,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1,1,1.0,1.0


In [190]:
london_grouped.dtypes

BoroughName            object
African Restaurant    float64
Airport               float64
Airport Lounge        float64
Airport Service       float64
                       ...   
crime_level           float64
transport_level        object
Population             object
Latitude               object
Longitude              object
Length: 228, dtype: object

In [192]:
london_grouped['transport_level'].astype(float)
del london_grouped['BoroughName']
del london_grouped['Population']
del london_grouped['Latitude']
del london_grouped['Longitude']

london_grouped


Unnamed: 0,African Restaurant,Airport,Airport Lounge,Airport Service,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,...,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,crime_level,transport_level
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.281568,2
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.426708,2
2,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1
3,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.429295,2
4,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1
5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.513796,5
6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.482359,2
7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,...,0.013514,0.027027,0.013514,0.013514,0.0,0.0,0.0,0.0,0.439494,2
8,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1
9,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.395573,2


In [194]:
# set number of clusters
kclusters = 5

london_grouped_clustering = london_grouped

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(london_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([3, 3, 2, 3, 2, 4, 3, 3, 2, 3], dtype=int32)

In [197]:
BoroughName_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

london_merged = df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
london_merged = london_merged.join(BoroughName_venues_sorted.set_index('BoroughName'), on='BoroughName')

london_merged.head(100)

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Barking and Dagenham,0.281568,2,194352,51.5607,0.1557,3,Pool,Bus Station,Supermarket,Martial Arts School,Park,Gym / Fitness Center,Golf Course,Escape Room,Flea Market,Fish Market
1,Barnet,0.426708,2,369088,51.6252,-0.1517,3,Café,Bus Stop,Home Service,Salon / Barbershop,Yoga Studio,Escape Room,Food Court,Flea Market,Fish Market,Fish & Chips Shop
2,Bexley,0.244403,1b,236687,51.4549,0.1505,2,Clothing Store,Pub,Coffee Shop,Furniture / Home Store,Fast Food Restaurant,Pharmacy,Supermarket,Portuguese Restaurant,Bakery,Italian Restaurant
3,Brent,0.429295,2,317264,51.5588,-0.2817,3,Hotel,Coffee Shop,Clothing Store,Bar,Sporting Goods Shop,Grocery Store,Italian Restaurant,Sandwich Place,Indian Restaurant,American Restaurant
4,Bromley,0.341258,1b,317899,51.4039,0.0198,2,Coffee Shop,Clothing Store,Pizza Place,Bar,Burger Joint,Gym / Fitness Center,Pub,Fast Food Restaurant,Gelato Shop,Bookstore
5,Camden,0.513796,5,229719,51.529,-0.1255,4,Coffee Shop,Hotel,Café,Pub,Italian Restaurant,Burger Joint,Breakfast Spot,Pizza Place,Plaza,Modern European Restaurant
6,Croydon,0.482359,2,372752,51.3714,-0.0977,3,Pub,Coffee Shop,Asian Restaurant,Portuguese Restaurant,Mediterranean Restaurant,Bookstore,Gaming Cafe,Breakfast Spot,Malay Restaurant,Spanish Restaurant
7,Ealing,0.439494,2,342494,51.513,-0.3089,3,Coffee Shop,Clothing Store,Pub,Italian Restaurant,Park,Bakery,Vietnamese Restaurant,Café,Pizza Place,Gym / Fitness Center
8,Enfield,0.425292,1b,320524,51.6538,-0.0799,2,Clothing Store,Coffee Shop,Supermarket,Café,Pub,Optical Shop,Shopping Mall,Pharmacy,Department Store,Gift Shop
9,Greenwich,0.395573,2,264008,51.4892,0.0648,3,Pub,Clothing Store,Fast Food Restaurant,Coffee Shop,Hotel,Grocery Store,Pharmacy,Supermarket,Plaza,African Restaurant


In [199]:
london_merged.dtypes

BoroughName                object
crime_level               float64
transport_level            object
Population                 object
Latitude                   object
Longitude                  object
Cluster Labels              int32
1st Most Common Venue      object
2nd Most Common Venue      object
3rd Most Common Venue      object
4th Most Common Venue      object
5th Most Common Venue      object
6th Most Common Venue      object
7th Most Common Venue      object
8th Most Common Venue      object
9th Most Common Venue      object
10th Most Common Venue     object
dtype: object

In [202]:
london_merged['Population'] = pd.to_numeric(london_merged['Population'])
london_merged['Latitude'] = pd.to_numeric(london_merged['Latitude'])
london_merged['Longitude'] = pd.to_numeric(london_merged['Longitude'])
london_merged.dtypes

BoroughName                object
crime_level               float64
transport_level            object
Population                  int64
Latitude                  float64
Longitude                 float64
Cluster Labels              int32
1st Most Common Venue      object
2nd Most Common Venue      object
3rd Most Common Venue      object
4th Most Common Venue      object
5th Most Common Venue      object
6th Most Common Venue      object
7th Most Common Venue      object
8th Most Common Venue      object
9th Most Common Venue      object
10th Most Common Venue     object
dtype: object

In [203]:
address = 'London, UK'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of London are 51.5073219, -0.1276474.


In [206]:
london_merged = london_merged[london_merged['Cluster Labels'] >= 1] 
london_merged['Cluster Labels'] =london_merged['Cluster Labels'].astype(int)

map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(london_merged['Latitude'], london_merged['Longitude'], london_merged['BoroughName'], london_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# Results and Discussion

In [237]:
london_merged['Cluster Labels_new']=london_merged['Cluster Labels']+1

In [239]:
london_merged

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels_new
0,Barking and Dagenham,0.281568,2,194352,51.5607,0.1557,3,Pool,Bus Station,Supermarket,Martial Arts School,Park,Gym / Fitness Center,Golf Course,Escape Room,Flea Market,Fish Market,4
1,Barnet,0.426708,2,369088,51.6252,-0.1517,3,Café,Bus Stop,Home Service,Salon / Barbershop,Yoga Studio,Escape Room,Food Court,Flea Market,Fish Market,Fish & Chips Shop,4
2,Bexley,0.244403,1b,236687,51.4549,0.1505,2,Clothing Store,Pub,Coffee Shop,Furniture / Home Store,Fast Food Restaurant,Pharmacy,Supermarket,Portuguese Restaurant,Bakery,Italian Restaurant,3
3,Brent,0.429295,2,317264,51.5588,-0.2817,3,Hotel,Coffee Shop,Clothing Store,Bar,Sporting Goods Shop,Grocery Store,Italian Restaurant,Sandwich Place,Indian Restaurant,American Restaurant,4
4,Bromley,0.341258,1b,317899,51.4039,0.0198,2,Coffee Shop,Clothing Store,Pizza Place,Bar,Burger Joint,Gym / Fitness Center,Pub,Fast Food Restaurant,Gelato Shop,Bookstore,3
5,Camden,0.513796,5,229719,51.529,-0.1255,4,Coffee Shop,Hotel,Café,Pub,Italian Restaurant,Burger Joint,Breakfast Spot,Pizza Place,Plaza,Modern European Restaurant,5
6,Croydon,0.482359,2,372752,51.3714,-0.0977,3,Pub,Coffee Shop,Asian Restaurant,Portuguese Restaurant,Mediterranean Restaurant,Bookstore,Gaming Cafe,Breakfast Spot,Malay Restaurant,Spanish Restaurant,4
7,Ealing,0.439494,2,342494,51.513,-0.3089,3,Coffee Shop,Clothing Store,Pub,Italian Restaurant,Park,Bakery,Vietnamese Restaurant,Café,Pizza Place,Gym / Fitness Center,4
8,Enfield,0.425292,1b,320524,51.6538,-0.0799,2,Clothing Store,Coffee Shop,Supermarket,Café,Pub,Optical Shop,Shopping Mall,Pharmacy,Department Store,Gift Shop,3
9,Greenwich,0.395573,2,264008,51.4892,0.0648,3,Pub,Clothing Store,Fast Food Restaurant,Coffee Shop,Hotel,Grocery Store,Pharmacy,Supermarket,Plaza,African Restaurant,4


In [240]:
# Cluster 1: the area with the best transport link, but relatively high crime rate
london_merged.loc[london_merged['Cluster Labels'] == 1]

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels_new
17,Islington,0.409581,6a,215667,51.5416,-0.1022,1,Pub,Mediterranean Restaurant,Cocktail Bar,Theater,Bakery,Boutique,Burger Joint,Park,Ice Cream Shop,Kebab Restaurant,2
32,Westminster,1.0,6a,226841,51.4973,-0.1372,1,Hotel,Coffee Shop,Sandwich Place,Theater,Sushi Restaurant,Gym / Fitness Center,Italian Restaurant,Pub,Hotel Bar,Juice Bar,2


In [241]:
# Cluster 2: areas with the worst transport link, but relatively low crime rate 
london_merged.loc[london_merged['Cluster Labels'] == 2]

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels_new
2,Bexley,0.244403,1b,236687,51.4549,0.1505,2,Clothing Store,Pub,Coffee Shop,Furniture / Home Store,Fast Food Restaurant,Pharmacy,Supermarket,Portuguese Restaurant,Bakery,Italian Restaurant,3
4,Bromley,0.341258,1b,317899,51.4039,0.0198,2,Coffee Shop,Clothing Store,Pizza Place,Bar,Burger Joint,Gym / Fitness Center,Pub,Fast Food Restaurant,Gelato Shop,Bookstore,3
8,Enfield,0.425292,1b,320524,51.6538,-0.0799,2,Clothing Store,Coffee Shop,Supermarket,Café,Pub,Optical Shop,Shopping Mall,Pharmacy,Department Store,Gift Shop,3
14,Havering,0.262357,1b,242080,51.5812,0.1837,2,Coffee Shop,Clothing Store,Shopping Mall,Pub,Fast Food Restaurant,Department Store,Bookstore,Bakery,Café,Hotel,3
15,Hillingdon,0.381054,1b,286806,51.5441,-0.476,2,Coffee Shop,Clothing Store,Pharmacy,Italian Restaurant,Fast Food Restaurant,Department Store,Burger Joint,Pub,Bookstore,Toy / Game Store,3
26,Richmond upon Thames,0.183635,1b,191365,51.4479,-0.326,2,Pub,Coffee Shop,Italian Restaurant,Bus Stop,Grocery Store,Steakhouse,Indian Restaurant,Pharmacy,Mediterranean Restaurant,Deli / Bodega,3


In [242]:
# Cluster 3: ares with sligitly better transport link comparing to cluster 2, but with medium crime rate  
london_merged.loc[london_merged['Cluster Labels'] == 3]

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels_new
0,Barking and Dagenham,0.281568,2,194352,51.5607,0.1557,3,Pool,Bus Station,Supermarket,Martial Arts School,Park,Gym / Fitness Center,Golf Course,Escape Room,Flea Market,Fish Market,4
1,Barnet,0.426708,2,369088,51.6252,-0.1517,3,Café,Bus Stop,Home Service,Salon / Barbershop,Yoga Studio,Escape Room,Food Court,Flea Market,Fish Market,Fish & Chips Shop,4
3,Brent,0.429295,2,317264,51.5588,-0.2817,3,Hotel,Coffee Shop,Clothing Store,Bar,Sporting Goods Shop,Grocery Store,Italian Restaurant,Sandwich Place,Indian Restaurant,American Restaurant,4
6,Croydon,0.482359,2,372752,51.3714,-0.0977,3,Pub,Coffee Shop,Asian Restaurant,Portuguese Restaurant,Mediterranean Restaurant,Bookstore,Gaming Cafe,Breakfast Spot,Malay Restaurant,Spanish Restaurant,4
7,Ealing,0.439494,2,342494,51.513,-0.3089,3,Coffee Shop,Clothing Store,Pub,Italian Restaurant,Park,Bakery,Vietnamese Restaurant,Café,Pizza Place,Gym / Fitness Center,4
9,Greenwich,0.395573,2,264008,51.4892,0.0648,3,Pub,Clothing Store,Fast Food Restaurant,Coffee Shop,Hotel,Grocery Store,Pharmacy,Supermarket,Plaza,African Restaurant,4
13,Harrow,0.23884,2,243372,51.5898,-0.3346,3,Indian Restaurant,Platform,Grocery Store,Supermarket,Coffee Shop,Thai Restaurant,Indie Movie Theater,Fish Market,Flea Market,Fish & Chips Shop,4
16,Hounslow,0.375211,2,262407,51.4746,-0.368,3,Pizza Place,Café,Bed & Breakfast,Park,Yoga Studio,Event Space,French Restaurant,Food Court,Flea Market,Fish Market,4
19,Kingston upon Thames,0.180896,2,166793,51.4085,-0.3064,3,Coffee Shop,Clothing Store,Café,Pub,Italian Restaurant,Department Store,Burger Joint,Sushi Restaurant,Bakery,Sandwich Place,4
23,Merton,0.204269,2,203223,51.4014,-0.1958,3,Café,Italian Restaurant,Supermarket,Indian Restaurant,Park,Garden Center,Fast Food Restaurant,Sandwich Place,Bar,Bakery,4


In [243]:
# Cluster 4: areas with good transport link, and medium crime rate
london_merged.loc[london_merged['Cluster Labels'] == 4]

Unnamed: 0,BoroughName,crime_level,transport_level,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels_new
5,Camden,0.513796,5,229719,51.529,-0.1255,4,Coffee Shop,Hotel,Café,Pub,Italian Restaurant,Burger Joint,Breakfast Spot,Pizza Place,Plaza,Modern European Restaurant,5
10,Hackney,0.472879,4,257379,51.545,-0.0553,4,Pub,Coffee Shop,Cocktail Bar,Bakery,Café,Brewery,Vegetarian / Vegan Restaurant,Organic Grocery,Grocery Store,Modern European Restaurant,5
11,Hammersmith and Fulham,0.315952,4,178685,51.4927,-0.2339,4,Pub,Café,Italian Restaurant,Coffee Shop,Indian Restaurant,Grocery Store,Clothing Store,Gastropub,Hotel,Vietnamese Restaurant,5
18,Kensington and Chelsea,0.317943,5,155594,51.502,-0.1947,4,Café,Juice Bar,Clothing Store,Restaurant,French Restaurant,Italian Restaurant,Gym / Fitness Center,Hotel,Burger Joint,Bakery,5
20,Lambeth,0.491099,5,314242,51.4607,-0.1163,4,Caribbean Restaurant,Pub,Market,Coffee Shop,Beer Bar,Pizza Place,Gym / Fitness Center,Nightclub,Sandwich Place,Mexican Restaurant,5
27,Southwark,0.527905,5,298464,51.5035,-0.0804,4,Coffee Shop,Pub,Bar,Cocktail Bar,Restaurant,French Restaurant,Scenic Lookout,Indian Restaurant,Italian Restaurant,English Restaurant,5
29,Tower Hamlets,0.484328,4,272890,51.5099,-0.0059,4,Coffee Shop,Hotel,Sandwich Place,Italian Restaurant,Outdoor Sculpture,Chinese Restaurant,Pizza Place,Café,Convenience Store,Grocery Store,5


# Conclusion 

This report presents four clusters of London Boroughs based on their similarities of crime rate, transport level and common venues.
The results clearly show four clusters are broadly aligned to the london transport zones from zone 1 to one 5. 