# Introduction

In the 21st century mostly every single process we do is automated. Starting from ordering our favorite food or requesting a taxi and ending with some advanced processes such as executing million of millions transactions by simply pressing one button. And it all thanks to a major progress of Information Technologies.

Nowadays if we need to know where we can find some lovely dinner places in foreign country, what we need to do is simply open maps and by analyzing our geolocation the advanced machine can predict our preferences and provide us with most rated nearby restaurants in no time. But not that it is user-friendly and really reliable, it is, also, can be extremely precious for commercial use. For example if we would get a project to build a new venue (office building, department store, grocery store, restaurant and so on.) in a city we have never been to by using some machine learning algorithms, we can cluster our whole city dataset and then visualize it on map to predict what would be the most efficient spot for a new start up. That is exactly what we will be doing today, imagine, that we got a business project to build a new department store in City of London the most advanced Borough in London and we need to find the best neighbourhood to proceed with. Please do take your seat ant let me take you through this fascinating journey where will be exploring and clustering different neighbourhoods in London to find which one best suits our needs.



# Data

First thing first in order for us to start analyze neighbourhoods in City of London we need to find dataset which one would include boroughs and neighbourhoods names and coordinates, because we will need them later on for data visualization. We will be using dataset provided by doogal.co.uk which is almost ideal for our project. Then we will just need to clean our data a little bit by dropping all unsufficient columns and rename district to a borough and ward to a neighbourhood columns, so our whole data would look more appealing and user-friendly.

Afterwards we will be calling foursquare api to find top 20 venues in every neighborhood so we can cluster them and decide which one will be the most suitable for our project.

# Methodology

In this section first of all we will be cleaning our dataset, to prepare it for visualization and clustering.

Secondly when data is cleaned, we will be visualizing it on the map to see how it is distributed.

Thirdly we will be categorizing our dataset by using k-means clustering algorithm.

And lastly we will be transferring all of our processed dataset on the map, so that way we would be able to predict, what will be the most sufficient place for our project to begin.

### Installing and importing all required libraries for our project


In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize 
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
#!conda install folium -c conda-forge --yes
import folium # map rendering library
#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim

### Reading and examining size of our dataframe

In [2]:
df = pd.read_csv('https://www.doogal.co.uk/UKPostcodesCSV.ashx?area=London')
df.head(10)

Unnamed: 0,Postcode,In Use?,Latitude,Longitude,Easting,Northing,Grid Ref,County,District,Ward,...,Quality,User Type,Last updated,Nearest station,Distance to station,Postcode area,Postcode district,Police force,Water company,Plus Code
0,BR1 1AA,Yes,51.401546,0.015415,540291,168873,TQ402688,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley South,0.218257,BR,BR1,Metropolitan Police,Thames Water,9F32C228+J5
1,BR1 1AB,Yes,51.406333,0.015208,540262,169405,TQ402694,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley North,0.253666,BR,BR1,Metropolitan Police,Thames Water,9F32C248+G3
2,BR1 1AD,No,51.400057,0.016715,540386,168710,TQ403687,Greater London,Bromley,Bromley Town,...,1,1,2019-11-23,Bromley South,0.044559,BR,BR1,Metropolitan Police,,9F32C228+2M
3,BR1 1AE,Yes,51.404543,0.014195,540197,169204,TQ401692,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley North,0.462939,BR,BR1,Metropolitan Police,Thames Water,9F32C237+RM
4,BR1 1AF,Yes,51.401392,0.014948,540259,168855,TQ402688,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley South,0.227664,BR,BR1,Metropolitan Police,Thames Water,9F32C227+HX
5,BR1 1AG,Yes,51.401392,0.014948,540259,168855,TQ402688,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley South,0.227664,BR,BR1,Metropolitan Police,Thames Water,9F32C227+HX
6,BR1 1AH,Yes,51.400441,0.01739,540432,168754,TQ404687,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley South,0.048906,BR,BR1,Metropolitan Police,Thames Water,9F32C228+5X
7,BR1 1AJ,Yes,51.400489,0.018833,540532,168762,TQ405687,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley South,0.115632,BR,BR1,Metropolitan Police,Thames Water,9F32C229+5G
8,BR1 1AL,Yes,51.406549,0.01313,540117,169425,TQ401694,Greater London,Bromley,Bromley Town,...,1,0,2019-11-23,Bromley North,0.332674,BR,BR1,Metropolitan Police,,9F32C247+J7
9,BR1 1AX,No,51.408226,0.017578,540421,169620,TQ404696,Greater London,Bromley,Bromley Town,...,1,1,2019-11-23,Bromley North,0.042067,BR,BR1,Metropolitan Police,,9F32C259+72


In [3]:
df.shape

(321375, 46)

### Cleaning our dataset and checking the size afterwards

In [4]:
# Dropping all unrequired collumns
df.drop(['In Use?', 'Easting', 'Northing', 'Grid Ref', 'County', 'District Code', 'Ward Code', 'Country', 'County Code', 'Constituency', 'Introduced', 'Terminated', 'Parish', 
         'National Park', 'Population', 'Households', 'Built up area', 'Built up sub-division', 'Lower layer super output area', 'Rural/urban', 'Region', 'Altitude', 'London zone', 'LSOA Code',
         'Local authority', 'MSOA Code', 'Middle layer super output area', 'Parish Code', 'Census output area', 'Constituency Code', 'Index of Multiple Deprivation', 'Quality', 'User Type', 
         'Last updated', 'Distance to station', 'Postcode area', 'Postcode district', 'Police force', 'Water company', 'Plus Code'], axis=1)
df_lon = df[['District', 'Ward', 'Latitude', 'Longitude', 'Nearest station', 'Postcode']]
df_lon = df_lon.rename(columns={'District':'Borough', 'Ward':'Neighbourhood'})
df_lon.head(10)

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
0,Bromley,Bromley Town,51.401546,0.015415,Bromley South,BR1 1AA
1,Bromley,Bromley Town,51.406333,0.015208,Bromley North,BR1 1AB
2,Bromley,Bromley Town,51.400057,0.016715,Bromley South,BR1 1AD
3,Bromley,Bromley Town,51.404543,0.014195,Bromley North,BR1 1AE
4,Bromley,Bromley Town,51.401392,0.014948,Bromley South,BR1 1AF
5,Bromley,Bromley Town,51.401392,0.014948,Bromley South,BR1 1AG
6,Bromley,Bromley Town,51.400441,0.01739,Bromley South,BR1 1AH
7,Bromley,Bromley Town,51.400489,0.018833,Bromley South,BR1 1AJ
8,Bromley,Bromley Town,51.406549,0.01313,Bromley North,BR1 1AL
9,Bromley,Bromley Town,51.408226,0.017578,Bromley North,BR1 1AX


In [5]:
df_lon.shape

(321375, 6)

### Creating city of London dataset and finding the size

In [6]:
city_of_london_data = df_lon[df_lon['Borough'] == 'City of London'].reset_index(drop=True)

city_of_london_data.head(10)

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
0,City of London,Bishopsgate,51.518895,-0.078378,Liverpool Street,E1 6AN
1,City of London,Portsoken,51.515567,-0.075635,Aldgate,E1 7AA
2,City of London,Portsoken,51.515457,-0.076718,Aldgate,E1 7AD
3,City of London,Portsoken,51.515613,-0.076899,Aldgate,E1 7AE
4,City of London,Portsoken,51.515613,-0.076899,Aldgate,E1 7AF
5,City of London,Portsoken,51.51563,-0.076279,Aldgate,E1 7AW
6,City of London,Aldgate,51.515526,-0.078592,Aldgate,E1 7AX
7,City of London,Aldgate,51.515526,-0.078592,Aldgate,E1 7AY
8,City of London,Aldgate,51.515175,-0.07761,Aldgate,E1 7BH
9,City of London,Portsoken,51.515432,-0.076806,Aldgate,E1 7BS


In [7]:
city_of_london_data.shape

(6800, 6)

### Creating a new dictionary with unique neighbourhoods values

In [8]:
city_of_london_data_unique = {Neighbourhood: city_of_london_data[city_of_london_data['Neighbourhood'] == Neighbourhood] for Neighbourhood in city_of_london_data['Neighbourhood'].unique()}

### Converting dictionary in to a list and creating dataframe for each neighbourhood

In [9]:
dictlist =[]
for key, value in city_of_london_data_unique.items():
    temp = [key,value]
    dictlist.append(temp)

In [10]:
Bishopsgate = dictlist[0][1]
Bishopsgate.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
0,City of London,Bishopsgate,51.518895,-0.078378,Liverpool Street,E1 6AN
13,City of London,Bishopsgate,51.51578,-0.078867,Aldgate,E1 7DD
14,City of London,Bishopsgate,51.51578,-0.078867,Aldgate,E1 7DG
15,City of London,Bishopsgate,51.515791,-0.078869,Aldgate,E1 7DJ
16,City of London,Bishopsgate,51.51578,-0.078867,Aldgate,E1 7DP


In [11]:
Portsoken = dictlist[1][1]
Portsoken.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
1,City of London,Portsoken,51.515567,-0.075635,Aldgate,E1 7AA
2,City of London,Portsoken,51.515457,-0.076718,Aldgate,E1 7AD
3,City of London,Portsoken,51.515613,-0.076899,Aldgate,E1 7AE
4,City of London,Portsoken,51.515613,-0.076899,Aldgate,E1 7AF
5,City of London,Portsoken,51.51563,-0.076279,Aldgate,E1 7AW


In [12]:
Aldgate = dictlist[2][1]
Aldgate.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
6,City of London,Aldgate,51.515526,-0.078592,Aldgate,E1 7AX
7,City of London,Aldgate,51.515526,-0.078592,Aldgate,E1 7AY
8,City of London,Aldgate,51.515175,-0.07761,Aldgate,E1 7BH
10,City of London,Aldgate,51.514988,-0.077185,Aldgate,E1 7BT
18,City of London,Aldgate,51.514882,-0.078905,Aldgate,E1 7DS


In [13]:
Tower = dictlist[3][1]
Tower.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
60,City of London,Tower,51.511193,-0.073295,Tower Gateway,E1 8AR
61,City of London,Tower,51.511193,-0.073295,Tower Gateway,E1 8AS
62,City of London,Tower,51.511017,-0.073562,Tower Gateway,E1 8AT
63,City of London,Tower,51.511646,-0.073521,Tower Gateway,E1 8AW
64,City of London,Tower,51.511207,-0.074159,Tower Gateway,E1 8BJ


In [14]:
Bread_Street = dictlist[4][1]
Bread_Street.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
82,City of London,Bread Street,51.51521,-0.099069,St. Pauls,EC1A 1AE
212,City of London,Bread Street,51.515233,-0.100509,St. Pauls,EC1A 7AR
214,City of London,Bread Street,51.515233,-0.100509,St. Pauls,EC1A 7AT
215,City of London,Bread Street,51.515233,-0.100509,St. Pauls,EC1A 7AU
216,City of London,Bread Street,51.515069,-0.098109,St. Pauls,EC1A 7AW


In [15]:
Farringdon_Within = dictlist[5][1]
Farringdon_Within.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
83,City of London,Farringdon Within,51.517883,-0.097516,Barbican,EC1A 1AL
84,City of London,Farringdon Within,51.516359,-0.098906,St. Pauls,EC1A 1HQ
85,City of London,Farringdon Within,51.516359,-0.098906,St. Pauls,EC1A 1LP
86,City of London,Farringdon Within,51.516109,-0.099031,St. Pauls,EC1A 1ZZ
87,City of London,Farringdon Within,51.516273,-0.102498,City Thameslink,EC1A 2AA


In [16]:
Farringdon_Without = dictlist[6][1]
Farringdon_Without.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
92,City of London,Farringdon Without,51.517473,-0.103327,Farringdon,EC1A 2AL
98,City of London,Farringdon Without,51.517187,-0.102878,Farringdon,EC1A 2AY
106,City of London,Farringdon Without,51.516996,-0.102727,City Thameslink,EC1A 2BS
110,City of London,Farringdon Without,51.516842,-0.102676,City Thameslink,EC1A 2DH
111,City of London,Farringdon Without,51.516996,-0.102727,City Thameslink,EC1A 2DJ


In [17]:
Castle_Baynard = dictlist[7][1]
Castle_Baynard.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
93,City of London,Castle Baynard,51.517147,-0.107636,Chancery Lane,EC1A 2AP
105,City of London,Castle Baynard,51.517147,-0.107636,Chancery Lane,EC1A 2BQ
107,City of London,Castle Baynard,51.516727,-0.107783,Chancery Lane,EC1A 2DA
126,City of London,Castle Baynard,51.517147,-0.107636,Chancery Lane,EC1A 2ES
127,City of London,Castle Baynard,51.517147,-0.107636,Chancery Lane,EC1A 2ET


In [18]:
Cheap = dictlist[8][1]
Cheap.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
139,City of London,Cheap,51.515226,-0.096733,St. Pauls,EC1A 4AA
141,City of London,Cheap,51.515226,-0.096733,St. Pauls,EC1A 4AD
144,City of London,Cheap,51.516085,-0.097591,St. Pauls,EC1A 4AQ
145,City of London,Cheap,51.516411,-0.097721,St. Pauls,EC1A 4AS
146,City of London,Cheap,51.516411,-0.097721,St. Pauls,EC1A 4BB


In [19]:
Aldersgate = dictlist[9][1]
Aldersgate.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
142,City of London,Aldersgate,51.521025,-0.097212,Barbican,EC1A 4AJ
167,City of London,Aldersgate,51.516554,-0.096562,St. Pauls,EC1A 4ER
169,City of London,Aldersgate,51.516984,-0.09702,St. Pauls,EC1A 4EU
171,City of London,Aldersgate,51.517469,-0.097533,St. Pauls,EC1A 4HD
173,City of London,Aldersgate,51.517029,-0.096442,St. Pauls,EC1A 4HJ


In [20]:
Cripplegate = dictlist[10][1]
Cripplegate.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
481,City of London,Cripplegate,51.522476,-0.097411,Barbican,EC1M 7AA
553,City of London,Cripplegate,51.52158,-0.095358,Barbican,EC1Y 0AA
554,City of London,Cripplegate,51.522513,-0.095262,Barbican,EC1Y 0HA
555,City of London,Cripplegate,51.521813,-0.095911,Barbican,EC1Y 0RB
556,City of London,Cripplegate,51.522121,-0.095999,Barbican,EC1Y 0RD


In [21]:
Coleman_Street = dictlist[11][1]
Coleman_Street.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
579,City of London,Coleman Street,51.520686,-0.090178,Moorgate,EC1Y 4AG
580,City of London,Coleman Street,51.520794,-0.090706,Moorgate,EC1Y 4SA
581,City of London,Coleman Street,51.520728,-0.092208,Moorgate,EC1Y 4SD
582,City of London,Coleman Street,51.520485,-0.091642,Moorgate,EC1Y 4SJ
583,City of London,Coleman Street,51.520485,-0.091642,Moorgate,EC1Y 4SL


In [22]:
Broad_Street = dictlist[12][1]
Broad_Street.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
651,City of London,Broad Street,51.516838,-0.084732,Liverpool Street,EC2M 1BB
652,City of London,Broad Street,51.515874,-0.084628,Liverpool Street,EC2M 1DL
654,City of London,Broad Street,51.515874,-0.084628,Liverpool Street,EC2M 1JA
655,City of London,Broad Street,51.516686,-0.083559,Liverpool Street,EC2M 1JB
656,City of London,Broad Street,51.517078,-0.084422,Liverpool Street,EC2M 1JD


In [23]:
Cornhill = dictlist[13][1]
Cornhill.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
716,City of London,Cornhill,51.515851,-0.083188,Liverpool Street,EC2M 1RJ
717,City of London,Cornhill,51.515851,-0.083188,Liverpool Street,EC2M 1RL
718,City of London,Cornhill,51.515851,-0.083188,Liverpool Street,EC2M 1RN
804,City of London,Cornhill,51.51594,-0.082452,Liverpool Street,EC2M 3AL
811,City of London,Cornhill,51.515851,-0.083188,Liverpool Street,EC2M 3TA


In [24]:
Walbrook = dictlist[14][1]
Walbrook.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
1339,City of London,Walbrook,51.514541,-0.08614,Bank,EC2N 1AR
1385,City of London,Walbrook,51.514124,-0.087584,Bank,EC2N 1EY
1392,City of London,Walbrook,51.514532,-0.086644,Bank,EC2N 1HH
1396,City of London,Walbrook,51.514532,-0.086644,Bank,EC2N 1HP
1398,City of London,Walbrook,51.514532,-0.086644,Bank,EC2N 1HR


In [25]:
Lime_Street = dictlist[15][1]
Lime_Street.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
1533,City of London,Lime Street,51.514054,-0.083263,Liverpool Street,EC2N 4AE
1534,City of London,Lime Street,51.514694,-0.082833,Liverpool Street,EC2N 4AF
1536,City of London,Lime Street,51.514936,-0.082779,Liverpool Street,EC2N 4AH
1537,City of London,Lime Street,51.515033,-0.08266,Liverpool Street,EC2N 4AJ
1541,City of London,Lime Street,51.514054,-0.083263,Liverpool Street,EC2N 4AQ


In [26]:
Cordwainer = dictlist[16][1]
Cordwainer.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
1569,City of London,Cordwainer,51.513272,-0.090502,Bank,EC2P 2AH
1572,City of London,Cordwainer,51.513824,-0.093448,Mansion House,EC2P 2BA
1815,City of London,Cordwainer,51.513296,-0.091942,Bank,EC2R 8AT
1830,City of London,Cordwainer,51.513272,-0.090502,Bank,EC2R 8BS
1836,City of London,Cordwainer,51.513296,-0.091942,Bank,EC2R 8DH


In [27]:
Bassishaw = dictlist[17][1]
Bassishaw.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
1570,City of London,Bassishaw,51.517168,-0.091146,Moorgate,EC2P 2AJ
1571,City of London,Bassishaw,51.517168,-0.091146,Moorgate,EC2P 2AT
1576,City of London,Bassishaw,51.517168,-0.091146,Moorgate,EC2P 2DY
1578,City of London,Bassishaw,51.516015,-0.09327,St. Pauls,EC2P 2NA
1579,City of London,Bassishaw,51.516671,-0.093776,St. Pauls,EC2P 2NQ


In [28]:
Langbourn = dictlist[18][1]
Langbourn.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
2561,City of London,Langbourn,51.512209,-0.080458,Fenchurch Street,EC3A 2AB
2568,City of London,Langbourn,51.512209,-0.080458,Fenchurch Street,EC3A 2AY
3106,City of London,Langbourn,51.512233,-0.081898,Fenchurch Street,EC3M 2RB
3107,City of London,Langbourn,51.512209,-0.080458,Fenchurch Street,EC3M 2RD
3108,City of London,Langbourn,51.512209,-0.080458,Fenchurch Street,EC3M 2RH


In [29]:
Bridge = dictlist[19][1]
Bridge.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
3022,City of London,Bridge,51.510603,-0.085599,Monument,EC3M 1AA
3024,City of London,Bridge,51.510483,-0.084854,Monument,EC3M 1AD
3025,City of London,Bridge,51.510606,-0.085253,Monument,EC3M 1AE
3026,City of London,Bridge,51.510815,-0.085388,Monument,EC3M 1AG
3027,City of London,Bridge,51.510578,-0.08521,Monument,EC3M 1AH


In [30]:
Candlewick = dictlist[20][1]
Candlewick.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
3023,City of London,Candlewick,51.51068,-0.085956,Monument,EC3M 1AB
3040,City of London,Candlewick,51.510506,-0.086294,Monument,EC3M 1BJ
3041,City of London,Candlewick,51.510506,-0.086294,Monument,EC3M 1BL
3160,City of London,Candlewick,51.511405,-0.086257,Monument,EC3M 3DU
3877,City of London,Candlewick,51.510506,-0.086294,Monument,EC3R 6BP


In [31]:
Billingsgate = dictlist[21][1]
Billingsgate.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
3029,City of London,Billingsgate,51.510437,-0.083718,Monument,EC3M 1AL
3030,City of London,Billingsgate,51.510437,-0.083718,Monument,EC3M 1AN
3031,City of London,Billingsgate,51.50956,-0.083452,Monument,EC3M 1AP
3037,City of London,Billingsgate,51.510459,-0.083414,Monument,EC3M 1BB
3055,City of London,Billingsgate,51.50956,-0.083452,Monument,EC3M 1DY


In [32]:
Vintry = dictlist[22][1]
Vintry.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
4964,City of London,Vintry,51.512467,-0.0963,Mansion House,EC4M 5SA
4966,City of London,Vintry,51.512444,-0.09486,Mansion House,EC4M 5SD
4969,City of London,Vintry,51.512444,-0.09486,Mansion House,EC4M 5SJ
4970,City of London,Vintry,51.512444,-0.09486,Mansion House,EC4M 5SL
4971,City of London,Vintry,51.512444,-0.09486,Mansion House,EC4M 5SN


In [33]:
Queenhithe = dictlist[23][1]
Queenhithe.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
4978,City of London,Queenhithe,51.511545,-0.094898,Mansion House,EC4M 6GB
4987,City of London,Queenhithe,51.511545,-0.094898,Mansion House,EC4M 6XN
5337,City of London,Queenhithe,51.511545,-0.094898,Mansion House,EC4N 4TX
5341,City of London,Queenhithe,51.511545,-0.094898,Mansion House,EC4N 4UB
5444,City of London,Queenhithe,51.511545,-0.094898,Mansion House,EC4N 6LT


In [34]:
Dowgate = dictlist[24][1]
Dowgate.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
5359,City of London,Dowgate,51.511475,-0.090577,Cannon Street,EC4N 5AH
5361,City of London,Dowgate,51.511475,-0.090577,Cannon Street,EC4N 5AL
5362,City of London,Dowgate,51.511475,-0.090577,Cannon Street,EC4N 5AN
5364,City of London,Dowgate,51.511475,-0.090577,Cannon Street,EC4N 5AQ
5371,City of London,Dowgate,51.511475,-0.090577,Cannon Street,EC4N 5AY


### Taking 20 samples from each neighbourhood and combining them into a new dataframe

In [35]:
Bishopsgate_samples = Bishopsgate.head(20)

In [36]:
Portsoken_samples = Portsoken.head(20)

In [37]:
Aldgate_samples = Aldgate.head(20)

In [38]:
Tower_samples = Tower.head(20)

In [39]:
Bread_Street_samples = Bread_Street.head(20)

In [40]:
Farringdon_Within_samples = Farringdon_Within.head(20)

In [41]:
Farringdon_Without_samples = Farringdon_Without.head(20)

In [42]:
Castle_Baynard_samples = Castle_Baynard.head(20)

In [43]:
Cheap_samples = Cheap.head(20)

In [44]:
Aldersgate_samples = Aldersgate.head(20)

In [45]:
Cripplegate_samples = Cripplegate.head(20)

In [46]:
Coleman_Street_samples = Coleman_Street.head(20)

In [47]:
Broad_Street_samples = Broad_Street.head(20)

In [48]:
Cornhill_samples = Cornhill.head(20)

In [49]:
Walbrook_samples = Walbrook.head(20)

In [50]:
Lime_Street_samples = Lime_Street.head(20)

In [51]:
Cordwainer_samples = Cordwainer.head(20)

In [52]:
Bassishaw_samples = Bassishaw.head(20)

In [53]:
Langbourn_samples = Langbourn.head(20)

In [54]:
Bridge_samples = Bridge.head(20)

In [55]:
Candlewick_samples = Candlewick.head(20)

In [56]:
Billingsgate_samples = Billingsgate.head(20)

In [57]:
Vintry_samples = Vintry.head(20)

In [58]:
Queenhithe_samples = Queenhithe.head(20)

In [59]:
Dowgate_samples = Dowgate.head(20)

In [60]:
# Combining all samples in one dataframe
frames = [Bishopsgate_samples, Portsoken_samples, Aldgate_samples, Tower_samples, Bread_Street_samples,
Farringdon_Within_samples, Farringdon_Without_samples, Castle_Baynard_samples, Cheap_samples, Aldersgate_samples,
Cripplegate_samples, Coleman_Street_samples, Broad_Street_samples, Cornhill_samples, Walbrook_samples, Lime_Street_samples, Cordwainer_samples,
Bassishaw_samples, Langbourn_samples, Bridge_samples, Candlewick_samples, Billingsgate_samples, Vintry_samples, Queenhithe_samples, Dowgate_samples]
city_of_london_data_samples = pd.concat(frames)
city_of_london_data_samples

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Nearest station,Postcode
0,City of London,Bishopsgate,51.518895,-0.078378,Liverpool Street,E1 6AN
13,City of London,Bishopsgate,51.515780,-0.078867,Aldgate,E1 7DD
14,City of London,Bishopsgate,51.515780,-0.078867,Aldgate,E1 7DG
15,City of London,Bishopsgate,51.515791,-0.078869,Aldgate,E1 7DJ
16,City of London,Bishopsgate,51.515780,-0.078867,Aldgate,E1 7DP
...,...,...,...,...,...,...
5392,City of London,Dowgate,51.511380,-0.090235,Cannon Street,EC4N 6AG
5393,City of London,Dowgate,51.511499,-0.092017,Cannon Street,EC4N 6AJ
5394,City of London,Dowgate,51.511499,-0.092017,Cannon Street,EC4N 6AL
5396,City of London,Dowgate,51.511264,-0.090283,Cannon Street,EC4N 6AP


### Finding latitude and longitude of City of London

In [61]:
address = 'City of London'
geolocator = Nominatim(user_agent="lon_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinates of City of London are {}, {}'.format(latitude, longitude))

The geographical coordinates of City of London are 51.5156177, -0.0919983


### Visualizing Neighbourhoods of City of London on map

In [62]:
city_of_london_map = folium.Map(location=[latitude, longitude], zoom_start=14.5)

# add markers to map
for lat, lng, label in zip(city_of_london_data_samples['Latitude'], city_of_london_data_samples['Longitude'], city_of_london_data_samples['Neighbourhood']):
    label = folium.Popup(label)
    folium.CircleMarker(
    [lat, lng],
    radius=5,
    popup=label,
    color='purple',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(city_of_london_map)
    
city_of_london_map

### Defining foursquare credentials and version

In [63]:
# @hidden_cell
CLIENT_ID = 'P3UQT5BGLDRLIOQJ3HP32HG3RMPAJ5KPVUKVHO4BAWIA4ZWF' # your Foursquare ID
CLIENT_SECRET = 'PK5TD1CDMDPW1CIQF5BLSLISM0SVVSBO2TJCIVNGBLGVNXZI' # your Foursquare Secret.
VERSION = '20180605' # Foursquare API version

### Creating city_of_london_venues dataframe by calling foursquare api and finding 100 venues for each neighbourhood within 500 meters

In [64]:
%%capture

def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues);
    
city_of_london_venues = getNearbyVenues(names=city_of_london_data_samples['Neighbourhood'],
                                   latitudes=city_of_london_data_samples['Latitude'],
                                   longitudes=city_of_london_data_samples['Longitude']
                                  )

### Examining the size and appearance of the dataframe

In [65]:
city_of_london_venues.head(10)

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bishopsgate,51.518895,-0.078378,Kastner & Ovens,51.517913,-0.076465,Café
1,Bishopsgate,51.518895,-0.078378,Ottolenghi,51.518272,-0.077177,Mediterranean Restaurant
2,Bishopsgate,51.518895,-0.078378,Pizza Union,51.517699,-0.077416,Pizza Place
3,Bishopsgate,51.518895,-0.078378,Old Spitalfields Market,51.519668,-0.075375,Flea Market
4,Bishopsgate,51.518895,-0.078378,Honest Burgers,51.518042,-0.077957,Burger Joint
5,Bishopsgate,51.518895,-0.078378,The Mayor of Scaredy Cat Town,51.518524,-0.078882,Speakeasy
6,Bishopsgate,51.518895,-0.078378,The Breakfast Club,51.518386,-0.078784,Breakfast Spot
7,Bishopsgate,51.518895,-0.078378,New Street Wine,51.517335,-0.079857,Wine Shop
8,Bishopsgate,51.518895,-0.078378,The Williams Ale & Cider House,51.518517,-0.078447,Pub
9,Bishopsgate,51.518895,-0.078378,Vita Mojo,51.518925,-0.077676,Salad Place


In [66]:
city_of_london_venues.shape

(48043, 7)

### Checking how many venues were returned for each neighbourhood

In [75]:
city_of_london_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Aldersgate,1914,1914,1914,1914,1914,1914
Aldgate,2000,2000,2000,2000,2000,2000
Bassishaw,1946,1946,1946,1946,1946,1946
Billingsgate,1751,1751,1751,1751,1751,1751
Bishopsgate,2000,2000,2000,2000,2000,2000
Bread Street,2000,2000,2000,2000,2000,2000
Bridge,1984,1984,1984,1984,1984,1984
Broad Street,1989,1989,1989,1989,1989,1989
Candlewick,1818,1818,1818,1818,1818,1818
Castle Baynard,2000,2000,2000,2000,2000,2000


### Finding out how many unique categories can be curated from all the returned venues

In [77]:
print('There are {} uniques categories.'.format(len(city_of_london_venues['Venue Category'].unique())))

There are 166 uniques categories.


### Analyzing occurrence of each venue in neighbourhood with onehot encoding

In [80]:
city_of_london_onehot = pd.get_dummies(city_of_london_venues[['Venue Category']], prefix="", prefix_sep="")

# adding neighbourhood column back to dataframe
city_of_london_onehot['Neighbourhood'] = city_of_london_venues['Neighbourhood'] 

# moving neighbourhood column to the first columncity_of_london_onehot
fixed_columns = [city_of_london_onehot.columns[-1]] + list(city_of_london_onehot.columns[:-1])
city_of_london_onehot = city_of_london_onehot[fixed_columns]

city_of_london_onehot.head(10)

Unnamed: 0,Neighbourhood,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,...,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
8,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,Bishopsgate,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Examining size of a newly created dataframe

In [81]:
city_of_london_onehot.shape

(48043, 167)

### Grouping rows by neighbourhood and by taking the mean of the frequency of occurrence of each category


In [82]:
city_of_london_grouped = city_of_london_onehot.groupby('Neighbourhood').mean().reset_index()
city_of_london_grouped

Unnamed: 0,Neighbourhood,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,...,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Aldersgate,0.003135,0.0,0.0,0.026123,0.0,0.0,0.007837,0.004702,0.005225,...,0.009927,0.0,0.0,0.015674,0.0,0.015674,0.0,0.0,0.010449,0.001567
1,Aldgate,0.0,0.0,0.012,0.0025,0.0,0.0,0.0175,0.01,0.0,...,0.018,0.0,0.0,0.005,0.0,0.0165,0.01,0.0,0.0025,0.0
2,Bassishaw,0.0,0.0,0.0,0.030832,0.0,0.0,0.014388,0.0,0.0,...,0.004625,0.004111,0.003083,0.017472,0.0,0.013361,0.0,0.0,0.004111,0.014388
3,Billingsgate,0.0,0.0,0.0,0.0,0.0,0.0,0.034266,0.011422,0.0,...,0.011422,0.0,0.0,0.0,0.0,0.011422,0.0,0.0,0.0,0.0
4,Bishopsgate,0.0,0.0,0.01,0.011,0.0,0.0,0.0025,0.01,0.0,...,0.006,0.0,0.0,0.02,0.0,0.016,0.0135,0.0015,0.01,0.0
5,Bread Street,0.01,0.0,0.0,0.0045,0.0,0.0,0.019,0.0,0.0,...,0.0,0.0025,0.001,0.029,0.0,0.0315,0.0,0.0,0.008,0.002
6,Bridge,0.0,0.0,0.0,0.0,0.0,0.0,0.045867,0.010081,0.0,...,0.010081,0.00252,0.0,0.0,0.0,0.010081,0.0,0.0,0.0,0.0
7,Broad Street,0.0,0.0,0.008044,0.0,0.0,0.0,0.017094,0.0,0.0,...,0.010055,0.0,0.0,0.001006,0.002514,0.010055,0.011061,0.0,0.001006,0.011061
8,Candlewick,0.0,0.0,0.0,0.00055,0.0,0.0,0.036304,0.0011,0.0,...,0.012101,0.011001,0.0,0.00165,0.0,0.011001,0.0,0.0,0.0,0.0044
9,Castle Baynard,0.0005,0.0005,0.0095,0.001,0.0,0.0,0.0135,0.0,0.0,...,0.0,0.0005,0.0005,0.011,0.009,0.03,0.0,0.0,0.001,0.0


### Creating dataframe for each neighbourhood along with top 20 venues

In [104]:
num_top_venues = 20

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = city_of_london_grouped['Neighbourhood']

for ind in np.arange(city_of_london_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(city_of_london_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Aldersgate,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,Food Truck,French Restaurant,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
1,Aldgate,Hotel,Coffee Shop,Gym / Fitness Center,Cocktail Bar,Restaurant,Pub,English Restaurant,Salad Place,Garden,...,Italian Restaurant,Indian Restaurant,Turkish Restaurant,Asian Restaurant,Café,Beer Bar,Steakhouse,Wine Bar,Event Space,Sushi Restaurant
2,Bassishaw,Coffee Shop,Italian Restaurant,Hotel,Gym / Fitness Center,Art Gallery,Sandwich Place,Café,Restaurant,Steakhouse,...,Sushi Restaurant,Clothing Store,French Restaurant,Vietnamese Restaurant,Concert Hall,Roof Deck,Indie Movie Theater,Indian Restaurant,Plaza,Cocktail Bar
3,Billingsgate,Hotel,Coffee Shop,Gym / Fitness Center,Pub,Restaurant,Asian Restaurant,French Restaurant,Garden,Sandwich Place,...,English Restaurant,Café,Salad Place,Historic Site,Italian Restaurant,Hotel Bar,Scenic Lookout,Burger Joint,Seafood Restaurant,Steakhouse
4,Bishopsgate,Coffee Shop,Cocktail Bar,Pub,Hotel,Gym / Fitness Center,Chinese Restaurant,Burger Joint,Restaurant,Salad Place,...,Pizza Place,Thai Restaurant,Clothing Store,Sushi Restaurant,Breakfast Spot,Plaza,Mediterranean Restaurant,Vietnamese Restaurant,Street Food Gathering,Japanese Restaurant
5,Bread Street,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Sandwich Place,Wine Bar,Pub,Modern European Restaurant,Vietnamese Restaurant,Japanese Restaurant,...,Scenic Lookout,French Restaurant,Plaza,Café,Park,Asian Restaurant,Grocery Store,Restaurant,Burger Joint,Bakery
6,Bridge,Coffee Shop,Hotel,Gym / Fitness Center,Pub,Asian Restaurant,Restaurant,Sandwich Place,French Restaurant,Italian Restaurant,...,Historic Site,Bar,Fast Food Restaurant,English Restaurant,Garden,Cocktail Bar,Salad Place,Café,Seafood Restaurant,History Museum
7,Broad Street,Coffee Shop,Pub,Restaurant,Hotel,Gym / Fitness Center,Cocktail Bar,Italian Restaurant,Burger Joint,Sushi Restaurant,...,Salad Place,English Restaurant,Breakfast Spot,Plaza,Boxing Gym,Event Space,Seafood Restaurant,Indian Restaurant,Historic Site,Japanese Restaurant
8,Candlewick,Coffee Shop,Gym / Fitness Center,Italian Restaurant,Hotel,Restaurant,Pub,Asian Restaurant,Bar,Burger Joint,...,English Restaurant,Historic Site,Garden,Seafood Restaurant,Fast Food Restaurant,Sandwich Place,Steakhouse,Salad Place,Event Space,Café
9,Castle Baynard,Coffee Shop,Pub,Sandwich Place,Italian Restaurant,Gym / Fitness Center,Wine Bar,French Restaurant,Burrito Place,Hotel,...,Japanese Restaurant,Falafel Restaurant,Fast Food Restaurant,Bar,Salad Place,Deli / Bodega,Asian Restaurant,Sushi Restaurant,Tea Room,Café


### Clustering neighbourhoods

In [None]:
kclusters = 5

city_of_london_grouped_clustering = city_of_london_grouped.drop('Neighbourhood', 1)

# runing k-means clustering
kmeans = KMeans(init="k-means++", n_clusters=kclusters, random_state=4, n_init=12).fit(city_of_london_grouped_clustering)

# checking cluster labels generated for each row in the dataframe
kmeans.labels_[0:25]

### Creating a new dataframe with clusters and top 20 venues for each neighborhood

In [None]:
neighbourhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

city_of_london_merged = city_of_london_data_samples

# merging city_of_london_grouped with city_of_london_data to add latitude/longitude for each neighbourhood
city_of_london_merged = city_of_london_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

city_of_london_merged.head(10)

### Examining the size of the dataset

In [107]:
city_of_london_merged.shape

(500, 27)

### Examining Clusters

#### Cluster 1

In [110]:
city_of_london_merged.loc[city_of_london_merged['Cluster Labels'] == 0, city_of_london_merged.columns[[1] + list(range(5, city_of_london_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Bishopsgate,E1 6AN,0,Coffee Shop,Cocktail Bar,Pub,Hotel,Gym / Fitness Center,Chinese Restaurant,Burger Joint,...,Pizza Place,Thai Restaurant,Clothing Store,Sushi Restaurant,Breakfast Spot,Plaza,Mediterranean Restaurant,Vietnamese Restaurant,Street Food Gathering,Japanese Restaurant
13,Bishopsgate,E1 7DD,0,Coffee Shop,Cocktail Bar,Pub,Hotel,Gym / Fitness Center,Chinese Restaurant,Burger Joint,...,Pizza Place,Thai Restaurant,Clothing Store,Sushi Restaurant,Breakfast Spot,Plaza,Mediterranean Restaurant,Vietnamese Restaurant,Street Food Gathering,Japanese Restaurant
14,Bishopsgate,E1 7DG,0,Coffee Shop,Cocktail Bar,Pub,Hotel,Gym / Fitness Center,Chinese Restaurant,Burger Joint,...,Pizza Place,Thai Restaurant,Clothing Store,Sushi Restaurant,Breakfast Spot,Plaza,Mediterranean Restaurant,Vietnamese Restaurant,Street Food Gathering,Japanese Restaurant
15,Bishopsgate,E1 7DJ,0,Coffee Shop,Cocktail Bar,Pub,Hotel,Gym / Fitness Center,Chinese Restaurant,Burger Joint,...,Pizza Place,Thai Restaurant,Clothing Store,Sushi Restaurant,Breakfast Spot,Plaza,Mediterranean Restaurant,Vietnamese Restaurant,Street Food Gathering,Japanese Restaurant
16,Bishopsgate,E1 7DP,0,Coffee Shop,Cocktail Bar,Pub,Hotel,Gym / Fitness Center,Chinese Restaurant,Burger Joint,...,Pizza Place,Thai Restaurant,Clothing Store,Sushi Restaurant,Breakfast Spot,Plaza,Mediterranean Restaurant,Vietnamese Restaurant,Street Food Gathering,Japanese Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2719,Lime Street,EC3A 6AB,0,Coffee Shop,Hotel,Cocktail Bar,Gym / Fitness Center,Restaurant,Pub,English Restaurant,...,Wine Bar,Event Space,Boxing Gym,Sandwich Place,Turkish Restaurant,Burger Joint,French Restaurant,Steakhouse,Sushi Restaurant,Lounge
2720,Lime Street,EC3A 6AD,0,Coffee Shop,Hotel,Cocktail Bar,Gym / Fitness Center,Restaurant,Pub,English Restaurant,...,Wine Bar,Event Space,Boxing Gym,Sandwich Place,Turkish Restaurant,Burger Joint,French Restaurant,Steakhouse,Sushi Restaurant,Lounge
2721,Lime Street,EC3A 6AE,0,Coffee Shop,Hotel,Cocktail Bar,Gym / Fitness Center,Restaurant,Pub,English Restaurant,...,Wine Bar,Event Space,Boxing Gym,Sandwich Place,Turkish Restaurant,Burger Joint,French Restaurant,Steakhouse,Sushi Restaurant,Lounge
2722,Lime Street,EC3A 6AH,0,Coffee Shop,Hotel,Cocktail Bar,Gym / Fitness Center,Restaurant,Pub,English Restaurant,...,Wine Bar,Event Space,Boxing Gym,Sandwich Place,Turkish Restaurant,Burger Joint,French Restaurant,Steakhouse,Sushi Restaurant,Lounge


#### Cluster 2

In [111]:
city_of_london_merged.loc[city_of_london_merged['Cluster Labels'] == 1, city_of_london_merged.columns[[1] + list(range(5, city_of_london_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
139,Cheap,EC1A 4AA,1,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Plaza,Sandwich Place,Scenic Lookout,Vietnamese Restaurant,...,Art Gallery,Burger Joint,Clothing Store,French Restaurant,Falafel Restaurant,Restaurant,Grocery Store,Bakery,Cocktail Bar,Sushi Restaurant
141,Cheap,EC1A 4AD,1,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Plaza,Sandwich Place,Scenic Lookout,Vietnamese Restaurant,...,Art Gallery,Burger Joint,Clothing Store,French Restaurant,Falafel Restaurant,Restaurant,Grocery Store,Bakery,Cocktail Bar,Sushi Restaurant
144,Cheap,EC1A 4AQ,1,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Plaza,Sandwich Place,Scenic Lookout,Vietnamese Restaurant,...,Art Gallery,Burger Joint,Clothing Store,French Restaurant,Falafel Restaurant,Restaurant,Grocery Store,Bakery,Cocktail Bar,Sushi Restaurant
145,Cheap,EC1A 4AS,1,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Plaza,Sandwich Place,Scenic Lookout,Vietnamese Restaurant,...,Art Gallery,Burger Joint,Clothing Store,French Restaurant,Falafel Restaurant,Restaurant,Grocery Store,Bakery,Cocktail Bar,Sushi Restaurant
146,Cheap,EC1A 4BB,1,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Plaza,Sandwich Place,Scenic Lookout,Vietnamese Restaurant,...,Art Gallery,Burger Joint,Clothing Store,French Restaurant,Falafel Restaurant,Restaurant,Grocery Store,Bakery,Cocktail Bar,Sushi Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5392,Dowgate,EC4N 6AG,1,Coffee Shop,Gym / Fitness Center,Italian Restaurant,Asian Restaurant,Pub,Hotel,Steakhouse,...,Bar,English Restaurant,Burger Joint,Vietnamese Restaurant,Juice Bar,Café,Pedestrian Plaza,French Restaurant,Modern European Restaurant,Roof Deck
5393,Dowgate,EC4N 6AJ,1,Coffee Shop,Gym / Fitness Center,Italian Restaurant,Asian Restaurant,Pub,Hotel,Steakhouse,...,Bar,English Restaurant,Burger Joint,Vietnamese Restaurant,Juice Bar,Café,Pedestrian Plaza,French Restaurant,Modern European Restaurant,Roof Deck
5394,Dowgate,EC4N 6AL,1,Coffee Shop,Gym / Fitness Center,Italian Restaurant,Asian Restaurant,Pub,Hotel,Steakhouse,...,Bar,English Restaurant,Burger Joint,Vietnamese Restaurant,Juice Bar,Café,Pedestrian Plaza,French Restaurant,Modern European Restaurant,Roof Deck
5396,Dowgate,EC4N 6AP,1,Coffee Shop,Gym / Fitness Center,Italian Restaurant,Asian Restaurant,Pub,Hotel,Steakhouse,...,Bar,English Restaurant,Burger Joint,Vietnamese Restaurant,Juice Bar,Café,Pedestrian Plaza,French Restaurant,Modern European Restaurant,Roof Deck


#### Cluster 3

In [112]:
city_of_london_merged.loc[city_of_london_merged['Cluster Labels'] == 2, city_of_london_merged.columns[[1] + list(range(5, city_of_london_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
142,Aldersgate,EC1A 4AJ,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
167,Aldersgate,EC1A 4ER,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
169,Aldersgate,EC1A 4EU,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
171,Aldersgate,EC1A 4HD,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
173,Aldersgate,EC1A 4HJ,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
182,Aldersgate,EC1A 4JP,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
183,Aldersgate,EC1A 4JR,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
184,Aldersgate,EC1A 4LA,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
193,Aldersgate,EC1A 4LX,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar
231,Aldersgate,EC1A 7BQ,2,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Café,Sandwich Place,Plaza,Pub,...,Sushi Restaurant,Cocktail Bar,Park,Concert Hall,Hotel,Modern European Restaurant,Scenic Lookout,Garden,Indie Movie Theater,Wine Bar


#### Cluster 4

In [113]:
city_of_london_merged.loc[city_of_london_merged['Cluster Labels'] == 3, city_of_london_merged.columns[[1] + list(range(5, city_of_london_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
6,Aldgate,E1 7AX,3,Hotel,Coffee Shop,Gym / Fitness Center,Cocktail Bar,Restaurant,Pub,English Restaurant,...,Italian Restaurant,Indian Restaurant,Turkish Restaurant,Asian Restaurant,Café,Beer Bar,Steakhouse,Wine Bar,Event Space,Sushi Restaurant
7,Aldgate,E1 7AY,3,Hotel,Coffee Shop,Gym / Fitness Center,Cocktail Bar,Restaurant,Pub,English Restaurant,...,Italian Restaurant,Indian Restaurant,Turkish Restaurant,Asian Restaurant,Café,Beer Bar,Steakhouse,Wine Bar,Event Space,Sushi Restaurant
8,Aldgate,E1 7BH,3,Hotel,Coffee Shop,Gym / Fitness Center,Cocktail Bar,Restaurant,Pub,English Restaurant,...,Italian Restaurant,Indian Restaurant,Turkish Restaurant,Asian Restaurant,Café,Beer Bar,Steakhouse,Wine Bar,Event Space,Sushi Restaurant
10,Aldgate,E1 7BT,3,Hotel,Coffee Shop,Gym / Fitness Center,Cocktail Bar,Restaurant,Pub,English Restaurant,...,Italian Restaurant,Indian Restaurant,Turkish Restaurant,Asian Restaurant,Café,Beer Bar,Steakhouse,Wine Bar,Event Space,Sushi Restaurant
18,Aldgate,E1 7DS,3,Hotel,Coffee Shop,Gym / Fitness Center,Cocktail Bar,Restaurant,Pub,English Restaurant,...,Italian Restaurant,Indian Restaurant,Turkish Restaurant,Asian Restaurant,Café,Beer Bar,Steakhouse,Wine Bar,Event Space,Sushi Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3084,Billingsgate,EC3M 1JQ,3,Hotel,Coffee Shop,Gym / Fitness Center,Pub,Restaurant,Asian Restaurant,French Restaurant,...,English Restaurant,Café,Salad Place,Historic Site,Italian Restaurant,Hotel Bar,Scenic Lookout,Burger Joint,Seafood Restaurant,Steakhouse
3085,Billingsgate,EC3M 1JS,3,Hotel,Coffee Shop,Gym / Fitness Center,Pub,Restaurant,Asian Restaurant,French Restaurant,...,English Restaurant,Café,Salad Place,Historic Site,Italian Restaurant,Hotel Bar,Scenic Lookout,Burger Joint,Seafood Restaurant,Steakhouse
3089,Billingsgate,EC3M 1LA,3,Hotel,Coffee Shop,Gym / Fitness Center,Pub,Restaurant,Asian Restaurant,French Restaurant,...,English Restaurant,Café,Salad Place,Historic Site,Italian Restaurant,Hotel Bar,Scenic Lookout,Burger Joint,Seafood Restaurant,Steakhouse
3090,Billingsgate,EC3M 1LB,3,Hotel,Coffee Shop,Gym / Fitness Center,Pub,Restaurant,Asian Restaurant,French Restaurant,...,English Restaurant,Café,Salad Place,Historic Site,Italian Restaurant,Hotel Bar,Scenic Lookout,Burger Joint,Seafood Restaurant,Steakhouse


#### Cluster 5

In [114]:
city_of_london_merged.loc[city_of_london_merged['Cluster Labels'] == 4, city_of_london_merged.columns[[1] + list(range(5, city_of_london_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
82,Bread Street,EC1A 1AE,4,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Sandwich Place,Wine Bar,Pub,Modern European Restaurant,...,Scenic Lookout,French Restaurant,Plaza,Café,Park,Asian Restaurant,Grocery Store,Restaurant,Burger Joint,Bakery
212,Bread Street,EC1A 7AR,4,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Sandwich Place,Wine Bar,Pub,Modern European Restaurant,...,Scenic Lookout,French Restaurant,Plaza,Café,Park,Asian Restaurant,Grocery Store,Restaurant,Burger Joint,Bakery
214,Bread Street,EC1A 7AT,4,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Sandwich Place,Wine Bar,Pub,Modern European Restaurant,...,Scenic Lookout,French Restaurant,Plaza,Café,Park,Asian Restaurant,Grocery Store,Restaurant,Burger Joint,Bakery
215,Bread Street,EC1A 7AU,4,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Sandwich Place,Wine Bar,Pub,Modern European Restaurant,...,Scenic Lookout,French Restaurant,Plaza,Café,Park,Asian Restaurant,Grocery Store,Restaurant,Burger Joint,Bakery
216,Bread Street,EC1A 7AW,4,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Sandwich Place,Wine Bar,Pub,Modern European Restaurant,...,Scenic Lookout,French Restaurant,Plaza,Café,Park,Asian Restaurant,Grocery Store,Restaurant,Burger Joint,Bakery
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
491,Castle Baynard,EC1N 2HQ,4,Coffee Shop,Pub,Sandwich Place,Italian Restaurant,Gym / Fitness Center,Wine Bar,French Restaurant,...,Japanese Restaurant,Falafel Restaurant,Fast Food Restaurant,Bar,Salad Place,Deli / Bodega,Asian Restaurant,Sushi Restaurant,Tea Room,Café
494,Castle Baynard,EC1N 2HT,4,Coffee Shop,Pub,Sandwich Place,Italian Restaurant,Gym / Fitness Center,Wine Bar,French Restaurant,...,Japanese Restaurant,Falafel Restaurant,Fast Food Restaurant,Bar,Salad Place,Deli / Bodega,Asian Restaurant,Sushi Restaurant,Tea Room,Café
495,Castle Baynard,EC1N 2HU,4,Coffee Shop,Pub,Sandwich Place,Italian Restaurant,Gym / Fitness Center,Wine Bar,French Restaurant,...,Japanese Restaurant,Falafel Restaurant,Fast Food Restaurant,Bar,Salad Place,Deli / Bodega,Asian Restaurant,Sushi Restaurant,Tea Room,Café
496,Castle Baynard,EC1N 2HX,4,Coffee Shop,Pub,Sandwich Place,Italian Restaurant,Gym / Fitness Center,Wine Bar,French Restaurant,...,Japanese Restaurant,Falafel Restaurant,Fast Food Restaurant,Bar,Salad Place,Deli / Bodega,Asian Restaurant,Sushi Restaurant,Tea Room,Café


### Visualizing clustered neighbourhoods on map

In [116]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=14.5)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(city_of_london_merged['Latitude'], city_of_london_merged['Longitude'], city_of_london_merged['Neighbourhood'], city_of_london_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters