<h1>Comparing the demographics and facilities of London neighbourhoods</h1>



<h2>Introduction to the project: background and aim</h2>

London is one of the busiest cities in the world with over 8 million people calling it home. But do the demographics of the people who live in London, where they live and the facilities available to them, have distinct patterns? This project aims to investigate this.

We will look at both the demographics of the population and the types of venues in each borough of London to paint a picture of the people and their lives. This will be done in two parts

1.   We will explore the boroughs by their demographic data. This will involve grouping the boroughs based on their demographic metrics and exploring their characteristics. This will provide some insight into the people living in each borough.
2.   We will then use the Foursquare API to investigate the types of venues found in each borough. This shows both practical facilities (shops, supermarkets, leisure centres) as well as sociable facilities (pubs, cafes, music venues). The analysis will include finding the most common categories of venues in each borough.

The types of venues in an area influence and are influenced by the people who live there. Therefore, with both of these analyses we can draw conclusions about the differences and similarites between boroughs.

The application of this could be to identify areas which could benefit from certain facilities based on their demographics. It could also lead to suggestions for government funding in certain regions. For example, does a borough with lots of young people but low happiness score need more youth clubs? Are there not enough supermarkets in a borough which has a high population density? How is the frequency of gambling shops linked to an area's demographics?

The audience for this could be a local government authority who want to compare their facilities to boroughs with similar demographics. Or, on a city-wide level, it could also be useful to government to identify regions which require more funding for certain facilities. It could also give an indiciation to people who want to open a business in London of where their audience lives and the other facilities in the area.

<h2>Description of the data used</h2>

The project will use two key datasets.

1. Demographic data by borough from Greater London Authority. This displays key population metrics such as population density, average age, % of resident population born abroad and happiness score. This will be used to characterise each London borough's population as described above.
https://data.london.gov.uk/dataset/london-borough-profiles 
2. Foursquare API data on venues in each borough. We will use the API to explore a radius around each borough and the different venues within this radius. These are split into categories such as restaurant, supermarket, pub etc. We can then calculate the proportion of facilities in different categories for each area. This is useful to understand the types of places that make up a neighbourhood.

<h2>Methodology</h2>

<h3>Part 1: Demographic data of London</h3>

<h5>Load and clean the data</h5>

In [2]:
# pip install sklearn

In [3]:
# pip install geocoder

In [4]:
# pip install xlrd

In [5]:
# pip install folium

In [6]:
# pip install geopy

In [7]:
# import libraries
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans # import k-means from clustering stage
import geocoder
import folium # map rendering library
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import matplotlib.cm as cm
import matplotlib.colors as colors

In [8]:
# load data
df_demographics = pd.read_excel("data/london-borough-profiles.xlsx", sheet_name='Data')

In [9]:
# rename columns
columns = ['New code','Area name','Population density (per hectare) 2017','Average Age, 2017','% of resident population born abroad (2015)','Happiness score 2011-14 (out of 10)']
df_london=df_demographics[columns]

In [10]:
# drop rows
df_london.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_london.dropna(inplace=True)


In [11]:
df_london = df_london.drop(df_london.index[33:38])

In [12]:
df_london = df_london.drop(df_london.index[0])

In [13]:
# drop columns
london_grouped_clustering = df_london.drop(['New code','Area name'], 1)

<h5>Cluster the data based on demographic properties of the boroughs</h5>

In [14]:
# set number of clusters
kclusters = 5

# drop the neighbourhood column
# london_grouped_clustering = df_london.drop(['New code','Area name'], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(london_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 4, 3, 4, 1, 0, 0, 0, 0])

In [15]:
df_london.insert(0, 'Cluster Labels', kmeans.labels_)
df_london.head(5)

Unnamed: 0,Cluster Labels,New code,Area name,Population density (per hectare) 2017,"Average Age, 2017",% of resident population born abroad (2015),Happiness score 2011-14 (out of 10)
2,0,E09000002,Barking and Dagenham,57.8822,32.9,37.8,7.05
3,0,E09000003,Barnet,44.9115,37.3,35.2,7.37
4,4,E09000004,Bexley,40.3264,39.0,16.1,7.21
5,3,E09000005,Brent,76.817,35.6,53.9,7.22
6,4,E09000006,Bromley,21.8404,40.2,18.3,7.44


In [16]:
for column in ['Population density (per hectare) 2017','% of resident population born abroad (2015)']:
    df_london[column] = df_london[column].astype(str).astype(float)

<h5>Aggregate the data by the cluster</h5>

In [17]:
df_london_grouped = df_london.groupby(['Cluster Labels'])['Population density (per hectare) 2017','Average Age, 2017','% of resident population born abroad (2015)','Happiness score 2011-14 (out of 10)'].mean()

  df_london_grouped = df_london.groupby(['Cluster Labels'])['Population density (per hectare) 2017','Average Age, 2017','% of resident population born abroad (2015)','Happiness score 2011-14 (out of 10)'].mean()


In [18]:
df_london_grouped

Unnamed: 0_level_0,Population density (per hectare) 2017,"Average Age, 2017",% of resident population born abroad (2015),Happiness score 2011-14 (out of 10)
Cluster Labels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,51.052854,36.441667,37.216667,7.266667
1,116.615242,36.333333,42.816667,7.245
2,151.099346,33.1,37.0,7.083333
3,86.106888,34.65,42.083333,7.223333
4,29.037102,38.94,20.28,7.294


<h5>Plot the boroughs on a map</h5>

In [19]:
#Create empty list to be populated with dictionaries
coords = []

#Populate the list using geocoder
for borough in df_london['Area name']:
    g=geocoder.arcgis('%s, London, United Kindom' % (borough))
    lat_lng_coords = g.latlng
    coord = {'Area name': borough, 'Latitude': lat_lng_coords[0], 'Longitude': lat_lng_coords[1]}
    coords.append(coord)

#Convert list to dataframe
df_coords = pd.DataFrame(coords)

In [20]:
df_london_coords = df_london.join(df_coords.set_index('Area name'), on='Area name')
df_london_coords.head()

Unnamed: 0,Cluster Labels,New code,Area name,Population density (per hectare) 2017,"Average Age, 2017",% of resident population born abroad (2015),Happiness score 2011-14 (out of 10),Latitude,Longitude
2,0,E09000002,Barking and Dagenham,57.882203,32.9,37.8,7.05,51.543932,0.133157
3,0,E09000003,Barnet,44.911536,37.3,35.2,7.37,51.527095,-0.066826
4,4,E09000004,Bexley,40.326396,39.0,16.1,7.21,51.452078,0.069931
5,3,E09000005,Brent,76.816966,35.6,53.9,7.22,51.609783,-0.194672
6,4,E09000006,Bromley,21.840359,40.2,18.3,7.44,51.601511,-0.066365


In [21]:
# get the coordinates of London

address = 'London, United Kingdom'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of London are 51.5073219, -0.1276474.


In [22]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_london_coords['Latitude'], df_london_coords['Longitude'], df_london_coords['Area name'], df_london_coords['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7,
        legend_name='Cluster').add_to(map_clusters)
       
map_clusters

<h3>Part 2: Venues in London</h3>

<h5>Load and clean the data</h5>

In [23]:
pip install lxml




You should consider upgrading via the 'python -m pip install --upgrade pip' command.


In [24]:
pip install OSGridConverter

Note: you may need to restart the kernel to use updated packages.


You should consider upgrading via the 'python -m pip install --upgrade pip' command.


In [25]:
from OSGridConverter import grid2latlong
import requests # library to handle requests

In [26]:
url = 'https://en.wikipedia.org/wiki/List_of_areas_of_London'
table = pd.read_html(url)
df_london_raw = table[1]
df_london_raw.head()

Unnamed: 0,Location,London borough,Post town,Postcode district,Dial code,OS grid ref
0,Abbey Wood,"Bexley, Greenwich [7]",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham[8]",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon[8],CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon[8],CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728


In [27]:
# clean the data
df_london_clean=df_london_raw.replace(to_replace='\[\d{,2}\]', value="", regex=True)
df_london_town=df_london_clean[df_london_clean['Post town']=='LONDON']
df_london_town.columns=['Location','London Borough','Town','Postcode','Dial code','OS grid ref']

In [28]:
df_london_town.drop(columns=['Dial code','Town'],inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [29]:
df_london_group = df_london_town.groupby('OS grid ref').agg(lambda x : ','.join(set(x)))
df_london_group.head()

Unnamed: 0_level_0,Location,London Borough,Postcode
OS grid ref,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
TQ153802,"Hanwell,West Ealing",Ealing,"W7,W13"
TQ175805,Ealing,Ealing,"W5, W13"
TQ195785,Gunnersbury,Hounslow,W4
TQ195828,Park Royal,"Brent, Ealing",NW10
TQ195885,Kingsbury,Brent,NW9


In [30]:
df_london_group.reset_index(level=0, inplace=True)

<h5>Find locations</h5>

In [31]:
#Create empty list to be populated with dictionaries
coords = []

#Populate the list using geocoder
for gridref in df_london_group['OS grid ref']:
  l=grid2latlong(gridref)
  coord = {'OS grid ref': gridref, 'Latitude': l.latitude, 'Longitude': l.longitude}
  coords.append(coord)

#Convert list to dataframe
df_coords = pd.DataFrame(coords)

In [32]:
df_london_coords = df_london_group.join(df_coords.set_index('OS grid ref'), on='OS grid ref')

In [42]:
df_london_coords.head()

Unnamed: 0,OS grid ref,Location,London Borough,Postcode,Latitude,Longitude
0,TQ153802,"Hanwell,West Ealing",Ealing,"W7,W13",51.508979,-0.33963
1,TQ175805,Ealing,Ealing,"W5, W13",51.511222,-0.307823
2,TQ195785,Gunnersbury,Hounslow,W4,51.492828,-0.279675
3,TQ195828,Park Royal,"Brent, Ealing",NW10,51.531474,-0.278218
4,TQ195885,Kingsbury,Brent,NW9,51.582702,-0.276281


<h5>Foursquare</h5>

In [34]:
CLIENT_ID = 'ER03M3JR3O2RMSFKDP3XGC40N4UDYHK4XQHKQEE2EQ1ZX4HM' 
CLIENT_SECRET = 'MWHY2HX1OZ3Y2N5JPRMOGJ0JG5SLLNTKUAOAL2RFVJCXAFJA' 
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [35]:
def getNearbyVenues(names, latitudes, longitudes, radius=800):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [36]:
london_venues = getNearbyVenues(names=df_london_coords['OS grid ref'],
                                   latitudes=df_london_coords['Latitude'],
                                   longitudes=df_london_coords['Longitude']
                                  )

TQ153802
TQ175805
TQ195785
TQ195828
TQ195885
TQ203839
TQ205755
TQ205775
TQ205785
TQ205805
TQ205918
TQ208793
TQ209852
TQ213897
TQ215715
TQ215835
TQ215855
TQ215885
TQ215888
TQ216823
TQ217905
TQ225745
TQ225765
TQ225815
TQ225865
TQ225925
TQ226776
TQ227846
TQ227891
TQ229887
TQ230874
TQ233786
TQ233807
TQ235685
TQ235755
TQ235798
TQ235825
TQ235855
TQ239709
TQ245765
TQ245805
TQ245835
TQ245845
TQ245865
TQ245945
TQ246783
TQ246798
TQ246832
TQ248876
TQ250695
TQ253779
TQ254784
TQ255705
TQ255735
TQ255755
TQ255765
TQ255795
TQ255805
TQ255825
TQ255855
TQ255905
TQ256925
TQ257853
TQ262818
TQ265735
TQ265765
TQ265785
TQ265835
TQ265855
TQ265885
TQ265895
TQ265925
TQ265935
TQ266842
TQ267814
TQ273845
TQ275705
TQ275715
TQ275775
TQ275795
TQ275825
TQ276920
TQ278918
TQ280932
TQ281844
TQ281896
TQ282838
TQ285735
TQ285765
TQ285805
TQ285815
TQ285845
TQ285855
TQ285875
TQ285945
TQ287897
TQ293816
TQ295755
TQ295775
TQ295785
TQ295795
TQ295805
TQ295815
TQ295825
TQ295845
TQ295855
TQ295885
TQ295925
TQ296942
TQ297867
TQ298914
T

In [70]:
london_venues.head()

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,TQ153802,51.508979,-0.33963,The Golden Chip,51.507057,-0.339348,Fish & Chips Shop
1,TQ153802,51.508979,-0.33963,The Dodo Micropub,51.506697,-0.337481,Pub
2,TQ153802,51.508979,-0.33963,Fade To Black,51.508698,-0.3378,Coffee Shop
3,TQ153802,51.508979,-0.33963,Big Bites Cafe,51.508726,-0.338032,Café
4,TQ153802,51.508979,-0.33963,Brent Lodge Park,51.512359,-0.348323,Park


In [71]:
# one hot encoding
london_onehot = pd.get_dummies(london_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
london_onehot['Neighbourhood'] = london_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [london_onehot.columns[-1]] + list(london_onehot.columns[:-1])
london_onehot = london_onehot[fixed_columns]

london_grouped = london_onehot.groupby('Neighbourhood').mean().reset_index()

In [72]:
london_grouped.head()

Unnamed: 0,Neighbourhood,ATM,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Animal Shelter,Antique Shop,Aquarium,...,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo,Zoo Exhibit
0,TQ153802,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,TQ175805,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,TQ195785,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,TQ195828,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,TQ195885,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [75]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [77]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))        
        
# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighbourhood'] = london_grouped['Neighbourhood']

# london_temp = london_grouped.drop('Cluster Labels', axis=1)

for ind in np.arange(london_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(london_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,TQ153802,Park,Pub,Café,Bus Stop,Coffee Shop,Asian Restaurant,Canal Lock,Train Station,Fish & Chips Shop,Music Store
1,TQ175805,Coffee Shop,Pub,Café,Park,Italian Restaurant,Pizza Place,Platform,Gym / Fitness Center,Burger Joint,Bakery
2,TQ195785,Coffee Shop,Pub,Café,Park,Sandwich Place,Gym / Fitness Center,Pizza Place,Sporting Goods Shop,Gastropub,Bookstore
3,TQ195828,Movie Theater,Café,Hotel,Hookah Bar,Sandwich Place,Bar,Gym / Fitness Center,Chinese Restaurant,Shipping Store,Grocery Store
4,TQ195885,Grocery Store,Indian Restaurant,Coffee Shop,Fast Food Restaurant,Sandwich Place,Snack Place,Supermarket,Herbs & Spices Store,Playground,Dessert Shop


In [78]:
# set number of clusters
kclusters = 5

# drop the neighbourhood column
london_grouped_clustering = london_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(london_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 3, 2, 2, 0, 0, 0, 0])

In [79]:
# # add clustering labels
london_grouped.insert(0, 'Cluster Labels', kmeans.labels_)

df_london_coords.rename(columns={'OS grid ref': 'Neighbourhood'}, inplace = True)

# merge london_merged with neighborhoods_venues_sorted to add latitude/longitude for each neighborhood
london_merged = df_london_coords.join(london_grouped.set_index('Neighbourhood'), on='Neighbourhood')

# london_grouped.head()

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

london_merged = df_london_coords

# merge df_london_coords with neighborhoods_venues_sorted to add latitude/longitude for each neighborhood
london_merged = london_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

london_merged.head()

Unnamed: 0,Neighbourhood,Location,London Borough,Postcode,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,TQ153802,"Hanwell,West Ealing",Ealing,"W7,W13",51.508979,-0.33963,0,Park,Pub,Café,Bus Stop,Coffee Shop,Asian Restaurant,Canal Lock,Train Station,Fish & Chips Shop,Music Store
1,TQ175805,Ealing,Ealing,"W5, W13",51.511222,-0.307823,0,Coffee Shop,Pub,Café,Park,Italian Restaurant,Pizza Place,Platform,Gym / Fitness Center,Burger Joint,Bakery
2,TQ195785,Gunnersbury,Hounslow,W4,51.492828,-0.279675,0,Coffee Shop,Pub,Café,Park,Sandwich Place,Gym / Fitness Center,Pizza Place,Sporting Goods Shop,Gastropub,Bookstore
3,TQ195828,Park Royal,"Brent, Ealing",NW10,51.531474,-0.278218,3,Movie Theater,Café,Hotel,Hookah Bar,Sandwich Place,Bar,Gym / Fitness Center,Chinese Restaurant,Shipping Store,Grocery Store
4,TQ195885,Kingsbury,Brent,NW9,51.582702,-0.276281,2,Grocery Store,Indian Restaurant,Coffee Shop,Fast Food Restaurant,Sandwich Place,Snack Place,Supermarket,Herbs & Spices Store,Playground,Dessert Shop


In [80]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(london_merged['Latitude'], london_merged['Longitude'], london_merged['Neighbourhood'], london_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [81]:
# Cluster 0
london_merged.loc[london_merged['Cluster Labels'] == 0, london_merged.columns[[1] + list(range(5, london_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Hanwell,West Ealing",-0.339630,0,Park,Pub,Café,Bus Stop,Coffee Shop,Asian Restaurant,Canal Lock,Train Station,Fish & Chips Shop,Music Store
1,Ealing,-0.307823,0,Coffee Shop,Pub,Café,Park,Italian Restaurant,Pizza Place,Platform,Gym / Fitness Center,Burger Joint,Bakery
2,Gunnersbury,-0.279675,0,Coffee Shop,Pub,Café,Park,Sandwich Place,Gym / Fitness Center,Pizza Place,Sporting Goods Shop,Gastropub,Bookstore
6,"Mortlake,East Sheen",-0.266291,0,Pub,Coffee Shop,Grocery Store,Pizza Place,Golf Course,Park,Middle Eastern Restaurant,Tennis Court,Chinese Restaurant,Tapas Restaurant
7,Grove Park,-0.265609,0,Pub,Café,Gym,Convenience Store,Thai Restaurant,Athletics & Sports,Soccer Field,Pharmacy,Flea Market,Train Station
...,...,...,...,...,...,...,...,...,...,...,...,...,...
231,Canning Town,0.024071,0,Pub,Hotel,Platform,Convenience Store,Coffee Shop,Sandwich Place,Fast Food Restaurant,Tapas Restaurant,Tennis Court,Wine Bar
235,Wanstead,0.026867,0,Pub,Park,Café,Coffee Shop,Restaurant,Metro Station,Grocery Store,Bakery,English Restaurant,Sushi Restaurant
236,South Woodford,0.027668,0,Grocery Store,Coffee Shop,Italian Restaurant,Supermarket,Café,Fast Food Restaurant,Pizza Place,Thai Restaurant,Performing Arts Venue,Greek Restaurant
237,Woodford,0.028068,0,Italian Restaurant,Indian Restaurant,Grocery Store,Turkish Restaurant,Coffee Shop,Pizza Place,Playground,Metro Station,Performing Arts Venue,Chinese Restaurant


In [82]:
# Cluster 1
london_merged.loc[london_merged['Cluster Labels'] == 1, london_merged.columns[[1] + list(range(5, london_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,Mill Hill,-0.231577,1,Event Service,Farm,Pub,Athletics & Sports,Fish & Chips Shop,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant
26,Castelnau,-0.235325,1,Pub,Café,French Restaurant,Gastropub,Trail,Park,Coffee Shop,Lake,Italian Restaurant,Flower Shop
44,Totteridge,-0.201972,1,Café,Pub,Bus Stop,Fish Market,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm
54,Wandsworth,-0.194297,1,Pub,Gym / Fitness Center,Park,Breakfast Spot,Pizza Place,Grocery Store,Furniture / Home Store,Supermarket,Italian Restaurant,Music Store
93,"Archway,Highgate",-0.146726,1,Pub,Café,Indian Restaurant,Italian Restaurant,Coffee Shop,Japanese Restaurant,Pizza Place,Historic Site,Park,Bakery
128,Grange Park,-0.103055,1,English Restaurant,Supermarket,Pub,Golf Course,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Fabric Shop,Factory
130,Oval,-0.11457,1,Construction & Landscaping,Farm,Pub,Fish & Chips Shop,Zoo Exhibit,Fish Market,Event Space,Exhibit,Fabric Shop,Factory
146,West Norwood,-0.095004,1,Pub,Brewery,Park,Performing Arts Venue,Italian Restaurant,Record Shop,Indian Restaurant,Bus Station,Coffee Shop,Nature Preserve
152,Canonbury,-0.090125,1,Pub,Coffee Shop,Café,Gastropub,Thai Restaurant,Grocery Store,Park,Deli / Bodega,Platform,Train Station
177,East Dulwich,-0.06509,1,Pub,Café,Garden,Grocery Store,Park,Fish & Chips Shop,Coffee Shop,Vegetarian / Vegan Restaurant,Gastropub,Thai Restaurant


In [83]:
# Cluster 2
london_merged.loc[london_merged['Cluster Labels'] == 2, london_merged.columns[[1] + list(range(5, london_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Kingsbury,-0.276281,2,Grocery Store,Indian Restaurant,Coffee Shop,Fast Food Restaurant,Sandwich Place,Snack Place,Supermarket,Herbs & Spices Store,Playground,Dessert Shop
5,Stonebridge,-0.266306,2,Rental Car Location,Indian Restaurant,Grocery Store,Plaza,Diner,Hotel,Warehouse Store,Bakery,Dam,Fast Food Restaurant
10,The Hale,-0.260712,2,Construction & Landscaping,Fried Chicken Joint,Bakery,Juice Bar,Park,Grocery Store,Indian Restaurant,Argentinian Restaurant,Falafel Restaurant,Farmers Market
14,Kingston Vale,-0.253265,2,Stables,Outdoors & Recreation,Grocery Store,Sandwich Place,Soccer Field,Coffee Shop,Bar,Flower Shop,Event Service,Event Space
28,"Burroughs, The",-0.229877,2,Grocery Store,Coffee Shop,Bus Stop,Gym / Fitness Center,Pizza Place,Middle Eastern Restaurant,BBQ Joint,Bagel Shop,Metro Station,Chinese Restaurant
29,Hendon,-0.227129,2,Grocery Store,Coffee Shop,Pizza Place,Sushi Restaurant,Gym / Fitness Center,Chinese Restaurant,Bagel Shop,Café,Park,Sandwich Place
48,Golders Green,-0.200091,2,Grocery Store,Coffee Shop,Korean Restaurant,Bakery,Café,Italian Restaurant,Japanese Restaurant,Turkish Restaurant,Park,Sushi Restaurant
49,Merton Park,-0.203615,2,Tram Station,Park,Grocery Store,Pub,Diner,Coffee Shop,Train Station,Pizza Place,Trail,Thai Restaurant
76,Colliers Wood,-0.167301,2,Pub,Grocery Store,Coffee Shop,Indian Restaurant,Bar,Italian Restaurant,Chinese Restaurant,Convenience Store,Supermarket,Asian Restaurant
81,Friern Barnet,-0.158079,2,Grocery Store,Italian Restaurant,Indian Restaurant,Dessert Shop,Supermarket,Playground,Park,Residential Building (Apartment / Condo),Indoor Play Area,Coffee Shop


In [84]:
# Cluster 3
london_merged.loc[london_merged['Cluster Labels'] == 3, london_merged.columns[[1] + list(range(5, london_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Park Royal,-0.278218,3,Movie Theater,Café,Hotel,Hookah Bar,Sandwich Place,Bar,Gym / Fitness Center,Chinese Restaurant,Shipping Store,Grocery Store
12,Brent Park,-0.257206,3,Scandinavian Restaurant,Metro Station,Indian Restaurant,Supermarket,Portuguese Restaurant,Sandwich Place,Bus Stop,Furniture / Home Store,Fast Food Restaurant,Exhibit
13,Colindale,-0.249882,3,Supermarket,Pub,Gym / Fitness Center,Chinese Restaurant,Fast Food Restaurant,Bubble Tea Shop,Italian Restaurant,Pizza Place,Food Court,Café
16,Neasden,-0.248446,3,Sandwich Place,Coffee Shop,Supermarket,Betting Shop,Grocery Store,Chinese Restaurant,Shopping Mall,Harbor / Marina,Discount Store,Metro Station
17,West Hendon,-0.247409,3,Bus Stop,Asian Restaurant,Reservoir,Supermarket,Coffee Shop,Pet Store,Hookah Bar,Hotel,Café,Ice Cream Shop
...,...,...,...,...,...,...,...,...,...,...,...,...,...
239,Custom House,0.028076,3,Hotel,Coffee Shop,Light Rail Station,Restaurant,Deli / Bodega,Hotel Bar,Bar,Café,Pub,Convenience Store
243,Charlton,0.037283,3,Coffee Shop,Grocery Store,Bus Stop,Supermarket,Soccer Stadium,Go Kart Track,Thai Restaurant,Carpet Store,Gym / Fitness Center,Electronics Store
245,Middle Park,0.038279,3,Construction & Landscaping,Soccer Field,Historic Site,Pet Store,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Fabric Shop
246,Eltham,0.050076,3,Pub,Fast Food Restaurant,Grocery Store,Department Store,Pizza Place,Pharmacy,Mediterranean Restaurant,Bookstore,Chinese Restaurant,Supermarket


In [86]:
# Cluster 4
london_merged.loc[london_merged['Cluster Labels'] == 4, london_merged.columns[[1] + list(range(5, london_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Grahame Park,-0.243829,4,Park,Supermarket,History Museum,Bus Stop,Gym / Fitness Center,Café,Metro Station,Chinese Restaurant,Grocery Store,Gift Shop
22,Barnes,-0.237147,4,Park,Café,Farmers Market,Pub,French Restaurant,Movie Theater,Track,Bakery,Bookstore,Indie Movie Theater
23,Wormwood Scrubs,-0.235411,4,Soccer Field,Convenience Store,Park,Gym / Fitness Center,Thai Restaurant,Grocery Store,Bar,Track Stadium,Harbor / Marina,Tunnel
33,Raynes Park,-0.225536,4,Pharmacy,Gym / Fitness Center,Bus Stop,Park,Athletics & Sports,Zoo Exhibit,Filipino Restaurant,Exhibit,Fabric Shop,Factory
43,Childs Hill,-0.204811,4,Park,Gym / Fitness Center,Bus Stop,Hotel,Sushi Restaurant,Coffee Shop,Breakfast Spot,Grocery Store,Food Truck,Fabric Shop
69,Hampstead Garden Suburb,-0.175231,4,Bakery,Breakfast Spot,Pet Store,Grocery Store,Park,Outdoors & Recreation,Food Stand,Farmers Market,Fountain,Exhibit
83,Brunswick Park,-0.151862,4,Park,Electronics Store,Bus Stop,Café,Zoo Exhibit,Fish & Chips Shop,Exhibit,Fabric Shop,Factory,Falafel Restaurant
238,Horn Park,0.025365,4,Track,Pet Store,Grocery Store,Pub,Park,Soccer Field,Farmers Market,Event Service,Event Space,Exhibit
241,Mottingham,0.034881,4,Pet Store,Other Repair Shop,Gym,Park,Motorcycle Shop,Gym / Fitness Center,Convenience Store,Food Court,Farmers Market,Forest
242,Kidbrooke,0.036481,4,Grocery Store,Café,Park,Bus Stop,Rugby Pitch,Gastropub,Pool,Fish & Chips Shop,Pub,Fast Food Restaurant
