<h1> The battle of Neighborhoods

London is a multi-cultural city, characterized by several neighborhoods, each one with an history and populated by different ethnic groups. Among the others, the Italian population is one of the most numerous, and the Italian cuisine is one of the most appreciated.

In this project, we would like to understand, with a data-driven approach, what neighborhoods offers the highest business opportunities if a new Italian restaurant is opened. Specifically we would like to answer the following question: if someone is looking to open an italian restaurant, where would they open it? 

In order to answer this question, we perform a segmentation and clusterization of the city based on its neighborhoods. This allows us to analyse the distribution of Italian restaurants in London, and thus to obtain the areas where this business is still in a preliminary stage.  

<h1> Data

In order to perform the analysis, we need to extract the data on London neighborhoods by using public available data from Wikipedia. Moreover, we will employ Foursquare to obtain data on the venues in each neighborhoods.


We start importing the needed packages:

In [182]:
import pandas as pd

import numpy as np

from bs4 import BeautifulSoup

import json

import requests
import lxml
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

import folium 

from sklearn.cluster import KMeans

import pgeocode

London is administered by the City of London and 32 London boroughs. Data on the boroughs can be found on Wikipedia at the following link:

https://en.wikipedia.org/wiki/List_of_areas_of_London

By using BeautifulSoup, we can extract info from this page (scraping) and we can look for the table class, as follows:

In [183]:
link = 'https://en.wikipedia.org/wiki/List_of_areas_of_London'
page = requests.get(link)
print(page)
soup = BeautifulSoup(page.content, 'html')
table = soup.find('table', {'class':'wikitable sortable'}).tbody
table

<Response [200]>


<tbody><tr>
<th>Location</th>
<th>London borough</th>
<th>Post town</th>
<th>Postcode district</th>
<th>Dial code</th>
<th>OS grid ref
</th></tr>
<tr>
<td><a href="/wiki/Abbey_Wood" title="Abbey Wood">Abbey Wood</a></td>
<td>Bexley,  Greenwich <sup class="reference" id="cite_ref-mills1_1-0"><a href="#cite_note-mills1-1">[1]</a></sup></td>
<td>LONDON</td>
<td>SE2</td>
<td>020</td>
<td><span class="plainlinks nourlexpansion" style="white-space: nowrap"><a class="external text" href="https://tools.wmflabs.org/os/coor_g/?pagename=List_of_areas_of_London&amp;params=TQ465785_region%3AGB_scale%3A25000">TQ465785</a></span>
</td></tr>
<tr>
<td><a href="/wiki/Acton,_London" title="Acton, London">Acton</a></td>
<td>Ealing, Hammersmith and Fulham<sup class="reference" id="cite_ref-mills2_2-0"><a href="#cite_note-mills2-2">[2]</a></sup></td>
<td>LONDON</td>
<td>W3, W4</td>
<td>020</td>
<td><span class="plainlinks nourlexpansion" style="white-space: nowrap"><a class="external text" href="https://too

Then, we find all the table rows, we obtain the column headers, and we build the dataframe:

In [184]:

rows = table.find_all('tr')

columns = [i.text.replace('\n', '') for i in rows[0].find_all('th')]

df = pd.DataFrame(columns = columns)

df.columns[3]

'Postcode\xa0district'

Then, we fill the dataframe by using a for loop

In [185]:
for i in range(1, len(rows)):
    tds = rows[i].find_all('td')    
    if len(tds) == 7:
        values = [tds[0].text, tds[1].text, tds[2].text.replace('\n', ''.replace('\xa0','')), tds[3].text, tds[4].text.replace('\n', ''.replace('\xa0','')), tds[5].text.replace('\n', ''.replace('\xa0','')), tds[6].text.replace('\n', ''.replace('\xa0',''))]
    else:
        values = [td.text.replace('\n', '').replace('\xa0','') for td in tds]
        
        df = df.append(pd.Series(values, index = columns), ignore_index = True)
                                                                                        
df.head()

Unnamed: 0,Location,London borough,Post town,Postcode district,Dial code,OS grid ref
0,Abbey Wood,"Bexley, Greenwich [1]",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham[2]",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon[2],CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon[2],CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728


Re-naming some columns for convenience, and removing from the table useless columns. We also keep only Locations within London post town:

In [186]:
df=df.rename(columns = {'London\xa0borough':'Borough'})
df=df.rename(columns = {'Postcode\xa0district':'Postcode'})
df['Borough'] = df['Borough'].map(lambda x: x.rstrip(']').rstrip('0123456789').rstrip('['))

df = df[['Location', 'Borough', 'Postcode', 'Post town']].reset_index(drop=True)

df=df[df['Post town']=='LONDON']

df.head()

Unnamed: 0,Location,Borough,Postcode,Post town
0,Abbey Wood,"Bexley, Greenwich",SE2,LONDON
1,Acton,"Ealing, Hammersmith and Fulham","W3, W4",LONDON
6,Aldgate,City,EC3,LONDON
7,Aldwych,Westminster,WC2,LONDON
9,Anerley,Bromley,SE20,LONDON


In case of multiple postcodes for the same Borough, we keep only the first one

In [187]:
df['Postcode']=df['Postcode'].str.split(',', expand=True).iloc[:,0]


df.Postcode = df.Postcode.str.strip()
#df= df[df['Postcode'].str.startswith(('SW'))].reset_index(drop=True)
df= df.reset_index(drop=True)
df.head()

Unnamed: 0,Location,Borough,Postcode,Post town
0,Abbey Wood,"Bexley, Greenwich",SE2,LONDON
1,Acton,"Ealing, Hammersmith and Fulham",W3,LONDON
2,Aldgate,City,EC3,LONDON
3,Aldwych,Westminster,WC2,LONDON
4,Anerley,Bromley,SE20,LONDON


Let us check the shape

In [188]:
df.shape

(299, 4)

To obtain the coordinates (latitude and longitude) of the locations, the Geocoder package is used:

In [189]:
nomi = pgeocode.Nominatim('gb')
lat=[]
lon=[]
for x in df['Postcode'].tolist():
    nomi_db = nomi.query_postal_code(x)
    lat.append(nomi_db.latitude)
    lon.append(nomi_db.longitude)

data = {'Latitude': lat, 'Longitude': lon}

data

{'Latitude': [51.4869,
  51.5114,
  51.5085,
  51.5142,
  51.4065,
  51.5201,
  51.5649,
  51.6197,
  51.4469,
  51.4987,
  51.5201,
  51.4735,
  51.5344,
  51.4747,
  51.5143,
  51.4927,
  51.5015,
  51.4449,
  51.5574,
  51.4987,
  51.55,
  51.5125,
  51.4647,
  51.4647,
  51.4906,
  51.5236,
  51.6197,
  51.525,
  51.6,
  51.5649,
  51.55,
  51.4548,
  51.46100000000001,
  51.525,
  51.4876,
  51.5433,
  51.6197,
  51.6,
  51.4739,
  51.55,
  51.5406,
  51.4906,
  51.5768,
  51.4999,
  51.5344,
  51.4735,
  51.4449,
  51.5406,
  51.5142,
  51.4842,
  51.4876,
  51.5649,
  51.5166,
  51.4448,
  51.6303,
  51.4927,
  51.55,
  51.601000000000006,
  51.474,
  51.5201,
  51.5864,
  51.4163,
  51.6197,
  51.5142,
  51.5649,
  51.46100000000001,
  51.4869,
  51.5835,
  51.4193,
  51.4906,
  51.4999,
  51.54600000000001,
  51.4406,
  51.5344,
  51.4739,
  51.4776,
  51.5649,
  51.4439,
  51.5122,
  51.487,
  51.4439,
  51.4544,
  51.5895,
  51.5308,
  51.4655,
  nan,
  51.4987,
  51.4506,
 

and we add them to the main dataframe:

In [190]:
coordinates = pd.DataFrame.from_dict(data)
df['Latitude'] = coordinates['Latitude']
df['Longitude'] = coordinates['Longitude']

df.dropna(inplace=True)
df= df.reset_index(drop=True)
df.head()


Unnamed: 0,Location,Borough,Postcode,Post town,Latitude,Longitude
0,Abbey Wood,"Bexley, Greenwich",SE2,LONDON,51.4869,0.1075
1,Acton,"Ealing, Hammersmith and Fulham",W3,LONDON,51.5114,-0.265717
2,Aldgate,City,EC3,LONDON,51.5085,-0.1257
3,Aldwych,Westminster,WC2,LONDON,51.5142,-0.123382
4,Anerley,Bromley,SE20,LONDON,51.4065,-0.05695


Hence, the final dimensions of our dataframe are checked:

In [191]:
df.shape

(298, 6)

Finally, with Foursquare we can obtain information regarding the venues for the geographical location data in London. 
This allows us to provide suggestions on the optimal locations


<h1>Segmenting and Clustering in West London

In [192]:
import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library


address = 'London, United Kingdom'

geolocator = Nominatim(user_agent="my_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of London are 51.5073219, -0.1276474.


<h1>Create a map of London with neighborhoods superimposed on top.

In [193]:
# create map of New York using latitude and longitude values
map_london = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Location']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  
    
map_london

<h1>Define Foursquare Credentials and Version

In [194]:
CLIENT_ID = 'MPNX04UNMBGAPZ2WWTRWZ1LIOFON4FJXU1HJXIR3JCPCU2WP' # your Foursquare ID
CLIENT_SECRET = 'SYWFOX4P002IYMFH1EWI3ZCZAD4CMI3WECXMUPKAEGIHCKTV' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

#print('Your credentails:')
#print('CLIENT_ID: ' + CLIENT_ID)
#print('CLIENT_SECRET:' + CLIENT_SECRET)

<h1>Explore Neighborhoods in London

In [195]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [197]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 500 # define radius

venues = getNearbyVenues(names=df['Location'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )



Abbey Wood


KeyError: 'groups'

In [198]:
print(venues.shape)
venues.head()

(1469, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Balham,51.4469,-0.1384,Grafton Tennis & Squash,51.444039,-0.135662,Tennis Court
1,Balham,51.4469,-0.1384,Poynders Fish Bar,51.449992,-0.136211,Fish & Chips Shop
2,Balham,51.4469,-0.1384,Unit 9 Rehearsal Studio,51.444807,-0.14441,Music Store
3,Balham,51.4469,-0.1384,The Aquatic Design Centre,51.445771,-0.145124,Pet Service
4,Barnes,51.4735,-0.2484,Barnes Farmers Market,51.472922,-0.247694,Farmers Market


Let's check how many venues were returned for each location

In [199]:
venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Balham,4,4,4,4,4,4
Barnes,20,20,20,20,20,20
Battersea,23,23,23,23,23,23
Belgravia,50,50,50,50,50,50
Brixton,18,18,18,18,18,18
Brompton,60,60,60,60,60,60
Castelnau,20,20,20,20,20,20
Chelsea,60,60,60,60,60,60
Clapham,31,31,31,31,31,31
Colliers Wood,40,40,40,40,40,40


In [200]:
# one hot encoding
onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
onehot['Neighborhood'] = venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [onehot.columns[-1]] + list(onehot.columns[:-1])
onehot = onehot[fixed_columns]

onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Veterinarian,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Balham,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Balham,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Balham,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Balham,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Barnes,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [201]:
onehot.shape

(1469, 146)

In [202]:
grouped = onehot.groupby('Neighborhood').mean().reset_index()
grouped.head()

Unnamed: 0,Neighborhood,American Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Veterinarian,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Balham,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Barnes,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,...,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Battersea,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Belgravia,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02
4,Brixton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [203]:
grouped.shape

(42, 146)

Let's print each neighborhood along with the top 5 most common venues

In [204]:
num_top_venues = 3

for hood in grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = grouped[grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Balham----
          venue  freq
0   Pet Service  0.25
1   Music Store  0.25
2  Tennis Court  0.25


----Barnes----
            venue  freq
0             Pub  0.15
1   Grocery Store  0.10
2  Farmers Market  0.10


----Battersea----
                venue  freq
0                 Pub  0.13
1  Italian Restaurant  0.09
2            Bus Stop  0.09


----Belgravia----
           venue  freq
0         Palace  0.06
1  Historic Site  0.06
2          Hotel  0.06


----Brixton----
               venue  freq
0        Coffee Shop  0.11
1        Music Venue  0.11
2  Convenience Store  0.11


----Brompton----
                 venue  freq
0               Bakery  0.07
1   English Restaurant  0.07
2  Japanese Restaurant  0.07


----Castelnau----
            venue  freq
0             Pub  0.15
1   Grocery Store  0.10
2  Farmers Market  0.10


----Chelsea----
                 venue  freq
0               Bakery  0.07
1   English Restaurant  0.07
2  Japanese Restaurant  0.07


----Clapham----
           

In [205]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [213]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = grouped['Neighborhood']

for ind in np.arange(grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Balham,Fish & Chips Shop,Pet Service,Tennis Court,Music Store,Event Service,Exhibit,English Restaurant,Farmers Market,Fast Food Restaurant,Furniture / Home Store
1,Barnes,Pub,Farmers Market,Grocery Store,Italian Restaurant,Gym / Fitness Center,Park,Community Center,Coffee Shop,Food & Drink Shop,Café
2,Battersea,Pub,Italian Restaurant,Thai Restaurant,Bus Stop,Playground,Park,Other Great Outdoors,Flea Market,Chinese Restaurant,Soccer Field
3,Belgravia,Historic Site,Coffee Shop,Gym / Fitness Center,Hotel,Palace,Park,English Restaurant,Garden,Pub,Sushi Restaurant
4,Brixton,Convenience Store,Music Venue,Coffee Shop,Pub,Caribbean Restaurant,Pizza Place,Fried Chicken Joint,Modern European Restaurant,Grocery Store,Café


## Cluster Neighborhoods

In [216]:
# set number of clusters
kclusters = 8

grouped_clustering = grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:20] 

array([4, 0, 5, 1, 0, 1, 0, 1, 7, 5, 1, 6, 0, 0, 1, 3, 1, 5, 1, 0],
      dtype=int32)

Let us check how many occurrecies of each category are present:

In [217]:
venues_count = venues['Venue Category'].value_counts().to_frame(name='Count')
venues_count.head()

Unnamed: 0,Count
Pub,86
Coffee Shop,84
Hotel,51
Café,49
Italian Restaurant,46


In [244]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

merged = df

merged = merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Location')

merged.head() # check the last columns!

ValueError: cannot insert Cluster Labels, already exists

Create map 

In [230]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(merged['Latitude'], merged['Longitude'], merged['Location'], merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    
    if np.isnan(cluster):
        pass
    else:
        folium.CircleMarker(
            [lat, lon],
            radius=20,
            popup=label,
            color=rainbow[int(cluster)-1],
            fill=True,
            fill_color=rainbow[int(cluster)-1],
            fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<h1>Examine Clusters

Cluster 1

In [229]:
cluster_1=merged.loc[merged['Cluster Labels'] == 0, merged.columns[[1] + list(range(5, merged.shape[1]))]]

cluster_1.head()

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Richmond upon Thames,-0.2484,0.0,Pub,Farmers Market,Grocery Store,Italian Restaurant,Gym / Fitness Center,Park,Community Center,Coffee Shop,Food & Drink Shop,Café
31,Lambeth,-0.1158,0.0,Convenience Store,Music Venue,Coffee Shop,Pub,Caribbean Restaurant,Pizza Place,Fried Chicken Joint,Modern European Restaurant,Grocery Store,Café
45,Richmond upon Thames,-0.2484,0.0,Pub,Farmers Market,Grocery Store,Italian Restaurant,Gym / Fitness Center,Park,Community Center,Coffee Shop,Food & Drink Shop,Café
84,Richmond upon Thames,-0.2669,0.0,Coffee Shop,Café,Grocery Store,Pizza Place,Pub,Italian Restaurant,Supermarket,Beer Store,Bookstore,Chinese Restaurant
97,Hammersmith and Fulham,-0.1993,0.0,Coffee Shop,Pub,Climbing Gym,Café,Bakery,Thai Restaurant,Gastropub,Grocery Store,Multiplex,Food Court


Cluster 2

In [221]:
merged.loc[merged['Cluster Labels'] == 1, merged.columns[[1] + list(range(5, merged.shape[1]))]]

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,Westminster,-0.141792,1.0,Historic Site,Coffee Shop,Gym / Fitness Center,Hotel,Palace,Park,English Restaurant,Garden,Pub,Sushi Restaurant
34,"Kensington and Chelsea,Hammersmith and Fulham",-0.165875,1.0,Japanese Restaurant,English Restaurant,Bakery,Italian Restaurant,Pizza Place,Coffee Shop,Pub,Park,Café,Ice Cream Shop
50,Kensington and Chelsea,-0.165875,1.0,Japanese Restaurant,English Restaurant,Bakery,Italian Restaurant,Pizza Place,Coffee Shop,Pub,Park,Café,Ice Cream Shop
79,Kensington and Chelsea,-0.191,1.0,Italian Restaurant,Pub,Garden,Park,Pizza Place,Café,Indie Theater,Hotel,Gym / Fitness Center,Grocery Store
138,Kensington and Chelsea,-0.176283,1.0,Hotel,Exhibit,Science Museum,Italian Restaurant,Bakery,Ice Cream Shop,Burger Joint,Café,Sandwich Place,Gift Shop
145,Westminster,-0.141792,1.0,Historic Site,Coffee Shop,Gym / Fitness Center,Hotel,Palace,Park,English Restaurant,Garden,Pub,Sushi Restaurant
170,Westminster,-0.141792,1.0,Historic Site,Coffee Shop,Gym / Fitness Center,Hotel,Palace,Park,English Restaurant,Garden,Pub,Sushi Restaurant
200,Westminster,-0.141792,1.0,Historic Site,Coffee Shop,Gym / Fitness Center,Hotel,Palace,Park,English Restaurant,Garden,Pub,Sushi Restaurant
224,Kensington and Chelsea,-0.176283,1.0,Hotel,Exhibit,Science Museum,Italian Restaurant,Bakery,Ice Cream Shop,Burger Joint,Café,Sandwich Place,Gift Shop
233,Westminster,-0.141792,1.0,Historic Site,Coffee Shop,Gym / Fitness Center,Hotel,Palace,Park,English Restaurant,Garden,Pub,Sushi Restaurant


Cluster 3 

In [231]:
merged.loc[merged['Cluster Labels'] == 2, merged.columns[[1] + list(range(5, merged.shape[1]))]]

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
208,Merton,-0.22818,2.0,Playground,Road,Yoga Studio,Fast Food Restaurant,French Restaurant,Food Court,Food & Drink Shop,Flea Market,Fish Market,Fish & Chips Shop


Cluster 4

In [232]:
merged.loc[merged['Cluster Labels'] == 3, merged.columns[[1] + list(range(5, merged.shape[1]))]]

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
144,Kingston upon Thames,-0.2306,3.0,Business Service,Park,Bus Station,Pub,Hotel,Event Service,Fried Chicken Joint,French Restaurant,Food Court,Food & Drink Shop
205,Wandsworth,-0.2306,3.0,Business Service,Park,Bus Station,Pub,Hotel,Event Service,Fried Chicken Joint,French Restaurant,Food Court,Food & Drink Shop
209,Wandsworth,-0.2306,3.0,Business Service,Park,Bus Station,Pub,Hotel,Event Service,Fried Chicken Joint,French Restaurant,Food Court,Food & Drink Shop


Cluster 5 

In [233]:
merged.loc[merged['Cluster Labels'] == 4, merged.columns[[1] + list(range(5, merged.shape[1]))]]

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Wandsworth,-0.1384,4.0,Fish & Chips Shop,Pet Service,Tennis Court,Music Store,Event Service,Exhibit,English Restaurant,Farmers Market,Fast Food Restaurant,Furniture / Home Store


In [235]:
merged.loc[merged['Cluster Labels'] == 5, merged.columns[[1] + list(range(5, merged.shape[1]))]]

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,Wandsworth,-0.161933,5.0,Pub,Italian Restaurant,Thai Restaurant,Bus Stop,Playground,Park,Other Great Outdoors,Flea Market,Chinese Restaurant,Soccer Field
61,Merton,-0.197333,5.0,Grocery Store,Bar,Theater,Café,Sushi Restaurant,Restaurant,Thai Restaurant,Italian Restaurant,Coffee Shop,Portuguese Restaurant
166,Merton,-0.197333,5.0,Grocery Store,Bar,Theater,Café,Sushi Restaurant,Restaurant,Thai Restaurant,Italian Restaurant,Coffee Shop,Portuguese Restaurant
182,Croydon,-0.124267,5.0,Pharmacy,Pizza Place,Pub,Trail,Chinese Restaurant,Clothing Store,Fast Food Restaurant,Coffee Shop,Bar,Grocery Store
226,Merton,-0.197333,5.0,Grocery Store,Bar,Theater,Café,Sushi Restaurant,Restaurant,Thai Restaurant,Italian Restaurant,Coffee Shop,Portuguese Restaurant
245,Lambeth,-0.124267,5.0,Pharmacy,Pizza Place,Pub,Trail,Chinese Restaurant,Clothing Store,Fast Food Restaurant,Coffee Shop,Bar,Grocery Store
291,Merton,-0.197333,5.0,Grocery Store,Bar,Theater,Café,Sushi Restaurant,Restaurant,Thai Restaurant,Italian Restaurant,Coffee Shop,Portuguese Restaurant


In [239]:
merged.loc[merged['Cluster Labels'] == 6, merged.columns[[1] + list(range(5, merged.shape[1]))]]

Unnamed: 0,Borough,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
80,Wandsworth,-0.1971,6.0,Italian Restaurant,Bar,Indoor Play Area,Gym,Furniture / Home Store,Pizza Place,Pub,Rental Car Location,Café,Indian Restaurant
230,Wandsworth,-0.1971,6.0,Italian Restaurant,Bar,Indoor Play Area,Gym,Furniture / Home Store,Pizza Place,Pub,Rental Car Location,Café,Indian Restaurant
271,Wandsworth,-0.1971,6.0,Italian Restaurant,Bar,Indoor Play Area,Gym,Furniture / Home Store,Pizza Place,Pub,Rental Car Location,Café,Indian Restaurant
