## Exploring London Wards

the City of London is one of the most important and international cities. It is know with many nouns, one is "The City of London" or simply "the City". It covers a total area of 1569 km^2 and it is divided in 25 Wards in Total.

This analysis is addressed to those who seek to open a Pizza place in London. The goal is to understand which area is the most suitable.

I found on that link: https://www.doogal.co.uk/AdministrativeAreas.php?district=E09000001 all the Wards associated to latitude and longitude in London. It is a csv file that and I will use it later. In this csv the information that I gained are: the postal codes, latitude, longitude, wards and some other data that, honestly, are not relevant for my analysis. 

I will use Foursquare location data to explore the Wards and see what types of restourant we can find in each of them. What I want to find with the use of the Foursquare location data is which ward has many pizza restaurants inside it, as well as other types of restaurant. 
Given the dataset with all the Ward I will explore each ward, look at the number of restaurants, look which ward is the suitable for opening a pizza place. 

My metric to decide if it is convinient or not to open a pizza place is based on the concentartion of competition.

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import folium 

In [3]:
import numpy as np # library to handle data in a vectorized manner
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib.cm as cm
import matplotlib.colors as colors
# import k-means from clustering stage
from sklearn.cluster import KMeans
import geopy
import geopandas

In [19]:
df=pd.read_csv('London_postcodes.csv')

In [20]:
df.shape

(6799, 16)

In [21]:
df.columns

Index(['Postcode', 'In Use?', 'Latitude', 'Longitude', 'Easting', 'Northing',
       'Grid Ref', 'Ward', 'Parish', 'Introduced', 'Terminated', 'Altitude',
       'Country', 'Last Updated', 'Quality', 'LSOA Code'],
      dtype='object')

In [22]:
df=df[df['In Use?']=='Yes']
df.shape

(1709, 16)

## The Data

As I said in the introduction, I found the data on link that follows: https://www.doogal.co.uk/AdministrativeAreas.php?district=E09000001. 
Here we can find all the Wards associated to latitude and longitude in London. The format of the data I get is a csv file. 

In this csv the information that I gained are: the postal codes, latitude, longitude, wards and some other data that, honestly, are not relevant for my analysis.

Thanks to this file I got the Wards and their geographical coordinates, but there are many different latitudes for each ward because of the different postal codes in each area. For simplicity I will use just one of them.

The dataset consists in 6799 observation, but the number is consistently reduced by selecting only the existing postal codes, that still in use. 

In this notebook I selected only the Wards and their geographical coordinates, and I dropped all the other columns. Later on I will use the Foursquare API to explore each ward and look at the piazza restaurants inside each one of them.

In [23]:
df.head()

Unnamed: 0,Postcode,In Use?,Latitude,Longitude,Easting,Northing,Grid Ref,Ward,Parish,Introduced,Terminated,Altitude,Country,Last Updated,Quality,LSOA Code
0,E1 6AN,Yes,51.518895,-0.078378,533425,181747,TQ334817,Bishopsgate,"City of London, unparished area",1980-01-01,,32,England,2020-02-19,Within the building of the matched address clo...,E01032739
1,E1 7AA,Yes,51.515567,-0.075635,533625,181382,TQ336813,Portsoken,"City of London, unparished area",2000-12-01,,28,England,2020-02-19,Within the building of the matched address clo...,E01000005
2,E1 7AD,Yes,51.515457,-0.076718,533550,181368,TQ335813,Portsoken,"City of London, unparished area",2013-09-01,,31,England,2020-02-19,Within the building of the matched address clo...,E01000005
3,E1 7AE,Yes,51.515613,-0.076899,533537,181385,TQ335813,Portsoken,"City of London, unparished area",2013-07-01,,30,England,2020-02-19,Within the building of the matched address clo...,E01000005
4,E1 7AF,Yes,51.515613,-0.076899,533537,181385,TQ335813,Portsoken,"City of London, unparished area",2013-01-01,,30,England,2020-02-19,Within the building of the matched address clo...,E01000005


In [24]:
df1=df.sort_values(by="Last Updated").drop_duplicates(subset=["Ward"], keep="last")

In [25]:
df1.reset_index(inplace=True)
df1.head()

Unnamed: 0,index,Postcode,In Use?,Latitude,Longitude,Easting,Northing,Grid Ref,Ward,Parish,Introduced,Terminated,Altitude,Country,Last Updated,Quality,LSOA Code
0,5878,EC4V 2AD,Yes,51.511636,-0.094937,532297,180910,TQ322809,Queenhithe,"City of London, unparished area",2009-07-01,,22,England,2020-02-19,Within the building of the matched address clo...,E01032739
1,5877,EC4V 2AB,Yes,51.512132,-0.093893,532368,180967,TQ323809,Vintry,"City of London, unparished area",1991-09-01,,27,England,2020-02-19,Within the building of the matched address clo...,E01032739
2,5805,EC4R 9AF,Yes,51.510483,-0.08654,532883,180797,TQ328807,Candlewick,"City of London, unparished area",1999-12-01,,31,England,2020-02-19,Within the building of the matched address clo...,E01032739
3,5802,EC4R 9AB,Yes,51.510225,-0.088338,532759,180765,TQ327807,Dowgate,"City of London, unparished area",1980-01-01,,25,England,2020-02-19,Within the building of the matched address clo...,E01032739
4,482,EC1N 2HA,Yes,51.517092,-0.107537,531407,181494,TQ314814,Castle Baynard,"City of London, unparished area",1998-12-01,,31,England,2020-02-19,Within the building of the matched address clo...,E01032740


In [26]:
len(df1)

25

In [107]:
#list of all Wards
df1.Ward.value_counts()

Bread Street          1
Cordwainer            1
Farringdon Without    1
Cripplegate           1
Candlewick            1
Dowgate               1
Portsoken             1
Langbourn             1
Walbrook              1
Billingsgate          1
Vintry                1
Bridge                1
Bishopsgate           1
Farringdon Within     1
Broad Street          1
Bassishaw             1
Cornhill              1
Castle Baynard        1
Aldersgate            1
Tower                 1
Cheap                 1
Queenhithe            1
Aldgate               1
Coleman Street        1
Lime Street           1
Name: Ward, dtype: int64

In [28]:
df1.columns

Index(['index', 'Postcode', 'In Use?', 'Latitude', 'Longitude', 'Easting',
       'Northing', 'Grid Ref', 'Ward', 'Parish', 'Introduced', 'Terminated',
       'Altitude', 'Country', 'Last Updated', 'Quality', 'LSOA Code'],
      dtype='object')

## Creation of the dataset with only Wards and their latitude and longitude


In [29]:
df2=df1[['Ward','Latitude', 'Longitude']]
df2.head()

Unnamed: 0,Ward,Latitude,Longitude
0,Queenhithe,51.511636,-0.094937
1,Vintry,51.512132,-0.093893
2,Candlewick,51.510483,-0.08654
3,Dowgate,51.510225,-0.088338
4,Castle Baynard,51.517092,-0.107537


In [71]:
#plot of the different wards
Latitude = location.latitude
Longitude = location.longitude
# create map of Manhattan using latitude and longitude values
map = folium.Map(location=[Latitude, Longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df2['Latitude'], df2['Longitude'], df2['Ward']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map)  
    
map

## London City venues

In [32]:
CLIENT_ID = 'DV5DFTF4YGRFQD51YWF3OP10AEYMOJJX1MGWV0MLBEQUK0MC' # your Foursquare ID
CLIENT_SECRET = 'XJQ0Z2BDLT4Y2N1SRHYLFDKDRILBV3RRWMO5HWKTZG2RLXKF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: DV5DFTF4YGRFQD51YWF3OP10AEYMOJJX1MGWV0MLBEQUK0MC
CLIENT_SECRET:XJQ0Z2BDLT4Y2N1SRHYLFDKDRILBV3RRWMO5HWKTZG2RLXKF


In [126]:
LIMIT = 300 # limit of number of venues returned by Foursquare API
radius = 1000 # define radius

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Ward', 
                  'Ward Latitude', 
                  'Ward Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [127]:
# type your answer here

London_venues = getNearbyVenues(names=df2['Ward'],
                                   latitudes=df2['Latitude'],
                                   longitudes=df2['Longitude']
                                  )

Queenhithe
Vintry
Candlewick
Dowgate
Castle Baynard
Farringdon Within
Bishopsgate
Portsoken
Lime Street
Langbourn
Aldgate
Billingsgate
Bridge
Tower
Bread Street
Cornhill
Broad Street
Cripplegate
Walbrook
Cordwainer
Coleman Street
Aldersgate
Bassishaw
Cheap
Farringdon Without


In [128]:
print(London_venues.shape)
London_venues

(2392, 7)


Unnamed: 0,Ward,Ward Latitude,Ward Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Queenhithe,51.511636,-0.094937,Rosslyn,51.512574,-0.093381,Coffee Shop
1,Queenhithe,51.511636,-0.094937,Porterford Butchers,51.513032,-0.093737,Butcher
2,Queenhithe,51.511636,-0.094937,The Merchant House,51.513264,-0.093039,Cocktail Bar
3,Queenhithe,51.511636,-0.094937,Host,51.512629,-0.093211,Coffee Shop
4,Queenhithe,51.511636,-0.094937,M&S Simply Food,51.513590,-0.095297,Grocery Store
...,...,...,...,...,...,...,...
2387,Farringdon Without,51.517484,-0.112941,Soho Coffee,51.513333,-0.113325,Coffee Shop
2388,Farringdon Without,51.517484,-0.112941,Vital Ingredient,51.514238,-0.109501,Salad Place
2389,Farringdon Without,51.517484,-0.112941,El Vino,51.514175,-0.109422,Wine Bar
2390,Farringdon Without,51.517484,-0.112941,Pret A Manger,51.513432,-0.112947,Sandwich Place


In [129]:
London_venues['Venue Category'].unique()

array(['Coffee Shop', 'Butcher', 'Cocktail Bar', 'Grocery Store', 'Café',
       'Roof Deck', 'Vietnamese Restaurant', 'Gym / Fitness Center',
       'Wine Bar', 'Seafood Restaurant', 'Hotel', 'Italian Restaurant',
       'Bookstore', 'Udon Restaurant', 'Chinese Restaurant', 'Park',
       'Sandwich Place', 'Restaurant', 'Indian Restaurant',
       'Scandinavian Restaurant', 'Bar', 'Modern European Restaurant',
       'English Restaurant', 'Pizza Place', 'French Restaurant',
       'Theater', 'Asian Restaurant', 'Pedestrian Plaza', 'Gym', 'Lounge',
       'Scenic Lookout', 'Steakhouse', 'Burger Joint', 'Food Truck',
       'Mexican Restaurant', 'History Museum', 'Falafel Restaurant',
       'Nightclub', 'New American Restaurant', 'Bakery', 'Shopping Mall',
       'Deli / Bodega', 'Clothing Store', 'Plaza', 'American Restaurant',
       'Fast Food Restaurant', 'Art Gallery', 'Hotel Bar',
       'Sushi Restaurant', 'Juice Bar', 'Trail', 'Pub',
       'Turkish Restaurant', 'Salad Place', 

In [130]:
pizza_places=London_venues[London_venues['Venue Category']=='Pizza Place']
pizza_places

Unnamed: 0,Ward,Ward Latitude,Ward Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
28,Queenhithe,51.511636,-0.094937,Homeslice,51.512259,-0.092557,Pizza Place
76,Queenhithe,51.511636,-0.094937,Franco Manca,51.513731,-0.100302,Pizza Place
132,Vintry,51.512132,-0.093893,Homeslice,51.512259,-0.092557,Pizza Place
184,Vintry,51.512132,-0.093893,Franco Manca,51.513731,-0.100302,Pizza Place
276,Candlewick,51.510483,-0.08654,Homeslice,51.512259,-0.092557,Pizza Place
349,Dowgate,51.510225,-0.088338,Homeslice,51.512259,-0.092557,Pizza Place
481,Farringdon Within,51.519765,-0.099199,Pixxa,51.520053,-0.101632,Pizza Place
552,Bishopsgate,51.515796,-0.08038,Pizza Union,51.517699,-0.077416,Pizza Place
567,Bishopsgate,51.515796,-0.08038,1n1 Fashion Pizza,51.516037,-0.075865,Pizza Place
591,Bishopsgate,51.515796,-0.08038,Franco Manca,51.51878,-0.083448,Pizza Place


In [131]:
pizza_places.Ward.value_counts()

Lime Street          3
Portsoken            3
Bishopsgate          3
Cheap                2
Vintry               2
Bread Street         2
Tower                2
Queenhithe           2
Bassishaw            1
Cordwainer           1
Farringdon Within    1
Aldersgate           1
Broad Street         1
Walbrook             1
Coleman Street       1
Aldgate              1
Dowgate              1
Candlewick           1
Name: Ward, dtype: int64

## Firstly Analyse each Ward

In [132]:
# one hot encoding
London_onehot = pd.get_dummies(London_venues[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
London_onehot['Ward'] = London_venues['Ward'] 

# move neighborhood column to the first column
fixed_columns = [London_onehot.columns[-1]] + list(London_onehot.columns[:-1])
London_onehot = London_onehot[fixed_columns]
London_onehot.head()

Unnamed: 0,Ward,American Restaurant,Arcade,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Trail,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Queenhithe,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Queenhithe,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Queenhithe,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Queenhithe,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Queenhithe,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [133]:
London_grouped = London_onehot.groupby('Ward').mean().reset_index()
London_grouped

Unnamed: 0,Ward,American Restaurant,Arcade,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Trail,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Aldersgate,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,...,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.01
1,Aldgate,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.01,...,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0
2,Bassishaw,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,...,0.0,0.0,0.01,0.0,0.02,0.0,0.02,0.0,0.0,0.02
3,Billingsgate,0.0,0.0,0.010526,0.0,0.0,0.031579,0.010526,0.0,0.0,...,0.010526,0.021053,0.0,0.0,0.0,0.0,0.010526,0.0,0.0,0.0
4,Bishopsgate,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,...,0.0,0.01,0.0,0.0,0.02,0.0,0.02,0.01,0.0,0.0
5,Bread Street,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,...,0.0,0.0,0.01,0.0,0.03,0.0,0.02,0.0,0.01,0.0
6,Bridge,0.0,0.0,0.01,0.0,0.0,0.03,0.01,0.0,0.0,...,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0
7,Broad Street,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,...,0.01,0.01,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.01
8,Candlewick,0.0,0.0,0.011628,0.0,0.0,0.046512,0.0,0.0,0.0,...,0.023256,0.011628,0.011628,0.0,0.0,0.0,0.011628,0.0,0.0,0.0
9,Castle Baynard,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,...,0.0,0.0,0.0,0.02,0.01,0.01,0.03,0.0,0.0,0.0


In [134]:
num_top_venues = 15
for hood in London_grouped['Ward']:
    print("----"+hood+"----")
    temp = London_grouped[London_grouped['Ward'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Aldersgate----
                         venue  freq
0                  Coffee Shop  0.08
1           Italian Restaurant  0.05
2             Sushi Restaurant  0.05
3   Modern European Restaurant  0.04
4                        Plaza  0.04
5         Gym / Fitness Center  0.04
6                  Art Gallery  0.03
7               Clothing Store  0.03
8                 Cocktail Bar  0.03
9                   Steakhouse  0.03
10                Burger Joint  0.03
11                  Restaurant  0.03
12                Concert Hall  0.02
13                        Park  0.02
14                        Café  0.02


----Aldgate----
                   venue  freq
0                  Hotel  0.11
1   Gym / Fitness Center  0.07
2           Cocktail Bar  0.05
3             Restaurant  0.04
4            Coffee Shop  0.04
5                    Pub  0.03
6     English Restaurant  0.03
7                 Garden  0.03
8      French Restaurant  0.03
9       Asian Restaurant  0.02
10             Hotel Bar  0.02

In [135]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [136]:
num_top_venues = 15
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Ward']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
Ward_venues_sorted = pd.DataFrame(columns=columns)
Ward_venues_sorted['Ward'] = London_grouped['Ward']

for ind in np.arange(London_grouped.shape[0]):
    Ward_venues_sorted.iloc[ind, 1:] = return_most_common_venues(London_grouped.iloc[ind, :], num_top_venues)

Ward_venues_sorted.head()

Unnamed: 0,Ward,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,Aldersgate,Coffee Shop,Italian Restaurant,Sushi Restaurant,Plaza,Gym / Fitness Center,Modern European Restaurant,Cocktail Bar,Burger Joint,Restaurant,Steakhouse,Art Gallery,Clothing Store,Sandwich Place,Garden,Grocery Store
1,Aldgate,Hotel,Gym / Fitness Center,Cocktail Bar,Restaurant,Coffee Shop,French Restaurant,Garden,English Restaurant,Pub,Salad Place,Café,Scenic Lookout,Hotel Bar,Indian Restaurant,Italian Restaurant
2,Bassishaw,Coffee Shop,Italian Restaurant,Café,Scenic Lookout,Art Gallery,Clothing Store,Steakhouse,Gym / Fitness Center,Sushi Restaurant,Lounge,Roof Deck,Restaurant,Plaza,Park,Yoga Studio
3,Billingsgate,Hotel,Restaurant,Coffee Shop,Gym / Fitness Center,Pub,French Restaurant,Café,Cocktail Bar,Salad Place,Asian Restaurant,Garden,Turkish Restaurant,Fast Food Restaurant,Burger Joint,Beer Bar
4,Bishopsgate,Coffee Shop,Cocktail Bar,Gym / Fitness Center,Indian Restaurant,Pizza Place,Mediterranean Restaurant,Salad Place,Pub,Hotel,Restaurant,Italian Restaurant,Japanese Restaurant,Boxing Gym,Sandwich Place,Middle Eastern Restaurant


From this first analysis we could say that there is not a high precence of pizza places, but a high presence of italian restaurants. This is good for my puposes but I also have to admit that probabily Italian restaurants also do pizza, and to be more precise in my analysis I will focus also on this category. 

Another consideration that is important to do is that many of the pizza places are chain stores, and the place that I suggest to open is more likely a family conduction store, that is more the "Italian type". This leads to a completely different type of competitors.

## Clustering Wards

Since the London City Area is not so big I decided to cluster the venues in three clusters. 
Thanks to this clusterization I will explore each cluster and look at the one were the precence of Italian food is higher. I will not focus only on pizza places because there are just a few of them, and this is the reason why I decided to open one of them. I will also focus my attention on the Italian Restaurant, that probabily will do also pizza. 

My goal is to find a division in the clusters that allows me to say that one cluster is better Than the other.

In [137]:
# set number of clusters
kclusters = 3
London_clustering = London_grouped.drop('Ward', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(London_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 0, 1, 1, 0, 1, 0, 0, 2])

In [138]:
# add clustering labels
Ward_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
London_merged = df2
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
London_merged = London_merged.join(Ward_venues_sorted.set_index('Ward'), on='Ward')

London_merged.head() # check the last columns!

Unnamed: 0,Ward,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,Queenhithe,51.511636,-0.094937,0,Coffee Shop,Italian Restaurant,Wine Bar,Gym / Fitness Center,Restaurant,Pub,Seafood Restaurant,Asian Restaurant,Modern European Restaurant,Burger Joint,French Restaurant,Vietnamese Restaurant,Pedestrian Plaza,Scenic Lookout,Falafel Restaurant
1,Vintry,51.512132,-0.093893,0,Coffee Shop,Italian Restaurant,Restaurant,Burger Joint,Gym / Fitness Center,Vietnamese Restaurant,Asian Restaurant,Modern European Restaurant,Sushi Restaurant,Seafood Restaurant,Cocktail Bar,Lounge,Salad Place,Steakhouse,Bar
2,Candlewick,51.510483,-0.08654,0,Coffee Shop,Restaurant,Asian Restaurant,Pub,Gym / Fitness Center,Italian Restaurant,Hotel,Historic Site,Fast Food Restaurant,Burger Joint,Garden,Sandwich Place,Indian Restaurant,French Restaurant,Seafood Restaurant
3,Dowgate,51.510225,-0.088338,0,Coffee Shop,Pub,Italian Restaurant,Restaurant,Gym / Fitness Center,Historic Site,Hotel,Steakhouse,Asian Restaurant,Burger Joint,Café,Trail,Seafood Restaurant,Cocktail Bar,Butcher
4,Castle Baynard,51.517092,-0.107537,2,Coffee Shop,Italian Restaurant,Pub,Sandwich Place,French Restaurant,Gym / Fitness Center,Wine Bar,Hotel,Burrito Place,Beer Bar,Fast Food Restaurant,Salad Place,Japanese Restaurant,Bar,Falafel Restaurant


In [139]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1,2, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(London_merged['Latitude'], London_merged['Longitude'], London_merged['Ward'], London_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Exploring Clusters

Cluster 1

In [140]:
first=London_merged.loc[London_merged['Cluster Labels'] == 0, London_merged.columns[[0] + list(range(4, London_merged.shape[1]))]]
first

Unnamed: 0,Ward,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,Queenhithe,Coffee Shop,Italian Restaurant,Wine Bar,Gym / Fitness Center,Restaurant,Pub,Seafood Restaurant,Asian Restaurant,Modern European Restaurant,Burger Joint,French Restaurant,Vietnamese Restaurant,Pedestrian Plaza,Scenic Lookout,Falafel Restaurant
1,Vintry,Coffee Shop,Italian Restaurant,Restaurant,Burger Joint,Gym / Fitness Center,Vietnamese Restaurant,Asian Restaurant,Modern European Restaurant,Sushi Restaurant,Seafood Restaurant,Cocktail Bar,Lounge,Salad Place,Steakhouse,Bar
2,Candlewick,Coffee Shop,Restaurant,Asian Restaurant,Pub,Gym / Fitness Center,Italian Restaurant,Hotel,Historic Site,Fast Food Restaurant,Burger Joint,Garden,Sandwich Place,Indian Restaurant,French Restaurant,Seafood Restaurant
3,Dowgate,Coffee Shop,Pub,Italian Restaurant,Restaurant,Gym / Fitness Center,Historic Site,Hotel,Steakhouse,Asian Restaurant,Burger Joint,Café,Trail,Seafood Restaurant,Cocktail Bar,Butcher
5,Farringdon Within,Café,Hotel,Pub,Wine Bar,Coffee Shop,Italian Restaurant,Plaza,Park,Beer Bar,Indie Movie Theater,Modern European Restaurant,Art Gallery,Garden,French Restaurant,Gym / Fitness Center
14,Bread Street,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Plaza,Vietnamese Restaurant,Scenic Lookout,Burger Joint,Clothing Store,Modern European Restaurant,Steakhouse,Indian Restaurant,Sandwich Place,Pizza Place,Seafood Restaurant,Park
15,Cornhill,Coffee Shop,Restaurant,Gym / Fitness Center,Italian Restaurant,Cocktail Bar,Hotel,Steakhouse,English Restaurant,Pub,Salad Place,Asian Restaurant,Burger Joint,Fast Food Restaurant,Modern European Restaurant,Bookstore
16,Broad Street,Coffee Shop,Hotel,Restaurant,Gym / Fitness Center,Sushi Restaurant,French Restaurant,Italian Restaurant,Pub,Cocktail Bar,Burger Joint,Boxing Gym,Modern European Restaurant,Steakhouse,Indian Restaurant,Bookstore
17,Cripplegate,Coffee Shop,Gym / Fitness Center,Hotel,Sandwich Place,Italian Restaurant,Art Gallery,Pub,Sushi Restaurant,Food Truck,Deli / Bodega,Indie Movie Theater,Bar,German Restaurant,Juice Bar,History Museum
18,Walbrook,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Hotel,Asian Restaurant,Seafood Restaurant,Cocktail Bar,Café,Steakhouse,Restaurant,Pub,Wine Bar,French Restaurant,Roof Deck,Clothing Store


2nd Cluster

In [141]:
London_merged.loc[London_merged['Cluster Labels'] == 1, London_merged.columns[[0] + list(range(4, London_merged.shape[1]))]]

Unnamed: 0,Ward,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
6,Bishopsgate,Coffee Shop,Cocktail Bar,Gym / Fitness Center,Indian Restaurant,Pizza Place,Mediterranean Restaurant,Salad Place,Pub,Hotel,Restaurant,Italian Restaurant,Japanese Restaurant,Boxing Gym,Sandwich Place,Middle Eastern Restaurant
7,Portsoken,Hotel,Coffee Shop,Cocktail Bar,Gym / Fitness Center,Café,Pub,Indian Restaurant,Pizza Place,Salad Place,Turkish Restaurant,Vietnamese Restaurant,Middle Eastern Restaurant,Asian Restaurant,English Restaurant,Hotel Bar
8,Lime Street,Coffee Shop,Cocktail Bar,Hotel,Mediterranean Restaurant,Pizza Place,Salad Place,Restaurant,Gym / Fitness Center,Italian Restaurant,Sushi Restaurant,Breakfast Spot,Boxing Gym,Lounge,Burger Joint,Japanese Restaurant
9,Langbourn,Hotel,Gym / Fitness Center,Coffee Shop,French Restaurant,Pub,Salad Place,Cocktail Bar,Restaurant,English Restaurant,Italian Restaurant,Garden,Historic Site,Falafel Restaurant,Wine Bar,Café
10,Aldgate,Hotel,Gym / Fitness Center,Cocktail Bar,Restaurant,Coffee Shop,French Restaurant,Garden,English Restaurant,Pub,Salad Place,Café,Scenic Lookout,Hotel Bar,Indian Restaurant,Italian Restaurant
11,Billingsgate,Hotel,Restaurant,Coffee Shop,Gym / Fitness Center,Pub,French Restaurant,Café,Cocktail Bar,Salad Place,Asian Restaurant,Garden,Turkish Restaurant,Fast Food Restaurant,Burger Joint,Beer Bar
12,Bridge,Hotel,Restaurant,Coffee Shop,Pub,Gym / Fitness Center,Cocktail Bar,Historic Site,Café,Salad Place,Asian Restaurant,Garden,French Restaurant,Italian Restaurant,English Restaurant,Burger Joint
13,Tower,Hotel,Coffee Shop,Indian Restaurant,Gym / Fitness Center,Cocktail Bar,Pub,Café,Modern European Restaurant,Salad Place,Castle,Hotel Bar,French Restaurant,Garden,Pizza Place,Fast Food Restaurant


In [142]:
third=London_merged.loc[London_merged['Cluster Labels'] == 2, London_merged.columns[[0] + list(range(4, London_merged.shape[1]))]]
third

Unnamed: 0,Ward,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
4,Castle Baynard,Coffee Shop,Italian Restaurant,Pub,Sandwich Place,French Restaurant,Gym / Fitness Center,Wine Bar,Hotel,Burrito Place,Beer Bar,Fast Food Restaurant,Salad Place,Japanese Restaurant,Bar,Falafel Restaurant
24,Farringdon Without,Coffee Shop,Pub,Sandwich Place,French Restaurant,Gym / Fitness Center,Italian Restaurant,Hotel,Korean Restaurant,Bar,Fast Food Restaurant,Japanese Restaurant,Restaurant,Café,Bookstore,History Museum


In [143]:
first['Ward'].unique()

array(['Queenhithe', 'Vintry', 'Candlewick', 'Dowgate',
       'Farringdon Within', 'Bread Street', 'Cornhill', 'Broad Street',
       'Cripplegate', 'Walbrook', 'Cordwainer', 'Coleman Street',
       'Aldersgate', 'Bassishaw', 'Cheap'], dtype=object)

From the exploration of the clusters I could say that the first cluster (number zero) is the optimal one, since in it we can see a much higher concentration of Italian restaurants and, in a few cases, pizza places between the first 15 venues. 

To choose the best Wards I based my decision on the concentartion of competition. In this specific case I want to precise that the competition I looked for is the one made from Italian restaurants because of the nature of the pizza place that I want to open.

Best Wards where in my opinion is convinient to open a pizza place are:
- Vintry 
- Bread Street
- Queenhithe 
- Walbrook
- Cordwainer
- Coleman Street
- Aldersgate
- Bassishaw
- Cheap