# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## <span style="color:darkred">Table of contents</span>
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussions](#results)
* [Conclusion](#conclusion)

## <span style="color:darkred"> 1. Introduction: Business Problem <a name="introduction"></a></span>

In this project, we will investigate the influence of poverty rate to the facilities and amenities of certain borough and how this insights can provide targeting local and business development for London boroughs. It will be interesting to see how poverty impacts the facilities, venues and amenities available in the boroughs. 

<br><br>
<center><img src="poverty_rate.jpg"
     alt="London Poverty Rates"
     width = "700"
     style="float: centre; margin-left: 10px;" /></center><br><br>


This will be done by carrying out a comparative studies on the facilities and amenities between the two boroughs with the highest income rate and the lowest in London, **Tower Hamlet (TH)** borough and **Bromley (Brom)** borough of London, as shown in the figure above.  

We will use the data science principles and techniques learned to generate a model of these boroughs, looking at the venues and facilities available in these areas and provides insights to stakeholders, the local authorities and business chambers of commerce.

### Questions to answer:

1.	What types of facilities (venues) and amenities are available in the wards (neighbourhoods) with different poverty line?
2.	How venues changing based on the spending power?
3.	What are the distinctive venues that represent in these boroughs?
4.	Suggestions and recommendations. 

By answering the above questions, the findings can be used for the targeting development for the rest of the London Boroughs so that unnecessary development can be avoided and overall budget can be sustained.

======================================================================================================================== 

## <span style="color:darkred">2. Data <a name="data"></a></span>

Based on the problem definition, factors that will influence the decision in this project will be:

* number and the type of venues and facilities available in the suurounding area of these boroughs
* the most frequent venues and facilties for each boroughs

To define the surrounding area of the borough, we will be using:
* London Boroughs poverty information from: **https://www.trustforlondon.org.uk/data/poverty-borough**. *Accessed: 11/03/2019*
* latitudes and longitudes of Tower Hamlet and Bromley  obtained from: **https://www.distancesto.com/coordinates/gb/**. *Accessed: 11/03/2019*  
* venues, type and locations in every borough will be obtained using **Foursquare API**

Further information of the boroughs can be found at:
* Bromley: **https://en.wikipedia.org/wiki/London_Borough_of_Bromley** 
* Tower Hamlet: **https://en.wikipedia.org/wiki/London_Borough_of_Tower_Hamlets**

## Boroughs Locations 
Based on the information obtained from https://www.distancesto.com/coordinates/gb/, the latitude and the longitude of TM and Brom are as follows:

In [1]:
TH_coordinates = (51.520261, -0.02934)
BROM_coordinates = (51.367971, 0.070062)
london_coordinates = (51.509865, -0.118092)

Let's visualise the locations of these boroughs on London map:

In [2]:
import folium
from folium.features import DivIcon

london_map = folium.Map(location = london_coordinates, zoom_start = 10)

folium.Circle(
    radius=2500,   # the radius is calculated based on the area coverage of the borough 
    location= TH_coordinates,
    color= 'crimson',
    fill=False,
).add_to(london_map)

folium.Marker(
    TH_coordinates, 
    popup=('Tower Hamlet'), 
    icon=folium.Icon(color='crimson', 
    icon_color='white', icon='info-sign', angle=0, prefix='fa')
).add_to(london_map)

folium.Circle(
    radius=6900, # the radius is calculated based on the area coverage of the borough 
    location= BROM_coordinates, 
    popup = 'Bromley',
    color='darkblue',
    fill=False,
).add_to(london_map)

folium.Marker(
    BROM_coordinates, 
    popup=('Bromley'), 
    icon=folium.Icon(color='darkblue', 
    icon_color='white', icon='info-sign', angle=0, prefix='fa')
).add_to(london_map)



london_map

Preliminary observations from the above map show that based on the locations for the two boroughs, Tower Hamlet is located very near to the London centre, where as Bromley borough located at the boundary of the M25 Ringroad, which is about 2 hours drive from London centre. In terms of area size, Tower Hamlet convers about 19.77km2 and Bromley is about 150.2km2. Based on these information, we can used them as distance references when retrieving venues, type and locations  using Foursquare API.

Now that we have the information about the boroughs, let's load the wards (neighbourhood) information of each boroughs.

In [3]:
import pandas as pd

TH_data = pd.read_csv('TH_neighbourhoods.csv')
TH_data.head()

Unnamed: 0,neighbourhood,latitude,longitude
0,Bethnal Green,51.526962,-0.06674
1,Blackwall and Cubitt Town,51.495182,-0.009826
2,Bow East,51.528309,-0.019482
3,Bow West,51.528309,-0.019482
4,Canary Wharf,51.505219,-0.0189


In [4]:
print('Borough of Tower Hamlet has {} neighbourhoods.'.format(
        len(TH_data['neighbourhood'].unique()),
        TH_data.shape[0]
    )
)

Borough of Tower Hamlet has 15 neighbourhoods.


In [5]:
BROM_data = pd.read_csv('BROM_neighbourhoods.csv')
BROM_data.head()

Unnamed: 0,neighbourhood,latitude,longitude
0,Bickley,51.40174,0.043712
1,Biggin Hill,51.331959,0.029057
2,Bromley Common & Keston,51.375875,0.043819
3,Bromley Town,51.402805,0.014814
4,Chelsfield & Pratts Bottom,51.357943,0.127288


In [6]:
print('Borough of Bromley has {} neighborhoods.'.format(
        len(BROM_data['neighbourhood'].unique()),
        BROM_data.shape[0]
    )
)

Borough of Bromley has 20 neighborhoods.


Now, let's confirm the locations of all the wards within the borough on the map based on the area coverage we have define earlier. b

In [7]:
london_map = folium.Map(location = london_coordinates, zoom_start = 10)

folium.Circle(
    radius=2500,   # the radius is calculated based on the area coverage of the borough 
    location= TH_coordinates,
    color= 'crimson',
    fill=False,
).add_to(london_map)

# add markers to map
for lat, lng, label in zip(TH_data['latitude'], TH_data['longitude'], TH_data['neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(london_map) 

folium.Circle(
    radius=6900, # the radius is calculated based on the area coverage of the borough 
    location= BROM_coordinates, 
    popup = 'Bromley',
    color='darkblue',
    fill=False,
).add_to(london_map)

# add markers to map
for lat, lng, label in zip(BROM_data['latitude'], BROM_data['longitude'], BROM_data['neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(london_map) 


london_map

## Foursquare API

Now that we have our location candidates, let's use Foursquare API to get info on venues in each of the wards within each borough

As an exploratory project, we will retrieve the venues based on the areas of each borough. We will then do the neccesary manipulations and analysis to achieve our objectives.  

### Define Foursquare Credential and Version 

In [8]:
CLIENT_ID = 'VB4GSHAOKEPLPPVS0VBVRAXL3DXVHHTRU3BJ4X4NJSGSF3R4' # your Foursquare ID
CLIENT_SECRET = '0N21QNS5SLASFGZFAGYFVAGKCE15NM40N2ZOXLAHS50KHICP' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: VB4GSHAOKEPLPPVS0VBVRAXL3DXVHHTRU3BJ4X4NJSGSF3R4
CLIENT_SECRET:0N21QNS5SLASFGZFAGYFVAGKCE15NM40N2ZOXLAHS50KHICP


In [9]:
# importing the neccesary libraries for the tasks 

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import pandas as pd
import numpy as np
import json # library to handle JSON files
import requests # library to handle requests

# import k-means from clustering stage
from sklearn.cluster import KMeans

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt
%matplotlib inline 

In [10]:
# fundtion to repeat the same process of retrieving venues of all the neighbourhoods 

def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT = 100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Tower Hamlet: Let's get the venues in Tower Hamlet within a radius of 500 meters of each neighbourhoods

In [11]:
TH_venues = getNearbyVenues(names=TH_data['neighbourhood'],
                                   latitudes = TH_data['latitude'],
                                   longitudes = TH_data['longitude']
                                  )



Bethnal Green
Blackwall and Cubitt Town
Bow East
Bow West
Canary Wharf
Island Gardens 
Lansbury
Limehouse 
Poplar
Shadwell
Spitalfields and Banglatown
St Katharine's and Wapping 
Stepney Green
Weavers
Whitechapel


In [12]:
print(TH_venues.shape)

TH_venues.to_csv('TH_venues', index = False)
TH_venues.head()

(520, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bethnal Green,51.526962,-0.06674,The King's Arms,51.525754,-0.065868,Pub
1,Bethnal Green,51.526962,-0.06674,Sam's Cafe,51.526424,-0.065056,Café
2,Bethnal Green,51.526962,-0.06674,Woolidando,51.526377,-0.066518,Café
3,Bethnal Green,51.526962,-0.06674,Jonestown,51.526092,-0.067936,Coffee Shop
4,Bethnal Green,51.526962,-0.06674,E Pellicci,51.526516,-0.063426,Café


In [13]:
TH_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bethnal Green,63,63,63,63,63,63
Blackwall and Cubitt Town,32,32,32,32,32,32
Bow East,13,13,13,13,13,13
Bow West,13,13,13,13,13,13
Canary Wharf,100,100,100,100,100,100
Island Gardens,30,30,30,30,30,30
Lansbury,16,16,16,16,16,16
Limehouse,28,28,28,28,28,28
Poplar,13,13,13,13,13,13
Shadwell,22,22,22,22,22,22


#### Let's find out how many unique categories can we obtained from Tower Hamlet

In [14]:
print('There are {} uniques venue categories at Tower Hamlet.'.format(len(TH_venues['Venue Category'].unique())))

There are 156 uniques venue categories at Tower Hamlet.


### Bromley: Let's get the venues in Bromley within a radius of 500 meters of each neighbourhoods

In [15]:
BROM_venues = getNearbyVenues(names=BROM_data['neighbourhood'],
                                   latitudes = BROM_data['latitude'],
                                   longitudes = BROM_data['longitude']
                                  )



Bickley
Biggin Hill
Bromley Common & Keston
Bromley Town
Chelsfield & Pratts Bottom
Chislehurst
Copers Cope
Cray Valley East
Crystal Palace
Darwin
Farnborough & Croftton
Hayes & Coney Hall
Kelsey & Eden Park
Mottingham & Chislehurst North
Orpington
Penge & Cator
Petts Wood & Knoll
Plaistow & Sunbridge 
Shortlangs 
West Wickham 


In [16]:
print(BROM_venues.shape)

BROM_venues.to_csv('BROM_venues', index = False)
BROM_venues.head()

(193, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bickley,51.40174,0.043712,Bickley Railway Station (BKL),51.400032,0.045351,Train Station
1,Bickley,51.40174,0.043712,Bickley Park Cricket Club,51.401508,0.046263,Cricket Ground
2,Bickley,51.40174,0.043712,J Henry Flooring,51.401227,0.040652,Home Service
3,Bickley,51.40174,0.043712,Village Sandwich Bar,51.399301,0.047951,Café
4,Biggin Hill,51.331959,0.029057,Biggin Hill Airport (BQH) (Biggin Hill Airport),51.331794,0.028845,Airport


In [17]:
BROM_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bickley,4,4,4,4,4,4
Biggin Hill,4,4,4,4,4,4
Bromley Common & Keston,5,5,5,5,5,5
Bromley Town,45,45,45,45,45,45
Chelsfield & Pratts Bottom,3,3,3,3,3,3
Chislehurst,6,6,6,6,6,6
Copers Cope,4,4,4,4,4,4
Cray Valley East,28,28,28,28,28,28
Crystal Palace,22,22,22,22,22,22
Darwin,3,3,3,3,3,3


In [18]:
print('There are {} uniques venue categories at Bromley.'.format(len(BROM_venues['Venue Category'].unique())))

There are 76 uniques venue categories at Bromley.


####  In summary:
* Total venues in Tower Hamlet neighbourhoods returned by Foursquare: **520**
* Total venues in Bromley neighbourhoods returned by Foursquare: **193**
* Total unique venue categories in Tower Hamlet = **156**
* Total unique venue categories in Bromley = **76**
* Tower Hamlet venues dataframe is called: **TH_venues**
* Tower Hamlet venues dataframe is called: **BROM_venues**

## <span style="color:darkred">3. Methodology <a name="methodology"></a></span>

In this project, we will be only concentrating the two boroughs with the largest poverty rate gaps. 

In the first step, we will be looking into the top 5 most common venues for each of the borough, this will provides us with an overview of the types of venues popular in the boroughs. 

Second steps we will be clustering the neighbourhood for each borough to investigate the cluster formations 

in third and final steps, we will be drill into each of cluster to obtained the reviews of some of the venues to seek the quality of service at venues in this separate borough.

With these analysis, we will be able to drawn some conclusions on how poverty rate in particular borough impacting the venues in the areas using Foursquare geospatial data.

======================================================================================================================== 

## <span style="color:darkred">4. Analysis <a name="analysis"></a></span>

## Analyse Each Borough:

### Tower Hamlet:

In [19]:
# one hot encoding
TH_onehot = pd.get_dummies(TH_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
TH_onehot['Neighbourhood'] = TH_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [TH_onehot.columns[-1]] + list(TH_onehot.columns[:-1])
TH_onehot = TH_onehot[fixed_columns]

TH_onehot.head()

Unnamed: 0,Neighbourhood,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Beer Bar,...,Trail,Train Station,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Bethnal Green,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Bethnal Green,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Bethnal Green,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Bethnal Green,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Bethnal Green,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Bromley:

In [20]:
# one hot encoding
BROM_onehot = pd.get_dummies(BROM_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
BROM_onehot['Neighbourhood'] = BROM_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [BROM_onehot.columns[-1]] + list(BROM_onehot.columns[:-1])
BROM_onehot = BROM_onehot[fixed_columns]

BROM_onehot.head()

Unnamed: 0,Neighbourhood,Airport,Airport Service,American Restaurant,Asian Restaurant,Athletics & Sports,Bakery,Bar,Bike Shop,Bookstore,...,Stationery Store,Supermarket,Sushi Restaurant,Tennis Court,Theater,Track Stadium,Train Station,Turkish Restaurant,Wine Shop,Women's Store
0,Bickley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
1,Bickley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Bickley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Bickley,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Biggin Hill,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### Next, let's group rows by neighbourhood and by taking the mean of the frequency of occurrence of each category|

In [21]:
TH_grouped = TH_onehot.groupby('Neighbourhood').mean().reset_index()
TH_grouped

Unnamed: 0,Neighbourhood,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Beer Bar,...,Trail,Train Station,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Bethnal Green,0.015873,0.0,0.0,0.0,0.031746,0.015873,0.0,0.0,0.015873,...,0.0,0.0,0.0,0.015873,0.0,0.0,0.031746,0.0,0.0,0.0
1,Blackwall and Cubitt Town,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bow East,0.076923,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bow West,0.076923,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Canary Wharf,0.0,0.01,0.0,0.01,0.0,0.03,0.01,0.0,0.01,...,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0
5,Island Gardens,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,...,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Lansbury,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Limehouse,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0
8,Poplar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Shadwell,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,...,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0


In [22]:
BROM_grouped = BROM_onehot.groupby('Neighbourhood').mean().reset_index()
BROM_grouped

Unnamed: 0,Neighbourhood,Airport,Airport Service,American Restaurant,Asian Restaurant,Athletics & Sports,Bakery,Bar,Bike Shop,Bookstore,...,Stationery Store,Supermarket,Sushi Restaurant,Tennis Court,Theater,Track Stadium,Train Station,Turkish Restaurant,Wine Shop,Women's Store
0,Bickley,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0
1,Biggin Hill,0.25,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bromley Common & Keston,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bromley Town,0.0,0.0,0.0,0.022222,0.0,0.022222,0.022222,0.0,0.022222,...,0.022222,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.022222
4,Chelsfield & Pratts Bottom,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Chislehurst,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Copers Cope,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Cray Valley East,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.035714,...,0.035714,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Crystal Palace,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.045455,0.0,...,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.045455,0.0
9,Darwin,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's print each Tower Hamlet neighborhood along with the top 5 most common venues

In [23]:
num_top_venues = 5

for hood in TH_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = TH_grouped[TH_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bethnal Green----
         venue  freq
0  Coffee Shop  0.13
1          Pub  0.11
2         Café  0.06
3         Park  0.03
4   Bagel Shop  0.03


----Blackwall and Cubitt Town----
               venue  freq
0  Indian Restaurant  0.09
1               Park  0.09
2               Café  0.06
3    Harbor / Marina  0.06
4      Grocery Store  0.06


----Bow East----
                      venue  freq
0                       Pub  0.15
1  Bike Rental / Bike Share  0.08
2      Fast Food Restaurant  0.08
3         Convenience Store  0.08
4               Coffee Shop  0.08


----Bow West----
                      venue  freq
0                       Pub  0.15
1  Bike Rental / Bike Share  0.08
2      Fast Food Restaurant  0.08
3         Convenience Store  0.08
4               Coffee Shop  0.08


----Canary Wharf----
                venue  freq
0         Coffee Shop  0.09
1        Burger Joint  0.05
2      Sandwich Place  0.05
3               Plaza  0.04
4  Italian Restaurant  0.03


----Island Gard

#### Let's print each Bromley neighborhood along with the top 5 most common venues

In [24]:
for hood in BROM_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = BROM_grouped[BROM_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bickley----
            venue  freq
0    Home Service  0.25
1   Train Station  0.25
2            Café  0.25
3  Cricket Ground  0.25
4         Airport  0.00


----Biggin Hill----
                venue  freq
0             Airport  0.25
1                 Pub  0.25
2      Massage Studio  0.25
3     Airport Service  0.25
4  Mexican Restaurant  0.00


----Bromley Common & Keston----
                  venue  freq
0           Bus Station   0.2
1                   Pub   0.2
2     Indian Restaurant   0.2
3                   Bar   0.2
4  Fast Food Restaurant   0.2


----Bromley Town----
                  venue  freq
0           Coffee Shop  0.13
1        Clothing Store  0.11
2  Gym / Fitness Center  0.07
3                   Pub  0.07
4           Pizza Place  0.04


----Chelsfield & Pratts Bottom----
                     venue  freq
0  Health & Beauty Service  0.33
1                      Pub  0.33
2            Fishing Store  0.33
3             Home Service  0.00
4           Ice Cream Shop  0.0

Function to sort the venues in descending order.

In [25]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

========================================================================================================================

### Top 10 venues for each neighborhood.

### Tower Hamlet:

In [26]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
TH_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
TH_neighborhoods_venues_sorted['Neighbourhood'] = TH_grouped['Neighbourhood']

for ind in np.arange(TH_grouped.shape[0]):
    TH_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(TH_grouped.iloc[ind, :], num_top_venues)


### Bromley:

In [27]:
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
BROM_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
BROM_neighborhoods_venues_sorted['Neighbourhood'] = BROM_grouped['Neighbourhood']

for ind in np.arange(BROM_grouped.shape[0]):
    BROM_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(BROM_grouped.iloc[ind, :], num_top_venues)


### Top 10 venues for each neighborhood.

In [28]:
TH_neighborhoods_venues_sorted

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bethnal Green,Coffee Shop,Pub,Café,Park,Bagel Shop,Restaurant,Wine Bar,Print Shop,Music Store,Pizza Place
1,Blackwall and Cubitt Town,Indian Restaurant,Park,Harbor / Marina,Pizza Place,Grocery Store,Pub,Café,Soccer Field,Light Rail Station,Dim Sum Restaurant
2,Bow East,Pub,Bike Rental / Bike Share,Hotel,Gym,Metro Station,Office,Fast Food Restaurant,Convenience Store,Coffee Shop,Burger Joint
3,Bow West,Pub,Bike Rental / Bike Share,Hotel,Gym,Metro Station,Office,Fast Food Restaurant,Convenience Store,Coffee Shop,Burger Joint
4,Canary Wharf,Coffee Shop,Burger Joint,Sandwich Place,Plaza,Café,Sushi Restaurant,Italian Restaurant,Bakery,Food Truck,Shopping Mall
5,Island Gardens,Pub,History Museum,Boat or Ferry,Park,Art Gallery,Church,Burger Joint,Bus Stop,Café,Chinese Restaurant
6,Lansbury,Indian Restaurant,Grocery Store,Beer Garden,Café,Food & Drink Shop,Pakistani Restaurant,Supermarket,Park,Fast Food Restaurant,Canal
7,Limehouse,Pub,Chinese Restaurant,Indian Restaurant,Café,Italian Restaurant,Pizza Place,Park,Bus Stop,Gastropub,Light Rail Station
8,Poplar,Park,Fried Chicken Joint,Grocery Store,Chinese Restaurant,Steakhouse,Coffee Shop,Café,Light Rail Station,Tunnel,English Restaurant
9,Shadwell,Grocery Store,Hotel,Pizza Place,Dive Bar,Burger Joint,Fast Food Restaurant,Coffee Shop,Park,Market,Event Space


In [29]:
BROM_neighborhoods_venues_sorted

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bickley,Cricket Ground,Train Station,Café,Home Service,Women's Store,Fast Food Restaurant,Department Store,Diner,Donut Shop,Electronics Store
1,Biggin Hill,Airport,Airport Service,Pub,Massage Studio,Farm,Cricket Ground,Department Store,Diner,Donut Shop,Electronics Store
2,Bromley Common & Keston,Pub,Bar,Indian Restaurant,Bus Station,Fast Food Restaurant,Women's Store,Farm,Cricket Ground,Department Store,Diner
3,Bromley Town,Coffee Shop,Clothing Store,Gym / Fitness Center,Pub,Pizza Place,Burger Joint,Department Store,Irish Pub,Ice Cream Shop,Furniture / Home Store
4,Chelsfield & Pratts Bottom,Health & Beauty Service,Pub,Fishing Store,Women's Store,Farm,Cosmetics Shop,Cricket Ground,Department Store,Diner,Donut Shop
5,Chislehurst,Pub,Italian Restaurant,Gastropub,Pizza Place,Indian Restaurant,Women's Store,Cosmetics Shop,Cricket Ground,Department Store,Diner
6,Copers Cope,Athletics & Sports,Soccer Field,Indoor Play Area,Cricket Ground,Department Store,Diner,Donut Shop,Electronics Store,Farm,Fast Food Restaurant
7,Cray Valley East,Coffee Shop,Clothing Store,Gym / Fitness Center,Sandwich Place,Gastropub,Café,Ice Cream Shop,Electronics Store,Movie Theater,Burger Joint
8,Crystal Palace,Platform,Breakfast Spot,Park,Sculpture Garden,Wine Shop,Gym / Fitness Center,Farm,History Museum,Outdoor Sculpture,Café
9,Darwin,History Museum,Bar,Farm,Women's Store,Fast Food Restaurant,Cricket Ground,Department Store,Diner,Donut Shop,Electronics Store


## Cluster Neighbourhoods

In [30]:
# Run K-Mean to cluster the neighbourhoods in each borough into 5 clusters 
TH_BROM_venues = TH_venues.merge(BROM_venues, how='outer')
TH_BROM_venues

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bethnal Green,51.526962,-0.066740,The King's Arms,51.525754,-0.065868,Pub
1,Bethnal Green,51.526962,-0.066740,Sam's Cafe,51.526424,-0.065056,Café
2,Bethnal Green,51.526962,-0.066740,Woolidando,51.526377,-0.066518,Café
3,Bethnal Green,51.526962,-0.066740,Jonestown,51.526092,-0.067936,Coffee Shop
4,Bethnal Green,51.526962,-0.066740,E Pellicci,51.526516,-0.063426,Café
5,Bethnal Green,51.526962,-0.066740,Tas Firin,51.525458,-0.070273,Turkish Restaurant
6,Bethnal Green,51.526962,-0.066740,Cafe 338,51.526582,-0.063258,Café
7,Bethnal Green,51.526962,-0.066740,Brawn,51.528913,-0.070313,Restaurant
8,Bethnal Green,51.526962,-0.066740,Columbia Road Flower Market,51.529358,-0.069566,Market
9,Bethnal Green,51.526962,-0.066740,The Carpenters Arms,51.523927,-0.067441,Pub


In [31]:
# one hot encoding
TH_BROM_onehot = pd.get_dummies(TH_BROM_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
TH_BROM_onehot['Neighbourhood'] = TH_BROM_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [TH_BROM_onehot.columns[-1]] + list(TH_BROM_onehot.columns[:-1])
TH_BROM_onehot = TH_BROM_onehot[fixed_columns]

TH_BROM_grouped = TH_BROM_onehot.groupby('Neighbourhood').mean().reset_index()
TH_BROM_grouped

Unnamed: 0,Neighbourhood,Airport,Airport Service,American Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,...,Train Station,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Bethnal Green,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.031746,0.015873,...,0.0,0.0,0.015873,0.0,0.0,0.031746,0.0,0.0,0.0,0.0
1,Bickley,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Biggin Hill,0.25,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Blackwall and Cubitt Town,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bow East,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bow West,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Bromley Common & Keston,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Bromley Town,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.022222,...,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.022222,0.0
8,Canary Wharf,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.03,...,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0
9,Chelsfield & Pratts Bottom,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [32]:
num_top_venues = 5

for hood in TH_BROM_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = TH_BROM_grouped[TH_BROM_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bethnal Green----
         venue  freq
0  Coffee Shop  0.13
1          Pub  0.11
2         Café  0.06
3     Wine Bar  0.03
4   Bagel Shop  0.03


----Bickley----
            venue  freq
0            Café  0.25
1  Cricket Ground  0.25
2    Home Service  0.25
3   Train Station  0.25
4         Airport  0.00


----Biggin Hill----
             venue  freq
0          Airport  0.25
1              Pub  0.25
2   Massage Studio  0.25
3  Airport Service  0.25
4        Wine Shop  0.00


----Blackwall and Cubitt Town----
               venue  freq
0  Indian Restaurant  0.09
1               Park  0.09
2               Café  0.06
3                Pub  0.06
4        Pizza Place  0.06


----Bow East----
                      venue  freq
0                       Pub  0.15
1      Fast Food Restaurant  0.08
2             Metro Station  0.08
3                       Gym  0.08
4  Bike Rental / Bike Share  0.08


----Bow West----
                      venue  freq
0                       Pub  0.15
1      Fas

In [33]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
TH_BROM_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
TH_BROM_neighborhoods_venues_sorted['Neighbourhood'] = TH_BROM_grouped['Neighbourhood']

for ind in np.arange(TH_BROM_grouped.shape[0]):
    TH_BROM_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(TH_BROM_grouped.iloc[ind, :], num_top_venues)

In [34]:
TH_BROM_neighborhoods_venues_sorted

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bethnal Green,Coffee Shop,Pub,Café,Wine Bar,Park,Bagel Shop,Restaurant,Shoe Store,Print Shop,Jewelry Store
1,Bickley,Home Service,Cricket Ground,Café,Train Station,Yoga Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market
2,Biggin Hill,Airport,Airport Service,Pub,Massage Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
3,Blackwall and Cubitt Town,Indian Restaurant,Park,Harbor / Marina,Pizza Place,Grocery Store,Café,Pub,Restaurant,Light Rail Station,Bus Stop
4,Bow East,Pub,Hotel,Bar,Burger Joint,Metro Station,Office,Fast Food Restaurant,Bike Rental / Bike Share,Convenience Store,Gym
5,Bow West,Pub,Hotel,Bar,Burger Joint,Metro Station,Office,Fast Food Restaurant,Bike Rental / Bike Share,Convenience Store,Gym
6,Bromley Common & Keston,Bus Station,Indian Restaurant,Pub,Bar,Fast Food Restaurant,Farm,Food Stand,Food Court,Food & Drink Shop,Flower Shop
7,Bromley Town,Coffee Shop,Clothing Store,Gym / Fitness Center,Pub,Burger Joint,Pizza Place,Movie Theater,Burrito Place,Park,Café
8,Canary Wharf,Coffee Shop,Sandwich Place,Burger Joint,Plaza,Food Truck,Bakery,Italian Restaurant,Shopping Mall,Café,Sushi Restaurant
9,Chelsfield & Pratts Bottom,Pub,Health & Beauty Service,Fishing Store,Yoga Studio,Falafel Restaurant,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fish Market


In [35]:
# set number of clusters
kclusters = 5

TH_BROM_grouped_clustering = TH_BROM_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(TH_BROM_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 2, 4, 2, 4, 4, 4, 2, 2, 4], dtype=int32)

In [42]:
TH_BROM_neighborhoods_venues_sorted

Unnamed: 0,Cluster Label,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,2,Bethnal Green,Coffee Shop,Pub,Café,Wine Bar,Park,Bagel Shop,Restaurant,Shoe Store,Print Shop,Jewelry Store
1,2,Bickley,Home Service,Cricket Ground,Café,Train Station,Yoga Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market
2,4,Biggin Hill,Airport,Airport Service,Pub,Massage Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
3,2,Blackwall and Cubitt Town,Indian Restaurant,Park,Harbor / Marina,Pizza Place,Grocery Store,Café,Pub,Restaurant,Light Rail Station,Bus Stop
4,4,Bow East,Pub,Hotel,Bar,Burger Joint,Metro Station,Office,Fast Food Restaurant,Bike Rental / Bike Share,Convenience Store,Gym
5,4,Bow West,Pub,Hotel,Bar,Burger Joint,Metro Station,Office,Fast Food Restaurant,Bike Rental / Bike Share,Convenience Store,Gym
6,4,Bromley Common & Keston,Bus Station,Indian Restaurant,Pub,Bar,Fast Food Restaurant,Farm,Food Stand,Food Court,Food & Drink Shop,Flower Shop
7,2,Bromley Town,Coffee Shop,Clothing Store,Gym / Fitness Center,Pub,Burger Joint,Pizza Place,Movie Theater,Burrito Place,Park,Café
8,2,Canary Wharf,Coffee Shop,Sandwich Place,Burger Joint,Plaza,Food Truck,Bakery,Italian Restaurant,Shopping Mall,Café,Sushi Restaurant
9,4,Chelsfield & Pratts Bottom,Pub,Health & Beauty Service,Fishing Store,Yoga Studio,Falafel Restaurant,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fish Market


========================================================================================================================

## <span style="color:darkred">5. Results and Discussions <a name="results"></a></span>

In [36]:
print(TH_neighborhoods_venues_sorted['1st Most Common Venue'].value_counts().head(3))
print(TH_neighborhoods_venues_sorted['2nd Most Common Venue'].value_counts().head(3))
print(TH_neighborhoods_venues_sorted['3rd Most Common Venue'].value_counts().head(3))

Pub                  7
Coffee Shop          3
Indian Restaurant    2
Name: 1st Most Common Venue, dtype: int64
Pub                         3
Hotel                       2
Bike Rental / Bike Share    2
Name: 2nd Most Common Venue, dtype: int64
Hotel             2
Sandwich Place    2
Boat or Ferry     1
Name: 3rd Most Common Venue, dtype: int64


In [37]:
print(BROM_neighborhoods_venues_sorted['1st Most Common Venue'].value_counts().head(3))
print(BROM_neighborhoods_venues_sorted['2nd Most Common Venue'].value_counts().head(3))
print(BROM_neighborhoods_venues_sorted['3rd Most Common Venue'].value_counts().head(3))

Pub            4
Coffee Shop    2
Park           2
Name: 1st Most Common Venue, dtype: int64
Italian Restaurant    2
Train Station         2
Clothing Store        2
Name: 2nd Most Common Venue, dtype: int64
Park                    2
Gym / Fitness Center    2
Pub                     2
Name: 3rd Most Common Venue, dtype: int64


### Analysis of the most common venues 

Let's analyse the most common venue between the two boroughs:
* Tower Hamlet: refering to the first 3 most common venues in Tower Hamlet, we can easily identify that mist venues are food and beverage venues, with 84% of the venues. It is seems that in most of the Tower Hamlet venues are for essential sustenance providers. Furthermore, due to the location of the borough, the boroughs has provided bike rental and bike share venues for the residents to commute within the area.  
* Bromley: However, in Bromley, althougb also includes food and beverage venues, the venues also includes clothing stores, gyms, parks and Italian restaurants, as such that these are life style luxuries that are mostly common for area that have low poverty rates. 

If we looking at the big pictures between the two boroughs, we can somehow identity that in general, majority of the venues in Tower Hamlets are food and beverage or sustenance venues, such as cafe, pubs restaurant, grocery stores etc. However, Bromley provides more lifestyle venues such as Gyms, massage studios, health and beuty studio, golf course, theater and garden center just to name a few. 

In summary, currently, there are clear discrepancies between venues in these two boroughs based on the poverty rate in these two boroughs. 

Now that we know the dicrepancies between the two boroughs, are there any similarities between two boroughs?

### Examine Clusters

#### Cluster 1:

In [51]:
TH_BROM_neighborhoods_venues_sorted.loc[TH_BROM_neighborhoods_venues_sorted['Cluster Label'] == 0]

Unnamed: 0,Cluster Label,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,0,Hayes & Coney Hall,Park,Pub,Hardware Store,Grocery Store,Yoga Studio,Falafel Restaurant,Flower Shop,Flea Market,Fishing Store,Fish Market
18,0,Kelsey & Eden Park,Café,Tennis Court,Park,Yoga Studio,Fast Food Restaurant,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
25,0,Plaistow & Sunbridge,Chinese Restaurant,Diner,Supermarket,Other Repair Shop,Fast Food Restaurant,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
26,0,Poplar,Park,Steakhouse,Pizza Place,Light Rail Station,Café,Fried Chicken Joint,Intersection,Tunnel,Chinese Restaurant,English Restaurant
31,0,Stepney Green,Park,Chinese Restaurant,Pub,Dessert Shop,Thrift / Vintage Store,Sandwich Place,Fried Chicken Joint,Farm,Supermarket,Italian Restaurant
33,0,West Wickham,Park,Furniture / Home Store,Business Service,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fish Market


Let's examine cluster 1. Based on the data from boith boroughs, in cluster 1, only poplar is located at Tower Hamlet. Further investigation indicates thatpopplar is located at the boundary of the finacial hub of London, canary wharf, where most financial sector employees are resided with low poverty rate. Due to this reason, the venues are similar with the venues in cluster in Bromley. 

#### Cluster 2:

In [53]:
TH_BROM_neighborhoods_venues_sorted.loc[TH_BROM_neighborhoods_venues_sorted['Cluster Label'] == 1]

Unnamed: 0,Cluster Label,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,1,Darwin,Farm,History Museum,Bar,Yoga Studio,Food Stand,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store


Cluster 2 only have one neighbourhood, which is Darwin, located in Bromley. Analysing the venues showsn that the area are leisure areas where more lifestyle venues are located, such as yoga studio, flower shop, bars etc. 

#### Cluster 3:

In [54]:
TH_BROM_neighborhoods_venues_sorted.loc[TH_BROM_neighborhoods_venues_sorted['Cluster Label'] == 2]

Unnamed: 0,Cluster Label,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,2,Bethnal Green,Coffee Shop,Pub,Café,Wine Bar,Park,Bagel Shop,Restaurant,Shoe Store,Print Shop,Jewelry Store
1,2,Bickley,Home Service,Cricket Ground,Café,Train Station,Yoga Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market
3,2,Blackwall and Cubitt Town,Indian Restaurant,Park,Harbor / Marina,Pizza Place,Grocery Store,Café,Pub,Restaurant,Light Rail Station,Bus Stop
7,2,Bromley Town,Coffee Shop,Clothing Store,Gym / Fitness Center,Pub,Burger Joint,Pizza Place,Movie Theater,Burrito Place,Park,Café
8,2,Canary Wharf,Coffee Shop,Sandwich Place,Burger Joint,Plaza,Food Truck,Bakery,Italian Restaurant,Shopping Mall,Café,Sushi Restaurant
12,2,Cray Valley East,Coffee Shop,Clothing Store,Gym / Fitness Center,Chocolate Shop,Bookstore,Stationery Store,Furniture / Home Store,Movie Theater,Café,Sandwich Place
13,2,Crystal Palace,Platform,Breakfast Spot,History Museum,Track Stadium,Bus Station,Sculpture Garden,Garden,Farm,Bike Shop,Park
19,2,Lansbury,Indian Restaurant,Playground,Grocery Store,Café,Canal,Fast Food Restaurant,Supermarket,Pakistani Restaurant,Food & Drink Shop,Beer Garden
22,2,Orpington,Fast Food Restaurant,Supermarket,Italian Restaurant,Restaurant,Café,Coffee Shop,Chinese Restaurant,Memorial Site,Bakery,Portuguese Restaurant
24,2,Petts Wood & Knoll,Home Service,Convenience Store,Gift Shop,Yoga Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store


#### Cluster 4:

In [55]:
TH_BROM_neighborhoods_venues_sorted.loc[TH_BROM_neighborhoods_venues_sorted['Cluster Label'] == 3]

Unnamed: 0,Cluster Label,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,3,Copers Cope,Athletics & Sports,Soccer Field,Indoor Play Area,Yoga Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store


Cluster 4 only have one neighbourhood, which is Copers , located in Bromley. Analysing the venues showsn that the area are leisure areas where more sport lifestyle venues are located, such as atheletics & sports, soccer field, indoor play area and yoga studio. 

#### Cluster 5:

In [52]:
TH_BROM_neighborhoods_venues_sorted.loc[TH_BROM_neighborhoods_venues_sorted['Cluster Label'] == 4]

Unnamed: 0,Cluster Label,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,4,Biggin Hill,Airport,Airport Service,Pub,Massage Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
4,4,Bow East,Pub,Hotel,Bar,Burger Joint,Metro Station,Office,Fast Food Restaurant,Bike Rental / Bike Share,Convenience Store,Gym
5,4,Bow West,Pub,Hotel,Bar,Burger Joint,Metro Station,Office,Fast Food Restaurant,Bike Rental / Bike Share,Convenience Store,Gym
6,4,Bromley Common & Keston,Bus Station,Indian Restaurant,Pub,Bar,Fast Food Restaurant,Farm,Food Stand,Food Court,Food & Drink Shop,Flower Shop
9,4,Chelsfield & Pratts Bottom,Pub,Health & Beauty Service,Fishing Store,Yoga Studio,Falafel Restaurant,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fish Market
10,4,Chislehurst,Pub,Indian Restaurant,Pizza Place,Italian Restaurant,Gastropub,Yoga Studio,Farm,Food & Drink Shop,Flower Shop,Flea Market
15,4,Farnborough & Croftton,Pub,American Restaurant,Coffee Shop,Theater,Yoga Studio,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market
17,4,Island Gardens,Pub,History Museum,Boat or Ferry,Park,Portuguese Restaurant,Church,Restaurant,Rugby Pitch,Chinese Restaurant,Food & Drink Shop
20,4,Limehouse,Pub,Chinese Restaurant,Café,Indian Restaurant,Italian Restaurant,Park,Bus Stop,Convenience Store,Light Rail Station,Gastropub
21,4,Mottingham & Chislehurst North,Pub,Pizza Place,Fish & Chips Shop,Grocery Store,Yoga Studio,Falafel Restaurant,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store


Finally, cluster 5 have even split between two boroughs. Generally, this cluster consists of the food and drinks venues in both boroughs. From this cluster, we can clearly identify that pubs are the favourite leisure venues for all residents in both areas, regardless of the poverty rate. 

======================================================================================================================== 

## <span style="color:darkred">6. Conclusions <a name="conclusion"></a></span>

In conclusion, based on the poverty rate, we can clearly see there are discrepancies in terms of the venues at each boroughs. At Tower Hamlet, it is clearly shown that most venues provided are essentail venues for day-to-day life as in Bromley , much of the venues are lifestyle venues and required certain level of incomes. 

======================================================================================================================== 