# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## <span style="color:darkred">Table of contents</span>
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## <span style="color:darkred"> 1. Introduction: Business Problem <a name="introduction"></a></span>

In this project, we will investigate the influence of poverty rate to the facilities and amenities of certain borough and how this insights can provide targeting local and business development for London boroughs. It will be interesting to see how poverty impacts the facilities, venues and amenities available in the boroughs. 

<br><br>
<center><img src="poverty_rate.jpg"
     alt="London Poverty Rates"
     width = "700"
     style="float: centre; margin-left: 10px;" /></center><br><br>


This will be done by carrying out a comparative studies on the facilities and amenities between the two boroughs with the highest income rate and the lowest in London, **Tower Hamlet (TH)** borough and **Bromley (Brom)** borough of London, as shown in the figure above.  

We will use the data science principles and techniques learned to generate a model of these boroughs, looking at the venues and facilities available in these areas and provides insights to stakeholders, the local authorities and business chambers of commerce.

### Questions to answer:

1.	What types of facilities (venues) and amenities are available in the area with different poverty line?
2.	How venues changing based on the spending power?
3.	What are the distinctive venues that represent in these boroughs?
4.	Suggestions and recommendations. 

By answering the above questions, the findings can be used for the targeting development for the rest of the London Boroughs so that unnecessary development can be avoided and overall budget can be sustained.

======================================================================================================================== 

## <span style="color:darkred">2. Data <a name="data"></a></span>

Based on the problem definition, factors that will influence the decision in this project will be:

* number and the type of venues and facilities available in the suurounding area of these boroughs
* the most frequent venues and facilties for each boroughs

To define the surrounding area of the borough, we will be using:
* London Boroughs poverty information from: **https://www.trustforlondon.org.uk/data/poverty-borough**. *Accessed: 11/03/2019*
* latitudes and longitudes of Tower Hamlet and Bromley  obtained from: **https://www.distancesto.com/coordinates/gb/**. *Accessed: 11/03/2019*  
* venues, type and locations in every borough will be obtained using **Foursquare API**

Further information of the boroughs can be found at:
* Bromley: **https://en.wikipedia.org/wiki/London_Borough_of_Bromley** 
* Tower Hamlet: **https://en.wikipedia.org/wiki/London_Borough_of_Tower_Hamlets**

## Boroughs Locations 
Based on the information obtained from https://www.distancesto.com/coordinates/gb/, the latitude and the longitude of TM and Brom are as follows:

In [None]:
TH_coordinates = (51.520261, -0.02934)
BROM_coordinates = (51.367971, 0.070062)
london_coordinates = (51.509865, -0.118092)

Let's visualise the locations of these boroughs on London map:

In [None]:
import folium
from folium.features import DivIcon

london_map = folium.Map(location = london_coordinates, zoom_start = 10)

folium.Circle(
    radius=2500,   # the radius is calculated based on the area coverage of the borough 
    location= TH_coordinates,
    color= 'crimson',
    fill=False,
).add_to(london_map)

folium.Marker(
    TH_coordinates, 
    popup=('Tower Hamlet'), 
    icon=folium.Icon(color='crimson', 
    icon_color='white', icon='info-sign', angle=0, prefix='fa')
).add_to(london_map)

folium.Circle(
    radius=6900, # the radius is calculated based on the area coverage of the borough 
    location= BROM_coordinates, 
    popup = 'Bromley',
    color='darkblue',
    fill=False,
).add_to(london_map)

folium.Marker(
    BROM_coordinates, 
    popup=('Bromley'), 
    icon=folium.Icon(color='darkblue', 
    icon_color='white', icon='info-sign', angle=0, prefix='fa')
).add_to(london_map)



london_map

Preliminary observations from the above map show that based on the locations for the two boroughs, Tower Hamlet is located very near to the London centre, where as Bromley borough located at the boundary of the M25 Ringroad, which is about 2 hours drive from London centre. In terms of area size, Tower Hamlet convers about 19.77km2 and Bromley is about 150.2km2. Based on these information, we can used them as distance references when retrieving venues, type and locations  using Foursquare API.

## Foursquare API

Now that we have our location candidates, let's use Foursquare API to get info on venues in each of the borough

As an exploratory project, we will retrieve the venues based on the areas of each borough. We will then do the neccesary manipulations and analysis to achieve our objectives.  

### Define Foursquare Credential and Version 

In [None]:
CLIENT_ID = 'VB4GSHAOKEPLPPVS0VBVRAXL3DXVHHTRU3BJ4X4NJSGSF3R4' # your Foursquare ID
CLIENT_SECRET = '0N21QNS5SLASFGZFAGYFVAGKCE15NM40N2ZOXLAHS50KHICP' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

### Tower Hamlet: Let's get the top venues that are in Tower Hamlet within a radius of 2500 meters 

In [None]:
# importing the neccesary libraries for the tasks 

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import pandas as pd
import json # library to handle JSON files
import requests # library to handle requests

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt
%matplotlib inline 

In [None]:
TH_latitude = 51.520261
TH_longitude = -0.02934
radius = 2500
LIMIT = 100

url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, TH_latitude, TH_longitude, VERSION, radius, LIMIT)
url

Send the GET request and examine the results of the retrieval from Foursquare API

In [None]:
TH_results = requests.get(url).json()
TH_results

In [None]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a pandas dataframe

In [None]:
TH_venues = TH_results['response']['groups'][0]['items']
    
TH_nearby_venues = json_normalize(TH_venues) # flatten JSON

# filter columns
TH_filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
TH_nearby_venues =TH_nearby_venues.loc[:, TH_filtered_columns]

# filter the category for each row
TH_nearby_venues['venue.categories'] = TH_nearby_venues.apply(get_category_type, axis=1)

# clean columns
TH_nearby_venues.columns = [col.split(".")[-1] for col in TH_nearby_venues.columns]

TH_nearby_venues.head()

In [None]:
# Number of venues were returned by Foursquare
print('{} venues in Tower Hamlet were returned by Foursquare.'.format(TH_nearby_venues.shape[0]))

### Bromley: Let's get the top venues that are in Bromley within a radius of 6900 meters

In [None]:
BROM_latitude = 51.36797
BROM_longitude = 0.070062
radius = 6900
LIMIT = 200

url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, BROM_latitude,BROM_longitude, VERSION, radius, LIMIT)
url

Send the GET request and examine the results of the retrieval from Foursquare API

In [None]:
BROM_results = requests.get(url).json()
BROM_results

In [None]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a pandas dataframe

In [None]:
BROM_venues = BROM_results['response']['groups'][0]['items']
    
BROM_nearby_venues = json_normalize(BROM_venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
BROM_nearby_venues =BROM_nearby_venues.loc[:, filtered_columns]

# filter the category for each row
BROM_nearby_venues['venue.categories'] = BROM_nearby_venues.apply(get_category_type, axis=1)

# clean columns
BROM_nearby_venues.columns = [col.split(".")[-1] for col in BROM_nearby_venues.columns]

BROM_nearby_venues.head()

======================================================================================================================== 

In [None]:
# Number of venues were returned by Foursquare
print('{} venues in Bromley were returned by Foursquare.'.format(BROM_nearby_venues.shape[0]))

####  In summary:
* Total venues in Tower Hamlet returned by Foursquare: **100**
* Total venues in Bromley returned by Foursquare: **100**
* Tower Hamlet venues dataframe is called: **TH_nearby_venues**
* Tower Hamlet venues dataframe is called: **BROM_nearby_venues**

### Create a map of London with venues of each borough on it.

In [None]:
london_map = folium.Map(location = (51.476852, -0.000500), zoom_start = 10)

# add Tower Hamlet markers to map
for lat, lng, name, categories in zip(TH_nearby_venues['lat'], TH_nearby_venues['lng'], TH_nearby_venues['name'], TH_nearby_venues['categories']):
    label = '{}, {}'.format(name, categories)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(london_map)  
    
# add Tower Hamlet markers to map
for lat, lng, name, categories in zip(BROM_nearby_venues['lat'], BROM_nearby_venues['lng'], BROM_nearby_venues['name'], BROM_nearby_venues['categories']):
    label = '{}, {}'.format(name, categories)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='darkblue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(london_map)
    
folium.Circle(
    radius=2500,   # the radius is calculated based on the area coverage of the borough 
    location= TH_coordinates,
    color= 'crimson',
    fill=False,
).add_to(london_map)

folium.Circle(
    radius=6900, # the radius is calculated based on the area coverage of the borough 
    location= BROM_coordinates, 
    popup = 'Bromley',
    color='darkblue',
    fill=False,
).add_to(london_map)


    
london_map

## <span style="color:darkred">3. Methodology <a name="methodology"></a></span>

In this project, we will be only concentrating the twoi boroughs with the largest poverty rate gaps. 

In the first step, we will be looking into the top 10 categories of venues for each of the borough, this will provides us with an overview of the types of venues popular in the boroughs. 

Second steps we will be looking at the top 5 venues and what are the locations spread betwen them to provides an indication of the posible area coverage of the venue for the local populations.

in third and final steps, we will be drill into each of the categories to obtained the reviews of some of the venues to seek the quality of service at venues in this separate borough.

With these analysis, we will be able to drawn some conclusions on how poverty rate in particular borough impacting the venues in the areas using Foursquare geospatial data.

======================================================================================================================== 

## <span style="color:darkred">4. Analysis <a name="analysis"></a></span>

### Analyse Each Borough 

In [None]:
print("----Tower Hamlet----")
TH_nearby_venues['categories'].value_counts().head(10)

In [None]:
print("----Bromley----")
BROM_nearby_venues['categories'].value_counts().head(10)

From the above, we can noticed that based on the categories of venues available from Foursquare, both boroughs almost have similar top 10 venues. it is obvious that the top one venue is **Pub**, this is typical for London, as part of the British culture where most people tends to relax and have drinks with friends and family in the local pubs. 

One surprising observation, with the area coverage of just 19.77km2, Tower Hamlet has 6 **parks** available for the local residents, where as Bromley with the coverage of 152.2km2 only have 3 designated parks. 

As boroughy located at the outskirt of London, Bromley considered to be at the countryside of London. Thus, it is obevious that we can identify **Garden Center** as one of the popular venues available at the borough.


In [None]:
london_map = folium.Map(location = (51.476852, -0.000500), zoom_start = 11)

# add Tower Hamlet markers to map
for lat, lng, name, categories in zip(TH_nearby_venues['lat'], TH_nearby_venues['lng'], TH_nearby_venues['name'], TH_nearby_venues['categories']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    if (categories == 'Pub'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='red',fill=True,fill_color='red',fill_opacity=0.7,parse_html=False).add_to(london_map) 
    elif (categories == 'Café'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='blue',fill=True,fill_color='blue',fill_opacity=0.7,parse_html=False).add_to(london_map)
    elif (categories == 'Caffee Shop'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='green',fill=True,fill_color='green',fill_opacity=0.7,parse_html=False).add_to(london_map)
    elif (categories == 'Park'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='yellow',fill=True,fill_color='yellow',fill_opacity=0.7,parse_html=False).add_to(london_map)
    elif (categories == 'Turkish Restaurant'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='black',fill=True,fill_color='black',fill_opacity=0.7,parse_html=False).add_to(london_map)
    
# add Bromley markers to map
for lat, lng, name, categories in zip(BROM_nearby_venues['lat'], BROM_nearby_venues['lng'], BROM_nearby_venues['name'], BROM_nearby_venues['categories']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    if (categories == 'Pub'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='red',fill=True,fill_color='red',fill_opacity=0.7,parse_html=False).add_to(london_map) 
    elif (categories == 'Caffee Shop'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='blue',fill=True,fill_color='blue',fill_opacity=0.7,parse_html=False).add_to(london_map)
    elif (categories == 'Pizza Place'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='green',fill=True,fill_color='green',fill_opacity=0.7,parse_html=False).add_to(london_map)
    elif (categories == 'Supermarket'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='yellow',fill=True,fill_color='yellow',fill_opacity=0.7,parse_html=False).add_to(london_map)
    elif (categories == 'Garden Center'):
        folium.CircleMarker([lat, lng],radius=5, popup=label,color='black',fill=True,fill_color='black',fill_opacity=0.7,parse_html=False).add_to(london_map)
    
folium.Circle(
    radius=2500,   # the radius is calculated based on the area coverage of the borough 
    location= TH_coordinates,
    color= 'crimson',
    fill=False,
).add_to(london_map)

folium.Circle(
    radius=6900, # the radius is calculated based on the area coverage of the borough 
    location= BROM_coordinates, 
    popup = 'Bromley',
    color='darkblue',
    fill=False,
).add_to(london_map)

# display the London Map   
london_map

======================================================================================================================== 

## <span style="color:darkred">5. Results and Discussion <a name="results"></a></span>

======================================================================================================================== 

## <span style="color:darkred">6. Conclusions <a name="conclusion"></a></span>

======================================================================================================================== 