# Capstone Project - The Battle of the Neighborhoods (Week 1 and Week2)
### Applied Data Science Capstone by IBM/Coursera

**Evaluation Criteria**
1. A description of the problem and a discussion of the background. (15 marks)
2. A description of the data and how it will be used to solve the problem. (15 marks)
3. A link to your Notebook on your Github repository, showing your code. (15 marks)
4. A full report consisting of all of the following components. (15 marks)
5. A link to your Notebook on your Github repository pushed showing your code. (15 marks)
6. Your choice of a presentation or blogpost. (10 marks)

## Table of contents
1. [Introduction: Business Problem](#introduction)
2. [Data](#data)
3. [Methodology](#methodology)
4. [Analysis](#analysis)
5. [Results and Discussion](#results)
6. [Conclusion](#conclusion)

THIS IS THE CONTENTS ABOUT THIS PROJECT:
1. Introduction where we discuss the business problem and who would be interested in this project.
2. Data where we describe the data that will be used to solve the problem and the source of the data.
3. Methodology section which represents the main compnent of the report where we discuss and describe any exploratory data analysis that we did, any inferential statistical testing that you performed, and what machine learnings were used and why.
4. Results section where we discuss the results.
5. Discussion section where we discuss any observations we noted and any recommendations we can make based on the results.
6. Conclusion section where we conclude the report.

## 1. Introduction: Business Problem <a name="introduction"></a>

In this project, we will try to find an optimal location for a new Asian restaurant in London. Specifically, this report will be targeted to stakeholders interested in starting an **business for Asian restaurant in London, UK.** They hope to expand their restaurant in Europe, especially in London. Because foods in London is notorious, the delicious asian foods' taste is more competitive than the foods there.

As Asians are increasing more and more in London, there are not enough stores for Asian restaurants. Since there are lots of restaurants for different countries' people in London, we will try to detect **areas with no Asian restaurants in vicinity**. We would also prefer locations **as close to city center as possible**, assuming that first two conditions are met.

We will use our data science powers to generate a few most promising neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## 2. Data Section<a name="data"></a>

I analyze the data based on the area data from Wikipedia page and the map data from google maps API and Foursquare API.  

**London Area data:** Use the Notebook to build the code to scrape the following Wikipedia page, https://en.wikipedia.org/wiki/List_of_areas_of_London, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe.

* There are different website scraping libraries and packages in Python. One of the most common packages is BeautifulSoup. Here is the package's main documentation page: http://beautiful-soup-4.readthedocs.io/en/latest/

* The package is so popular that there is a plethora of tutorials and examples of how to use it. Here is a very good Youtube video on how to use the BeautifulSoup package: https://www.youtube.com/watch?v=ng2o98k983k

* Use the BeautifulSoup package or any other way you are comfortable with to transform the data in the table on the Wikipedia page into the above pandas dataframe

**Ethnicity data:** Use the Notebook to build the code to scrape the following Wikipedia page, https://en.wikipedia.org/wiki/Demography_of_London, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe.

Using the proportions of ethnicity data in London, we can detect the areas where a lot of Asians live. And, we can find that the area with the highest Asian population in London is Newham, Harrow, Redbride, Tower Hamlets, and Hounslow.

And, we can focus on the specific area and analyze that area in more detailed.

**Geographical Data:** Use the Geocoder package or the csv file to get the geographical coordinates of the neighborhood

* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
* number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinate of Berlin center will be obtained using **Google Maps API geocoding** of London location

**Data visualization:** After the data preprocessing,  we can create map of London using latitude and longitude values using the folium function.
Folium is a great visualization library. 

Feel free to zoom into the above map, and click on each circle mark to reveal the name of the neighborhood and its respective borough.

#### First, we need to import the library.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

# !conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from bs4 import BeautifulSoup
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# !conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


## Download and Explore Dataset 
### Obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe

In [2]:
# Get url from wiki page and create soup object
url = " https://en.wikipedia.org/wiki/List_of_areas_of_London"
url_text= requests.get(url).text
soup = BeautifulSoup(url_text, 'lxml')

In [3]:
# Build the code to scrape the following Wikipedia page
data = []
columns = []
london_post = soup.find(class_='wikitable')
for index, tr in enumerate(london_post.find_all('tr')):
    cell = []
    for td in tr.find_all(['th','td']):
        cell.append(td.text.rstrip())    
    #First row of data is the header
    if (index == 0):
        columns = cell
    else:
        data.append(cell)
# Create the DataFrame
df = pd.DataFrame(data = data,columns = columns)
# The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood
df.head(12)

Unnamed: 0,Location,London borough,Post town,Postcode district,Dial code,OS grid ref
0,Abbey Wood,"Bexley, Greenwich [1]",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham[2]",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon[2],CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon[2],CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728
5,Aldborough Hatch,Redbridge[3],ILFORD,IG2,20,TQ455895
6,Aldgate,City[4],LONDON,EC3,20,TQ334813
7,Aldwych,Westminster[4],LONDON,WC2,20,TQ307810
8,Alperton,Brent[5],WEMBLEY,HA0,20,TQ185835
9,Anerley,Bromley[5],LONDON,SE20,20,TQ345695


In [4]:
df.columns = ['Location', 'Borough', 'PostTown', 'PostCode', 'DialCode', 'OSGridRef']
df.columns
df.head()

Unnamed: 0,Location,Borough,PostTown,PostCode,DialCode,OSGridRef
0,Abbey Wood,"Bexley, Greenwich [1]",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham[2]",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon[2],CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon[2],CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728


In [5]:
df['Borough'] = df['Borough'].map(lambda x: x.rstrip(']').rstrip('0123456789').rstrip('['))
df.head()

Unnamed: 0,Location,Borough,PostTown,PostCode,DialCode,OSGridRef
0,Abbey Wood,"Bexley, Greenwich",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon,CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon,CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728


In [6]:
df = df.drop('PostCode', axis=1).join(df['PostCode'].str.split(',', expand=True).stack().reset_index(level=1, drop=True).rename('PostCode'))
df.head()

Unnamed: 0,Location,Borough,PostTown,DialCode,OSGridRef,PostCode
0,Abbey Wood,"Bexley, Greenwich",LONDON,20,TQ465785,SE2
1,Acton,"Ealing, Hammersmith and Fulham",LONDON,20,TQ205805,W3
1,Acton,"Ealing, Hammersmith and Fulham",LONDON,20,TQ205805,W4
2,Addington,Croydon,CROYDON,20,TQ375645,CR0
3,Addiscombe,Croydon,CROYDON,20,TQ345665,CR0


In [7]:
df = df[['Location', 'Borough', 'PostCode', 'PostTown']].reset_index(drop=True)
df.head()

Unnamed: 0,Location,Borough,PostCode,PostTown
0,Abbey Wood,"Bexley, Greenwich",SE2,LONDON
1,Acton,"Ealing, Hammersmith and Fulham",W3,LONDON
2,Acton,"Ealing, Hammersmith and Fulham",W4,LONDON
3,Addington,Croydon,CR0,CROYDON
4,Addiscombe,Croydon,CR0,CROYDON


In [8]:
df = df[df['PostTown'].str.contains('LONDON')]
df.head()

Unnamed: 0,Location,Borough,PostCode,PostTown
0,Abbey Wood,"Bexley, Greenwich",SE2,LONDON
1,Acton,"Ealing, Hammersmith and Fulham",W3,LONDON
2,Acton,"Ealing, Hammersmith and Fulham",W4,LONDON
8,Aldgate,City,EC3,LONDON
9,Aldwych,Westminster,WC2,LONDON


In [9]:
# Get url from wiki page and create soup object
url = "https://en.wikipedia.org/wiki/Demography_of_London"
url_text= requests.get(url).text
soup = BeautifulSoup(url_text, 'lxml')

In [10]:
# Build the code to scrape the following Wikipedia page
data = []
columns = []
ethnic_post = soup.find(class_='wikitable sortable')
for index, tr in enumerate(ethnic_post.find_all('tr')):
    cell = []
    for td in tr.find_all(['th','td']):
        cell.append(td.text.rstrip())    
    #First row of data is the header
    if (index == 0):
        columns = cell
    else:
        data.append(cell)
# Create the DataFrame
race = pd.DataFrame(data = data,columns = columns)
# The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood
race.head(12)

Unnamed: 0,Local authority,White,Mixed,Asian,Black,Other
0,Barnet,64.1,4.8,18.5,7.7,4.9
1,Barking and Dagenham,58.3,4.2,15.9,20.0,1.6
2,Bexley,81.9,2.3,6.6,8.5,0.8
3,Brent,36.3,5.1,34.1,18.8,5.8
4,Bromley,84.3,3.5,5.2,6.0,0.9
5,Camden,66.3,5.6,16.1,8.2,3.8
6,City of London,78.6,3.9,12.7,2.6,2.1
7,Croydon,55.1,6.6,16.4,20.2,1.8
8,Ealing,49.0,4.5,29.7,10.9,6.0
9,Enfield,61.0,5.5,11.2,17.2,5.1


In [11]:
race['Asian'] = race['Asian'].astype('float')

In [12]:
race_sorted = race.sort_values(by='Asian', ascending = False)
race_sorted.head()

Unnamed: 0,Local authority,White,Mixed,Asian,Black,Other
24,Newham,29.0,4.5,43.5,19.6,3.5
13,Harrow,42.2,4.0,42.6,8.2,2.9
25,Redbridge,42.5,4.1,41.8,8.9,2.7
29,Tower Hamlets,45.2,4.1,41.1,7.3,2.3
17,Hounslow,51.4,4.1,34.4,6.6,3.6


We can detect the areas where a lot of Asians lived.

The Area with the highest Asian population in London is the Newham, Harrow, Redbride, Tower Hamlets, and Hounslow.

In [13]:
london_top_asian = df[df['Borough'].isin(['Newham', 'Harrow', 'Redbridge', 'Tower Hamlets', 'Hounslow'])].reset_index(drop=True)
london_top_asian.head()

Unnamed: 0,Location,Borough,PostCode,PostTown
0,Beckton,Newham,E6,"LONDON, BARKING"
1,Beckton,Newham,E16,"LONDON, BARKING"
2,Beckton,Newham,IG11,"LONDON, BARKING"
3,Bethnal Green,Tower Hamlets,E2,LONDON
4,Blackwall,Tower Hamlets,E14,LONDON


## Use the Geocoder package or the csv file to get the geographical coordinates of the neighborhoods

Explore and cluster the neighborhoods in London. You can decide to work with only boroughs that contain the word London and then replicate the same analysis we did to the Toronto data. It is up to you.

Just make sure:
1. add enough Markdown cells to explain what you decided to do and to report any observations you make.
2. generate maps to visualize your neighborhoods and how they cluster together.

In [14]:
# !conda install -c conda-forge geocoder --yes  
import geocoder

Let's now use Google Maps API to get approximate addresses of those locations

In [15]:
# Defining a function to use --> get_latlng()'''
def get_latlng(arcgis_geocoder):
    
    # Initialize the Location (lat. and long.) to "None"
    lat_lng_coords = None
    
    # While loop helps to create a continous run until all the location coordinates are geocoded
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, London, United Kingdom'.format(arcgis_geocoder))
        lat_lng_coords = g.latlng
    return lat_lng_coords
# Geocoder ends here

In [16]:
postal_codes = london_top_asian['PostCode']    
coordinates = [get_latlng(postal_code) for postal_code in postal_codes.tolist()]

In [17]:
london_data = london_top_asian

london_coordinates = pd.DataFrame(coordinates, columns = ['Latitude', 'Longitude'])
london_data['Latitude'] = london_coordinates['Latitude']
london_data['Longitude'] = london_coordinates['Longitude']
london_data.head()

Unnamed: 0,Location,Borough,PostCode,PostTown,Latitude,Longitude
0,Beckton,Newham,E6,"LONDON, BARKING",51.53292,0.05461
1,Beckton,Newham,E16,"LONDON, BARKING",51.50913,0.01528
2,Beckton,Newham,IG11,"LONDON, BARKING",51.532674,0.085256
3,Bethnal Green,Tower Hamlets,E2,LONDON,51.52669,-0.06257
4,Blackwall,Tower Hamlets,E14,LONDON,51.51122,-0.01264


In [18]:
#geocoders
from geopy.geocoders import Nominatim
#get the coordinates for Toronto
address = 'London, United Kingdom'

geolocator = Nominatim(user_agent="ln_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

51.5073219 -0.1276474


In [19]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

#create map of Toronto using latitude and longitude values
map_london = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, loc in zip(london_data['Latitude'], london_data['Longitude'], london_data['Borough'], london_data['Location']):
    label = '{} - {}'.format(loc, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  
    
map_london

Folium is a great visualization library. Feel free to zoom into the above map, and click on each circle mark to reveal the name of the neighborhood and its respective borough.
However, for illustration purposes, let's simplify the above map and segment and cluster only the neighborhoods in Toronto. So let's slice the original dataframe and create a new dataframe of the Toronto data.
Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

### Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on restaurants in each neighborhood.

We're interested in venues in 'food' category, but only those that are proper restaurants - coffe shops, pizza places, bakeries etc. are not direct competitors so we don't care about those. So we will include in out list only venues that have 'restaurant' in category name, and we'll make sure to detect and include all the subcategories of specific 'Italian restaurant' category, as we need info on Italian restaurants in the neighborhood.

#### Define Foursquare Credentials and Version

In [20]:
CLIENT_ID = 'KVHRH3GNQEDO0ZD4QGBMVVTECINSXMLKIC4NAKITAT1BVEQ0' # your Foursquare ID
CLIENT_SECRET = 'OLDPBX11WOOA0AOY2EGWSVSEQUH45N3EBL1JQ42RTUCPM4IS' # your Foursquare Secret
VERSION = '20180604'
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KVHRH3GNQEDO0ZD4QGBMVVTECINSXMLKIC4NAKITAT1BVEQ0
CLIENT_SECRET:OLDPBX11WOOA0AOY2EGWSVSEQUH45N3EBL1JQ42RTUCPM4IS


#### Let's explore the first neighborhood in our dataframe.

Get the neighborhood's name.

In [21]:
london_data.head()

Unnamed: 0,Location,Borough,PostCode,PostTown,Latitude,Longitude
0,Beckton,Newham,E6,"LONDON, BARKING",51.53292,0.05461
1,Beckton,Newham,E16,"LONDON, BARKING",51.50913,0.01528
2,Beckton,Newham,IG11,"LONDON, BARKING",51.532674,0.085256
3,Bethnal Green,Tower Hamlets,E2,LONDON,51.52669,-0.06257
4,Blackwall,Tower Hamlets,E14,LONDON,51.51122,-0.01264


In [22]:
london_data.loc[0, 'Location']

'Beckton'

Get the neighborhood's latitude and longitude values.

In [23]:
neighborhood_latitude = london_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = london_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = london_data.loc[0, 'Location'] # Location name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Beckton are 51.53292000000005, 0.05461000000002514.


## 3. Methodology <a name="methodology"></a>

In this project, we will direct our efforts on detecting areas of London. We will limit our analysis to area around city center. In first step we have collected the required data: location and type (category) of every restaurant near London center. And, we will focus on most promising areas and within those create clusters of locations. We will present map of all such locations but also create clusters (using **k-means clustering) of those locations to identify general zones / neighborhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.

## 4. Analysis <a name="analysis"></a>

Let's perform some basic explanatory data analysis and derive some additional info from our raw data. First let's count the **number of restaurants in every area candidate**:

### Analyze one Neighborhood

A single Neighborhood within the London area where a lot of Asians lived are examined by using the Foursquare API.

**Now, let's get the top 100 venues that are in Beckton (Borough: Newham) within a radius of 2000 meters.** (Newham borough is the area where a lot of Asians live.)

First, let's create the GET request URL. Name your URL url.

In [24]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 2000 # define radius
# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=KVHRH3GNQEDO0ZD4QGBMVVTECINSXMLKIC4NAKITAT1BVEQ0&client_secret=OLDPBX11WOOA0AOY2EGWSVSEQUH45N3EBL1JQ42RTUCPM4IS&v=20180604&ll=51.53292000000005,0.05461000000002514&radius=2000&limit=100'

Send the GET request and examine the resutls

In [25]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5cf42192db04f52f63f0729f'},
 'response': {'groups': [{'items': [{'reasons': {'count': 0,
       'items': [{'reasonName': 'globalInteractionReason',
         'summary': 'This spot is popular',
         'type': 'general'}]},
      'referralId': 'e-0-4d06283cc2e537044020c267-0',
      'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/fastfood_',
          'suffix': '.png'},
         'id': '4bf58dd8d48988d16e941735',
         'name': 'Fast Food Restaurant',
         'pluralName': 'Fast Food Restaurants',
         'primary': True,
         'shortName': 'Fast Food'}],
       'id': '4d06283cc2e537044020c267',
       'location': {'address': '28-32 High St. N',
        'cc': 'GB',
        'city': 'East Ham',
        'country': 'United Kingdom',
        'distance': 142,
        'formattedAddress': ['28-32 High St. N',
         'East Ham',
         'Greater London',
         'E6 2HJ',
         'United Kingdom'],
        

From the Foursquare lab in the previous module, we know that all the information is in the items key. Before we proceed, let's borrow the get_category_type function from the Foursquare lab.

In [26]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [27]:
import json # library to handle JSON files values
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,McDonald's,Fast Food Restaurant,51.53404,0.053628
1,The Miller's Well (Wetherspoon),Pub,51.533406,0.056379
2,Central Park,Park,51.528808,0.052901
3,Taste Of India,Indian Restaurant,51.542572,0.050107
4,Chennai Dosa,Indian Restaurant,51.538225,0.05136


In [28]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

87 venues were returned by Foursquare.


In [29]:
beckton_unique = nearby_venues['categories'].value_counts().to_frame(name='Count')

In [30]:
beckton_unique.head(5)

Unnamed: 0,Count
Grocery Store,10
Indian Restaurant,7
Coffee Shop,6
Fast Food Restaurant,6
Hotel,5


### Analyze multiple Neighborhoods in London

#### Let's create a function to repeat the same process to all the neighborhoods in London

In [31]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Now write the code to run the above function on each neighborhood and create a new dataframe called london_venues

In [32]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

london_venues = getNearbyVenues(names=london_data['Location'],
                                   latitudes=london_data['Latitude'],
                                   longitudes=london_data['Longitude']
                                  )

Beckton
Beckton
Beckton
Bethnal Green
Blackwall
Bow
Bromley (also Bromley-by-Bow)
Cambridge Heath
Canary Wharf
Canning Town
Cubitt Town
Custom House
East Ham
Forest Gate
Grove Park
Gunnersbury
Isle of Dogs
Leamouth
Limehouse
Little Ilford
Manor Park
Maryland
Mile End
Millwall
North Woolwich
Old Ford
Plaistow
Poplar
Ratcliff
Shadwell
Silvertown
South Woodford
Spitalfields
Stepney
Stratford
Tower Hill
Upton Park
Upton Park
Wanstead
Wapping
West Ham
West Ham
Whitechapel
Woodford
Woodford


In [33]:
print(london_venues.shape)
london_venues.head()

(1005, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Beckton,51.53292,0.05461,McDonald's,51.53404,0.053628,Fast Food Restaurant
1,Beckton,51.53292,0.05461,The Miller's Well (Wetherspoon),51.533406,0.056379,Pub
2,Beckton,51.53292,0.05461,Central Park,51.528808,0.052901,Park
3,Beckton,51.53292,0.05461,Costa Coffee,51.534517,0.053365,Coffee Shop
4,Beckton,51.53292,0.05461,Primark,51.535303,0.052308,Clothing Store


Let's check how many venues were returned for each neighborhood

In [34]:
london_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beckton,43,43,43,43,43,43
Bethnal Green,33,33,33,33,33,33
Blackwall,9,9,9,9,9,9
Bow,10,10,10,10,10,10
Bromley (also Bromley-by-Bow),10,10,10,10,10,10
Cambridge Heath,33,33,33,33,33,33
Canary Wharf,9,9,9,9,9,9
Canning Town,22,22,22,22,22,22
Cubitt Town,9,9,9,9,9,9
Custom House,22,22,22,22,22,22


**Let's find out how many unique categories can be curated from all the returned venues**

In [35]:
print('There are {} uniques categories.'.format(len(london_venues['Venue Category'].unique())))

There are 134 uniques categories.


In [36]:
london_venues__count = london_venues['Venue Category'].value_counts().to_frame(name='Count')
london_venues__count.head()

Unnamed: 0,Count
Pub,90
Coffee Shop,67
Café,53
Hotel,45
Grocery Store,39


### Analyze Each Neighborhood

In [37]:
# one hot encoding
london_onehot = pd.get_dummies(london_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
london_onehot['Neighborhood'] = london_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [london_onehot.columns[-1]] + list(london_onehot.columns[:-1])
london_onehot = london_onehot[fixed_columns]

london_onehot.head()

Unnamed: 0,Neighborhood,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Bookstore,Boutique,Brewery,Burger Joint,Bus Station,Bus Stop,Café,Caribbean Restaurant,Champagne Bar,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Creperie,Currywurst Joint,Department Store,Dessert Shop,Diner,Discount Store,Doner Restaurant,Electronics Store,English Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gay Bar,General Entertainment,Gift Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Historic Site,History Museum,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jewelry Store,Kebab Restaurant,Lebanese Restaurant,Light Rail Station,Liquor Store,Locksmith,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,North Indian Restaurant,Office,Opera House,Organic Grocery,Outdoor Sculpture,Pakistani Restaurant,Park,Pharmacy,Pier,Pizza Place,Platform,Plaza,Poke Place,Portuguese Restaurant,Pub,Ramen Restaurant,Recreation Center,Rental Car Location,Restaurant,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Social Club,South American Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Train Station,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Yoga Studio
0,Beckton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Beckton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Beckton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Beckton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Beckton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [38]:
# And let's examine the new dataframe size.
london_onehot.shape

(1005, 135)

In [39]:
london_onehot.loc[london_onehot['Asian Restaurant'] != 0]

Unnamed: 0,Neighborhood,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Bookstore,Boutique,Brewery,Burger Joint,Bus Station,Bus Stop,Café,Caribbean Restaurant,Champagne Bar,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Creperie,Currywurst Joint,Department Store,Dessert Shop,Diner,Discount Store,Doner Restaurant,Electronics Store,English Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gay Bar,General Entertainment,Gift Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Historic Site,History Museum,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jewelry Store,Kebab Restaurant,Lebanese Restaurant,Light Rail Station,Liquor Store,Locksmith,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,North Indian Restaurant,Office,Opera House,Organic Grocery,Outdoor Sculpture,Pakistani Restaurant,Park,Pharmacy,Pier,Pizza Place,Platform,Plaza,Poke Place,Portuguese Restaurant,Pub,Ramen Restaurant,Recreation Center,Rental Car Location,Restaurant,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Social Club,South American Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Train Station,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Yoga Studio
388,Maryland,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
636,Stratford,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
668,Tower Hill,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
727,Tower Hill,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
865,West Ham,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Using this code, we can check the area where there is an asian restaurant.

In [40]:
# Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
london_grouped = london_onehot.groupby('Neighborhood').mean().reset_index()
london_grouped

Unnamed: 0,Neighborhood,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Bookstore,Boutique,Brewery,Burger Joint,Bus Station,Bus Stop,Café,Caribbean Restaurant,Champagne Bar,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Creperie,Currywurst Joint,Department Store,Dessert Shop,Diner,Discount Store,Doner Restaurant,Electronics Store,English Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gay Bar,General Entertainment,Gift Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Historic Site,History Museum,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jewelry Store,Kebab Restaurant,Lebanese Restaurant,Light Rail Station,Liquor Store,Locksmith,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,North Indian Restaurant,Office,Opera House,Organic Grocery,Outdoor Sculpture,Pakistani Restaurant,Park,Pharmacy,Pier,Pizza Place,Platform,Plaza,Poke Place,Portuguese Restaurant,Pub,Ramen Restaurant,Recreation Center,Rental Car Location,Restaurant,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Social Club,South American Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Stationery Store,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Train Station,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Yoga Studio
0,Beckton,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.023256,0.023256,0.023256,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.023256,0.023256,0.0,0.023256,0.0,0.046512,0.0,0.023256,0.0,0.0,0.023256,0.0,0.0,0.0,0.023256,0.0,0.023256,0.023256,0.0,0.023256,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.046512,0.023256,0.023256,0.023256,0.0,0.0,0.0,0.069767,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.023256,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.023256,0.0,0.023256,0.023256,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.023256,0.023256,0.023256,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.023256,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,0.0
1,Bethnal Green,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.030303,0.212121,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303
2,Blackwall,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bow,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.3,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bromley (also Bromley-by-Bow),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.3,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Cambridge Heath,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.030303,0.212121,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303
6,Canary Wharf,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Canning Town,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.045455,0.0,0.0,0.0,0.0,0.136364,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Cubitt Town,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Custom House,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.045455,0.0,0.0,0.0,0.0,0.136364,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [41]:
#checking the grouped dataframe size
london_grouped.shape

(40, 135)

#### Let's print each neighborhood along with the top 5 most common venues

In [42]:
# Let's print each neighborhood along with the top 5 most common venues
num_top_venues = 5

for hood in london_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = london_grouped[london_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Beckton----
              venue  freq
0             Hotel  0.07
1     Grocery Store  0.05
2    Clothing Store  0.05
3               Pub  0.02
4  Department Store  0.02


----Bethnal Green----
              venue  freq
0       Coffee Shop  0.21
1              Café  0.18
2               Pub  0.09
3     Grocery Store  0.06
4  Arepa Restaurant  0.03


----Blackwall----
            venue  freq
0            Park  0.22
1            Café  0.11
2   Grocery Store  0.11
3  Sandwich Place  0.11
4     Coffee Shop  0.11


----Bow----
          venue  freq
0           Pub   0.3
1     Locksmith   0.1
2      Bus Stop   0.1
3  Burger Joint   0.1
4           Bar   0.1


----Bromley (also Bromley-by-Bow)----
          venue  freq
0           Pub   0.3
1     Locksmith   0.1
2      Bus Stop   0.1
3  Burger Joint   0.1
4           Bar   0.1


----Cambridge Heath----
              venue  freq
0       Coffee Shop  0.21
1              Café  0.18
2               Pub  0.09
3     Grocery Store  0.06
4  Arepa R

In [43]:
# Let's put that into a pandas dataframe
# First, let's write a function to sort the venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [44]:
#create the new dataframe and display the top 10 venues for each neighborhood:

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = london_grouped['Neighborhood']

for ind in np.arange(london_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(london_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beckton,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
1,Bethnal Green,Coffee Shop,Café,Pub,Grocery Store,Yoga Studio,Social Club,Art Gallery,Bar,Brewery,Cocktail Bar
2,Blackwall,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop
3,Bow,Pub,Bus Stop,Rental Car Location,Burger Joint,Grocery Store,Locksmith,Light Rail Station,Bar,Falafel Restaurant,Farmers Market
4,Bromley (also Bromley-by-Bow),Pub,Bus Stop,Rental Car Location,Burger Joint,Grocery Store,Locksmith,Light Rail Station,Bar,Falafel Restaurant,Farmers Market


## Cluster Neighborhoods

Before creating K-means clustering,  we are needed to check the elbow method and get the best K.

In [45]:
# Run *k*-means to cluster the neighborhood into 5 clusters.
# set number of clusters

london_grouped_clustering = london_grouped.drop('Neighborhood', 1)

In [46]:
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

# SSE is initialize with empty values
# n_clusters is the "k" 
sse = {}
for n_cluster1 in range(2, 12):
    kmeans1 = KMeans(n_clusters = n_cluster1, max_iter = 1000).fit(london_grouped_clustering)
    london_grouped_clustering["Labels"] = kmeans1.labels_
    
    # The inertia is the sum of distances of samples to their closest cluster centre
    sse[n_cluster1] = kmeans1.inertia_ 
plt.figure()
plt.plot(list(sse.keys()), list(sse.values()))
plt.xlabel("Number of Clusters, k")
plt.ylabel("Sum of Squared Error, SSE")

plt.show()

<matplotlib.figure.Figure at 0x7fb692ef0630>

Using the elbow method, we can select the best number of k. So, k is 3.

In [47]:
# set number of clusters
kclusters = 3

# run k-means clustering
kmeans = KMeans(n_clusters = kclusters, random_state=0).fit(london_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 1, 1, 2, 2, 1, 1, 0, 1, 0], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [48]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Labels', kmeans.labels_)

london_merged = london_data

# merge downtown_grouped with toronto data to add latitude/longitude for each neighborhood
# I realized that I've misspelled the NeighboUrhood column name in Toronto dataframe. oops...
london_merged = london_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Location')

london_merged.head()

Unnamed: 0,Location,Borough,PostCode,PostTown,Latitude,Longitude,Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beckton,Newham,E6,"LONDON, BARKING",51.53292,0.05461,0,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
1,Beckton,Newham,E16,"LONDON, BARKING",51.50913,0.01528,0,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
2,Beckton,Newham,IG11,"LONDON, BARKING",51.532674,0.085256,0,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
3,Bethnal Green,Tower Hamlets,E2,LONDON,51.52669,-0.06257,1,Coffee Shop,Café,Pub,Grocery Store,Yoga Studio,Social Club,Art Gallery,Bar,Brewery,Cocktail Bar
4,Blackwall,Tower Hamlets,E14,LONDON,51.51122,-0.01264,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop


Finally, let's visualize the resulting clusters

In [49]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(london_merged['Latitude'], london_merged['Longitude'], london_merged['Location'], london_merged['Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=9,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. 

#### Cluster 1

In [50]:
#Cluster 1
london_merged.loc[london_merged['Labels'] == 0, london_merged.columns[[1] + list(range(6, london_merged.shape[1]))]]

Unnamed: 0,Borough,Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Newham,0,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
1,Newham,0,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
2,Newham,0,Hotel,Grocery Store,Clothing Store,Gym / Fitness Center,Platform,Middle Eastern Restaurant,Sandwich Place,Caribbean Restaurant,Chinese Restaurant,Pub
9,Newham,0,Hotel,Plaza,Beach,Pier,Platform,Convenience Store,Middle Eastern Restaurant,Scenic Lookout,Science Museum,Diner
11,Newham,0,Hotel,Plaza,Beach,Pier,Platform,Convenience Store,Middle Eastern Restaurant,Scenic Lookout,Science Museum,Diner
13,Newham,0,Grocery Store,Bus Stop,Pub,Fast Food Restaurant,Bakery,Comfort Food Restaurant,Moving Target,Fish & Chips Shop,Café,Train Station
22,Tower Hamlets,0,Pub,Supermarket,Chinese Restaurant,Movie Theater,Burger Joint,Sandwich Place,Cosmetics Shop,Thrift / Vintage Store,Platform,Thai Restaurant
24,Newham,0,Hotel,Plaza,Beach,Pier,Platform,Convenience Store,Middle Eastern Restaurant,Scenic Lookout,Science Museum,Diner
28,Tower Hamlets,0,Pub,Supermarket,Chinese Restaurant,Movie Theater,Burger Joint,Sandwich Place,Cosmetics Shop,Thrift / Vintage Store,Platform,Thai Restaurant
29,Tower Hamlets,0,Pub,Supermarket,Chinese Restaurant,Movie Theater,Burger Joint,Sandwich Place,Cosmetics Shop,Thrift / Vintage Store,Platform,Thai Restaurant


Looking at the result from cluster 1, we can conclude that this is the Cultural & Tourist area & Hub.(Hotel and Park are popular.)

In [51]:
#Cluster 2
london_merged.loc[london_merged['Labels'] == 1, london_merged.columns[[1] + list(range(6, london_merged.shape[1]))]]

Unnamed: 0,Borough,Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Tower Hamlets,1,Coffee Shop,Café,Pub,Grocery Store,Yoga Studio,Social Club,Art Gallery,Bar,Brewery,Cocktail Bar
4,Tower Hamlets,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop
7,Tower Hamlets,1,Coffee Shop,Café,Pub,Grocery Store,Yoga Studio,Social Club,Art Gallery,Bar,Brewery,Cocktail Bar
8,Tower Hamlets,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop
10,Tower Hamlets,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop
14,Hounslow,1,Café,Pub,Italian Restaurant,Bakery,Coffee Shop,Bookstore,Pharmacy,Creperie,Pizza Place,English Restaurant
15,Hounslow,1,Café,Pub,Italian Restaurant,Bakery,Coffee Shop,Bookstore,Pharmacy,Creperie,Pizza Place,English Restaurant
16,Tower Hamlets,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop
17,Tower Hamlets,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop
18,Tower Hamlets,1,Park,Chinese Restaurant,English Restaurant,Light Rail Station,Café,Sandwich Place,Grocery Store,Coffee Shop,Yoga Studio,Fish & Chips Shop


Even though cluster 2 is similar to cluster 3, it falls within more of a main residential district with Pubs.

In [52]:
#Cluster 3
london_merged.loc[london_merged['Labels'] == 2, london_merged.columns[[1] + list(range(6, london_merged.shape[1]))]]

Unnamed: 0,Borough,Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Tower Hamlets,2,Pub,Bus Stop,Rental Car Location,Burger Joint,Grocery Store,Locksmith,Light Rail Station,Bar,Falafel Restaurant,Farmers Market
6,Tower Hamlets,2,Pub,Bus Stop,Rental Car Location,Burger Joint,Grocery Store,Locksmith,Light Rail Station,Bar,Falafel Restaurant,Farmers Market
12,Newham,2,Clothing Store,Discount Store,Gym Pool,Sandwich Place,Fast Food Restaurant,Coffee Shop,Grocery Store,Bakery,Jewelry Store,Pub
25,Tower Hamlets,2,Pub,Bus Stop,Rental Car Location,Burger Joint,Grocery Store,Locksmith,Light Rail Station,Bar,Falafel Restaurant,Farmers Market
26,Newham,2,Pub,Bus Station,Grocery Store,Park,Café,Gym,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Fountain
31,Redbridge,2,Yoga Studio,BBQ Joint,Seafood Restaurant,Coffee Shop,Bar,Pub,Café,English Restaurant,Falafel Restaurant,Farmers Market
36,Newham,2,Pub,Park,Clothing Store,Grocery Store,Electronics Store,Gym,Sandwich Place,Café,Discount Store,Bus Station
37,Newham,2,Pub,Park,Clothing Store,Grocery Store,Electronics Store,Gym,Sandwich Place,Café,Discount Store,Bus Station


Cluster 3 falls within much more residential area and there are pubs and café there.

## 5. Results and Discussion <a name="results"></a>

After the cluster analysis, Cluster 1  is the Cultural & Tourist area and Hub (Hotel and Park are popular), cluster 2 falls within more of a main residential district with Pubs and cluster 3 is  much more residential area and there are pubs and café there.

In Newham area where many Asians lived, Indian restaurants are popular.

In conclusion, pubs and hotels are popular in the Asian district in London. And, Most popular venue type in Downtown London is Pubs. 

London is the capital and the largest city with a high population density in a small area. There are 32 boroughs in London. As London is considered as one of the world's most global cities, there are many creative approaches in clustering and classification studies.

The K-means algorithm are used for clustering study. Through using the Elbow method, the best number of k is 3. For future study, more data set can be added and more details of the neighborhood can also be included.

When we recommend the zones where stakeholders can consider for opening their restaurants, we need to check if there is no nearby competition. and the conditions in that area are related with the restaurants in which they try to open.

## 6. Conclusion <a name="conclusion"></a>

The purpose of the project is to find out areas in London center with low number of Asian restaurants in order to help stakeholders narrowing down the candidate of best location for new Asian restaurants.

So, after the analysis of the venues in London, we can recommend our stakeholders that if the Asian restaurants are open in the cluster 1 area, it will be successful. This is because there are a lots of asian restaurant (Especially Chinese restaurant) and china restaurants (3rd Most Common Venue) are already popular in cluster 2 area. Also, the cluster 3 is a tourist area, it is not good for asian restaurants. In the areas which belongs to the cluster 1, there is no competitor for Asian restaurants and a lot of asian lived there so these areas are good place ofr new Asian restaurants.

And, if there were more data about the housing price, traffic access, ratings or so on, we can get better understanding of the results and give the good insights for stakeholders.

* Recommendation

When stakeholders start a business in a big city, it is needed to understand what the district needed. If they understand this clearly, people can achieve better outcomes through their analysis. There are different approaches to analyze the big cities.

The techniques which I used in London, can be used to analyze the different big cities' cases. And, these skills are useful to analyze the other cities.

And, we need to remember that "There is no free lunch", which means that there are a lot of methods to analyze the dataset and according to these techniques, the results are very different so we need to try diverse different methods to figure out what we need and we can find the best outcomes.