# 1. Motivation/Business Problem

My client is an entrepreneur that has created and owns a restaurant chain whose product is a unique fusion of Greek and Lebanese cuisines.  The franchise was established 5 years ago in Athens and now operates 18 stores in major cities of Greece, Cyprus and Lebanon.  
 
Given the success it has seen in the past years, my client has decided to try to expand his franchise to world class cities in wealthy countries such as the UK (London), France (Paris) and the UAE (Dubai).  He has decided to start with London, a truly global city with a highly diverse population and hence a constant demand for high quality cuisine.  London boasts a wide array of fine restaurants from cuisines all over the world, including Mediterranean and Middle Eastern, meaning that competition will be fierce.  However, my client is confident that he has found an interesting gastronomic niche with the fusion of Greek and Lebanese dishes and believes that a demand for such culinary novelties is behind the franchise’s success in the countries that it currently operates.           

London offers an obvious advantage in its huge ethnic diversity.  The franchise currently operates in large cities (e.g. Athens, Limassol, Beirut) whose populations are considerably homogeneous, as there is very limited immigration to these countries.  Expanding to London would offer an opportunity to tap into a vast customer base not only due to London’s size (almost 9 million) but also because of its age and ethnicity demographics.  The aim is to capitalise on the city’s multiculturalism and to make the brand known throughout Europe and eventually the Middle East.      

However, because London is so large, my client requires deeper insight in order to decide where to establish the first London store of his franchise.  Specifically, he would be interested in identifying any areas with large Greek and Arab communities, where residents will no doubt appreciate the chance to dine with familiar tastes.  Londoners from Mediterranean countries would also find appeal in a Greek/Lebanese cuisine fusion, so such populations can be eventually examined as well.

# 2. Data

The project will use data from Wikipedia (https://en.wikipedia.org/wiki/List_of_areas_of_London) and Foursquare.  Wikipedia will be used to collect and organise boroughs and their post codes.  The Foursquare API will be used to obtain area venues and location data for the areas of London with large Greek and Arab populations. 

One assumption that will be used in this investigation is that the demographics of Greek and Arab areas will not shift dramatically in the next 5 to 10 years.

As a first step, some research was carried out to identify areas of London with large Greek, Cypriot and Arab populations. 

Greek-speaking populations (Greece and Cyprus):
The areas in Central London with significant populations are Chelsea, Bayswater (site of Saint Sophia Cathedral on Moscow Road), Kensington and Belgravia.  In suburban London, Palmers Green is home to one of the largest Cypriot populations outside Cyprus.        

Arab populations:
The centre of London, including SW1, NW London, W2 and W1, especially around Edgeware Road, has a thriving Arab community.  The Borough of Westminster has the highest density of Arabic speakers in London and is also one of the most expensive areas.  

To explore locations across different venues data will be accessed through the FourSquare API and will be arranged as a dataframe for visualisation.   

In [2]:
import pandas as pd
from bs4 import BeautifulSoup 
import requests

import numpy as np 
import pandas as pd 
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print("Libraries imported")

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ------------------------------------------------------------
                       

# Dataset 1: London Boroughs

In [3]:
from bs4 import BeautifulSoup
url = 'https://en.wikipedia.org/wiki/List_of_areas_of_London'

# Load wikipedia page, turn into soup and get the <table>.
wiki_pg = requests.get(url)
soup = BeautifulSoup(wiki_pg.content, 'html.parser')

table = soup.find('table', {'class':'wikitable sortable'}).tbody

rows = table.find_all('tr')
columns = [i.text.replace('\n', '') for i in rows[0].find_all('th')]

df_lon = pd.DataFrame(columns = columns)
for i in range(1, len(rows)):
    tds = rows[i].find_all('td')    
    if len(tds) == 7:
        values = [tds[0].text, tds[1].text, tds[2].text.replace('\n', ''.replace('\xa0','')), tds[3].text, tds[4].text.replace('\n', ''.replace('\xa0','')), tds[5].text.replace('\n', ''.replace('\xa0','')), tds[6].text.replace('\n', ''.replace('\xa0',''))]
    else:
        values = [td.text.replace('\n', '').replace('\xa0','') for td in tds]
        df_lon = df_lon.append(pd.Series(values, index = columns), ignore_index = True)

In [4]:
df_lon.shape
df_lon.head(15)

Unnamed: 0,Location,London borough,Post town,Postcode district,Dial code,OS grid ref
0,Abbey Wood,"Bexley, Greenwich [7]",LONDON,SE2,20,TQ465785
1,Acton,"Ealing, Hammersmith and Fulham[8]",LONDON,"W3, W4",20,TQ205805
2,Addington,Croydon[8],CROYDON,CR0,20,TQ375645
3,Addiscombe,Croydon[8],CROYDON,CR0,20,TQ345665
4,Albany Park,Bexley,"BEXLEY, SIDCUP","DA5, DA14",20,TQ478728
5,Aldborough Hatch,Redbridge[9],ILFORD,IG2,20,TQ455895
6,Aldgate,City[10],LONDON,EC3,20,TQ334813
7,Aldwych,Westminster[10],LONDON,WC2,20,TQ307810
8,Alperton,Brent[11],WEMBLEY,HA0,20,TQ185835
9,Anerley,Bromley[11],LONDON,SE20,20,TQ345695


In [5]:
# Filter out bad rows
df_lon = df_lon[~df_lon['London\xa0borough'].isnull()] 

# Remove Borough reference numbers with "[]"
df_lon['London\xa0borough'] = df_lon['London\xa0borough'].map(lambda x: x.rstrip("]").rstrip("0123456789").rstrip("["))
                                                                                        
# Separate rows with more than one postcode into multiple rows                                                                                           
df_cleaned = df_lon.drop('Postcode\xa0district', axis=1).join(df_lon["Postcode\xa0district"].str.split(",", expand=True).stack().reset_index(level=1, drop=True).rename("Postcode\xa0district"))

# Drop columns we don't need, i.e. Dial code, OS grif ref                                                                                         
df1 = df_cleaned[['Location', 'London\xa0borough', 'Postcode\xa0district', 'Post town']].reset_index(drop=True)

# Only keep rows that have a London postcode
df2 = df1 
df_new = df2[df2['Post town'].str.contains('LONDON')]

df_new.head(15)

Unnamed: 0,Location,London borough,Postcode district,Post town
0,Abbey Wood,"Bexley, Greenwich",SE2,LONDON
1,Acton,"Ealing, Hammersmith and Fulham",W3,LONDON
2,Acton,"Ealing, Hammersmith and Fulham",W4,LONDON
8,Aldgate,City,EC3,LONDON
9,Aldwych,Westminster,WC2,LONDON
11,Anerley,Bromley,SE20,LONDON
12,Angel,Islington,EC1,LONDON
13,Angel,Islington,N1,LONDON
15,Archway,Islington,N19,LONDON
17,Arkley,Barnet,EN5,"BARNET, LONDON"


In [6]:
df_new.shape

(381, 4)

# Dataset 2: Candidate Neighbourhoods

It is now time to focus on the neighbourhoods of interest. 

For Greek populations in London, these are Chelsea, Bayswater, Kensington and Belgravia from inner London. From outer London, Palmers Green has a large Cypriot population.  The postcodes of these areas are as follows:
Chelsea (SW1, SW3, SW10)
Bayswater (W2) 
Kensington (SW3)
Belgravia (SW1)
Palmers Green (N13)

For Arab populations in London, the areas of interest are in Central London in the following postcodes:
W1
W2
SW1

In [7]:
# Collect postcodes of interest
postcodes = ['W1', 'W2', 'SW1', 'SW3', 'SW10', 'N13']

# Define new dataframe containing only areas of interest 
df_LondonAreas = df_new.loc[df_new['Postcode\xa0district'].isin(postcodes)].reset_index(drop=True)
df_LondonAreas.head(15)

Unnamed: 0,Location,London borough,Postcode district,Post town
0,Bayswater,Westminster,W2,LONDON
1,Belgravia,Westminster,SW1,LONDON
2,Brompton,Kensington and ChelseaHammersmith and Fulham,SW3,LONDON
3,Chelsea,Kensington and Chelsea,SW3,LONDON
4,Chinatown,Westminster,W1,LONDON
5,Fitzrovia,Camden,W1,LONDON
6,Knightsbridge,Westminster,SW1,LONDON
7,Marylebone (also St Marylebone),Westminster,W1,LONDON
8,Mayfair,Westminster,W1,LONDON
9,Millbank,Westminster,SW1,LONDON


In [8]:
df_LondonAreas.shape

(17, 4)

The Geocoder package is used with the arcgis_geocoder to obtain the latitude and longitude of the required locations.

In [9]:
!pip -q install geocoder
import geocoder
import time

def get_latlng(arcgis_geocoder):
    """ Function to use latitude and longitude data """
    
    # Initialise the Location (lat. and long.) to "None"
    lat_lng_coords = None
    
    # While loop creates a continous run until all the location coordinates are geocoded
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, London, United Kingdom'.format(arcgis_geocoder))
        lat_lng_coords = g.latlng
        
    return lat_lng_coords


In [10]:
geocoder_example = get_latlng('SW1')
print(geocoder_example)

[51.49714000000006, -0.13828999999992675]


In [11]:
# Apply get_latlng to df_LondonAreas
postal_codes = df_LondonAreas['Postcode\xa0district']  
coords = [get_latlng(postal_code) for postal_code in postal_codes.tolist()]

# Merge latitude and longitude coordinates with the London dataframe 
df_loc = df_LondonAreas 
df_AreaCoords = pd.DataFrame(coords, columns = ['Latitude', 'Longitude'])
df_loc['Latitude'] = df_AreaCoords['Latitude']
df_loc['Longitude'] = df_AreaCoords['Longitude']
df_loc.head(15)

Unnamed: 0,Location,London borough,Postcode district,Post town,Latitude,Longitude
0,Bayswater,Westminster,W2,LONDON,51.51494,-0.18048
1,Belgravia,Westminster,SW1,LONDON,51.49714,-0.13829
2,Brompton,Kensington and ChelseaHammersmith and Fulham,SW3,LONDON,51.49014,-0.16248
3,Chelsea,Kensington and Chelsea,SW3,LONDON,51.49014,-0.16248
4,Chinatown,Westminster,W1,LONDON,51.51656,-0.1477
5,Fitzrovia,Camden,W1,LONDON,51.51656,-0.1477
6,Knightsbridge,Westminster,SW1,LONDON,51.49714,-0.13829
7,Marylebone (also St Marylebone),Westminster,W1,LONDON,51.51656,-0.1477
8,Mayfair,Westminster,W1,LONDON,51.51656,-0.1477
9,Millbank,Westminster,SW1,LONDON,51.49714,-0.13829


In [12]:
# Verify that the dimensions of the coordinate dataframe match those of the neighbourhood dataframe 
try:
    df_AreaCoords.shape == df_LondonAreas.shape
except:
    print("ERROR: dataframe dimensions do not match!")

# 3. Methodology I: Single Neighbourhood Exploration

In this instance we are exploring a single neighbourhood within the London area in order to validate the Foursquare feasibility. The Enfield Borough (postcode: N13) is used.

In [13]:
# Resets the current index to a new one
palmers_green = df_loc.reset_index().drop('index', axis = 1)
palmers_green.loc[palmers_green['Location'] == 'Palmers Green']

Unnamed: 0,Location,London borough,Postcode district,Post town,Latitude,Longitude
11,Palmers Green,Enfield,N13,LONDON,51.62003,-0.1066


In [14]:
palmersgreen_lat = palmers_green.loc[11, 'Latitude']
palmersgreen_long = palmers_green.loc[11, 'Longitude']
palmersgreen_loc = palmers_green.loc[11, 'Location']
palmersgreen_postcode = palmers_green.loc[11, 'Postcode\xa0district']
print('The latitude and longitude values of {} with postcode {}, are {}, {}.'.format(palmersgreen_loc, palmersgreen_postcode, palmersgreen_lat, palmersgreen_long))

The latitude and longitude values of Palmers Green with postcode N13, are 51.62003000000004, -0.10659999999995762.


Now we are ready to explore the top 50 venues that are within 2000 metres of Palmers Green. 

In [15]:
# Credentials 
CLIENT_ID = "5TLLY1PD04SVSY15QZC5BWG0KPGICRGUOMPXF2VVY3QNZXZK"
CLIENT_SECRET = "GG4HR0R2N3SL545Z3FZ0NXXX5OVBKN2QRJ3X4RLHM4C30FZ5"
VERSION = 20200301

LIMIT = 50 # limit of number of venues returned by Foursquare API
radius = 2000 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION,
    palmersgreen_lat, 
    palmersgreen_long, 
    radius, 
    LIMIT)

url

'https://api.foursquare.com/v2/venues/explore?&client_id=5TLLY1PD04SVSY15QZC5BWG0KPGICRGUOMPXF2VVY3QNZXZK&client_secret=GG4HR0R2N3SL545Z3FZ0NXXX5OVBKN2QRJ3X4RLHM4C30FZ5&v=20200301&ll=51.62003000000004,-0.10659999999995762&radius=2000&limit=50'

In [16]:
results_palmersgreen = requests.get(url).json()
results_palmersgreen 

{'meta': {'code': 200, 'requestId': '5efbd1a328da647940dea4aa'},
 'response': {'headerLocation': 'Palmers Green',
  'headerFullLocation': 'Palmers Green, London',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 95,
  'suggestedBounds': {'ne': {'lat': 51.63803001800006,
    'lng': -0.07766270109091306},
   'sw': {'lat': 51.602029982000026, 'lng': -0.13553729890900218}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bd5bf1b637ba593f8e8f670',
       'name': 'Aksular Restaurant',
       'location': {'address': '232 Green Lanes',
        'lat': 51.61712427127939,
        'lng': -0.10905404931080269,
        'labeledLatLngs': [{'label': 'display',
          'lat': 51.61712427127939,
          'lng': -0.10905404931080269}],
        'distance': 365,
        '

In [17]:
# Now we would like to obtain necessary information from the items key. 

def get_category_type(row):
    """ Function that extracts the type of venue """
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
    
    return

In [22]:
# The result is then cleaned up from json to a structured pandas dataframe 

venues_palmersgreen = results_palmersgreen['response']['groups'][0]['items']
    
nearby_venues_palmersgreen = json_normalize(venues_palmersgreen) 
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng'] # filter columns
nearby_venues =nearby_venues.loc[:, filtered_columns] # filter  category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1) # clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues.head(15)

Unnamed: 0,name,categories,lat,lng
0,Aksular Restaurant,Turkish Restaurant,51.617124,-0.109054
1,Kiva,Breakfast Spot,51.620046,-0.106582
2,Broomfield Park,Park,51.617616,-0.115644
3,Yasar Halim,Grocery Store,51.624133,-0.102029
4,Baskervilles Tea Shop,Tea Room,51.619121,-0.113068
5,Nostos Taverna,Greek Restaurant,51.611494,-0.109334
6,Nissi,Greek Restaurant,51.619022,-0.112809
7,Morrisons,Supermarket,51.61759,-0.110528
8,Little Waitrose & Partners,Grocery Store,51.618181,-0.108253
9,Starfish Coffee & Restaurant,Café,51.61949,-0.114345


In [25]:
nearby_venues_palmersgreen_unique = nearby_venues['categories'].value_counts().to_frame(name='Count')
nearby_venues_palmersgreen_unique.head(10)

Unnamed: 0,Count
Greek Restaurant,5
Grocery Store,5
Coffee Shop,4
Park,4
Pub,3
Bakery,2
Supermarket,2
Café,2
Pizza Place,2
Mediterranean Restaurant,1


There is an interesting finding. In line with the demographics of Palmers Green (large Greek-Cypriot population), there are 5 Greek restaurants in the area. This means that a Greek-Lebanese place is likely to be appreciated and will attract customers. We consider it a candidate area. 

Next we will try Chelsea, a Central London neighbourhood that is extremely affluent. The postcode for Chelsea is SW3.

In [50]:
# Resets the current index to a new one
sw3 = df_loc.reset_index().drop('index', axis = 1)
sw3.loc[sw3['Postcode\xa0district'] == 'SW3']

Unnamed: 0,Location,London borough,Postcode district,Post town,Latitude,Longitude
2,Brompton,Kensington and ChelseaHammersmith and Fulham,SW3,LONDON,51.49014,-0.16248
3,Chelsea,Kensington and Chelsea,SW3,LONDON,51.49014,-0.16248


In [54]:
sw3_lat = sw3.loc[3, 'Latitude']
sw3_long = sw3.loc[3, 'Longitude']
sw3_loc = sw3.loc[3, 'Location']
sw3_postcode = sw3.loc[3, 'Postcode\xa0district']

In [56]:
# Another call to Foursquare API, this time for Chealse venues 

CLIENT_ID = "5TLLY1PD04SVSY15QZC5BWG0KPGICRGUOMPXF2VVY3QNZXZK"
CLIENT_SECRET = "GG4HR0R2N3SL545Z3FZ0NXXX5OVBKN2QRJ3X4RLHM4C30FZ5"
VERSION = 20200301

LIMIT = 100 # This time choose 100, since Chelsea is a much denser area 
radius = 3000 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION,
    sw3_lat, 
    sw3_long, 
    radius, 
    LIMIT)

url

'https://api.foursquare.com/v2/venues/explore?&client_id=5TLLY1PD04SVSY15QZC5BWG0KPGICRGUOMPXF2VVY3QNZXZK&client_secret=GG4HR0R2N3SL545Z3FZ0NXXX5OVBKN2QRJ3X4RLHM4C30FZ5&v=20200301&ll=51.49014000000005,-0.16247999999995955&radius=3000&limit=100'

In [57]:
results_chelsea = requests.get(url).json()
results_chelsea 

{'meta': {'code': 200, 'requestId': '5efbebba4348d913c6dae49a'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'London',
  'headerFullLocation': 'London',
  'headerLocationGranularity': 'city',
  'totalResults': 239,
  'suggestedBounds': {'ne': {'lat': 51.51714002700008,
    'lng': -0.11919782719612296},
   'sw': {'lat': 51.463139973000025, 'lng': -0.20576217280379613}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '584056722a1982460a750bcb',
       'name': 'Venchi',
       'location': {'address': "71 King's Rd",
        'lat': 51.489239341994235,
        'lng': -0.16426476718180824,
        'labeledLatLngs': [{'label': 'display',
          'lat': 51.489239341994235,
          '

In [61]:
# The result is then cleaned up from json to a structured pandas dataframe 

venues_chelsea = results_chelsea['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues_chelsea) 
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng'] # filter columns
nearby_venues =nearby_venues.loc[:, filtered_columns] # filter category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1) # clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues.head(30)

Unnamed: 0,name,categories,lat,lng
0,Venchi,Ice Cream Shop,51.489239,-0.164265
1,Amorino,Ice Cream Shop,51.489455,-0.163803
2,Duke of York Square,Plaza,51.491272,-0.159827
3,JOE & THE JUICE,Juice Bar,51.489397,-0.164001
4,Anthropologie,Women's Store,51.488012,-0.166653
5,Royal Court Theatre,Theater,51.492525,-0.156911
6,Sloane Square,Plaza,51.4925,-0.157435
7,Waitrose & Partners,Supermarket,51.488325,-0.16706
8,Chelsea Gardener,Garden Center,51.488213,-0.16938
9,Hagen,Coffee Shop,51.487985,-0.167429


In [63]:
nearby_venues_chelsea_unique = nearby_venues['categories'].value_counts().to_frame(name='Count')
nearby_venues_chelsea_unique.head(30)

Unnamed: 0,Count
Hotel,11
Park,5
Café,5
French Restaurant,4
Ice Cream Shop,4
Plaza,4
Garden,3
Coffee Shop,3
Supermarket,3
Department Store,2


Chelsea has only a single Mediterranean restaurant and 4 French restaurants, reflecting its large French population. In terms of venues, Chelsea is dominated by hotels (11), parks (5) and cafes (5), in line with it being a far more affleunt inner city neighbourhood than the suburb of Palmers Green.   

# Methodology II: Multiple Neighbourhoods

In this instance we will explore several Neighbourhoods of interest using a function *getNearbyVenues* to iterate over all neighbourhoods.

In [21]:
def getNearbyVenues(names, latitudes, longitudes, radius=2000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
        
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                             'Neighbourhood Latitude', 
                             'Neighbourhood Longitude', 
                             'Venue', 
                             'Venue Latitude', 
                             'Venue Longitude', 
                             'Venue Category']
    
    return(nearby_venues)

In [28]:
LondonArea_venues = getNearbyVenues(names = palmers_green['Location'],
                                   latitudes = palmers_green['Latitude'],
                                   longitudes = palmers_green['Longitude'])

Bayswater
Belgravia
Brompton
Chelsea
Chinatown
Fitzrovia
Knightsbridge
Marylebone (also St Marylebone)
Mayfair
Millbank
Paddington
Palmers Green
Pimlico
Soho
St James's
West Brompton
Westminster


In [29]:
LondonArea_venues.head(15)

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bayswater,51.51494,-0.18048,The Westbourne Hyde Park,51.513263,-0.177836,Hotel
1,Bayswater,51.51494,-0.18048,Grand Union Canal | Paddington Arm,51.518277,-0.177283,Canal
2,Bayswater,51.51494,-0.18048,Waitrose & Partners,51.516346,-0.187347,Supermarket
3,Bayswater,51.51494,-0.18048,Paramount Lebanese Kitchen,51.515801,-0.174681,Lebanese Restaurant
4,Bayswater,51.51494,-0.18048,Darcie & May Green,51.518738,-0.178263,Café
5,Bayswater,51.51494,-0.18048,Italian Gardens,51.510844,-0.175576,Garden
6,Bayswater,51.51494,-0.18048,Heist Bank,51.518811,-0.176037,Beer Bar
7,Bayswater,51.51494,-0.18048,Amorino,51.510972,-0.187172,Ice Cream Shop
8,Bayswater,51.51494,-0.18048,M&S Simply Food,51.516335,-0.176644,Grocery Store
9,Bayswater,51.51494,-0.18048,The Victoria,51.513381,-0.171149,Pub


In [30]:
LondonArea_venues.shape

(850, 7)

In [38]:
LondonAreas_venue_unique_count = LondonArea_venues['Venue Category'].value_counts().to_frame(name='Count')
print(LondonAreas_venue_unique_count)

                               Count
Hotel                             63
Coffee Shop                       40
Park                              38
Café                              37
Plaza                             30
French Restaurant                 25
Garden                            23
Bakery                            20
Juice Bar                         20
Burger Joint                      20
Department Store                  20
Lounge                            17
Gym / Fitness Center              17
Grocery Store                     15
Pub                               15
Art Gallery                       14
Mediterranean Restaurant          13
Dessert Shop                      12
Italian Restaurant                12
Ice Cream Shop                    12
Hotel Bar                         11
Japanese Restaurant               11
Pedestrian Plaza                  11
Beer Bar                          11
Wine Shop                         11
Greek Restaurant                  10
T

# Methodology III: Clustering

In [40]:
address = 'London, United Kingdom'
geolocator = Nominatim(user_agent="ln_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of London are 51.5073219, -0.1276474.


In [45]:
map_london = folium.Map(location = [latitude, longitude], zoom_start = 12)
map_london

In [65]:
# Adding markers to map
for lat, lng, borough, loc in zip(palmers_green['Latitude'], 
                                  palmers_green['Longitude'],
                                  palmers_green['London\xa0borough'],
                                  palmers_green['Location']):
    label = '{} - {}'.format(loc, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_london)  
    
display(map_london)

# 4. Results

In terms of culinary landscape, the Central London area is dominated by cafes and French/Mediterranean restaurants.  The Northern suburb of Palmers Green has five Greek restaurants because of its large Cypriot population.    

# 5. Discussion & Conclusions

There does not seeem to be one ideal place to open the first Greek-Lebanese fusion restaurant. In terms of fitting a niche, it can be considered that one of the Central London areas would be a nice fit as there are several Greek, Mediterranean, Italian and French restaurants spread out. Palmers Green appears to be a Greek hotspot, but this may well mean that there would be more competition concentrated in a smaller area (Palmers Green vs. zone 1 of London).  

Another factor that has to be considered is the cost of renting space for this operation. A Central London location would be far more expensive than a suburban one, but on the other hand it will be located in a more densely population area with higher average income. Further investigation is certainly required to make an informed decison.