# Epicure Restaurant Expansion Project 

Epicure restaurant is a high end African restaurant located in Sandton, a luxurious neighborhood of Johannesburg in South Africa. The owner is Chef Coco. "Multi award-winning chef Coco Reinarhz is African epicurean elegance personified. World citizen... his food philosophy always engages with what it means to be African in a global gourmet context. With a childhood spent at his mother’s side in her restaurant in Kinshasa and a formal training at the Ecole Hotelière de Namur in Belgium, Coco is uniquely qualified to lead the new wave of South African cuisine. Under his careful stewardship, modern French flair and exquisite African innovation consistently make respectful and compatible culinary companions."

Chef Coco is looking to expand the Epicure franchise by opening a second restaurant. We are runing this project to advise him on best locations for his next restaurant in Johannesburg, South Africa. 

https://epicurerestaurant.co.za

## Data used for the project 

To identify the best location for the next Epicure restaurant, we use the following data. 
1- We will use foursquare location data to identify characterics of the current epicure restaurant location. 
2- We will then analyze the main competitor which has a larger footprint to confirm these characterictics as key success factors for a high end African restaurant.
3- We will then review the various neighborhoods of johannesburg to select the ones that present these characteristics.
4- We will finally rank the neighborhood and make a recommendation on the top 3 choices for the next Epicure restaurant location. 

The current Epicure restaurant address is: Central Square, 3 -5 Lower Road, Morningside, Sandton Johannesburg South Africa

Main competitor restaurant are: 

Moyo Melrose Arch: Shop 5, The High Street, Melrose Arch, Johannesburg, Gauteng, 2196

Moyo Zoo Lake: Zoo Lake Park, 1 Prince of Wales Drive, Parkview, Johannesburg, South Africa, 2192

http://www.moyo.co.za

## 1- Exploring epicure restaurant location

In [143]:
#Let's import necessary librairy 
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
geopy                     1.18.1                     py_0    conda-forge
Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
folium                    0.5.0                      py_0    conda-forge
Folium installed
Libraries imported.


In [144]:
#Let's view a map of Johannesburg
jhb_map = folium.Map(location=[-26.195246, 28.034088], zoom_start=10) # generate johannesburg map
jhb_map

In [145]:
#lets' define my foursquare credentials
CLIENT_ID = 'MBYOYZAVJU2LAMDJVQFLFZWKAQG4WVN0M1D3TQXTHNRHH0BC' # your Foursquare ID
CLIENT_SECRET = 'FSLIHDA5HZZWR323RW5BUFG3JWHCYC152SS30GICKQID0Y2L' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentials:
CLIENT_ID: MBYOYZAVJU2LAMDJVQFLFZWKAQG4WVN0M1D3TQXTHNRHH0BC
CLIENT_SECRET:FSLIHDA5HZZWR323RW5BUFG3JWHCYC152SS30GICKQID0Y2L


In [146]:
#let's find the coordinate for Epicure Restaurant
address = '5 Lower Road, Sandton South Africa'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

-26.0975473 28.050935


In [147]:
radius = 500

url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    latitude, 
    longitude, 
    VERSION, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?client_id=MBYOYZAVJU2LAMDJVQFLFZWKAQG4WVN0M1D3TQXTHNRHH0BC&client_secret=FSLIHDA5HZZWR323RW5BUFG3JWHCYC152SS30GICKQID0Y2L&ll=-26.0975473,28.050935&v=20180604&radius=500&limit=30'

In [148]:
import requests
results = requests.get(url).json()
'There are {} around Epicure restaurant.'.format(len(results['response']['groups'][0]['items']))

'There are 19 around Epicure restaurant.'

In [149]:
items = results['response']['groups'][0]['items']
items[0]

{'reasons': {'count': 0,
  'items': [{'reasonName': 'globalInteractionReason',
    'summary': 'This spot is popular',
    'type': 'general'}]},
 'referralId': 'e-0-54058c73498e9e7e7eeb865b-0',
 'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/seafood_',
     'suffix': '.png'},
    'id': '4bf58dd8d48988d1ce941735',
    'name': 'Seafood Restaurant',
    'pluralName': 'Seafood Restaurants',
    'primary': True,
    'shortName': 'Seafood'}],
  'id': '54058c73498e9e7e7eeb865b',
  'location': {'address': '3 Stan Close',
   'cc': 'ZA',
   'city': 'Sandton',
   'country': 'iNingizimu Afrika',
   'crossStreet': 'Grayston Drive',
   'distance': 227,
   'formattedAddress': ['3 Stan Close (Grayston Drive)',
    'Sandton',
    '2196',
    'iNingizimu Afrika'],
   'labeledLatLngs': [{'label': 'display',
     'lat': -26.098068736579865,
     'lng': 28.05313461839715}],
   'lat': -26.098068736579865,
   'lng': 28.05313461839715,
   'postalCode': '2196',
   'sta

In [150]:
# tranform venues into a dataframe
dataframe = json_normalize(items)

# keep only columns that include venue name, and anything that is associated with location
#filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered
dataframe = json_normalize(items) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories'] + [col for col in dataframe.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# filter the category for each row
dataframe_filtered['venue.categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered.columns = [col.split('.')[-1] for col in dataframe_filtered.columns]

dataframe_filtered.head(20)

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
0,the Codfather Sandton Skye,Seafood Restaurant,3 Stan Close,ZA,Sandton,iNingizimu Afrika,Grayston Drive,227,"[3 Stan Close (Grayston Drive), Sandton, 2196,...","[{'lng': 28.05313461839715, 'label': 'display'...",-26.098069,28.053135,2196.0,IGauteng,54058c73498e9e7e7eeb865b
1,Kauai,Juice Bar,Shop No 21A Benmore Gardens Shopping Center,ZA,"Benmore Gardens, Sandton",iNingizimu Afrika,1 Benmore Rd,292,[Shop No 21A Benmore Gardens Shopping Center (...,"[{'lng': 28.04936742606543, 'label': 'display'...",-26.099764,28.049367,,IGauteng,4da43ccc63b5a35db37f211a
2,Fournos Bakery,Bakery,Shop G40-G46 Benmore Gardens Shopping Centre,ZA,Parkmore,iNingizimu Afrika,Cnr Grayston Dr & Benmore Rd,215,[Shop G40-G46 Benmore Gardens Shopping Centre ...,"[{'lng': 28.049960559157427, 'label': 'display...",-26.099272,28.049961,2196.0,IGauteng,4bac822ef964a5205ff83ae3
3,Delhi Darbar,Indian Restaurant,134 11th Street,ZA,Benmore,iNingizimu Afrika,,406,"[134 11th Street, Benmore, iNingizimu Afrika]","[{'lng': 28.047639791936295, 'label': 'display...",-26.09968,28.04764,,IGauteng,4bc8948a6501c9b618654029
4,The Generator,Gastropub,130 11th Street,ZA,Parkmore,iNingizimu Afrika,,439,"[130 11th Street, Parkmore, iNingizimu Afrika]","[{'lng': 28.04702214239844, 'label': 'display'...",-26.099356,28.047022,,IGauteng,5638800bcd10c0eb6259c538
5,Hydro Park on Grayston Apartment Sandton,Hotel,86 Grayston Drive,ZA,Sandton,iNingizimu Afrika,,388,"[86 Grayston Drive, Sandton, iNingizimu Afrika]","[{'lng': 28.053729812489976, 'label': 'display...",-26.099973,28.05373,,IGauteng,4bdf22f17ea362b52f5943c4
6,Simply Asia,Asian Restaurant,Benmore Shopping Centre,ZA,,iNingizimu Afrika,,361,"[Benmore Shopping Centre, iNingizimu Afrika]","[{'lng': 28.04900024685335, 'label': 'display'...",-26.100287,28.049,,,54099394498e14be8b7a2541
7,Woolworths,Grocery Store,Benmore Gardens Shopping Centre,ZA,EGoli,iNingizimu Afrika,Cnr Grayston Dr & Benmore Rd,274,[Benmore Gardens Shopping Centre (Cnr Grayston...,"[{'lng': 28.04947021875389, 'label': 'display'...",-26.099629,28.04947,,IGauteng,4c398a4318e72d7f7b6d1af5
8,Wellness in motion Gym&Spa: The Sandton Emperor,Gym,8 West Rd South,ZA,EGoli,iNingizimu Afrika,,438,"[8 West Rd South, EGoli, 2096, iNingizimu Afrika]","[{'lng': 28.055310483379277, 'label': 'display...",-26.097219,28.05531,2096.0,IGauteng,4dcd0cbfb3adb047f4e9f399
9,Kawayi Sushi Bar,Sushi Restaurant,138 11th Street,ZA,Sandton,iNingizimu Afrika,,403,"[138 11th Street, Sandton, iNingizimu Afrika]","[{'lng': 28.047707835357983, 'label': 'display...",-26.099731,28.047708,,IGauteng,4b9a78d5f964a52056b835e3


In [151]:
#Let's clean up the data
filtered_columns = ['city', 'lat', 'lng', 'categories', 'name']
epicure_nearby_venues =dataframe_filtered.loc[:, filtered_columns]
epicure_nearby_venues

Unnamed: 0,city,lat,lng,categories,name
0,Sandton,-26.098069,28.053135,Seafood Restaurant,the Codfather Sandton Skye
1,"Benmore Gardens, Sandton",-26.099764,28.049367,Juice Bar,Kauai
2,Parkmore,-26.099272,28.049961,Bakery,Fournos Bakery
3,Benmore,-26.09968,28.04764,Indian Restaurant,Delhi Darbar
4,Parkmore,-26.099356,28.047022,Gastropub,The Generator
5,Sandton,-26.099973,28.05373,Hotel,Hydro Park on Grayston Apartment Sandton
6,,-26.100287,28.049,Asian Restaurant,Simply Asia
7,EGoli,-26.099629,28.04947,Grocery Store,Woolworths
8,EGoli,-26.097219,28.05531,Gym,Wellness in motion Gym&Spa: The Sandton Emperor
9,Sandton,-26.099731,28.047708,Sushi Restaurant,Kawayi Sushi Bar


In [152]:
epicure_nearby_venues =epicure_nearby_venues.replace({'EGoli': 'jhb_cbd'}, regex=True)
epicure_nearby_venues

Unnamed: 0,city,lat,lng,categories,name
0,Sandton,-26.098069,28.053135,Seafood Restaurant,the Codfather Sandton Skye
1,"Benmore Gardens, Sandton",-26.099764,28.049367,Juice Bar,Kauai
2,Parkmore,-26.099272,28.049961,Bakery,Fournos Bakery
3,Benmore,-26.09968,28.04764,Indian Restaurant,Delhi Darbar
4,Parkmore,-26.099356,28.047022,Gastropub,The Generator
5,Sandton,-26.099973,28.05373,Hotel,Hydro Park on Grayston Apartment Sandton
6,,-26.100287,28.049,Asian Restaurant,Simply Asia
7,jhb_cbd,-26.099629,28.04947,Grocery Store,Woolworths
8,jhb_cbd,-26.097219,28.05531,Gym,Wellness in motion Gym&Spa: The Sandton Emperor
9,Sandton,-26.099731,28.047708,Sushi Restaurant,Kawayi Sushi Bar


In [153]:
# one hot encoding
epicure_nearby_venues_onehot = pd.get_dummies(epicure_nearby_venues[['categories']], prefix="", prefix_sep="")

# add city column back to dataframe
epicure_nearby_venues_onehot['city'] = epicure_nearby_venues['city'] 

# move city column to the first column
fixed_columns = [epicure_nearby_venues_onehot.columns[-1]] + list(epicure_nearby_venues_onehot.columns[:-1])
epicure_nearby_venues_onehot = epicure_nearby_venues_onehot[fixed_columns]

epicure_nearby_venues_onehot.head(20)

Unnamed: 0,city,Asian Restaurant,Bakery,Chinese Restaurant,Coffee Shop,Gastropub,Grocery Store,Gym,Hotel,Indian Restaurant,Juice Bar,Mediterranean Restaurant,Pharmacy,Portuguese Restaurant,Seafood Restaurant,Shopping Mall,Supermarket,Sushi Restaurant,Toy / Game Store
0,Sandton,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
1,"Benmore Gardens, Sandton",0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
2,Parkmore,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Benmore,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
4,Parkmore,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Sandton,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
6,,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,jhb_cbd,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
8,jhb_cbd,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
9,Sandton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0


In [154]:
epicure_nearby_venues_grouped = epicure_nearby_venues_onehot.groupby('city').mean().reset_index()
epicure_nearby_venues_grouped

Unnamed: 0,city,Asian Restaurant,Bakery,Chinese Restaurant,Coffee Shop,Gastropub,Grocery Store,Gym,Hotel,Indian Restaurant,Juice Bar,Mediterranean Restaurant,Pharmacy,Portuguese Restaurant,Seafood Restaurant,Shopping Mall,Supermarket,Sushi Restaurant,Toy / Game Store
0,Benmore,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Benmore Gardens, Johannesburg",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
2,"Benmore Gardens, Sandton",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Morningside,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Parkmore,0.0,0.333333,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0
5,Sandton,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.0,0.222222,0.111111,0.111111,0.111111,0.111111
6,jhb_cbd,0.0,0.0,0.0,0.0,0.0,0.5,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [155]:
num_top_venues = 5

for hood in epicure_nearby_venues_grouped['city']:
    print("----"+hood+"----")
    temp = epicure_nearby_venues_grouped[epicure_nearby_venues_grouped['city'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Benmore----
                      venue  freq
0         Indian Restaurant   1.0
1          Asian Restaurant   0.0
2  Mediterranean Restaurant   0.0
3          Sushi Restaurant   0.0
4               Supermarket   0.0


----Benmore Gardens, Johannesburg----
                   venue  freq
0  Portuguese Restaurant   1.0
1       Asian Restaurant   0.0
2                 Bakery   0.0
3       Sushi Restaurant   0.0
4            Supermarket   0.0


----Benmore Gardens, Sandton----
              venue  freq
0         Juice Bar   1.0
1            Bakery   0.0
2  Sushi Restaurant   0.0
3       Supermarket   0.0
4     Shopping Mall   0.0


----Morningside----
                      venue  freq
0               Coffee Shop   1.0
1          Asian Restaurant   0.0
2  Mediterranean Restaurant   0.0
3          Sushi Restaurant   0.0
4               Supermarket   0.0


----Parkmore----
                      venue  freq
0                 Gastropub  0.33
1                  Pharmacy  0.33
2               

In [156]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [157]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['city']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
epicure_nearby_venues_sorted = pd.DataFrame(columns=columns)
epicure_nearby_venues_sorted['city'] = epicure_nearby_venues_grouped['city']

for ind in np.arange(epicure_nearby_venues_grouped.shape[0]):
    epicure_nearby_venues_sorted.iloc[ind, 1:] = return_most_common_venues(epicure_nearby_venues_grouped.iloc[ind, :], num_top_venues)

epicure_nearby_venues_sorted.head()

Unnamed: 0,city,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Benmore,Indian Restaurant,Sushi Restaurant,Bakery,Chinese Restaurant,Coffee Shop
1,"Benmore Gardens, Johannesburg",Portuguese Restaurant,Toy / Game Store,Hotel,Bakery,Chinese Restaurant
2,"Benmore Gardens, Sandton",Juice Bar,Toy / Game Store,Hotel,Bakery,Chinese Restaurant
3,Morningside,Coffee Shop,Toy / Game Store,Sushi Restaurant,Bakery,Chinese Restaurant
4,Parkmore,Bakery,Gastropub,Pharmacy,Toy / Game Store,Hotel


In [158]:
epicure_map = folium.Map(location=[latitude, longitude], zoom_start=16) # generate map centred around Epicure

# add a red circle marker to represent the Conrad Hotel
folium.features.CircleMarker(
    [-26.0975473, 28.050935],
    radius=10,
    color='red',
    popup='Epicure Restaurant',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(epicure_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(epicure_nearby_venues.lat, epicure_nearby_venues.lng, epicure_nearby_venues.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(epicure_map)

# display map
epicure_map

## 2- Competitors Analysis: Moyo Restaurant

In [159]:
#let's find the coordinate for Moyo Melrose Arch Restaurant
address = 'Melrose Arch, sandton South Africa'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

-26.133655 28.0676007


In [160]:
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    latitude, 
    longitude, 
    VERSION, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?client_id=MBYOYZAVJU2LAMDJVQFLFZWKAQG4WVN0M1D3TQXTHNRHH0BC&client_secret=FSLIHDA5HZZWR323RW5BUFG3JWHCYC152SS30GICKQID0Y2L&ll=-26.133655,28.0676007&v=20180604&radius=500&limit=30'

In [161]:
results = requests.get(url).json()
'There are {} venues around Moyo Melrose Arch restaurant.'.format(len(results['response']['groups'][0]['items']))

'There are 30 venues around Moyo Melrose Arch restaurant.'

In [162]:
items_M1 = results['response']['groups'][0]['items']
items_M1[0]

{'reasons': {'count': 0,
  'items': [{'reasonName': 'globalInteractionReason',
    'summary': 'This spot is popular',
    'type': 'general'}]},
 'referralId': 'e-0-4f28455ae4b04e256a5905f8-0',
 'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
     'suffix': '.png'},
    'id': '4bf58dd8d48988d1fa931735',
    'name': 'Hotel',
    'pluralName': 'Hotels',
    'primary': True,
    'shortName': 'Hotel'}],
  'id': '4f28455ae4b04e256a5905f8',
  'location': {'address': '1 Melrose Square, Melrose Arch',
   'cc': 'ZA',
   'city': 'Johannesburg',
   'country': 'iNingizimu Afrika',
   'crossStreet': 'btwn Melrose Blvd. & High St.',
   'distance': 5,
   'formattedAddress': ['1 Melrose Square, Melrose Arch (btwn Melrose Blvd. & High St.)',
    'Johannesburg',
    '2196',
    'iNingizimu Afrika'],
   'labeledLatLngs': [{'label': 'display',
     'lat': -26.13361309621831,
     'lng': 28.067621466270452}],
   'lat': -26.13361309621831,
   'lng': 28.067

In [163]:

# tranform venues into a dataframe
dataframe_M1 = json_normalize(items_M1)

# keep only columns that include venue name, and anything that is associated with location
#filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered_M1 = dataframe_M1.loc[:]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered_M1['categories'] = dataframe_filtered_M1.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns_M1 = [column.split('.')[-1] for column in dataframe_filtered_M1.columns]

dataframe_filtered_M1
dataframe_M1 = json_normalize(items_M1) # flatten JSON

# filter columns
filtered_columns_M1 = ['venue.name', 'venue.categories'] + [col for col in dataframe_M1.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered_M1 = dataframe_M1.loc[:, filtered_columns_M1]

# filter the category for each row
dataframe_filtered_M1['venue.categories'] = dataframe_filtered_M1.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered_M1.columns = [col.split('.')[-1] for col in dataframe_filtered_M1.columns]

dataframe_filtered_M1.head(20)



Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,African Pride Melrose Arch Hotel,Hotel,"1 Melrose Square, Melrose Arch",ZA,Johannesburg,iNingizimu Afrika,btwn Melrose Blvd. & High St.,5,"[1 Melrose Square, Melrose Arch (btwn Melrose ...","[{'lng': 28.067621466270452, 'label': 'display...",-26.133613,28.067621,,2196.0,ZA-GT,4f28455ae4b04e256a5905f8
1,Pigalle,Restaurant,Melrose Arch,ZA,"Birnam Park, Johannesburg",iNingizimu Afrika,Melrose Blvd,250,"[Melrose Arch (Melrose Blvd), Birnam Park, Joh...","[{'lng': 28.06854774101176, 'label': 'display'...",-26.131568,28.068548,,,IGauteng,4c111c3e6b7e2d7f10b42835
2,JB's Corner,American Restaurant,11-17 Melrose Blvd.,ZA,EGoli,iNingizimu Afrika,The High St.,84,"[11-17 Melrose Blvd. (The High St.), EGoli, 21...","[{'lng': 28.068411684035134, 'label': 'display...",-26.13388,28.068412,,2196.0,IGauteng,4b7c1669f964a520257c2fe3
3,Melrose Arch,Shopping Mall,30 Melrose Blvd. Birnam,ZA,EGoli,iNingizimu Afrika,at Collins Rd.,90,"[30 Melrose Blvd. Birnam (at Collins Rd.), EGo...","[{'lng': 28.067944049835205, 'label': 'display...",-26.132901,28.067944,,2196.0,IGauteng,4d19d40885fc6dcb2067c04e
4,The Grind Coffee Company,Coffee Shop,30 Melrose Boulevard,ZA,EGoli,iNingizimu Afrika,,222,"[30 Melrose Boulevard, EGoli, iNingizimu Afrika]","[{'lng': 28.067724426560567, 'label': 'display...",-26.131657,28.067724,,,IGauteng,56bd9db1498e369507db1700
5,Paul Melrose Arch,Bakery,,ZA,,iNingizimu Afrika,,105,[iNingizimu Afrika],"[{'lng': 28.067986355875924, 'label': 'display...",-26.132773,28.067986,,,,58b2f4a919b1ad5493766327
6,Jamie's,English Restaurant,,ZA,EGoli,iNingizimu Afrika,,63,"[EGoli, iNingizimu Afrika]","[{'lng': 28.068114200674184, 'label': 'display...",-26.13332,28.068114,,,IGauteng,5852dd3fa83a251b3d8aab6a
7,Tashas,Café,Shop 14 The Piazza Melrose Arch,ZA,EGoli,iNingizimu Afrika,Melrose Blvd,249,[Shop 14 The Piazza Melrose Arch (Melrose Blvd...,"[{'lng': 28.068550028531213, 'label': 'display...",-26.131583,28.06855,Melrose,2196.0,IGauteng,4b7fc08ef964a520383c30e3
8,Protea Hotel Fire & Ice,Hotel,"22 Whiteley Road, Melrose Arch Precinct",ZA,Johannesburg,iNingizimu Afrika,Melrose Arch,139,"[22 Whiteley Road, Melrose Arch Precinct (Melr...","[{'lng': 28.067675, 'label': 'display', 'lat':...",-26.132407,28.067675,,2076.0,ZA-GT,4bd89579f645c9b60d5da8e0
9,Mezepoli,Greek Restaurant,"Shop 26, Melrose Arch",ZA,Melrose,iNingizimu Afrika,Melrose Blvd,272,"[Shop 26, Melrose Arch (Melrose Blvd), Melrose...","[{'lng': 28.0686605746955, 'label': 'display',...",-26.131398,28.068661,,,IGauteng,4b76ee38f964a520aa6b2ee3


In [164]:
#Let's clean up the data
filtered_columns_M1 = ['city', 'lat', 'lng', 'categories', 'name']
M1_nearby_venues =dataframe_filtered_M1.loc[:, filtered_columns_M1]
M1_nearby_venues

Unnamed: 0,city,lat,lng,categories,name
0,Johannesburg,-26.133613,28.067621,Hotel,African Pride Melrose Arch Hotel
1,"Birnam Park, Johannesburg",-26.131568,28.068548,Restaurant,Pigalle
2,EGoli,-26.13388,28.068412,American Restaurant,JB's Corner
3,EGoli,-26.132901,28.067944,Shopping Mall,Melrose Arch
4,EGoli,-26.131657,28.067724,Coffee Shop,The Grind Coffee Company
5,,-26.132773,28.067986,Bakery,Paul Melrose Arch
6,EGoli,-26.13332,28.068114,English Restaurant,Jamie's
7,EGoli,-26.131583,28.06855,Café,Tashas
8,Johannesburg,-26.132407,28.067675,Hotel,Protea Hotel Fire & Ice
9,Melrose,-26.131398,28.068661,Greek Restaurant,Mezepoli


In [165]:
M1_nearby_venues =M1_nearby_venues.replace({'EGoli': 'jhb_cbd'}, regex=True)
M1_nearby_venues

Unnamed: 0,city,lat,lng,categories,name
0,Johannesburg,-26.133613,28.067621,Hotel,African Pride Melrose Arch Hotel
1,"Birnam Park, Johannesburg",-26.131568,28.068548,Restaurant,Pigalle
2,jhb_cbd,-26.13388,28.068412,American Restaurant,JB's Corner
3,jhb_cbd,-26.132901,28.067944,Shopping Mall,Melrose Arch
4,jhb_cbd,-26.131657,28.067724,Coffee Shop,The Grind Coffee Company
5,,-26.132773,28.067986,Bakery,Paul Melrose Arch
6,jhb_cbd,-26.13332,28.068114,English Restaurant,Jamie's
7,jhb_cbd,-26.131583,28.06855,Café,Tashas
8,Johannesburg,-26.132407,28.067675,Hotel,Protea Hotel Fire & Ice
9,Melrose,-26.131398,28.068661,Greek Restaurant,Mezepoli


In [166]:
# one hot encoding
M1_nearby_venues_onehot = pd.get_dummies(M1_nearby_venues[['categories']], prefix="", prefix_sep="")

# add city column back to dataframe
M1_nearby_venues_onehot['city'] = M1_nearby_venues['city'] 

# move city column to the first column
M1_fixed_columns = [M1_nearby_venues_onehot.columns[-1]] + list(M1_nearby_venues_onehot.columns[:-1])
M1_nearby_venues_onehot = M1_nearby_venues_onehot[M1_fixed_columns]

M1_nearby_venues_onehot.head(20)

Unnamed: 0,city,African Restaurant,American Restaurant,Asian Restaurant,Bakery,Burger Joint,Café,Clothing Store,Coffee Shop,Department Store,...,Ice Cream Shop,Italian Restaurant,Plaza,Restaurant,Seafood Restaurant,Shopping Mall,Steakhouse,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar
0,Johannesburg,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Birnam Park, Johannesburg",0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
2,jhb_cbd,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,jhb_cbd,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
4,jhb_cbd,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
5,,0,0,0,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,jhb_cbd,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,jhb_cbd,0,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,Johannesburg,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,Melrose,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [167]:
M1_nearby_venues_grouped = M1_nearby_venues_onehot.groupby('city').mean().reset_index()
M1_nearby_venues_grouped

Unnamed: 0,city,African Restaurant,American Restaurant,Asian Restaurant,Bakery,Burger Joint,Café,Clothing Store,Coffee Shop,Department Store,...,Ice Cream Shop,Italian Restaurant,Plaza,Restaurant,Seafood Restaurant,Shopping Mall,Steakhouse,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar
0,Aucklandpark,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
1,"Birnam Park, Johannesburg",0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,...,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0
2,Johannesburg,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Melrose,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.166667,0.166667,...,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0
4,jhb_cbd,0.076923,0.076923,0.0,0.0,0.076923,0.076923,0.0,0.230769,0.0,...,0.076923,0.0,0.076923,0.0,0.076923,0.076923,0.0,0.076923,0.0,0.0


In [168]:
num_top_venues = 5

for hood in M1_nearby_venues_grouped['city']:
    print("----"+hood+"----")
    temp = M1_nearby_venues_grouped[M1_nearby_venues_grouped['city'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Aucklandpark----
                           venue  freq
0                       Wine Bar   1.0
1            American Restaurant   0.0
2                     Whisky Bar   0.0
3  Vegetarian / Vegan Restaurant   0.0
4                     Steakhouse   0.0


----Birnam Park, Johannesburg----
        venue  freq
0         Gym  0.33
1        Café  0.33
2  Restaurant  0.33
3       Hotel  0.00
4  Whisky Bar  0.00


----Johannesburg----
                           venue  freq
0                          Hotel   1.0
1             African Restaurant   0.0
2            American Restaurant   0.0
3                     Whisky Bar   0.0
4  Vegetarian / Vegan Restaurant   0.0


----Melrose----
              venue  freq
0        Whisky Bar  0.17
1    Clothing Store  0.17
2       Coffee Shop  0.17
3  Department Store  0.17
4  Greek Restaurant  0.17


----jhb_cbd----
                           venue  freq
0                    Coffee Shop  0.23
1             African Restaurant  0.08
2  Vegetarian / Vegan R

In [169]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['city']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
M1_nearby_venues_sorted = pd.DataFrame(columns=columns)
M1_nearby_venues_sorted['city'] = M1_nearby_venues_grouped['city']

for ind in np.arange(M1_nearby_venues_grouped.shape[0]):
    M1_nearby_venues_sorted.iloc[ind, 1:] = return_most_common_venues(M1_nearby_venues_grouped.iloc[ind, :], num_top_venues)

M1_nearby_venues_sorted.head()

Unnamed: 0,city,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Aucklandpark,Wine Bar,Whisky Bar,American Restaurant,Asian Restaurant,Bakery
1,"Birnam Park, Johannesburg",Restaurant,Café,Gym,Wine Bar,English Restaurant
2,Johannesburg,Hotel,Wine Bar,English Restaurant,American Restaurant,Asian Restaurant
3,Melrose,Greek Restaurant,Clothing Store,Italian Restaurant,Coffee Shop,Department Store
4,jhb_cbd,Coffee Shop,African Restaurant,Vegetarian / Vegan Restaurant,American Restaurant,Shopping Mall


In [170]:
venues_map_M1 = folium.Map(location=[latitude, longitude], zoom_start=17) # generate map centred around Moyo Melrose Arch

# add a red circle marker to represent Moyo Melrose Arch 
folium.features.CircleMarker(
    [-26.133655, 28.0676007],
    radius=10,
    color='red',
    popup='Moyo Melrose Arch',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map_M1)

# add the venues as blue circle markers
for lat, lng, label in zip(dataframe_filtered_M1.lat, dataframe_filtered_M1.lng, dataframe_filtered_M1.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map_M1)

# display map
venues_map_M1

In [171]:
#let's find the coordinate for Moyo Zoo Lake Restaurant
address = '1 Prince of Wales Drive Johannesburg South Africa'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

-26.2399844 28.0295006


In [172]:
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    latitude, 
    longitude, 
    VERSION, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?client_id=MBYOYZAVJU2LAMDJVQFLFZWKAQG4WVN0M1D3TQXTHNRHH0BC&client_secret=FSLIHDA5HZZWR323RW5BUFG3JWHCYC152SS30GICKQID0Y2L&ll=-26.2399844,28.0295006&v=20180604&radius=500&limit=30'

In [173]:
results = requests.get(url).json()
'There are {} venues around Moyo Zoo Lake restaurant.'.format(len(results['response']['groups'][0]['items']))

'There are 1 venues around Moyo Zoo Lake restaurant.'

In [174]:
items_M2 = results['response']['groups'][0]['items']
items_M2[0]

{'reasons': {'count': 0,
  'items': [{'reasonName': 'globalInteractionReason',
    'summary': 'This spot is popular',
    'type': 'general'}]},
 'referralId': 'e-0-4efeede461af974bcd291a48-0',
 'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/cafe_',
     'suffix': '.png'},
    'id': '4bf58dd8d48988d16d941735',
    'name': 'Café',
    'pluralName': 'Cafés',
    'primary': True,
    'shortName': 'Café'}],
  'id': '4efeede461af974bcd291a48',
  'location': {'address': 'Southdale',
   'cc': 'ZA',
   'city': 'Johannesburg South',
   'country': 'iNingizimu Afrika',
   'distance': 444,
   'formattedAddress': ['Southdale',
    'Johannesburg south',
    'iNingizimu Afrika'],
   'labeledLatLngs': [{'label': 'display',
     'lat': -26.242385369435585,
     'lng': 28.02594194965284}],
   'lat': -26.242385369435585,
   'lng': 28.02594194965284,
   'state': 'IGauteng'},
  'name': 'Cafe mino',
  'photos': {'count': 0, 'groups': []}}}

In [175]:
# tranform venues into a dataframe
dataframe_M2 = json_normalize(items_M2)

# keep only columns that include venue name, and anything that is associated with location
#filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered_M2 = dataframe_M2.loc[:]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered_M2['categories'] = dataframe_filtered_M2.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns_M2 = [column.split('.')[-1] for column in dataframe_filtered_M2.columns]

dataframe_filtered_M2
dataframe_M2 = json_normalize(items_M2) # flatten JSON

# filter columns
filtered_columns_M2 = ['venue.name', 'venue.categories'] + [col for col in dataframe_M2.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered_M2 = dataframe_M2.loc[:, filtered_columns_M2]

# filter the category for each row
dataframe_filtered_M2['venue.categories'] = dataframe_filtered_M2.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered_M2.columns = [col.split('.')[-1] for col in dataframe_filtered_M2.columns]

dataframe_filtered_M2.head()



Unnamed: 0,name,categories,address,cc,city,country,distance,formattedAddress,labeledLatLngs,lat,lng,state,id
0,Cafe mino,Café,Southdale,ZA,Johannesburg South,iNingizimu Afrika,444,"[Southdale, Johannesburg south, iNingizimu Afr...","[{'lng': 28.02594194965284, 'label': 'display'...",-26.242385,28.025942,IGauteng,4efeede461af974bcd291a48


In [176]:
venues_map_M2 = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Moyo Zoo Lake

# add a red circle marker to represent Moyo Zoo Lake
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Moyo Zoo Lake',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map_M2)

# add the venues as blue circle markers
for lat, lng, label in zip(dataframe_filtered_M2.lat, dataframe_filtered_M2.lng, dataframe_filtered_M2.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map_M2)

# display map
venues_map_M2

Notice: Only one venue is around Moyo Zoo Lake. So not enough data to process. We will proceed with Moyo Melrose Arch only by joining the two database. 

In [177]:
pd.merge(epicure_nearby_venues_sorted, M1_nearby_venues_sorted, on='city', how='inner')

Unnamed: 0,city,1st Most Common Venue_x,2nd Most Common Venue_x,3rd Most Common Venue_x,4th Most Common Venue_x,5th Most Common Venue_x,1st Most Common Venue_y,2nd Most Common Venue_y,3rd Most Common Venue_y,4th Most Common Venue_y,5th Most Common Venue_y
0,jhb_cbd,Grocery Store,Gym,Toy / Game Store,Sushi Restaurant,Bakery,Coffee Shop,African Restaurant,Vegetarian / Vegan Restaurant,American Restaurant,Shopping Mall


We will look for clusters in Johannesburg where the most common venues in 1st, 2nd and 3rd are similar to jhb 

## 3- Johannesburg neighborhood review 

In [178]:
import pandas as pd

Data = {'Neighborhood' : ['Diepsloot', 'Kya Sand', 'Northgate', 'Constantia Kloof', 'Roodepoort', 'Dobsonville', 'Soweto', 'Protea Glen', 'Lenasia',
             'Ennerdale', 'Orange farm', 'Midrand', 'Fourways', 'Sunninghill', 'woodmead', 'Strijdom Park', 'Randburg', 'Sandton', 
             'Northcliff', 'Rosebank', 'Parktown', 'CDB', 'Meadowlands', 'Diepkloof', 'Aeroton', 'South gate', 'Ivory Park', 'Wynberg', 
             'Alexandra', 'Bruma'],
'lattitude' : [-25.9301490, -26.0291190, -26.0526100, -26.1422500, -26.1422500, -26.2216400, -26.2485370, -26.2685910, -26.3760200, 
              -26.4085100, -26.4922400, -25.9991800, -26.0044900, -26.0378600, -26.0457300, -26.0744000, -26.1438410, -26.1075670, 
              -26.1404000, -26.1447800, -26.1753010, -26.2014500, -26.2207000, -26.2586000, -26.2577710, -26.0095800, -26.0023600, 
              -26.1139000, -26.1095600, -26.1760800],
'longitude' : [28.0115200, 27.9488600, 27.9511600, 27.8994100, 27.8994100, 27.8623800, 27.8540330, 27.8158210, 27.8783800, 27.8460200, 
               27.8771300, 28.1262930, 28.0105300, 28.0739500, 28.0816900, 27.9746900, 27.9951860, 28.0567020, 27.9769200, 28.0397600, 
               28.0404400, 28.0454900, 27.9010000, 27.9565900, 27.9769710, 27.9855700, 28.1887900, 28.0873700, 28.0958100, 28.1091400]
       }

jhb_df_2 = pd.DataFrame(Data, columns=['Neighborhood', 'lattitude', 'longitude'])

jhb_df_2


Unnamed: 0,Neighborhood,lattitude,longitude
0,Diepsloot,-25.930149,28.01152
1,Kya Sand,-26.029119,27.94886
2,Northgate,-26.05261,27.95116
3,Constantia Kloof,-26.14225,27.89941
4,Roodepoort,-26.14225,27.89941
5,Dobsonville,-26.22164,27.86238
6,Soweto,-26.248537,27.854033
7,Protea Glen,-26.268591,27.815821
8,Lenasia,-26.37602,27.87838
9,Ennerdale,-26.40851,27.84602


In [179]:
neighborhood_latitude = jhb_df_2.loc[0, 'lattitude'] # neighborhood latitude value
neighborhood_longitude = jhb_df_2.loc[0, 'longitude'] # neighborhood longitude value

neighborhood_name = jhb_df_2.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Diepsloot are -25.930149, 28.01152.


In [180]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=MBYOYZAVJU2LAMDJVQFLFZWKAQG4WVN0M1D3TQXTHNRHH0BC&client_secret=FSLIHDA5HZZWR323RW5BUFG3JWHCYC152SS30GICKQID0Y2L&v=20180604&ll=-25.930149,28.01152&radius=500&limit=100'

In [181]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5c6ecd1f4434b957857dcf6c'},
 'response': {'groups': [{'items': [{'reasons': {'count': 0,
       'items': [{'reasonName': 'globalInteractionReason',
         'summary': 'This spot is popular',
         'type': 'general'}]},
      'referralId': 'e-0-4cc2b99fbe40a35d8233734c-0',
      'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/food_grocery_',
          'suffix': '.png'},
         'id': '4bf58dd8d48988d118951735',
         'name': 'Grocery Store',
         'pluralName': 'Grocery Stores',
         'primary': True,
         'shortName': 'Grocery Store'}],
       'id': '4cc2b99fbe40a35d8233734c',
       'location': {'cc': 'ZA',
        'country': 'iNingizimu Afrika',
        'distance': 383,
        'formattedAddress': ['iNingizimu Afrika'],
        'labeledLatLngs': [{'label': 'display',
          'lat': -25.933201024486834,
          'lng': 28.013299491606574}],
        'lat': -25.933201024486834,
        '

In [182]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [183]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Shoprite (Diepsloot),Grocery Store,-25.933201,28.013299


In [184]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [185]:
jhb_df_2_venues = getNearbyVenues(names=jhb_df_2['Neighborhood'],
                                   latitudes=jhb_df_2['lattitude'],
                                   longitudes=jhb_df_2['longitude']
                                  )

Diepsloot
Kya Sand
Northgate
Constantia Kloof
Roodepoort
Dobsonville
Soweto
Protea Glen
Lenasia
Ennerdale
Orange farm
Midrand
Fourways
Sunninghill
woodmead
Strijdom Park
Randburg
Sandton
Northcliff
Rosebank
Parktown
CDB
Meadowlands
Diepkloof
Aeroton
South gate
Ivory Park
Wynberg
Alexandra
Bruma


In [186]:
print(jhb_df_2_venues.shape)
jhb_df_2_venues.head()

(267, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Diepsloot,-25.930149,28.01152,Shoprite (Diepsloot),-25.933201,28.013299,Grocery Store
1,Kya Sand,-26.029119,27.94886,Wonderwall Climbing Gym,-26.02507,27.950431,Climbing Gym
2,Kya Sand,-26.029119,27.94886,Sandwich baron,-26.031253,27.950422,American Restaurant
3,Kya Sand,-26.029119,27.94886,Wetherlys - Randburg,-26.028327,27.945592,Furniture / Home Store
4,Kya Sand,-26.029119,27.94886,Mistry's Pine Furniture,-26.032186,27.946607,Furniture / Home Store


In [187]:
jhb_df_2_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Aeroton,1,1,1,1,1,1
Alexandra,3,3,3,3,3,3
Bruma,4,4,4,4,4,4
CDB,13,13,13,13,13,13
Constantia Kloof,1,1,1,1,1,1
Diepkloof,1,1,1,1,1,1
Diepsloot,1,1,1,1,1,1
Dobsonville,2,2,2,2,2,2
Ennerdale,2,2,2,2,2,2
Fourways,3,3,3,3,3,3


In [188]:
print('There are {} uniques categories.'.format(len(jhb_df_2_venues['Venue Category'].unique())))

There are 100 uniques categories.


In [189]:
# one hot encoding
jhb_df_2_onehot = pd.get_dummies(jhb_df_2_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
jhb_df_2_onehot['Neighborhood'] = jhb_df_2_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [jhb_df_2_onehot.columns[-1]] + list(jhb_df_2_onehot.columns[:-1])
jhb_df_2_onehot = jhb_df_2_onehot[fixed_columns]

jhb_df_2_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bakery,Bar,...,Snack Place,Soccer Field,Spa,Sporting Goods Shop,Steakhouse,Supermarket,Tapas Restaurant,Thai Restaurant,Train Station,Whisky Bar
0,Diepsloot,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Kya Sand,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Kya Sand,0,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Kya Sand,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Kya Sand,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [190]:
jhb_df_2_onehot.shape

(267, 101)

In [191]:
jhb_df_2_grouped = jhb_df_2_onehot.groupby('Neighborhood').mean().reset_index()
jhb_df_2_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bakery,Bar,...,Snack Place,Soccer Field,Spa,Sporting Goods Shop,Steakhouse,Supermarket,Tapas Restaurant,Thai Restaurant,Train Station,Whisky Bar
0,Aeroton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Alexandra,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bruma,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,CDB,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Constantia Kloof,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Diepkloof,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Diepsloot,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Dobsonville,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Ennerdale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Fourways,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [192]:
jhb_df_2_grouped.shape

(26, 101)

In [193]:
num_top_venues = 5

for hood in jhb_df_2_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = jhb_df_2_grouped[jhb_df_2_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Aeroton----
                     venue  freq
0             Soccer Field   1.0
1        Afghan Restaurant   0.0
2           Massage Studio   0.0
3  North Indian Restaurant   0.0
4                Nightclub   0.0


----Alexandra----
               venue  freq
0  Afghan Restaurant  0.33
1            Butcher  0.33
2              Hotel  0.33
3        Art Gallery  0.00
4     Ice Cream Shop  0.00


----Bruma----
                   venue  freq
0   Gym / Fitness Center  0.25
1  Portuguese Restaurant  0.25
2          Shopping Mall  0.25
3   Fast Food Restaurant  0.25
4      Afghan Restaurant  0.00


----CDB----
                   venue  freq
0   Fast Food Restaurant  0.31
1  Portuguese Restaurant  0.23
2                  Hotel  0.08
3         Scenic Lookout  0.08
4       Department Store  0.08


----Constantia Kloof----
                      venue  freq
0         Martial Arts Dojo   1.0
1         Afghan Restaurant   0.0
2  Mediterranean Restaurant   0.0
3           Other Nightlife   0.0
4   N

In [194]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [195]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = jhb_df_2_grouped['Neighborhood']

for ind in np.arange(jhb_df_2_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(jhb_df_2_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Aeroton,Soccer Field,Whisky Bar,Fast Food Restaurant,Convenience Store,Convention Center
1,Alexandra,Afghan Restaurant,Butcher,Hotel,American Restaurant,Flea Market
2,Bruma,Gym / Fitness Center,Shopping Mall,Portuguese Restaurant,Fast Food Restaurant,Furniture / Home Store
3,CDB,Fast Food Restaurant,Portuguese Restaurant,Music Venue,Hotel,Shopping Mall
4,Constantia Kloof,Martial Arts Dojo,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop


In [196]:
import random 
import numpy as np 
import matplotlib.pyplot as plt 
from sklearn.cluster import KMeans 
from sklearn.datasets.samples_generator import make_blobs 
%matplotlib inline

# set number of clusters
kclusters = 5

jhb_df_2_grouped_clustering = jhb_df_2_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(jhb_df_2_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 1, 0, 4, 2, 3, 0], dtype=int32)

In [197]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster', kmeans.labels_)

jhb_df_2_merged = jhb_df_2

# merge to add latitude/longitude for each neighborhood
jhb_df_2_merged = jhb_df_2_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

jhb_df_2_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,lattitude,longitude,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Diepsloot,-25.930149,28.01152,4.0,Grocery Store,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop
1,Kya Sand,-26.029119,27.94886,0.0,Furniture / Home Store,American Restaurant,Sporting Goods Shop,Climbing Gym,Whisky Bar
2,Northgate,-26.05261,27.95116,0.0,Sporting Goods Shop,Bar,Flower Shop,Fast Food Restaurant,Whisky Bar
3,Constantia Kloof,-26.14225,27.89941,1.0,Martial Arts Dojo,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop
4,Roodepoort,-26.14225,27.89941,1.0,Martial Arts Dojo,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop


In [198]:
jhb_df_2_merged.loc[jhb_df_2_merged['Cluster'] == 0, jhb_df_2_merged.columns[[0] + list(range(4, jhb_df_2_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Kya Sand,Furniture / Home Store,American Restaurant,Sporting Goods Shop,Climbing Gym,Whisky Bar
2,Northgate,Sporting Goods Shop,Bar,Flower Shop,Fast Food Restaurant,Whisky Bar
6,Soweto,Market,Gas Station,Department Store,Portuguese Restaurant,Shopping Mall
7,Protea Glen,Café,Pub,Business Service,Whisky Bar,Concert Hall
11,Midrand,Hotel,Gym / Fitness Center,Fried Chicken Joint,Cosmetics Shop,Portuguese Restaurant
12,Fourways,Convention Center,Soccer Field,Restaurant,Whisky Bar,Electronics Store
13,Sunninghill,Café,Cocktail Bar,Dance Studio,Whisky Bar,Concert Hall
15,Strijdom Park,Other Nightlife,Dog Run,Other Repair Shop,Print Shop,Road
16,Randburg,Indian Restaurant,Grocery Store,BBQ Joint,Italian Restaurant,Pool
17,Sandton,Hotel,Clothing Store,Coffee Shop,Fast Food Restaurant,Shopping Mall


In [199]:
jhb_df_2_merged.loc[jhb_df_2_merged['Cluster'] == 1, jhb_df_2_merged.columns[[0] + list(range(4, jhb_df_2_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
3,Constantia Kloof,Martial Arts Dojo,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop
4,Roodepoort,Martial Arts Dojo,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop


In [200]:
jhb_df_2_merged.loc[jhb_df_2_merged['Cluster'] == 2, jhb_df_2_merged.columns[[0] + list(range(4, jhb_df_2_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
5,Dobsonville,African Restaurant,Music Venue,Whisky Bar,Fast Food Restaurant,Convention Center
26,Ivory Park,Music Venue,Whisky Bar,Electronics Store,Convenience Store,Convention Center


In [201]:
jhb_df_2_merged.loc[jhb_df_2_merged['Cluster'] == 3, jhb_df_2_merged.columns[[0] + list(range(4, jhb_df_2_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
9,Ennerdale,Convenience Store,Whisky Bar,Fast Food Restaurant,Convention Center,Cosmetics Shop


In [202]:
jhb_df_2_merged.loc[jhb_df_2_merged['Cluster'] == 4, jhb_df_2_merged.columns[[0] + list(range(4, jhb_df_2_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Diepsloot,Grocery Store,Fast Food Restaurant,Convenience Store,Convention Center,Cosmetics Shop


# 4- Recommendations

In [203]:
benchmark1 = pd.concat([epicure_nearby_venues_sorted, M1_nearby_venues_sorted])
benchmark1 = benchmark1.drop('city', 1)
benchmark1

Unnamed: 0,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Indian Restaurant,Sushi Restaurant,Bakery,Chinese Restaurant,Coffee Shop
1,Portuguese Restaurant,Toy / Game Store,Hotel,Bakery,Chinese Restaurant
2,Juice Bar,Toy / Game Store,Hotel,Bakery,Chinese Restaurant
3,Coffee Shop,Toy / Game Store,Sushi Restaurant,Bakery,Chinese Restaurant
4,Bakery,Gastropub,Pharmacy,Toy / Game Store,Hotel
5,Seafood Restaurant,Toy / Game Store,Supermarket,Shopping Mall,Chinese Restaurant
6,Grocery Store,Gym,Toy / Game Store,Sushi Restaurant,Bakery
0,Wine Bar,Whisky Bar,American Restaurant,Asian Restaurant,Bakery
1,Restaurant,Café,Gym,Wine Bar,English Restaurant
2,Hotel,Wine Bar,English Restaurant,American Restaurant,Asian Restaurant


In [204]:
result = benchmark1.apply(pd.value_counts).fillna(0); result
result['total']=result.iloc[:,0:5].sum(axis=1)
result = result.sort_values(by=['total'], ascending=[False])
result

Unnamed: 0,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,total
Bakery,1.0,0.0,1.0,3.0,2.0,7.0
Toy / Game Store,0.0,4.0,1.0,1.0,0.0,6.0
Chinese Restaurant,0.0,0.0,0.0,1.0,4.0,5.0
Hotel,1.0,0.0,2.0,0.0,1.0,4.0
Coffee Shop,2.0,0.0,0.0,1.0,1.0,4.0
Sushi Restaurant,0.0,1.0,1.0,1.0,0.0,3.0
American Restaurant,0.0,0.0,1.0,2.0,0.0,3.0
Wine Bar,1.0,1.0,0.0,1.0,0.0,3.0
Asian Restaurant,0.0,0.0,0.0,1.0,1.0,2.0
Gym,0.0,1.0,1.0,0.0,0.0,2.0


### The most common venues in the idea neighborhood are Bakery, Game Store, Chinese restaurant, Hotel and coffee shop           

In [205]:
jhb_df_2_merged.loc[jhb_df_2_merged['Cluster'] == 0, jhb_df_2_merged.columns[[0] + list(range(4, jhb_df_2_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Kya Sand,Furniture / Home Store,American Restaurant,Sporting Goods Shop,Climbing Gym,Whisky Bar
2,Northgate,Sporting Goods Shop,Bar,Flower Shop,Fast Food Restaurant,Whisky Bar
6,Soweto,Market,Gas Station,Department Store,Portuguese Restaurant,Shopping Mall
7,Protea Glen,Café,Pub,Business Service,Whisky Bar,Concert Hall
11,Midrand,Hotel,Gym / Fitness Center,Fried Chicken Joint,Cosmetics Shop,Portuguese Restaurant
12,Fourways,Convention Center,Soccer Field,Restaurant,Whisky Bar,Electronics Store
13,Sunninghill,Café,Cocktail Bar,Dance Studio,Whisky Bar,Concert Hall
15,Strijdom Park,Other Nightlife,Dog Run,Other Repair Shop,Print Shop,Road
16,Randburg,Indian Restaurant,Grocery Store,BBQ Joint,Italian Restaurant,Pool
17,Sandton,Hotel,Clothing Store,Coffee Shop,Fast Food Restaurant,Shopping Mall


### The areas recommended for the  expansion of the second Epicure restaurant are: 
### 1- Rosebank
### 2- Parktown
### 3- Midrand