## Market Research for setting up a new Asian-Mediterranean food market store
#### The primary location targeted is any of the major boroughs in New York City. The location that will be determined in this research shall be practical to start the new business as desired

#### Importing all necessary libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!python -m pip install folium
import folium

print('Libraries imported.')

Libraries imported.


### I. Gather information about restaurants in New York City
We want to gather Asia Pacific demographic activity in and around New York City in order to establish a perfect location to setup Asian food market that includes a larger ethnicities right from Middle East all the way to Japan including every country in the continent.

We want to gather Asia Pacific restaurant information across the three boroughs of NYC and determine a better location for our store. Since patrons who visit Asia Pacific restaurants often tend to like Asian food, the close proximity of the store to these restaurant clusters could be a great place to start the business. 

#### 1. Gathering NYC borough coordinates

Selecting only Manhattan, Brooklyn and Queens boroughs from NYC which is where we want to set up our new venture

In [2]:
# Only the following neighborhoods are being explored
nyc_boroughs = ["Manhattan","Brooklyn","Queens"]

##### FourSquare and Opencagedata API keys and tokens

In [42]:
# Hidden keys for APIs

#### Fetch the borough coordinates from Opencagedata API

In [4]:
nyc_borough_coordinates = []
for city in nyc_boroughs:
    url = "https://api.opencagedata.com/geocode/v1/json?key={}&q={}%2C%20NY&pretty=1".format(OPENCAGEAPIKEY, city)
    result = requests.get(url).json()["results"][0]["geometry"]
    nyc_borough_coordinates.append([city, result["lat"], result["lng"]])


nyc_borough_coord_pd = pd.DataFrame(nyc_borough_coordinates, columns=["City","Latitude","Longitude"])
nyc_borough_coord_pd

Unnamed: 0,City,Latitude,Longitude
0,Manhattan,40.789624,-73.959894
1,Brooklyn,40.650104,-73.949582
2,Queens,40.749824,-73.797634


#### 2. Gather all restaurant categories from NYC beighborhoods

In [5]:
radius = 10000
LIMIT = 10000
food_category_id = '4d4b7105d754a06374d81259'

In [6]:
# Get all restaurants in each borough in "Food" category
def getNearbyRestaurants(data, categoryId, radius=10000):
    
    restaurant_list=[]
    
    for city in data:
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            city[1], 
            city[2], 
            categoryId,
            radius, 
            LIMIT)

        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']

        # return only relevant information for each nearby venue
        restaurant_list.append([(
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['id'],
            v['venue']['categories'][0]['name'],
            v['venue']['categories'][0]['shortName'],) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in restaurant_list for item in venue_list])
    nearby_venues.columns = ["Name","Latitude","Longitude","Category Id","Category Name","Category Short Name"]
    
    return(nearby_venues)


## Get all restaurant categories
def getAllRestaurantCategories(location, categoryId, radius=5000):
    restaurant_cat_list=[]
    # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&near={}&intent=global&categoryId={}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            location,
            categoryId,
            radius, 
            LIMIT)
            
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    #return requests.get(url).json()
        
    # return only relevant information for each nearby venue
    restaurant_cat_list.append([( 
            v['venue']['categories'][0]['id'],
            v['venue']['categories'][0]['name'],
            v['venue']['categories'][0]['shortName'],) for v in results])

    restaurant_categories = pd.DataFrame([item for venue_list in restaurant_cat_list for item in venue_list])
    restaurant_categories.columns = ["Category Id","Category Name","Category Short Name"]
    restaurant_categories.drop_duplicates()
    
    return(restaurant_categories)

#### Fetching restaurants for all 3 boroughs

In [7]:
food_category_id = '4d4b7105d754a06374d81259'
nyc_restaurant_data = getNearbyRestaurants(nyc_borough_coordinates, food_category_id)

In [8]:
nyc_restaurant_data.head()

Unnamed: 0,Name,Latitude,Longitude,Category Id,Category Name,Category Short Name
0,Da Capo,40.787679,-73.953899,4bf58dd8d48988d16d941735,Café,Café
1,Earl's Beer & Cheese,40.787331,-73.951725,4bf58dd8d48988d14e941735,American Restaurant,American
2,Marinara Pizza Upper East,40.782538,-73.953359,4bf58dd8d48988d1ca941735,Pizza Place,Pizza
3,Dig Inn,40.780332,-73.954728,4bf58dd8d48988d14e941735,American Restaurant,American
4,Levain Bakery,40.777354,-73.955284,4bf58dd8d48988d16a941735,Bakery,Bakery


#### Fetching all restaurant categories in NYC neighborhoods

In [9]:
restaurant_categories = getAllRestaurantCategories("New York, NY",categoryId=food_category_id)

In [10]:
print("Total number of restaurant categories found in NYC:", restaurant_categories.shape[0])
restaurant_categories.head()

Total number of restaurant categories found in NYC: 100


Unnamed: 0,Category Id,Category Name,Category Short Name
0,4bf58dd8d48988d111941735,Japanese Restaurant,Japanese
1,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi
2,4bf58dd8d48988d148941735,Donut Shop,Donuts
3,4bf58dd8d48988d14e941735,American Restaurant,American
4,4bf58dd8d48988d1bd941735,Salad Place,Salad


In [11]:
restaurant_categories[["Category Short Name"]].head()

Unnamed: 0,Category Short Name
0,Japanese
1,Sushi
2,Donuts
3,American
4,Salad


### 3. Data Analysis

#### Let's explore the data acquired. Let's prepare the data to match our criteria and extract features for our decision making

#### Filter all the restaurant categories to retrieve only Asian/Mediterranean types

In [12]:
#All restaurant types
restaurant_types = [x.lower() for x in list(restaurant_categories["Category Short Name"].unique())]

In [13]:
##  Filtered all restaurant types to fetch Asian and Mediterranean restaurants
ap_restaurant_types = ['chinese','falafel','greek','japanese','korean','lebanese','mediterranean','moroccan',
 'seafood','sushi','szechuan','thai','udon','vegetarian / vegan']

##  Filtered all restaurant types to fetch Asian and Mediterranean restaurants
ap_restaurant_types = ['Chinese','Falafel','Greek','Japanese','Korean','Lebanese','Mediterranean','Moroccan',
 'Seafood','Sushi','Szechuan','Thai','Udon','Vegetarian / Vegan']

#### Filtering NYC restaurants to only Asian/Mediterranean restaurants. The list also includes Vegeterian/Vegan and Seafood categories since the store has inventories specific to these types

In [14]:
ap_nyc_restaurant_data = nyc_restaurant_data[nyc_restaurant_data["Category Short Name"].isin(ap_restaurant_types)].reset_index(drop=True)
print("Total restaurants that determine our store's location:",ap_nyc_restaurant_data.shape)
ap_nyc_restaurant_data.head()


Total restaurants that determine our store's location: (57, 6)


Unnamed: 0,Name,Latitude,Longitude,Category Id,Category Name,Category Short Name
0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood
1,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai
2,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai
3,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai
4,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi


### Setting up map modules to visualize restaurant clusters

In [15]:
# create map of NYC using latitude and longitude values

def mapIt(data, latitude, longitude, zoom_start = 10):
    
    city_map = folium.Map(location=[latitude, longitude], zoom_start=zoom_start)

    # add markers to map
    for lat, lng, name, category in zip(data['Latitude'], data['Longitude'], data['Name'], data['Category Short Name']):
        label = '{}, {}'.format(category, name)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color='blue',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(city_map)  

    return city_map

#### Plot restaurants on the map to visualize the distribution

In [16]:
nyc_map = mapIt(ap_nyc_restaurant_data, nyc_borough_coordinates[0][1], nyc_borough_coordinates[0][2], zoom_start=11)
nyc_map

In [17]:
ap_nyc_restaurant_data.head()

Unnamed: 0,Name,Latitude,Longitude,Category Id,Category Name,Category Short Name
0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood
1,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai
2,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai
3,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai
4,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi


In [18]:
ap_nyc_restaurant_data["Value"] = 1

In [19]:
#ap_nyc_restaurant_data.pivot(index = ["Name","Latitude","Longitude","Category Id"],columns="Category Short Name", values = "Value")

In [20]:
ap_nyc_restaurant_grouped = ap_nyc_restaurant_data[["Category Short Name","Value"]].groupby('Category Short Name').sum().reset_index()
ap_nyc_restaurant_grouped.head()

Unnamed: 0,Category Short Name,Value
0,Chinese,4
1,Greek,7
2,Japanese,9
3,Korean,7
4,Mediterranean,5


In [21]:
ap_nyc_restaurant_data.reset_index().head()

Unnamed: 0,index,Name,Latitude,Longitude,Category Id,Category Name,Category Short Name,Value
0,0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,1
1,1,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
2,2,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
3,3,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
4,4,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,1


In [22]:
ap_nyc_restaurant_data.head()

Unnamed: 0,Name,Latitude,Longitude,Category Id,Category Name,Category Short Name,Value
0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,1
1,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
2,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
3,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
4,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,1


In [23]:
ap_nyc_restaurant_data2 = ap_nyc_restaurant_data[["Name","Latitude","Longitude"]].reset_index()
ap_nyc_restaurant_data2.head()

Unnamed: 0,index,Name,Latitude,Longitude
0,0,The Mermaid Inn,40.788744,-73.974243
1,1,PuTawn Local Thai Kitchen,40.774599,-73.951042
2,2,Sala Thai,40.780124,-73.980475
3,3,Up Thai,40.769898,-73.957598
4,4,Tanoshi Sushi,40.767747,-73.953203


In [24]:
# Creating a grouped dataframe to run K Means clustering
nyc_grouped_clustering = ap_nyc_restaurant_data[["Name","Latitude","Longitude"]].drop('Name', 1)

In [25]:
nyc_grouped_clustering.head()

Unnamed: 0,Latitude,Longitude
0,40.788744,-73.974243
1,40.774599,-73.951042
2,40.780124,-73.980475
3,40.769898,-73.957598
4,40.767747,-73.953203


### 4. K Means Clustering
#### Let's create clusters of the restaurants scattered across the boroughs and label them

In [26]:
# set number of clusters
kclusters = 4

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(nyc_grouped_clustering)

print(kmeans.cluster_centers_)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:] 

[[ 40.76515813 -73.93195378]
 [ 40.74758713 -73.8033496 ]
 [ 40.75807782 -73.98640458]
 [ 40.67501736 -73.97174628]]


array([2, 0, 2, 0, 0, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 2, 0, 0, 2, 2, 2, 2,
       2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

In [27]:
nyc_grouped_clustering.insert(0, 'Cluster Labels', kmeans.labels_)
nyc_grouped_clustering.reset_index().head()
#nyc_grouped_clustering.head()

Unnamed: 0,index,Cluster Labels,Latitude,Longitude
0,0,2,40.788744,-73.974243
1,1,0,40.774599,-73.951042
2,2,2,40.780124,-73.980475
3,3,0,40.769898,-73.957598
4,4,0,40.767747,-73.953203


In [28]:
ap_nyc_restaurant_data_clustered = ap_nyc_restaurant_data.reset_index().join(nyc_grouped_clustering.reset_index().set_index("index"), on="index", lsuffix='_left', rsuffix='_right')
ap_nyc_restaurant_data_clustered.head()


Unnamed: 0,index,Name,Latitude_left,Longitude_left,Category Id,Category Name,Category Short Name,Value,Cluster Labels,Latitude_right,Longitude_right
0,0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,1,2,40.788744,-73.974243
1,1,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1,0,40.774599,-73.951042
2,2,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1,2,40.780124,-73.980475
3,3,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1,0,40.769898,-73.957598
4,4,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,1,0,40.767747,-73.953203


In [29]:
ap_nyc_restaurant_data_clustered = ap_nyc_restaurant_data_clustered.drop(["Latitude_right","Longitude_right", "Value","index"], axis=1).reset_index(drop=True)
ap_nyc_restaurant_data_clustered.columns=["Name","Latitude","Longitude","CategoryId","CategoryName","CategoryShortName","ClusterLabel"]
ap_nyc_restaurant_data_clustered.head()

Unnamed: 0,Name,Latitude,Longitude,CategoryId,CategoryName,CategoryShortName,ClusterLabel
0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,2
1,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai,0
2,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai,2
3,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai,0
4,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,0


In [30]:
# create map
map_clusters = folium.Map(location=[nyc_borough_coordinates[0][1], nyc_borough_coordinates[0][2]], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ap_nyc_restaurant_data_clustered['Latitude'], ap_nyc_restaurant_data_clustered['Longitude'], ap_nyc_restaurant_data_clustered['Name'], ap_nyc_restaurant_data_clustered['ClusterLabel']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='brown',
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)



       
map_clusters

In [31]:
centroids = kmeans.cluster_centers_
counter = 0
for center in zip(centroids):
        label = '{} {}'.format("Cluster" + str(counter), "Center" )
        label = folium.Popup(label, parse_html=True)
        lat = center[0][0]
        lng = center[0][1]
        counter += 1
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color='red',
            fill=True,
            fill_color='yellow',
            fill_opacity=0.7,
            parse_html=False).add_to(map_clusters)  

map_clusters

#### Calculating center of the centroids

In [32]:
len(centroids)

4

In [33]:
lx, ly = 0,0
for center in centroids:
    lx += center[0]
    ly += center[1]
lx = lx/kclusters
ly = ly/kclusters
print(lx, ly)

40.73646010852825 -73.92336356073892


In [34]:
label = '{}'.format("Suggested Store Location 1")
label = folium.Popup(label, parse_html=True)
lat = lx
lng = ly
counter += 1
folium.CircleMarker(
    [lat, lng],
    radius=5,
    popup=label,
    color='black',
    fill=True,
    fill_color='yellow',
    fill_opacity=0.8,
    parse_html=False).add_to(map_clusters)  

map_clusters

### Exploring the clusters
#### Let's find out which clusters are heavy and which clusters offer better prospects for our store as the potential location

Let's sort the clusters with highest number of restaurants

In [35]:
#ap_nyc_restaurant_data_clustered.groupby("ClusterLabel")["Name"].count().reset_index()
ap_nyc_grouped = ap_nyc_restaurant_data_clustered.groupby("ClusterLabel")["Name"].count().reset_index()
ap_nyc_grouped.columns = ["Cluster", "Number of Restaurants"]
ap_nyc_grouped = ap_nyc_grouped.sort_values(by="Number of Restaurants", ascending = False)
ap_nyc_grouped = ap_nyc_grouped.reset_index(drop=True)
ap_nyc_grouped.head()

Unnamed: 0,Cluster,Number of Restaurants
0,1,20
1,2,15
2,3,13
3,0,9


#### Cluster 0

In [36]:
# Cluster 0
ap_nyc_restaurant_data_clustered[ap_nyc_restaurant_data_clustered.ClusterLabel == 0].reset_index(drop=True)

Unnamed: 0,Name,Latitude,Longitude,CategoryId,CategoryName,CategoryShortName,ClusterLabel
0,PuTawn Local Thai Kitchen,40.774599,-73.951042,4bf58dd8d48988d149941735,Thai Restaurant,Thai,0
1,Up Thai,40.769898,-73.957598,4bf58dd8d48988d149941735,Thai Restaurant,Thai,0
2,Tanoshi Sushi,40.767747,-73.953203,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,0
3,Enthaice,40.763165,-73.92115,4bf58dd8d48988d149941735,Thai Restaurant,Thai,0
4,Pye Boat Noodle,40.760324,-73.921711,4bf58dd8d48988d149941735,Thai Restaurant,Thai,0
5,Taverna Kyclades,40.77521,-73.909381,4bf58dd8d48988d10e941735,Greek Restaurant,Greek,0
6,JJ's Asian Fusion,40.762024,-73.918452,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,0
7,Hibino LIC,40.74277,-73.952127,4bf58dd8d48988d111941735,Japanese Restaurant,Japanese,0
8,Loukoumi Taverna,40.770687,-73.902921,4bf58dd8d48988d10e941735,Greek Restaurant,Greek,0


#### Cluster 1

In [37]:
# Cluster 1
ap_nyc_restaurant_data_clustered[ap_nyc_restaurant_data_clustered.ClusterLabel == 1].reset_index(drop=True)

Unnamed: 0,Name,Latitude,Longitude,CategoryId,CategoryName,CategoryShortName,ClusterLabel
0,Arirang,40.761282,-73.802799,4bf58dd8d48988d113941735,Korean Restaurant,Korean,1
1,Mad For Chicken,40.763426,-73.807724,4bf58dd8d48988d113941735,Korean Restaurant,Korean,1
2,BKNY Thai Restaurant,40.753085,-73.779995,4bf58dd8d48988d149941735,Thai Restaurant,Thai,1
3,Mapo BBQ,40.762309,-73.81488,4bf58dd8d48988d113941735,Korean Restaurant,Korean,1
4,The Oneness-Fountain-Heart,40.727897,-73.811311,4bf58dd8d48988d1d3941735,Vegetarian / Vegan Restaurant,Vegetarian / Vegan,1
5,Avli Little Greek Tavern,40.765729,-73.771972,4bf58dd8d48988d10e941735,Greek Restaurant,Greek,1
6,Tong Sam Gyup Goo Yi Restaurant,40.762059,-73.803123,4bf58dd8d48988d113941735,Korean Restaurant,Korean,1
7,Hahm Ji Bach - 함지박,40.763022,-73.815042,4bf58dd8d48988d113941735,Korean Restaurant,Korean,1
8,OK Ryan,40.75607,-73.832368,4bf58dd8d48988d145941735,Chinese Restaurant,Chinese,1
9,Nikko Hibachi,40.726321,-73.790242,4bf58dd8d48988d111941735,Japanese Restaurant,Japanese,1


#### Cluster 2

In [38]:
# Cluster 2
ap_nyc_restaurant_data_clustered[ap_nyc_restaurant_data_clustered.ClusterLabel == 2].reset_index(drop=True)

Unnamed: 0,Name,Latitude,Longitude,CategoryId,CategoryName,CategoryShortName,ClusterLabel
0,The Mermaid Inn,40.788744,-73.974243,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,2
1,Sala Thai,40.780124,-73.980475,4bf58dd8d48988d149941735,Thai Restaurant,Thai,2
2,Marea,40.767452,-73.981114,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,2
3,Kashkaval Garden,40.766888,-73.986469,4bf58dd8d48988d1c0941735,Mediterranean Restaurant,Mediterranean,2
4,Taboon,40.766103,-73.990997,4bf58dd8d48988d1c0941735,Mediterranean Restaurant,Mediterranean,2
5,Taladwat,40.762703,-73.98946,4bf58dd8d48988d149941735,Thai Restaurant,Thai,2
6,Pure Thai Cookhouse,40.764319,-73.988307,4bf58dd8d48988d149941735,Thai Restaurant,Thai,2
7,The Lobster Club,40.758457,-73.971741,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,2
8,Mémé Mediterranean,40.760905,-73.994601,4bf58dd8d48988d1c0941735,Mediterranean Restaurant,Mediterranean,2
9,Jongro BBQ,40.747574,-73.987043,4bf58dd8d48988d113941735,Korean Restaurant,Korean,2


#### Cluster 3

In [39]:
# Cluster 3
ap_nyc_restaurant_data_clustered[ap_nyc_restaurant_data_clustered.ClusterLabel == 3].reset_index(drop=True)

Unnamed: 0,Name,Latitude,Longitude,CategoryId,CategoryName,CategoryShortName,ClusterLabel
0,Thai Farm Kitchen,40.644148,-73.976047,4bf58dd8d48988d149941735,Thai Restaurant,Thai,3
1,Silver Rice,40.674187,-73.957037,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,3
2,East Wind Snack Shop,40.660297,-73.980169,4bf58dd8d48988d145941735,Chinese Restaurant,Chinese,3
3,Gen,40.677575,-73.963721,4bf58dd8d48988d1d2941735,Sushi Restaurant,Sushi,3
4,Sushi Katsuei,40.670615,-73.978504,4bf58dd8d48988d111941735,Japanese Restaurant,Japanese,3
5,China New Star,40.61614,-73.929742,4bf58dd8d48988d145941735,Chinese Restaurant,Chinese,3
6,Avlee Greek Kitchen,40.679975,-73.994988,4bf58dd8d48988d10e941735,Greek Restaurant,Greek,3
7,Charm Kao,40.689212,-73.98625,4bf58dd8d48988d149941735,Thai Restaurant,Thai,3
8,Grand Army,40.688329,-73.986612,4bf58dd8d48988d1ce941735,Seafood Restaurant,Seafood,3
9,Kichin,40.697706,-73.927023,4bf58dd8d48988d113941735,Korean Restaurant,Korean,3


#### Gathering street addresses for all centroids including centroid of centroids

In [40]:
# Defining a function to fetch street address for any set of coordinates

def findStreetAddress(lat, lng):
    url = "https://api.opencagedata.com/geocode/v1/json?key={}&q={}&pretty=1".format(OPENCAGEAPIKEY, str(lat) + "," + str(lng))
    return requests.get(url).json()["results"][0]["formatted"]
    #return address


#### Address of all Centroids

In [41]:
for i in range(len(centroids)):
    print("Address of Centroid", i, " : ", findStreetAddress(centroids[i][0], centroids[i][1]))

print("Address of Centroid of Centroids: ", findStreetAddress(lx, ly))

Address of Centroid 0  :  14-48 Broadway, New York, NY 11106, United States of America
Address of Centroid 1  :  164th Street, New York, NY 11358:11432, United States of America
Address of Centroid 2  :  One Astor Place, 1515 Broadway, New York, NY 10019, United States of America
Address of Centroid 3  :  11 Plaza Street West, New York, NY 11217, United States of America
Address of Centroid of Centroids:  50-48 43rd Street, New York, NY 11377, United States of America


### 5. Results

##### As evident, Cluster 5 has the highest number of restaurants, followed by Cluster 2 and then by Cluster 4.

If the business intends to open only one store outlet that's close to all these clusters, then the most feasible location would be the centroid of centroids.

If the business has resources to open multiple outlets, the recommended sequence of locations as a result of the above analysis is as following:

1. Manhattan (Cluster 4)
2. Brooklyn (Cluster 0)
3. Queens (Cluster 1)


### 6. Conclusion

The business stakeholders must consider the available resources, leasing costs in the recommended locations, infrastructure and set up expenses, marketing costs, etc. before making a final decision. Based on budget, resource allocation and time, the business can either set up multiple outlets or single outlet to meet their venture goals.