<a href="https://cognitiveclass.ai"><img src = "https://ibm.box.com/shared/static/9gegpsmnsoo25ikkbl4qzlvlyjbgxs5x.png" width = 400> </a>

<h1 align=center><font size = 5>Segmenting and Clustering Neighborhoods in New York City</font></h1>

## Introduction

In this lab, you will learn how to convert addresses into their equivalent latitude and longitude values. Also, you will use the Foursquare API to explore neighborhoods in New York City. You will use the **explore** function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. You will use the *k*-means clustering algorithm to complete this task. Finally, you will use the Folium library to visualize the neighborhoods in New York City and their emerging clusters.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Download and Explore Dataset</a>

2. <a href="#item2">Explore Neighborhoods in New York City</a>

3. <a href="#item3">Analyze Each Neighborhood</a>

4. <a href="#item4">Cluster Neighborhoods</a>

5. <a href="#item5">Examine Clusters</a>    
</font>
</div>

Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [192]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Libraries imported.


<a id='item1'></a>

## 1. Download and Explore Dataset

First, let's download a dataset that contains the 5 boroughs and the neighborhoods that exist in each borough as well as the the latitude and logitude coordinates of each neighborhood from https://geo.nyu.edu/download/file/nyu-2451-34572-geojson.json:

In [427]:
!wget -q -O 'newyork_data_2.json' https://geo.nyu.edu/download/file/nyu-2451-34572-geojson.json
print('Data downloaded!')

Data downloaded!


#### Load and explore the data

Next, let's load the data.

In [426]:
with open('newyork_data_2.json') as json_data:
    newyork_data = json.load(json_data)

All the relevant data is in the *features* key, which is basically a list of the neighborhoods. So, let's define a new variable that includes this data.

In [194]:
neighborhoods_data = newyork_data['features']

Let's take a look at the first item in this list.

In [428]:
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

#### Tranform the data into a *pandas* dataframe

Let's transform this data of nested Python dictionaries into a *pandas* dataframe:

In [283]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

Let's loop through the data and fill the dataframe one row at a time.

In [285]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': "%s (%s)" % (neighborhood_name, borough),
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

#### Use geopy library to get the latitude and longitude values of New York City.

In [286]:
address = 'New York City, NY'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7308619, -73.9871558.


#### Create a map of New York with neighborhoods superimposed on top.

In [287]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

Define function to extract single borough's neighborhoods:

In [429]:
def getBoroughData(neighborhoods, borough):
    return neighborhoods[neighborhoods['Borough'] == borough].reset_index(drop=True)

Let's check it by extracting Brooklyn's neighborhoods, for example:

In [289]:
borough_data = getBoroughData(neighborhoods, 'Brooklyn')
borough_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Brooklyn,Bay Ridge (Brooklyn),40.625801,-74.030621
1,Brooklyn,Bensonhurst (Brooklyn),40.611009,-73.99518
2,Brooklyn,Sunset Park (Brooklyn),40.645103,-74.010316
3,Brooklyn,Greenpoint (Brooklyn),40.730201,-73.954241
4,Brooklyn,Gravesend (Brooklyn),40.59526,-73.973471


Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

#### Define Foursquare Credentials and Version

In [290]:
CLIENT_ID = 'BB5MFHHS4VO3ZJTB12ETQUO4JXRMTNFJWHODUBG24FP0BAOS' # your Foursquare ID
CLIENT_SECRET = 'HD4MMSKQ1XUM4DUMHGLVWNVOTQNSDHDXT3MCA3EY4RUY2IRL' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: BB5MFHHS4VO3ZJTB12ETQUO4JXRMTNFJWHODUBG24FP0BAOS
CLIENT_SECRET:HD4MMSKQ1XUM4DUMHGLVWNVOTQNSDHDXT3MCA3EY4RUY2IRL


Define function to get nearby restaurants:

In [291]:
def getNearbyRestaurants(neighborhood_data, client_id, client_secret, version = '20180605', radius=500, limit = 100):
    
    categoryId = '4d4b7105d754a06374d81259'   # Foursquare's Restaurants category
    venues_list=[]
    for name, lat, lng in zip(neighborhood_data['Neighborhood'], neighborhood_data['Latitude'], neighborhood_data['Longitude']):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            client_id, 
            client_secret, 
            version, 
            lat, 
            lng, 
            radius, 
            limit,
            categoryId)
            
        # make the GET request
        results = requests.get(url).json()["response"]['venues']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng'],  
            v['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Get Brooklyn restaurants:

In [381]:
neighborhood_data = getBoroughData(neighborhoods, 'Brooklyn')
brooklyn_restaurants = getNearbyRestaurants(neighborhood_data, CLIENT_ID, CLIENT_SECRET)
brooklyn_restaurants.shape

(3132, 7)

Get Queens restaurants:

In [382]:
neighborhood_data = getBoroughData(neighborhoods, 'Queens')
queens_restaurants = getNearbyRestaurants(neighborhood_data, CLIENT_ID, CLIENT_SECRET)
queens_restaurants.shape

(2855, 7)

Get Manhattan restaurants:

In [383]:
neighborhood_data = getBoroughData(neighborhoods, 'Manhattan')
manhattan_restaurants = getNearbyRestaurants(neighborhood_data, CLIENT_ID, CLIENT_SECRET)
manhattan_restaurants.shape

(2000, 7)

Get Bronx restaurants:

In [384]:
neighborhood_data = getBoroughData(neighborhoods, 'Bronx')
bronx_restaurants = getNearbyRestaurants(neighborhood_data, CLIENT_ID, CLIENT_SECRET)
bronx_restaurants.shape

(2160, 7)

Get Staten Island restaurants:

In [385]:
neighborhood_data = getBoroughData(neighborhoods, 'Staten Island')
staten_restaurants = getNearbyRestaurants(neighborhood_data, CLIENT_ID, CLIENT_SECRET)
staten_restaurants.shape

(1452, 7)

Let's take a look at categories counts for one of the boroughs:

In [431]:
brooklyn_restaurants['Venue Category'].value_counts()

Deli / Bodega                      322
Pizza Place                        207
Food                               198
Coffee Shop                        180
Chinese Restaurant                 163
Bakery                             154
Caribbean Restaurant               130
Italian Restaurant                  95
Café                                94
Fast Food Restaurant                80
American Restaurant                 79
Mexican Restaurant                  76
Bagel Shop                          72
Donut Shop                          71
Restaurant                          51
Sandwich Place                      51
Food Truck                          51
Fried Chicken Joint                 48
Ice Cream Shop                      48
Seafood Restaurant                  42
Sushi Restaurant                    42
Burger Joint                        41
Diner                               37
Latin American Restaurant           36
Juice Bar                           32
Thai Restaurant          

There are a lot of generic categories that would not be useful in our analysis. Let's filter them out and leave only categories that may hint at some particular ethnic background.

Define a function to clean up venues list:

In [403]:
def cleanupRestaurants(restaurants):
    keeps = [
    #'American Restaurant',
    'Italian Restaurant',
    'Mexican Restaurant',
    'Chinese Restaurant',
    'French Restaurant',
    #'New American Restaurant',
    'Ramen Restaurant',
    'Thai Restaurant',
    'Mediterranean Restaurant',
    'Japanese Restaurant',
    'Indian Restaurant',
    'Spanish Restaurant',
    'Sushi Restaurant',
    'Latin American Restaurant',
    'Asian Restaurant',
    'Greek Restaurant',
    'Taco Place',
    'Noodle House',
    'Korean Restaurant',
    'Caribbean Restaurant',
    'Vietnamese Restaurant',
    'Tapas Restaurant',
    'Dim Sum Restaurant',
    'Cuban Restaurant',
    'Dumpling Restaurant',
    'Udon Restaurant',
    'Falafel Restaurant',
    'Lebanese Restaurant',
    'Burrito Place',
    'Southern / Soul Food Restaurant',
    'German Restaurant',
    'Beer Garden',
    'Filipino Restaurant',
    'Australian Restaurant',
    'Swiss Restaurant',
    'Moroccan Restaurant',
    'Peruvian Restaurant',
    'Sake Bar',
    'Ethiopian Restaurant',
    'Middle Eastern Restaurant',
    'Poke Place',
    'Taiwanese Restaurant',
    'Malay Restaurant',
    'Szechuan Restaurant',
    'Hawaiian Restaurant',
    'Soba Restaurant',
    'Turkish Restaurant',
    'Kosher Restaurant',
    'Israeli Restaurant',
    'Jewish Restaurant',
    'Cantonese Restaurant',
    'South Indian Restaurant',
    'Creperie',
    'Shanghai Restaurant',
    'Arepa Restaurant',
    'African Restaurant',
    'Tex-Mex Restaurant',
    'Irish Pub',
    'Halal Restaurant',
    'English Restaurant',
    'Paella Restaurant',
    'Japanese Curry Restaurant',
    'Kebab Restaurant',
    'Argentinian Restaurant',
    'Ukrainian Restaurant'
    ]
    
    return restaurants.loc[restaurants['Venue Category'].isin(keeps)]

In [404]:
brooklyn_restaurants_cleaned = cleanupRestaurants(brooklyn_restaurants)
queens_restaurants_cleaned = cleanupRestaurants(queens_restaurants)
manhattan_restaurants_cleaned = cleanupRestaurants(manhattan_restaurants)
bronx_restaurants_cleaned = cleanupRestaurants(bronx_restaurants)
staten_restaurants_cleaned = cleanupRestaurants(staten_restaurants)
brooklyn_restaurants_cleaned.shape
queens_restaurants_cleaned.shape
manhattan_restaurants_cleaned.shape
bronx_restaurants_cleaned.shape
staten_restaurants_cleaned.shape

(367, 7)

Concatenate boroughs data sets:

In [405]:
nyc_restaurants = pd.concat([brooklyn_restaurants_cleaned, queens_restaurants_cleaned, manhattan_restaurants_cleaned, bronx_restaurants_cleaned, staten_restaurants_cleaned])
nyc_restaurants.shape

(3343, 7)

Define function to get frequencies:

In [406]:
def getFrequencies(venues):
    # one hot encoding
    onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")
    
    # add neighborhood column back to dataframe
    onehot['Neighborhood'] = venues['Neighborhood'] 

    # move neighborhood column to the first column
    fixed_columns = [onehot.columns[-1]] + list(onehot.columns[:-1])
    onehot = onehot[fixed_columns]
    
    # get frequencies
    return onehot.groupby('Neighborhood').mean().reset_index()

#brooklyn_grouped = getFrequencies(brooklyn_restaurants)
#brooklyn_grouped

Function to sort venues in descending order:

In [407]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Clustering:

In [408]:
nyc_grouped = getFrequencies(nyc_restaurants)
nyc_grouped.sort_values(by='Neighborhood')
nyc_grouped.shape

(291, 64)

Function to find optimal clusters number using silhouette score:

In [409]:
from sklearn.metrics import silhouette_score

def findOptimalK(grouped_for_clustering, kmax = 20):
    # get frequencies
    max_score = 0
    opt_clusters = 0

    #grouped_for_clustering = freqs.drop('Neighborhood', 1)
    
    for kclusters in range(3, kmax):    # [2, 3, 4, 5, 6, 7, 8, 9]:
        cluster_labels = KMeans(n_clusters=kclusters, random_state=0).fit_predict(grouped_for_clustering)
        silhouette_avg = silhouette_score(grouped_for_clustering, cluster_labels)
        print("For n_clusters =", kclusters, "The average silhouette_score is :", silhouette_avg, "labels =: ", len(cluster_labels))
        if (silhouette_avg > max_score):
            max_score = silhouette_avg
            opt_clusters = kclusters
    
    return opt_clusters

def findClusters(freqs):
    grouped_for_clustering = freqs.drop('Neighborhood', 1)

    # set number of clusters
    kclusters = findOptimalK(grouped_for_clustering)

    # run k-means clustering
    cluster_labels = KMeans(n_clusters=kclusters, random_state=0).fit_predict(grouped_for_clustering)
    print("labels =: ", len(cluster_labels))
    
    return (kclusters, cluster_labels)


In [410]:
nyc_grouped.shape

(291, 64)

In [411]:
k_clusters, labels = findClusters(nyc_grouped)

For n_clusters = 3 The average silhouette_score is : 0.1594182612197137 labels =:  291
For n_clusters = 4 The average silhouette_score is : 0.17512347647915594 labels =:  291
For n_clusters = 5 The average silhouette_score is : 0.15480323139950145 labels =:  291
For n_clusters = 6 The average silhouette_score is : 0.15670265663812336 labels =:  291
For n_clusters = 7 The average silhouette_score is : 0.15797086577112154 labels =:  291
For n_clusters = 8 The average silhouette_score is : 0.15396317697018455 labels =:  291
For n_clusters = 9 The average silhouette_score is : 0.15523497813217546 labels =:  291
For n_clusters = 10 The average silhouette_score is : 0.1339339915921139 labels =:  291
For n_clusters = 11 The average silhouette_score is : 0.15183515788376128 labels =:  291
For n_clusters = 12 The average silhouette_score is : 0.16208708484098897 labels =:  291
For n_clusters = 13 The average silhouette_score is : 0.1396968508131247 labels =:  291
For n_clusters = 14 The average

In [419]:
len(labels)
k_clusters

4

Create dataset with top 10 venues for neighborhood:

In [413]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = nyc_grouped['Neighborhood']

for ind in np.arange(nyc_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(nyc_grouped.iloc[ind, :], num_top_venues)
    
neighborhoods_venues_sorted.head()


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allerton (Bronx),Mexican Restaurant,Chinese Restaurant,Spanish Restaurant,Vietnamese Restaurant,Halal Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Hawaiian Restaurant
1,Annadale (Staten Island),Sushi Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant
2,Arlington (Staten Island),Caribbean Restaurant,Peruvian Restaurant,Vietnamese Restaurant,Falafel Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant
3,Arrochar (Staten Island),Italian Restaurant,Taco Place,Middle Eastern Restaurant,Latin American Restaurant,Mediterranean Restaurant,Chinese Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant
4,Arverne (Queens),Thai Restaurant,Chinese Restaurant,Vietnamese Restaurant,Hawaiian Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Indian Restaurant


In [414]:
neighborhoods.shape

(306, 4)

In [415]:
nyc_grouped.shape

(291, 64)

In [416]:
nyc_merged = neighborhoods.loc[neighborhoods['Neighborhood'].isin(nyc_grouped['Neighborhood'])].sort_values(by='Neighborhood')
nyc_merged.shape

(291, 4)

Now let's add cluster labels to our dataset:

In [417]:
# add clustering labels
nyc_merged['Cluster Labels'] = labels

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
nyc_merged = nyc_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

nyc_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
298,Bronx,Allerton (Bronx),40.865788,-73.859319,3,Mexican Restaurant,Chinese Restaurant,Spanish Restaurant,Vietnamese Restaurant,Halal Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Hawaiian Restaurant
215,Staten Island,Annadale (Staten Island),40.538114,-74.178549,0,Sushi Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant
227,Staten Island,Arlington (Staten Island),40.635325,-74.165104,2,Caribbean Restaurant,Peruvian Restaurant,Vietnamese Restaurant,Falafel Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant
228,Staten Island,Arrochar (Staten Island),40.596313,-74.067124,0,Italian Restaurant,Taco Place,Middle Eastern Restaurant,Latin American Restaurant,Mediterranean Restaurant,Chinese Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant
177,Queens,Arverne (Queens),40.589144,-73.791992,3,Thai Restaurant,Chinese Restaurant,Vietnamese Restaurant,Hawaiian Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Indian Restaurant


Let's visualize the resulting clusters:

In [445]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(k_clusters)
ys = [i+x+(i*x)**2 for i in range(k_clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(nyc_merged['Latitude'], nyc_merged['Longitude'], nyc_merged['Neighborhood'], nyc_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster + 1), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
        
map_clusters

## Let's Examine Clusters:

#### Cluster 1

In [450]:
nyc_merged.loc[nyc_merged['Cluster Labels'] == 0, nyc_merged.columns[[1] + list(range(5, nyc_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
215,Annadale (Staten Island),Sushi Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant
228,Arrochar (Staten Island),Italian Restaurant,Taco Place,Middle Eastern Restaurant,Latin American Restaurant,Mediterranean Restaurant,Chinese Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant
129,Astoria (Queens),Greek Restaurant,Italian Restaurant,Latin American Restaurant,Noodle House,Moroccan Restaurant,Middle Eastern Restaurant,Mediterranean Restaurant,Poke Place,Falafel Restaurant,Beer Garden
152,Auburndale (Queens),Korean Restaurant,Italian Restaurant,Sushi Restaurant,Greek Restaurant,Japanese Restaurant,Mexican Restaurant,Noodle House,Vietnamese Restaurant,Thai Restaurant,Taco Place
79,Bath Beach (Brooklyn),Chinese Restaurant,Asian Restaurant,Cantonese Restaurant,Dim Sum Restaurant,Japanese Restaurant,German Restaurant,Spanish Restaurant,Sushi Restaurant,Poke Place,Indian Restaurant
127,Battery Park City (Manhattan),Mexican Restaurant,Italian Restaurant,Sushi Restaurant,French Restaurant,Beer Garden,Burrito Place,Chinese Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant
46,Bay Ridge (Brooklyn),Italian Restaurant,Greek Restaurant,Thai Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Chinese Restaurant,Vietnamese Restaurant,Sushi Restaurant,Japanese Restaurant,Taco Place
175,Bay Terrace (Queens),Asian Restaurant,Italian Restaurant,Mediterranean Restaurant,Peruvian Restaurant,Indian Restaurant,Greek Restaurant,Mexican Restaurant,Halal Restaurant,French Restaurant,German Restaurant
151,Bayside (Queens),Sushi Restaurant,Indian Restaurant,Italian Restaurant,Greek Restaurant,Mediterranean Restaurant,Mexican Restaurant,Chinese Restaurant,Korean Restaurant,Noodle House,Shanghai Restaurant
63,Bedford Stuyvesant (Brooklyn),Mexican Restaurant,Japanese Restaurant,Chinese Restaurant,Vietnamese Restaurant,Arepa Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Falafel Restaurant


#### Cluster 2

In [451]:
nyc_merged.loc[nyc_merged['Cluster Labels'] == 1, nyc_merged.columns[[1] + list(range(5, nyc_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
235,Bay Terrace (Staten Island),Italian Restaurant,French Restaurant,Sushi Restaurant,Chinese Restaurant,Vietnamese Restaurant,Halal Restaurant,Filipino Restaurant,German Restaurant,Greek Restaurant,Hawaiian Restaurant
34,Belmont (Bronx),Italian Restaurant,Spanish Restaurant,Mexican Restaurant,Chinese Restaurant,Vietnamese Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Hawaiian Restaurant
208,Castleton Corners (Staten Island),Italian Restaurant,Japanese Restaurant,Thai Restaurant,Indian Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Curry Restaurant,Israeli Restaurant,Irish Pub,Hawaiian Restaurant
244,Chelsea (Staten Island),Italian Restaurant,Spanish Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant
12,City Island (Bronx),Italian Restaurant,French Restaurant,Tapas Restaurant,Spanish Restaurant,Vietnamese Restaurant,Greek Restaurant,Falafel Restaurant,Filipino Restaurant,German Restaurant,Hawaiian Restaurant
254,Concord (Staten Island),Italian Restaurant,Peruvian Restaurant,Chinese Restaurant,Mexican Restaurant,Spanish Restaurant,Halal Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Vietnamese Restaurant
231,Dongan Hills (Staten Island),Italian Restaurant,Mediterranean Restaurant,Chinese Restaurant,Sushi Restaurant,Greek Restaurant,Mexican Restaurant,Spanish Restaurant,Vietnamese Restaurant,Filipino Restaurant,French Restaurant
287,Egbertville (Staten Island),Italian Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant,Halal Restaurant
255,Emerson Hill (Staten Island),Italian Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant,Halal Restaurant
279,Fulton Ferry (Brooklyn),Italian Restaurant,Mexican Restaurant,Ramen Restaurant,Sushi Restaurant,Chinese Restaurant,Middle Eastern Restaurant,Falafel Restaurant,Indian Restaurant,Halal Restaurant,Hawaiian Restaurant


#### Cluster 3

In [452]:
nyc_merged.loc[nyc_merged['Cluster Labels'] == 2, nyc_merged.columns[[1] + list(range(5, nyc_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
227,Arlington (Staten Island),Caribbean Restaurant,Peruvian Restaurant,Vietnamese Restaurant,Falafel Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant
168,Cambria Heights (Queens),Caribbean Restaurant,Mexican Restaurant,Chinese Restaurant,Vietnamese Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Hawaiian Restaurant,Indian Restaurant
74,Canarsie (Brooklyn),Caribbean Restaurant,Chinese Restaurant,African Restaurant,Thai Restaurant,Asian Restaurant,Middle Eastern Restaurant,Taco Place,Halal Restaurant,French Restaurant,German Restaurant
105,Central Harlem (Manhattan),Caribbean Restaurant,Chinese Restaurant,French Restaurant,Southern / Soul Food Restaurant,African Restaurant,Italian Restaurant,Tex-Mex Restaurant,Tapas Restaurant,Ethiopian Restaurant,Filipino Restaurant
267,Claremont Village (Bronx),Chinese Restaurant,Caribbean Restaurant,Southern / Soul Food Restaurant,Spanish Restaurant,African Restaurant,Irish Pub,Indian Restaurant,Hawaiian Restaurant,Halal Restaurant,Ethiopian Restaurant
55,Crown Heights (Brooklyn),Caribbean Restaurant,Sushi Restaurant,Kosher Restaurant,Chinese Restaurant,Southern / Soul Food Restaurant,Mexican Restaurant,Falafel Restaurant,French Restaurant,German Restaurant,Greek Restaurant
221,Ditmas Park (Brooklyn),Caribbean Restaurant,Chinese Restaurant,Japanese Restaurant,Taco Place,Ramen Restaurant,Spanish Restaurant,Filipino Restaurant,Mexican Restaurant,Southern / Soul Food Restaurant,Latin American Restaurant
141,East Elmhurst (Queens),Caribbean Restaurant,Mexican Restaurant,Chinese Restaurant,Latin American Restaurant,Asian Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Hawaiian Restaurant,Indian Restaurant
56,East Flatbush (Brooklyn),Caribbean Restaurant,Chinese Restaurant,Vietnamese Restaurant,Falafel Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Italian Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant
2,Eastchester (Bronx),Caribbean Restaurant,Chinese Restaurant,Asian Restaurant,Vietnamese Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Hawaiian Restaurant,Indian Restaurant


#### Cluster 4

In [453]:
nyc_merged.loc[nyc_merged['Cluster Labels'] == 3, nyc_merged.columns[[1] + list(range(5, nyc_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
298,Allerton (Bronx),Mexican Restaurant,Chinese Restaurant,Spanish Restaurant,Vietnamese Restaurant,Halal Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Hawaiian Restaurant
177,Arverne (Queens),Thai Restaurant,Chinese Restaurant,Vietnamese Restaurant,Hawaiian Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Indian Restaurant
266,Astoria Heights (Queens),Italian Restaurant,Chinese Restaurant,Spanish Restaurant,Greek Restaurant,Vietnamese Restaurant,Halal Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Hawaiian Restaurant
10,Baychester (Bronx),Chinese Restaurant,Spanish Restaurant,Mexican Restaurant,Vietnamese Restaurant,Halal Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Hawaiian Restaurant
13,Bedford Park (Bronx),Mexican Restaurant,Chinese Restaurant,Latin American Restaurant,Spanish Restaurant,Hawaiian Restaurant,French Restaurant,German Restaurant,Greek Restaurant,Halal Restaurant,Vietnamese Restaurant
174,Beechhurst (Queens),Chinese Restaurant,Italian Restaurant,Caribbean Restaurant,Vietnamese Restaurant,Falafel Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant
194,Bellaire (Queens),Chinese Restaurant,Italian Restaurant,Indian Restaurant,Halal Restaurant,Greek Restaurant,Vietnamese Restaurant,Ethiopian Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Israeli Restaurant
190,Belle Harbor (Queens),Italian Restaurant,Mexican Restaurant,Chinese Restaurant,Vietnamese Restaurant,Falafel Restaurant,Japanese Restaurant,Japanese Curry Restaurant,Israeli Restaurant,Irish Pub,Indian Restaurant
156,Bellerose (Queens),Chinese Restaurant,Italian Restaurant,Japanese Restaurant,Argentinian Restaurant,Asian Restaurant,Indian Restaurant,Vietnamese Restaurant,Halal Restaurant,French Restaurant,German Restaurant
47,Bensonhurst (Brooklyn),Chinese Restaurant,Italian Restaurant,Mexican Restaurant,Sushi Restaurant,Asian Restaurant,Latin American Restaurant,Dumpling Restaurant,Noodle House,German Restaurant,Greek Restaurant
