# Capstone Project - The Battle of Neighborhoods

## Import Libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-1.21.0               |             py_0          58 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.2 MB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.50-py_0
  geopy              conda-forge/noarch::geopy-1.21.0-py_0

The following packages will be UPDATED:

  openssl                                 1.1.1f-h516909a_0 --> 1.1.1g-h51

## Download and Explore Dataset

Neighborhood has a total of 5 boroughs and 306 neighborhoods. In order to segment the neighborhoods and explore them, we will essentially need a dataset that contains the 5 boroughs and the neighborhoods that exist in each borough as well as the the latitude and longitude coordinates of each neighborhood.

We get the dataset from the following link: https://geo.nyu.edu/catalog/nyu_2451_34572

We can simply run a wget command and access the data. So let's go ahead and do that.

In [2]:
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')

Data downloaded!


## Load and Explore the Data

In [3]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

All the relevant data is in the features key, which is basically a list of the neighborhoods. So, let's define a new variable that includes this data.

In [4]:
neighborhoods_data = newyork_data['features']
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

## Transform the data into a Pandas Dataframe

In [5]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


Then let's loop through the data and fill the dataframe one row at a time.

In [6]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [7]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


### Use geopy library to get the latitude and longitude values of New York City.

In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent ny_explorer, as shown below.

In [8]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [9]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

 Let's simplify the above map and segment and cluster only the neighborhoods in Manhattan. So let's slice the original dataframe and create a new dataframe of the Manhattan data.

In [10]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


Let's get the geographical coordinates of Manhattan.

In [11]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


As we did with all of New York City, let's visualize Manhattan the neighborhoods in it.

In [12]:
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

## Getting venues through Foursquare API

In [13]:
import urllib
def getNearbyVenues(names, latitudes, longitudes, radius=5000, categoryIds=''):
    try:
        venues_list=[]
        for name, lat, lng in zip(names, latitudes, longitudes):
            #print(name)

            # create the API request URL
            url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)

            if (categoryIds != ''):
                url = url + '&categoryId={}'
                url = url.format(categoryIds)

            # make the GET request
            response = requests.get(url).json()
            results = response["response"]['venues']

            # return only relevant information for each nearby venue
            for v in results:
                success = False
                try:
                    category = v['categories'][0]['name']
                    success = True
                except:
                    pass

                if success:
                    venues_list.append([(
                        name, 
                        lat, 
                        lng, 
                        v['name'], 
                        v['location']['lat'], 
                        v['location']['lng'],
                        v['categories'][0]['name']
                    )])

        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude',  
                  'Venue Category']
    
    except:
        print(url)
        print(response)
        print(results)
        print(nearby_venues)

    return(nearby_venues)

In [14]:
LIMIT = 500 
radius = 5000 
CLIENT_ID = 'RINYDC0XFEN03USIV3RECOQINWWG4HRT1LMY1YASQ0DHINDY'
CLIENT_SECRET = '4LVJQQYG0HIVOON2P431NAI1KNPFR0VVHVMKJK4VA0TESB0E'
VERSION = '20180605'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: RINYDC0XFEN03USIV3RECOQINWWG4HRT1LMY1YASQ0DHINDY
CLIENT_SECRET:4LVJQQYG0HIVOON2P431NAI1KNPFR0VVHVMKJK4VA0TESB0E


In [15]:
#https://developer.foursquare.com/docs/resources/categories
#Indian = 4bf58dd8d48988d10f941735
manhattan_venues_indian = getNearbyVenues(names=manhattan_data['Neighborhood'], latitudes=manhattan_data['Latitude'], longitudes=manhattan_data['Longitude'], radius=5000, categoryIds='4bf58dd8d48988d10f941735')
manhattan_venues_indian.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Riverdale Indian Cuisine,40.880886,-73.9088,Indian Restaurant
1,Marble Hill,40.876551,-73.91066,Aman Restaurant,40.885174,-73.87955,Indian Restaurant
2,Marble Hill,40.876551,-73.91066,Shahi Kabab and Curry,40.854604,-73.868137,Indian Restaurant
3,Marble Hill,40.876551,-73.91066,Spice Mantra,40.893799,-73.975175,Indian Restaurant
4,Marble Hill,40.876551,-73.91066,Delhi Masala Express,40.834512,-73.944967,Indian Restaurant


In [16]:
manhattan_venues_indian.shape

(1928, 7)

In [17]:
def addToMap(df, color, existingMap):
    for lat, lng, local, venue, venueCat in zip(df['Venue Latitude'], df['Venue Longitude'], df['Neighborhood'], df['Venue'], df['Venue Category']):
        label = '{} ({}) - {}'.format(venue, venueCat, local)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=0.7).add_to(existingMap)

In [18]:
map_manhattan_indian = folium.Map(location=[latitude, longitude], zoom_start=10)
addToMap(manhattan_venues_indian, 'red', map_manhattan_indian)

map_manhattan_indian

In [19]:
def addColumn(startDf, columnTitle, dataDf):
    grouped = dataDf.groupby('Neighborhood').count()
    
    for n in startDf['Neighborhood']:
        try:
            startDf.loc[startDf['Neighborhood'] == n,columnTitle] = grouped.loc[n, 'Venue']
        except:
            startDf.loc[startDf['Neighborhood'] == n,columnTitle] = 0

In [20]:
manhattan_grouped = manhattan_venues_indian.groupby('Neighborhood').count()
manhattan_grouped

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,50,50,50,50,50,50
Carnegie Hill,50,50,50,50,50,50
Central Harlem,50,50,50,50,50,50
Chelsea,50,50,50,50,50,50
Chinatown,50,50,50,50,50,50
Civic Center,50,50,50,50,50,50
Clinton,50,50,50,50,50,50
East Harlem,50,50,50,50,50,50
East Village,50,50,50,50,50,50
Financial District,50,50,50,50,50,50


In [21]:
print('There are {} unique categories.'.format(len(manhattan_venues_indian['Venue Category'].unique())))

There are 12 unique categories.


# ANALYZING EACH NEIGHBORHOOD

In [22]:
# one hot encoding
manhattan_onehot = pd.get_dummies(manhattan_venues_indian[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = manhattan_venues_indian['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()

Unnamed: 0,Neighborhood,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Coffee Shop,Deli / Bodega,Food Truck,Grocery Store,Indian Restaurant,North Indian Restaurant,Snack Place,South Indian Restaurant,Tibetan Restaurant
0,Marble Hill,0,0,0,0,0,0,0,1,0,0,0,0
1,Marble Hill,0,0,0,0,0,0,0,1,0,0,0,0
2,Marble Hill,0,0,0,0,0,0,0,1,0,0,0,0
3,Marble Hill,0,0,0,0,0,0,0,1,0,0,0,0
4,Marble Hill,0,0,0,0,0,0,0,1,0,0,0,0


In [23]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Coffee Shop,Deli / Bodega,Food Truck,Grocery Store,Indian Restaurant,North Indian Restaurant,Snack Place,South Indian Restaurant,Tibetan Restaurant
0,Battery Park City,0.0,0.0,0.02,0.0,0.02,0.04,0.02,0.84,0.02,0.0,0.04,0.0
1,Carnegie Hill,0.02,0.0,0.02,0.0,0.02,0.0,0.0,0.88,0.02,0.0,0.02,0.02
2,Central Harlem,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
3,Chelsea,0.0,0.0,0.0,0.0,0.02,0.04,0.02,0.86,0.02,0.0,0.04,0.0
4,Chinatown,0.0,0.0,0.02,0.02,0.02,0.04,0.02,0.82,0.02,0.0,0.04,0.0
5,Civic Center,0.0,0.0,0.02,0.0,0.02,0.04,0.02,0.84,0.02,0.0,0.04,0.0
6,Clinton,0.0,0.0,0.0,0.0,0.02,0.04,0.02,0.86,0.02,0.0,0.04,0.0
7,East Harlem,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.94,0.0,0.02,0.0,0.02
8,East Village,0.0,0.0,0.04,0.0,0.02,0.04,0.02,0.82,0.02,0.0,0.04,0.0
9,Financial District,0.0,0.0,0.02,0.02,0.02,0.04,0.02,0.82,0.02,0.0,0.04,0.0


In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
1,Carnegie Hill,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck
2,Central Harlem,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop,Chinese Restaurant
3,Chelsea,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop,Chinese Restaurant
4,Chinatown,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Chinese Restaurant,Tibetan Restaurant,Snack Place


# CLUSTERING NEIGHBORHOODS

In [26]:
# set number of clusters
kclusters = 5

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([3, 2, 1, 3, 0, 3, 3, 4, 0, 0], dtype=int32)

In [27]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Manhattan,Marble Hill,40.876551,-73.91066,4,Indian Restaurant,Caribbean Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop
1,Manhattan,Chinatown,40.715618,-73.994279,0,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Chinese Restaurant,Tibetan Restaurant,Snack Place
2,Manhattan,Washington Heights,40.851903,-73.9369,4,Indian Restaurant,Food Truck,Caribbean Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop
3,Manhattan,Inwood,40.867684,-73.92121,4,Indian Restaurant,Caribbean Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop
4,Manhattan,Hamilton Heights,40.823604,-73.949688,1,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop,Chinese Restaurant


In [28]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster 1

In [35]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Chinatown,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Chinese Restaurant,Tibetan Restaurant,Snack Place
19,East Village,Indian Restaurant,South Indian Restaurant,Food Truck,Chinese Restaurant,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop
20,Lower East Side,Indian Restaurant,South Indian Restaurant,Food Truck,Chinese Restaurant,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Tibetan Restaurant,Snack Place
27,Gramercy,Indian Restaurant,South Indian Restaurant,Food Truck,Chinese Restaurant,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop
29,Financial District,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Chinese Restaurant,Tibetan Restaurant,Snack Place
31,Noho,Indian Restaurant,South Indian Restaurant,Food Truck,Chinese Restaurant,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop
37,Stuyvesant Town,Indian Restaurant,South Indian Restaurant,Food Truck,Chinese Restaurant,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop


### Cluster 2

In [36]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Hamilton Heights,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop,Chinese Restaurant
5,Manhattanville,Indian Restaurant,Food Truck,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Chinese Restaurant
6,Central Harlem,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop,Chinese Restaurant
25,Manhattan Valley,Indian Restaurant,South Indian Restaurant,North Indian Restaurant,Tibetan Restaurant,Snack Place,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop,Chinese Restaurant
26,Morningside Heights,Indian Restaurant,Food Truck,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop,Chinese Restaurant


### Cluster 3

In [37]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Upper East Side,Indian Restaurant,South Indian Restaurant,Tibetan Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck
9,Yorkville,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Grocery Store,Food Truck
10,Lenox Hill,Indian Restaurant,South Indian Restaurant,Tibetan Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck
11,Roosevelt Island,Indian Restaurant,South Indian Restaurant,Tibetan Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck
30,Carnegie Hill,Indian Restaurant,Tibetan Restaurant,South Indian Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck
34,Sutton Place,Indian Restaurant,South Indian Restaurant,Tibetan Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck
35,Turtle Bay,Indian Restaurant,South Indian Restaurant,Tibetan Restaurant,North Indian Restaurant,Deli / Bodega,Chinese Restaurant,Asian Restaurant,Snack Place,Grocery Store,Food Truck


### Cluster 4

In [38]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Clinton,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop,Chinese Restaurant
15,Midtown,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
16,Murray Hill,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
17,Chelsea,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop,Chinese Restaurant
18,Greenwich Village,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
21,Tribeca,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
22,Little Italy,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
23,Soho,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop
24,West Village,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Tibetan Restaurant,Snack Place,Coffee Shop,Chinese Restaurant
28,Battery Park City,Indian Restaurant,South Indian Restaurant,Food Truck,North Indian Restaurant,Grocery Store,Deli / Bodega,Chinese Restaurant,Tibetan Restaurant,Snack Place,Coffee Shop


### Cluster 5

In [39]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Marble Hill,Indian Restaurant,Caribbean Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop
2,Washington Heights,Indian Restaurant,Food Truck,Caribbean Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Deli / Bodega,Coffee Shop
3,Inwood,Indian Restaurant,Caribbean Restaurant,Tibetan Restaurant,South Indian Restaurant,Snack Place,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop
7,East Harlem,Indian Restaurant,Tibetan Restaurant,Snack Place,Asian Restaurant,South Indian Restaurant,North Indian Restaurant,Grocery Store,Food Truck,Deli / Bodega,Coffee Shop
12,Upper West Side,Indian Restaurant,South Indian Restaurant,North Indian Restaurant,Deli / Bodega,Tibetan Restaurant,Snack Place,Grocery Store,Food Truck,Coffee Shop,Chinese Restaurant
13,Lincoln Square,Indian Restaurant,South Indian Restaurant,North Indian Restaurant,Deli / Bodega,Tibetan Restaurant,Snack Place,Grocery Store,Food Truck,Coffee Shop,Chinese Restaurant
