 # Capstone Project - The Battle of Neighborhoods

## Import Libraries

In this section we import the libraries that will be required to process the data.

The first library is Pandas.
Pandas is an open source, BSD-licensed library, providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

In [1]:
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim
import urllib.request
import json
from bs4 import BeautifulSoup
from urllib.request import urlopen
import requests
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import matplotlib.colors as colors
%matplotlib inline
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium

print('Libraries imported.')

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Libraries imported.


## Download and Explore Dataset


Download and Explore Dataset
Neighborhood has a total of 5 boroughs and 306 neighborhoods. In order to segement the neighborhoods and explore them, we will essentially need a dataset that contains the 5 boroughs and the neighborhoods that exist in each borough as well as the the latitude and logitude coordinates of each neighborhood.

Luckily, this dataset exists for free on the web. Feel free to try to find this dataset on your own, but here is the link to the dataset: https://geo.nyu.edu/catalog/nyu_2451_34572

For your convenience, I downloaded the files and placed it on the server, so you can simply run a wget command and access the data. So let's go ahead and do that.

In [2]:
with open('nyu_2451_34572-geojson.json') as json_data:
    newyork_data = json.load(json_data)

#### Tranform the data into a *pandas* dataframe

In [3]:
neighborhoods_data = newyork_data['features']
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [34]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [35]:
neighborhoods.count()

Borough         40
Neighborhood    40
Latitude        40
Longitude       40
dtype: int64

#### Use geopy library to get the latitude and longitude values of New York City.

In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent <em>ny_explorer</em>, as shown below.

In [5]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


#### Create a map of New York with neighborhoods superimposed on top.

In [30]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [31]:
manhattan_data.count()

Borough         40
Neighborhood    40
Latitude        40
Longitude       40
dtype: int64

In [7]:
import folium
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Borough'], manhattan_data['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
        fill=True,
        fill_color='green',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

## Foursquare venues


In [8]:
import urllib
def getNearbyVenues(names, latitudes, longitudes, radius=5000, categoryIds=''):
    try:
        venues_list=[]
        for name, lat, lng in zip(names, latitudes, longitudes):
            #print(name)

            # create the API request URL
            url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)

            if (categoryIds != ''):
                url = url + '&categoryId={}'
                url = url.format(categoryIds)

            # make the GET request
            response = requests.get(url).json()
            results = response["response"]['venues']

            # return only relevant information for each nearby venue
            for v in results:
                success = False
                try:
                    category = v['categories'][0]['name']
                    success = True
                except:
                    pass

                if success:
                    venues_list.append([(
                        name, 
                        lat, 
                        lng, 
                        v['name'], 
                        v['location']['lat'], 
                        v['location']['lng'],
                        v['categories'][0]['name']
                    )])

        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude',  
                  'Venue Category']
    
    except:
        print(url)
        print(response)
        print(results)
        print(nearby_venues)

    return(nearby_venues)

In [9]:
LIMIT = 500 
radius = 5000 
CLIENT_ID = 'SDDSETKBMK2GX4WLOATPUIFRL4VU2HFLXLVWOK0CIUQBDVKP'
CLIENT_SECRET = '5LM5TUHBFGF4C3YQW53OCJKXRVBUB3X1DAFGIWPQTFGL5W4Y'
VERSION = '20181020'

In [10]:
#https://developer.foursquare.com/docs/resources/categories
#College  = 4d4b7105d754a06372d81259
neighborhoods = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
newyork_venues_College = getNearbyVenues(names=neighborhoods['Neighborhood'], latitudes=neighborhoods['Latitude'], longitudes=neighborhoods['Longitude'], radius=1000, categoryIds='4d4b7105d754a06372d81259')
newyork_venues_College.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,IN-Tech Academy,40.879101,-73.911026,High School
1,Marble Hill,40.876551,-73.91066,Columbia University Medical,40.87695,-73.901602,Medical School
2,Marble Hill,40.876551,-73.91066,Robert K. Kraft Field,40.872951,-73.916023,College Football Field
3,Marble Hill,40.876551,-73.91066,Lawrence A. Wien Stadium,40.872543,-73.915749,College Stadium
4,Marble Hill,40.876551,-73.91066,Wein Stadium,40.872599,-73.916518,College Stadium


In [11]:
newyork_venues_College.shape

(1855, 7)

In [12]:
def addToMap(df, color, existingMap):
    for lat, lng, local, venue, venueCat in zip(df['Venue Latitude'], df['Venue Longitude'], df['Neighborhood'], df['Venue'], df['Venue Category']):
        label = '{} ({}) - {}'.format(venue, venueCat, local)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=label,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=0.7).add_to(existingMap)

In [13]:
map_newyork_College = folium.Map(location=[latitude, longitude], zoom_start=10)
addToMap(newyork_venues_College, 'brown', map_newyork_College)

map_newyork_College

In [14]:
def addColumn(startDf, columnTitle, dataDf):
    grouped = dataDf.groupby('Neighborhood').count()
    
    for n in startDf['Neighborhood']:
        try:
            startDf.loc[startDf['Neighborhood'] == n,columnTitle] = grouped.loc[n, 'Venue']
        except:
            startDf.loc[startDf['Neighborhood'] == n,columnTitle] = 0

In [15]:
manhattan_grouped = newyork_venues_College.groupby('Neighborhood').count()
manhattan_grouped
#print('There are {} uniques categories.'.format(len(newyork_venues_sushi['Venue Category'].unique())))

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,48,48,48,48,48,48
Carnegie Hill,47,47,47,47,47,47
Central Harlem,48,48,48,48,48,48
Chelsea,45,45,45,45,45,45
Chinatown,48,48,48,48,48,48
Civic Center,49,49,49,49,49,49
Clinton,45,45,45,45,45,45
East Harlem,47,47,47,47,47,47
East Village,50,50,50,50,50,50
Financial District,46,46,46,46,46,46


## 3. Analyze Each Neighborhood

In [16]:
# one hot encoding
manhattan_onehot = pd.get_dummies(newyork_venues_College[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = newyork_venues_College['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()

Unnamed: 0,Neighborhood,Art Gallery,Athletics & Sports,Auditorium,College & University,College Academic Building,College Administrative Building,College Arts Building,College Auditorium,College Baseball Diamond,College Basketball Court,College Bookstore,College Cafeteria,College Classroom,College Communications Building,College Engineering Building,College Football Field,College Gym,College History Building,College Lab,College Library,College Math Building,College Quad,College Rec Center,College Residence Hall,College Science Building,College Stadium,College Technology Building,College Tennis Court,College Theater,College Track,Community College,Deli / Bodega,Doctor's Office,Elementary School,Food Court,Fraternity House,General College & University,Gym,Gym / Fitness Center,High School,Hospital,Hotel,IT Services,Indie Theater,Law School,Library,Medical School,Office,Performing Arts Venue,Residential Building (Apartment / Condo),School,Sorority House,Student Center,Tech Startup,Tennis Court,Trade School,University
0,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
2,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Marble Hill,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [17]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Art Gallery,Athletics & Sports,Auditorium,College & University,College Academic Building,College Administrative Building,College Arts Building,College Auditorium,College Baseball Diamond,College Basketball Court,College Bookstore,College Cafeteria,College Classroom,College Communications Building,College Engineering Building,College Football Field,College Gym,College History Building,College Lab,College Library,College Math Building,College Quad,College Rec Center,College Residence Hall,College Science Building,College Stadium,College Technology Building,College Tennis Court,College Theater,College Track,Community College,Deli / Bodega,Doctor's Office,Elementary School,Food Court,Fraternity House,General College & University,Gym,Gym / Fitness Center,High School,Hospital,Hotel,IT Services,Indie Theater,Law School,Library,Medical School,Office,Performing Arts Venue,Residential Building (Apartment / Condo),School,Sorority House,Student Center,Tech Startup,Tennis Court,Trade School,University
0,Battery Park City,0.0,0.0,0.0,0.0,0.125,0.083333,0.0,0.020833,0.0,0.0,0.0,0.020833,0.125,0.0,0.0,0.0,0.020833,0.020833,0.0625,0.0,0.020833,0.0625,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.041667,0.020833,0.0,0.0,0.0,0.020833,0.0,0.020833,0.0,0.0,0.020833,0.104167
1,Carnegie Hill,0.0,0.0,0.0,0.0,0.085106,0.042553,0.021277,0.021277,0.0,0.021277,0.0,0.021277,0.042553,0.0,0.0,0.0,0.021277,0.0,0.085106,0.0,0.0,0.021277,0.0,0.085106,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.06383,0.085106,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.042553,0.042553,0.085106,0.0,0.0,0.06383,0.06383
2,Central Harlem,0.0,0.0,0.0,0.0,0.1875,0.083333,0.041667,0.0,0.0,0.0,0.0,0.041667,0.1875,0.0,0.0,0.0,0.020833,0.0,0.083333,0.020833,0.0,0.041667,0.0,0.020833,0.041667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833,0.0625
3,Chelsea,0.022222,0.0,0.0,0.0,0.111111,0.022222,0.066667,0.066667,0.0,0.0,0.0,0.0,0.088889,0.022222,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.022222,0.0,0.022222,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.088889,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.022222,0.022222,0.0,0.088889,0.111111
4,Chinatown,0.0,0.020833,0.020833,0.020833,0.020833,0.0625,0.041667,0.0,0.0,0.0,0.020833,0.020833,0.145833,0.0,0.0,0.0,0.0625,0.020833,0.0625,0.041667,0.0,0.020833,0.020833,0.083333,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.104167,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.020833,0.0,0.0,0.0,0.0625
5,Civic Center,0.0,0.020408,0.020408,0.0,0.122449,0.081633,0.020408,0.0,0.0,0.0,0.0,0.0,0.081633,0.0,0.0,0.0,0.040816,0.0,0.081633,0.020408,0.020408,0.061224,0.020408,0.040816,0.0,0.0,0.0,0.0,0.0,0.020408,0.020408,0.0,0.0,0.0,0.0,0.0,0.040816,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.061224,0.040816,0.0,0.0,0.0,0.0,0.020408,0.040816,0.0,0.0,0.0,0.020408,0.102041
6,Clinton,0.0,0.0,0.0,0.0,0.133333,0.022222,0.0,0.022222,0.0,0.0,0.044444,0.0,0.088889,0.0,0.0,0.0,0.022222,0.0,0.044444,0.022222,0.022222,0.044444,0.022222,0.022222,0.0,0.0,0.022222,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.022222,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.066667,0.0,0.0,0.155556,0.111111
7,East Harlem,0.0,0.0,0.0,0.042553,0.085106,0.0,0.0,0.042553,0.0,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.042553,0.0,0.12766,0.0,0.0,0.0,0.0,0.106383,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.06383,0.085106,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.042553,0.170213,0.0,0.0,0.021277,0.021277
8,East Village,0.0,0.0,0.0,0.0,0.16,0.12,0.06,0.04,0.0,0.0,0.0,0.06,0.04,0.02,0.0,0.0,0.02,0.0,0.0,0.04,0.0,0.02,0.0,0.1,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.08,0.0,0.0,0.0,0.1
9,Financial District,0.0,0.0,0.0,0.0,0.108696,0.086957,0.0,0.0,0.0,0.0,0.0,0.021739,0.152174,0.0,0.0,0.0,0.021739,0.021739,0.065217,0.0,0.0,0.043478,0.0,0.065217,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.130435,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.043478,0.0,0.043478,0.0,0.0,0.021739,0.108696


In [18]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [19]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,College Classroom,College Academic Building,University,General College & University,College Administrative Building,College Quad,College Lab,Law School,College Residence Hall,Community College
1,Carnegie Hill,General College & University,College Academic Building,Student Center,College Residence Hall,College Lab,University,Trade School,Fraternity House,College Classroom,School
2,Central Harlem,College Academic Building,College Classroom,General College & University,College Administrative Building,College Lab,University,College Arts Building,College Cafeteria,College Quad,College Science Building
3,Chelsea,University,College Academic Building,General College & University,College Classroom,Trade School,College Arts Building,College Auditorium,College Lab,College Theater,College Residence Hall
4,Chinatown,College Classroom,General College & University,College Residence Hall,University,College Lab,College Administrative Building,College Gym,Sorority House,College Library,College Arts Building


Cluster Neighborhoods


In [20]:
# set number of clusters
kclusters = 8

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([5, 7, 5, 1, 5, 5, 1, 7, 2, 5])

In [37]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head() 

ValueError: cannot insert Cluster Labels, already exists

In [22]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [23]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Upper East Side,College Academic Building,College Classroom,College Library,University,General College & University,Student Center,College Lab,College Administrative Building,College Arts Building,College Basketball Court
12,Upper West Side,Student Center,General College & University,College Academic Building,College Classroom,College Residence Hall,College Library,College Arts Building,College Science Building,College Technology Building,Trade School
13,Lincoln Square,College Academic Building,General College & University,College Residence Hall,College Classroom,University,College Administrative Building,College Arts Building,College Library,College Theater,Student Center


In [24]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Yorkville,University,College Classroom,Student Center,Trade School,College Academic Building,Sorority House,School,College Administrative Building,College Lab,College Auditorium
14,Clinton,Trade School,College Academic Building,University,College Classroom,Student Center,College Bookstore,College Lab,College Quad,College Theater,College Residence Hall
15,Midtown,College Academic Building,University,General College & University,College Classroom,Trade School,College Administrative Building,College Lab,Doctor's Office,Fraternity House,College Theater
17,Chelsea,University,College Academic Building,General College & University,College Classroom,Trade School,College Arts Building,College Auditorium,College Lab,College Theater,College Residence Hall
33,Midtown South,University,College Academic Building,Student Center,Trade School,College Administrative Building,College Classroom,College Lab,College Theater,College Quad,College Library
39,Hudson Yards,University,Trade School,College Academic Building,College Classroom,College Administrative Building,College Bookstore,College Quad,College Residence Hall,College Theater,Student Center


In [25]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Manhattanville,College Academic Building,University,College Quad,College Administrative Building,College Library,College Residence Hall,College Cafeteria,College Science Building,College Arts Building,General College & University
18,Greenwich Village,College Academic Building,University,College Cafeteria,Student Center,College Arts Building,College Auditorium,Law School,College Residence Hall,College Library,College Gym
19,East Village,College Academic Building,College Administrative Building,University,College Residence Hall,Student Center,College Arts Building,College Cafeteria,General College & University,School,College Auditorium
22,Little Italy,College Academic Building,College Administrative Building,University,Law School,College Gym,College Library,College Lab,College Cafeteria,College Auditorium,College Classroom
23,Soho,College Academic Building,Law School,College Library,College Cafeteria,College Administrative Building,College Residence Hall,Student Center,College Lab,College Arts Building,College Gym
24,West Village,College Academic Building,Law School,University,College Library,College Cafeteria,College Auditorium,College Residence Hall,Student Center,Performing Arts Venue,General College & University
26,Morningside Heights,College Academic Building,College Residence Hall,University,College Library,General College & University,College Quad,College Administrative Building,College Cafeteria,College Engineering Building,College Science Building
31,Noho,College Academic Building,Student Center,College Cafeteria,College Arts Building,College Residence Hall,College Library,Law School,College Administrative Building,College Auditorium,College Gym
38,Flatiron,College Academic Building,University,College Cafeteria,General College & University,College Arts Building,College Auditorium,College Residence Hall,College Administrative Building,College Classroom,College Library


In [26]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,Manhattan Valley,College Residence Hall,General College & University,College Library,Fraternity House,College Academic Building,Student Center,College Cafeteria,University,College Classroom,College Administrative Building
27,Gramercy,College Residence Hall,College Academic Building,University,College Administrative Building,Student Center,College Library,Medical School,College Theater,College Science Building,College Auditorium
37,Stuyvesant Town,College Residence Hall,College Lab,College Academic Building,Student Center,College Library,Fraternity House,College Arts Building,Medical School,College Quad,General College & University


In [27]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Lenox Hill,College Academic Building,Medical School,College Lab,College Library,General College & University,College Science Building,University,College Administrative Building,Student Center,College Cafeteria
11,Roosevelt Island,Medical School,College Lab,University,College Library,College Administrative Building,College Science Building,College Quad,General College & University,College Academic Building,College Gym
16,Murray Hill,College Academic Building,University,College Classroom,College Residence Hall,Medical School,College Lab,Student Center,General College & University,College Library,College Administrative Building
34,Sutton Place,College Academic Building,Medical School,General College & University,University,College Lab,College Science Building,College Library,College Classroom,College Administrative Building,College Gym
