# **Battle of Neighbourhoods:IBM Capstone Project**

## ** Introduction**

As part of my capstone project, I decided to compare two neighbourhoods in New York in depth to find out how they compare to each other using location data from Foursquare API. The aim of this project is to find the ten most popular venues in each of the neighbourhoods then clustering them for future interested parties for examples data analysts, location data managers, tourists etc.

**New York** often called the big Apple, comprises of 5 boroughs sitting where the Hudson River meets the Atlantic Ocean. It is the largest city in the United States with a long history of international immigration having a population of 8.149 million as of 2019. Its iconic sites include skyscrapers such as the Empire State building and sprawling Central Park. New York City's demographics show that it is a large and ethnically diverse metropolis, thus New York city is known worldwide as a cultural melting pot. While other states have had immigration surges, none have compared to the diversity and sheer number of immigrants that have made their way to the City. The varied cultures combined to create a great diversity for itself, the city is the world’s Financial epicentre home to NYSE and NASDAQ, Communications, Real Estate, Insurance, Technology, Entertainment, Healthcare etc.

**Manhattan** Originally the only borough in the city has the smallest land area with a population of 1.6mil diverse (example neighbourhoods of Soho and Harlem), rich culture, leading centre for performing arts and iconic landmarks: Central park, Skyscrapers, Empire State building, historic Cathedrals, Wallstreet, Grand Central station etc.
Travelling using the subway, walking or taxis etc it is divided into three grids: Uptown, Downtown and Midtown, roads are Avenues (north to south) and Streets (east to west).


**Bronx** 4th largest, the northernmost and only borough on American mainland, population of 1.3mil largely residential, has vibrant neighbourhoods some areas became the symbol of urban decay that happened in the 1960s to 80s. Home of Hip Hop music, the Yankee Stadium, New York Botanical Gardens.

### **Data**

To accomplish the comparison between these two boroughs of Manhattan and Bronx in New York, Foursquare API will be used to gather data on the ten most popular venues in each of the neighbourhoods mentioned above.
Data scrapping and cleaning by using python packages. 
Neighbourhood, and Borough data for New York was provided from the lab in week 3 of the course https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-Skillabs/newyork_data.json

In [1]:
#libraries
import numpy as np 

import pandas as pd

import json 

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim 

import requests 
from pandas.io.json import json_normalize 

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: - 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/noarch::ibm-wsrt-py37main-keep==0.0.0=2020
  - conda-forge/linux-64::pytorch==1.8.0=cpu_py37hafa7651_0
  - defaults/noarch::ibm-wsrt-py37main-main==custom=2020
done

# All requested packages already installed.

Libraries imported.


In [2]:
!pip install folium



In [3]:
import folium

## Gathering the Data

In [4]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [5]:
!wget -q -O 'newyork_data.json' https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs/newyork_data.json
print('Data downloaded!')

Data downloaded!


In [6]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [7]:
newyork_data

{'type': 'FeatureCollection',
 'totalFeatures': 306,
 'features': [{'type': 'Feature',
   'id': 'nyu_2451_34572.1',
   'geometry': {'type': 'Point',
    'coordinates': [-73.84720052054902, 40.89470517661]},
   'geometry_name': 'geom',
   'properties': {'name': 'Wakefield',
    'stacked': 1,
    'annoline1': 'Wakefield',
    'annoline2': None,
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.84720052054902,
     40.89470517661,
     -73.84720052054902,
     40.89470517661]}},
  {'type': 'Feature',
   'id': 'nyu_2451_34572.2',
   'geometry': {'type': 'Point',
    'coordinates': [-73.82993910812398, 40.87429419303012]},
   'geometry_name': 'geom',
   'properties': {'name': 'Co-op City',
    'stacked': 2,
    'annoline1': 'Co-op',
    'annoline2': 'City',
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.82993910812398,
     40.87429419303012,
     -73.82993910812398,
     40.87429419303012]}},
  {'type': 'Feature',
 

In [8]:
ny_data = newyork_data['features']

In [9]:
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
ny_df = pd.DataFrame(columns=column_names)
ny_df

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


In [10]:
for data in ny_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    ny_df = ny_df.append({'Borough': borough,
                          'Neighborhood': neighborhood_name,
                          'Latitude': neighborhood_lat,
                          'Longitude': neighborhood_lon}, ignore_index=True)

In [11]:
ny_df.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [12]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(ny_df['Latitude'], ny_df['Longitude'], ny_df['Borough'], ny_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

In [13]:
CLIENT_ID = 'MBYQSQXJ3J13FRYK5V3BSAGLW2KNSDVBKCHOX5SEEWBF5UMQ' #  Foursquare ID
CLIENT_SECRET = '5SNXXVII5HAQ5QDJ4KLTJ3OO20ZN5MXAWWBXMY2AAETXYJQE' #  Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: MBYQSQXJ3J13FRYK5V3BSAGLW2KNSDVBKCHOX5SEEWBF5UMQ
CLIENT_SECRET:5SNXXVII5HAQ5QDJ4KLTJ3OO20ZN5MXAWWBXMY2AAETXYJQE


In [17]:
bronx_data = ny_df[ny_df['Borough'] == 'Bronx'].reset_index(drop=True)
bronx_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [18]:
bronx_data.loc[4, 'Neighborhood']

'Riverdale'

In [19]:
neighborhood_latitude = bronx_data.loc[4, 'Latitude']
neighborhood_longitude = bronx_data.loc[4, 'Longitude']

neighborhood_name = bronx_data.loc[4, 'Neighborhood']

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Riverdale are 40.890834493891305, -73.9125854610857.


In [20]:
LIMIT = 100
radius = 500
v =20191201
url="https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=MBYQSQXJ3J13FRYK5V3BSAGLW2KNSDVBKCHOX5SEEWBF5UMQ&client_secret=5SNXXVII5HAQ5QDJ4KLTJ3OO20ZN5MXAWWBXMY2AAETXYJQE&v=20180605&ll=40.890834493891305,-73.9125854610857&radius=500&limit=100'

In [21]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '607b66b15474d24d96ee1876'},
 'response': {'headerLocation': 'Riverdale',
  'headerFullLocation': 'Riverdale, Bronx',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 11,
  'suggestedBounds': {'ne': {'lat': 40.89533449839131,
    'lng': -73.90664385942961},
   'sw': {'lat': 40.8863344893913, 'lng': -73.91852706274179}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4c268383136d20a15a83e561',
       'name': 'Riverdale Ave',
       'location': {'lat': 40.890424929507866,
        'lng': -73.91024841803598,
        'labeledLatLngs': [{'label': 'display',
          'lat': 40.890424929507866,
          'lng': -73.91024841803598}],
        'distance': 201,
        'postalCode': '10463',
        'cc': 'US',
        'city': 'B

In [22]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [23]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues)

filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]


nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)


nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  app.launch_new_instance()


Unnamed: 0,name,categories,lat,lng
0,Riverdale Ave,Plaza,40.890425,-73.910248
1,Bell Tower Park,Park,40.889178,-73.908331
2,Chase Bank,Bank,40.888089,-73.907921
3,Seton Park,Park,40.887914,-73.916113
4,"Ankle, Back & Knee Braces",Medical Supply Store,40.89312,-73.911102


In [24]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [25]:
bronx_venues = getNearbyVenues(names=bronx_data['Neighborhood'],
                                   latitudes=bronx_data['Latitude'],
                                   longitudes=bronx_data['Longitude']
                                  )

Wakefield
Co-op City
Eastchester
Fieldston
Riverdale
Kingsbridge
Woodlawn
Norwood
Williamsbridge
Baychester
Pelham Parkway
City Island
Bedford Park
University Heights
Morris Heights
Fordham
East Tremont
West Farms
High  Bridge
Melrose
Mott Haven
Port Morris
Longwood
Hunts Point
Morrisania
Soundview
Clason Point
Throgs Neck
Country Club
Parkchester
Westchester Square
Van Nest
Morris Park
Belmont
Spuyten Duyvil
North Riverdale
Pelham Bay
Schuylerville
Edgewater Park
Castle Hill
Olinville
Pelham Gardens
Concourse
Unionport
Edenwald
Claremont Village
Concourse Village
Mount Eden
Mount Hope
Bronxdale
Allerton
Kingsbridge Heights


In [26]:
print(bronx_venues.shape)
bronx_venues.head()

(1196, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Wakefield,40.894705,-73.847201,Lollipops Gelato,40.894123,-73.845892,Dessert Shop
1,Wakefield,40.894705,-73.847201,Rite Aid,40.896649,-73.844846,Pharmacy
2,Wakefield,40.894705,-73.847201,Walgreens,40.896528,-73.8447,Pharmacy
3,Wakefield,40.894705,-73.847201,Carvel Ice Cream,40.890487,-73.848568,Ice Cream Shop
4,Wakefield,40.894705,-73.847201,Dunkin',40.890459,-73.849089,Donut Shop


In [27]:
bronx_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Allerton,29,29,29,29,29,29
Baychester,23,23,23,23,23,23
Bedford Park,34,34,34,34,34,34
Belmont,99,99,99,99,99,99
Bronxdale,13,13,13,13,13,13
Castle Hill,6,6,6,6,6,6
City Island,25,25,25,25,25,25
Claremont Village,21,21,21,21,21,21
Clason Point,13,13,13,13,13,13
Co-op City,16,16,16,16,16,16


In [28]:
bronx_onehot = pd.get_dummies(bronx_venues[['Venue Category']], prefix="", prefix_sep="")

bronx_onehot['Neighborhood'] = bronx_venues['Neighborhood'] 

fixed_columns = [bronx_onehot.columns[-1]] + list(bronx_onehot.columns[:-1])
bronx_onehot = bronx_onehot[fixed_columns]

bronx_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,American Restaurant,Arcade,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Trail,Train Station,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waste Facility,Wine Shop,Wings Joint,Women's Store
0,Wakefield,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Wakefield,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Wakefield,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Wakefield,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Wakefield,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [29]:
bronx_onehot.shape

(1196, 169)

In [31]:
bronx_grouped = bronx_onehot.groupby('Neighborhood').mean().reset_index()
bronx_grouped

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,American Restaurant,Arcade,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Trail,Train Station,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waste Facility,Wine Shop,Wings Joint,Women's Store
0,Allerton,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Baychester,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bedford Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Belmont,0.0,0.0,0.010101,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010101,0.0
4,Bronxdale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Castle Hill,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,City Island,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0
7,Claremont Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Clason Point,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Co-op City,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [32]:
bronx_grouped.shape

(52, 169)

In [33]:
num_top_venues = 5

for hood in bronx_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = bronx_grouped[bronx_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Allerton----
                venue  freq
0         Pizza Place  0.14
1       Deli / Bodega  0.10
2         Supermarket  0.10
3      Discount Store  0.07
4  Chinese Restaurant  0.03


----Baychester----
                  venue  freq
0            Donut Shop  0.09
1    Mexican Restaurant  0.04
2     Electronics Store  0.04
3        Discount Store  0.04
4  Fast Food Restaurant  0.04


----Bedford Park----
                venue  freq
0               Diner  0.12
1         Pizza Place  0.12
2  Mexican Restaurant  0.09
3  Chinese Restaurant  0.06
4      Sandwich Place  0.06


----Belmont----
                venue  freq
0  Italian Restaurant  0.18
1         Pizza Place  0.10
2       Deli / Bodega  0.09
3              Bakery  0.05
4      Sandwich Place  0.03


----Bronxdale----
                venue  freq
0  Chinese Restaurant  0.15
1       Deli / Bodega  0.08
2         Pizza Place  0.08
3      Breakfast Spot  0.08
4         Supermarket  0.08


----Castle Hill----
         venue  freq
0     

In [35]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [36]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = bronx_grouped['Neighborhood']

for ind in np.arange(bronx_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(bronx_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allerton,Pizza Place,Deli / Bodega,Supermarket,Discount Store,Chinese Restaurant,Electronics Store,Spa,Bus Station,Gas Station,Pharmacy
1,Baychester,Donut Shop,Mexican Restaurant,Electronics Store,Discount Store,Fast Food Restaurant,Sandwich Place,Pizza Place,Convenience Store,Pet Store,Fried Chicken Joint
2,Bedford Park,Diner,Pizza Place,Mexican Restaurant,Chinese Restaurant,Sandwich Place,Deli / Bodega,Baseball Field,Bus Station,Pub,Burger Joint
3,Belmont,Italian Restaurant,Pizza Place,Deli / Bodega,Bakery,Sandwich Place,Dessert Shop,Grocery Store,Bank,Bar,Shoe Store
4,Bronxdale,Chinese Restaurant,Deli / Bodega,Pizza Place,Breakfast Spot,Supermarket,Mexican Restaurant,Gym,Performing Arts Venue,Bank,Spanish Restaurant


In [37]:
kclusters = 5

bronx_grouped_clustering = bronx_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(bronx_grouped_clustering)

kmeans.labels_[0:10]

array([2, 2, 2, 2, 2, 2, 2, 2, 1, 2], dtype=int32)

In [38]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

bronx_merged = bronx_data
bronx_merged = bronx_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

bronx_merged.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bronx,Wakefield,40.894705,-73.847201,1,Pharmacy,Deli / Bodega,Laundromat,Donut Shop,Sandwich Place,Dessert Shop,Ice Cream Shop,Lake,Pizza Place,Paper / Office Supplies Store
1,Bronx,Co-op City,40.874294,-73.829939,2,Pizza Place,Accessories Store,Basketball Court,Donut Shop,Restaurant,Fast Food Restaurant,Pharmacy,Park,Bus Station,Discount Store
2,Bronx,Eastchester,40.887556,-73.827806,2,Bus Station,Caribbean Restaurant,Diner,Deli / Bodega,Seafood Restaurant,Platform,Bus Stop,Bowling Alley,Pizza Place,Chinese Restaurant
3,Bronx,Fieldston,40.895437,-73.905643,0,Plaza,River,Bus Station,Medical Supply Store,Accessories Store,Peruvian Restaurant,Nightclub,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store
4,Bronx,Riverdale,40.890834,-73.912585,1,Bus Station,Park,Plaza,Playground,Home Service,Gym,Medical Supply Store,Baseball Field,Bank,Accessories Store


In [39]:
map_clusters = folium.Map(location=[40.7127281,-74.0060152], zoom_start=11)


x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]


markers_colors = []
for lat, lon, poi, cluster in zip(bronx_merged['Latitude'], bronx_merged['Longitude'], bronx_merged['Neighborhood'], bronx_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [40]:
#Red cluster
bronx_merged.loc[bronx_merged['Cluster Labels'] == 0, bronx_merged.columns[[1] + list(range(5, bronx_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Fieldston,Plaza,River,Bus Station,Medical Supply Store,Accessories Store,Peruvian Restaurant,Nightclub,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store


In [41]:
#Purple Cluster
bronx_merged.loc[bronx_merged['Cluster Labels'] == 1, bronx_merged.columns[[1] + list(range(5, bronx_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Wakefield,Pharmacy,Deli / Bodega,Laundromat,Donut Shop,Sandwich Place,Dessert Shop,Ice Cream Shop,Lake,Pizza Place,Paper / Office Supplies Store
4,Riverdale,Bus Station,Park,Plaza,Playground,Home Service,Gym,Medical Supply Store,Baseball Field,Bank,Accessories Store
15,Fordham,Fast Food Restaurant,Shoe Store,Donut Shop,Mobile Phone Shop,Clothing Store,Bank,Pharmacy,Supplement Shop,Supermarket,Spanish Restaurant
17,West Farms,Park,Bus Stop,Donut Shop,Bus Station,Convenience Store,Chinese Restaurant,Pizza Place,Basketball Court,Coffee Shop,Playground
21,Port Morris,Brewery,Restaurant,Furniture / Home Store,Storage Facility,Grocery Store,Peruvian Restaurant,Spanish Restaurant,Donut Shop,Distillery,Latin American Restaurant
22,Longwood,Deli / Bodega,Diner,Latin American Restaurant,Donut Shop,Sandwich Place,Mexican Restaurant,Grocery Store,Fast Food Restaurant,Pet Store,Outlet Store
26,Clason Point,Park,Recording Studio,South American Restaurant,Pool,Playground,Convenience Store,Bus Stop,Boat or Ferry,Home Service,Grocery Store
34,Spuyten Duyvil,Park,Scenic Lookout,Grocery Store,Tennis Court,Tennis Stadium,Thai Restaurant,Bank,Intersection,Pharmacy,Pet Store
36,Pelham Bay,Bank,Fast Food Restaurant,Diner,Italian Restaurant,Gym / Fitness Center,Donut Shop,Convenience Store,Latin American Restaurant,Asian Restaurant,Supermarket
41,Pelham Gardens,Pharmacy,Bus Station,Spanish Restaurant,Boat or Ferry,Grocery Store,Bank,Sandwich Place,Playground,Mobile Phone Shop,Locksmith


In [42]:
#Blue Cluster
bronx_merged.loc[bronx_merged['Cluster Labels'] == 2, bronx_merged.columns[[1] + list(range(5, bronx_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Co-op City,Pizza Place,Accessories Store,Basketball Court,Donut Shop,Restaurant,Fast Food Restaurant,Pharmacy,Park,Bus Station,Discount Store
2,Eastchester,Bus Station,Caribbean Restaurant,Diner,Deli / Bodega,Seafood Restaurant,Platform,Bus Stop,Bowling Alley,Pizza Place,Chinese Restaurant
5,Kingsbridge,Pizza Place,Sandwich Place,Bar,Mexican Restaurant,Latin American Restaurant,Bakery,Supermarket,Burger Joint,Spanish Restaurant,Pharmacy
6,Woodlawn,Pub,Deli / Bodega,Food & Drink Shop,Pizza Place,Bakery,Park,Grocery Store,Food Truck,Pharmacy,Bar
7,Norwood,Pizza Place,Park,Chinese Restaurant,Pharmacy,Bank,American Restaurant,Caribbean Restaurant,Bus Station,Spanish Restaurant,Fried Chicken Joint
8,Williamsbridge,Nightclub,Soup Place,Bar,Caribbean Restaurant,Accessories Store,Music Venue,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store,Park
9,Baychester,Donut Shop,Mexican Restaurant,Electronics Store,Discount Store,Fast Food Restaurant,Sandwich Place,Pizza Place,Convenience Store,Pet Store,Fried Chicken Joint
10,Pelham Parkway,Italian Restaurant,Pizza Place,Sandwich Place,Food,Smoke Shop,Bus Station,Gourmet Shop,Donut Shop,Frozen Yogurt Shop,Chinese Restaurant
11,City Island,Seafood Restaurant,Thrift / Vintage Store,Harbor / Marina,Bar,Spanish Restaurant,Café,Music Venue,Park,History Museum,French Restaurant
12,Bedford Park,Diner,Pizza Place,Mexican Restaurant,Chinese Restaurant,Sandwich Place,Deli / Bodega,Baseball Field,Bus Station,Pub,Burger Joint


In [43]:
#Green Cluster
bronx_merged.loc[bronx_merged['Cluster Labels'] == 3, bronx_merged.columns[[1] + list(range(5, bronx_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,Country Club,Sandwich Place,Playground,Athletics & Sports,Accessories Store,Peruvian Restaurant,Nail Salon,Nightclub,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store


In [44]:
#Orange Cluster
bronx_merged.loc[bronx_merged['Cluster Labels'] == 4, bronx_merged.columns[[1] + list(range(5, bronx_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
44,Edenwald,Fried Chicken Joint,Supermarket,Grocery Store,Accessories Store,Music Venue,Nail Salon,Nightclub,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store


In [45]:
manhattan_data = ny_df[ny_df['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [46]:
manhattan_data.loc[2, 'Neighborhood']

'Washington Heights'

In [47]:
neighborhood_latitude = manhattan_data.loc[2, 'Latitude']
neighborhood_longitude = manhattan_data.loc[2, 'Longitude']

neighborhood_name = manhattan_data.loc[2, 'Neighborhood']

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Washington Heights are 40.85190252555305, -73.93690027985234.


In [48]:
manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude']
                                  )

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards


In [49]:
print(manhattan_venues.shape)
manhattan_venues.head()

(3243, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place
1,Marble Hill,40.876551,-73.91066,Bikram Yoga,40.876844,-73.906204,Yoga Studio
2,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner
3,Marble Hill,40.876551,-73.91066,Astral Fitness & Wellness Center,40.876705,-73.906372,Gym
4,Marble Hill,40.876551,-73.91066,Dunkin',40.877136,-73.906666,Donut Shop


In [50]:
manhattan_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,79,79,79,79,79,79
Carnegie Hill,92,92,92,92,92,92
Central Harlem,45,45,45,45,45,45
Chelsea,100,100,100,100,100,100
Chinatown,100,100,100,100,100,100
Civic Center,100,100,100,100,100,100
Clinton,100,100,100,100,100,100
East Harlem,40,40,40,40,40,40
East Village,100,100,100,100,100,100
Financial District,100,100,100,100,100,100


In [51]:
manhattan_onehot = pd.get_dummies(manhattan_venues[['Venue Category']], prefix="", prefix_sep="")

manhattan_onehot['Neighborhood'] = manhattan_venues['Neighborhood'] 

fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,...,Video Store,Vietnamese Restaurant,Volleyball Court,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
2,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [52]:
manhattan_onehot.shape

(3243, 329)

In [53]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,...,Video Store,Vietnamese Restaurant,Volleyball Court,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Battery Park City,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.025316,0.0,0.0,0.0
1,Carnegie Hill,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.01087,0.0,...,0.0,0.01087,0.0,0.0,0.0,0.01087,0.032609,0.0,0.01087,0.032609
2,Central Harlem,0.0,0.0,0.0,0.044444,0.044444,0.0,0.0,0.0,0.044444,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Chelsea,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,...,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0
4,Chinatown,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,...,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Civic Center,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.01,...,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.02
6,Clinton,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0
7,East Harlem,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,East Village,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,...,0.0,0.02,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0
9,Financial District,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0


In [54]:
manhattan_grouped.shape

(40, 329)

In [55]:
num_top_venues = 5

for hood in manhattan_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = manhattan_grouped[manhattan_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Battery Park City----
           venue  freq
0           Park  0.10
1    Coffee Shop  0.06
2          Hotel  0.05
3  Memorial Site  0.04
4     Playground  0.04


----Carnegie Hill----
               venue  freq
0        Coffee Shop  0.08
1               Café  0.07
2        Yoga Studio  0.03
3          Bookstore  0.03
4  French Restaurant  0.03


----Central Harlem----
                  venue  freq
0            Public Art  0.04
1  Gym / Fitness Center  0.04
2    Seafood Restaurant  0.04
3                   Bar  0.04
4   Fried Chicken Joint  0.04


----Chelsea----
                 venue  freq
0          Coffee Shop  0.07
1               Bakery  0.05
2    French Restaurant  0.04
3  American Restaurant  0.04
4          Art Gallery  0.04


----Chinatown----
                 venue  freq
0               Bakery  0.09
1   Chinese Restaurant  0.08
2         Cocktail Bar  0.05
3    Hotpot Restaurant  0.04
4  American Restaurant  0.04


----Civic Center----
                  venue  freq
0     

In [56]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,Park,Coffee Shop,Hotel,Memorial Site,Playground,Gym,Clothing Store,BBQ Joint,Burger Joint,Beer Garden
1,Carnegie Hill,Coffee Shop,Café,Yoga Studio,Bookstore,French Restaurant,Cosmetics Shop,Gym,Pizza Place,Wine Shop,Gym / Fitness Center
2,Central Harlem,Public Art,Gym / Fitness Center,Seafood Restaurant,Bar,Fried Chicken Joint,Art Gallery,French Restaurant,Chinese Restaurant,African Restaurant,American Restaurant
3,Chelsea,Coffee Shop,Bakery,French Restaurant,American Restaurant,Art Gallery,Nightclub,Italian Restaurant,Wine Shop,Seafood Restaurant,Ice Cream Shop
4,Chinatown,Bakery,Chinese Restaurant,Cocktail Bar,Hotpot Restaurant,American Restaurant,Spa,Dessert Shop,Salon / Barbershop,Ice Cream Shop,Boutique


In [57]:
kclusters = 5

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

kmeans.labels_[0:10]

array([1, 1, 1, 1, 1, 1, 1, 0, 1, 1], dtype=int32)

In [58]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data

manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Manhattan,Marble Hill,40.876551,-73.91066,1,Gym,Sandwich Place,Yoga Studio,Department Store,Steakhouse,Shopping Mall,Clothing Store,Coffee Shop,Seafood Restaurant,Diner
1,Manhattan,Chinatown,40.715618,-73.994279,1,Bakery,Chinese Restaurant,Cocktail Bar,Hotpot Restaurant,American Restaurant,Spa,Dessert Shop,Salon / Barbershop,Ice Cream Shop,Boutique
2,Manhattan,Washington Heights,40.851903,-73.9369,0,Café,Bakery,Pizza Place,Grocery Store,Bank,Mobile Phone Shop,New American Restaurant,Deli / Bodega,Sandwich Place,Tapas Restaurant
3,Manhattan,Inwood,40.867684,-73.92121,0,Mexican Restaurant,Café,Restaurant,Bakery,Pizza Place,Lounge,Spanish Restaurant,Wine Bar,Caribbean Restaurant,Chinese Restaurant
4,Manhattan,Hamilton Heights,40.823604,-73.949688,0,Pizza Place,Coffee Shop,Café,Mexican Restaurant,Park,Deli / Bodega,Yoga Studio,Bakery,Liquor Store,Indian Restaurant


In [59]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)


x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]


markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [60]:
#Red Cluster
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Washington Heights,Café,Bakery,Pizza Place,Grocery Store,Bank,Mobile Phone Shop,New American Restaurant,Deli / Bodega,Sandwich Place,Tapas Restaurant
3,Inwood,Mexican Restaurant,Café,Restaurant,Bakery,Pizza Place,Lounge,Spanish Restaurant,Wine Bar,Caribbean Restaurant,Chinese Restaurant
4,Hamilton Heights,Pizza Place,Coffee Shop,Café,Mexican Restaurant,Park,Deli / Bodega,Yoga Studio,Bakery,Liquor Store,Indian Restaurant
7,East Harlem,Mexican Restaurant,Bakery,Thai Restaurant,Spa,Latin American Restaurant,Sandwich Place,Deli / Bodega,New American Restaurant,French Restaurant,Gas Station


In [61]:
#Purple Cluster
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Marble Hill,Gym,Sandwich Place,Yoga Studio,Department Store,Steakhouse,Shopping Mall,Clothing Store,Coffee Shop,Seafood Restaurant,Diner
1,Chinatown,Bakery,Chinese Restaurant,Cocktail Bar,Hotpot Restaurant,American Restaurant,Spa,Dessert Shop,Salon / Barbershop,Ice Cream Shop,Boutique
6,Central Harlem,Public Art,Gym / Fitness Center,Seafood Restaurant,Bar,Fried Chicken Joint,Art Gallery,French Restaurant,Chinese Restaurant,African Restaurant,American Restaurant
8,Upper East Side,Exhibit,Italian Restaurant,Gym / Fitness Center,Coffee Shop,Bakery,Juice Bar,French Restaurant,American Restaurant,Spa,Hotel
13,Lincoln Square,Plaza,Performing Arts Venue,Theater,Concert Hall,Café,Bakery,Indie Movie Theater,Gym / Fitness Center,Wine Shop,Gym
14,Clinton,Theater,American Restaurant,Italian Restaurant,Coffee Shop,Gym / Fitness Center,Gym,Hotel,Sandwich Place,Wine Shop,Spa
15,Midtown,Hotel,Coffee Shop,Steakhouse,Clothing Store,Theater,Sandwich Place,Bookstore,Bakery,Sporting Goods Shop,Indian Restaurant
16,Murray Hill,Coffee Shop,Sandwich Place,Hotel,American Restaurant,Japanese Restaurant,Gym / Fitness Center,Sushi Restaurant,Burger Joint,Gym,Pub
17,Chelsea,Coffee Shop,Bakery,French Restaurant,American Restaurant,Art Gallery,Nightclub,Italian Restaurant,Wine Shop,Seafood Restaurant,Ice Cream Shop
19,East Village,Bar,Mexican Restaurant,Pizza Place,Wine Bar,Speakeasy,Cocktail Bar,Italian Restaurant,Vegetarian / Vegan Restaurant,Korean Restaurant,Coffee Shop


In [62]:
#Blue Cluster
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Manhattanville,Coffee Shop,Chinese Restaurant,Mexican Restaurant,Seafood Restaurant,Food Truck,Deli / Bodega,Italian Restaurant,Bus Station,Café,Diner
9,Yorkville,Italian Restaurant,Gym,Coffee Shop,Bar,Deli / Bodega,Sushi Restaurant,Japanese Restaurant,Wine Shop,Mexican Restaurant,Park
10,Lenox Hill,Pizza Place,Italian Restaurant,Coffee Shop,Cocktail Bar,Sushi Restaurant,Café,Gym,Gym / Fitness Center,Burger Joint,Salad Place
11,Roosevelt Island,Park,Metro Station,Supermarket,Bridge,Bubble Tea Shop,Gym,Greek Restaurant,Noodle House,Soccer Field,Outdoors & Recreation
12,Upper West Side,Italian Restaurant,Bakery,Indian Restaurant,Bar,Sushi Restaurant,Mediterranean Restaurant,Coffee Shop,Café,Wine Bar,Breakfast Spot
18,Greenwich Village,Italian Restaurant,Clothing Store,Sushi Restaurant,Indian Restaurant,Boutique,Coffee Shop,Dessert Shop,Gourmet Shop,Sandwich Place,Bubble Tea Shop
21,Tribeca,Park,American Restaurant,Italian Restaurant,Wine Bar,Café,Spa,Bakery,Men's Store,Gym / Fitness Center,Cocktail Bar
23,Soho,Clothing Store,Italian Restaurant,Shoe Store,Bakery,Coffee Shop,Sporting Goods Shop,Women's Store,Salon / Barbershop,Art Gallery,Boutique
24,West Village,Italian Restaurant,American Restaurant,Cocktail Bar,New American Restaurant,Park,Ice Cream Shop,Wine Bar,Coffee Shop,Bakery,Boutique
26,Morningside Heights,Coffee Shop,American Restaurant,Bookstore,Park,Café,Burger Joint,Deli / Bodega,Food Truck,Mexican Restaurant,Tennis Court


In [63]:
#Green Cluster
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,Stuyvesant Town,Park,Bar,Coffee Shop,Gas Station,Fountain,Heliport,Baseball Field,Bistro,Farmers Market,Boat or Ferry


In [64]:
#Orange Cluster
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
33,Midtown South,Korean Restaurant,Hotel,Dessert Shop,Cosmetics Shop,American Restaurant,Coffee Shop,Cocktail Bar,Japanese Restaurant,Ramen Restaurant,Hotel Bar
