# Investment in Restaurant in Manhattan, NY

This is a casptone project for IBM Data Science Professional Certificate where I as an investor would like to search for the good locations to set up a restaurant in Manhattan, New York City. New York City is the city of dreams that houses highly diverse groups of people from around the globe and this is the reason that i want to invest here despite being the complete outsider. Moreover, Manhattan is the most happening and lively Borough within the perimeter of this beautiful city. To decide on my investment, i will be looking into the demographics of types of restaurants in Manhattan for the restaurant type and, technically, i will be running K-means clustering to segment the Neighborhoods in Manhattan based on the most visited venues.

I will extract data from the link https://geo.nyu.edu/catalog/nyu_2451_34572 to get the geo locations of the Neighborhoods in Manhattan which is downloaded and linked via path to my machine. Folium will be used to map out the coordinates of the Neighborhoods and also later to show the clusters after running K-Means clustering. FourSquare API is used to get the data on categories of restaurants, most visited venues in the Neighborhoods which are the key to my investment decisions.

In [1]:
import numpy as np
import pandas as pd
pd.set_option('display.max_columns',None)
pd.set_option('display.max_rows',None)
import json
from geopy.geocoders import Nominatim
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium


In [2]:
# data was downloaded from the link https://geo.nyu.edu/catalog/nyu_2451_34572 

path = '/Users/gobinrana/Downloads/nyu-2451-34572-geojson.json' # path variable is created linking the dataset to my machine
with open(path) as json_data:
    newyork_data = json.load(json_data)
    
neighborhoods_data = newyork_data['features'] # extract a dataset with the key ['features']
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

In [12]:
# Convert the uploaded dataset on json file into a pandas dataframe.

column_names = ['Borough','Neighborhood','Latitude','Longitude'] #define the dataframe columns to convert it into the pandas dataframe

neighborhoods = pd.DataFrame(columns=column_names) # instantiate the dataframe

for data in neighborhoods_data: # looping through data and fill the dataframe one row at a time:
    borough = neighborhood_name = data['properties']['borough']
    neighborhood_name = data['properties']['name']
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                         'Neighborhood': neighborhood_name,
                                         'Latitude': neighborhood_lat,
                                         'Longitude': neighborhood_lon}, ignore_index=True)

print ('The dataframe on New York has {} borughs and {} neighborhoods.'.format(
      len(neighborhoods['Borough'].unique()), neighborhoods.shape[0]))

The dataframe on New York has 5 borughs and 306 neighborhoods.


In [13]:
# Create a dataset of the Manhattan Borough only from the dataframe 'neighborhoods' which has five boroughs and visualize in map.

manhattan_data = neighborhoods[neighborhoods['Borough']=='Manhattan'].reset_index(drop=True)
print(manhattan_data.head())

# Extracting the geographical coordinates of Manhattan
address = 'Manhattan, NY'
geolocator = Nominatim(user_agent = "ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print("The geographical coordinates of Manhattan are {}, {}\n".format(latitude, longitude))


# get neighborhoods latitude and longitude values of Manhattan
neighborhood_latitude = manhattan_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = manhattan_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = manhattan_data.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.\n'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))


# creating a map of Manhattan
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan



     Borough        Neighborhood   Latitude  Longitude
0  Manhattan         Marble Hill  40.876551 -73.910660
1  Manhattan           Chinatown  40.715618 -73.994279
2  Manhattan  Washington Heights  40.851903 -73.936900
3  Manhattan              Inwood  40.867684 -73.921210
4  Manhattan    Hamilton Heights  40.823604 -73.949688
The geographical coordinates of Manhattan are 40.7896239, -73.9598939

Latitude and longitude values of Marble Hill are 40.87655077879964, -73.91065965862981.



In [5]:
# Define FourSquare Credentials and Version
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: FDXDMA1KQEDDZLQYZCRPMLICL2CJWYCX5QRMMB2XMVDP44KS
CLIENT_SECRET:YJSYMICCRB0A2VTTX4EWWDOJCKZ1MD2TQ0OE3G5P4ETQFIIC


In [14]:
# create a GET request and URL to access the datasets using FourSquare.
LIMIT = 500
radius = 2000
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        neighborhood_latitude,
        neighborhood_longitude,
        radius,
        LIMIT
        )
url

'https://api.foursquare.com/v2/venues/explore?&client_id=FDXDMA1KQEDDZLQYZCRPMLICL2CJWYCX5QRMMB2XMVDP44KS&client_secret=YJSYMICCRB0A2VTTX4EWWDOJCKZ1MD2TQ0OE3G5P4ETQFIIC&v=20180605&ll=40.87655077879964,-73.91065965862981&radius=2000&limit=500'

In [15]:
# send the get requests and examine the results

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5eeb0b65f7706a001b12af5f'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'},
    {'name': '$-$$$$', 'key': 'price'}]},
  'headerLocation': 'Marble Hill',
  'headerFullLocation': 'Marble Hill, New York',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 193,
  'suggestedBounds': {'ne': {'lat': 40.894550796799656,
    'lng': -73.88689838082279},
   'sw': {'lat': 40.85855076079962, 'lng': -73.93442093643684}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4baf59e8f964a520a6f93be3',
       'name': 'Bikram Yoga',
       'location': {'address': '5500 Broadway',
        'crossStreet': '230th Street',
        'lat': 40.876843690797934,
        'lng': -73.90620

In [23]:
# create a function that extracts the category of the venue

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [35]:
# clean the json data and structure it into pandas dataframe

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()
print('{} venues were returned by FourSquare.'.format(nearby_venues.shape[0]))

100 venues were returned by FourSquare.


  """


In [37]:
# Exploring the neighborhoods of Manhattan

def getNearbyVenues(names, latitudes, longitudes, radius=2000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()['response']['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)


In [39]:
# Create a dataframe of Neighborhoods in Manhattan Borough with the columns on venues category type and its goegraphical locations

manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude']
                                  )
print(manhattan_venues.shape)
manhattan_venues.head()

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards
(4000, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Bikram Yoga,40.876844,-73.906204,Yoga Studio
1,Marble Hill,40.876551,-73.91066,Sam's Pizza,40.879435,-73.905859,Pizza Place
2,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner
3,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place
4,Marble Hill,40.876551,-73.91066,The Bronx Public,40.878377,-73.903481,Pub


In [40]:
manhattan_venues.groupby('Neighborhood').count()
print('There are {} unique categories.'.format(len(manhattan_venues['Venue Category'].unique())))

There are 253 unique categories.


In [41]:
# Analyze Each Neighborhood

# one hot encoding
manhattan_onehot = pd.get_dummies(manhattan_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = manhattan_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()
manhattan_onehot.shape

(4000, 254)

In [42]:
# Group rows by neighborhood and the mean of the frequency of occurrence of each category

manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Adult Boutique,African Restaurant,American Restaurant,Antique Shop,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,Australian Restaurant,Austrian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Basketball Stadium,Beach,Beer Bar,Beer Garden,Beer Store,Bike Shop,Bike Trail,Bistro,Bookstore,Boutique,Boxing Gym,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Building,Burger Joint,Butcher,Café,Cambodian Restaurant,Camera Store,Candy Store,Caribbean Restaurant,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Circus,Climbing Gym,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Arts Building,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Cosmetics Shop,Cuban Restaurant,Cultural Center,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Donut Shop,Dumpling Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Field,Fish Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gastropub,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Heliport,Historic Site,History Museum,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lebanese Restaurant,Library,Lighthouse,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Newsstand,Noodle House,North Indian Restaurant,Office,Opera House,Organic Grocery,Outdoor Sculpture,Outdoors & Recreation,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Peruvian Restaurant,Pet Café,Pet Store,Pharmacy,Photography Studio,Piano Bar,Pie Shop,Pier,Pilates Studio,Pizza Place,Planetarium,Playground,Plaza,Poke Place,Pool,Pub,Public Art,Ramen Restaurant,Recreation Center,Reservoir,Resort,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soup Place,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,State / Provincial Park,Stationery Store,Steakhouse,Street Art,Supermarket,Sushi Restaurant,Szechuan Restaurant,TV Station,Taco Place,Tailor Shop,Tapas Restaurant,Tea Room,Tech Startup,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Toy / Game Store,Track,Trail,Train Station,Tram Station,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Battery Park City,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.03,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.11,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0
1,Carnegie Hill,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.02
2,Central Harlem,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.06,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.04
3,Chelsea,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.02,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03
4,Chinatown,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.05,0.01,0.02,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.01
5,Civic Center,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.02,0.01,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.01,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.02,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.03,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01
6,Clinton,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.01,0.01,0.02,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.19,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0
7,East Harlem,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.05,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.04,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.13,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.03,0.0,0.0,0.0
8,East Village,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.02,0.03,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.04,0.03,0.0,0.0,0.0
9,Financial District,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.08,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.03,0.03,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.09,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0


In [43]:
# Print each neighborhood along with the top 5 most common venues

manhattan_grouped.shape
num_top_venues = 10

for hood in manhattan_grouped['Neighborhood']:
    print('------'+hood+'-------')
    temp = manhattan_grouped[manhattan_grouped['Neighborhood']==hood].T.reset_index()
    temp.columns = ['venue', 'freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq':2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')
    

------Battery Park City-------
                  venue  freq
0                  Park  0.11
1           Coffee Shop  0.08
2                 Hotel  0.04
3           Pizza Place  0.03
4    Italian Restaurant  0.03
5                 Plaza  0.03
6                   Gym  0.03
7  Gym / Fitness Center  0.03
8          Cocktail Bar  0.03
9         Memorial Site  0.03


------Carnegie Hill-------
                venue  freq
0         Coffee Shop  0.07
1  Italian Restaurant  0.06
2                Park  0.05
3             Exhibit  0.05
4              Bakery  0.05
5           Wine Shop  0.04
6          Art Museum  0.04
7                Café  0.03
8                 Bar  0.03
9                 Gym  0.03


------Central Harlem-------
                             venue  freq
0                      Coffee Shop  0.08
1                              Bar  0.06
2                     Cocktail Bar  0.05
3  Southern / Soul Food Restaurant  0.04
4                             Park  0.04
5                      Yog

                  venue  freq
0               Theater  0.08
1                 Hotel  0.05
2   American Restaurant  0.04
3          Gourmet Shop  0.04
4  Gym / Fitness Center  0.03
5          Burger Joint  0.03
6            Boxing Gym  0.03
7        Sandwich Place  0.03
8                 Plaza  0.03
9                   Gym  0.03


------Noho-------
             venue  freq
0         Wine Bar  0.04
1      Coffee Shop  0.04
2           Bakery  0.03
3  Thai Restaurant  0.03
4             Café  0.03
5   Ice Cream Shop  0.03
6      Pizza Place  0.02
7      Salad Place  0.02
8        Juice Bar  0.02
9     Gourmet Shop  0.02


------Roosevelt Island-------
                  venue  freq
0                  Park  0.09
1    Italian Restaurant  0.06
2  Gym / Fitness Center  0.04
3                   Gym  0.04
4      Sushi Restaurant  0.04
5       Thai Restaurant  0.03
6            Art Museum  0.03
7           Coffee Shop  0.03
8                Bakery  0.03
9          Cycle Studio  0.02


------Soho-

In [44]:
# write a function to sort the venues in descending order

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

# create a new dataframe displaying the top 10 venues for each neighborhood

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,Park,Coffee Shop,Hotel,Pizza Place,Cocktail Bar,Memorial Site,Italian Restaurant,Plaza,Gym / Fitness Center,Gym
1,Carnegie Hill,Coffee Shop,Italian Restaurant,Park,Exhibit,Bakery,Wine Shop,Art Museum,Bar,Gym,Pizza Place
2,Central Harlem,Coffee Shop,Bar,Cocktail Bar,Pizza Place,Park,Yoga Studio,Southern / Soul Food Restaurant,Wine Bar,American Restaurant,French Restaurant
3,Chelsea,Park,Art Gallery,Coffee Shop,Gym / Fitness Center,Bakery,Yoga Studio,Hotel,Salad Place,New American Restaurant,Italian Restaurant
4,Chinatown,Hotel,Café,Sandwich Place,Spa,Pizza Place,Sushi Restaurant,Bakery,Italian Restaurant,Juice Bar,Breakfast Spot
5,Civic Center,Park,Coffee Shop,Hotel,Sandwich Place,Memorial Site,Bakery,Italian Restaurant,Plaza,Men's Store,Falafel Restaurant
6,Clinton,Theater,Coffee Shop,Art Gallery,Pizza Place,Park,Mediterranean Restaurant,Gym / Fitness Center,Gym,Concert Hall,Thai Restaurant
7,East Harlem,Park,Café,Coffee Shop,Gym,Pizza Place,Italian Restaurant,Plaza,Cocktail Bar,Fountain,Wine Shop
8,East Village,Wine Bar,Coffee Shop,Pizza Place,Bakery,Gourmet Shop,Wine Shop,Café,Juice Bar,Park,Ice Cream Shop
9,Financial District,Park,Coffee Shop,Plaza,Gym / Fitness Center,Memorial Site,Pizza Place,Gym,Hotel,Restaurant,Shopping Mall


# Cluster Neighborhoods

In [45]:
# run k-means to cluster the neighborhoods into 5 clusters.

kclusters = 5

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 2, 0, 3, 0, 1, 4, 3, 0], dtype=int32)

In [46]:
# create a dataframe that includes cluster as well as the top 10 venues

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Manhattan,Marble Hill,40.876551,-73.91066,4,Mexican Restaurant,Pizza Place,Park,Café,Diner,Deli / Bodega,Bar,Bakery,Spanish Restaurant,Latin American Restaurant
1,Manhattan,Chinatown,40.715618,-73.994279,3,Hotel,Café,Sandwich Place,Spa,Pizza Place,Sushi Restaurant,Bakery,Italian Restaurant,Juice Bar,Breakfast Spot
2,Manhattan,Washington Heights,40.851903,-73.9369,4,Park,Latin American Restaurant,Pizza Place,Café,Tapas Restaurant,Deli / Bodega,Bar,Bakery,Mexican Restaurant,Wine Shop
3,Manhattan,Inwood,40.867684,-73.92121,4,Pizza Place,Mexican Restaurant,Park,Restaurant,Latin American Restaurant,Wine Bar,Deli / Bodega,Café,Bar,Wine Shop
4,Manhattan,Hamilton Heights,40.823604,-73.949688,2,Coffee Shop,Park,Bar,Yoga Studio,Pizza Place,French Restaurant,Italian Restaurant,Cocktail Bar,Deli / Bodega,Ethiopian Restaurant


In [47]:
# Visualize the clusters

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# Examine Clusters

### Cluster 1

In [48]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Upper East Side,Art Museum,Bakery,Exhibit,Italian Restaurant,Thai Restaurant,Plaza,Bar,Park,Coffee Shop,Ice Cream Shop
9,Yorkville,Italian Restaurant,Coffee Shop,Gym,Ice Cream Shop,Gym / Fitness Center,Bakery,French Restaurant,American Restaurant,Pizza Place,Thai Restaurant
10,Lenox Hill,Bakery,Italian Restaurant,Park,Thai Restaurant,Art Museum,Bar,Gym / Fitness Center,Sushi Restaurant,Salon / Barbershop,Gym
11,Roosevelt Island,Park,Italian Restaurant,Sushi Restaurant,Gym,Gym / Fitness Center,Art Museum,Bakery,Thai Restaurant,Coffee Shop,Grocery Store
12,Upper West Side,Park,Bakery,Ice Cream Shop,Italian Restaurant,Garden,Coffee Shop,Gym,Sushi Restaurant,Mediterranean Restaurant,Thai Restaurant
17,Chelsea,Park,Art Gallery,Coffee Shop,Gym / Fitness Center,Bakery,Yoga Studio,Hotel,Salad Place,New American Restaurant,Italian Restaurant
21,Tribeca,Park,Italian Restaurant,Sushi Restaurant,Coffee Shop,Memorial Site,Bakery,Café,Seafood Restaurant,Sandwich Place,Salad Place
24,West Village,Coffee Shop,Park,Gym / Fitness Center,Bakery,Sushi Restaurant,Italian Restaurant,Yoga Studio,Gym,New American Restaurant,Pizza Place
28,Battery Park City,Park,Coffee Shop,Hotel,Pizza Place,Cocktail Bar,Memorial Site,Italian Restaurant,Plaza,Gym / Fitness Center,Gym
29,Financial District,Park,Coffee Shop,Plaza,Gym / Fitness Center,Memorial Site,Pizza Place,Gym,Hotel,Restaurant,Shopping Mall


### Cluster 2

In [49]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,Lincoln Square,Theater,Park,Concert Hall,Gym,Jazz Club,Bakery,Spa,Performing Arts Venue,Hotel,Gym / Fitness Center
14,Clinton,Theater,Coffee Shop,Art Gallery,Pizza Place,Park,Mediterranean Restaurant,Gym / Fitness Center,Gym,Concert Hall,Thai Restaurant
15,Midtown,Theater,Gym,Hotel,Concert Hall,Plaza,Bakery,Museum,Boxing Gym,Boutique,Sandwich Place
16,Murray Hill,Theater,Hotel,American Restaurant,Gourmet Shop,Plaza,Park,Burger Joint,Sandwich Place,Boxing Gym,Gym / Fitness Center
33,Midtown South,Theater,Gym / Fitness Center,New American Restaurant,American Restaurant,Hotel,Coffee Shop,Gym,Mediterranean Restaurant,Miscellaneous Shop,Sushi Restaurant
39,Hudson Yards,Theater,Art Gallery,Park,Coffee Shop,Bakery,Gym / Fitness Center,Wine Bar,Pizza Place,Mediterranean Restaurant,Bar


### Cluster 3

In [50]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Hamilton Heights,Coffee Shop,Park,Bar,Yoga Studio,Pizza Place,French Restaurant,Italian Restaurant,Cocktail Bar,Deli / Bodega,Ethiopian Restaurant
5,Manhattanville,Park,Coffee Shop,Yoga Studio,Pizza Place,American Restaurant,Italian Restaurant,Cocktail Bar,Wine Shop,French Restaurant,Bar
6,Central Harlem,Coffee Shop,Bar,Cocktail Bar,Pizza Place,Park,Yoga Studio,Southern / Soul Food Restaurant,Wine Bar,American Restaurant,French Restaurant
26,Morningside Heights,Park,Coffee Shop,Italian Restaurant,Yoga Studio,American Restaurant,Pizza Place,Grocery Store,Seafood Restaurant,Bar,Burger Joint


### Cluster 4

In [51]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Chinatown,Hotel,Café,Sandwich Place,Spa,Pizza Place,Sushi Restaurant,Bakery,Italian Restaurant,Juice Bar,Breakfast Spot
18,Greenwich Village,Italian Restaurant,Pizza Place,New American Restaurant,Yoga Studio,Bookstore,Sushi Restaurant,Grocery Store,French Restaurant,Garden,Hotel
19,East Village,Wine Bar,Coffee Shop,Pizza Place,Bakery,Gourmet Shop,Wine Shop,Café,Juice Bar,Park,Ice Cream Shop
20,Lower East Side,Wine Bar,Coffee Shop,Ice Cream Shop,Pizza Place,Sandwich Place,Park,Asian Restaurant,Café,Juice Bar,Chinese Restaurant
22,Little Italy,Hotel,Sushi Restaurant,Italian Restaurant,Café,Sandwich Place,Wine Shop,Gym / Fitness Center,Pizza Place,Mediterranean Restaurant,Spa
23,Soho,Sushi Restaurant,Café,Sandwich Place,Pizza Place,Italian Restaurant,Salad Place,Park,Hotel,Indie Movie Theater,French Restaurant
27,Gramercy,Gym / Fitness Center,New American Restaurant,Gourmet Shop,Pizza Place,Gym,Ice Cream Shop,Japanese Restaurant,Mexican Restaurant,Park,Cycle Studio
31,Noho,Wine Bar,Coffee Shop,Café,Ice Cream Shop,Bakery,Thai Restaurant,Pizza Place,Hotel,Speakeasy,Sandwich Place
37,Stuyvesant Town,Cocktail Bar,Pizza Place,Wine Bar,Park,Coffee Shop,Bar,Bakery,Gourmet Shop,Ice Cream Shop,Wine Shop
38,Flatiron,Gym / Fitness Center,American Restaurant,Bakery,New American Restaurant,Gourmet Shop,Gym,Pizza Place,Park,Coffee Shop,Grocery Store


### Cluster 5

In [52]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Chinatown,Hotel,Café,Sandwich Place,Spa,Pizza Place,Sushi Restaurant,Bakery,Italian Restaurant,Juice Bar,Breakfast Spot
18,Greenwich Village,Italian Restaurant,Pizza Place,New American Restaurant,Yoga Studio,Bookstore,Sushi Restaurant,Grocery Store,French Restaurant,Garden,Hotel
19,East Village,Wine Bar,Coffee Shop,Pizza Place,Bakery,Gourmet Shop,Wine Shop,Café,Juice Bar,Park,Ice Cream Shop
20,Lower East Side,Wine Bar,Coffee Shop,Ice Cream Shop,Pizza Place,Sandwich Place,Park,Asian Restaurant,Café,Juice Bar,Chinese Restaurant
22,Little Italy,Hotel,Sushi Restaurant,Italian Restaurant,Café,Sandwich Place,Wine Shop,Gym / Fitness Center,Pizza Place,Mediterranean Restaurant,Spa
23,Soho,Sushi Restaurant,Café,Sandwich Place,Pizza Place,Italian Restaurant,Salad Place,Park,Hotel,Indie Movie Theater,French Restaurant
27,Gramercy,Gym / Fitness Center,New American Restaurant,Gourmet Shop,Pizza Place,Gym,Ice Cream Shop,Japanese Restaurant,Mexican Restaurant,Park,Cycle Studio
31,Noho,Wine Bar,Coffee Shop,Café,Ice Cream Shop,Bakery,Thai Restaurant,Pizza Place,Hotel,Speakeasy,Sandwich Place
37,Stuyvesant Town,Cocktail Bar,Pizza Place,Wine Bar,Park,Coffee Shop,Bar,Bakery,Gourmet Shop,Ice Cream Shop,Wine Shop
38,Flatiron,Gym / Fitness Center,American Restaurant,Bakery,New American Restaurant,Gourmet Shop,Gym,Pizza Place,Park,Coffee Shop,Grocery Store


Cluster 1 has Neighborhoods with a very few to no restaurants compared to other clusters, so based on this Neighborhoods within this cluster - Lincoln Square, Clinton, Midtown South, Hudson Yards, Murray Hill are the prime locations for the restaurant business.

Within this cluster, the mostly visited places are related to entertainment such as Theater and Parks, and that further strengthens our belief that most of the people would like to spend longer time with family which in turn increases the chances of their visit to the restaurant for foods. Also, this cluster has mostly Fast Foods types only that cements our unique selling position as one of the few hot food serving businesses in the locality.