# A description.

## Background

New York is also the most densely populated major city in the United States. A global power city, New York City has been described as the cultural, financial, and media capital of the world, and exerts a significant impact upon commerce, entertainment, research, technology, education, politics, tourism, art, fashion, and sports. Manhattan is the central island in New York City and is home to Times Square, the financial district with Wall Street and New York Stock Exchange, United Nations, Central Park, Metropolitan Museum of Art and many other famous landmarks and tourist attractions.

## Business Problem

In this problem we use machine learning tools in order to help people, investors to make wise and effective decision. The business problem we put is: How can we offer support to investors, businessmen, ... to make a decision to buy property in New York's Manhattan district? In order to solve the problem, I intend to cluster neighborhoods in Manhattan to recommend venues and the current average price of real estate where homebuyers can make a real estate investments. We will recommend profitable venues according to amenities and essential facilities surrounding such venues i.e. elementary schools, high schools, hospitals & grocery stores etc...


# Data

Data on Manhattan properties and the relative price paid data were extracted from the NYC Department of Finance (https://www1.nyc.gov/site/finance/taxes/property-annualized-sales-update.page). The following data contains following columns:BOROUGH, NEIGHBORHOOD, BUILDING CLASS CATEGORY, TAX CLASS AT PRESENT, BLOCK, LOT, EASE-MENT, BUILDING CLASS AT PRESENT, ADDRESS, APARTMENT NUMBER, ZIP CODE, RESIDENTIAL UNITS, COMMERCIAL UNITS, TOTAL UNITS, LAND SQUARE FEET, GROSS SQUARE FEET, YEAR BUILT, TAX CLASS AT TIME OF SALE, BUILDING CLASS AT TIME OF SALE,  SALE PRICE, SALE DATE. We reduce the data in excel to 4 columns just for simplicity: NEIGHBORHOOD, ADDRESS, PRICE, DATE.

To explore and target recommended locations across different venues according to the presence of amenities and essential facilities, we will access data through FourSquare API interface and arrange them as a dataframe for visualization. By merging data on Manhattan properties and the relative price paid data from the NYC Department of Finance and data on amenities and essential facilities surrounding such properties from FourSquare API interface, we will be able to recommend profitable real estate investments.

# Metodology

Section include:
<ol>
  <li>Collect Inspection Data</li>
  <li>Explore and Understand Data</li>
  <li>Data preparation and preprocessing </li>
  <li>Modeling</li>
</ol>

#### 1. Collect Inspection Data

In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium # map rendering library
import urllib.request
print('Libraries imported.')

Libraries imported.


In [2]:
df = pd.read_csv('manhattan.csv', sep=';')

#### 2. Explore and Understand Data 

In [5]:
print(df.head())
print(df.shape)

    Neighborhood               Adress      Price        Date
0  ALPHABET CITY  189 EAST 7TH STREET  4844809.0  22.05.2018
1  ALPHABET CITY  189 EAST 7TH STREET        0.0  23.05.2018
2  ALPHABET CITY  526 EAST 5TH STREET  6100000.0  03.12.2018
3  ALPHABET CITY         113 AVENUE C        0.0  25.04.2018
4  ALPHABET CITY         166 AVENUE A        0.0  29.11.2018
(22924, 4)


#### 3. Data preparation and preprocessing

In [6]:
# Assign columns names.
df.columns = ['NEIGHBORHOOD', 'ADDRESS', 'PRICE', 'DATE']

In [7]:
# Formating and cleaning the data.
df = df.drop(['ADDRESS', 'DATE'], axis=1)
df = df[df.PRICE != 0]
df['NEIGHBORHOOD'] = df['NEIGHBORHOOD'].str.replace(r"\(.*\)","")

df["NEIGHBORHOOD"]= df["NEIGHBORHOOD"].str.replace("GREENWICH VILLAGE-WEST", "GREENWICH VILLAGE", case = False)
df["NEIGHBORHOOD"]= df["NEIGHBORHOOD"].str.replace("GREENWICH VILLAGE-CENTRAL", "GREENWICH VILLAGE", case = False)

df["NEIGHBORHOOD"]= df["NEIGHBORHOOD"].str.replace("WASHINGTON HEIGHTS LOWER", "WASHINGTON HEIGHTS", case = False)
df["NEIGHBORHOOD"]= df["NEIGHBORHOOD"].str.replace("WASHINGTON HEIGHTS UPPER", "WASHINGTON HEIGHTS", case = False)

df["NEIGHBORHOOD"]= df["NEIGHBORHOOD"].str.replace("MIDTOWN CBD", "MIDTOWN CENTRAL", case = False)

In [9]:
# let's look how much values have each neighborhood
df['NEIGHBORHOOD'].value_counts()

UPPER EAST SIDE        2585
UPPER WEST SIDE        2047
GREENWICH VILLAGE       918
MIDTOWN EAST            855
MIDTOWN WEST            614
CHELSEA                 547
HARLEM-CENTRAL          511
MURRAY HILL             465
TRIBECA                 433
GRAMERCY                428
LOWER EAST SIDE         345
WASHINGTON HEIGHTS      334
FLATIRON                265
KIPS BAY                261
CLINTON                 223
FINANCIAL               221
SOHO                    194
MANHATTAN VALLEY        162
EAST VILLAGE            153
MIDTOWN CENTRAL         148
FASHION                 138
HARLEM-EAST             128
ALPHABET CITY           118
CIVIC CENTER            117
HARLEM-UPPER            114
CHINATOWN                91
SOUTHBRIDGE              84
INWOOD                   77
JAVITS CENTER            66
MORNINGSIDE HEIGHTS      53
LITTLE ITALY             49
HARLEM-WEST              27
ROOSEVELT ISLAND         23
Name: NEIGHBORHOOD, dtype: int64

In [11]:
# Grouping
df_nei = df.groupby(['NEIGHBORHOOD'])['PRICE'].mean().reset_index()
# Give meaningful names to the columns
df_nei.columns = ['NEIGHBORHOOD', 'AVG_PRICE']
df_nei['AVG_PRICE'] = df_nei['AVG_PRICE'].astype(int)

In [12]:
# Adding "MANHATTAN" word to improve the address search 
df_nei['NEIGHBORHOOD'] = df_nei['NEIGHBORHOOD'] + ', MANHATTAN'

In [13]:
df_nei['AVG_PRICE'] = df_nei['AVG_PRICE'].astype(int)

In [14]:
# Our data in descending order
df_nei.sort_values('AVG_PRICE', ascending=False)

Unnamed: 0,NEIGHBORHOOD,AVG_PRICE
21,"MIDTOWN CENTRAL, MANHATTAN",19053401
6,"FASHION, MANHATTAN",17918463
23,"MIDTOWN WEST, MANHATTAN",10692167
8,"FLATIRON, MANHATTAN",8055894
29,"TRIBECA, MANHATTAN",7592438
18,"LITTLE ITALY, MANHATTAN",5681737
14,"HARLEM-WEST, MANHATTAN",5347516
16,"JAVITS CENTER, MANHATTAN",5154428
1,"CHELSEA, MANHATTAN",4944513
3,"CIVIC CENTER, MANHATTAN",4908241


In [15]:
for index, item in df_nei.iterrows():
    print(f"index: {index}")
    print(f"item: {item}")
    print(f"item.Street only: {item.NEIGHBORHOOD}")

index: 0
item: NEIGHBORHOOD    ALPHABET CITY, MANHATTAN
AVG_PRICE                        3456643
Name: 0, dtype: object
item.Street only: ALPHABET CITY, MANHATTAN
index: 1
item: NEIGHBORHOOD    CHELSEA, MANHATTAN
AVG_PRICE                  4944513
Name: 1, dtype: object
item.Street only: CHELSEA, MANHATTAN
index: 2
item: NEIGHBORHOOD    CHINATOWN, MANHATTAN
AVG_PRICE                    3398609
Name: 2, dtype: object
item.Street only: CHINATOWN, MANHATTAN
index: 3
item: NEIGHBORHOOD    CIVIC CENTER, MANHATTAN
AVG_PRICE                       4908241
Name: 3, dtype: object
item.Street only: CIVIC CENTER, MANHATTAN
index: 4
item: NEIGHBORHOOD    CLINTON, MANHATTAN
AVG_PRICE                  3220975
Name: 4, dtype: object
item.Street only: CLINTON, MANHATTAN
index: 5
item: NEIGHBORHOOD    EAST VILLAGE, MANHATTAN
AVG_PRICE                       3277558
Name: 5, dtype: object
item.Street only: EAST VILLAGE, MANHATTAN
index: 6
item: NEIGHBORHOOD    FASHION, MANHATTAN
AVG_PRICE                 

In [16]:
geolocator = Nominatim()

In [17]:
#Using geolocator to find the latititude and longitude of Neach neighborhods
df_nei['location'] = df_nei['NEIGHBORHOOD'].apply(geolocator.geocode).apply(lambda x: (x.latitude, x.longitude))
df_nei[['latitude', 'longitude']] = df_nei['location'].apply(pd.Series)
df = df_nei.drop(columns=['location'])

In [18]:
# Manhattan location

address = 'Manhattan, New York'
geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinate of {} are {}, {}.'.format(address, latitude, longitude))

The geographical coordinate of Manhattan, New York are 40.7900869, -73.9598295.


In [19]:
manhattan_map = folium.Map(location=[latitude, longitude], zoom_start=11.4)

for lat, lng, avg_price, neighborhood in zip(df['latitude'], df['longitude'], df['AVG_PRICE'], df['NEIGHBORHOOD']):
    label ='{}, {}'.format(neighborhood, avg_price)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat,lng],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#e072b9',
        fill_opacity = 'red',
        parse_html=False).add_to(manhattan_map)
manhattan_map

In [21]:
#Define Foursquare Credentials and Version

CLIENT_ID = 'xyz' # your Foursquare ID
CLIENT_SECRET = 'xyz' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + 'xyz')
print('CLIENT_SECRET:' + 'xyz')

Your credentails:
CLIENT_ID: xyz
CLIENT_SECRET:xyz


Now we can deal with modeling. We will analyze neighborhoods to recommend real estates where home buyers can make a real estate investment. We will then recommend profitable venues according to amenities and essential facilities surrounding such venues.

#### 4. Modeling

Now we are going to use the cluster methotology to analyze our data. We will use the k- means clustering technique as it is fast and efficinet in terms of computational cost, is highly flexible to account for mutations in real estate market in Manhattan and is accurate.

In [22]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [23]:
# Run the above function on each location and create a new dataframe called location_venues and display it.
location_venues = getNearbyVenues(names=df['NEIGHBORHOOD'],
                                   latitudes=df['latitude'],
                                   longitudes=df['longitude']
                                  )

ALPHABET CITY, MANHATTAN
CHELSEA, MANHATTAN
CHINATOWN, MANHATTAN
CIVIC CENTER, MANHATTAN
CLINTON, MANHATTAN
EAST VILLAGE, MANHATTAN
FASHION, MANHATTAN
FINANCIAL, MANHATTAN
FLATIRON, MANHATTAN
GRAMERCY, MANHATTAN
GREENWICH VILLAGE, MANHATTAN
HARLEM-CENTRAL, MANHATTAN
HARLEM-EAST, MANHATTAN
HARLEM-UPPER, MANHATTAN
HARLEM-WEST, MANHATTAN
INWOOD, MANHATTAN
JAVITS CENTER, MANHATTAN
KIPS BAY, MANHATTAN
LITTLE ITALY, MANHATTAN
LOWER EAST SIDE, MANHATTAN
MANHATTAN VALLEY, MANHATTAN
MIDTOWN CENTRAL, MANHATTAN
MIDTOWN EAST, MANHATTAN
MIDTOWN WEST, MANHATTAN
MORNINGSIDE HEIGHTS, MANHATTAN
MURRAY HILL, MANHATTAN
ROOSEVELT ISLAND, MANHATTAN
SOHO, MANHATTAN
SOUTHBRIDGE, MANHATTAN
TRIBECA, MANHATTAN
UPPER EAST SIDE , MANHATTAN
UPPER WEST SIDE , MANHATTAN
WASHINGTON HEIGHTS, MANHATTAN


In [55]:
location_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Sunny & Annie Gourmet Deli,40.72459,-73.9816,Deli / Bodega
1,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Alphabet City Beer Co.,40.723753,-73.979043,Beer Bar
2,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Bobwhite Counter,40.723715,-73.979121,Fried Chicken Joint
3,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Sake Bar Satsko,40.724647,-73.98019,Sake Bar
4,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,The Wayland,40.725264,-73.97804,Cocktail Bar
5,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Lois,40.723849,-73.979033,Wine Bar
6,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Mace,40.725395,-73.978235,Cocktail Bar
7,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,CrossFit East River,40.725443,-73.978461,Gym / Fitness Center
8,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Loverboy,40.724728,-73.978482,Cocktail Bar
9,"ALPHABET CITY, MANHATTAN",40.725102,-73.979583,Tompkins Square Park Dog Run,40.726538,-73.981297,Dog Run


In [25]:
location_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"ALPHABET CITY, MANHATTAN",100,100,100,100,100,100
"CHELSEA, MANHATTAN",100,100,100,100,100,100
"CHINATOWN, MANHATTAN",100,100,100,100,100,100
"CIVIC CENTER, MANHATTAN",100,100,100,100,100,100
"CLINTON, MANHATTAN",100,100,100,100,100,100
"EAST VILLAGE, MANHATTAN",100,100,100,100,100,100
"FASHION, MANHATTAN",100,100,100,100,100,100
"FINANCIAL, MANHATTAN",100,100,100,100,100,100
"FLATIRON, MANHATTAN",100,100,100,100,100,100
"GRAMERCY, MANHATTAN",100,100,100,100,100,100


In [26]:
print('There are {} uniques categories.'.format(len(location_venues['Venue Category'].unique())))

There are 294 uniques categories.


In [27]:
location_venues.shape

(2997, 7)

In [28]:
# one hot encoding
venues_onehot = pd.get_dummies(location_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhoods column back to dataframe
venues_onehot['Neighborhood'] = location_venues['Neighborhood'] 

# move neighborhoods column to the first column
fixed_columns = [venues_onehot.columns[-1]] + list(venues_onehot.columns[:-1])

#fixed_columns
venues_onehot = venues_onehot[fixed_columns]

venues_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,American Restaurant,Amphitheater,Antique Shop,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Stadium,Beach,Beer Bar,Beer Store,Bike Rental / Bike Share,Bike Shop,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Bridal Shop,Bridge,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lebanese Restaurant,Lingerie Store,Liquor Store,Lounge,Luggage Store,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,New American Restaurant,Newsstand,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Outdoor Sculpture,Outdoors & Recreation,Paella Restaurant,Paper / Office Supplies Store,Park,Peking Duck Restaurant,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Piano Bar,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Pub,Public Art,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shop & Service,Ski Shop,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Social Club,Soup Place,Southern / Soul Food Restaurant,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swiss Restaurant,Synagogue,Szechuan Restaurant,TV Station,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Tour Provider,Toy / Game Store,Track,Trail,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Watch Shop,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"ALPHABET CITY, MANHATTAN",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"ALPHABET CITY, MANHATTAN",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"ALPHABET CITY, MANHATTAN",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"ALPHABET CITY, MANHATTAN",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"ALPHABET CITY, MANHATTAN",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [29]:
manhattan_grouped = venues_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,American Restaurant,Amphitheater,Antique Shop,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Stadium,Beach,Beer Bar,Beer Store,Bike Rental / Bike Share,Bike Shop,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Bridal Shop,Bridge,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lebanese Restaurant,Lingerie Store,Liquor Store,Lounge,Luggage Store,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,New American Restaurant,Newsstand,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Outdoor Sculpture,Outdoors & Recreation,Paella Restaurant,Paper / Office Supplies Store,Park,Peking Duck Restaurant,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Piano Bar,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Pub,Public Art,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shop & Service,Ski Shop,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Social Club,Soup Place,Southern / Soul Food Restaurant,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swiss Restaurant,Synagogue,Szechuan Restaurant,TV Station,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Tour Provider,Toy / Game Store,Track,Trail,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Watch Shop,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"ALPHABET CITY, MANHATTAN",0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.07,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.09,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.0,0.0
1,"CHELSEA, MANHATTAN",0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.26,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01
2,"CHINATOWN, MANHATTAN",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.07,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.14,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01
3,"CIVIC CENTER, MANHATTAN",0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.13,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.03,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02
4,"CLINTON, MANHATTAN",0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.05,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.04,0.0,0.0,0.0
5,"EAST VILLAGE, MANHATTAN",0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.02,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.01,0.0,0.0,0.02,0.01,0.06,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01
6,"FASHION, MANHATTAN",0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.07,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02
7,"FINANCIAL, MANHATTAN",0.01,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.03,0.01,0.01,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01
8,"FLATIRON, MANHATTAN",0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.06,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.04,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.01,0.02,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.02,0.03
9,"GRAMERCY, MANHATTAN",0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.01,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.05,0.0,0.0,0.01


In [31]:
manhattan_grouped.shape

(33, 295)

In [32]:
# What are the top 5 venues/facilities nearby profitable real estate investments?

num_top_venues = 5

for hood in manhattan_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = manhattan_grouped[manhattan_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----ALPHABET CITY, MANHATTAN----
                venue  freq
0        Cocktail Bar  0.09
1                 Bar  0.07
2         Coffee Shop  0.04
3              Garden  0.03
4  Italian Restaurant  0.03


----CHELSEA, MANHATTAN----
                     venue  freq
0              Art Gallery  0.26
1       Italian Restaurant  0.03
2  Health & Beauty Service  0.03
3                  Theater  0.03
4         Tapas Restaurant  0.03


----CHINATOWN, MANHATTAN----
                   venue  freq
0     Chinese Restaurant  0.14
1                 Bakery  0.07
2  Vietnamese Restaurant  0.05
3       Malay Restaurant  0.03
4         Ice Cream Shop  0.03


----CIVIC CENTER, MANHATTAN----
                   venue  freq
0     Chinese Restaurant  0.13
1         Sandwich Place  0.04
2  Vietnamese Restaurant  0.04
3     Dim Sum Restaurant  0.04
4            Coffee Shop  0.04


----CLINTON, MANHATTAN----
                venue  freq
0     Thai Restaurant  0.06
1             Theater  0.05
2  Italian Restaurant 

In [33]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [34]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

In [36]:
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

In [37]:
venues_sorted.shape

(33, 11)

In [38]:
venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"ALPHABET CITY, MANHATTAN",Cocktail Bar,Bar,Coffee Shop,Wine Bar,Italian Restaurant,Garden,Latin American Restaurant,Eastern European Restaurant,Nightclub,Dessert Shop
1,"CHELSEA, MANHATTAN",Art Gallery,Health & Beauty Service,Tapas Restaurant,Coffee Shop,Italian Restaurant,Theater,Ice Cream Shop,American Restaurant,Vegetarian / Vegan Restaurant,Bagel Shop
2,"CHINATOWN, MANHATTAN",Chinese Restaurant,Bakery,Vietnamese Restaurant,Dim Sum Restaurant,Italian Restaurant,Salon / Barbershop,Malay Restaurant,Ice Cream Shop,Optical Shop,Noodle House
3,"CIVIC CENTER, MANHATTAN",Chinese Restaurant,Coffee Shop,Sandwich Place,Vietnamese Restaurant,Dim Sum Restaurant,Bakery,Dessert Shop,Bubble Tea Shop,Optical Shop,Park
4,"CLINTON, MANHATTAN",Thai Restaurant,Italian Restaurant,Wine Bar,Theater,Mexican Restaurant,Pizza Place,Wine Shop,Bar,Ramen Restaurant,Japanese Restaurant


In [39]:
print(venues_sorted.shape)
print(manhattan_grouped.shape)

(33, 11)
(33, 295)


In [40]:
manhattan_grouped=df

In [41]:
kclusters = 5 #Distribute in 5 Clusters

manhattan_grouped_clustering = manhattan_grouped.drop('NEIGHBORHOOD', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:50]

array([0, 4, 0, 4, 0, 0, 1, 3, 2, 3, 0, 3, 0, 3, 4, 3, 4, 3, 4, 3, 3, 1,
       0, 2, 3, 3, 3, 4, 0, 2, 0, 0, 3], dtype=int32)

In [42]:
manhattan_grouped_clustering=df
manhattan_grouped_clustering.head()

Unnamed: 0,NEIGHBORHOOD,AVG_PRICE,latitude,longitude
0,"ALPHABET CITY, MANHATTAN",3456643,40.725102,-73.979583
1,"CHELSEA, MANHATTAN",4944513,40.746491,-74.001528
2,"CHINATOWN, MANHATTAN",3398609,40.716491,-73.99625
3,"CIVIC CENTER, MANHATTAN",4908241,40.713679,-74.002404
4,"CLINTON, MANHATTAN",3220975,40.764423,-73.992392


In [43]:
print(manhattan_grouped_clustering.shape)
print(df.shape)

(33, 4)
(33, 4)


In [44]:
print(manhattan_grouped_clustering.dtypes)
print('-----------------------')
print(df.dtypes)

NEIGHBORHOOD     object
AVG_PRICE         int64
latitude        float64
longitude       float64
dtype: object
-----------------------
NEIGHBORHOOD     object
AVG_PRICE         int64
latitude        float64
longitude       float64
dtype: object


In [45]:
manhattan_grouped_clustering['Cluster Labels'] = kmeans.labels_

# merge london_grouped with london_data to add latitude/longitude for each neighborhood
manhattan_grouped_clustering = manhattan_grouped_clustering.join(venues_sorted.set_index('Neighborhood'), on='NEIGHBORHOOD')

manhattan_grouped_clustering.head(30) # check the last columns!

Unnamed: 0,NEIGHBORHOOD,AVG_PRICE,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"ALPHABET CITY, MANHATTAN",3456643,40.725102,-73.979583,0,Cocktail Bar,Bar,Coffee Shop,Wine Bar,Italian Restaurant,Garden,Latin American Restaurant,Eastern European Restaurant,Nightclub,Dessert Shop
1,"CHELSEA, MANHATTAN",4944513,40.746491,-74.001528,4,Art Gallery,Health & Beauty Service,Tapas Restaurant,Coffee Shop,Italian Restaurant,Theater,Ice Cream Shop,American Restaurant,Vegetarian / Vegan Restaurant,Bagel Shop
2,"CHINATOWN, MANHATTAN",3398609,40.716491,-73.99625,0,Chinese Restaurant,Bakery,Vietnamese Restaurant,Dim Sum Restaurant,Italian Restaurant,Salon / Barbershop,Malay Restaurant,Ice Cream Shop,Optical Shop,Noodle House
3,"CIVIC CENTER, MANHATTAN",4908241,40.713679,-74.002404,4,Chinese Restaurant,Coffee Shop,Sandwich Place,Vietnamese Restaurant,Dim Sum Restaurant,Bakery,Dessert Shop,Bubble Tea Shop,Optical Shop,Park
4,"CLINTON, MANHATTAN",3220975,40.764423,-73.992392,0,Thai Restaurant,Italian Restaurant,Wine Bar,Theater,Mexican Restaurant,Pizza Place,Wine Shop,Bar,Ramen Restaurant,Japanese Restaurant
5,"EAST VILLAGE, MANHATTAN",3277558,40.729269,-73.987361,0,Japanese Restaurant,Chinese Restaurant,Coffee Shop,Ice Cream Shop,Seafood Restaurant,Sushi Restaurant,Jewelry Store,Cocktail Bar,Ramen Restaurant,Pizza Place
6,"FASHION, MANHATTAN",17918463,40.747261,-73.994551,1,Gym / Fitness Center,Coffee Shop,Pizza Place,Hotel,Gym,Flower Shop,Sandwich Place,Yoga Studio,French Restaurant,Food Truck
7,"FINANCIAL, MANHATTAN",2465916,40.707612,-74.009378,3,Steakhouse,American Restaurant,Hotel,Juice Bar,Coffee Shop,Pizza Place,Gym,Italian Restaurant,Café,Spa
8,"FLATIRON, MANHATTAN",8055894,40.741086,-73.98963,2,Gym,American Restaurant,Gym / Fitness Center,Italian Restaurant,Yoga Studio,Cycle Studio,Vegetarian / Vegan Restaurant,Japanese Restaurant,Mediterranean Restaurant,Mexican Restaurant
9,"GRAMERCY, MANHATTAN",1977192,40.735519,-73.984079,3,Spa,Wine Shop,American Restaurant,Italian Restaurant,Pizza Place,Bagel Shop,Hotel,Indie Theater,Coffee Shop,Bar


In [46]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_grouped_clustering['latitude'], manhattan_grouped_clustering['longitude'], manhattan_grouped_clustering['NEIGHBORHOOD'], manhattan_grouped_clustering['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [47]:
manhattan_grouped_clustering.loc[manhattan_grouped_clustering['Cluster Labels'] == 0, manhattan_grouped_clustering.columns[[1] + list(range(5, manhattan_grouped_clustering.shape[1]))]].head()


Unnamed: 0,AVG_PRICE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,3456643,Cocktail Bar,Bar,Coffee Shop,Wine Bar,Italian Restaurant,Garden,Latin American Restaurant,Eastern European Restaurant,Nightclub,Dessert Shop
2,3398609,Chinese Restaurant,Bakery,Vietnamese Restaurant,Dim Sum Restaurant,Italian Restaurant,Salon / Barbershop,Malay Restaurant,Ice Cream Shop,Optical Shop,Noodle House
4,3220975,Thai Restaurant,Italian Restaurant,Wine Bar,Theater,Mexican Restaurant,Pizza Place,Wine Shop,Bar,Ramen Restaurant,Japanese Restaurant
5,3277558,Japanese Restaurant,Chinese Restaurant,Coffee Shop,Ice Cream Shop,Seafood Restaurant,Sushi Restaurant,Jewelry Store,Cocktail Bar,Ramen Restaurant,Pizza Place
10,3313486,Italian Restaurant,Coffee Shop,Cosmetics Shop,American Restaurant,Vegetarian / Vegan Restaurant,Yoga Studio,Sandwich Place,Cocktail Bar,Ice Cream Shop,Massage Studio


In [48]:
manhattan_grouped_clustering.loc[manhattan_grouped_clustering['Cluster Labels'] == 1, manhattan_grouped_clustering.columns[[1] + list(range(5, manhattan_grouped_clustering.shape[1]))]].head()


Unnamed: 0,AVG_PRICE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,17918463,Gym / Fitness Center,Coffee Shop,Pizza Place,Hotel,Gym,Flower Shop,Sandwich Place,Yoga Studio,French Restaurant,Food Truck
21,19053401,Hotel,Italian Restaurant,Steakhouse,Lounge,American Restaurant,Theater,Concert Hall,French Restaurant,Clothing Store,Food Truck


In [49]:
manhattan_grouped_clustering.loc[manhattan_grouped_clustering['Cluster Labels'] == 2, manhattan_grouped_clustering.columns[[1] + list(range(5, manhattan_grouped_clustering.shape[1]))]].head()


Unnamed: 0,AVG_PRICE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,8055894,Gym,American Restaurant,Gym / Fitness Center,Italian Restaurant,Yoga Studio,Cycle Studio,Vegetarian / Vegan Restaurant,Japanese Restaurant,Mediterranean Restaurant,Mexican Restaurant
23,10692167,Thai Restaurant,Italian Restaurant,Wine Bar,Theater,Mexican Restaurant,Pizza Place,Wine Shop,Bar,Ramen Restaurant,Japanese Restaurant
29,7592438,Italian Restaurant,Coffee Shop,Bakery,Cocktail Bar,Spa,Gym / Fitness Center,Hotel,Wine Shop,Nail Salon,Indian Restaurant


In [50]:
manhattan_grouped_clustering.loc[manhattan_grouped_clustering['Cluster Labels'] == 3, manhattan_grouped_clustering.columns[[1] + list(range(5, manhattan_grouped_clustering.shape[1]))]].head()


Unnamed: 0,AVG_PRICE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,2465916,Steakhouse,American Restaurant,Hotel,Juice Bar,Coffee Shop,Pizza Place,Gym,Italian Restaurant,Café,Spa
9,1977192,Spa,Wine Shop,American Restaurant,Italian Restaurant,Pizza Place,Bagel Shop,Hotel,Indie Theater,Coffee Shop,Bar
11,2407765,Mobile Phone Shop,Clothing Store,Cosmetics Shop,Southern / Soul Food Restaurant,Burger Joint,Theater,African Restaurant,Pizza Place,Sushi Restaurant,French Restaurant
13,1875968,Mobile Phone Shop,Clothing Store,Cosmetics Shop,Southern / Soul Food Restaurant,Burger Joint,Theater,African Restaurant,Pizza Place,Sushi Restaurant,French Restaurant
15,860169,Café,Mexican Restaurant,Wine Shop,Wine Bar,American Restaurant,Chinese Restaurant,Bakery,Frozen Yogurt Shop,Park,Deli / Bodega


In [51]:
manhattan_grouped_clustering.loc[manhattan_grouped_clustering['Cluster Labels'] == 4, manhattan_grouped_clustering.columns[[1] + list(range(5, manhattan_grouped_clustering.shape[1]))]].head()


Unnamed: 0,AVG_PRICE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,4944513,Art Gallery,Health & Beauty Service,Tapas Restaurant,Coffee Shop,Italian Restaurant,Theater,Ice Cream Shop,American Restaurant,Vegetarian / Vegan Restaurant,Bagel Shop
3,4908241,Chinese Restaurant,Coffee Shop,Sandwich Place,Vietnamese Restaurant,Dim Sum Restaurant,Bakery,Dessert Shop,Bubble Tea Shop,Optical Shop,Park
14,5347516,Mobile Phone Shop,Clothing Store,Cosmetics Shop,Southern / Soul Food Restaurant,Burger Joint,Theater,African Restaurant,Pizza Place,Sushi Restaurant,French Restaurant
16,5154428,American Restaurant,Boat or Ferry,Coffee Shop,Theater,Gym,Dog Run,Ice Cream Shop,Asian Restaurant,Burger Joint,Bus Station
18,5681737,Bakery,Café,Clothing Store,Sandwich Place,Salon / Barbershop,Mediterranean Restaurant,Boutique,Furniture / Home Store,Ice Cream Shop,Yoga Studio


"As New York’s population and economy continue to grow, every sector of the building industry—commercial, residential, healthcare, education, cultural and infrastructure-remains robust, offering opportunity for both the labor force and contractors,” says Carlo A. Scissura, the organization’s president and CEO. He points to the high costs of land and materials and regulations as the primary drivers of cost increases in 2018. He nonetheless adds, “While the cost of construction is high, the rewards for doing business in New York have never been greater.”

Manhattan has been described as the cultural, financial, media, and entertainment capital of the world, and the borough hosts the United Nations Headquarters. Anchored by Wall Street in the Financial District of Lower Manhattan, New York City has been called both the most economically powerful city. Consequently, building prices are not the cheapest.

To solve this business problem, we clustered Manhattan neighborhoods in order to recommend venues and the current average price of real estate where homebuyers can make a real estate investment. We recommended profitable venues according to amenities and essential facilities surrounding such venues i.e. hotels, high schools, restaurants shops.

First, we gathered Data on Manhattan properties from the NYC Department of Finance (https://www1.nyc.gov/site/finance/taxes/property-annualized-sales-update.page). Moreover, to explore and target recommended locations across different venues according to the presence of amenities and essential facilities, we accessed data through FourSquare API interface and arranged them as a data frame for visualization. By merging data on Manhattan properties and the relative price paid data from the NYC Department of Finance and data on amenities and essential facilities surrounding such properties from FourSquare API interface, we were able to recommend profitable real estate investments.

Second, The Methodology section comprised four stages: 1. Collect Inspection Data; 2. Explore and Understand Data; 3. Data preparation and preprocessing; 4. Modeling. In particular, in the modeling section, we used the k-means clustering technique as it is fast and efficient in terms of computational cost, is highly flexible to account for mutations in real estate market in Manhattan and is accurate.

By analyzing the results according to our five clusters, we can see that all clusters could praise an optimal range of facilities and amenities. The first pattern we are referring to, i.e. Cluster 1 which have the bigest average price may target investor, who value the neighborhood of hotels, Italian restaurants, gym, concert hall, maybe for potential Italian investors. Cluster 2 also have high average price, but with a larger neighborhood of the restaurant. The CLusters 0, 3, 4 may target potential buyers who are more interest to live in area full of restaurants, shops.

## Conclusion