## IBM Applied Data Science Capstone  
### Week 5 Final Report
### Recommendation for best location to build new housing complex in Hudson County, New Jersey    

- Build a dataframe of municipalities in Hudson County, New Jersey by web scraping the data from Wikipedia page 
- Get the geographical coordinates of the municipalities 
- Obtain the venue data for the municipalities from Foursquare API 
- Explore and cluster the municipalities 
- Select the best cluster to to build new housing complex 

### Download all the dependencies

In [156]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
!pip install geocoder
import geocoder # to get coordinates

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!pip install folium
import folium # map rendering library

print("Libraries imported.")

Libraries imported.


### Scraping data from Wikipedia page into pandas dataframe

In [80]:
# Using Get request get the data from the Wikipedia page
data = requests.get('https://en.wikipedia.org/wiki/Hudson_County,_New_Jersey').text

### Using BeautifulSoup package for web scraping

In [81]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')

In [82]:
# create a list to store municipality data
municipalityList = []

In [83]:
# parse data from the html into a beautifulsoup object
table = soup.find('table', attrs={'class':'wikitable sortable'})
tableBody = table.find('tbody')
rows = tableBody.find_all('tr')
for row in rows:
    cells = row('td')
    #print(cells)
    if(len(cells) > 0):
        municipalityList.append(cells[0].text.rstrip('\n'))
municipalityList = list(set(municipalityList))
municipalityList.remove('Hudson County')
#print (municipalityList)

In [84]:
# create a new DataFrame from the list
municipalityDf = pd.DataFrame({"Municipalities": municipalityList})

municipalityDf.head()

Unnamed: 0,Municipalities
0,Bayonne
1,Kearny
2,West New York
3,Hoboken
4,Jersey City


In [85]:
# print the number of rows of the dataframe
municipalityDf.shape

(12, 1)

### Get the geographical coordinates

In [86]:
# define a function to get coordinates of the municipalities
def getLatLong(municipality):
    # initialize your variable to None
    latLongCoord = None
    # loop until you get the coordinates
    while(latLongCoord is None):
        g = geocoder.arcgis('{}, Hudson County, New Jersey'.format(municipality))
        latLongCoord = g.latlng
    return latLongCoord

In [89]:
# call the function to get the coordinates, store in a new list using list comprehension
coordinates = [getLatLong(municipality) for municipality in municipalityDf["Municipalities"].tolist() ]

In [90]:
coordinates

[[40.66873000000004, -74.11748999999998],
 [40.76463000000007, -74.14825999999994],
 [40.78833000000003, -74.01525999999996],
 [40.73718000000008, -74.03095999999994],
 [40.71748000000008, -74.04384999999996],
 [40.79301000000004, -74.02037999999999],
 [40.75207000000006, -74.15956999999997],
 [40.77388000000008, -74.02469999999994],
 [40.79164000000003, -74.00403999999997],
 [40.78830000000005, -74.05496999999997],
 [40.77502000000004, -74.02027999999996],
 [40.74633000000006, -74.15766999999994]]

In [91]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
coordinatesDf = pd.DataFrame(coordinates, columns=['Latitude', 'Longitude'])

In [92]:
# merge the coordinates into the original dataframe
municipalityDf['Latitude'] = coordinatesDf['Latitude']
municipalityDf['Longitude'] = coordinatesDf['Longitude']

In [93]:
print(municipalityDf.shape)
municipalityDf

(12, 3)


Unnamed: 0,Municipalities,Latitude,Longitude
0,Bayonne,40.66873,-74.11749
1,Kearny,40.76463,-74.14826
2,West New York,40.78833,-74.01526
3,Hoboken,40.73718,-74.03096
4,Jersey City,40.71748,-74.04385
5,North Bergen,40.79301,-74.02038
6,East Newark,40.75207,-74.15957
7,Union City,40.77388,-74.0247
8,Guttenberg,40.79164,-74.00404
9,Secaucus,40.7883,-74.05497


In [94]:
# save the DataFrame as CSV file
municipalityDf.to_csv("municipalityDf.csv", index=False)

### Create a map of Hudson County with municipalities superimposed on top

In [96]:
# get the coordinates of Hudson County, New Jersey
address = 'Hudson County, New Jersey'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Hudson County, New Jersey {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Hudson County, New Jersey 40.7381635, -74.0550731.


In [98]:
# create map of Hudson County, New Jersey using latitude and longitude values
mapHudsonC = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, municipality in zip(municipalityDf['Latitude'], municipalityDf['Longitude'], municipalityDf['Municipalities']):
    label = '{}'.format(municipality)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(mapHudsonC)  
    
mapHudsonC

In [99]:
# save the map as HTML file
mapHudsonC.save('mapHudsonC.html')

### Use the Foursquare API to explore the municipalities

In [247]:
# define Foursquare Credentials and Version
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20200703' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 
CLIENT_SECRET:


### Get the top 100 venues that are within a radius of 2000 meters.

In [140]:
radius = 5000
LIMIT = 100

venues = []

for lat, lng, municipality in zip(municipalityDf['Latitude'], municipalityDf['Longitude'], municipalityDf['Municipalities']):
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            municipality,
            lat, 
            lng, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [141]:
# convert the venues list into a new DataFrame
venuesDf = pd.DataFrame(venues)

# define the column names
venuesDf.columns = ['Municipality', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venuesDf.shape)
venuesDf.head()
#venuesDf

(1200, 7)


Unnamed: 0,Municipality,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Bayonne,40.66873,-74.11749,Pizza Masters,40.66505,-74.117176,Pizza Place
1,Bayonne,40.66873,-74.11749,Judicke's Bakery,40.673136,-74.110514,Bakery
2,Bayonne,40.66873,-74.11749,Blimpie,40.665244,-74.116982,Sandwich Place
3,Bayonne,40.66873,-74.11749,Hendrickson's Corner,40.670128,-74.113082,American Restaurant
4,Bayonne,40.66873,-74.11749,San Vito Ristorante & Pizzeria,40.660951,-74.120696,Italian Restaurant


### Number of venues returned for each municipality

In [142]:
venuesDf.groupby(["Municipality"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bayonne,100,100,100,100,100,100
East Newark,100,100,100,100,100,100
Guttenberg,100,100,100,100,100,100
Harrison,100,100,100,100,100,100
Hoboken,100,100,100,100,100,100
Jersey City,100,100,100,100,100,100
Kearny,100,100,100,100,100,100
North Bergen,100,100,100,100,100,100
Secaucus,100,100,100,100,100,100
Union City,100,100,100,100,100,100


In [142]:
venuesDf.groupby(["Municipality"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bayonne,100,100,100,100,100,100
East Newark,100,100,100,100,100,100
Guttenberg,100,100,100,100,100,100
Harrison,100,100,100,100,100,100
Hoboken,100,100,100,100,100,100
Jersey City,100,100,100,100,100,100
Kearny,100,100,100,100,100,100
North Bergen,100,100,100,100,100,100
Secaucus,100,100,100,100,100,100
Union City,100,100,100,100,100,100


### Unique categories for all venues

In [143]:
print('There are {} uniques categories.'.format(len(venuesDf['VenueCategory'].unique())))

There are 171 uniques categories.


In [150]:
# print out the list of categories
venuesDf['VenueCategory'].unique()[:50]

array(['Pizza Place', 'Bakery', 'Sandwich Place', 'American Restaurant',
       'Italian Restaurant', 'Bagel Shop', 'Mexican Restaurant', 'Bar',
       'Spanish Restaurant', 'Greek Restaurant', 'Café', 'Spa',
       'Candy Store', 'Deli / Bodega', 'Park', 'Asian Restaurant',
       'Japanese Restaurant', 'Thai Restaurant', 'Fast Food Restaurant',
       'Ice Cream Shop', 'Health & Beauty Service', 'Taco Place',
       'Irish Pub', 'Golf Course', 'Breakfast Spot', 'Warehouse Store',
       'Burger Joint', 'Mediterranean Restaurant', 'Cultural Center',
       'Gym', 'Arts & Crafts Store', 'Garden', 'Diner', 'Pet Store',
       'Museum', 'Sporting Goods Shop', 'Botanical Garden',
       'Concert Hall', 'Clothing Store', 'Monument / Landmark',
       'Tapas Restaurant', 'Coffee Shop', 'BBQ Joint', 'Dessert Shop',
       'Toy / Game Store', 'Department Store', 'Event Space',
       'Salon / Barbershop', 'Playground', 'Zoo'], dtype=object)

### Analysing of each area

In [232]:
# one hot encoding
hudsonOnehot = pd.get_dummies(venuesDf[['VenueCategory']], prefix="", prefix_sep="")

# add postal, borough and neighborhood column back to dataframe
hudsonOnehot['Municipality'] = venuesDf['Municipality'] 

# move postal, borough and neighborhood column to the first column
fixed_columns = list(hudsonOnehot.columns[-1:]) + list(hudsonOnehot.columns[:-1])
hudsonOnehot = hudsonOnehot[fixed_columns]

print(hudsonOnehot.shape)
hudsonOnehot.head()

(1200, 172)


Unnamed: 0,Municipality,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Building,Burger Joint,Business Service,Café,Candy Store,Cheese Shop,Chinese Restaurant,Circus,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,Colombian Restaurant,Comedy Club,Comic Shop,Community Center,Concert Hall,Convenience Store,Cosmetics Shop,Cuban Restaurant,Cultural Center,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Donut Shop,Empanada Restaurant,Event Space,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,History Museum,Hockey Arena,Hot Dog Joint,Hotel,Ice Cream Shop,Indie Movie Theater,Indie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lingerie Store,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Multiplex,Museum,Music Venue,New American Restaurant,North Indian Restaurant,Opera House,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pier,Pizza Place,Planetarium,Playground,Plaza,Portuguese Restaurant,Pub,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Soccer Stadium,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,State / Provincial Park,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Tapas Restaurant,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Track,Trail,Udon Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo
0,Bayonne,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Bayonne,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Bayonne,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Bayonne,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Bayonne,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [233]:
hudsonGrouped = hudsonOnehot.groupby(["Municipality"]).mean().reset_index()

print(hudsonGrouped.shape)
hudsonGrouped

(12, 172)


Unnamed: 0,Municipality,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Building,Burger Joint,Business Service,Café,Candy Store,Cheese Shop,Chinese Restaurant,Circus,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,Colombian Restaurant,Comedy Club,Comic Shop,Community Center,Concert Hall,Convenience Store,Cosmetics Shop,Cuban Restaurant,Cultural Center,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Donut Shop,Empanada Restaurant,Event Space,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,History Museum,Hockey Arena,Hot Dog Joint,Hotel,Ice Cream Shop,Indie Movie Theater,Indie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lingerie Store,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Multiplex,Museum,Music Venue,New American Restaurant,North Indian Restaurant,Opera House,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pier,Pizza Place,Planetarium,Playground,Plaza,Portuguese Restaurant,Pub,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Soccer Stadium,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,State / Provincial Park,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Tapas Restaurant,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Track,Trail,Udon Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo
0,Bayonne,0.03,0.0,0.0,0.01,0.01,0.0,0.01,0.04,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.03,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.1,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.05,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01
1,East Newark,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.08,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.03,0.01,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.06,0.01,0.02,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.01,0.02,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
2,Guttenberg,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.05,0.0,0.01,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.02,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.1,0.03,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.05,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.02,0.0
3,Harrison,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.03,0.01,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.07,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.03,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
4,Hoboken,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.16,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.03,0.02,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.04,0.0
5,Jersey City,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.03,0.02,0.0,0.0,0.0,0.01,0.01,0.03,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.01,0.02,0.01,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.14,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.03,0.0,0.01,0.01,0.02,0.01,0.03,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.01,0.0
6,Kearny,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.05,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.01,0.02,0.01,0.03,0.0,0.0,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.0,0.07,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
7,North Bergen,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.01,0.04,0.0,0.01,0.03,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.04,0.03,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.02,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.07,0.02,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.01,0.0
8,Secaucus,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.01,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.01,0.03,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.03,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.07,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.02,0.04,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0
9,Union City,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.01,0.01,0.05,0.0,0.01,0.03,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.02,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.09,0.03,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.02,0.15,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0


### First, let's write a function to sort the venues in descending order.

In [234]:
def returnMostCommonVenues(row, numTopVenues):
    rowCategories = row.iloc[1:]
    rowCategoriesSorted = rowCategories.sort_values(ascending=False)
    
    return rowCategoriesSorted.index.values[0:numTopVenues]

### Top 10 venues for each Municipality

In [235]:
numTopVenues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Municipality']
for ind in np.arange(numTopVenues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
municipalityVenuesSorted = pd.DataFrame(columns=columns)
municipalityVenuesSorted['Municipality'] = hudsonGrouped['Municipality']

for ind in np.arange(hudsonGrouped.shape[0]):
    municipalityVenuesSorted.iloc[ind, 1:] = returnMostCommonVenues(hudsonGrouped.iloc[ind, :], numTopVenues)

municipalityVenuesSorted

Unnamed: 0,Municipality,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bayonne,Italian Restaurant,Bar,Pizza Place,Spanish Restaurant,Bagel Shop,Ice Cream Shop,American Restaurant,Park,Diner,Burger Joint
1,East Newark,Bakery,Portuguese Restaurant,Pizza Place,BBQ Joint,Brazilian Restaurant,Bar,Lounge,Tapas Restaurant,Burger Joint,Hot Dog Joint
2,Guttenberg,Park,Theater,Bakery,Concert Hall,Plaza,Cuban Restaurant,Performing Arts Venue,Gym,Yoga Studio,Garden
3,Harrison,Portuguese Restaurant,Pizza Place,BBQ Joint,Brazilian Restaurant,Bakery,Bar,Park,Lounge,Tapas Restaurant,Donut Shop
4,Hoboken,Park,Bakery,Yoga Studio,Ice Cream Shop,Pizza Place,Sandwich Place,Seafood Restaurant,Sushi Restaurant,Cheese Shop,Taco Place
5,Jersey City,Park,Sushi Restaurant,Gym,Sandwich Place,Pizza Place,Memorial Site,Bakery,Ice Cream Shop,Plaza,Gym / Fitness Center
6,Kearny,Portuguese Restaurant,Italian Restaurant,BBQ Joint,Bakery,Pizza Place,Bar,Grocery Store,Park,Brazilian Restaurant,Lounge
7,North Bergen,Park,Bakery,Concert Hall,Gym,Gym / Fitness Center,Theater,Wine Bar,Mediterranean Restaurant,Cuban Restaurant,Spa
8,Secaucus,Italian Restaurant,Park,Grocery Store,Cuban Restaurant,Scenic Lookout,Bakery,Mexican Restaurant,Restaurant,Deli / Bodega,Café
9,Union City,Theater,Park,Concert Hall,Art Gallery,Bakery,Gym,Hotel,Performing Arts Venue,Cuban Restaurant,Pizza Place


### Run *k*-means to cluster the neighborhood into 5 clusters.

In [236]:
# set number of clusters
kClusters = 5

hudsonGroupedClustering = hudsonGrouped.drop('Municipality', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kClusters, random_state=0).fit(hudsonGroupedClustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([4, 2, 3, 2, 0, 0, 2, 3, 4, 1], dtype=int32)

### Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood

In [237]:
# add clustering labels
municipalityVenuesSorted.insert(1, 'Cluster Labels', kmeans.labels_)
municipalityMerged = municipalityDf

In [238]:
municipalityMerged

Unnamed: 0,Municipalities,Latitude,Longitude
0,Bayonne,40.66873,-74.11749
1,Kearny,40.76463,-74.14826
2,West New York,40.78833,-74.01526
3,Hoboken,40.73718,-74.03096
4,Jersey City,40.71748,-74.04385
5,North Bergen,40.79301,-74.02038
6,East Newark,40.75207,-74.15957
7,Union City,40.77388,-74.0247
8,Guttenberg,40.79164,-74.00404
9,Secaucus,40.7883,-74.05497


In [239]:
municipalityVenuesSorted

Unnamed: 0,Municipality,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bayonne,4,Italian Restaurant,Bar,Pizza Place,Spanish Restaurant,Bagel Shop,Ice Cream Shop,American Restaurant,Park,Diner,Burger Joint
1,East Newark,2,Bakery,Portuguese Restaurant,Pizza Place,BBQ Joint,Brazilian Restaurant,Bar,Lounge,Tapas Restaurant,Burger Joint,Hot Dog Joint
2,Guttenberg,3,Park,Theater,Bakery,Concert Hall,Plaza,Cuban Restaurant,Performing Arts Venue,Gym,Yoga Studio,Garden
3,Harrison,2,Portuguese Restaurant,Pizza Place,BBQ Joint,Brazilian Restaurant,Bakery,Bar,Park,Lounge,Tapas Restaurant,Donut Shop
4,Hoboken,0,Park,Bakery,Yoga Studio,Ice Cream Shop,Pizza Place,Sandwich Place,Seafood Restaurant,Sushi Restaurant,Cheese Shop,Taco Place
5,Jersey City,0,Park,Sushi Restaurant,Gym,Sandwich Place,Pizza Place,Memorial Site,Bakery,Ice Cream Shop,Plaza,Gym / Fitness Center
6,Kearny,2,Portuguese Restaurant,Italian Restaurant,BBQ Joint,Bakery,Pizza Place,Bar,Grocery Store,Park,Brazilian Restaurant,Lounge
7,North Bergen,3,Park,Bakery,Concert Hall,Gym,Gym / Fitness Center,Theater,Wine Bar,Mediterranean Restaurant,Cuban Restaurant,Spa
8,Secaucus,4,Italian Restaurant,Park,Grocery Store,Cuban Restaurant,Scenic Lookout,Bakery,Mexican Restaurant,Restaurant,Deli / Bodega,Café
9,Union City,1,Theater,Park,Concert Hall,Art Gallery,Bakery,Gym,Hotel,Performing Arts Venue,Cuban Restaurant,Pizza Place


In [240]:
# merge hudsonGrouped with municipalityDf to add latitude/longitude for each neighborhood
municipalityMerged = municipalityMerged.join(municipalityVenuesSorted.set_index('Municipality'), on='Municipalities')

municipalityMerged.head()

Unnamed: 0,Municipalities,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bayonne,40.66873,-74.11749,4,Italian Restaurant,Bar,Pizza Place,Spanish Restaurant,Bagel Shop,Ice Cream Shop,American Restaurant,Park,Diner,Burger Joint
1,Kearny,40.76463,-74.14826,2,Portuguese Restaurant,Italian Restaurant,BBQ Joint,Bakery,Pizza Place,Bar,Grocery Store,Park,Brazilian Restaurant,Lounge
2,West New York,40.78833,-74.01526,3,Park,Theater,Bakery,Mediterranean Restaurant,Concert Hall,Gym,Performing Arts Venue,Jazz Club,Gym / Fitness Center,Cuban Restaurant
3,Hoboken,40.73718,-74.03096,0,Park,Bakery,Yoga Studio,Ice Cream Shop,Pizza Place,Sandwich Place,Seafood Restaurant,Sushi Restaurant,Cheese Shop,Taco Place
4,Jersey City,40.71748,-74.04385,0,Park,Sushi Restaurant,Gym,Sandwich Place,Pizza Place,Memorial Site,Bakery,Ice Cream Shop,Plaza,Gym / Fitness Center


### Finally, let's visualize the resulting clusters

In [241]:
# create map
mapClusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kClusters)
ys = [i + x + (i*x)**2 for i in range(kClusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markersColors = []
for lat, lon, poi, cluster in zip(municipalityMerged['Latitude'], municipalityMerged['Longitude'], municipalityMerged['Municipalities'], municipalityMerged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(mapClusters)
       
mapClusters

### Examine Clusters
Now, examine each cluster and determine the discriminating venue categories that distinguish each cluster.

### Cluster 0

In [242]:
municipalityMerged.loc[municipalityMerged['Cluster Labels'] == 0]

Unnamed: 0,Municipalities,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Hoboken,40.73718,-74.03096,0,Park,Bakery,Yoga Studio,Ice Cream Shop,Pizza Place,Sandwich Place,Seafood Restaurant,Sushi Restaurant,Cheese Shop,Taco Place
4,Jersey City,40.71748,-74.04385,0,Park,Sushi Restaurant,Gym,Sandwich Place,Pizza Place,Memorial Site,Bakery,Ice Cream Shop,Plaza,Gym / Fitness Center


In [243]:
municipalityMerged.loc[municipalityMerged['Cluster Labels'] == 1]

Unnamed: 0,Municipalities,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Union City,40.77388,-74.0247,1,Theater,Park,Concert Hall,Art Gallery,Bakery,Gym,Hotel,Performing Arts Venue,Cuban Restaurant,Pizza Place
10,Weehawken,40.77502,-74.02028,1,Theater,Park,Concert Hall,Bakery,Gym,Art Gallery,Hotel,Cuban Restaurant,Performing Arts Venue,Gym / Fitness Center


In [244]:
municipalityMerged.loc[municipalityMerged['Cluster Labels'] == 2]

Unnamed: 0,Municipalities,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Kearny,40.76463,-74.14826,2,Portuguese Restaurant,Italian Restaurant,BBQ Joint,Bakery,Pizza Place,Bar,Grocery Store,Park,Brazilian Restaurant,Lounge
6,East Newark,40.75207,-74.15957,2,Bakery,Portuguese Restaurant,Pizza Place,BBQ Joint,Brazilian Restaurant,Bar,Lounge,Tapas Restaurant,Burger Joint,Hot Dog Joint
11,Harrison,40.74633,-74.15767,2,Portuguese Restaurant,Pizza Place,BBQ Joint,Brazilian Restaurant,Bakery,Bar,Park,Lounge,Tapas Restaurant,Donut Shop


In [245]:
municipalityMerged.loc[municipalityMerged['Cluster Labels'] == 3]

Unnamed: 0,Municipalities,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,West New York,40.78833,-74.01526,3,Park,Theater,Bakery,Mediterranean Restaurant,Concert Hall,Gym,Performing Arts Venue,Jazz Club,Gym / Fitness Center,Cuban Restaurant
5,North Bergen,40.79301,-74.02038,3,Park,Bakery,Concert Hall,Gym,Gym / Fitness Center,Theater,Wine Bar,Mediterranean Restaurant,Cuban Restaurant,Spa
8,Guttenberg,40.79164,-74.00404,3,Park,Theater,Bakery,Concert Hall,Plaza,Cuban Restaurant,Performing Arts Venue,Gym,Yoga Studio,Garden


In [246]:
municipalityMerged.loc[municipalityMerged['Cluster Labels'] == 4]

Unnamed: 0,Municipalities,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bayonne,40.66873,-74.11749,4,Italian Restaurant,Bar,Pizza Place,Spanish Restaurant,Bagel Shop,Ice Cream Shop,American Restaurant,Park,Diner,Burger Joint
9,Secaucus,40.7883,-74.05497,4,Italian Restaurant,Park,Grocery Store,Cuban Restaurant,Scenic Lookout,Bakery,Mexican Restaurant,Restaurant,Deli / Bodega,Café


### Observations from the project:
- Cluster 0 municipalities have mostly parks, gyms and some speciality eateries.
- Cluster 1 municipalities have mostly theaters, concert halls, art galleries, parks and few restaurants.
- Cluster 2 municipalities have mostly restaurants.
- Cluster 3 municipalities have mostly parks, theaters, gym and some restaurants
- Cluster 4 municipalities have mostly restaurants, parks, grocery store and scenic lookout. 

Based on the clustering result it seems like cluster 4 municipalities i.e. Bayonne and Secaucus might be most suitable for a new housing complex. Its most common venues are restaurant, park and grocery store which will be lucrative features for the prospective home buyers.