# IBM Data Science - Capstone Project

# Topic: Clustering Similar Areas in New York City and London

## Introduction
First of all, I will process the latitude and longitude values found on web. I am going to use the Foursquare API to explore areas in New York City and London. Then, I will use K-Mean Clustering Algorithm to cluster areas both in New York and London at the same time in order to find out the similar clusters in both of the cities.

Import the libraries that need to be used:

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    geographiclib: 1.49-py_0   conda-forge
    geopy:         1.18.1-py_0 conda-forge

geographiclib- 100% |################################| Time: 0:00:00  14.74 MB/s
geopy-1.18.1-p 100% |################################| Time: 0:00:00  31.29 MB/s
Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    altair:  2.2.2-py35_1 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge

altair-2.2.2-p 100% |################################| Time: 0:00:00  48.94 MB/s
branca-0.3.1-p 100% |################################| Time: 0:00:00  27.56 MB/s
vincent-0.4.4- 100% |###################

# New York Data

In [2]:
#Download the New York dataData from 
!wget -q -O 'newyork_data.json' https://ibm.box.com/shared/static/fbpwbovar7lf8p5sgddm06cgipa2rxpe.json
print('Data downloaded!')

Data downloaded!


In [3]:
# open the downloaded file
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [4]:
neighborhoods_data = newyork_data['features']

In [5]:
# define the dataframe columns
column_names = ['Borough', 'Neighbourhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
newyork = pd.DataFrame(columns=column_names)

In [6]:
# Extracting the location information from the file
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    newyork = newyork.append({'Borough': borough,
                                          'Neighbourhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [7]:
newyork.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


Get the geograpical coordinate of New York City for the map.

In [8]:
address = 'New York City, NY'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

  app.launch_new_instance()


The geograpical coordinate of New York City are 40.7308619, -73.9871558.


In [9]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighbourhood in zip(newyork['Latitude'], newyork['Longitude'], newyork['Borough'], newyork['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

The blue spots show the Neighbourhoods of New York City.

# London
___
### Importing London Data

As I can not find a file consisting both the latitude and longitude and the name of neighbourhoods in the web, I need to process the data and merge the data together to form a table same as that in New York city.

First, I get the postcode and name of all neighbourhoods in London.

In [10]:
import sys
import types
import pandas as pd
from ibm_botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.
client_e75eea09c9df43cb8ab74c72adb84188 = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='CcKaA3FljXQXCs3IhGuEa3GmvhJj4_oVQZiSg7-3G18y',
    ibm_auth_endpoint="https://iam.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3-api.us-geo.objectstorage.service.networklayer.com')

body = client_e75eea09c9df43cb8ab74c72adb84188.get_object(Bucket='machinelearningassignment-donotdelete-pr-zaotljsxyi3zum',Key='postcode.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_data_1 = pd.read_csv(body)
df_data_1.head()

Unnamed: 0.1,Unnamed: 0,Neighbourhood,postcode
0,0,"Aldgate, Bethnal Green, City of London, Mile E...",E1
1,1,Leyton,E10
2,2,"Leyton, Leytonstone, Wanstead",E11
3,3,"East Ham, Manor Park, Wanstead",E12
4,4,"Plaistow, West Ham",E13


Secondly, I have a file that contaning all the UK postcode with geograpical coordinate.

In [11]:

body = client_e75eea09c9df43cb8ab74c72adb84188.get_object(Bucket='machinelearningassignment-donotdelete-pr-zaotljsxyi3zum',Key='london_postcode-outcodes.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_data_2 = pd.read_csv(body)
df_data_2.head()

Unnamed: 0,id,postcode,latitude,longitude
0,2,AB10,57.13514,-2.11731
1,3,AB11,57.13875,-2.09089
2,4,AB12,57.101,-2.1106
3,5,AB13,57.10801,-2.23776
4,6,AB14,57.10076,-2.27073


Now, I am going to merge them together so that I have a table with both the geograpical coordinate and the postcode.

In [12]:
london = pd.merge(df_data_1, df_data_2, on='postcode')

In [13]:
london = london[['postcode', 'Neighbourhood', 'latitude', 'longitude']]

In [14]:
london.head()

Unnamed: 0,postcode,Neighbourhood,latitude,longitude
0,E1,"Aldgate, Bethnal Green, City of London, Mile E...",51.51766,-0.05841
1,E10,Leyton,51.56814,-0.01153
2,E11,"Leyton, Leytonstone, Wanstead",51.56769,0.01443
3,E12,"East Ham, Manor Park, Wanstead",51.54992,0.05404
4,E13,"Plaistow, West Ham",51.527,0.02705


Get the geograpical coordinate of London for the map.

In [15]:
address = 'London, UK'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude1 = location.latitude
longitude1 = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude1, longitude1))

  app.launch_new_instance()


The geograpical coordinate of London are 51.5073219, -0.1276474.


In [16]:
# create map of New York using latitude and longitude values
map_london = folium.Map(location=[latitude1, longitude1], zoom_start=10)

# add markers to map
for lat, lng, postcode, neighbourhood in zip(london['latitude'], london['longitude'], london['postcode'], london['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, postcode)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_london)  
    
map_london

The blue spot show all the neighbourhoods that under investigation.

___
## Combining the New York Neighbourhoods location to Londons Neighbourhoods location

Change the Column name of borough in newyork table to borough_postcode

In [17]:
newyork_concat = newyork
newyork_concat.columns = ['borough_postcode', 'Neighbourhood', 'Latitude', 'Longitude']
newyork_concat.head()

Unnamed: 0,borough_postcode,Neighbourhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


Change the Column name of postcode in london table to borough_postcode. And the column names should be the same as that in newyork table.

In [18]:
london_concat = london
london_concat.columns = ['borough_postcode', 'Neighbourhood', 'Latitude', 'Longitude']
london_concat.head()

Unnamed: 0,borough_postcode,Neighbourhood,Latitude,Longitude
0,E1,"Aldgate, Bethnal Green, City of London, Mile E...",51.51766,-0.05841
1,E10,Leyton,51.56814,-0.01153
2,E11,"Leyton, Leytonstone, Wanstead",51.56769,0.01443
3,E12,"East Ham, Manor Park, Wanstead",51.54992,0.05404
4,E13,"Plaistow, West Ham",51.527,0.02705


Merge the two tables into one table.

In [19]:
nyc_london = pd.concat([newyork_concat, london_concat])
nyc_london.index = range(len(nyc_london))
nyc_london.head()

Unnamed: 0,borough_postcode,Neighbourhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [20]:
nyc_london.shape

(425, 4)

Checking any Neighbourhoods name are the same.

In [21]:
nyc_london[nyc_london.Neighbourhood.duplicated(keep=False)]

Unnamed: 0,borough_postcode,Neighbourhood,Latitude,Longitude
57,Brooklyn,Kensington,40.642382,-73.980421
115,Manhattan,Murray Hill,40.748303,-73.978332
116,Manhattan,Chelsea,40.744035,-74.003116
140,Queens,Sunnyside,40.740176,-73.926916
175,Queens,Bay Terrace,40.782843,-73.776802
180,Queens,Murray Hill,40.764126,-73.812763
220,Staten Island,Sunnyside,40.61276,-74.097126
235,Staten Island,Bay Terrace,40.553988,-74.139166
244,Staten Island,Chelsea,40.594726,-74.18956
421,W8,Kensington,51.50003,-0.19317


Some of the Neighbourhoods' name are the same, need to deal with it by adding the borough_postcode name at the front of Neighbourhoods 

In [22]:
nyc_london['Neighbourhood'] = nyc_london['borough_postcode'] + nyc_london['Neighbourhood']

In [23]:
nyc_london.head()

Unnamed: 0,borough_postcode,Neighbourhood,Latitude,Longitude
0,Bronx,BronxWakefield,40.894705,-73.847201
1,Bronx,BronxCo-op City,40.874294,-73.829939
2,Bronx,BronxEastchester,40.887556,-73.827806
3,Bronx,BronxFieldston,40.895437,-73.905643
4,Bronx,BronxRiverdale,40.890834,-73.912585


___

### Next, I am going to use Foursquare API to explore the areas.

## Define Foursquare Credentials and Version

In [24]:
CLIENT_ID = 'ZAH5MFJZ5BCMQKRKMUNXMEOKZJZ0YZCPJEKHBJ3D1ME5N5RU' # your Foursquare ID
CLIENT_SECRET = '2CFCV0CLCXXS4Q1H4GFDDHNLXUTLY1CQ2DQKRM2PIFT1DX5O' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ZAH5MFJZ5BCMQKRKMUNXMEOKZJZ0YZCPJEKHBJ3D1ME5N5RU
CLIENT_SECRET:2CFCV0CLCXXS4Q1H4GFDDHNLXUTLY1CQ2DQKRM2PIFT1DX5O


## Explore Neighborhoods

#### Now, a function is created to repeatingly finding the top 100 venues to all the neighborhoods in both New York and London.

In [25]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now write the code to run the above function on each neighborhood and create a new dataframe called nl_venues.


In [27]:
LIMIT = 100
nl_venues = getNearbyVenues(names=nyc_london['Neighbourhood'],
                                   latitudes=nyc_london['Latitude'],
                                   longitudes=nyc_london['Longitude']
                                  )

BronxWakefield
BronxCo-op City
BronxEastchester
BronxFieldston
BronxRiverdale
BronxKingsbridge
ManhattanMarble Hill
BronxWoodlawn
BronxNorwood
BronxWilliamsbridge
BronxBaychester
BronxPelham Parkway
BronxCity Island
BronxBedford Park
BronxUniversity Heights
BronxMorris Heights
BronxFordham
BronxEast Tremont
BronxWest Farms
BronxHigh  Bridge
BronxMelrose
BronxMott Haven
BronxPort Morris
BronxLongwood
BronxHunts Point
BronxMorrisania
BronxSoundview
BronxClason Point
BronxThrogs Neck
BronxCountry Club
BronxParkchester
BronxWestchester Square
BronxVan Nest
BronxMorris Park
BronxBelmont
BronxSpuyten Duyvil
BronxNorth Riverdale
BronxPelham Bay
BronxSchuylerville
BronxEdgewater Park
BronxCastle Hill
BronxOlinville
BronxPelham Gardens
BronxConcourse
BronxUnionport
BronxEdenwald
BrooklynBay Ridge
BrooklynBensonhurst
BrooklynSunset Park
BrooklynGreenpoint
BrooklynGravesend
BrooklynBrighton Beach
BrooklynSheepshead Bay
BrooklynManhattan Terrace
BrooklynFlatbush
BrooklynCrown Heights
BrooklynEast 

NW11Finchley, Golders Green, Hampstead Garden Suburb, Hendon
NW2Cricklewood, Dollis Hill, Hampstead, Hendon, Neasden, Willesden, Willesden Green
NW3Belsize Park, Brent Cross, Finchley, Hampstead, Hendon, St. Pancras, Swiss Cottage
NW4Brent Cross, Hendon
NW5Hampstead, Hendon, Kentish Town, St. Pancras
NW6Brondesbury Park, Kilburn, Paddington, Queens Park, South Hampstead, West Hampstead, Willesden
NW7Mill Hill
NW8Hampstead, St. John's Wood, St. Marylebone
NW9Colindale, Hendon, Kingsbury, The Hyde
SE1Bermondsey, Borough, Camberwell, Lambeth, Southwark, Waterloo, Woolwich
SE10Greenwich, Lewisham
SE11Kennington, Lambeth, Southwark
SE12Chislehurst, Grove Park, Lambeth, Lee, Lewisham, Woolwich
SE13Greenwich, Hither Green, Lewisham
SE14Camberwell, Deptford, New Cross, New Cross Gate
SE15Camberwell, Deptford, Nunhead, Peckham
SE16Bermondsey, Camberwell, Deptford, Lewisham, Rotherhithe, South Bermondsey, Surrey Docks, Woolwich
SE17Camberwell, Elephant & Castle, Lambeth, Southwark, Walworth
SE18

Check the size of the resulting dataframe

In [28]:
print(nl_venues.shape)
nl_venues.head()

(14393, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,BronxWakefield,40.894705,-73.847201,Lollipops Gelato,40.894123,-73.845892,Dessert Shop
1,BronxWakefield,40.894705,-73.847201,Rite Aid,40.896521,-73.84468,Pharmacy
2,BronxWakefield,40.894705,-73.847201,Carvel Ice Cream,40.890487,-73.848568,Ice Cream Shop
3,BronxWakefield,40.894705,-73.847201,Dunkin Donuts,40.890631,-73.849027,Donut Shop
4,BronxWakefield,40.894705,-73.847201,SUBWAY,40.890656,-73.849192,Sandwich Place


Check how many venues were returned for each neighborhood

In [29]:
nl_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
BronxAllerton,30,30,30,30,30,30
BronxBaychester,22,22,22,22,22,22
BronxBedford Park,37,37,37,37,37,37
BronxBelmont,97,97,97,97,97,97
BronxBronxdale,15,15,15,15,15,15
BronxCastle Hill,9,9,9,9,9,9
BronxCity Island,25,25,25,25,25,25
BronxClaremont Village,20,20,20,20,20,20
BronxClason Point,11,11,11,11,11,11
BronxCo-op City,15,15,15,15,15,15


#### Find out how many unique categories can be curated from all the returned venues

In [30]:
print('There are {} uniques categories.'.format(len(nl_venues['Venue Category'].unique())))

There are 460 uniques categories.


___
## Analyze Each Neighbourhood

In [31]:
# one hot encoding
nl_onehot = pd.get_dummies(nl_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
nl_onehot['Neighborhood'] = nl_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [nl_onehot.columns[-1]] + list(nl_onehot.columns[:-1])
nl_onehot = nl_onehot[fixed_columns]

nl_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport Terminal,Airport Tram,American Restaurant,Animal Shelter,Antique Shop,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auditorium,Australian Restaurant,Austrian Restaurant,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Bath House,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bike Trail,Bistro,Board Shop,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Camera Store,Campground,Canal,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Caucasian Restaurant,Cemetery,Champagne Bar,Check Cashing Service,Cheese Shop,Chinese Restaurant,Chocolate Shop,Christmas Market,Church,Climbing Gym,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Basketball Court,College Bookstore,College Cafeteria,College Quad,College Theater,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Cuban Restaurant,Cultural Center,Cupcake Shop,Cycle Studio,Czech Restaurant,Dance Studio,Daycare,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Dive Bar,Doctor's Office,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Film Studio,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Heliport,Herbs & Spices Store,High School,Himalayan Restaurant,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotel Pool,Hotpot Restaurant,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Indoor Play Area,Insurance Office,Intersection,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Kosher Restaurant,Lake,Laser Tag,Latin American Restaurant,Laundromat,Laundry Service,Lawyer,Lebanese Restaurant,Library,Light Rail Station,Lighthouse,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Medical Center,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Newsstand,Nightclub,Non-Profit,Noodle House,North Indian Restaurant,Observatory,Office,Okonomiyaki Restaurant,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Outlet Store,Paella Restaurant,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Café,Pet Service,Pet Store,Pharmacy,Photography Studio,Piano Bar,Pie Shop,Pier,Piercing Parlor,Pilates Studio,Pizza Place,Planetarium,Platform,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Pool Hall,Portuguese Restaurant,Post Office,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Rest Area,Restaurant,River,Road,Rock Climbing Spot,Rock Club,Roller Rink,Romanian Restaurant,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Sculpture Garden,Seafood Restaurant,Shaanxi Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,South Indian Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Spiritual Center,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,State / Provincial Park,Stationery Store,Steakhouse,Storage Facility,Street Art,Street Food Gathering,Strip Club,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Swiss Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Temple,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Tiki Bar,Toll Plaza,Tourist Information Center,Toy / Game Store,Track,Track Stadium,Trail,Train,Train Station,Tree,Turkish Restaurant,Udon Restaurant,University,Used Bookstore,Vape Store,Varenyky restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Waste Facility,Watch Shop,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BronxWakefield,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BronxWakefield,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BronxWakefield,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BronxWakefield,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BronxWakefield,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### Examine the new dataframe size

In [32]:
nl_onehot.shape

(14393, 460)

#### Next, group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [33]:
nl_grouped = nl_onehot.groupby('Neighborhood').mean().reset_index()
nl_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport Terminal,Airport Tram,American Restaurant,Animal Shelter,Antique Shop,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auditorium,Australian Restaurant,Austrian Restaurant,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Bath House,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bike Trail,Bistro,Board Shop,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Camera Store,Campground,Canal,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Caucasian Restaurant,Cemetery,Champagne Bar,Check Cashing Service,Cheese Shop,Chinese Restaurant,Chocolate Shop,Christmas Market,Church,Climbing Gym,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Basketball Court,College Bookstore,College Cafeteria,College Quad,College Theater,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Cuban Restaurant,Cultural Center,Cupcake Shop,Cycle Studio,Czech Restaurant,Dance Studio,Daycare,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Dive Bar,Doctor's Office,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Film Studio,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Heliport,Herbs & Spices Store,High School,Himalayan Restaurant,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotel Pool,Hotpot Restaurant,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Indoor Play Area,Insurance Office,Intersection,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Kosher Restaurant,Lake,Laser Tag,Latin American Restaurant,Laundromat,Laundry Service,Lawyer,Lebanese Restaurant,Library,Light Rail Station,Lighthouse,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Medical Center,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Newsstand,Nightclub,Non-Profit,Noodle House,North Indian Restaurant,Observatory,Office,Okonomiyaki Restaurant,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Outlet Store,Paella Restaurant,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Café,Pet Service,Pet Store,Pharmacy,Photography Studio,Piano Bar,Pie Shop,Pier,Piercing Parlor,Pilates Studio,Pizza Place,Planetarium,Platform,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Pool Hall,Portuguese Restaurant,Post Office,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Rest Area,Restaurant,River,Road,Rock Climbing Spot,Rock Club,Roller Rink,Romanian Restaurant,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Sculpture Garden,Seafood Restaurant,Shaanxi Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,South Indian Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Spiritual Center,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,State / Provincial Park,Stationery Store,Steakhouse,Storage Facility,Street Art,Street Food Gathering,Strip Club,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Swiss Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Temple,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Tiki Bar,Toll Plaza,Tourist Information Center,Toy / Game Store,Track,Track Stadium,Trail,Train,Train Station,Tree,Turkish Restaurant,Udon Restaurant,University,Used Bookstore,Vape Store,Varenyky restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Waste Facility,Watch Shop,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant
0,BronxAllerton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.033333,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,BronxBaychester,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,BronxBedford Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.135135,0.0,0.0,0.0,0.0,0.081081,0.027027,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,BronxBelmont,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051546,0.020619,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.082474,0.010309,0.0,0.030928,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.030928,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.195876,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.020619,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.010309,0.0,0.0,0.0,0.0,0.082474,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0
4,BronxBronxdale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Confirm the size of the new table

In [34]:
nl_grouped.shape

(424, 460)

Write a function to sort the venues in descending order.

In [35]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now create the new dataframe and display the top 10 venues for each neighborhood.

In [36]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = nl_grouped['Neighborhood']

for ind in np.arange(nl_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(nl_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,BronxAllerton,Supermarket,Pizza Place,Spa,Deli / Bodega,Chinese Restaurant,Cosmetics Shop,Bus Station,Donut Shop,Grocery Store,Fast Food Restaurant
1,BronxBaychester,Electronics Store,American Restaurant,Sandwich Place,Discount Store,Mexican Restaurant,Mattress Store,Shopping Mall,Pet Store,Bank,Fast Food Restaurant
2,BronxBedford Park,Deli / Bodega,Mexican Restaurant,Fried Chicken Joint,Diner,Pizza Place,Spanish Restaurant,Chinese Restaurant,Supermarket,Pharmacy,Sandwich Place
3,BronxBelmont,Italian Restaurant,Deli / Bodega,Pizza Place,Bakery,Dessert Shop,Grocery Store,Donut Shop,Sandwich Place,Fish Market,Café
4,BronxBronxdale,Italian Restaurant,Performing Arts Venue,Mexican Restaurant,Pizza Place,Paper / Office Supplies Store,Eastern European Restaurant,Spanish Restaurant,Gym,School,Bank
5,BronxCastle Hill,Bank,Diner,Market,Pizza Place,Mobile Phone Shop,Latin American Restaurant,Pharmacy,Cosmetics Shop,Deli / Bodega,Food Truck
6,BronxCity Island,Harbor / Marina,Seafood Restaurant,Thrift / Vintage Store,Jewelry Store,Diner,Pizza Place,Park,Bank,Bar,Spanish Restaurant
7,BronxClaremont Village,Bus Station,Bakery,Park,Fried Chicken Joint,Food,Chinese Restaurant,Pharmacy,Caribbean Restaurant,Gym,Grocery Store
8,BronxClason Point,Park,Bus Stop,Grocery Store,Scenic Lookout,Pool,South American Restaurant,Boat or Ferry,Spa,Farmers Market,Film Studio
9,BronxCo-op City,Baseball Field,Bus Station,Restaurant,Discount Store,Mattress Store,Shopping Mall,Grocery Store,Pizza Place,Park,Chinese Restaurant


___
## Clustering Neighbourhoods

Run k-means to cluster the neighborhood into 10 clusters.

In [37]:
# set number of clusters
kclusters = 10

nl_grouped_clustering = nl_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(nl_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([2, 1, 2, 2, 1, 2, 1, 1, 6, 1], dtype=int32)

In [38]:
np.size(kmeans.labels_)

424

Create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [39]:
nl_grouped_1 = nl_grouped[['Neighborhood']]
neighborhoods_1 = pd.merge(nl_grouped_1, nyc_london, left_on ='Neighborhood', right_on='Neighbourhood')
nl_merged = neighborhoods_1

# add clustering labels
nl_merged['Cluster Labels'] = kmeans.labels_

# merge nl_grouped with nl_data to add latitude/longitude for each neighborhood
nl_merged = nl_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

nl_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,borough_postcode,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,BronxAllerton,Bronx,BronxAllerton,40.865788,-73.859319,2,Supermarket,Pizza Place,Spa,Deli / Bodega,Chinese Restaurant,Cosmetics Shop,Bus Station,Donut Shop,Grocery Store,Fast Food Restaurant
1,BronxBaychester,Bronx,BronxBaychester,40.866858,-73.835798,1,Electronics Store,American Restaurant,Sandwich Place,Discount Store,Mexican Restaurant,Mattress Store,Shopping Mall,Pet Store,Bank,Fast Food Restaurant
2,BronxBedford Park,Bronx,BronxBedford Park,40.870185,-73.885512,2,Deli / Bodega,Mexican Restaurant,Fried Chicken Joint,Diner,Pizza Place,Spanish Restaurant,Chinese Restaurant,Supermarket,Pharmacy,Sandwich Place
3,BronxBelmont,Bronx,BronxBelmont,40.857277,-73.888452,2,Italian Restaurant,Deli / Bodega,Pizza Place,Bakery,Dessert Shop,Grocery Store,Donut Shop,Sandwich Place,Fish Market,Café
4,BronxBronxdale,Bronx,BronxBronxdale,40.852723,-73.861726,1,Italian Restaurant,Performing Arts Venue,Mexican Restaurant,Pizza Place,Paper / Office Supplies Store,Eastern European Restaurant,Spanish Restaurant,Gym,School,Bank


In [40]:
# create map
map_clusters_ny = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(nl_merged['Latitude'], nl_merged['Longitude'], nl_merged['Neighborhood'], nl_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_ny)
       
map_clusters_ny

In [41]:
# create map
map_clusters_london = folium.Map(location=[latitude1, longitude1], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(nl_merged['Latitude'], nl_merged['Longitude'], nl_merged['Neighborhood'], nl_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_london)
       
map_clusters_london

## Further Investigation on the Clusters

#### Cluster 1

In [42]:
nl_merged.loc[nl_merged['Cluster Labels'] == 0, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
94,Brooklyn,0,Bus Stop,Harbor / Marina,Playground,Beach,Café,Sandwich Place,Pizza Place,Ice Cream Shop,Food,Farmers Market
248,Queens,0,Beach,Dog Run,Food Truck,Café,Gym / Fitness Center,Bus Stop,Diner,Bus Station,Deli / Bodega,Building
269,Queens,0,Beach,Bar,Scenic Lookout,Fish Market,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market
330,SW15,0,Scenic Lookout,Bus Stop,Grocery Store,Park,Fish & Chips Shop,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm
347,Staten Island,0,American Restaurant,Deli / Bodega,Coffee Shop,Bus Stop,Food Service,Grocery Store,Farm,Farmers Market,Fish & Chips Shop,Fast Food Restaurant
350,Staten Island,0,Recreation Center,Theme Park,Bus Stop,Discount Store,Xinjiang Restaurant,Fish & Chips Shop,Exhibit,Eye Doctor,Factory,Falafel Restaurant
363,Staten Island,0,Bus Stop,Intersection,Sandwich Place,Xinjiang Restaurant,Fish & Chips Shop,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant
366,Staten Island,0,Bus Stop,Bank,Italian Restaurant,Ice Cream Shop,Bagel Shop,Bakery,Vegetarian / Vegan Restaurant,Food,Food & Drink Shop,Grocery Store
369,Staten Island,0,Dog Run,American Restaurant,Bus Stop,Xinjiang Restaurant,Flea Market,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market
376,Staten Island,0,Bus Stop,Restaurant,Beach,Dessert Shop,Basketball Court,Food,Bookstore,Liquor Store,Chinese Restaurant,Medical Center


#### Cluster 2

In [43]:
nl_merged.loc[nl_merged['Cluster Labels'] == 1, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Bronx,1,Electronics Store,American Restaurant,Sandwich Place,Discount Store,Mexican Restaurant,Mattress Store,Shopping Mall,Pet Store,Bank,Fast Food Restaurant
4,Bronx,1,Italian Restaurant,Performing Arts Venue,Mexican Restaurant,Pizza Place,Paper / Office Supplies Store,Eastern European Restaurant,Spanish Restaurant,Gym,School,Bank
6,Bronx,1,Harbor / Marina,Seafood Restaurant,Thrift / Vintage Store,Jewelry Store,Diner,Pizza Place,Park,Bank,Bar,Spanish Restaurant
7,Bronx,1,Bus Station,Bakery,Park,Fried Chicken Joint,Food,Chinese Restaurant,Pharmacy,Caribbean Restaurant,Gym,Grocery Store
9,Bronx,1,Baseball Field,Bus Station,Restaurant,Discount Store,Mattress Store,Shopping Mall,Grocery Store,Pizza Place,Park,Chinese Restaurant
12,Bronx,1,Playground,Spa,Fried Chicken Joint,Trail,Sandwich Place,Xinjiang Restaurant,Event Space,Exhibit,Eye Doctor,Factory
17,Bronx,1,River,Playground,Plaza,Film Studio,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm
18,Bronx,1,Shoe Store,Spanish Restaurant,Donut Shop,Mobile Phone Shop,Pizza Place,Gym / Fitness Center,Fast Food Restaurant,Chinese Restaurant,Bank,Pharmacy
20,Bronx,1,Food,Waste Facility,Bank,Gourmet Shop,Juice Bar,Café,BBQ Joint,Farmers Market,Spanish Restaurant,Pizza Place
21,Bronx,1,Pizza Place,Bar,Sandwich Place,Supermarket,Latin American Restaurant,Mexican Restaurant,Discount Store,Bakery,Pharmacy,Fast Food Restaurant


#### Cluster 3

In [44]:
nl_merged.loc[nl_merged['Cluster Labels'] == 2, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bronx,2,Supermarket,Pizza Place,Spa,Deli / Bodega,Chinese Restaurant,Cosmetics Shop,Bus Station,Donut Shop,Grocery Store,Fast Food Restaurant
2,Bronx,2,Deli / Bodega,Mexican Restaurant,Fried Chicken Joint,Diner,Pizza Place,Spanish Restaurant,Chinese Restaurant,Supermarket,Pharmacy,Sandwich Place
3,Bronx,2,Italian Restaurant,Deli / Bodega,Pizza Place,Bakery,Dessert Shop,Grocery Store,Donut Shop,Sandwich Place,Fish Market,Café
5,Bronx,2,Bank,Diner,Market,Pizza Place,Mobile Phone Shop,Latin American Restaurant,Pharmacy,Cosmetics Shop,Deli / Bodega,Food Truck
10,Bronx,2,Bus Station,Pizza Place,Grocery Store,Italian Restaurant,Liquor Store,Deli / Bodega,Convenience Store,Sandwich Place,Donut Shop,Pharmacy
11,Bronx,2,Deli / Bodega,Sandwich Place,Pizza Place,Fast Food Restaurant,Pharmacy,Sporting Goods Shop,Bus Station,Convenience Store,Mexican Restaurant,Supplement Shop
13,Bronx,2,Pizza Place,Shoe Store,Bank,Lounge,Spanish Restaurant,Discount Store,Mobile Phone Shop,Fish & Chips Shop,Donut Shop,Cosmetics Shop
14,Bronx,2,Caribbean Restaurant,Bus Station,Diner,Deli / Bodega,Convenience Store,Pizza Place,Bus Stop,Fast Food Restaurant,Seafood Restaurant,Donut Shop
15,Bronx,2,Fish Market,Deli / Bodega,Pizza Place,Grocery Store,Supermarket,Fried Chicken Joint,Forest,Filipino Restaurant,Exhibit,Eye Doctor
16,Bronx,2,Italian Restaurant,Deli / Bodega,Pizza Place,Ice Cream Shop,Sports Bar,Food & Drink Shop,Bookstore,Park,Chinese Restaurant,Coffee Shop


#### Cluster 4

In [45]:
nl_merged.loc[nl_merged['Cluster Labels'] == 3, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
124,E12,3,Restaurant,Turkish Restaurant,Gym / Fitness Center,Indian Restaurant,Xinjiang Restaurant,Film Studio,Event Space,Exhibit,Eye Doctor,Factory
128,E16,3,Hotel,Chinese Restaurant,Pub,Sandwich Place,Restaurant,Hotel Bar,Burger Joint,Tapas Restaurant,Gym,Bagel Shop
130,E18,3,Café,Grocery Store,Coffee Shop,Italian Restaurant,Supermarket,Bar,Pharmacy,Fast Food Restaurant,Pizza Place,Cocktail Bar
132,E2,3,Coffee Shop,Pub,Café,Cocktail Bar,Restaurant,Gym / Fitness Center,Italian Restaurant,Pizza Place,Park,Bar
138,E8,3,Pub,Restaurant,Café,Bakery,Gym / Fitness Center,Modern European Restaurant,Gastropub,Park,Pool,Train Station
141,EC2,3,Coffee Shop,Café,Hotel,Gym / Fitness Center,Sushi Restaurant,Bar,Boxing Gym,Yoga Studio,Movie Theater,Burrito Place
142,EC3,3,Coffee Shop,Restaurant,Hotel,Gym / Fitness Center,French Restaurant,Italian Restaurant,Sandwich Place,Cocktail Bar,Pub,Wine Bar
143,EC4,3,Coffee Shop,Italian Restaurant,Pub,Gym / Fitness Center,Vietnamese Restaurant,Sandwich Place,Wine Bar,Japanese Restaurant,Modern European Restaurant,Falafel Restaurant
158,Manhattan,3,Coffee Shop,Italian Restaurant,Gym / Fitness Center,Hotel,Theater,Restaurant,American Restaurant,Gym,Dog Run,Thai Restaurant
184,N10,3,Café,Pizza Place,Coffee Shop,Grocery Store,Italian Restaurant,Deli / Bodega,Pub,Bakery,English Restaurant,Gym / Fitness Center


#### Cluster 5

In [46]:
nl_merged.loc[nl_merged['Cluster Labels'] == 4, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
214,NW7,4,Park,Xinjiang Restaurant,Ethiopian Restaurant,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
223,Queens,4,Playground,Park,Xinjiang Restaurant,Film Studio,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm
286,Queens,4,Park,Xinjiang Restaurant,Ethiopian Restaurant,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
300,SE12,4,Park,Laundromat,Xinjiang Restaurant,Fish Market,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market
400,Staten Island,4,Park,Xinjiang Restaurant,Ethiopian Restaurant,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant


#### Cluster 6

In [47]:
nl_merged.loc[nl_merged['Cluster Labels'] == 5, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
385,Staten Island,5,Bar,Xinjiang Restaurant,Fish Market,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant


#### Cluster 7

In [48]:
nl_merged.loc[nl_merged['Cluster Labels'] == 6, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Bronx,6,Park,Bus Stop,Grocery Store,Scenic Lookout,Pool,South American Restaurant,Boat or Ferry,Spa,Farmers Market,Film Studio
122,E10,6,Convenience Store,Cricket Ground,Park,Fried Chicken Joint,Train Station,Farm,Coffee Shop,Grocery Store,Hotel,Exhibit
123,E11,6,Pub,Café,Fast Food Restaurant,Grocery Store,Coffee Shop,Pizza Place,Platform,Irish Pub,Sandwich Place,Fried Chicken Joint
125,E13,6,Pub,Café,Bus Station,Gym / Fitness Center,Gym,Fish Market,Eye Doctor,Factory,Falafel Restaurant,Farm
127,E15,6,Pub,Sandwich Place,Fast Food Restaurant,Hotel,Coffee Shop,Bookstore,Café,Platform,Bar,General Entertainment
129,E17,6,Coffee Shop,Pizza Place,Grocery Store,Pub,Restaurant,Café,Farmers Market,Bar,Bookstore,Optical Shop
131,E1,6,Pub,Indian Restaurant,Hotel,Coffee Shop,Grocery Store,Pakistani Restaurant,Gym / Fitness Center,Ice Cream Shop,Sandwich Place,Turkish Restaurant
133,E3,6,Pub,Hotel,Grocery Store,Café,Convenience Store,Gym,Park,Locksmith,Light Rail Station,Bar
135,E5,6,Grocery Store,Coffee Shop,Park,Breakfast Spot,Beer Bar,Turkish Restaurant,Train Station,Cocktail Bar,Café,Martial Arts Dojo
136,E6,6,Grocery Store,Café,Nature Preserve,Park,Pub,Bar,Food Truck,Food Stand,Eye Doctor,Factory


#### Cluster 8

In [49]:
nl_merged.loc[nl_merged['Cluster Labels'] == 7, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
371,Staten Island,7,Italian Restaurant,Xinjiang Restaurant,Fish Market,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market


#### Cluster 9

In [50]:
nl_merged.loc[nl_merged['Cluster Labels'] == 8, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
99,Brooklyn,8,Lake,Pool,Xinjiang Restaurant,Fish & Chips Shop,Event Space,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm
352,Staten Island,8,Baseball Field,Pool,Convenience Store,Bus Stop,Xinjiang Restaurant,Fish Market,Eye Doctor,Factory,Falafel Restaurant,Farm


#### Cluster 10

In [51]:
nl_merged.loc[nl_merged['Cluster Labels'] == 9, nl_merged.columns[[1] + list(range(5, nl_merged.shape[1]))]]

Unnamed: 0,borough_postcode,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
134,E4,9,American Restaurant,Xinjiang Restaurant,Flea Market,Exhibit,Eye Doctor,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant


## Statistics of the clusters

Find the proportion of the clusters from New York and London

In [52]:
nl_stat = pd.merge(nyc_london, nl_merged[['Neighbourhood','Cluster Labels']], how='left', on='Neighbourhood')

In [53]:
#Split the dataframe into 2 parts. The first part corresponding to New York and the second part to London
nl_stat_ny = nl_stat[nl_stat.index < 306]
nl_stat_london = nl_stat[nl_stat.index >= 306]

In [54]:
#create the count in the clusters of New York
count_ny = pd.DataFrame(nl_stat_ny['Cluster Labels'].value_counts())
count_ny.columns=['New York']
#create the count in the clusters of London
count_london = pd.DataFrame(nl_stat_london['Cluster Labels'].value_counts())
count_london.columns=['London']

In [55]:
#Combine the 2 tables together
count_nl= pd.concat([count_ny, count_london], axis=1).fillna(0)
count_nl=count_nl.set_index([['Cluster 1','Cluster 2','Cluster 3','Cluster 4','Cluster 5','Cluster 6','Cluster 7','Cluster 8','Cluster 9','Cluster 10']])
count_nl

Unnamed: 0,New York,London
Cluster 1,15.0,1.0
Cluster 2,189.0,11.0
Cluster 3,88.0,0.0
Cluster 4,6.0,46.0
Cluster 5,3.0,2.0
Cluster 6,1.0,0.0
Cluster 7,1.0,57.0
Cluster 8,1.0,0.0
Cluster 9,2.0,0.0
Cluster 10,0.0,1.0


The bar chart for the counts in the 2 cities.

In [56]:
import matplotlib.pyplot as plt
count_nl.plot(kind='bar', 
            width=0.8, 
            figsize=(20, 8),
            color=['#5cb85c','#5bc0de'],
            fontsize=18,
            )


plt.title("Number of Neighbourhoods in each Clusters in New York and London", fontsize=20)
plt.show()

<matplotlib.figure.Figure at 0x2afd56a96898>