# Coursera Capstone - Battle of the Neighborhoods:
# Opening an Irish pub in Brooklyn

# Introduction

This report considers options for what neighborhood in the Brooklyn borough of New York would be best to open an Irish pub. This report will use the Foursquare API in order to analyze data about the location of similar or competing venues and find out about trending venues and neighborhoods that might suggest popular areas to open a new night time venue.


# Contents

1- Introduction

2- About the Data Set

3- Data Collection and Understanding

4- Data Exploration 

5- Conclusion

# About The Data

There are two data sources that will be used in this report:

1. Data that will be acquired via a Foursquare developer account.
2. Shapefile data about the neighborhoods of New York that can be plotted on a Folium map for visualisation


# Import the required libraries

In [115]:
# import neccessary libraries
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values
import requests # library to handle requests
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 

import json # library to handle JSON files
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

 #Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


# Prepare a datafame containing location data of New York's neighborhoods

In [116]:
#Download an available shapefile of New York neighborhoods

!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')

Data downloaded!


In [117]:
with open('newyork_data.json') as json_data:
  newyork_data = json.load(json_data)

In [118]:
neighborhoods_data = newyork_data['features']

In [119]:
# Lets take a look at the first neighborhood to make sure the dataset worked
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

In [120]:
#Transform the shapefile into a Pandas dataframe

# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [121]:
# Check it worked and the headers are correct

neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


In [122]:
# Fill the dataframe from the shapefile

for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [123]:
# Examine the dataframe

neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [124]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


# Get the latitude and longitude of the NY neighborhoods and display on a map

In [125]:
#Define geocoder

address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [126]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

# Extract Brooklyn from the NY map

In [127]:
brooklyn_data = neighborhoods[neighborhoods['Borough'] == 'Brooklyn'].reset_index(drop=True)
brooklyn_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Brooklyn,Bay Ridge,40.625801,-74.030621
1,Brooklyn,Bensonhurst,40.611009,-73.99518
2,Brooklyn,Sunset Park,40.645103,-74.010316
3,Brooklyn,Greenpoint,40.730201,-73.954241
4,Brooklyn,Gravesend,40.59526,-73.973471


In [128]:
# Get the coordinates of Brooklyn

address = 'Brooklyn, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Brooklyn are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Brooklyn are 40.6501038, -73.9495823.


In [129]:
# create map of Brooklyn using latitude and longitude values and show the neighborhoods

map_brooklyn = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, label in zip(brooklyn_data['Latitude'], brooklyn_data['Longitude'], brooklyn_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_brooklyn)  
    
map_brooklyn

# Start exploring Brooklyn venues with Foursquare

In [130]:
CLIENT_ID = 'JVX5WEXUKCOFYWJW3RHZFDYVD5RPDIEH1QNA2AFRQVHDUQU2' # My Foursquare ID
CLIENT_SECRET = 'OCHC0TUFJICHWRVKTJMCCSC1ZRG45HZM4QC1DDHGL4TR4IBJ' # My Foursquare Secret
VERSION = '20180323' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: JVX5WEXUKCOFYWJW3RHZFDYVD5RPDIEH1QNA2AFRQVHDUQU2
CLIENT_SECRET:OCHC0TUFJICHWRVKTJMCCSC1ZRG45HZM4QC1DDHGL4TR4IBJ


# Establish what Irish pubs are in Brooklyn 

For simplicity we will just search for venues with the word "pub" in the title. "Pub" is the usual Irish term for a bar and common in the United States for Irish venues.

The Foresqaure API is structured, we can search for a specific venue category data via JSON as below structure

> `https://api.foursquare.com/v2/venues/`**search**`?client_id=`**CLIENT_ID**`&client_secret=`**CLIENT_SECRET**`&ll=`**LATITUDE**`,`**LONGITUDE**`&v=`**VERSION**`&query=`**QUERY**`&radius=`**RADIUS**`&limit=`**LIMIT**

In [131]:
search_query = 'pub'
near = 'Brooklyn, NY'
radius = 40000
limit = 200
print(search_query + ' .... OK!')

pub .... OK!


In [132]:
# create the api url
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, limit)
url

'https://api.foursquare.com/v2/venues/search?client_id=JVX5WEXUKCOFYWJW3RHZFDYVD5RPDIEH1QNA2AFRQVHDUQU2&client_secret=OCHC0TUFJICHWRVKTJMCCSC1ZRG45HZM4QC1DDHGL4TR4IBJ&ll=40.6501038,-73.9495823&v=20180323&query=pub&radius=40000&limit=200'

In [133]:
# get the data result in json
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e23658a1e152c001baa730a'},
 'response': {'venues': [{'id': '5667030a498ef28ad12dc4c5',
    'name': 'Bagel Pub',
    'contact': {},
    'location': {'address': '775 Franklin Ave',
     'crossStreet': 'at St. Johns Pl',
     'lat': 40.672343,
     'lng': -73.957283,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.672343,
       'lng': -73.957283}],
     'distance': 2559,
     'postalCode': '11238',
     'cc': 'US',
     'city': 'Brooklyn',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['775 Franklin Ave (at St. Johns Pl)',
      'Brooklyn, NY 11238',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d179941735',
      'name': 'Bagel Shop',
      'pluralName': 'Bagel Shops',
      'shortName': 'Bagels',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/bagels_',
       'suffix': '.png'},
      'primary': True}],
    'verified': True,
    'stats': {'tipCount': 0,
     'us

In [134]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,beenHere.count,beenHere.lastCheckinExpiredAt,beenHere.marked,beenHere.unconfirmedCount,categories,delivery.id,delivery.provider.icon.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.name,delivery.url,hasPerk,hereNow.count,hereNow.groups,hereNow.summary,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId,stats.checkinsCount,stats.tipCount,stats.usersCount,stats.visitsCount,venueChains,venuePage.id,verified
0,0,0,False,0,"[{'id': '4bf58dd8d48988d179941735', 'name': 'Bagel Shop', 'pluralName': 'Bagel Shops', 'shortName': 'Bagels', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/bagels_', 'suffix': '.png'}, 'primary': True}]",322607.0,/delivery_provider_seamless_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",seamless,https://www.seamless.com/menu/bagel-pub-crown-heights-775-franklin-ave-brooklyn/322607?affiliate=1131&utm_source=foursquare-affiliate-network&utm_medium=affiliate&utm_campaign=1131&utm_content=322607,False,1,"[{'type': 'others', 'name': 'Other people here', 'count': 1, 'items': []}]",One other person is here,5667030a498ef28ad12dc4c5,775 Franklin Ave,US,Brooklyn,United States,at St. Johns Pl,2559,"[775 Franklin Ave (at St. Johns Pl), Brooklyn, NY 11238, United States]","[{'label': 'display', 'lat': 40.672343, 'lng': -73.957283}]",40.672343,-73.957283,11238,NY,Bagel Pub,v-1579378088,0,0,0,0,[],151138815,True
1,0,0,False,0,"[{'id': '4bf58dd8d48988d179941735', 'name': 'Bagel Shop', 'pluralName': 'Bagel Shops', 'shortName': 'Bagels', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/bagels_', 'suffix': '.png'}, 'primary': True}]",267000.0,/delivery_provider_seamless_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",seamless,https://www.seamless.com/menu/bagel-pub-park-slope-287-9th-st-brooklyn/267000?affiliate=1131&utm_source=foursquare-affiliate-network&utm_medium=affiliate&utm_campaign=1131&utm_content=267000,False,0,[],Nobody here,4f900c3ee4b07368273ec967,287 9th St,US,Brooklyn,United States,btwn 4th & 5th Ave,3828,"[287 9th St (btwn 4th & 5th Ave), Brooklyn, NY 11215, United States]","[{'label': 'display', 'lat': 40.66952579241215, 'lng': -73.9869953122907}]",40.669526,-73.986995,11215,NY,Bagel Pub Park Slope,v-1579378088,0,0,0,0,[],151137917,True
2,0,0,False,0,"[{'id': '4bf58dd8d48988d148941735', 'name': 'Donut Shop', 'pluralName': 'Donut Shops', 'shortName': 'Donuts', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/donuts_', 'suffix': '.png'}, 'primary': True}]",1701138.0,/delivery_provider_seamless_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",seamless,https://www.seamless.com/menu/the-donut-pub-203-w-14th-st-new-york/1701138?affiliate=1131&utm_source=foursquare-affiliate-network&utm_medium=affiliate&utm_campaign=1131&utm_content=1701138,False,1,"[{'type': 'others', 'name': 'Other people here', 'count': 1, 'items': []}]",One other person is here,459b8681f964a5208c401fe3,203 W 14th St,US,New York,United States,btwn 7th & 8th Ave,10733,"[203 W 14th St (btwn 7th & 8th Ave), New York, NY 10011, United States]","[{'label': 'display', 'lat': 40.73867789322298, 'lng': -73.99983813539093}]",40.738678,-73.999838,10011,NY,The Donut Pub,v-1579378088,0,0,0,0,[],78017949,True
3,0,0,False,0,"[{'id': '4bf58dd8d48988d116941735', 'name': 'Bar', 'pluralName': 'Bars', 'shortName': 'Bar', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/pub_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,1,"[{'type': 'others', 'name': 'Other people here', 'count': 1, 'items': []}]",One other person is here,4587e9aef964a520cf3f1fe3,76 4th Ave,US,Brooklyn,United States,btwn Bergen St & St. Marks Pl,4398,"[76 4th Ave (btwn Bergen St & St. Marks Pl), Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.682261518582656, 'lng': -73.97984370290466}]",40.682262,-73.979844,11217,NY,Fourth Avenue Pub,v-1579378088,0,0,0,0,[],44560368,True
4,0,0,False,0,"[{'id': '4bf58dd8d48988d11b941735', 'name': 'Pub', 'pluralName': 'Pubs', 'shortName': 'Pub', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/pub_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,0,[],Nobody here,49f125dcf964a52091691fe3,120 Cedar St,US,New York,United States,at Greenwich St.,8533,"[120 Cedar St (at Greenwich St.), New York, NY 10006, United States]","[{'label': 'display', 'lat': 40.70989378141622, 'lng': -74.01283563128297}]",40.709894,-74.012836,10006,NY,O'Hara's Restaurant & Pub,v-1579378088,0,0,0,0,[],93216281,True


In [135]:
dataframe.shape

(50, 37)

In [136]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
0,Bagel Pub,Bagel Shop,775 Franklin Ave,US,Brooklyn,United States,at St. Johns Pl,2559,"[775 Franklin Ave (at St. Johns Pl), Brooklyn, NY 11238, United States]","[{'label': 'display', 'lat': 40.672343, 'lng': -73.957283}]",40.672343,-73.957283,11238,NY,5667030a498ef28ad12dc4c5
1,Bagel Pub Park Slope,Bagel Shop,287 9th St,US,Brooklyn,United States,btwn 4th & 5th Ave,3828,"[287 9th St (btwn 4th & 5th Ave), Brooklyn, NY 11215, United States]","[{'label': 'display', 'lat': 40.66952579241215, 'lng': -73.9869953122907}]",40.669526,-73.986995,11215,NY,4f900c3ee4b07368273ec967
2,The Donut Pub,Donut Shop,203 W 14th St,US,New York,United States,btwn 7th & 8th Ave,10733,"[203 W 14th St (btwn 7th & 8th Ave), New York, NY 10011, United States]","[{'label': 'display', 'lat': 40.73867789322298, 'lng': -73.99983813539093}]",40.738678,-73.999838,10011,NY,459b8681f964a5208c401fe3
3,Fourth Avenue Pub,Bar,76 4th Ave,US,Brooklyn,United States,btwn Bergen St & St. Marks Pl,4398,"[76 4th Ave (btwn Bergen St & St. Marks Pl), Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.682261518582656, 'lng': -73.97984370290466}]",40.682262,-73.979844,11217,NY,4587e9aef964a520cf3f1fe3
4,O'Hara's Restaurant & Pub,Pub,120 Cedar St,US,New York,United States,at Greenwich St.,8533,"[120 Cedar St (at Greenwich St.), New York, NY 10006, United States]","[{'label': 'display', 'lat': 40.70989378141622, 'lng': -74.01283563128297}]",40.709894,-74.012836,10006,NY,49f125dcf964a52091691fe3
5,The Irish American Pub,Pub,17 John St,US,New York,United States,btwn Broadway and Nassau St,8321,"[17 John St (btwn Broadway and Nassau St), New York, NY 10038, United States]","[{'label': 'display', 'lat': 40.7097984320412, 'lng': -74.00890728253643}]",40.709798,-74.008907,10038,NY,3fd66200f964a520c3e81ee3
6,Black Horse Pub,Bar,568 5th Ave,US,Brooklyn,United States,at 16th St,3793,"[568 5th Ave (at 16th St), Brooklyn, NY 11215, United States]","[{'label': 'display', 'lat': 40.66506312347628, 'lng': -73.98993841707401}]",40.665063,-73.989938,11215,NY,4ae12770f964a520858521e3
7,Joe's Pub,Music Venue,425 Lafayette St,US,New York,United States,btwn Astor Pl & E 4th St,9511,"[425 Lafayette St (btwn Astor Pl & E 4th St), New York, NY 10003, United States]","[{'label': 'display', 'lat': 40.729275386862724, 'lng': -73.99195905981068}]",40.729275,-73.991959,10003,NY,3fd66200f964a520f7e51ee3
8,Putnam's Pub & Cooker,Pub,419 Myrtle Ave,US,Brooklyn,United States,Clinton Ave.,5071,"[419 Myrtle Ave (Clinton Ave.), Brooklyn, NY 11205, United States]","[{'label': 'display', 'lat': 40.69320919801698, 'lng': -73.9690082938726}]",40.693209,-73.969008,11205,NY,510193528302126a99141178
9,McGee's Pub,Pub,240 W 55th St,US,New York,United States,btwn 8th Ave & Broadway,13097,"[240 W 55th St (btwn 8th Ave & Broadway), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.765010054507684, 'lng': -73.9829284943599}]",40.76501,-73.982928,10019,NY,3fd66200f964a52092e71ee3


In [137]:
dataframe_filtered.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 15 columns):
name                50 non-null object
categories          50 non-null object
address             49 non-null object
cc                  50 non-null object
city                50 non-null object
country             50 non-null object
crossStreet         49 non-null object
distance            50 non-null int64
formattedAddress    50 non-null object
labeledLatLngs      50 non-null object
lat                 50 non-null float64
lng                 50 non-null float64
postalCode          50 non-null object
state               50 non-null object
id                  50 non-null object
dtypes: float64(2), int64(1), object(12)
memory usage: 5.9+ KB


In [138]:
dataframe_filtered.describe()

Unnamed: 0,distance,lat,lng
count,50.0,50.0,50.0
mean,10282.86,40.732035,-73.977169
std,3253.786843,0.033145,0.045049
min,509.0,40.643433,-74.032235
25%,9236.5,40.711357,-73.998208
50%,11456.0,40.746526,-73.984096
75%,12278.75,40.75613,-73.977557
max,14370.0,40.777073,-73.779664


### We now have a dataframe containing pubs in the Brooklyn area. Lets explore this further.

# Data Exploration and Visualization

Lets see if the name "pub" is usually associated with the Irish community in New York.

In [139]:
dataframe_filtered.name

0     Bagel Pub                         
1     Bagel Pub Park Slope              
2     The Donut Pub                     
3     Fourth Avenue Pub                 
4     O'Hara's Restaurant & Pub         
5     The Irish American Pub            
6     Black Horse Pub                   
7     Joe's Pub                         
8     Putnam's Pub & Cooker             
9     McGee's Pub                       
10    Playwright Celtic Pub             
11    Foley's NY Pub & Restaurant       
12    Connolly's Pub & Restaurant       
13    Nancy Whiskey Pub                 
14    The Bailey Pub & Brasserie        
15    Playwright Irish Pub              
16    Peculier Pub                      
17    P.J. Carney's Pub                 
18    Peter Dillon's Pub                
19    Connolly's Pub & Restaurant       
20    Tigin Irish Pub                   
21    Mulligan's Pub                    
22    O'Reilly's Irish Pub              
23    PJ Moran's Irish Pub & Restaurant 
24    Carragher’

Clearly all of the names have Irish connotations.

In [140]:
# add the pubs as red circle markers to the map of Brooklyn and add the name of the pub as a label

for lat, lng, name in zip(dataframe_filtered['lat'], dataframe_filtered['lng'], dataframe_filtered['name']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill = True,
        fill_color='red',
        fill_opacity=0.6
    ).add_to(map_brooklyn)
    
   
    
map_brooklyn

There are relatively few Irish "pubs" in Brooklyn. Our Foursquare call has however shown a lot in Manhattan which is understandable. 

It might make sense therefore to look at opening an Irish pub in Brooklyn, given how few there are, but close to Manhattan where there may be greater demand. Lets explore the Irish pub closest to Manhattan. From the map that appears to be Putnams Pub in the Clinton Hill neighborhood. 

In [141]:
venue_id = '510193528302126a99141178' # ID of Putnams Pub
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
url

'https://api.foursquare.com/v2/venues/510193528302126a99141178?client_id=JVX5WEXUKCOFYWJW3RHZFDYVD5RPDIEH1QNA2AFRQVHDUQU2&client_secret=OCHC0TUFJICHWRVKTJMCCSC1ZRG45HZM4QC1DDHGL4TR4IBJ&v=20180323'

In [142]:
result = requests.get(url).json()
print(result['response']['venue'].keys())
result['response']['venue']

dict_keys(['id', 'name', 'contact', 'location', 'canonicalUrl', 'categories', 'verified', 'stats', 'url', 'price', 'hasMenu', 'likes', 'dislike', 'ok', 'rating', 'ratingColor', 'ratingSignals', 'delivery', 'menu', 'allowMenuUrlEdit', 'beenHere', 'specials', 'photos', 'venuePage', 'reasons', 'page', 'hereNow', 'createdAt', 'tips', 'shortUrl', 'timeZone', 'listed', 'hours', 'popular', 'pageUpdates', 'inbox', 'venueChains', 'attributes', 'bestPhoto', 'colors'])


{'id': '510193528302126a99141178',
 'name': "Putnam's Pub & Cooker",
 'contact': {'phone': '3477992382',
  'formattedPhone': '(347) 799-2382',
  'twitter': 'putnamspub'},
 'location': {'address': '419 Myrtle Ave',
  'crossStreet': 'Clinton Ave.',
  'lat': 40.69320919801698,
  'lng': -73.9690082938726,
  'labeledLatLngs': [{'label': 'display',
    'lat': 40.69320919801698,
    'lng': -73.9690082938726}],
  'postalCode': '11205',
  'cc': 'US',
  'city': 'Brooklyn',
  'state': 'NY',
  'country': 'United States',
  'formattedAddress': ['419 Myrtle Ave (Clinton Ave.)',
   'Brooklyn, NY 11205',
   'United States']},
 'canonicalUrl': 'https://foursquare.com/v/putnams-pub--cooker/510193528302126a99141178',
 'categories': [{'id': '4bf58dd8d48988d11b941735',
   'name': 'Pub',
   'pluralName': 'Pubs',
   'shortName': 'Pub',
   'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/pub_',
    'suffix': '.png'},
   'primary': True},
  {'id': '4bf58dd8d48988d116941735',
   'name': 'Bar

Lets see how Putnams has been rated.

In [143]:
try:
    print(result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')

8.4


Putnams has good ratings. Let's try the second closest, Fourth Avenue Pub in Boerum Hill.

In [144]:
venue_id = '4587e9aef964a520cf3f1fe3' # ID of Fourth Avenue Pub
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

result = requests.get(url).json()
try:
    print(result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')

8.2


Good ratings also. Are they good tippers in that part of Brooklyn?

In [145]:
# get the number of tips
result['response']['venue']['tips']['count']

101

Not bad. Lets investigate this more..

In [146]:
## Fourth Avenue Pub Tips - create the url
limit = 15 # set limit to be greater than or equal to the total number of tips
url = 'https://api.foursquare.com/v2/venues/{}/tips?client_id={}&client_secret={}&v={}&limit={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION, limit)

# get the url in json
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e2364cfaba297001cb8e33c'},
 'response': {'tips': {'count': 101,
   'items': [{'id': '51b78d4e498e156dd4f8eb1c',
     'createdAt': 1370983758,
     'text': 'Beer lovers belly up to the bar to munch on piping-hot popcorn and sample coast-to-coast suds. The outdoor space has plenty of picnic tables and chairs in the high-walled back garden.',
     'type': 'user',
     'url': 'http://www.timeout.com/newyork/bars/fourth-avenue-pub',
     'canonicalUrl': 'https://foursquare.com/item/51b78d4e498e156dd4f8eb1c',
     'lang': 'en',
     'likes': {'count': 2,
      'groups': [{'type': 'others',
        'count': 2,
        'items': [{'id': '488776',
          'firstName': 'Nate',
          'lastName': 'S',
          'photo': {'prefix': 'https://fastly.4sqi.net/img/user/',
           'suffix': '/488776-OTKOB0VWX2NUIHJE.jpg'}},
         {'id': '51461160',
          'firstName': 'Valih',
          'lastName': 'M',
          'photo': {'prefix': 'https://fastly.4sq

In [147]:
tips = results['response']['tips']['items']

tips = results['response']['tips']['items'][0]
tips.keys()

dict_keys(['id', 'createdAt', 'text', 'type', 'url', 'canonicalUrl', 'lang', 'likes', 'logView', 'agreeCount', 'disagreeCount', 'todo', 'user'])

In [148]:
pd.set_option('display.max_colwidth', -1)

tips_df = json_normalize(tips) # json normalize tips

# columns to keep
filtered_columns = ['text', 'agreeCount', 'disagreeCount', 'id', 'user.firstName', 'user.lastName', 'user.gender', 'user.id']
tips_filtered = tips_df.loc[:, filtered_columns]

# display tips
tips_filtered

Unnamed: 0,text,agreeCount,disagreeCount,id,user.firstName,user.lastName,user.gender,user.id
0,Beer lovers belly up to the bar to munch on piping-hot popcorn and sample coast-to-coast suds. The outdoor space has plenty of picnic tables and chairs in the high-walled back garden.,6,0,51b78d4e498e156dd4f8eb1c,Time Out New York,,,742542


Clearly these are lively spots with a good regular clientelle. These venues are relatively close. If we are thinking about opening a pub in this area we would want to consider what other venues are in the area.

In [149]:
# Putnams Pub latitude and longitude are as below
latitude=40.693209
longitude=-73.969008

In [150]:
# define the url
LIMIT = 50
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)
url

# get the url in json
results = requests.get(url).json()
'There are {} around Putnams Pub.'.format(len(results['response']['groups'][0]['items']))

'There are 50 around Putnams Pub.'

In [151]:
# get the relevant part of the JSON
items = results['response']['groups'][0]['items']
items[0]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '41196180f964a5200b0c1fe3',
  'name': 'Fort Greene Park',
  'contact': {},
  'location': {'address': 'Washington Park',
   'crossStreet': 'btwn Myrtle & DeKalb Ave',
   'lat': 40.69162107498688,
   'lng': -73.97564649581909,
   'labeledLatLngs': [{'label': 'display',
     'lat': 40.69162107498688,
     'lng': -73.97564649581909}],
   'distance': 587,
   'postalCode': '11217',
   'cc': 'US',
   'city': 'Brooklyn',
   'state': 'NY',
   'country': 'United States',
   'formattedAddress': ['Washington Park (btwn Myrtle & DeKalb Ave)',
    'Brooklyn, NY 11217',
    'United States']},
  'categories': [{'id': '4bf58dd8d48988d163941735',
    'name': 'Park',
    'pluralName': 'Parks',
    'shortName': 'Park',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/park_',
     'suffix': '.png'},
    'primary': True}

In [152]:
# create a clean dataframe
dataframe = json_normalize(items) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories'] + [col for col in dataframe.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# filter the category for each row
dataframe_filtered['venue.categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered.columns = [col.split('.')[-1] for col in dataframe_filtered.columns]

dataframe_filtered.head(10)

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,Fort Greene Park,Park,Washington Park,US,Brooklyn,United States,btwn Myrtle & DeKalb Ave,587,"[Washington Park (btwn Myrtle & DeKalb Ave), Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.69162107498688, 'lng': -73.97564649581909}]",40.691621,-73.975646,,11217,NY,41196180f964a5200b0c1fe3
1,Evelina Restaurant,Italian Restaurant,211 Dekalb Ave,US,Brooklyn,United States,Adelphi,433,"[211 Dekalb Ave (Adelphi), Brooklyn, NY 11205, United States]","[{'label': 'display', 'lat': 40.689629, 'lng': -73.971018}]",40.689629,-73.971018,,11205,NY,5a00f3538496ca58fd9e2856
2,Mekelburg's,Gourmet Shop,293 Grand Ave,US,Brooklyn,United States,,841,"[293 Grand Ave, Brooklyn, NY 11238, United States]","[{'label': 'display', 'lat': 40.687570695832704, 'lng': -73.96237002609244}]",40.687571,-73.96237,Clinton Hill,11238,NY,557dae53498ed2645c3ebbc0
3,Greenlight Bookstore,Bookstore,686 Fulton St,US,Brooklyn,United States,at S Portland Ave,895,"[686 Fulton St (at S Portland Ave), Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.68635635950454, 'lng': -73.97456031947704}]",40.686356,-73.97456,,11217,NY,4ad937aef964a520281921e3
4,BAM Harvey Theater,Theater,651 Fulton St,US,Brooklyn,United States,at Rockwell Pl,967,"[651 Fulton St (at Rockwell Pl), Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.68858644784374, 'lng': -73.978708667925}]",40.688586,-73.978709,,11217,NY,49b6a827f964a5200e531fe3
5,BAM Rose Cinemas,Indie Movie Theater,30 Lafayette Ave,US,Brooklyn,United States,btwn Ashland Pl & St. Felix St,1044,"[30 Lafayette Ave (btwn Ashland Pl & St. Felix St), Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.68633844980032, 'lng': -73.97743839568813}]",40.686338,-73.977438,,11217,NY,439ab2a9f964a520d12b1fe3
6,Dough,Donut Shop,305 Franklin Ave,US,Brooklyn,United States,at Lafayette Ave,1116,"[305 Franklin Ave (at Lafayette Ave), Brooklyn, NY 11238, United States]","[{'label': 'display', 'lat': 40.689042, 'lng': -73.956978}]",40.689042,-73.956978,,11238,NY,4cf8195ece1da1cd89ecc5e1
7,Alamo Drafthouse Cinema,Movie Theater,445 Albee Square West,US,Brooklyn,United States,Fulton St.,1262,"[445 Albee Square West (Fulton St.), Brooklyn, NY 11201, United States]","[{'label': 'display', 'lat': 40.69101558292192, 'lng': -73.98368571874677}]",40.691016,-73.983686,Downtown Brooklyn,11201,NY,5722dcad498e1dc59a10bca0
8,Ample Hills Creamery,Ice Cream Shop,623 Vanderbilt Ave,US,Brooklyn,United States,at St Marks Ave,1632,"[623 Vanderbilt Ave (at St Marks Ave), Brooklyn, NY 11238, United States]","[{'label': 'display', 'lat': 40.678546994225385, 'lng': -73.9684545270517}]",40.678547,-73.968455,,11238,NY,4dc16497d16455f8322791bb
9,Gotham Archery,Athletics & Sports,480 Baltic St,US,Brooklyn,United States,,1866,"[480 Baltic St, Brooklyn, NY 11217, United States]","[{'label': 'display', 'lat': 40.68250448326891, 'lng': -73.98603206203093}]",40.682504,-73.986032,Boerum Hill,11217,NY,5391fcfc498eae4bad6e344f


In [153]:
# visualize the items on the map
venues_map = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Putnams Pub


# add Putnams Pub as a red circle mark
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    popup='Putnams Pub',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.6
    ).add_to(venues_map)


# add popular spots to the map as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='green',
        fill_color='green',
        fill_opacity=0.6
        ).add_to(venues_map)

# display map
venues_map

This definitely looks like a busy area with lots of venues. Lets see what is trending.

We want to look at the foot traffic around Putnams Pub and get the trending venues.

In [154]:
# define URL
url = 'https://api.foursquare.com/v2/venues/trending?client_id={}&client_secret={}&ll={},{}&v={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION)

# send GET request and get trending venues
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e236537211536001cd272db'},
 'response': {'venues': [{'id': '5722dcad498e1dc59a10bca0',
    'name': 'Alamo Drafthouse Cinema',
    'contact': {},
    'location': {'address': '445 Albee Square West',
     'crossStreet': 'Fulton St.',
     'lat': 40.69101558292192,
     'lng': -73.98368571874677,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.69101558292192,
       'lng': -73.98368571874677}],
     'distance': 1262,
     'postalCode': '11201',
     'cc': 'US',
     'neighborhood': 'Downtown Brooklyn',
     'city': 'Brooklyn',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['445 Albee Square West (Fulton St.)',
      'Brooklyn, NY 11201',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d17f941735',
      'name': 'Movie Theater',
      'pluralName': 'Movie Theaters',
      'shortName': 'Movie Theater',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movie

In [155]:
# check if there are any venues trending at this time.
if len(results['response']['venues']) == 0:
    trending_venues_df = 'No trending venues are available at the moment!'
    
else:
    trending_venues = results['response']['venues']
    trending_venues_df = json_normalize(trending_venues)

    # filter columns
    columns_filtered = ['name', 'categories'] + ['location.distance', 'location.city', 'location.postalCode', 'location.state', 'location.country', 'location.lat', 'location.lng']
    trending_venues_df = trending_venues_df.loc[:, columns_filtered]

    # filter the category for each row
    trending_venues_df['categories'] = trending_venues_df.apply(get_category_type, axis=1)

In [159]:
# display trending venues
trending_venues_df

Unnamed: 0,name,categories,location.distance,location.city,location.postalCode,location.state,location.country,location.lat,location.lng
0,Alamo Drafthouse Cinema,Movie Theater,1262,Brooklyn,11201,NY,United States,40.691016,-73.983686


In [160]:
# lets visualize 
if len(results['response']['venues']) == 0:
    venues_map = 'Cannot generate visual as no trending venues are available at the moment!'

else:
    venues_map = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Putnams

    # add Putnams as a red circle mark
    folium.features.CircleMarker(
        [latitude, longitude],
        radius=10,
        popup='Putnams Pub',
        fill=True,
        color='red',
        fill_color='red',
        fill_opacity=0.6
    ).add_to(venues_map)


    # add the trending venues as blue circle markers
    for lat, lng, label in zip(trending_venues_df['location.lat'], trending_venues_df['location.lng'], trending_venues_df['name']):
        folium.features.CircleMarker(
            [lat, lng],
            radius=5,
            poup=label,
            fill=True,
            color='blue',
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(venues_map)
        
# display map
venues_map

From the Foursquare data we have established the Boerum Hill and Clinton Hill neighborhoods of Brooklyn are the busiest for nightlife where there is already a busy trade for Irish pubs. Opening in one of these areas would seem a good option. Other neighborhoods in Brooklyn have few Irish pubs which suggests they would not be good locations. However we should also consider those neighborhoods in terms of other venues to see what type of nightlife and social venues there are before we decide to invest.

# Venues in Brooklyn

For this section we will look at venues in each neighborhood in Brooklyn and their venues using Foursquare.

In [161]:
#This is a function to create the venues list.

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [162]:
#Itterate throught the Brooklyn dataframe to extract the venues list.

brooklyn_venues = getNearbyVenues(names=brooklyn_data['Neighborhood'],
                                   latitudes=brooklyn_data['Latitude'],
                                   longitudes=brooklyn_data['Longitude']
                                  )


Bay Ridge
Bensonhurst
Sunset Park
Greenpoint
Gravesend
Brighton Beach
Sheepshead Bay
Manhattan Terrace
Flatbush
Crown Heights
East Flatbush
Kensington
Windsor Terrace
Prospect Heights
Brownsville
Williamsburg
Bushwick
Bedford Stuyvesant
Brooklyn Heights
Cobble Hill
Carroll Gardens
Red Hook
Gowanus
Fort Greene
Park Slope
Cypress Hills
East New York
Starrett City
Canarsie
Flatlands
Mill Island
Manhattan Beach
Coney Island
Bath Beach
Borough Park
Dyker Heights
Gerritsen Beach
Marine Park
Clinton Hill
Sea Gate
Downtown
Boerum Hill
Prospect Lefferts Gardens
Ocean Hill
City Line
Bergen Beach
Midwood
Prospect Park South
Georgetown
East Williamsburg
North Side
South Side
Ocean Parkway
Fort Hamilton
Ditmas Park
Wingate
Rugby
Remsen Village
New Lots
Paerdegat Basin
Mill Basin
Fulton Ferry
Vinegar Hill
Weeksville
Broadway Junction
Dumbo
Homecrest
Highland Park
Madison
Erasmus


In [164]:
print(brooklyn_venues.shape)
brooklyn_venues.head()

(2114, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bay Ridge,40.625801,-74.030621,Pilo Arts Day Spa and Salon,40.624748,-74.030591,Spa
1,Bay Ridge,40.625801,-74.030621,Bagel Boy,40.627896,-74.029335,Bagel Shop
2,Bay Ridge,40.625801,-74.030621,Cocoa Grinder,40.623967,-74.030863,Juice Bar
3,Bay Ridge,40.625801,-74.030621,Pegasus Cafe,40.623168,-74.031186,Breakfast Spot
4,Bay Ridge,40.625801,-74.030621,Ho' Brah Taco Joint,40.62296,-74.031371,Taco Place


We now have a dataframe of all venues in Brooklyn. Lets see how many venues there are in each neighborhood.

In [169]:
brooklyn_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bath Beach,47,47,47,47,47,47
Bay Ridge,50,50,50,50,50,50
Bedford Stuyvesant,25,25,25,25,25,25
Bensonhurst,32,32,32,32,32,32
Bergen Beach,7,7,7,7,7,7
Boerum Hill,50,50,50,50,50,50
Borough Park,22,22,22,22,22,22
Brighton Beach,45,45,45,45,45,45
Broadway Junction,13,13,13,13,13,13
Brooklyn Heights,50,50,50,50,50,50


We can see that Boerum Hill and Clinton Hill both have 50 venues, whereas many other neighborhoods have much less. Note that Foursquare limits the results to 50 unless you have a paid account, therefore clearly in neighborhoods which are expressed to have 50 in all liklihood have more. For the purposes of this report however it is sufficient to see which neighborhoods have the greater number of venues to assist in our decison where to open our pub.

We now want to analyse the neighborhoods in more detail. In particular we cannot explore every venue in Brooklyn, so we are going to use a k-means clustering algorithm to create segments of Brooklyn based on the number of venues to that we can further establish which neighborhoods are the busiest. 

In [170]:
# one hot encoding
brooklyn_onehot = pd.get_dummies(brooklyn_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
brooklyn_onehot['Neighborhood'] = brooklyn_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [brooklyn_onehot.columns[-1]] + list(brooklyn_onehot.columns[:-1])
brooklyn_onehot = brooklyn_onehot[fixed_columns]

brooklyn_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Beach,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Boxing Gym,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Caucasian Restaurant,Cha Chaan Teng,Cheese Shop,Child Care Service,Chinese Restaurant,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Dive Bar,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Service,Event Space,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,History Museum,Hobby Shop,Hockey Field,Home Service,Hookah Bar,Hostel,Hotel,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Intersection,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Motorcycle Shop,Movie Theater,Moving Target,Museum,Music Store,Music Venue,Nail Salon,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoors & Recreation,Outlet Store,Pakistani Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pie Shop,Piercing Parlor,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Pub,Racetrack,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Rock Club,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Soccer Field,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tibetan Restaurant,Tiki Bar,Toy / Game Store,Trail,Turkish Restaurant,Used Bookstore,Vape Store,Varenyky restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Bay Ridge,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Bay Ridge,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Bay Ridge,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Bay Ridge,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Bay Ridge,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [171]:
brooklyn_onehot.shape

(2114, 267)

In [172]:
brooklyn_grouped = brooklyn_onehot.groupby('Neighborhood').mean().reset_index()
brooklyn_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Beach,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Boxing Gym,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Caucasian Restaurant,Cha Chaan Teng,Cheese Shop,Child Care Service,Chinese Restaurant,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Dive Bar,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Service,Event Space,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,History Museum,Hobby Shop,Hockey Field,Home Service,Hookah Bar,Hostel,Hotel,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Intersection,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Motorcycle Shop,Movie Theater,Moving Target,Museum,Music Store,Music Venue,Nail Salon,New American Restaurant,Nightclub,Non-Profit,Noodle House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoors & Recreation,Outlet Store,Pakistani Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pie Shop,Piercing Parlor,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Pub,Racetrack,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Rock Club,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Soccer Field,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Tibetan Restaurant,Tiki Bar,Toy / Game Store,Trail,Turkish Restaurant,Used Bookstore,Vape Store,Varenyky restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Bath Beach,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.021277,0.021277,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.021277,0.0,0.06383,0.0,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277
1,Bay Ridge,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0
2,Bedford Stuyvesant,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0
3,Bensonhurst,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.09375,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.03125,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bergen Beach,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Boerum Hill,0.04,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.06,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.06,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0
6,Borough Park,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.136364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Brighton Beach,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.044444,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.022222,0.0,0.044444,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Broadway Junction,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.153846,0.0,0.0,0.0,0.0,0.153846,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Brooklyn Heights,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.04,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.04,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.04,0.0,0.0,0.0,0.02,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0


Now we have a dataset consisting on one-hot encoded venue categories grouped by neighborhood. We can start to see what neighborhoods in Brooklyn have the most venues therefore are likely the busiest.

Lets see what types of venue are the most popular in each area. This enables us to see which areas might be busiest for nightlife, as opposed to other types of venue.

In [173]:
num_top_venues = 5

for hood in brooklyn_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = brooklyn_grouped[brooklyn_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bath Beach----
                venue  freq
0  Pharmacy            0.06
1  Chinese Restaurant  0.06
2  Pizza Place         0.06
3  Donut Shop          0.04
4  Italian Restaurant  0.04


----Bay Ridge----
                venue  freq
0  Spa                 0.10
1  Italian Restaurant  0.08
2  Greek Restaurant    0.06
3  Hookah Bar          0.04
4  Chinese Restaurant  0.04


----Bedford Stuyvesant----
            venue  freq
0  Café            0.08
1  Coffee Shop     0.08
2  Pizza Place     0.08
3  Bar             0.08
4  Discount Store  0.04


----Bensonhurst----
                venue  freq
0  Chinese Restaurant  0.09
1  Italian Restaurant  0.06
2  Ice Cream Shop      0.06
3  Grocery Store       0.06
4  Bakery              0.06


----Bergen Beach----
                venue  freq
0  Harbor / Marina     0.29
1  Baseball Field      0.14
2  Athletics & Sports  0.14
3  Playground          0.14
4  Hockey Field        0.14


----Boerum Hill----
                    venue  freq
0  Bar           

                venue  freq
0  Deli / Bodega       0.2 
1  Athletics & Sports  0.1 
2  Pizza Place         0.1 
3  Chinese Restaurant  0.1 
4  Soccer Field        0.1 


----Midwood----
              venue  freq
0  Pizza Place       0.31
1  Ice Cream Shop    0.08
2  Video Game Store  0.08
3  Candy Store       0.08
4  Field             0.08


----Mill Basin----
                 venue  freq
0  Chinese Restaurant   0.11
1  Pizza Place          0.08
2  Japanese Restaurant  0.06
3  Grocery Store        0.06
4  Cosmetics Shop       0.06


----Mill Island----
               venue  freq
0  Pool               1.0 
1  Other Repair Shop  0.0 
2  Movie Theater      0.0 
3  Moving Target      0.0 
4  Museum             0.0 


----New Lots----
                 venue  freq
0  Fried Chicken Joint  0.12
1  Pharmacy             0.12
2  Pizza Place          0.12
3  Park                 0.06
4  Salon / Barbershop   0.06


----North Side----
                           venue  freq
0  Yoga Studio            

In [181]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [184]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = brooklyn_grouped['Neighborhood']

for ind in np.arange(brooklyn_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(brooklyn_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bath Beach,Pizza Place,Chinese Restaurant,Pharmacy,Donut Shop,Fast Food Restaurant,Bubble Tea Shop,Italian Restaurant,Sushi Restaurant,Women's Store,Coffee Shop
1,Bay Ridge,Spa,Italian Restaurant,Greek Restaurant,Hookah Bar,Grocery Store,American Restaurant,Pizza Place,Chinese Restaurant,Ice Cream Shop,Sandwich Place
2,Bedford Stuyvesant,Bar,Café,Pizza Place,Coffee Shop,Cocktail Bar,Bus Station,Boutique,Fried Chicken Joint,New American Restaurant,Juice Bar
3,Bensonhurst,Chinese Restaurant,Italian Restaurant,Donut Shop,Grocery Store,Bakery,Ice Cream Shop,Sushi Restaurant,Flower Shop,Shabu-Shabu Restaurant,Supermarket
4,Bergen Beach,Harbor / Marina,Playground,Donut Shop,Athletics & Sports,Baseball Field,Hockey Field,Filipino Restaurant,Field,Fast Food Restaurant,Fish & Chips Shop
5,Boerum Hill,Bar,Coffee Shop,Furniture / Home Store,Yoga Studio,Spa,Sandwich Place,French Restaurant,Music Venue,Burrito Place,Seafood Restaurant
6,Borough Park,Bank,Pizza Place,Pharmacy,Fast Food Restaurant,Deli / Bodega,Coffee Shop,Restaurant,Eastern European Restaurant,Optical Shop,Café
7,Brighton Beach,Eastern European Restaurant,Beach,Russian Restaurant,Restaurant,Gourmet Shop,Bank,Mobile Phone Shop,Sushi Restaurant,Pharmacy,Other Great Outdoors
8,Broadway Junction,Donut Shop,Diner,Caribbean Restaurant,Burger Joint,Fried Chicken Joint,Gas Station,Metro Station,Ice Cream Shop,Deli / Bodega,Hotel
9,Brooklyn Heights,Yoga Studio,Park,Gym,Ice Cream Shop,Pet Store,Italian Restaurant,Deli / Bodega,Mexican Restaurant,Bakery,Optical Shop


Interesting that the top venue type in Boerum Hill is "bar" whereas in Clinton Hill it is restaurants.

What we now want to do is create the clusters so that we can see what options there are in terms of which neighborhoods are the busiest for venues.

# Run k-means to cluster the neighborhoods

In [178]:

kclusters = 4

brooklyn_grouped_clustering = brooklyn_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(brooklyn_grouped_clustering)

kmeans.labels_[0:10] 

array([1, 2, 2, 1, 1, 2, 1, 1, 1, 2], dtype=int32)

In [185]:

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

brooklyn_merged = brooklyn_data


brooklyn_merged = brooklyn_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

brooklyn_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Brooklyn,Bay Ridge,40.625801,-74.030621,2,Spa,Italian Restaurant,Greek Restaurant,Hookah Bar,Grocery Store,American Restaurant,Pizza Place,Chinese Restaurant,Ice Cream Shop,Sandwich Place
1,Brooklyn,Bensonhurst,40.611009,-73.99518,1,Chinese Restaurant,Italian Restaurant,Donut Shop,Grocery Store,Bakery,Ice Cream Shop,Sushi Restaurant,Flower Shop,Shabu-Shabu Restaurant,Supermarket
2,Brooklyn,Sunset Park,40.645103,-74.010316,1,Bank,Latin American Restaurant,Pizza Place,Bakery,Mexican Restaurant,Fried Chicken Joint,Gym,Mobile Phone Shop,Breakfast Spot,Supplement Shop
3,Brooklyn,Greenpoint,40.730201,-73.954241,2,Coffee Shop,Bar,Cocktail Bar,Pizza Place,Mexican Restaurant,Café,Boutique,Yoga Studio,Gymnastics Gym,Sushi Restaurant
4,Brooklyn,Gravesend,40.59526,-73.973471,1,Pizza Place,Italian Restaurant,Lounge,Bakery,Bus Station,Eastern European Restaurant,Baseball Field,Bar,Chinese Restaurant,Donut Shop


These are our five clusters, so lets visualise those on a map.

In [188]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(brooklyn_merged['Latitude'], brooklyn_merged['Longitude'], brooklyn_merged['Neighborhood'], brooklyn_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Now we can examine each cluster. Clearly clusters 0, 3 and 4 are outliers. Our proposed location for a pub was Boerum Hill or Clinton Hill both of which are in cluster 2. Much of Brooklyn falls into cluster 1. Lets examine each in more detial to establish the characteristics of each. 

In [189]:
brooklyn_merged.loc[brooklyn_merged['Cluster Labels'] == 0, brooklyn_merged.columns[[1] + list(range(5, brooklyn_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
30,Mill Island,Pool,Women's Store,Farm,Electronics Store,Ethiopian Restaurant,Event Service,Event Space,Factory,Falafel Restaurant,Farmers Market


In [190]:
brooklyn_merged.loc[brooklyn_merged['Cluster Labels'] == 1, brooklyn_merged.columns[[1] + list(range(5, brooklyn_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Bensonhurst,Chinese Restaurant,Italian Restaurant,Donut Shop,Grocery Store,Bakery,Ice Cream Shop,Sushi Restaurant,Flower Shop,Shabu-Shabu Restaurant,Supermarket
2,Sunset Park,Bank,Latin American Restaurant,Pizza Place,Bakery,Mexican Restaurant,Fried Chicken Joint,Gym,Mobile Phone Shop,Breakfast Spot,Supplement Shop
4,Gravesend,Pizza Place,Italian Restaurant,Lounge,Bakery,Bus Station,Eastern European Restaurant,Baseball Field,Bar,Chinese Restaurant,Donut Shop
5,Brighton Beach,Eastern European Restaurant,Beach,Russian Restaurant,Restaurant,Gourmet Shop,Bank,Mobile Phone Shop,Sushi Restaurant,Pharmacy,Other Great Outdoors
7,Manhattan Terrace,Donut Shop,Pizza Place,Convenience Store,Ice Cream Shop,Japanese Restaurant,Organic Grocery,Steakhouse,Mobile Phone Shop,Jazz Club,Bank
8,Flatbush,Mexican Restaurant,Deli / Bodega,Coffee Shop,Chinese Restaurant,Caribbean Restaurant,Bagel Shop,Pizza Place,Pharmacy,Donut Shop,Sandwich Place
9,Crown Heights,Pizza Place,Café,Museum,Cosmetics Shop,Bookstore,Bakery,Bagel Shop,Moving Target,Supermarket,Salon / Barbershop
10,East Flatbush,Park,Wine Shop,Food & Drink Shop,Pharmacy,Hardware Store,Department Store,Fast Food Restaurant,Supermarket,Moving Target,Chinese Restaurant
11,Kensington,Thai Restaurant,Grocery Store,Pharmacy,Restaurant,Pizza Place,Ice Cream Shop,Supermarket,Racetrack,Nail Salon,Music Venue
14,Brownsville,Restaurant,Moving Target,Park,Playground,Chinese Restaurant,Pharmacy,Burger Joint,Fried Chicken Joint,Caribbean Restaurant,Farmers Market


In [191]:
brooklyn_merged.loc[brooklyn_merged['Cluster Labels'] == 2, brooklyn_merged.columns[[1] + list(range(5, brooklyn_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bay Ridge,Spa,Italian Restaurant,Greek Restaurant,Hookah Bar,Grocery Store,American Restaurant,Pizza Place,Chinese Restaurant,Ice Cream Shop,Sandwich Place
3,Greenpoint,Coffee Shop,Bar,Cocktail Bar,Pizza Place,Mexican Restaurant,Café,Boutique,Yoga Studio,Gymnastics Gym,Sushi Restaurant
6,Sheepshead Bay,Dessert Shop,Turkish Restaurant,Sandwich Place,Yoga Studio,Karaoke Bar,Pizza Place,Deli / Bodega,Park,Outlet Store,Diner
12,Windsor Terrace,Plaza,Park,Diner,Café,Grocery Store,Deli / Bodega,French Restaurant,Sushi Restaurant,Beer Store,Salad Place
13,Prospect Heights,Bar,Cocktail Bar,Thai Restaurant,Café,Coffee Shop,New American Restaurant,Pizza Place,Beer Bar,American Restaurant,Ice Cream Shop
15,Williamsburg,Bar,Coffee Shop,Pizza Place,Bagel Shop,Deli / Bodega,Breakfast Spot,Burger Joint,Café,Clothing Store,Playground
16,Bushwick,Bar,Mexican Restaurant,Coffee Shop,Discount Store,Bakery,Thrift / Vintage Store,Vegetarian / Vegan Restaurant,Pizza Place,Nightclub,Pakistani Restaurant
17,Bedford Stuyvesant,Bar,Café,Pizza Place,Coffee Shop,Cocktail Bar,Bus Station,Boutique,Fried Chicken Joint,New American Restaurant,Juice Bar
18,Brooklyn Heights,Yoga Studio,Park,Gym,Ice Cream Shop,Pet Store,Italian Restaurant,Deli / Bodega,Mexican Restaurant,Bakery,Optical Shop
19,Cobble Hill,Yoga Studio,Italian Restaurant,Cocktail Bar,Wine Shop,Middle Eastern Restaurant,Ice Cream Shop,Bar,Playground,Pilates Studio,Cosmetics Shop


In [192]:
brooklyn_merged.loc[brooklyn_merged['Cluster Labels'] == 3, brooklyn_merged.columns[[1] + list(range(5, brooklyn_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,Canarsie,Caribbean Restaurant,Gym,Bus Line,Asian Restaurant,Farm,Ethiopian Restaurant,Event Service,Event Space,Factory,Falafel Restaurant
59,Paerdegat Basin,Asian Restaurant,Food,Child Care Service,Bus Line,Gym,Harbor / Marina,Women's Store,Event Service,Event Space,Factory


In [193]:
brooklyn_merged.loc[brooklyn_merged['Cluster Labels'] == 4, brooklyn_merged.columns[[1] + list(range(5, brooklyn_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


# Conclusion

It is clear that there are many more bars in cluster 2 than in cluster 1. The nature of venues in cluster 1 is more domestic or residential in nature, suggesting these areas are predominantly of that nature. Cluster 2 however has more nightlife venues, suggesting that these are busier for that type of entertainment so a good place for a pub. That would be expected given the overall proximity of cluster 2 to Manhattan. In conclusion therefore the data supports placing our pub in cluster 2. Given that we have seen Boerum Hill and Clinton Hill both already have sucessful and popular Irish pubs, and both have many nightlife venues, we believe that it would be less risky to open in these areas.