### Introduction/Business Problem 

This project will analyze the neighbored of Boston, MA and potential venues of interest.

As someone who is lives in Greater Boston and works in downtown area, I am often asked for recommendations, things to do, and areas to live from friends, colleagues who either live in different US cities or who are travelling to Boston on a work/pleasure trip. A systematic analysis of neighbored and venues not only benefit them but it will be beneficial for me as well at a personal level to explore areas and places which I have not explored.

To successfully complete our analysis, we will need to explore available dataset, explore neighborhood in Boston, analyze each neighborhood, cluster neighborhood and finally examine the clusters.


### Data

To perform our analysis, we will need two type of data.

First, we will need location data. For this, we will use foursquare location data which will provide us venues on a given location. We will fetch top 100 venues in nearly 3 mile radius of Boston geographical coordinates. 

Second, we would need Boston’s neighborhood data. For this, we will use two sources and merge the data to obtain desired geo coordinates for various neighborhoods. For easy reference, i have downloaded this data in an excel which will be used.

ZIP Code: http://archive.boston.com/news/local/articles/2007/04/15/sixfigurezipcodes_city/

Geo Data: https://public.opendatasoft.com/explore/embed/dataset/us-zip-code-latitude-and-longitude/table/?q=boston&refine.state=MA&location=11,42.36681,-71.18952&basemap=jawg.streets


### Methodology

Our methodology for this analysis will have following five important component 
1. Explore avalible dataset : In this section we will access , clean , reshape location and venue data from sources mentioned in above step.
2. Explore neighborhood in Boston : In this section we will look at different neighborhoods of boston
3. Analyze each neighborhood : In this section we will start analyzing the neighboorhood data
4. Cluster neighborhood : We will use k-means to cluster the neighboorhood
5. Examine the clusters : here, we will use the cluster data and examine to understand the top venues. 


For our work, we will use different python libraries which are avalible to us. Some of those are pandas, numpy, json, geopy, matplotlib, folium and sklearn. 

### 1. Explore Dataset

Import Libraries

In [17]:
import pandas as pd
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


Boston Neighboorhood Data 
- Read and merge data
- clean df

In [18]:
# Assign test file
testfile = 'bos_data.xlsx'

# Load Spreadsheet
ss = pd.ExcelFile(testfile)

# Load the sheet into dataframe
df1 = ss.parse('bos_neighborhoods')
df2 = ss.parse('bos_neighborhoods_loc')

# merge
df_m = pd.merge(df1,df2,on='Zip',how='left')
df = df_m.filter(['Community','Latitude','Longitude'],axis=1)

#selecting only values where latitude is not na
df = df[df['Latitude'].notna()]


#drop duplicates
df = df.drop_duplicates()

#sort df
df.sort_values(by=['Latitude'], inplace=True)

bos_nh = df.reset_index(drop=True)
bos_nh.head()

Unnamed: 0,Community,Latitude,Longitude
0,Dorchester / Codman Square,42.287,-71.072
1,Dorchester / Fields Corner,42.296,-71.055
2,Roxbury / Grove Hall,42.307,-71.081
3,Dorchester / Uphams Corner,42.317,-71.058
4,Roxbury,42.325,-71.085


Let's visualize the Boston Neighborhood

In [19]:
address = 'Boston, MA, USA'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Boston, MA, USA are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Boston, MA, USA are 42.3602534, -71.0582912.


In [20]:
# create map of Manhattan using latitude and longitude values
map_bos = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, label in zip(bos_nh['Latitude'], bos_nh['Longitude'],bos_nh['Community']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bos)  
    
map_bos

Define foursquare credentials and version

In [21]:
CLIENT_ID = 'X3HYRBWLA2CJMHHPHPZUORYXZW131NRNXV5KCPB3CGJAAUUP' # your Foursquare ID
CLIENT_SECRET = '320IAFEFFUQ223ZJLZ44IDVO4UJ4LK0TUV0BENU54MPGEFEB' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

Getting neighbourhood latitude and longitude

In [22]:
address = 'Boston, MA, USA'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Boston, MA, USA are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Boston, MA, USA are 42.3602534, -71.0582912.


Now, let's getting top 100 venues that are in Rowes Warf area within a radius of 3 mile

In [23]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 5000 # define radius in meters, roughing 3 mile radious
 # create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latitude, 
    longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=X3HYRBWLA2CJMHHPHPZUORYXZW131NRNXV5KCPB3CGJAAUUP&client_secret=320IAFEFFUQ223ZJLZ44IDVO4UJ4LK0TUV0BENU54MPGEFEB&v=20180605&ll=42.3602534,-71.0582912&radius=5000&limit=100'

In [24]:
#Send the GET request and examine the resutls
results = requests.get(url).json()

In [25]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [26]:
# creating a data frame
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head(100)

  after removing the cwd from sys.path.


Unnamed: 0,name,categories,lat,lng
0,North End Park,Park,42.362488,-71.056477
1,Boston Public Market,Market,42.36195,-71.057466
2,Faneuil Hall Marketplace,Historic Site,42.359978,-71.05641
3,Quincy Market,Historic Site,42.360106,-71.054881
4,Saus Restaurant,Belgian Restaurant,42.361076,-71.057054
5,The Rose Kennedy Greenway - Mothers Walk,Park,42.36264,-71.056407
6,haley.henry,Restaurant,42.357574,-71.059495
7,Sam LaGrassa's,Sandwich Place,42.35687,-71.05996
8,Tatte Bakery & Cafe,Bakery,42.358451,-71.057981
9,Boston Athenaeum,Library,42.357481,-71.061838


In [27]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


In [28]:
nearby_venues.groupby('categories').count()

Unnamed: 0_level_0,name,lat,lng
categories,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Aquarium,2,2,2
Asian Restaurant,1,1,1
Athletics & Sports,1,1,1
Bakery,8,8,8
Beer Garden,1,1,1
Belgian Restaurant,1,1,1
Breakfast Spot,1,1,1
Brewery,1,1,1
Café,2,2,2
Church,1,1,1


### 2. Explore Neighborhood in Boston

#### Let's create a function to repeat the same process to all the neighborhoods in Boston

In [29]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now write the code to run the above function on each neighborhood and create a new dataframe called *boston_venues*.

In [30]:
boston_venues = getNearbyVenues(names=bos_nh['Community'],
                                   latitudes=bos_nh['Latitude'],
                                   longitudes=bos_nh['Longitude']
                                  )

Dorchester / Codman Square
Dorchester / Fields Corner
Roxbury / Grove Hall
Dorchester / Uphams Corner
Roxbury
Roxbury Crossing
South Boston
Fenway / East Fens / Longwood
Prudential
Kenmore / Boston University
South Boston / Fort Point
Chinatown / Tufts-New England Medical Center
Downtown Boston
Back Bay
South End
Downtown Boston
Financial District / Wharves
Beacon Hill
Markets / Inner Harbor
West End / Back of the Hill
North End
North Brighton / Cambridge
Downtown Boston
East Boston


In [31]:
print(boston_venues.shape)
boston_venues.head()

(1304, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Dorchester / Codman Square,42.287,-71.072,Dorchester YMCA,42.284671,-71.071147,Gym / Fitness Center
1,Dorchester / Codman Square,42.287,-71.072,McDonald's,42.290421,-71.071777,Fast Food Restaurant
2,Dorchester / Codman Square,42.287,-71.072,Walgreens,42.291119,-71.07181,Pharmacy
3,Dorchester / Codman Square,42.287,-71.072,Codman Square Park,42.289982,-71.072634,Park
4,Dorchester / Codman Square,42.287,-71.072,Joy Luck Hot Pot Restaurant,42.287097,-71.070951,Restaurant


In [32]:
boston_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Back Bay,100,100,100,100,100,100
Beacon Hill,84,84,84,84,84,84
Chinatown / Tufts-New England Medical Center,100,100,100,100,100,100
Dorchester / Codman Square,12,12,12,12,12,12
Dorchester / Fields Corner,13,13,13,13,13,13
Dorchester / Uphams Corner,15,15,15,15,15,15
Downtown Boston,181,181,181,181,181,181
East Boston,31,31,31,31,31,31
Fenway / East Fens / Longwood,34,34,34,34,34,34
Financial District / Wharves,80,80,80,80,80,80


#### unique categories

In [33]:
print('There are {} uniques categories.'.format(len(boston_venues['Venue Category'].unique())))

There are 209 uniques categories.


### 3. Analyze Each neighborhood

In [34]:
# one hot encoding
boston_onehot = pd.get_dummies(boston_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
boston_onehot['Neighborhood'] = boston_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [boston_onehot.columns[-1]] + list(boston_onehot.columns[:-1])
boston_onehot = boston_onehot[fixed_columns]

boston_onehot.head()

Unnamed: 0,Yoga Studio,ATM,Accessories Store,African Restaurant,Airport,Airport Terminal,American Restaurant,Aquarium,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Caribbean Restaurant,Cemetery,Chinese Restaurant,Chocolate Shop,Church,Circus,Clothing Store,Cocktail Bar,Coffee Shop,College Hockey Rink,College Stadium,Comedy Club,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Cycle Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish Market,Flower Shop,Food Court,Food Service,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Gastropub,Gay Bar,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Hindu Temple,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Insurance Office,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Library,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Movie Theater,Museum,Music Venue,Nail Salon,Neighborhood,New American Restaurant,Nightclub,Noodle House,Opera House,Optical Shop,Other Great Outdoors,Other Repair Shop,Outdoor Sculpture,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pilates Studio,Pizza Place,Playground,Plaza,Pool,Pool Hall,Post Office,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skating Rink,Ski Chalet,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Toll Booth,Tour Provider,Tourist Information Center,Track,Trail,Udon Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Dorchester / Codman Square,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Dorchester / Codman Square,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Dorchester / Codman Square,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Dorchester / Codman Square,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Dorchester / Codman Square,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [35]:
boston_onehot.shape

(1304, 209)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [36]:
boston_grouped = boston_onehot.groupby('Neighborhood').mean().reset_index()
boston_grouped

Unnamed: 0,Neighborhood,Yoga Studio,ATM,Accessories Store,African Restaurant,Airport,Airport Terminal,American Restaurant,Aquarium,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Caribbean Restaurant,Cemetery,Chinese Restaurant,Chocolate Shop,Church,Circus,Clothing Store,Cocktail Bar,Coffee Shop,College Hockey Rink,College Stadium,Comedy Club,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Cycle Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish Market,Flower Shop,Food Court,Food Service,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Gastropub,Gay Bar,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Hindu Temple,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Insurance Office,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Library,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Movie Theater,Museum,Music Venue,Nail Salon,New American Restaurant,Nightclub,Noodle House,Opera House,Optical Shop,Other Great Outdoors,Other Repair Shop,Outdoor Sculpture,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pilates Studio,Pizza Place,Playground,Plaza,Pool,Pool Hall,Post Office,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skating Rink,Ski Chalet,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Toll Booth,Tour Provider,Tourist Information Center,Track,Trail,Udon Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Back Bay,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.05,0.01,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.01,0.03,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.06,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.01
1,Beacon Hill,0.011905,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.02381,0.011905,0.0,0.0,0.0,0.011905,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011905,0.02381,0.011905,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011905,0.0,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.011905,0.0,0.011905,0.011905,0.0,0.02381,0.0,0.0,0.035714,0.02381,0.0,0.02381,0.011905,0.0,0.0,0.0,0.0,0.011905,0.0,0.035714,0.0,0.0,0.011905,0.011905,0.0,0.0,0.011905,0.011905,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.011905,0.02381,0.0,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.011905,0.0,0.0,0.011905,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.011905,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.011905,0.0,0.02381,0.011905,0.011905,0.02381,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.011905,0.0,0.0,0.0,0.011905,0.0,0.0,0.0,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0
2,Chinatown / Tufts-New England Medical Center,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.09,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.13,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0
3,Dorchester / Codman Square,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Dorchester / Fields Corner,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.076923,0.076923,0.0,0.076923,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0
5,Dorchester / Uphams Corner,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.133333,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0
6,Downtown Boston,0.0,0.0,0.0,0.0,0.005525,0.01105,0.033149,0.0,0.0,0.005525,0.005525,0.0,0.033149,0.005525,0.0,0.005525,0.0,0.027624,0.0,0.022099,0.005525,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005525,0.0,0.0,0.0,0.005525,0.0,0.0,0.016575,0.0,0.005525,0.005525,0.0,0.0,0.016575,0.0,0.0,0.0,0.0,0.0,0.0,0.01105,0.01105,0.082873,0.0,0.0,0.0,0.0,0.0,0.0,0.01105,0.005525,0.0,0.0,0.005525,0.0,0.0,0.005525,0.005525,0.005525,0.0,0.0,0.016575,0.0,0.0,0.01105,0.005525,0.0,0.0,0.0,0.0,0.0,0.016575,0.016575,0.0,0.0,0.0,0.01105,0.0,0.0,0.0,0.0,0.005525,0.0,0.005525,0.01105,0.0,0.0,0.027624,0.005525,0.0,0.038674,0.005525,0.0,0.005525,0.005525,0.0,0.005525,0.0,0.027624,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005525,0.0,0.005525,0.005525,0.0,0.0,0.005525,0.0,0.0,0.005525,0.022099,0.0,0.005525,0.005525,0.0,0.01105,0.0,0.0,0.005525,0.0,0.005525,0.005525,0.0,0.0,0.0,0.0,0.022099,0.0,0.0,0.005525,0.0,0.0,0.0,0.0,0.01105,0.0,0.01105,0.0,0.0,0.0,0.005525,0.0,0.0,0.033149,0.0,0.027624,0.0,0.022099,0.0,0.038674,0.005525,0.005525,0.016575,0.005525,0.005525,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005525,0.005525,0.0,0.0,0.033149,0.0,0.01105,0.0,0.0,0.0,0.005525,0.005525,0.0,0.0,0.005525,0.0,0.0,0.0,0.0,0.0,0.01105,0.005525,0.005525,0.0,0.01105,0.005525,0.005525,0.0
7,East Boston,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.032258,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.16129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.064516,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.032258,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.032258,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Fenway / East Fens / Longwood,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.029412,0.088235,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.029412,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.029412,0.029412,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Financial District / Wharves,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0375,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0625,0.0,0.0625,0.0,0.0,0.05,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0125,0.0,0.0375,0.0,0.025,0.0,0.0125,0.075,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0375,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0,0.0


In [37]:
boston_grouped.shape

(22, 209)

#### Let's print each neighborhood along with the top 5 most common venues

In [38]:
num_top_venues = 5

for hood in boston_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = boston_grouped[boston_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Back Bay----
                 venue  freq
0                  Spa  0.06
1  American Restaurant  0.05
2   Italian Restaurant  0.05
3          Coffee Shop  0.04
4                Hotel  0.04


----Beacon Hill----
                 venue  freq
0          Coffee Shop  0.07
1           Restaurant  0.05
2          Pizza Place  0.04
3  American Restaurant  0.04
4   Italian Restaurant  0.04


----Chinatown / Tufts-New England Medical Center----
                venue  freq
0  Chinese Restaurant  0.13
1    Asian Restaurant  0.09
2              Bakery  0.08
3    Sushi Restaurant  0.04
4             Theater  0.04


----Dorchester / Codman Square----
                  venue  freq
0                  Park  0.17
1         Deli / Bodega  0.17
2        Breakfast Spot  0.08
3  Gym / Fitness Center  0.08
4  Caribbean Restaurant  0.08


----Dorchester / Fields Corner----
                venue  freq
0          Playground  0.08
1  Chinese Restaurant  0.08
2          Donut Shop  0.08
3             Dog Run  0

#### Putting above info into pandas dataframe

In [39]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [40]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = boston_grouped['Neighborhood']

for ind in np.arange(boston_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(boston_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Back Bay,Spa,American Restaurant,Italian Restaurant,Coffee Shop,Hotel,Gym,Steakhouse,Boutique,Seafood Restaurant,Jewelry Store
1,Beacon Hill,Coffee Shop,Restaurant,Pizza Place,American Restaurant,Italian Restaurant,Historic Site,History Museum,Mediterranean Restaurant,Falafel Restaurant,Hotel
2,Chinatown / Tufts-New England Medical Center,Chinese Restaurant,Asian Restaurant,Bakery,Sushi Restaurant,Theater,Bubble Tea Shop,Café,Performing Arts Venue,Coffee Shop,Hotel Bar
3,Dorchester / Codman Square,Deli / Bodega,Park,Farmers Market,Restaurant,Pharmacy,Fast Food Restaurant,Liquor Store,Gym / Fitness Center,Breakfast Spot,Caribbean Restaurant
4,Dorchester / Fields Corner,Playground,Liquor Store,Pharmacy,Pet Store,Chinese Restaurant,Park,Convenience Store,Sandwich Place,Dog Run,Donut Shop


### 4. Cluster Neighborhoods

#### Run k-means to cluster the neighboordhood

In [41]:
# set number of clusters
kclusters = 5

boston_grouped_clustering = boston_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(boston_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 3, 3, 2, 3, 3, 3, 0, 3, 3])

In [42]:
#Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

boston_merged = bos_nh

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
boston_merged = boston_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Community')

boston_merged.head() # check the last columns!

Unnamed: 0,Community,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dorchester / Codman Square,42.287,-71.072,2,Deli / Bodega,Park,Farmers Market,Restaurant,Pharmacy,Fast Food Restaurant,Liquor Store,Gym / Fitness Center,Breakfast Spot,Caribbean Restaurant
1,Dorchester / Fields Corner,42.296,-71.055,3,Playground,Liquor Store,Pharmacy,Pet Store,Chinese Restaurant,Park,Convenience Store,Sandwich Place,Dog Run,Donut Shop
2,Roxbury / Grove Hall,42.307,-71.081,4,Shopping Mall,Fast Food Restaurant,Men's Store,Supermarket,Caribbean Restaurant,Cosmetics Shop,Nightclub,Donut Shop,Southern / Soul Food Restaurant,Pharmacy
3,Dorchester / Uphams Corner,42.317,-71.058,3,Pizza Place,Pharmacy,Bar,Pub,Vietnamese Restaurant,Indian Restaurant,Chinese Restaurant,Caribbean Restaurant,Liquor Store,Pet Store
4,Roxbury,42.325,-71.085,1,American Restaurant,Park,Business Service,Bike Rental / Bike Share,Donut Shop,Convenience Store,Concert Hall,African Restaurant,Diner,Event Space


#### let's visualize the resulting clusters

In [43]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(boston_merged['Latitude'], boston_merged['Longitude'], boston_merged['Community'], boston_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 5. Examine Clusters

#### Cluster 1

In [44]:
boston_merged

Unnamed: 0,Community,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dorchester / Codman Square,42.287,-71.072,2,Deli / Bodega,Park,Farmers Market,Restaurant,Pharmacy,Fast Food Restaurant,Liquor Store,Gym / Fitness Center,Breakfast Spot,Caribbean Restaurant
1,Dorchester / Fields Corner,42.296,-71.055,3,Playground,Liquor Store,Pharmacy,Pet Store,Chinese Restaurant,Park,Convenience Store,Sandwich Place,Dog Run,Donut Shop
2,Roxbury / Grove Hall,42.307,-71.081,4,Shopping Mall,Fast Food Restaurant,Men's Store,Supermarket,Caribbean Restaurant,Cosmetics Shop,Nightclub,Donut Shop,Southern / Soul Food Restaurant,Pharmacy
3,Dorchester / Uphams Corner,42.317,-71.058,3,Pizza Place,Pharmacy,Bar,Pub,Vietnamese Restaurant,Indian Restaurant,Chinese Restaurant,Caribbean Restaurant,Liquor Store,Pet Store
4,Roxbury,42.325,-71.085,1,American Restaurant,Park,Business Service,Bike Rental / Bike Share,Donut Shop,Convenience Store,Concert Hall,African Restaurant,Diner,Event Space
5,Roxbury Crossing,42.332,-71.097,3,Pizza Place,Gym,Donut Shop,Metro Station,Italian Restaurant,Sushi Restaurant,Liquor Store,Deli / Bodega,Furniture / Home Store,Burger Joint
6,South Boston,42.335,-71.046,3,Pizza Place,Coffee Shop,Bar,Sports Bar,Convenience Store,Italian Restaurant,Mexican Restaurant,Dive Bar,Dessert Shop,New American Restaurant
7,Fenway / East Fens / Longwood,42.343,-71.093,3,Pizza Place,Art Museum,Japanese Restaurant,Baseball Field,Garden,Middle Eastern Restaurant,Thai Restaurant,Café,Restaurant,Department Store
8,Prudential,42.347,-71.082,0,Clothing Store,Coffee Shop,Ice Cream Shop,Seafood Restaurant,Italian Restaurant,Hotel,Spa,American Restaurant,Department Store,Dessert Shop
9,Kenmore / Boston University,42.347,-71.102,3,Café,Sports Bar,Donut Shop,Lounge,Mexican Restaurant,Pizza Place,American Restaurant,Furniture / Home Store,Japanese Restaurant,Chinese Restaurant


In [45]:
boston_merged.loc[boston_merged['Cluster Labels'] == 0, boston_merged.columns[[0] + list(range(4, boston_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Prudential,Clothing Store,Coffee Shop,Ice Cream Shop,Seafood Restaurant,Italian Restaurant,Hotel,Spa,American Restaurant,Department Store,Dessert Shop
13,Back Bay,Spa,American Restaurant,Italian Restaurant,Coffee Shop,Hotel,Gym,Steakhouse,Boutique,Seafood Restaurant,Jewelry Store
14,South End,Coffee Shop,Spa,Clothing Store,Pizza Place,Jewelry Store,Italian Restaurant,Boutique,Historic Site,Bakery,French Restaurant
18,Markets / Inner Harbor,Italian Restaurant,Seafood Restaurant,Bakery,Historic Site,Park,Sandwich Place,American Restaurant,Hotel,Coffee Shop,Aquarium
20,North End,Italian Restaurant,Pizza Place,Coffee Shop,Park,Seafood Restaurant,Café,Sandwich Place,Pub,Mexican Restaurant,Sports Bar
21,North Brighton / Cambridge,Japanese Restaurant,Park,American Restaurant,Pizza Place,Bus Station,Jazz Club,Farmers Market,Bookstore,Seafood Restaurant,Lingerie Store
23,East Boston,Italian Restaurant,Latin American Restaurant,Liquor Store,Airport,Metro Station,Brazilian Restaurant,Chinese Restaurant,Ice Cream Shop,Peruvian Restaurant,Snack Place


#### Cluster 2

In [46]:
boston_merged.loc[boston_merged['Cluster Labels'] == 1, boston_merged.columns[[0] + list(range(4, boston_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Roxbury,American Restaurant,Park,Business Service,Bike Rental / Bike Share,Donut Shop,Convenience Store,Concert Hall,African Restaurant,Diner,Event Space


#### Cluster 3

In [47]:
boston_merged.loc[boston_merged['Cluster Labels'] == 2, boston_merged.columns[[0] + list(range(4, boston_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dorchester / Codman Square,Deli / Bodega,Park,Farmers Market,Restaurant,Pharmacy,Fast Food Restaurant,Liquor Store,Gym / Fitness Center,Breakfast Spot,Caribbean Restaurant


#### Cluster 4

In [48]:
boston_merged.loc[boston_merged['Cluster Labels'] == 3, boston_merged.columns[[0] + list(range(4, boston_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Dorchester / Fields Corner,Playground,Liquor Store,Pharmacy,Pet Store,Chinese Restaurant,Park,Convenience Store,Sandwich Place,Dog Run,Donut Shop
3,Dorchester / Uphams Corner,Pizza Place,Pharmacy,Bar,Pub,Vietnamese Restaurant,Indian Restaurant,Chinese Restaurant,Caribbean Restaurant,Liquor Store,Pet Store
5,Roxbury Crossing,Pizza Place,Gym,Donut Shop,Metro Station,Italian Restaurant,Sushi Restaurant,Liquor Store,Deli / Bodega,Furniture / Home Store,Burger Joint
6,South Boston,Pizza Place,Coffee Shop,Bar,Sports Bar,Convenience Store,Italian Restaurant,Mexican Restaurant,Dive Bar,Dessert Shop,New American Restaurant
7,Fenway / East Fens / Longwood,Pizza Place,Art Museum,Japanese Restaurant,Baseball Field,Garden,Middle Eastern Restaurant,Thai Restaurant,Café,Restaurant,Department Store
9,Kenmore / Boston University,Café,Sports Bar,Donut Shop,Lounge,Mexican Restaurant,Pizza Place,American Restaurant,Furniture / Home Store,Japanese Restaurant,Chinese Restaurant
10,South Boston / Fort Point,American Restaurant,Hotel,Sandwich Place,Coffee Shop,Restaurant,Italian Restaurant,Bar,Park,Bakery,Steakhouse
11,Chinatown / Tufts-New England Medical Center,Chinese Restaurant,Asian Restaurant,Bakery,Sushi Restaurant,Theater,Bubble Tea Shop,Café,Performing Arts Venue,Coffee Shop,Hotel Bar
12,Downtown Boston,Coffee Shop,Sandwich Place,Hotel,American Restaurant,Steakhouse,Asian Restaurant,Rental Car Location,Bakery,Italian Restaurant,Historic Site
15,Downtown Boston,Coffee Shop,Sandwich Place,Hotel,American Restaurant,Steakhouse,Asian Restaurant,Rental Car Location,Bakery,Italian Restaurant,Historic Site


#### Cluster 5

In [49]:
boston_merged.loc[boston_merged['Cluster Labels'] == 4, boston_merged.columns[[0] + list(range(4, boston_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Roxbury / Grove Hall,Shopping Mall,Fast Food Restaurant,Men's Store,Supermarket,Caribbean Restaurant,Cosmetics Shop,Nightclub,Donut Shop,Southern / Soul Food Restaurant,Pharmacy


### Results

By examine clusters, some insights are very clear. Such as:
1.	Boston’s north end is best for Italian food. It is common knowledge for someone who lives in the area but data proves it that Italian food places are the number one venue in this area. Cluster also shows that inner harbor/east Boston also ranks number one for Italian food.
2.	Majority of venues are in cluster 3 which are areas around downtown Boston. It shows variety of venues with quite accurate results. For example, Chinatown’s number one venue is Chinese restaurant, where as people prefer to shop cloths around prudential. 


### Discussion

Even though cluster information is helpful in understanding top venues of Boston, some of the results are not very convincing. For example, Spa seems to be number one venue of many neighborhood which doesn’t sound right. Neighborhood of Roxbury are in two neighborhood. All this indicate poor coverage in location data and also makes me question the foursquare location data. It is advisable, that more data sources are explored to gain confidence

### Conclusion

In conclusion, the location data and geo data gives us quite a few good insights. We can certainly get an idea which neighborhoods to recommend to our friends and colleagues based on their need.  However, in some areas, data coverage doesn’t seems very good but it is recommended that analysis is enhanced with better quality of data or by accessing multiple data sources. 