## Introduction

This is the Capstone Project for the IBM Data Science Professional course. In this notebook, I will be exploring  venues in the the county of Suffolk, Massachusetts (Including Boston and the surrounding towns). This will be used to optimize the location of opening a new fusion restaurant, containing two popular cuisine styles. I will convert addresses into their equivalent latitude and longitude values. I will use the Foursquare API to explore towns and neighborhoods in Suffolk County and the greater Boston areas. I will use the **explore** function to get the most common venue restaurant style categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. I will use the *k*-means clustering algorithm to complete this task. Finally, I will use the Folium Library to visualize the towns in Suffolk County and their emerging clusters. 
First Half: Background and data collection for opening a fusion restaurant in the greater Boston area of Suffolk County. I will use location data to see where the top 2 styles of restaurants would be fused together and be successful in that neighborhood.
Second half: Data processing for: location of Suffolk County and Boston neighborhoods, popular venues in that area, top restaurants in the area. 

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Download and Explore Dataset</a>

2. <a href="#item2">Explore Neighborhoods in Suffolk Massachusetts</a>

3. <a href="#item3">Analyze Each Neighborhood</a>

4. <a href="#item4">Cluster Neighborhoods</a>

5. <a href="#item5">Examine Clusters</a>    
</font>
</div>

The first step is to download all the dependencies that I will need.

In [142]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


<a id='item1'></a>

## 1. Download and Explore Dataset

The link to the data for Massachusetts is: https://geo.nyu.edu/download/file/harvard-mgisgeonamx2-geojson.json
This data contains all of the towns and counties for the state.


In [149]:
!wget -q -O 'massachusetts_data.json' https://geo.nyu.edu/download/file/harvard-mgisgeonamx2-geojson.json
print('Data downloaded!')

Data downloaded!


#### Load and explore the data

Next, let's load the data.

In [150]:
with open('massachusetts_data.json') as json_data:
    massachusetts_data = json.load(json_data)

Let's take a quick look at the data.

In [151]:
massachusetts_data

{'type': 'FeatureCollection',
 'totalFeatures': 1835,
 'features': [{'type': 'Feature',
   'id': 'MGISGEONAMX2.1',
   'geometry': {'type': 'Point', 'coordinates': [-70.86436054, 42.84482233]},
   'geometry_name': 'the_geom',
   'properties': {'PLACES_': 3,
    'PLACES_ID': 1,
    'X': 251961.859,
    'Y': 955105.25,
    'OFFSETX': 0,
    'OFFSETY': 0,
    'HEIGHT': 100,
    'SYMBOL': 1,
    'LEVEL_': 1,
    'TEXT': 'S A L I S B U R Y',
    'NAME': 'SALISBURY',
    'FEATURE': 'PPL',
    'COUNTY': 25009,
    'COORD': '',
    'DATE_': 1978,
    'ELEVATION': 25,
    'SOURCE': 'USGS',
    'TILE_NAME': '146'}},
  {'type': 'Feature',
   'id': 'MGISGEONAMX2.2',
   'geometry': {'type': 'Point', 'coordinates': [-70.81461765, 42.84174158]},
   'geometry_name': 'the_geom',
   'properties': {'PLACES_': 2,
    'PLACES_ID': 2,
    'X': 256030.875,
    'Y': 954794.5,
    'OFFSETX': 0,
    'OFFSETY': 0,
    'HEIGHT': 76.2,
    'SYMBOL': 1,
    'LEVEL_': 1,
    'TEXT': 'SALISBURY BEACH',
    'NAME': 'SA

All the relevant data is in the *features* key, which is basically a list of the neighborhoods. So, let's define a new variable that includes this data.

In [152]:
neighborhoods_data = massachusetts_data['features']

Let's take a look at the first item in this list.

In [153]:
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'MGISGEONAMX2.1',
 'geometry': {'type': 'Point', 'coordinates': [-70.86436054, 42.84482233]},
 'geometry_name': 'the_geom',
 'properties': {'PLACES_': 3,
  'PLACES_ID': 1,
  'X': 251961.859,
  'Y': 955105.25,
  'OFFSETX': 0,
  'OFFSETY': 0,
  'HEIGHT': 100,
  'SYMBOL': 1,
  'LEVEL_': 1,
  'TEXT': 'S A L I S B U R Y',
  'NAME': 'SALISBURY',
  'FEATURE': 'PPL',
  'COUNTY': 25009,
  'COORD': '',
  'DATE_': 1978,
  'ELEVATION': 25,
  'SOURCE': 'USGS',
  'TILE_NAME': '146'}}

#### Next, I'll tranform the data into a *pandas* dataframe

The next task is essentially transforming this data of nested Python dictionaries into a *pandas* dataframe. So I'll start by creating an empty dataframe.

In [154]:
# define the dataframe columns
column_names = ['COUNTY', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

Take a look at the empty dataframe to confirm that the columns are as intended.

In [155]:
neighborhoods

Unnamed: 0,COUNTY,Neighborhood,Latitude,Longitude


Then I'll loop through the data and fill the dataframe one row at a time.

In [156]:
for data in neighborhoods_data:
    COUNTY = neighborhood_name = data['properties']['COUNTY'] 
    neighborhood_name = data['properties']['NAME']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'COUNTY': COUNTY,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

Quickly examine the resulting dataframe.

In [157]:
neighborhoods.head()

Unnamed: 0,COUNTY,Neighborhood,Latitude,Longitude
0,25009,SALISBURY,42.844822,-70.864361
1,25009,SALISBURY BEACH,42.841742,-70.814618
2,25009,BROWNS POINT,42.838659,-70.83387
3,25009,RINGS ISLAND,42.816168,-70.867222
4,25009,PLUM ISLAND,42.813622,-70.808103


In [158]:
neighborhoods

Unnamed: 0,COUNTY,Neighborhood,Latitude,Longitude
0,25009,SALISBURY,42.844822,-70.864361
1,25009,SALISBURY BEACH,42.841742,-70.814618
2,25009,BROWNS POINT,42.838659,-70.83387
3,25009,RINGS ISLAND,42.816168,-70.867222
4,25009,PLUM ISLAND,42.813622,-70.808103
5,25009,JOPPA,42.807909,-70.859419
6,0,NEWBURYPORT,42.806319,-70.871285
7,25009,PLUMBUSH,42.798062,-70.830288
8,25009,UPPER GREEN,42.794772,-70.85806
9,25009,PINE ISLAND,42.780207,-70.829422


In [160]:
print('The dataframe has {} COUNTY unique results and {} neighborhoods.'.format(
        len(neighborhoods['COUNTY'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 15 COUNTY unique results and 1835 neighborhoods.


#### Use geopy library to get the latitude and longitude values of Boston.

In [161]:
address = 'Boston, MA'

geolocator = Nominatim(user_agent="MA_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Boston are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Boston are 42.3602534, -71.0582912.


#### Create a map of Boston and Suffolk County with neighborhoods superimposed on top.

In [162]:
# create map of New York using latitude and longitude values
map_boston = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, COUNTY, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['COUNTY'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, COUNTY)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_boston)  
    
map_boston

**Folium** is a visualization library. Zoom into the above map, and click on each circle mark to reveal the name of the town and its respective county.

However, for illustration purposes, let's simplify the above map and segment and cluster only the neighborhoods in Suffolk County. So let's slice the original dataframe and create a new dataframe of the Suffolk County data.

In [163]:
BostonSuffolkCounty_data = neighborhoods[neighborhoods['COUNTY'] == 25025].reset_index(drop=True)
BostonSuffolkCounty_data.head()

Unnamed: 0,COUNTY,Neighborhood,Latitude,Longitude
0,25025,POINT OF PINES,42.437468,-70.965568
1,25025,BEACHMONT,42.395601,-70.990215
2,25025,REVERE,42.411107,-71.018667
3,25025,CHELSEA,42.39143,-71.03514
4,25025,ORIENT HEIGHTS,42.387261,-71.009795


Let's get the geographical coordinates of Boston.

In [164]:
address = 'Boston, MA'

geolocator = Nominatim(user_agent="boston_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Boston are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Boston are 42.3602534, -71.0582912.


Visualize Boston and the surrounding towns in Suffolk County.

In [165]:
# create map of Boston using latitude and longitude values
map_Boston2 = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(BostonSuffolkCounty_data['Latitude'], BostonSuffolkCounty_data['Longitude'], BostonSuffolkCounty_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=12,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Boston2)  
    
map_Boston2

Next, I am going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

#### Define Foursquare Credentials and Version

In [166]:
CLIENT_ID = 'SU0HF32WZZREMJDFBRN0BHIYNVXFVTU3LR1Z1RTY1TAZYXHE' # your Foursquare ID
CLIENT_SECRET = '22S5GP5J0J0UC2XOV1WFU2EEYA1YJG5WZCM0SW3GPIB14SPQ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: SU0HF32WZZREMJDFBRN0BHIYNVXFVTU3LR1Z1RTY1TAZYXHE
CLIENT_SECRET:22S5GP5J0J0UC2XOV1WFU2EEYA1YJG5WZCM0SW3GPIB14SPQ


#### Let's explore the first neighborhood in our dataframe.

Get the neighborhood's name.

In [167]:
BostonSuffolkCounty_data.loc[0, 'Neighborhood']

'POINT OF PINES'

Get the neighborhood's latitude and longitude values.

In [168]:
neighborhood_latitude = BostonSuffolkCounty_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = BostonSuffolkCounty_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = BostonSuffolkCounty_data.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of POINT OF PINES are 42.43746753, -70.96556756.


#### Now, let's get the top 200 venues that are in Point of Pines within a radius of 1000 meters.

First, let's create the GET request URL. Name your URL **url**.

In [169]:
# type your answer here

#The correct answer is:
LIMIT = 200 # limit of number of venues returned by Foursquare API

radius = 1000 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL




'https://api.foursquare.com/v2/venues/explore?&client_id=SU0HF32WZZREMJDFBRN0BHIYNVXFVTU3LR1Z1RTY1TAZYXHE&client_secret=22S5GP5J0J0UC2XOV1WFU2EEYA1YJG5WZCM0SW3GPIB14SPQ&v=20180605&ll=42.43746753,-70.96556756&radius=1000&limit=200'

Send the GET request and examine the resutls

In [170]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5ef9004f8d58ab258e1fee53'},
 'response': {'headerLocation': 'Revere',
  'headerFullLocation': 'Revere',
  'headerLocationGranularity': 'city',
  'totalResults': 5,
  'suggestedBounds': {'ne': {'lat': 42.44646753900001,
    'lng': -70.9533954303484},
   'sw': {'lat': 42.42846752099999, 'lng': -70.9777396896516}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4fee7c47e4b01127cba03ba8',
       'name': 'Point of Pines Private Beach',
       'location': {'address': 'Rice Ave',
        'crossStreet': 'Fowler Ave',
        'lat': 42.437731268209006,
        'lng': -70.96879679484049,
        'labeledLatLngs': [{'label': 'display',
          'lat': 42.437731268209006,
          'lng': -70.96879679484049}],
        'distance': 266,
        '

use **get_category_type** function from the Foursquare lab.

In [171]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now I'll clean the json and structure it into a *pandas* dataframe.

In [172]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Point of Pines Private Beach,Beach,42.437731,-70.968797
1,Revere Beach-North,Beach,42.434256,-70.971749
2,Pest Arrest Of New England,Business Service,42.439712,-70.96596
3,Pine River Rock Beach,River,42.437419,-70.96916
4,Mirage,Restaurant,42.441175,-70.967157


venues returned by Foursquare

In [174]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

5 venues were returned by Foursquare.


<a id='item2'></a>

## 2. Explore Neighborhoods in Suffolk County

#### Let's create a function to repeat the same process to all the neighborhoods in Suffolk County

In [175]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now use this for each town

In [176]:


BostonSuffolkCounty_venues = getNearbyVenues(names=BostonSuffolkCounty_data['Neighborhood'],
                                   latitudes=BostonSuffolkCounty_data['Latitude'],
                                   longitudes=BostonSuffolkCounty_data['Longitude']
                                  )



POINT OF PINES
BEACHMONT
REVERE
CHELSEA
ORIENT HEIGHTS
CHARLESTOWN
WINTHROP
FORT WARREN
BOSTON
FORT INDEPENDENCE
ROXBURY
NEWSTEAD MONTEGRADE
FOREST HILLS
DORCHESTER
ROSLINDALE
NEPONSET
ASHMONT
MATTAPAN
FAIRMOUNT
ALLSTON
FANEUIL
BRIGHTON
ABERDEEN
BELLEVUE
HIGHLAND
GERMANTOWN
READVILLE


check the size of the resulting dataframe

In [177]:
print(BostonSuffolkCounty_venues.shape)
BostonSuffolkCounty_venues.head()

(1123, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,POINT OF PINES,42.437468,-70.965568,Point of Pines Private Beach,42.437731,-70.968797,Beach
1,POINT OF PINES,42.437468,-70.965568,Revere Beach-North,42.434256,-70.971749,Beach
2,POINT OF PINES,42.437468,-70.965568,Pest Arrest Of New England,42.439712,-70.96596,Business Service
3,POINT OF PINES,42.437468,-70.965568,Pine River Rock Beach,42.437419,-70.96916,River
4,POINT OF PINES,42.437468,-70.965568,Mirage,42.441175,-70.967157,Restaurant


Let's check how many venues were returned for each neighborhood

In [178]:
BostonSuffolkCounty_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ABERDEEN,92,92,92,92,92,92
ALLSTON,100,100,100,100,100,100
ASHMONT,28,28,28,28,28,28
BEACHMONT,27,27,27,27,27,27
BELLEVUE,39,39,39,39,39,39
BOSTON,100,100,100,100,100,100
BRIGHTON,79,79,79,79,79,79
CHARLESTOWN,82,82,82,82,82,82
CHELSEA,51,51,51,51,51,51
DORCHESTER,18,18,18,18,18,18


#### Let's find out how many unique categories can be curated from all the returned venues

In [179]:
print('There are {} uniques categories.'.format(len(BostonSuffolkCounty_venues['Venue Category'].unique())))

There are 214 uniques categories.


<a id='item3'></a>

## 3. Analyze Each Neighborhood

In [180]:
# one hot encoding
BostonSuffolkCounty_onehot = pd.get_dummies(BostonSuffolkCounty_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
BostonSuffolkCounty_onehot['Neighborhood'] = BostonSuffolkCounty_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [BostonSuffolkCounty_onehot.columns[-1]] + list(BostonSuffolkCounty_onehot.columns[:-1])
BostonSuffolkCounty_onehot = BostonSuffolkCounty_onehot[fixed_columns]

BostonSuffolkCounty_onehot.head()

Unnamed: 0,Neighborhood,ATM,Afghan Restaurant,African Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Bath House,Beach,Beer Garden,Belgian Restaurant,Big Box Store,Board Shop,Boat or Ferry,Bookstore,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Buffet,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Circus,Clothing Store,Coffee Shop,College Hockey Rink,College Stadium,Comedy Club,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cuban Restaurant,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gastropub,Gift Shop,Golf Course,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Historic Site,History Museum,Home Service,Hookah Bar,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Insurance Office,Island,Italian Restaurant,Japanese Restaurant,Jazz Club,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lawyer,Library,Light Rail Station,Lighthouse,Liquor Store,Locksmith,Lounge,Market,Martial Arts Dojo,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Store,Music Venue,Nail Salon,National Park,Nature Preserve,New American Restaurant,Noodle House,Opera House,Optical Shop,Other Repair Shop,Outdoor Sculpture,Outlet Store,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pier,Piercing Parlor,Pizza Place,Platform,Playground,Plaza,Pool,Post Office,Pub,Racetrack,Record Shop,Recording Studio,Rental Car Location,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Ski Chalet,Smoke Shop,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Thrift / Vintage Store,Toll Plaza,Tourist Information Center,Track,Trail,Train,Train Station,Tunnel,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Weight Loss Center,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo Exhibit
0,POINT OF PINES,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,POINT OF PINES,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,POINT OF PINES,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,POINT OF PINES,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,POINT OF PINES,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


examine the new dataframe size.

In [181]:
BostonSuffolkCounty_onehot.shape

(1123, 215)

#### group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [182]:
BostonSuffolkCounty_grouped = BostonSuffolkCounty_onehot.groupby('Neighborhood').mean().reset_index()
BostonSuffolkCounty_grouped

Unnamed: 0,Neighborhood,ATM,Afghan Restaurant,African Restaurant,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Bath House,Beach,Beer Garden,Belgian Restaurant,Big Box Store,Board Shop,Boat or Ferry,Bookstore,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Buffet,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Circus,Clothing Store,Coffee Shop,College Hockey Rink,College Stadium,Comedy Club,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cuban Restaurant,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gastropub,Gift Shop,Golf Course,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Historic Site,History Museum,Home Service,Hookah Bar,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Insurance Office,Island,Italian Restaurant,Japanese Restaurant,Jazz Club,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lawyer,Library,Light Rail Station,Lighthouse,Liquor Store,Locksmith,Lounge,Market,Martial Arts Dojo,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Store,Music Venue,Nail Salon,National Park,Nature Preserve,New American Restaurant,Noodle House,Opera House,Optical Shop,Other Repair Shop,Outdoor Sculpture,Outlet Store,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pier,Piercing Parlor,Pizza Place,Platform,Playground,Plaza,Pool,Post Office,Pub,Racetrack,Record Shop,Recording Studio,Rental Car Location,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Ski Chalet,Smoke Shop,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Thrift / Vintage Store,Toll Plaza,Tourist Information Center,Track,Trail,Train,Train Station,Tunnel,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Weight Loss Center,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo Exhibit
0,ABERDEEN,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.01087,0.032609,0.0,0.0,0.065217,0.0,0.0,0.0,0.021739,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.01087,0.01087,0.0,0.01087,0.0,0.0,0.01087,0.0,0.032609,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.021739,0.021739,0.01087,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.01087,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.021739,0.032609,0.01087,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.021739,0.0,0.0,0.076087,0.0,0.01087,0.01087,0.0,0.0,0.021739,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.032609,0.0,0.01087,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0
1,ALLSTON,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.06,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0
2,ASHMONT,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,BEACHMONT,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.037037,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,BELLEVUE,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.025641,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.025641,0.0,0.0,0.025641,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,BOSTON,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.06,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.01,0.01,0.03,0.0,0.0,0.0,0.0,0.06,0.01,0.0,0.0,0.0,0.05,0.0,0.01,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.02,0.0,0.03,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0
6,BRIGHTON,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037975,0.025316,0.025316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.012658,0.025316,0.0,0.0,0.037975,0.0,0.0,0.0,0.050633,0.0,0.0,0.025316,0.0,0.0,0.0,0.0,0.0,0.050633,0.0,0.0,0.0,0.012658,0.0,0.012658,0.0,0.0,0.0,0.0,0.025316,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025316,0.037975,0.012658,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.012658,0.025316,0.0,0.0,0.101266,0.0,0.012658,0.0,0.0,0.0,0.037975,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.025316,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.012658,0.0,0.037975,0.0,0.0,0.0,0.0,0.0,0.025316,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.012658,0.0,0.0,0.0
7,CHARLESTOWN,0.0,0.0,0.0,0.02439,0.012195,0.0,0.0,0.012195,0.012195,0.0,0.0,0.0,0.0,0.012195,0.036585,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.036585,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.036585,0.0,0.0,0.0,0.02439,0.02439,0.012195,0.012195,0.0,0.0,0.0,0.012195,0.02439,0.0,0.0,0.0,0.012195,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.02439,0.0,0.012195,0.0,0.012195,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.012195,0.012195,0.0,0.0,0.0,0.012195,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.060976,0.0,0.012195,0.012195,0.0,0.0,0.036585,0.0,0.0,0.012195,0.012195,0.0,0.02439,0.0,0.0,0.012195,0.0,0.0,0.012195,0.0,0.0,0.0,0.02439,0.0,0.0,0.012195,0.0,0.0,0.012195,0.012195,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.012195,0.0,0.012195,0.012195,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0
8,CHELSEA,0.019608,0.0,0.0,0.039216,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.039216,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.019608,0.0,0.0,0.019608,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.019608,0.0,0.0,0.078431,0.0,0.0,0.0,0.0,0.0,0.0,0.039216,0.0,0.039216,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.019608,0.0,0.0,0.0,0.039216,0.0,0.0,0.0,0.0,0.0,0.0,0.098039,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.039216,0.0,0.0,0.0,0.0,0.019608,0.019608,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,DORCHESTER,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.055556,0.0,0.0,0.222222,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### confirm the new size

In [183]:
BostonSuffolkCounty_grouped.shape

(27, 215)

#### Let's print each neighborhood along with the top 5 most common venues

In [184]:
num_top_venues = 5

for hood in BostonSuffolkCounty_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = BostonSuffolkCounty_grouped[BostonSuffolkCounty_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----ABERDEEN----
               venue  freq
0        Pizza Place  0.08
1               Café  0.07
2        Coffee Shop  0.04
3  Convenience Store  0.04
4             Bakery  0.04


----ALLSTON----
               venue  freq
0        Coffee Shop  0.06
1  Korean Restaurant  0.05
2    Thai Restaurant  0.04
3             Bakery  0.04
4        Pizza Place  0.03


----ASHMONT----
                 venue  freq
0                 Park  0.07
1        Metro Station  0.07
2        Grocery Store  0.07
3  Rental Car Location  0.04
4       Hardware Store  0.04


----BEACHMONT----
            venue  freq
0    Liquor Store  0.11
1      Food Truck  0.07
2            Park  0.07
3  Sandwich Place  0.07
4      Donut Shop  0.04


----BELLEVUE----
                 venue  freq
0         Home Service  0.08
1                 Park  0.05
2  American Restaurant  0.05
3      Thai Restaurant  0.05
4         Liquor Store  0.03


----BOSTON----
           venue  freq
0  Historic Site  0.06
1    Coffee Shop  0.06
2     

#### put that into a *pandas* dataframe

sort the venues in descending order.

In [185]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

create the new dataframe and display the top 10 venues for each neighborhood.

In [186]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = BostonSuffolkCounty_grouped['Neighborhood']

for ind in np.arange(BostonSuffolkCounty_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(BostonSuffolkCounty_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,ABERDEEN,Pizza Place,Café,Bakery,Coffee Shop,Convenience Store,Bank,Mexican Restaurant,Donut Shop,Bus Station,Sushi Restaurant
1,ALLSTON,Coffee Shop,Korean Restaurant,Bakery,Thai Restaurant,Bubble Tea Shop,Rental Car Location,Chinese Restaurant,Pizza Place,Seafood Restaurant,Sushi Restaurant
2,ASHMONT,Grocery Store,Metro Station,Park,Farmers Market,Breakfast Spot,Mexican Restaurant,Pizza Place,Speakeasy,Caribbean Restaurant,Fast Food Restaurant
3,BEACHMONT,Liquor Store,Food Truck,Park,Sandwich Place,Gas Station,Mattress Store,Gym,Metro Station,Supermarket,Italian Restaurant
4,BELLEVUE,Home Service,Thai Restaurant,American Restaurant,Park,Mediterranean Restaurant,Gym,Grocery Store,Liquor Store,Locksmith,Convenience Store


<a id='item4'></a>

## 4. Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 8 clusters.

In [187]:
# set number of clusters
kclusters = 8

BostonSuffolkCounty_grouped_clustering = BostonSuffolkCounty_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(BostonSuffolkCounty_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 3, 3, 3, 3, 3, 3, 3, 3, 0], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [188]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

BostonSuffolkCounty_merged = BostonSuffolkCounty_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
BostonSuffolkCounty_merged = BostonSuffolkCounty_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

BostonSuffolkCounty_merged.head() # check the last columns

Unnamed: 0,COUNTY,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,25025,POINT OF PINES,42.437468,-70.965568,2,Beach,Restaurant,River,Business Service,Zoo Exhibit,Fast Food Restaurant,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Football Stadium
1,25025,BEACHMONT,42.395601,-70.990215,3,Liquor Store,Food Truck,Park,Sandwich Place,Gas Station,Mattress Store,Gym,Metro Station,Supermarket,Italian Restaurant
2,25025,REVERE,42.411107,-71.018667,3,Pharmacy,Pizza Place,Bank,Donut Shop,Shopping Mall,Chinese Restaurant,Greek Restaurant,Mexican Restaurant,Smoke Shop,Skating Rink
3,25025,CHELSEA,42.39143,-71.03514,3,Hotel,Donut Shop,Grocery Store,Mexican Restaurant,Food,Harbor / Marina,Fast Food Restaurant,Pizza Place,Bank,American Restaurant
4,25025,ORIENT HEIGHTS,42.387261,-71.009795,3,Sandwich Place,Harbor / Marina,Cosmetics Shop,Baseball Field,Skating Rink,Food Truck,Circus,Mexican Restaurant,Pharmacy,Coffee Shop


Finally, let's visualize the resulting clusters

In [203]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(BostonSuffolkCounty_merged['Latitude'], BostonSuffolkCounty_merged['Longitude'], BostonSuffolkCounty_merged['Neighborhood'], BostonSuffolkCounty_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=25,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>

## 5. Examine Clusters

examine each cluster and determine the discriminating venue categories that distinguish each cluster.

#### Cluster 1

In [192]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 0, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,ROXBURY,Pizza Place,Park,Donut Shop,Playground,Sandwich Place,Plaza,Italian Restaurant,Fast Food Restaurant,New American Restaurant,Metro Station
13,DORCHESTER,Pizza Place,Fried Chicken Joint,Fast Food Restaurant,Golf Course,Men's Store,Bank,Sandwich Place,Park,Vegetarian / Vegan Restaurant,Pharmacy
15,NEPONSET,Pizza Place,Park,Mobile Phone Shop,Donut Shop,Bowling Alley,Hotel,Pharmacy,Plaza,Candy Store,Gas Station


#### Cluster 2

In [193]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 1, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,FORT WARREN,Island,Historic Site,Seafood Restaurant,Park,Farmers Market,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Football Stadium,Food Truck


#### Cluster 3

In [194]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 2, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,POINT OF PINES,Beach,Restaurant,River,Business Service,Zoo Exhibit,Fast Food Restaurant,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Football Stadium


#### Cluster 4

In [195]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 3, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,BEACHMONT,Liquor Store,Food Truck,Park,Sandwich Place,Gas Station,Mattress Store,Gym,Metro Station,Supermarket,Italian Restaurant
2,REVERE,Pharmacy,Pizza Place,Bank,Donut Shop,Shopping Mall,Chinese Restaurant,Greek Restaurant,Mexican Restaurant,Smoke Shop,Skating Rink
3,CHELSEA,Hotel,Donut Shop,Grocery Store,Mexican Restaurant,Food,Harbor / Marina,Fast Food Restaurant,Pizza Place,Bank,American Restaurant
4,ORIENT HEIGHTS,Sandwich Place,Harbor / Marina,Cosmetics Shop,Baseball Field,Skating Rink,Food Truck,Circus,Mexican Restaurant,Pharmacy,Coffee Shop
5,CHARLESTOWN,Park,Café,Gastropub,Bar,Donut Shop,Pizza Place,Coffee Shop,Pub,Sandwich Place,Boat or Ferry
6,WINTHROP,Deli / Bodega,Park,Dance Studio,Bank,Pizza Place,Pharmacy,Restaurant,Construction & Landscaping,Chinese Restaurant,Mexican Restaurant
8,BOSTON,Historic Site,Coffee Shop,Hotel,Park,Pizza Place,Bakery,Italian Restaurant,Restaurant,Sandwich Place,Gym / Fitness Center
12,FOREST HILLS,Convenience Store,American Restaurant,Bar,Park,Pizza Place,Pub,Speakeasy,Bus Station,Scenic Lookout,Sandwich Place
14,ROSLINDALE,Pizza Place,American Restaurant,Donut Shop,Discount Store,Italian Restaurant,Park,Yoga Studio,Liquor Store,Grocery Store,Sandwich Place
16,ASHMONT,Grocery Store,Metro Station,Park,Farmers Market,Breakfast Spot,Mexican Restaurant,Pizza Place,Speakeasy,Caribbean Restaurant,Fast Food Restaurant


#### Cluster 5

In [196]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 4, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,NEWSTEAD MONTEGRADE,Park,Coffee Shop,Brewery,Plaza,Gym,Seafood Restaurant,Museum,Football Stadium,Food & Drink Shop,Shopping Mall
17,MATTAPAN,Park,Fast Food Restaurant,Caribbean Restaurant,Pharmacy,Bakery,Hardware Store,Light Rail Station,Metro Station,Convenience Store,Donut Shop


#### Cluster 6

In [198]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 5, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,GERMANTOWN,Home Service,Donut Shop,Pool,Food & Drink Shop,Grocery Store,Drugstore,Chinese Restaurant,Latin American Restaurant,Wings Joint,Fried Chicken Joint


#### Cluster 7

In [199]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 6, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,FORT INDEPENDENCE,Park,Harbor / Marina,Boat or Ferry,Pier,Playground,Trail,Hot Dog Joint,Historic Site,Lighthouse,Food


#### Cluster 8

In [200]:
BostonSuffolkCounty_merged.loc[BostonSuffolkCounty_merged['Cluster Labels'] == 7, BostonSuffolkCounty_merged.columns[[1] + list(range(5, BostonSuffolkCounty_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,READVILLE,Bakery,Gym,Pizza Place,Plaza,Construction & Landscaping,Baseball Field,Train Station,Donut Shop,Grocery Store,Dive Bar


### This is the end of my capstone project.

In conclusion, the data shows that Cluster 4 contains the most variety of venues. This includes Boston and the surrounding towns. From the data set, it can be seen that the neighborhood of Allston has the most variety of restaurants within the Cluster 4. This indicates that a fusion style restaurant would have a good chance of success here, as there are many different tastes already in the area. The top two highest rated venues currently are a Korean Restaurant and a Thai Restaurant. In addition, further down the list are a Chinese Restaurant, a Seafood restaurant, a Pizza Restaurant, and a Sushi Restaurant. In order to maximize the uniqueness of my proposed Fusion style restaurant, I would recommend the bottom two venues from the list (Pizza and Sushi) for a fusion opportunity. This would allow them to combine two separate styles, and potentially increase popularity by drawing different groups of customers. 