# Major attractions nearby and Data Science jobs (in MA).

#### Date: 21 March 2019
#### Author: Nasir Ahmad
#### Publication: This article is purely based on personal research for non commercial usages. The services used by third party are credited without any intention of copy right violation. 


## Introduction

Data science jobs are available around the globe yet the hunt for job is not easy when you have too many places to choose from. In this article I will focus on the types of locations where Data Science jobs are available in Massachusetts US. 
There are two approaches to get any job:
1.	Look for a new job around your current location. 
2.	Search with keyword and then go through each job description. 

Both of above methods fail to include one basic question: What kind of neighborhood is the company located in? Everyone knows the high-tech jobs are available in Silicon Valley, Seattle, New York, and Boston and so on. But what if you are not a big fan of living in a populous city, what if you just want a peaceful country side to live and code for living! 
This article answers the following question:

**“What are the major attractions nearby locations where Data Science jobs are being offered in Massachusetts?”**


## Data Section

1. Foursquare is a platform which provides information about places in a given neighborhood. In this article Foursquare‘s listing is used to get popular sites in a given area.  It provides developers with a good API which provides result in the form of JSON file, which developer can comprehend as per requirements. 

https://foursquare.com/city-guide

2. Adzuna (A) is a website which provides jobs listing for any given location in addition to other useful services related to jobs. This article uses Adzuna developer API for information related to Data Science jobs in Massachusetts. 

https://www.adzuna.com/

3. Massachusetts State is divided into 14 counties as shown in below image. 

https://upload.wikimedia.org/wikipedia/commons/b/b6/Massachusetts-counties-map.gif


In [1]:
import requests 
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import pandas as pd
import numpy as np

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

print('Libraries imported.')

Libraries imported.


In [2]:
url ='https://api.adzuna.com:443/v1/api/jobs/us/geodata?app_id=51c4103a&app_key=6bf4141ed2bb578e5c972eeb1a68e5b7&what=Data%20science&where=Massachusetts'

In [3]:
results = requests.get(url).json()
results

{'locations': [{'location': {'__CLASS__': 'Adzuna::API::Response::Location',
    'display_name': 'Middlesex County, Massachusetts',
    'area': ['US', 'Massachusetts', 'Middlesex County']},
   'count': 568,
   '__CLASS__': 'Adzuna::API::Response::LocationJobs'},
  {'location': {'__CLASS__': 'Adzuna::API::Response::Location',
    'display_name': 'Suffolk County, Massachusetts',
    'area': ['US', 'Massachusetts', 'Suffolk County']},
   'count': 442,
   '__CLASS__': 'Adzuna::API::Response::LocationJobs'},
  {'count': 60,
   'location': {'area': ['US', 'Massachusetts', 'Worcester County'],
    '__CLASS__': 'Adzuna::API::Response::Location',
    'display_name': 'Worcester County, Massachusetts'},
   '__CLASS__': 'Adzuna::API::Response::LocationJobs'},
  {'count': 47,
   'location': {'area': ['US', 'Massachusetts', 'Norfolk County'],
    'display_name': 'Norfolk County, Massachusetts',
    '__CLASS__': 'Adzuna::API::Response::Location'},
   '__CLASS__': 'Adzuna::API::Response::LocationJobs'

In [4]:
jobs_ma = results['locations']
#['display_name']
#venues    
data_jobs_ma = json_normalize(jobs_ma) # flatten JSON
#nearby_venues.columns

In [5]:
# filter columns
filtered_columns = ['count', 'location.area', 'location.display_name']
data_jobs_ma =data_jobs_ma.loc[:, filtered_columns]
data_jobs_ma.columns = [col.split(".")[-1] for col in data_jobs_ma.columns]
data_jobs_ma.rename({'count':'job_count','display_name':'Neighborhood'}, axis=1, inplace=True)
data_jobs_ma.drop(labels='area',axis=1)

Unnamed: 0,job_count,Neighborhood
0,568,"Middlesex County, Massachusetts"
1,442,"Suffolk County, Massachusetts"
2,60,"Worcester County, Massachusetts"
3,47,"Norfolk County, Massachusetts"
4,21,"Essex County, Massachusetts"
5,13,"Hampden County, Massachusetts"
6,9,"Hampshire County, Massachusetts"
7,6,"Bristol County, Massachusetts"
8,4,"Plymouth County, Massachusetts"
9,2,"Franklin County, Massachusetts"


In [6]:
#Add the coordinates in the dataframe 

In [7]:
#funtion to return the  latitude and longitude
def get_log_lat(address):
    try:
        
        geolocator = Nominatim(user_agent="ma_explorer")
        location = geolocator.geocode(address)
        latitude = location.latitude
        longitude = location.longitude
    except :
        latitude=-1
        longitude=-1
    return  latitude , longitude


In [8]:
#data_jobs_ma['latitude']  = data_jobs_ma.apply(get_log_lat,axis=1)
#nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
data_jobs_ma['latitude']=0
data_jobs_ma['longitude']=0
i=0
for row in data_jobs_ma['Neighborhood']:
    #print(row)
    data_jobs_ma['latitude'].iloc[i],data_jobs_ma['longitude'].iloc[i]=get_log_lat(row)
    i+=1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)


In [9]:
data_jobs_ma

Unnamed: 0,job_count,area,Neighborhood,latitude,longitude
0,568,"[US, Massachusetts, Middlesex County]","Middlesex County, Massachusetts",42.485452,-71.396826
1,442,"[US, Massachusetts, Suffolk County]","Suffolk County, Massachusetts",42.354445,-70.978877
2,60,"[US, Massachusetts, Worcester County]","Worcester County, Massachusetts",42.365013,-71.958455
3,47,"[US, Massachusetts, Norfolk County]","Norfolk County, Massachusetts",42.153861,-71.182801
4,21,"[US, Massachusetts, Essex County]","Essex County, Massachusetts",42.629142,-70.866495
5,13,"[US, Massachusetts, Hampden County]","Hampden County, Massachusetts",42.172589,-72.629525
6,9,"[US, Massachusetts, Hampshire County]","Hampshire County, Massachusetts",42.369013,-72.713946
7,6,"[US, Massachusetts, Bristol County]","Bristol County, Massachusetts",41.742554,-71.085655
8,4,"[US, Massachusetts, Plymouth County]","Plymouth County, Massachusetts",41.942666,-70.761859
9,2,"[US, Massachusetts, Franklin County]","Franklin County, Massachusetts",42.518933,-72.56182


## 2. Explore Massachusetts using Foursquare API

#### Foursquare API provides data about sites in a given nearby place 

In [10]:
CLIENT_ID = 'BKONW20YSKTSNRHM2ECIEPYFYA14SSA4QELHWVVP25N5SJLZ' # your Foursquare ID
CLIENT_SECRET = '1VZQZHXLZ3TT1CNSJ2O113HROIHMIPU24KF3MDLE1J4OSSJI' # your Foursquare Secret
VERSION = '20180604'
radius= 22000
LIMIT = 100


In [11]:
#### it is to be noted that the radius is kept to 5000 meters which is approx 5 km 

In [32]:
def getNearbyVenues(names, latitudes, longitudes, radius=22000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        #url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### execute the above code for Neighborhood of MA

In [33]:
ma_venues = getNearbyVenues(names=data_jobs_ma['Neighborhood'],
                                   latitudes=data_jobs_ma['latitude'],
                                   longitudes=data_jobs_ma['longitude']
                                  )

Middlesex County, Massachusetts
Suffolk County, Massachusetts
Worcester County, Massachusetts
Norfolk County, Massachusetts
Essex County, Massachusetts
Hampden County, Massachusetts
Hampshire County, Massachusetts
Bristol County, Massachusetts
Plymouth County, Massachusetts
Franklin County, Massachusetts
Barnstable County, Massachusetts


In [34]:
print(ma_venues.shape)
ma_venues.head()

(1017, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Middlesex County, Massachusetts",42.485452,-71.396826,Trader Joe's,42.482846,-71.414835,Grocery Store
1,"Middlesex County, Massachusetts",42.485452,-71.396826,Colonial Spirits of Acton,42.478297,-71.411557,Liquor Store
2,"Middlesex County, Massachusetts",42.485452,-71.396826,Nashoba Brook Bakery,42.458521,-71.396562,Bakery
3,"Middlesex County, Massachusetts",42.485452,-71.396826,Reasons To Be Cheerful,42.457451,-71.395757,Ice Cream Shop
4,"Middlesex County, Massachusetts",42.485452,-71.396826,Woods Hill Table,42.456593,-71.393105,New American Restaurant


#### let's see the count of each Neighborhood

In [35]:
ma_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Barnstable County, Massachusetts",100,100,100,100,100,100
"Bristol County, Massachusetts",74,74,74,74,74,74
"Essex County, Massachusetts",100,100,100,100,100,100
"Franklin County, Massachusetts",100,100,100,100,100,100
"Hampden County, Massachusetts",100,100,100,100,100,100
"Hampshire County, Massachusetts",93,93,93,93,93,93
"Middlesex County, Massachusetts",100,100,100,100,100,100
"Norfolk County, Massachusetts",100,100,100,100,100,100
"Plymouth County, Massachusetts",100,100,100,100,100,100
"Suffolk County, Massachusetts",100,100,100,100,100,100


#### Analyze each neighborhood

In [36]:
# one hot encoding
ma_onehot = pd.get_dummies(ma_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ma_onehot['Neighborhood'] = ma_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [ma_onehot.columns[-1]] + list(ma_onehot.columns[:-1])
ma_onehot = ma_onehot[fixed_columns]

ma_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,American Restaurant,Arcade,Art Gallery,Art Museum,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Warehouse Store,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio
0,"Middlesex County, Massachusetts",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Middlesex County, Massachusetts",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Middlesex County, Massachusetts",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Middlesex County, Massachusetts",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Middlesex County, Massachusetts",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [37]:
#lets see the stats 

ma_onehot.shape

(1017, 188)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [38]:
ma_grouped = ma_onehot.groupby('Neighborhood').mean().reset_index()
ma_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,American Restaurant,Arcade,Art Gallery,Art Museum,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Warehouse Store,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio
0,"Barnstable County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,...,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0
1,"Bristol County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Essex County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.09,0.0,0.0,0.0,...,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01
3,"Franklin County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,...,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0
4,"Hampden County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.01,0.0,...,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0


In [39]:
ma_grouped.shape

(11, 188)

In [40]:
##Let's print each neighborhood along with the top 5 most common venues
num_top_venues = 8

for hood in ma_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = ma_grouped[ma_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Barnstable County, Massachusetts----
                venue  freq
0               Beach  0.12
1  Seafood Restaurant  0.06
2      Ice Cream Shop  0.06
3         Pizza Place  0.05
4       Grocery Store  0.04
5         Coffee Shop  0.03
6                Café  0.03
7        Burger Joint  0.03


----Bristol County, Massachusetts----
                 venue  freq
0           Donut Shop  0.08
1       Breakfast Spot  0.05
2           Restaurant  0.05
3  American Restaurant  0.05
4   Seafood Restaurant  0.05
5       Ice Cream Shop  0.04
6               Bakery  0.04
7                  Bar  0.03


----Essex County, Massachusetts----
                 venue  freq
0  American Restaurant  0.09
1   Italian Restaurant  0.06
2   Chinese Restaurant  0.04
3          Coffee Shop  0.04
4                 Farm  0.04
5       Ice Cream Shop  0.04
6   Seafood Restaurant  0.04
7          Pizza Place  0.03


----Franklin County, Massachusetts----
                 venue  freq
0  American Restaurant  0.08
1       

In [55]:
##Let's put that into a pandas dataframe
##First, let's write a function to sort the venues in descending order.

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Now let's create the new dataframe and display the top 10 venues for each neighborhood

In [56]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = ma_grouped['Neighborhood']

for ind in np.arange(ma_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ma_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Barnstable County, Massachusetts",Beach,Seafood Restaurant,Ice Cream Shop,Pizza Place,Grocery Store,American Restaurant,Mexican Restaurant,Coffee Shop,Liquor Store,Café
1,"Bristol County, Massachusetts",Donut Shop,Seafood Restaurant,Restaurant,Breakfast Spot,American Restaurant,Ice Cream Shop,Bakery,Liquor Store,Pharmacy,Park
2,"Essex County, Massachusetts",American Restaurant,Italian Restaurant,Farm,Chinese Restaurant,Seafood Restaurant,Ice Cream Shop,Coffee Shop,Pizza Place,Golf Course,Restaurant
3,"Franklin County, Massachusetts",American Restaurant,Pizza Place,Sandwich Place,Convenience Store,Gift Shop,Brewery,Bar,Discount Store,Scenic Lookout,Café
4,"Hampden County, Massachusetts",American Restaurant,Clothing Store,Coffee Shop,Pizza Place,Pharmacy,Furniture / Home Store,Donut Shop,Department Store,Bakery,Cosmetics Shop


In [57]:
ma_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,American Restaurant,Arcade,Art Gallery,Art Museum,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Warehouse Store,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio
0,"Barnstable County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,...,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0
1,"Bristol County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Essex County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.09,0.0,0.0,0.0,...,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01
3,"Franklin County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,...,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0
4,"Hampden County, Massachusetts",0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.01,0.0,...,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0


## 3. Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 5 clusters.

In [58]:
# set number of clusters
kclusters = 5

ma_grouped_clustering = ma_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ma_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([4, 1, 3, 0, 1, 3, 3, 1, 1, 1], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [59]:
data_jobs_ma.head()

Unnamed: 0,job_count,area,Neighborhood,latitude,longitude
0,568,"[US, Massachusetts, Middlesex County]","Middlesex County, Massachusetts",42.485452,-71.396826
1,442,"[US, Massachusetts, Suffolk County]","Suffolk County, Massachusetts",42.354445,-70.978877
2,60,"[US, Massachusetts, Worcester County]","Worcester County, Massachusetts",42.365013,-71.958455
3,47,"[US, Massachusetts, Norfolk County]","Norfolk County, Massachusetts",42.153861,-71.182801
4,21,"[US, Massachusetts, Essex County]","Essex County, Massachusetts",42.629142,-70.866495


In [60]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

ma_merged = data_jobs_ma

# merge  with toronto_data to add latitude/longitude for each neighborhood
ma_merged = ma_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

ma_merged.head()

Unnamed: 0,job_count,area,Neighborhood,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,568,"[US, Massachusetts, Middlesex County]","Middlesex County, Massachusetts",42.485452,-71.396826,3,American Restaurant,Ice Cream Shop,Chinese Restaurant,Grocery Store,Coffee Shop,Café,Historic Site,History Museum,Sandwich Place,Asian Restaurant
1,442,"[US, Massachusetts, Suffolk County]","Suffolk County, Massachusetts",42.354445,-70.978877,1,Coffee Shop,Donut Shop,Seafood Restaurant,Airport Lounge,Pizza Place,Italian Restaurant,Electronics Store,Sandwich Place,Café,American Restaurant
2,60,"[US, Massachusetts, Worcester County]","Worcester County, Massachusetts",42.365013,-71.958455,2,American Restaurant,Donut Shop,Trail,Diner,Pharmacy,Pizza Place,Campground,Convenience Store,High School,Shipping Store
3,47,"[US, Massachusetts, Norfolk County]","Norfolk County, Massachusetts",42.153861,-71.182801,1,Coffee Shop,Chinese Restaurant,Breakfast Spot,Gym,Thai Restaurant,Donut Shop,Pizza Place,Japanese Restaurant,Mexican Restaurant,American Restaurant
4,21,"[US, Massachusetts, Essex County]","Essex County, Massachusetts",42.629142,-70.866495,3,American Restaurant,Italian Restaurant,Farm,Chinese Restaurant,Seafood Restaurant,Ice Cream Shop,Coffee Shop,Pizza Place,Golf Course,Restaurant


Finally, let's visualize the resulting clusters

In [61]:
# create map
latitude=42.40 
longitude=-71.38
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=8)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ma_merged['latitude'], ma_merged['longitude'], ma_merged['Neighborhood'], ma_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=20,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [49]:
ma_merged

Unnamed: 0,job_count,area,Neighborhood,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,568,"[US, Massachusetts, Middlesex County]","Middlesex County, Massachusetts",42.485452,-71.396826,0,American Restaurant,Ice Cream Shop,Chinese Restaurant,Grocery Store,Coffee Shop,Café,Historic Site,History Museum,Sandwich Place,Asian Restaurant
1,442,"[US, Massachusetts, Suffolk County]","Suffolk County, Massachusetts",42.354445,-70.978877,0,Coffee Shop,Donut Shop,Seafood Restaurant,Airport Lounge,Pizza Place,Italian Restaurant,Electronics Store,Sandwich Place,Café,American Restaurant
2,60,"[US, Massachusetts, Worcester County]","Worcester County, Massachusetts",42.365013,-71.958455,2,American Restaurant,Donut Shop,Trail,Diner,Pharmacy,Pizza Place,Campground,Convenience Store,High School,Shipping Store
3,47,"[US, Massachusetts, Norfolk County]","Norfolk County, Massachusetts",42.153861,-71.182801,0,Coffee Shop,Chinese Restaurant,Breakfast Spot,Gym,Thai Restaurant,Donut Shop,Pizza Place,Japanese Restaurant,Mexican Restaurant,American Restaurant
4,21,"[US, Massachusetts, Essex County]","Essex County, Massachusetts",42.629142,-70.866495,0,American Restaurant,Italian Restaurant,Farm,Chinese Restaurant,Seafood Restaurant,Ice Cream Shop,Coffee Shop,Pizza Place,Golf Course,Restaurant
5,13,"[US, Massachusetts, Hampden County]","Hampden County, Massachusetts",42.172589,-72.629525,0,American Restaurant,Clothing Store,Coffee Shop,Pizza Place,Pharmacy,Furniture / Home Store,Donut Shop,Department Store,Bakery,Cosmetics Shop
6,9,"[US, Massachusetts, Hampshire County]","Hampshire County, Massachusetts",42.369013,-72.713946,0,American Restaurant,Bar,Pizza Place,Vegetarian / Vegan Restaurant,Bakery,Liquor Store,Farm,Ice Cream Shop,Trail,Brewery
7,6,"[US, Massachusetts, Bristol County]","Bristol County, Massachusetts",41.742554,-71.085655,0,Donut Shop,Seafood Restaurant,Restaurant,Breakfast Spot,American Restaurant,Ice Cream Shop,Bakery,Liquor Store,Pharmacy,Park
8,4,"[US, Massachusetts, Plymouth County]","Plymouth County, Massachusetts",41.942666,-70.761859,0,Coffee Shop,Italian Restaurant,Seafood Restaurant,Sandwich Place,Convenience Store,American Restaurant,Pub,Pizza Place,Bar,Donut Shop
9,2,"[US, Massachusetts, Franklin County]","Franklin County, Massachusetts",42.518933,-72.56182,0,American Restaurant,Pizza Place,Sandwich Place,Convenience Store,Gift Shop,Brewery,Bar,Discount Store,Scenic Lookout,Café


<a id='item5'></a>

In [29]:
### Examine Cluster

In [30]:
ma_merged.loc[ma_merged['Cluster Labels'] == 0, ma_merged.columns[[2] + list(range(5, ma_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,"Essex County, Massachusetts",0,American Restaurant,Coffee Shop,Park,Pub,Seafood Restaurant,Bakery,Liquor Store,Café,Italian Restaurant,Ice Cream Shop
8,"Plymouth County, Massachusetts",0,Coffee Shop,American Restaurant,Seafood Restaurant,Ice Cream Shop,Pizza Place,Italian Restaurant,Beach,Golf Course,Steakhouse,Café
10,"Barnstable County, Massachusetts",0,Beach,Ice Cream Shop,Seafood Restaurant,American Restaurant,Pizza Place,Grocery Store,Coffee Shop,Café,Sushi Restaurant,Burger Joint


In [31]:
ma_merged

Unnamed: 0,job_count,area,Neighborhood,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,568,"[US, Massachusetts, Middlesex County]","Middlesex County, Massachusetts",42.485452,-71.396826,1,Ice Cream Shop,Italian Restaurant,American Restaurant,Grocery Store,Coffee Shop,Pizza Place,Sandwich Place,Spa,Café,Bakery
1,442,"[US, Massachusetts, Suffolk County]","Suffolk County, Massachusetts",42.354445,-70.978877,2,Park,Italian Restaurant,Bakery,Seafood Restaurant,Pizza Place,Hotel,Gym,Historic Site,Brewery,French Restaurant
2,60,"[US, Massachusetts, Worcester County]","Worcester County, Massachusetts",42.365013,-71.958455,1,American Restaurant,Sandwich Place,Bakery,Pizza Place,Fast Food Restaurant,Breakfast Spot,Café,Bar,Diner,Mexican Restaurant
3,47,"[US, Massachusetts, Norfolk County]","Norfolk County, Massachusetts",42.153861,-71.182801,1,Bakery,Gym / Fitness Center,Breakfast Spot,Pizza Place,Park,Trail,Gym,Grocery Store,Farm,Italian Restaurant
4,21,"[US, Massachusetts, Essex County]","Essex County, Massachusetts",42.629142,-70.866495,0,American Restaurant,Coffee Shop,Park,Pub,Seafood Restaurant,Bakery,Liquor Store,Café,Italian Restaurant,Ice Cream Shop
5,13,"[US, Massachusetts, Hampden County]","Hampden County, Massachusetts",42.172589,-72.629525,1,American Restaurant,Bakery,Burger Joint,Breakfast Spot,Ice Cream Shop,Italian Restaurant,Coffee Shop,Indian Restaurant,Pizza Place,Sandwich Place
6,9,"[US, Massachusetts, Hampshire County]","Hampshire County, Massachusetts",42.369013,-72.713946,1,Brewery,Pizza Place,American Restaurant,Breakfast Spot,Coffee Shop,Ice Cream Shop,Grocery Store,Bar,Bakery,Bookstore
7,6,"[US, Massachusetts, Bristol County]","Bristol County, Massachusetts",41.742554,-71.085655,1,American Restaurant,Ice Cream Shop,Restaurant,Pizza Place,Bakery,Coffee Shop,Breakfast Spot,Grocery Store,Café,Seafood Restaurant
8,4,"[US, Massachusetts, Plymouth County]","Plymouth County, Massachusetts",41.942666,-70.761859,0,Coffee Shop,American Restaurant,Seafood Restaurant,Ice Cream Shop,Pizza Place,Italian Restaurant,Beach,Golf Course,Steakhouse,Café
9,2,"[US, Massachusetts, Franklin County]","Franklin County, Massachusetts",42.518933,-72.56182,1,American Restaurant,Coffee Shop,Grocery Store,Pizza Place,Café,Bar,Gift Shop,Chinese Restaurant,Brewery,Breakfast Spot
