# Capstone Project - The Battle of the Neighborhoods
### Applied Data Science Capstone by IBM/Coursera
### Christopher Lawrence

***

## Data Analysis for New Coffee Shop Locations in Anne Arundel County Maryland

## Overview



For the Capstone project, I chose a scenario in which a large, national coffee chain is searching for new locations in which to open new coffee shops.  For the area, I chose to explore the region in which I currently live, Anne Arundel County Maryland.  This region is located in the suburbs between Baltimore Maryland and Washington D.C.  This area has a large metropolitan population with many coffee shops throughout the area.  I will use the Foursquare data API to establish the current coffee shop environment in Anne Arundel County in conjunction with other data sets from the Maryland and Anne Arundel Maryland County open data set web portals.  

***

## Introduction and Background: Business Problem <a name="introduction"></a>

This section will present a brief description of the problem/question to be explored and a discussion of the background of the scenario presented in this Capstone.

As stated in the overview section, this Capstone will present the problem of searching for an 'ideal' location to expand a new coffee shop store somewhere in the county of Anne Arundel Maryland.  Anne Arundel Maryland is a large, suburban county in the state of Maryland located between the cities of Baltimore Maryland and Washington D.C.  The problem/scenario presented in this Capstone will center around defining the properties of an 'ideal' location given the Foursquare API dataset information for existing coffee shops throughout the Anne Arundel county area.  

After exploring the various attributes of the existing coffee shop data in the Foursquare database (in conjunction with a systematic partitioning of Anne Arundel county via county open data sets (discussed in the data section), the business problem of where to locate a new coffee shop will become further defined.  In this scenario, the company wishes to expand its current number of coffee shops in the Anne Arundel county area, but also wants to maximize its business opportunities by avoiding areas of the county that are saturated in existing coffee shops and other establishments that cut into the coffee shop business.



***

## Data Analysis

Import Libraries:

In [1]:
import pandas as pd
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


Get the datasets:

For now, Anne Arundel County will be partitioned by zip code.  So I will obtain the zip code dataset for Anne Arundel County from the official County open data web portal at https://opendata.aacounty.org/datasets/zip-codes

In [2]:
# importing the zip code .csv file

aa_zip_codes = pd.read_csv('https://opendata.arcgis.com/datasets/899d24a210094af38a6ebe83c53c760f_7.csv', header=0)

In [3]:
type(aa_zip_codes)

pandas.core.frame.DataFrame

In [4]:
aa_zip_codes.head()

Unnamed: 0,OBJECTID_1,OBJECTID,ZIP,PO_NAME,STATE,CITY_NAME,CITY_CODE,Shape_Leng,ShapeSTArea,ShapeSTLength
0,1,1,21060,Glen Burnie,MD,Glen Burnie,GBE,135020.538278,379277500.0,135093.770581
1,2,2,20779,Tracys Landing,MD,Tracys Landing,TL,80203.800969,214789200.0,80203.800969
2,3,3,20764,Shady Side,MD,Shady Side,SS,53026.041572,158010600.0,53026.041572
3,4,4,20714,North Beach,MD,North Beach,NB,23465.94029,28628090.0,23465.94029
4,5,5,21054,Gambrills,MD,Gambrills,GM,245374.096713,493026800.0,248669.482468


In [5]:
aa_zip_codes.shape

(49, 10)

In [6]:
aa_zip_codes.dtypes

OBJECTID_1         int64
OBJECTID           int64
ZIP                int64
PO_NAME           object
STATE             object
CITY_NAME         object
CITY_CODE         object
Shape_Leng       float64
ShapeSTArea      float64
ShapeSTLength    float64
dtype: object

There are 49 zip code areas in Anne Arundel County. The dataset contains extra column data that can be removed, keeping only OBJECTID, ZIP, and PO_NAME.  There are no missing values or other issues for the remaining columns.

In [7]:
aa_zip_codes = aa_zip_codes.drop(['OBJECTID_1','STATE','PO_NAME','CITY_CODE', 'Shape_Leng','ShapeSTArea','ShapeSTLength'], axis=1)

In [8]:
aa_zip_codes.head()

Unnamed: 0,OBJECTID,ZIP,CITY_NAME
0,1,21060,Glen Burnie
1,2,20779,Tracys Landing
2,3,20764,Shady Side
3,4,20714,North Beach
4,5,21054,Gambrills


I need to find the latitude and longitude for each of the 49 zip code areas in Anne Arundel County:

I needed another dataset that linked zip codes and their corresponding lat/long geocoordinates.  I utilized a gazetteer file from the Census.gov web portal:

In [9]:
gen_zip_codes = pd.read_csv('2019_Gaz_zcta_national.csv', header=0)

In [10]:
gen_zip_codes.head()

Unnamed: 0,GEOID,ALAND,AWATER,ALAND_SQMI,AWATER_SQMI,INTPTLAT,INTPTLONG
0,20701,3429311,6563,1.324,0.003,39.125563,-76.785436
1,20705,41126879,259327,15.879,0.1,39.049423,-76.900362
2,20706,26786677,128248,10.342,0.05,38.96588,-76.851092
3,20707,28854743,466154,11.141,0.18,39.09917,-76.879786
4,20708,36138186,784564,13.953,0.303,39.048173,-76.824036


In [11]:
gen_zip_codes.shape

(219, 7)

In [12]:
gen_zip_codes = gen_zip_codes.drop(['ALAND','AWATER','ALAND_SQMI','AWATER_SQMI'], axis=1)

In [13]:
gen_zip_codes.rename(columns = {'GEOID':'ZIP'}, inplace = True) 

In [14]:
gen_zip_codes.head()

Unnamed: 0,ZIP,INTPTLAT,INTPTLONG
0,20701,39.125563,-76.785436
1,20705,39.049423,-76.900362
2,20706,38.96588,-76.851092
3,20707,39.09917,-76.879786
4,20708,39.048173,-76.824036


In [15]:
gen_zip_codes.shape

(219, 3)

Now merge the two tables on zip code:

In [16]:
aa_zip_codes_2 = pd.merge(gen_zip_codes, aa_zip_codes, on='ZIP')
aa_zip_codes_2.head()

Unnamed: 0,ZIP,INTPTLAT,INTPTLONG,OBJECTID,CITY_NAME
0,20701,39.125563,-76.785436,9,Annapolis Junction
1,20701,39.125563,-76.785436,29,Fort Meade
2,20711,38.801059,-76.645107,12,Lothian
3,20714,38.722457,-76.532813,4,North Beach
4,20724,39.101077,-76.804003,6,Laurel


In [17]:
aa_zip_codes_2.shape

(49, 5)

In [18]:
aa_zip_codes_2.dtypes

ZIP            int64
INTPTLAT     float64
INTPTLONG    float64
OBJECTID       int64
CITY_NAME     object
dtype: object

Let's plot a generic map of Anne Arundel County Maryland:

In [19]:
# use geopy library to get lat/long of Anne Arundel County

address = 'Anne Arundel, MD'

geolocator = Nominatim(user_agent="aa_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Anne Arundel are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Anne Arundel are 38.9722583, -76.573454.


In [20]:
aa_zip_codes_2.columns

Index(['ZIP', 'INTPTLAT', 'INTPTLONG', 'OBJECTID', 'CITY_NAME'], dtype='object')

In [21]:
# create map of Anne Arundel using latitude and longitude values
map_aaCounty = folium.Map(location=[latitude, longitude], zoom_start=10)
map_aaCounty

In [22]:
# add markers to map
for lat, lng, city, zip_code in zip(aa_zip_codes_2['INTPTLAT'], aa_zip_codes_2['INTPTLONG'], aa_zip_codes_2['CITY_NAME'], aa_zip_codes_2['ZIP']):
    label = '{}, {}'.format(city, zip_code)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_aaCounty)  
    


In [23]:
map_aaCounty

Obtain coffee shop location data from Foursquare API database:

Define Foursquare credentials and version:

In [24]:
CLIENT_ID = 'SSVL0O4BQ4ERACH2R31MZCT1ZKL4YDIOHP4WJWKI0ET5WBHM' # your Foursquare ID
CLIENT_SECRET = 'Z52JPQVDEZH2TPNOKFU43JXMSRHVRCTVEVVIFYKG1XFEHGW2' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: SSVL0O4BQ4ERACH2R31MZCT1ZKL4YDIOHP4WJWKI0ET5WBHM
CLIENT_SECRET:Z52JPQVDEZH2TPNOKFU43JXMSRHVRCTVEVVIFYKG1XFEHGW2


In [25]:
# function adapted from Project in week 3 to find nearby venues of a lat/long area--changed to bring back coffee shops only.

LIMIT = 10000 # limit of number of venues returned by Foursquare API

def getNearbyCoffeeShops(names, latitudes, longitudes, radius=10000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Run the function to get coffee shops for each zip code zone.

In [26]:
aa_venues = getNearbyCoffeeShops(names=aa_zip_codes_2['CITY_NAME'],
                                   latitudes=aa_zip_codes_2['INTPTLAT'],
                                   longitudes=aa_zip_codes_2['INTPTLONG']
                                  )

Annapolis Junction
Fort Meade
Lothian
North Beach
Laurel
Fort Meade
Churchton
Owings
Deale
Dunkirk
Fort Meade
Fort Meade
Friendship
Shady Side
Galesville
Harwood
West River
Tracys Landing
Fort Meade
Jessup
Arnold
Crownsville
Davidsonville
Edgewater
Gambrills
Gibson Island
Glen Burnie
Glen Burnie
Hanover
Harmans
Linthicum Heights
Millersville
Odenton
Crofton
Pasadena
Riva
Severn
Severna Park
Brooklyn
Curtis Bay
BWI Airport
Annapolis
Annapolis
Naval Academy
Annapolis
Annapolis
Annapolis
Annapolis
Annapolis


In [43]:
print(aa_venues.shape)
aa_venues

(4465, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Annapolis Junction,39.125563,-76.785436,Corridor Marketplace,39.098,-76.806654,Shopping Mall
1,Annapolis Junction,39.125563,-76.785436,Total Wine & More,39.096447,-76.808914,Wine Shop
2,Annapolis Junction,39.125563,-76.785436,Jailbreak Brewing Company,39.124002,-76.823293,Brewery
3,Annapolis Junction,39.125563,-76.785436,National Cryptologic Museum,39.114903,-76.774971,History Museum
4,Annapolis Junction,39.125563,-76.785436,Lima's Chicken,39.132402,-76.741959,Peruvian Restaurant
5,Annapolis Junction,39.125563,-76.785436,Chick-fil-A,39.097636,-76.807728,Fast Food Restaurant
6,Annapolis Junction,39.125563,-76.785436,Rita's Italian Ice & Frozen Custard,39.097973,-76.808761,Ice Cream Shop
7,Annapolis Junction,39.125563,-76.785436,sweetFrog,39.096902,-76.809316,Frozen Yogurt Shop
8,Annapolis Junction,39.125563,-76.785436,The Hotel At Arundel Preserve,39.15222,-76.743751,Hotel
9,Annapolis Junction,39.125563,-76.785436,Panera Bread,39.096325,-76.804594,Bakery


The Foursquare dataset returned data that contains two ways to say coffee shop, 'Coffee Shop' and 'Café', so I need to get both and merge into one dataframe.

Create a dataframe with the venue equal to 'Café'

In [71]:
aa_coffee_shops_1 = aa_venues[aa_venues['Venue Category'] == ('Café')]

In [72]:
aa_coffee_shops_1.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
36,Annapolis Junction,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café
136,Fort Meade,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café
374,Laurel,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café
474,Fort Meade,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café
840,Fort Meade,39.107783,-76.747196,more than java cafe',39.105322,-76.846451,Café


In [73]:
aa_coffee_shops_1.shape

(49, 7)

Now create a dataset with the venue equal to 'Coffee Shop'

In [74]:
aa_coffee_shops_2 = aa_venues[aa_venues['Venue Category'] == ('Coffee Shop')]

In [75]:
aa_coffee_shops_2.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
34,Annapolis Junction,39.125563,-76.785436,Starbucks,39.153211,-76.728363,Coffee Shop
38,Annapolis Junction,39.125563,-76.785436,Sip at C Street,39.10616,-76.845092,Coffee Shop
62,Annapolis Junction,39.125563,-76.785436,Starbucks,39.09787,-76.808211,Coffee Shop
70,Annapolis Junction,39.125563,-76.785436,Starbucks,39.156534,-76.724411,Coffee Shop
84,Annapolis Junction,39.125563,-76.785436,Starbucks,39.154524,-76.742366,Coffee Shop


In [76]:
aa_coffee_shops_2.shape

(135, 7)

Now concatenate the two coffee dataframes into one new complete coffee dataframe:

In [81]:
frames = [aa_coffee_shops_1, aa_coffee_shops_2]

In [82]:
aa_coffee_shops = pd.concat(frames)

In [83]:
aa_coffee_shops.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
36,Annapolis Junction,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café
136,Fort Meade,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café
374,Laurel,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café
474,Fort Meade,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café
840,Fort Meade,39.107783,-76.747196,more than java cafe',39.105322,-76.846451,Café


In [84]:
aa_coffee_shops.shape

(184, 7)

In [86]:
aa_coffee_shops.to_csv('aa_coffee_shops_final.csv')

Now map the coffee shops on the Anne Arundel County map to see the distribution:

In [87]:
# create map of Anne Arundel using latitude and longitude values
map_aaCounty = folium.Map(location=[latitude, longitude], zoom_start=10)

In [91]:
# add markers to map
for lat, lng, shop_name, neighborhood in zip(aa_coffee_shops['Venue Latitude'], aa_coffee_shops['Venue Longitude'], aa_coffee_shops['Venue'], aa_coffee_shops['Neighborhood']):
    label = '{}, {}'.format(shop_name, neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_aaCounty)  

In [92]:
map_aaCounty

Now, I'll try clustering the coffee shop locations into several discreet clusters/areas to see if a regional pattern of dense/available areas emerge.  I'll begin with 7 clusters.

In [93]:
aa_coffee_shops.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
36,Annapolis Junction,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café
136,Fort Meade,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café
374,Laurel,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café
474,Fort Meade,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café
840,Fort Meade,39.107783,-76.747196,more than java cafe',39.105322,-76.846451,Café


In [107]:
aa_coffee_shops.dtypes

Neighborhood               object
Neighborhood Latitude     float64
Neighborhood Longitude    float64
Venue                      object
Venue Latitude            float64
Venue Longitude           float64
Venue Category             object
Clus_Db                     int64
dtype: object

In [113]:
# set number of clusters
kclusters = 7

coffe_shop_clustering = aa_coffee_shops[['Venue Latitude', 'Venue Longitude']]

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(coffe_shop_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([5, 5, 5, 5, 5, 5, 6, 5, 5, 0])

Create a new dataframe that includes the cluster:

In [114]:
# add clustering labels
aa_coffee_shops.insert(0, 'Cluster Labels', kmeans.labels_)

In [115]:
aa_coffee_shops.head()

Unnamed: 0,Cluster Labels,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Clus_Db
36,5,Annapolis Junction,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café,0
136,5,Fort Meade,39.125563,-76.785436,more than java cafe',39.105322,-76.846451,Café,0
374,5,Laurel,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café,0
474,5,Fort Meade,39.101077,-76.804003,more than java cafe',39.105322,-76.846451,Café,0
840,5,Fort Meade,39.107783,-76.747196,more than java cafe',39.105322,-76.846451,Café,0


Map the resulting clusters:

In [118]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(aa_coffee_shops['Venue Latitude'], aa_coffee_shops['Venue Longitude'], aa_coffee_shops['Neighborhood'], aa_coffee_shops['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Now, I'm going to use the DBSCAN clustering method to see how this changes the clustering results:

In [119]:
from sklearn.cluster import DBSCAN
import sklearn.utils
from sklearn.preprocessing import StandardScaler
sklearn.utils.check_random_state(1000)
from sklearn.cluster import DBSCAN 
from sklearn.datasets.samples_generator import make_blobs 
from sklearn.preprocessing import StandardScaler 
import matplotlib.pyplot as plt 
%matplotlib inline

In [120]:
Clus_dataSet = aa_coffee_shops[['Venue Latitude','Venue Longitude']]
Clus_dataSet = np.nan_to_num(Clus_dataSet)
Clus_dataSet = StandardScaler().fit_transform(Clus_dataSet)

In [121]:
# Compute DBSCAN
db = DBSCAN(eps=0.15, min_samples=10).fit(Clus_dataSet)
core_samples_mask = np.zeros_like(db.labels_, dtype=bool)
core_samples_mask[db.core_sample_indices_] = True
labels = db.labels_
aa_coffee_shops["Clus_Db"]=labels

realClusterNum=len(set(labels)) - (1 if -1 in labels else 0)
clusterNum = len(set(labels)) 


# A sample of clusters
aa_coffee_shops[["Neighborhood","Clus_Db"]].head()

Unnamed: 0,Neighborhood,Clus_Db
36,Annapolis Junction,0
136,Fort Meade,0
374,Laurel,0
474,Fort Meade,0
840,Fort Meade,0


In [125]:
aa_coffee_shops.describe()

Unnamed: 0,Cluster Labels,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Clus_Db
count,184.0,184.0,184.0,184.0,184.0,184.0
mean,2.211957,39.035506,-76.612653,39.039704,-76.615762,1.141304
std,2.163702,0.109822,0.114995,0.113999,0.117816,1.905308
min,0.0,38.689075,-76.804003,38.716557,-76.846451,-1.0
25%,0.0,38.986634,-76.716408,38.978219,-76.725399,-1.0
50%,1.0,39.028343,-76.579016,39.038072,-76.598151,1.0
75%,4.0,39.120916,-76.531506,39.13783,-76.503006,3.0
max,6.0,39.226117,-76.428558,39.280709,-76.42451,5.0


In [123]:
aa_coffee_shops.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 184 entries, 36 to 4462
Data columns (total 9 columns):
Cluster Labels            184 non-null int32
Neighborhood              184 non-null object
Neighborhood Latitude     184 non-null float64
Neighborhood Longitude    184 non-null float64
Venue                     184 non-null object
Venue Latitude            184 non-null float64
Venue Longitude           184 non-null float64
Venue Category            184 non-null object
Clus_Db                   184 non-null int64
dtypes: float64(4), int32(1), int64(1), object(3)
memory usage: 13.7+ KB


Map the resulting clusters:

In [126]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(aa_coffee_shops['Venue Latitude'], aa_coffee_shops['Venue Longitude'], aa_coffee_shops['Neighborhood'], aa_coffee_shops['Clus_Db']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

***