# Clustering Toronto neighborhoods

## Part 1

### Create a Dataframe with Toronto's postal codes:

<p> From the following <a href= https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M>link</a> get the table and transform it into a data frame with the following characteristics </p> 

<br><li>The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood </li> 

<br><li>Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.</li> 

<br><li>More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma as shown in row 11 in the above table.</li> 

<br><li>If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough.</li> 

<br><li>Clean your Notebook and add Markdown cells to explain your work and any assumptions you are making.</li> 

<br><li>In the last cell of your notebook, use the .shape method to print the number of rows of your dataframe.</li> 

In [3]:
# Import libraries

import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup

### Getting the table from Wikipedia: Pandas option

In [4]:
url='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

Tablas=pd.read_html(url) # this will return a list of all the table into the site
TorontoCodes=Tablas[0] # we get the first table because it's what we need
NA=Tablas[0]
TorontoCodes


Unnamed: 0,Postal Code,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
7,M8A,Not assigned,
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"


### Prepare the data frame

In [5]:

print('The amount of row at beggining of the process:', TorontoCodes.shape[0])

# Deleteling the postal codes not assigned
indexNA=TorontoCodes[TorontoCodes['Borough']=='Not assigned'].index # Get the indexes of the rows without a borough assigned
TorontoCodes.drop(index=indexNA, inplace=True)
TorontoCodes.reset_index(drop=True, inplace=True) # Delete those rows
print('Rows after delete codes without a borough assigned: ', TorontoCodes.shape[0])
TorontoCodes


The amount of row at beggining of the process: 180
Rows after delete codes without a borough assigned:  103


Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,M1B,Scarborough,"Malvern, Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


In [6]:
# Detect what codes appear more than one time

listOfCodes= pd.unique(TorontoCodes['Postal Code'])
listOfCodes

rep_codes=[]
for i in listOfCodes:
    l=len(TorontoCodes[TorontoCodes['Postal Code']==i])
    if l > 1:
        rep_codes.append(i)
if len(rep_codes)==0:
    print("There's no repeated codes")
else:
    print('This code appear more than one time', rep_code)


There's no repeated codes


In [7]:
# Detect the neighborhoods wiouth a name

x=TorontoCodes[['Neighborhood']].isnull().sum().sum()

if x == 0:
    print('There´s not missing values in the neighborhoods')
else:
    Nohood=TorontoCodes[['Neighborhood']].isnull()
    print('These are the rows without hood', Nohood)


There´s not missing values in the neighborhoods


In [8]:
TorontoCodes

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,M1B,Scarborough,"Malvern, Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


In [7]:
print('The final number of rows is: ', TorontoCodes.shape[0])

The final number of rows is:  103


## Part 2

Now that you have built a dataframe of the postal code of each neighborhood along with the borough name and neighborhood name, in order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates of each neighborhood.

In [9]:
!conda install -c conda-forge geocoder --yes

import geocoder

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geocoder


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    geocoder-1.38.1            |             py_1          53 KB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    ratelim-0.1.6              |             py_2           6 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geocoder:        1.38.1-py_1       conda-forge
    python_abi:    

ModuleNotFoundError: No module named 'geocoder'

### Getting the coordinates through geocoders

In the next markdown cell it's the lines of codes necesaries to import the cordinates using de geocoder pagackage


codes=pd.unique(TorontoCodes['Postal Code'])

for code, i in zip(codes, range(len(codes))):
    <br>-ll_location=None
    <br>-while (ll_location is None):
    <br> ----location=geocoder.google('{}, Toronto, Ontario'.format(code))
    <br> ----ll_location=location.latlng
    
    latitude=ll_location[0]
    longitude=ll_location[1]
    
    TorontoCodes['Latitude'][i]= latitude
    TorontoCodes['Longitude'][i]= longitude

### Getting latlong trough the csv file

Due that with the previous lines of codes we don't get an asnwer, we're gonna use the csv file to get the latitude and longitude

In [10]:
#Download and read the file with coordinates

!wget -q -O 'toronto_latlng.csv' http://cocl.us/Geospatial_data
toronto_latlng=pd.read_csv('toronto_latlng.csv')
toronto_latlng.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [11]:
# merge the dataframes

Toronto=pd.merge(left=TorontoCodes, right=toronto_latlng)
Toronto.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


In [12]:
Toronto.shape

(103, 5)

## Part 3

### Topic

<p>Explore and cluster the neighborhoods in Toronto. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.</p>

<p>Just make sure:

<li>to add enough Markdown cells to explain what you decided to do and to report any observations you make.</li>
<li>to generate maps to visualize your neighborhoods and how they cluster together.</li></p>

In [13]:
# import libraries

# the libraries to handle with json files and transform it into a panda dataframe
import json 
from pandas.io.json import json_normalize

### Defining Foursquare credentials and version

In [14]:
CLIENT_ID='CBLSKN0KJFETYLEQ4AR2MKESM0VRMT5WRO42MHVSIXJC3QAM'
CLIENT_SECRET='GZ3XEFTRE4PP3U2VU5O3VGZG24SK0IGBHEBJATES3JMAWDDR'
VERSION='20180605'

### Testing wrangling the process with one neighborhood

In [15]:
# Getting the coordinates of the neighborhood
test_name=Toronto.loc[1,'Neighborhood']
test_lat=Toronto.loc[0,'Latitude']
test_lng=Toronto.loc[0,'Longitude']

print('the testint neighborhood is: ', test_name,'and its cordinates are: ', test_lat,' & ', test_lng)

the testint neighborhood is:  Victoria Village and its cordinates are:  43.7532586  &  -79.3296565


In [16]:
# Creating the Get request URL

Limit=100
url='https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION,
    test_lat,
    test_lng,
    Limit)

In [17]:
# Sending the request
results= requests.get(url).json()

In [18]:
# Transforming the json file into a pandas dataframe

json_list_of_venues=results['response']['groups'][0]['items']
test_venues = json_normalize(json_list_of_venues)
test_venues.head(2)


Unnamed: 0,reasons.count,reasons.items,referralId,venue.categories,venue.id,venue.location.address,venue.location.cc,venue.location.city,venue.location.country,venue.location.crossStreet,...,venue.location.labeledLatLngs,venue.location.lat,venue.location.lng,venue.location.neighborhood,venue.location.postalCode,venue.location.state,venue.name,venue.photos.count,venue.photos.groups,venue.venuePage.id
0,0,"[{'summary': 'This spot is popular', 'type': '...",e-0-4b8991cbf964a520814232e3-0,"[{'id': '4bf58dd8d48988d144941735', 'name': 'C...",4b8991cbf964a520814232e3,81 Underhill drive,CA,Toronto,Canada,,...,"[{'label': 'display', 'lat': 43.75984035203157...",43.75984,-79.324719,Parkwoods - Donalda,M3A 1Z5,ON,Allwyn's Bakery,0,[],
1,0,"[{'summary': 'This spot is popular', 'type': '...",e-0-4bd4846a6798ef3bd0c5618d-1,"[{'id': '4bf58dd8d48988d1e6941735', 'name': 'G...",4bd4846a6798ef3bd0c5618d,12 Bushbury Dr,CA,Don Mills,Canada,,...,"[{'label': 'display', 'lat': 43.75281596740471...",43.752816,-79.342741,,M3A 2Z7,ON,Donalda Golf & Country Club,0,[],


In [19]:
# slicing the dataframe

test_venues = test_venues.loc[:,['venue.name', 'venue.categories', 'venue.location.lat','venue.location.lng']]
test_venues.head(2)

Unnamed: 0,venue.name,venue.categories,venue.location.lat,venue.location.lng
0,Allwyn's Bakery,"[{'id': '4bf58dd8d48988d144941735', 'name': 'C...",43.75984,-79.324719
1,Donalda Golf & Country Club,"[{'id': '4bf58dd8d48988d1e6941735', 'name': 'G...",43.752816,-79.342741


In [20]:
# changing the columns names

test_venues.columns = [col.split('.')[-1] for col in test_venues.columns]
test_venues.head(2)

Unnamed: 0,name,categories,lat,lng
0,Allwyn's Bakery,"[{'id': '4bf58dd8d48988d144941735', 'name': 'C...",43.75984,-79.324719
1,Donalda Golf & Country Club,"[{'id': '4bf58dd8d48988d1e6941735', 'name': 'G...",43.752816,-79.342741


In [21]:
# function that extracts the category of the venue

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [22]:
# getting the categorie name

test_venues['categories']=test_venues.apply(get_category_type, axis=1)
test_venues.head(2)

Unnamed: 0,name,categories,lat,lng
0,Allwyn's Bakery,Caribbean Restaurant,43.75984,-79.324719
1,Donalda Golf & Country Club,Golf Course,43.752816,-79.342741


### Applying the process to all neighborhoods

In [26]:
# Defining the funtion to make the process

def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=50):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [28]:
# Using the funtion created above to get the a datagrame with all the venues of the neighborhoods

Toronto_venues= getNearbyVenues(Toronto['Neighborhood'], 
                                Toronto['Latitude'], 
                                Toronto['Longitude'])

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmo

In [69]:
# the number of venue we've got

Toronto_venues.shape

(1689, 7)

In [73]:
#The number of venues per neighborhood

Toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,TTC stop #8380,43.752672,-79.326351,Bus Stop
2,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
3,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop


### Transforming the dataframe in order to be used into the clustering process

we are going the transform the dataframe into one where appear the frecuency of each kind of venue

In [101]:
# one hot encoding
Toronto_onehot=pd.get_dummies(Toronto_venues[['Venue Category']])

#adding neighborhoods
Toronto_onehot['Neighborhood']=Toronto_venues['Neighborhood']

# Ordering the columns
lista=[Toronto_onehot.columns[-1]]+list(Toronto_onehot.columns[:-1])
Toronto_onehot = Toronto_onehot[lista]

#Getting the frecuency
Toronto_fre=Toronto_onehot.groupby('Neighborhood').mean().reset_index()

# fixing the columns names

fixedNames= list(Toronto_fre.columns)

for names, i in zip(fixedNames, range(len(fixedNames))):
    division=names.split("_")
    fixedNames[i]=division[-1]
                    
Toronto_fre.columns=fixedNames

Toronto_fre.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Clustering neighborhoods

In [86]:
# import libreries

from sklearn.cluster import KMeans

In [92]:
# set the number of cluster

ncluster=4

# get the values

Toronto_fre_clustering= Toronto_fre.drop('Neighborhood', 1)


# Run Kmeans

neigh= KMeans(init="k-means++", n_clusters=ncluster, n_init=12).fit(Toronto_fre_clustering)

# preview the labels

neigh.labels_[0:50]

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 3, 2,
       2, 2, 2, 1, 2, 2], dtype=int32)

In [102]:
# Asignning the cluster label to each neighborhood 

Toronto_fre.insert(0, 'Cluster Label', neigh.labels_)

Toronto_fre.head()

Unnamed: 0,Cluster Label,Neighborhood,Accessories Store,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,2,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Plotting the Clusters

In [123]:
# Getting the columns we need
Toronto_cluster=Toronto_fre.iloc[:,[0,1]]
Toronto_cluster.head()

Unnamed: 0,Cluster Label,Neighborhood
0,2,Agincourt
1,2,"Alderwood, Long Branch"
2,2,"Bathurst Manor, Wilson Heights, Downsview North"
3,2,Bayview Village
4,2,"Bedford Park, Lawrence Manor East"


In [125]:
# Merging the dataframes to get all the information needed

Toronto_merged= pd.merge(left=Toronto_cluster, right=Toronto, how='left')
Toronto_merged.head()

Unnamed: 0,Cluster Label,Neighborhood,Postal Code,Borough,Latitude,Longitude
0,2,Agincourt,M1S,Scarborough,43.7942,-79.262029
1,2,"Alderwood, Long Branch",M8W,Etobicoke,43.602414,-79.543484
2,2,"Bathurst Manor, Wilson Heights, Downsview North",M3H,North York,43.754328,-79.442259
3,2,Bayview Village,M2K,North York,43.786947,-79.385975
4,2,"Bedford Park, Lawrence Manor East",M5M,North York,43.733283,-79.41975


In [127]:
#Cleaning the Dataframe

Toronto_merged= Toronto_merged[['Neighborhood', 'Cluster Label', 'Latitude', 'Longitude']]
Toronto_merged.head()

Unnamed: 0,Neighborhood,Cluster Label,Latitude,Longitude
0,Agincourt,2,43.7942,-79.262029
1,"Alderwood, Long Branch",2,43.602414,-79.543484
2,"Bathurst Manor, Wilson Heights, Downsview North",2,43.754328,-79.442259
3,Bayview Village,2,43.786947,-79.385975
4,"Bedford Park, Lawrence Manor East",2,43.733283,-79.41975


In [129]:
# Import the library

!conda install -c conda-forge folium=0.5.0 --yes
import folium

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.8 MB

The following NEW packages will be INSTALLED:

    altair:          4.1.0-py_1        conda-forge
    branca:          0.4.1-py_0        conda-forge
    folium:          0.5.0-py_0        con

In [130]:
# Establish the latitude and longitude of the city

latitude=43.7001114
longitude=-79.4162979

In [133]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
rainbow = ['blue','red','yellow','green']

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Toronto_merged['Latitude'], Toronto_merged['Longitude'], Toronto_merged['Neighborhood'], Toronto_merged['Cluster Label']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters