# Clustering of neighborhoods in Taipei city by using Foursquare data

## Introduction/Business Problem

>Taipei, located in Northern Taiwan, is the capital and a special municipality of Taiwan. The city proper is home to approximately 2,7 million people, which makes a population density of nearly 10,000 people per square kilometer . 
Taiwan is my favorite country for spending my vacations due to a number of reasons: its subtropical climate, friendly and hospitable people, plenty of tourist attractions of various types, well-connected public transports, and irresistible Taiwanese cuisine. Beijing the political, economic, educational and cultural center of Taiwan, Taipei attracts millions of overseas visitors each year, making it the 15th most visited city globally, and the most of any city in the Chinese-speaking world.

>However, despite the large amount of travel guides and recommendations, it is not that easy for people who are new to the city to makes decisions on the best places to stay for their first visits. Depending on different purposes and duration, visitors may have various preferences and requirements, for instance, some needs easy access to major transportation centers for convenience, some prefers to stay in a quiet place far from the city center, some wants to be close to the commercial centers, while others prefer to stay in an area where one gets most authentic experience of local life etc. After I have been to Taipei many times, my friends and relatives have for several times asked me for advice regarding visiting the different parts of Taipei city.

>To accommodate this need, I decided to create a map of Taipei that visualize the characteristics of different clusters of neighborhoods in Taipei city, which provides new visitors a quick overview of distribution of different areas in Taipei city.

## Data

>Considering the above stated problem, I will use the following data to create the target map.

>- List of the administrative districts of Taipei City with postal codes – scraped from [this Wiki page](https://en.wikipedia.org/wiki/Postal_codes_in_Taiwan). The data is in the second table from this page.
>- Use geopy to get location coordinates of Taipei city and its administrative districts listed above. 
>- Use Foursquare API to explore popular venues in each Taipei district. I’ll be querying the top 200 venues in each district in a radius of 1000m radius as it is a reasonable walking distance. The result will be analyzed to get the top venue categories for each district, which then will be used to cluster the districts. 

In [1]:
import pandas as pd # library for data analysis
import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML documents

In [2]:
# get the data of districts and post codes from Wiki
# get the response in the form of html
wikiurl="https://en.wikipedia.org/wiki/Postal_codes_in_Taiwan#Classification%20of%20postal%20codes"
response=requests.get(wikiurl)

In [3]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(response.text, 'html.parser')
taiwantables=soup.find_all('table',{'class':"wikitable"})
taipeitable=taiwantables[1]

In [4]:
# covert list to dataframe
taipei_df=pd.read_html(str(taipeitable))
taipei_df=pd.DataFrame(taipei_df[0])
taipei_df.head()

Unnamed: 0_level_0,Code,Division name,Chinese
Unnamed: 0_level_1,Taipei City,Taipei City,Taipei City
0,100,Zhongzheng District,中正區
1,103,Datong District,大同區
2,104,Zhongshan District,中山區
3,105,Songshan District,松山區
4,106,Daan District,大安區


In [5]:
taipei_df.columns

MultiIndex([(         'Code', 'Taipei City'),
            ('Division name', 'Taipei City'),
            (      'Chinese', 'Taipei City')],
           )

In [6]:
# change multiIndex to flat index
taipei_df.columns = taipei_df.columns.get_level_values(0)

In [7]:
taipei_df

Unnamed: 0,Code,Division name,Chinese
0,100,Zhongzheng District,中正區
1,103,Datong District,大同區
2,104,Zhongshan District,中山區
3,105,Songshan District,松山區
4,106,Daan District,大安區
5,108,Wanhua District,萬華區
6,110,Xinyi District,信義區
7,111,Shilin District,士林區
8,112,Beitou District,北投區
9,114,Neihu District,內湖區


In [8]:
# remove unnecessary column and change column names to "Post code" and "Neighborhood"
taipei_df=taipei_df.drop(['Chinese'], axis=1).rename(columns={'Code':'Post code', 'Division name':'Neighborhood'})
taipei_df.head(5)

Unnamed: 0,Post code,Neighborhood
0,100,Zhongzheng District
1,103,Datong District
2,104,Zhongshan District
3,105,Songshan District
4,106,Daan District


## Get coordinates for Teipei City and its neighborhoods

In [9]:
!pip install geopy 



In [10]:
from geopy.geocoders import Nominatim
address = 'Taipei City, TW'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Taipei City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Taipei City are 25.0375198, 121.5636796.


In [11]:
address = 'Zhongzheng District, TW'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Zhongzheng District are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Zhongzheng District are 25.0323611, 121.518267.


In [12]:
address = (taipei_df['Neighborhood'].values + ', TW').tolist()
latitude = []
longitude = []

for address in address:
    geolocator = Nominatim(user_agent="ny_explorer")
    location = geolocator.geocode(address)
    latitude.append(location.latitude) 
    longitude.append(location.longitude)

In [13]:
taipei_df['Latitude'] = latitude
taipei_df['Longitude'] = longitude
taipei_df

Unnamed: 0,Post code,Neighborhood,Latitude,Longitude
0,100,Zhongzheng District,25.032361,121.518267
1,103,Datong District,25.065986,121.515514
2,104,Zhongshan District,25.064361,121.533468
3,105,Songshan District,25.049885,121.577272
4,106,Daan District,25.026515,121.534395
5,108,Wanhua District,25.031933,121.499332
6,110,Xinyi District,25.033345,121.566896
7,111,Shilin District,25.094118,121.524788
8,112,Beitou District,25.131931,121.498593
9,114,Neihu District,25.069664,121.588998


### Define Foursquare Credentials and Version

In [14]:
CLIENT_ID = 'LAFC0ZZQTJUXNJ1M5STCDWVTLKTHT2VDUDESH4RXTTR0341P' 
CLIENT_SECRET = 'ATW50H3ZDDI3LPVJHIDCSAI0UHMCJMV5QEYKYQMX3BH453SR' 
VERSION = '20210123' 
LIMIT = 200 

#### Explore the first neighborhood

In [15]:
taipei_df.loc[0, 'Neighborhood']

'Zhongzheng District'

#### Get the top 200 venues that are in Zhongzheng Distrct within a radius of 1000 meters

In [16]:
zhongzheng_lat = taipei_df.loc[0, 'Latitude']
zhongzheng_long = taipei_df.loc[0, 'Longitude']
radius = 1000

url='https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, zhongzheng_lat, zhongzheng_long, VERSION, radius, LIMIT)


#### Senf GET request and examine the results

In [17]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '600d1e70dfb92326e6dcd9ef'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Zhōngzhèng Qū',
  'headerFullLocation': 'Zhōngzhèng Qū, Taipei',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 121,
  'suggestedBounds': {'ne': {'lat': 25.041361109000007,
    'lng': 121.52818148278706},
   'sw': {'lat': 25.02336109099999, 'lng': 121.50835251721293}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b8e4566f964a520891d33e3',
       'name': 'Kinfen Braised Pork Rice (金峰魯肉飯)',
       'location': {'address': '羅斯福路一段10號',
        'lat': 25.03219410314086,
        'lng': 121.51853364691742,
        'labeledLatLngs': [{'label': 'display',
  

#### Create a function that extracts the category of the venue

In [18]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [19]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = pd.json_normalize(venues) 

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Kinfen Braised Pork Rice (金峰魯肉飯),Taiwanese Restaurant,25.032194,121.518534
1,National Theater (國家戲劇院),Theater,25.035197,121.518188
2,虎記商行,Café,25.031744,121.519284
3,樂田麵包屋 Gakuden Boulangerie,Bakery,25.032757,121.517534
4,Chiang Kai-Shek Memorial Hall (中正紀念堂),Monument / Landmark,25.034555,121.521835


In [20]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


#### Create a function to repeat the same process to all the neighborhoods in Taipei

In [21]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [22]:
taipei_venues = getNearbyVenues(names=taipei_df['Neighborhood'],
                                   latitudes=taipei_df['Latitude'],
                                   longitudes=taipei_df['Longitude']
                                  )

Zhongzheng District
Datong District
Zhongshan District
Songshan District
Daan District
Wanhua District
Xinyi District
Shilin District
Beitou District
Neihu District
Nangang District
Wenshan District


In [23]:
print(taipei_venues.shape)
taipei_venues.head()

(911, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Zhongzheng District,25.032361,121.518267,Kinfen Braised Pork Rice (金峰魯肉飯),25.032194,121.518534,Taiwanese Restaurant
1,Zhongzheng District,25.032361,121.518267,National Theater (國家戲劇院),25.035197,121.518188,Theater
2,Zhongzheng District,25.032361,121.518267,虎記商行,25.031744,121.519284,Café
3,Zhongzheng District,25.032361,121.518267,樂田麵包屋 Gakuden Boulangerie,25.032757,121.517534,Bakery
4,Zhongzheng District,25.032361,121.518267,Chiang Kai-Shek Memorial Hall (中正紀念堂),25.034555,121.521835,Monument / Landmark


In [24]:
taipei_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beitou District,62,62,62,62,62,62
Daan District,100,100,100,100,100,100
Datong District,100,100,100,100,100,100
Nangang District,52,52,52,52,52,52
Neihu District,29,29,29,29,29,29
Shilin District,96,96,96,96,96,96
Songshan District,64,64,64,64,64,64
Wanhua District,69,69,69,69,69,69
Wenshan District,39,39,39,39,39,39
Xinyi District,100,100,100,100,100,100


In [25]:
print('There are {} uniques categories.'.format(len(taipei_venues['Venue Category'].unique())))

There are 159 uniques categories.


In [26]:
# one hot encoding
taipei_onehot = pd.get_dummies(taipei_venues[['Venue Category']], prefix="", prefix_sep="")
taipei_onehot

Unnamed: 0,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,...,Theater,Theme Park,Tourist Information Center,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Xinjiang Restaurant
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
906,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
907,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
908,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
909,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [27]:
# add neighborhood column back to dataframe
taipei_onehot['Neighborhood'] = taipei_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [taipei_onehot.columns[-1]] + list(taipei_onehot.columns[:-1])
taipei_onehot = taipei_onehot[fixed_columns]

taipei_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Theater,Theme Park,Tourist Information Center,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Xinjiang Restaurant
0,Zhongzheng District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Zhongzheng District,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
2,Zhongzheng District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Zhongzheng District,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
4,Zhongzheng District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [28]:
taipei_onehot.shape

(911, 160)

In [29]:
taipei_grouped = taipei_onehot.groupby('Neighborhood').mean().reset_index()
taipei_grouped

Unnamed: 0,Neighborhood,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Theater,Theme Park,Tourist Information Center,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Xinjiang Restaurant
0,Beitou District,0.0,0.0,0.0,0.0,0.0,0.080645,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0
1,Daan District,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,...,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.01
2,Datong District,0.01,0.01,0.01,0.0,0.0,0.03,0.0,0.01,0.01,...,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
3,Nangang District,0.0,0.0,0.0,0.0,0.0,0.019231,0.019231,0.0,0.019231,...,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0
4,Neihu District,0.0,0.0,0.0,0.0,0.0,0.068966,0.034483,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0
5,Shilin District,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010417,...,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0
6,Songshan District,0.0,0.0,0.0,0.0,0.015625,0.046875,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.015625,0.015625,0.0,0.0,0.0
7,Wanhua District,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.057971,...,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0
8,Wenshan District,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Xinyi District,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.01,...,0.0,0.0,0.01,0.02,0.01,0.0,0.02,0.0,0.0,0.0


In [30]:
taipei_grouped.shape

(12, 160)

#### Create a function to sort the venues in descending order

In [31]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [32]:
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = taipei_grouped['Neighborhood']

for ind in np.arange(taipei_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(taipei_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beitou District,Hotel,Convenience Store,Asian Restaurant,Chinese Restaurant,Noodle House,Coffee Shop,Park,Dessert Shop,Café,Italian Restaurant
1,Daan District,Café,Taiwanese Restaurant,Noodle House,Chinese Restaurant,Tea Room,Ice Cream Shop,Bakery,Coffee Shop,Dim Sum Restaurant,Dessert Shop
2,Datong District,Taiwanese Restaurant,Convenience Store,Dessert Shop,Coffee Shop,Café,Noodle House,Chinese Restaurant,Hotel,Asian Restaurant,Hotpot Restaurant
3,Nangang District,Coffee Shop,Convenience Store,Japanese Restaurant,Thai Restaurant,Korean Restaurant,Hotel,Train Station,Hotpot Restaurant,Noodle House,Café
4,Neihu District,Convenience Store,Taiwanese Restaurant,Chinese Restaurant,Coffee Shop,Asian Restaurant,Hotpot Restaurant,Bus Station,Sporting Goods Shop,Golf Course,Gym


### Cluster the neighborhoods

#### Run k-means to cluster the neighborhood into 5 clusters.

In [33]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 4

taipei_grouped_clustering = taipei_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(taipei_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 1, 1, 2, 0, 1, 0, 1, 2, 1], dtype=int32)

#### Create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [34]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

taipei_merged = taipei_df

In [35]:
# merge taipei_grouped with taipei_data to add latitude/longitude for each neighborhood
taipei_merged = taipei_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

taipei_merged.head()

Unnamed: 0,Post code,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,100,Zhongzheng District,25.032361,121.518267,1,Café,Convenience Store,Noodle House,Breakfast Spot,Coffee Shop,Japanese Restaurant,Dumpling Restaurant,History Museum,Hotpot Restaurant,Taiwanese Restaurant
1,103,Datong District,25.065986,121.515514,1,Taiwanese Restaurant,Convenience Store,Dessert Shop,Coffee Shop,Café,Noodle House,Chinese Restaurant,Hotel,Asian Restaurant,Hotpot Restaurant
2,104,Zhongshan District,25.064361,121.533468,3,Hotel,Taiwanese Restaurant,Chinese Restaurant,Convenience Store,Café,Japanese Restaurant,Hotpot Restaurant,Asian Restaurant,Coffee Shop,Seafood Restaurant
3,105,Songshan District,25.049885,121.577272,0,Convenience Store,Chinese Restaurant,Japanese Restaurant,Taiwanese Restaurant,Park,Asian Restaurant,Italian Restaurant,Hotel,Coffee Shop,Hotpot Restaurant
4,106,Daan District,25.026515,121.534395,1,Café,Taiwanese Restaurant,Noodle House,Chinese Restaurant,Tea Room,Ice Cream Shop,Bakery,Coffee Shop,Dim Sum Restaurant,Dessert Shop


### Create map to visualize the results

In [36]:
import matplotlib.cm as cm
import matplotlib.colors as colors
!pip install folium
import folium



In [37]:
# create map
map_clusters = folium.Map(location=[25.0375198, 121.5636796], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(taipei_merged['Latitude'], taipei_merged['Longitude'], taipei_merged['Neighborhood'], taipei_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine the clusters

In [38]:
# Cluster 1
taipei_merged.loc[taipei_merged['Cluster Labels'] == 0, taipei_merged.columns[[1] + list(range(5, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Songshan District,Convenience Store,Chinese Restaurant,Japanese Restaurant,Taiwanese Restaurant,Park,Asian Restaurant,Italian Restaurant,Hotel,Coffee Shop,Hotpot Restaurant
9,Neihu District,Convenience Store,Taiwanese Restaurant,Chinese Restaurant,Coffee Shop,Asian Restaurant,Hotpot Restaurant,Bus Station,Sporting Goods Shop,Golf Course,Gym


In [39]:
# Cluster 2
taipei_merged.loc[taipei_merged['Cluster Labels'] == 1, taipei_merged.columns[[1] + list(range(5, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Zhongzheng District,Café,Convenience Store,Noodle House,Breakfast Spot,Coffee Shop,Japanese Restaurant,Dumpling Restaurant,History Museum,Hotpot Restaurant,Taiwanese Restaurant
1,Datong District,Taiwanese Restaurant,Convenience Store,Dessert Shop,Coffee Shop,Café,Noodle House,Chinese Restaurant,Hotel,Asian Restaurant,Hotpot Restaurant
4,Daan District,Café,Taiwanese Restaurant,Noodle House,Chinese Restaurant,Tea Room,Ice Cream Shop,Bakery,Coffee Shop,Dim Sum Restaurant,Dessert Shop
5,Wanhua District,Taiwanese Restaurant,Convenience Store,Night Market,Dessert Shop,Chinese Restaurant,Coffee Shop,Café,Bakery,Food Truck,Dumpling Restaurant
6,Xinyi District,Department Store,Hotel,Bar,Café,Chinese Restaurant,Coffee Shop,Gym / Fitness Center,Noodle House,Lounge,Japanese Restaurant
7,Shilin District,Convenience Store,Café,Noodle House,Chinese Restaurant,Fried Chicken Joint,Taiwanese Restaurant,Food Court,Breakfast Spot,Dessert Shop,Japanese Restaurant


In [40]:
# Cluster 3
taipei_merged.loc[taipei_merged['Cluster Labels'] == 2, taipei_merged.columns[[1] + list(range(5, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Nangang District,Coffee Shop,Convenience Store,Japanese Restaurant,Thai Restaurant,Korean Restaurant,Hotel,Train Station,Hotpot Restaurant,Noodle House,Café
11,Wenshan District,Coffee Shop,Convenience Store,Bus Station,Café,Japanese Restaurant,Chinese Restaurant,Fast Food Restaurant,Cupcake Shop,Cable Car,Sandwich Place


In [41]:
# Cluster 4
taipei_merged.loc[taipei_merged['Cluster Labels'] == 3, taipei_merged.columns[[1] + list(range(5, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Zhongshan District,Hotel,Taiwanese Restaurant,Chinese Restaurant,Convenience Store,Café,Japanese Restaurant,Hotpot Restaurant,Asian Restaurant,Coffee Shop,Seafood Restaurant
8,Beitou District,Hotel,Convenience Store,Asian Restaurant,Chinese Restaurant,Noodle House,Coffee Shop,Park,Dessert Shop,Café,Italian Restaurant
