# Capstone Project - The Battle of Neighborhoods

## Table of Content

<ol>
    <li><a href="#Introduction">Introduction</a></li>
    <li><a href="#Data">Data</a></li>
    <li><a href="#Methodology">Methodology</a></li>
    <li><a href="#Results">Results</a></li>
    <li><a href="#Discussion">Discussion</a></li>
    <li><a href="#Conclusion">Conclusion</a></li>
</ol>

## Introduction

Being suffered from a combination of political and social issues over past few years, Hong Kong is no longer a place that has a bright future to many. An increasing number of Hong Kong people therefore plan to leave once their beloved home to other places. One of the hottest places is Taipei city, the capital of Taiwan where shares many similarities with Hong Kong (e.g.  culture, language and standard of living, etc.) and has a well-established democracy. 

Taipei City is a big city and has a lot to offer. It can be daunting to pick where to settle in. In an attempt to provide some useful insights, I am going to explore the neighbourhoods within the city and try to group them according to their characteristics and make suggestions

## Data

<ul>
    <li><b>Foursquare</b> - I will use the Foursquare API to explore neighborhoods in Taipei City. I will use the <b>explore</b> function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters</li>
    <li><b>Wikipedia</b> - <a href="https://en.wikipedia.org/wiki/Category:Districts_of_Taipei">List of neighbourhoods of Taipei</a></li>
</ul>


## Methodology

- Use Nominatim library to find the latitude and longitude values of neighbourhoods of Taipei City
- Use the <b>explore</b> function provided by Foursquare to get the most common venue categories for each neighorhood
- Use k-means clustering to group the neighborhoods into clusters by the features. This is a very fast and efficient algorithm to perform clustering on unlabeled data.
- Use Folium library to visualize the neighborhoods in Taipei City and their clusters


In [49]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


#### Retrieve the geographical coordinate

In [50]:
taipei_address = 'Taipei City, Taiwan'
geolocator = Nominatim(user_agent='taipei_explorer')
location = geolocator.geocode(taipei_address)
latitude = location.latitude
longitude =location.longitude
print('The geographical coordinate of Taipei City are {}, {}'.format(latitude, longitude))

# source: Wikipedia
neighborhoods = ['Beitou','Daan','Datong','Nangang','Neihu','Shilin','Songshan','Wanhua','Wenshan','Xinyi','Zhongshan','Zhongzheng']
latitudes = []
longitudes = []
geolocator = Nominatim(user_agent='taipei_explorer')
for neighborhood in neighborhoods:
    location = geolocator.geocode(neighborhood + ', Taipei City, Taiwan')
    latitudes.append(location.latitude)
    longitudes.append(location.longitude)
    
taipei_data = pd.DataFrame(data={'Neighborhood': neighborhoods, 'Latitude': latitudes, 'Longitude': longitudes})
taipei_data

The geographical coordinate of Taipei City are 25.0375198, 121.5636796


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Beitou,25.132419,121.501379
1,Daan,25.026515,121.534395
2,Datong,25.065986,121.515514
3,Nangang,25.054578,121.6066
4,Neihu,25.069664,121.588998
5,Shilin,25.09184,121.524207
6,Songshan,25.049885,121.577272
7,Wanhua,25.031933,121.499332
8,Wenshan,24.989786,121.570458
9,Xinyi,25.033345,121.566896


#### Create a map of Taipei with neighborhoods

In [51]:
# create map of New York using latitude and longitude values
map_taipei = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(taipei_data['Latitude'], taipei_data['Longitude'], taipei_data['Neighborhood']):
    label = neighborhood
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_taipei)  
    
map_taipei

### Explore Neighborhoods in Taipei

#### Now, let's get the top 100 venues of each neighborhood within a radius of 500 meters.

In [71]:
CLIENT_ID = 'MKKHBWPBK1ZI5ZVB3RC2AM2YEJEZX2JPR1XLG32WBJHCD4D1' # your Foursquare ID
CLIENT_SECRET = 'BR1IR1APQSF11WSGSRV13ZJQK0RWODXS1HTTVQVHH0YFGX5O' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: MKKHBWPBK1ZI5ZVB3RC2AM2YEJEZX2JPR1XLG32WBJHCD4D1
CLIENT_SECRET:BR1IR1APQSF11WSGSRV13ZJQK0RWODXS1HTTVQVHH0YFGX5O


In [72]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None;
    else:
        return categories_list[0]['name']

In [73]:
def getNearByVenues(names, latitudes, longitudes, radius=500):
    
    venues_list = []
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [74]:
taipei_venues = getNearByVenues(taipei_data.Neighborhood, taipei_data.Latitude, taipei_data.Longitude)
taipei_venues.head()

Beitou
Daan
Datong
Nangang
Neihu
Shilin
Songshan
Wanhua
Wenshan
Xinyi
Zhongshan
Zhongzheng


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Beitou,25.132419,121.501379,蔡元益紅茶（總店）,25.131896,121.502012,Tea Room
1,Beitou,25.132419,121.501379,Beitou Market (北投市場 Beitou Market),25.132509,121.50218,Farmers Market
2,Beitou,25.132419,121.501379,阿馬非 Coffee. Pizza.pasta,25.13234,121.497882,Italian Restaurant
3,Beitou,25.132419,121.501379,拾米屋 SheMe House,25.136224,121.499005,Café
4,Beitou,25.132419,121.501379,傳統之最豆花堂,25.133637,121.498719,Dessert Shop


In [75]:
taipei_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beitou,23,23,23,23,23,23
Daan,19,19,19,19,19,19
Datong,19,19,19,19,19,19
Nangang,42,42,42,42,42,42
Neihu,4,4,4,4,4,4
Shilin,78,78,78,78,78,78
Songshan,27,27,27,27,27,27
Wanhua,25,25,25,25,25,25
Wenshan,25,25,25,25,25,25
Xinyi,56,56,56,56,56,56


In [76]:
print('There are {} unique categories.'.format(len(taipei_venues['Venue Category'].unique())))

There are 103 unique categories.


### Analyze each neighborhood

In [77]:
taipei_onehot = pd.get_dummies(taipei_venues[['Venue Category']], prefix="", prefix_sep="")
taipei_onehot['Neighborhood'] = taipei_venues['Neighborhood']

fixed_columns = [taipei_onehot.columns[-1]] + list(taipei_onehot.columns[:-1])
taipei_onehot = taipei_onehot[fixed_columns]

taipei_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Bar,Basketball Court,Beer Bar,Bistro,Boarding House,Bookstore,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Building,Bus Station,Bus Stop,Café,Chinese Breakfast Place,Chinese Restaurant,Climbing Gym,Clothing Store,Coffee Shop,Convenience Store,Cosmetics Shop,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dumpling Restaurant,Electronics Store,Farmers Market,Fast Food Restaurant,Fish Market,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Grocery Store,Gym / Fitness Center,Historic Site,History Museum,Hong Kong Restaurant,Hostel,Hot Spring,Hotel,Hotpot Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Latin American Restaurant,Leather Goods Store,Lounge,Massage Studio,Metro Station,Monument / Landmark,Motorcycle Shop,Movie Theater,Multiplex,Music Venue,Night Market,Nightclub,Noodle House,Park,Performing Arts Venue,Pizza Place,Platform,Playground,Plaza,Public Art,Ramen Restaurant,Restaurant,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soup Place,Speakeasy,Sporting Goods Shop,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Tea Room,Thai Restaurant,Theater,Tourist Information Center,Toy / Game Store,Train Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Beitou,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
1,Beitou,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Beitou,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Beitou,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Beitou,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [78]:
taipei_grouped = taipei_onehot.groupby('Neighborhood').mean().reset_index()
taipei_grouped

Unnamed: 0,Neighborhood,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Bar,Basketball Court,Beer Bar,Bistro,Boarding House,Bookstore,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Building,Bus Station,Bus Stop,Café,Chinese Breakfast Place,Chinese Restaurant,Climbing Gym,Clothing Store,Coffee Shop,Convenience Store,Cosmetics Shop,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dumpling Restaurant,Electronics Store,Farmers Market,Fast Food Restaurant,Fish Market,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Grocery Store,Gym / Fitness Center,Historic Site,History Museum,Hong Kong Restaurant,Hostel,Hot Spring,Hotel,Hotpot Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Latin American Restaurant,Leather Goods Store,Lounge,Massage Studio,Metro Station,Monument / Landmark,Motorcycle Shop,Movie Theater,Multiplex,Music Venue,Night Market,Nightclub,Noodle House,Park,Performing Arts Venue,Pizza Place,Platform,Playground,Plaza,Public Art,Ramen Restaurant,Restaurant,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soup Place,Speakeasy,Sporting Goods Shop,Steakhouse,Street Food Gathering,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Tea Room,Thai Restaurant,Theater,Tourist Information Center,Toy / Game Store,Train Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Beitou,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.086957,0.0,0.0,0.043478,0.043478,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.043478,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Daan,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.315789,0.0,0.052632,0.0,0.0,0.105263,0.157895,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.052632
2,Datong,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.052632,0.0,0.0,0.105263,0.105263,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.210526,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Nangang,0.0,0.02381,0.02381,0.047619,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.02381,0.0,0.0,0.0,0.02381,0.0,0.047619,0.0,0.02381,0.0,0.02381,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.02381,0.0,0.0,0.0,0.02381,0.071429,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.02381,0.0,0.02381,0.02381,0.0,0.0,0.02381,0.047619,0.0,0.0
4,Neihu,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.25,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Shilin,0.0,0.025641,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.012821,0.051282,0.0,0.0,0.051282,0.0,0.0,0.012821,0.089744,0.0,0.025641,0.012821,0.0,0.025641,0.025641,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.038462,0.0,0.012821,0.0,0.038462,0.012821,0.0,0.012821,0.0,0.0,0.012821,0.012821,0.0,0.025641,0.012821,0.0,0.051282,0.0,0.0,0.0,0.038462,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.012821,0.012821,0.0,0.0,0.025641,0.012821,0.0,0.025641,0.038462,0.012821,0.0,0.0,0.064103,0.012821,0.012821,0.0,0.0,0.0,0.0,0.0,0.012821
6,Songshan,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.037037,0.0,0.037037,0.074074,0.148148,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0
7,Wanhua,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.08,0.04,0.12,0.04,0.0,0.08,0.08,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0
8,Wenshan,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.08,0.0,0.0,0.12,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Xinyi,0.017857,0.0,0.035714,0.017857,0.035714,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.017857,0.0,0.0,0.089286,0.017857,0.0,0.0,0.0,0.017857,0.053571,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.035714,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.017857,0.053571,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.017857,0.017857,0.017857,0.0,0.017857,0.0,0.0,0.035714,0.017857,0.017857,0.0,0.0,0.017857,0.017857,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.017857,0.0,0.035714,0.017857,0.017857,0.0,0.017857,0.035714,0.0,0.017857,0.0


### Get the top 10 most common categories of each neighborhood

In [79]:
def return_most_common_venues(row, num_top_venues=10):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [80]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = taipei_grouped['Neighborhood']

for ind in np.arange(taipei_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(taipei_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beitou,Italian Restaurant,Noodle House,Dessert Shop,Tea Room,Chinese Restaurant,Café,American Restaurant,Shop & Service,Food Truck,Farmers Market
1,Daan,Café,Convenience Store,Park,Coffee Shop,Tea Room,Vietnamese Restaurant,Bakery,Chinese Restaurant,Food Stand,Food Court
2,Datong,Taiwanese Restaurant,Coffee Shop,Convenience Store,Café,Supermarket,Hotel,Night Market,Chinese Restaurant,Fried Chicken Joint,Dessert Shop
3,Nangang,Japanese Restaurant,Steakhouse,Train Station,Café,Bakery,Shopping Mall,Lounge,Platform,Clothing Store,Coffee Shop
4,Neihu,Department Store,Convenience Store,Coffee Shop,Basketball Court,Food Truck,Diner,Discount Store,Dumpling Restaurant,Electronics Store,Farmers Market
5,Shilin,Café,Taiwanese Restaurant,Ice Cream Shop,Breakfast Spot,Bubble Tea Shop,Japanese Restaurant,Fried Chicken Joint,Food Court,Steakhouse,Dessert Shop
6,Songshan,Convenience Store,Coffee Shop,Seafood Restaurant,Taiwanese Restaurant,Clothing Store,Chinese Restaurant,Café,Food Stand,Night Market,Korean Restaurant
7,Wanhua,Chinese Restaurant,Coffee Shop,Dessert Shop,Convenience Store,Bakery,Taiwanese Restaurant,Café,Metro Station,Hotel,Climbing Gym
8,Wenshan,Coffee Shop,Japanese Restaurant,Convenience Store,Chinese Restaurant,Ice Cream Shop,Snack Place,Italian Restaurant,Juice Bar,Fast Food Restaurant,Farmers Market
9,Xinyi,Department Store,Electronics Store,Lounge,Bar,Hotel,Taiwanese Restaurant,BBQ Joint,Toy / Game Store,Plaza,Nightclub


#### Clustering neighborhoods

In [81]:
# set number of clusters
kclusters = 3

taipei_grouped_clustering = taipei_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(taipei_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 0, 0, 2, 0, 0, 0, 0, 0])

In [82]:
# add clustering labels
#neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
neighborhoods_venues_sorted['Cluster Labels'] = kmeans.labels_

taipei_merged = taipei_data

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
taipei_merged = taipei_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

taipei_merged # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
0,Beitou,25.132419,121.501379,Italian Restaurant,Noodle House,Dessert Shop,Tea Room,Chinese Restaurant,Café,American Restaurant,Shop & Service,Food Truck,Farmers Market,0
1,Daan,25.026515,121.534395,Café,Convenience Store,Park,Coffee Shop,Tea Room,Vietnamese Restaurant,Bakery,Chinese Restaurant,Food Stand,Food Court,1
2,Datong,25.065986,121.515514,Taiwanese Restaurant,Coffee Shop,Convenience Store,Café,Supermarket,Hotel,Night Market,Chinese Restaurant,Fried Chicken Joint,Dessert Shop,0
3,Nangang,25.054578,121.6066,Japanese Restaurant,Steakhouse,Train Station,Café,Bakery,Shopping Mall,Lounge,Platform,Clothing Store,Coffee Shop,0
4,Neihu,25.069664,121.588998,Department Store,Convenience Store,Coffee Shop,Basketball Court,Food Truck,Diner,Discount Store,Dumpling Restaurant,Electronics Store,Farmers Market,2
5,Shilin,25.09184,121.524207,Café,Taiwanese Restaurant,Ice Cream Shop,Breakfast Spot,Bubble Tea Shop,Japanese Restaurant,Fried Chicken Joint,Food Court,Steakhouse,Dessert Shop,0
6,Songshan,25.049885,121.577272,Convenience Store,Coffee Shop,Seafood Restaurant,Taiwanese Restaurant,Clothing Store,Chinese Restaurant,Café,Food Stand,Night Market,Korean Restaurant,0
7,Wanhua,25.031933,121.499332,Chinese Restaurant,Coffee Shop,Dessert Shop,Convenience Store,Bakery,Taiwanese Restaurant,Café,Metro Station,Hotel,Climbing Gym,0
8,Wenshan,24.989786,121.570458,Coffee Shop,Japanese Restaurant,Convenience Store,Chinese Restaurant,Ice Cream Shop,Snack Place,Italian Restaurant,Juice Bar,Fast Food Restaurant,Farmers Market,0
9,Xinyi,25.033345,121.566896,Department Store,Electronics Store,Lounge,Bar,Hotel,Taiwanese Restaurant,BBQ Joint,Toy / Game Store,Plaza,Nightclub,0


### Create a map of neighborhoods and their clusters

In [83]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(taipei_merged['Latitude'], taipei_merged['Longitude'], taipei_merged['Neighborhood'], taipei_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Examine the clusters

In [84]:
taipei_merged.loc[taipei_merged['Cluster Labels'] == 0, taipei_merged.columns[[0] + list(range(3, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
0,Beitou,Italian Restaurant,Noodle House,Dessert Shop,Tea Room,Chinese Restaurant,Café,American Restaurant,Shop & Service,Food Truck,Farmers Market,0
2,Datong,Taiwanese Restaurant,Coffee Shop,Convenience Store,Café,Supermarket,Hotel,Night Market,Chinese Restaurant,Fried Chicken Joint,Dessert Shop,0
3,Nangang,Japanese Restaurant,Steakhouse,Train Station,Café,Bakery,Shopping Mall,Lounge,Platform,Clothing Store,Coffee Shop,0
5,Shilin,Café,Taiwanese Restaurant,Ice Cream Shop,Breakfast Spot,Bubble Tea Shop,Japanese Restaurant,Fried Chicken Joint,Food Court,Steakhouse,Dessert Shop,0
6,Songshan,Convenience Store,Coffee Shop,Seafood Restaurant,Taiwanese Restaurant,Clothing Store,Chinese Restaurant,Café,Food Stand,Night Market,Korean Restaurant,0
7,Wanhua,Chinese Restaurant,Coffee Shop,Dessert Shop,Convenience Store,Bakery,Taiwanese Restaurant,Café,Metro Station,Hotel,Climbing Gym,0
8,Wenshan,Coffee Shop,Japanese Restaurant,Convenience Store,Chinese Restaurant,Ice Cream Shop,Snack Place,Italian Restaurant,Juice Bar,Fast Food Restaurant,Farmers Market,0
9,Xinyi,Department Store,Electronics Store,Lounge,Bar,Hotel,Taiwanese Restaurant,BBQ Joint,Toy / Game Store,Plaza,Nightclub,0
10,Zhongshan,Hotel,Sushi Restaurant,Convenience Store,Seafood Restaurant,Fish Market,Massage Studio,Taiwanese Restaurant,Ice Cream Shop,Breakfast Spot,Hotpot Restaurant,0
11,Zhongzheng,Café,History Museum,Theater,Noodle House,Monument / Landmark,Japanese Restaurant,Bakery,Bar,Snack Place,Dumpling Restaurant,0


In [85]:
taipei_merged.loc[taipei_merged['Cluster Labels'] == 1, taipei_merged.columns[[0] + list(range(4, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
1,Daan,Convenience Store,Park,Coffee Shop,Tea Room,Vietnamese Restaurant,Bakery,Chinese Restaurant,Food Stand,Food Court,1


In [86]:
taipei_merged.loc[taipei_merged['Cluster Labels'] == 2, taipei_merged.columns[[0] + list(range(4, taipei_merged.shape[1]))]]

Unnamed: 0,Neighborhood,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
4,Neihu,Convenience Store,Coffee Shop,Basketball Court,Food Truck,Diner,Discount Store,Dumpling Restaurant,Electronics Store,Farmers Market,2


## Results

<p>There are three groups:
<ul>
    <li>Group 1: Featuring different kinds of resturants</li>
    <li>Group 2: Featuring parks </li>
    <li>Group 3: Featuring more diverified venues ranging from convenience store to baseketball to electronics store, etc.</li>
</ul>
</p>

## Discussion

People can now easily choose the neighborhood that best fit their lifestyle. Group 1 is for those who love eating. Group 2 is for those who love outdoor activities like running, cycling, etc. Group 3 is for those who don't show strong preference to previous two groups.

Apart from knowning the nearby venues, there are other important aspects to consider like rent/housing price, accessiblibty, etc.

## Conclusion

Moving to a new place can be very daunting with many uncertainties, however by knowning better about a place beforehand it can become a more confortable and even exciting experience.