# Capstone Project - The Battle of Neighborhoods

<div style="text-align: right">Update date : 2020. 11. 7.</div>

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1.  <a href="#item1">Introduction</a>

2.  <a href="#item2">Data source and how to use</a>

3.  <a href="#item3">Methodology</a>

4.  <a href="#item4">Result</a>

5.  <a href="#item5">Discussion</a>  

6.  <a href="#item6">Conclusion</a>  
    </font>
    </div>

<div id="item1"></div>

## 1. Introduction

**Seoul** is the capital of South Korea and it is one of the metropolitan cities with over 10 million people. Every year, many people visit to see this big city, and each of them creates a pleasant travel record. Most tourists refer to guidebooks to explore Seoul, but in fact, a few days are not enough to understand and grasp this large city.
    
Based on data science, we try to separate each district of Seoul through machine learning and tie the districts together to present a new perspective for understanding Seoul to those who visit this place for the first time.

First of all, using wikipedia and geocoder, we will secure and visualize information in each district of Seoul. The Foursquare API will allow us to explore multiple venues in each district. After sorting it into Pandas dataframe through hot-end coding and normalization, Seoul will be divided into about five zones with similar characteristics to provide tourists with rough local information.

<div id="item2"></div>

## 2. Data source and how to use

We try to collect data in the similar way, referring to the method we did in the previous example - Segmenting and Clustering Neighborhoods in New York City.

* District of Seoul : Wikipedia, <https://en.wikipedia.org/wiki/List_of_districts_of_Seoul>
    * BeautifulSoup will be used to parse district information from html table.
* Folium, Geocoder
    * These will be used to visualize map info.

In [3]:
import requests # library to handle requests
from bs4 import BeautifulSoup  # import beautiful soup for html parsing

import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

#!conda install -c conda-forge geopy --yes  # uncomment if geopy library is not installed
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

#!conda install -c conda-forge folium=0.5.0 --yes   # uncomment if folium library is not installed
import folium # map rendering library

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

print('Libraries imported.')

Libraries imported.


In [4]:
url = 'https://en.wikipedia.org/wiki/List_of_districts_of_Seoul'

res = requests.get(url)
res.status_code

200

In [5]:
soup = BeautifulSoup(res.text,'lxml')
table = soup.findAll('table')[1]

In [6]:
colnames = ['Name', 'Population', 'Area', 'PopDensity']  # Create dataframe with column names
df = pd.DataFrame(columns = colnames)

df

Unnamed: 0,Name,Population,Area,PopDensity


In [7]:
# Add data from parsed texts into dataframe
for tr in table.find_all('tr'):
    row_data=[]
    for td in tr.find_all('td'):
        row_data.append(td.text.strip())
    if len(row_data)==4:
        df.loc[len(df)] = row_data

print(df.shape)
df.head()

(26, 4)


Unnamed: 0,Name,Population,Area,PopDensity
0,Dobong-gu (도봉구; 道峰區),355712,20.70 km²,17184/km²
1,Dongdaemun-gu (동대문구; 東大門區),376319,14.21 km²,26483/km²
2,Dongjak-gu (동작구; 銅雀區),419261,16.35 km²,25643/km²
3,Eunpyeong-gu (은평구; 恩平區),503243,29.70 km²,16944/km²
4,Gangbuk-gu (강북구; 江北區),338410,23.60 km²,14339/km²


In [8]:
df_new = pd.DataFrame()

# Split English distirct names out
df_new['Name_Eng'] = df.Name.str.split('(').str[0]
df_new['Name_Han'] = df.Name.str.split('(').str[1]

# Split Korean distirct names out
df_new['Name_Han'] = df_new.Name_Han.str.split(';').str[0]

In [9]:
df = pd.concat([df, df_new], axis=1, sort=False)
df = df.drop(['Name'], axis=1)
df = df.loc[:24]

df

Unnamed: 0,Population,Area,PopDensity,Name_Eng,Name_Han
0,355712,20.70 km²,17184/km²,Dobong-gu,도봉구
1,376319,14.21 km²,26483/km²,Dongdaemun-gu,동대문구
2,419261,16.35 km²,25643/km²,Dongjak-gu,동작구
3,503243,29.70 km²,16944/km²,Eunpyeong-gu,은평구
4,338410,23.60 km²,14339/km²,Gangbuk-gu,강북구
5,481332,24.59 km²,19574/km²,Gangdong-gu,강동구
6,583446,39.50 km²,14771/km²,Gangnam-gu,강남구
7,591653,41.43 km²,14281/km²,Gangseo-gu,강서구
8,258030,13.02 km²,19818/km²,Geumcheon-gu,금천구
9,457131,20.12 km²,22720/km²,Guro-gu,구로구


In [10]:
filter = df.columns[-2:].tolist() + df.columns[:-2].tolist()
df = df[filter]
df = df.rename(columns={'Name_Eng': 'district', 'Name_Han':'K_name'})

df.head()

Unnamed: 0,district,K_name,Population,Area,PopDensity
0,Dobong-gu,도봉구,355712,20.70 km²,17184/km²
1,Dongdaemun-gu,동대문구,376319,14.21 km²,26483/km²
2,Dongjak-gu,동작구,419261,16.35 km²,25643/km²
3,Eunpyeong-gu,은평구,503243,29.70 km²,16944/km²
4,Gangbuk-gu,강북구,338410,23.60 km²,14339/km²


In [11]:
geolocator = Nominatim(user_agent="Seoul_explorer")

In [12]:
df['Code'] = df['K_name'].apply(geolocator.geocode).apply(lambda x: (x.latitude, x.longitude))

df[['Lat', 'Lng']] = df['Code'].apply(pd.Series)
df.drop(['Code'], axis=1, inplace=True)

df.head()

Unnamed: 0,district,K_name,Population,Area,PopDensity,Lat,Lng
0,Dobong-gu,도봉구,355712,20.70 km²,17184/km²,37.6686,127.0466
1,Dongdaemun-gu,동대문구,376319,14.21 km²,26483/km²,37.5742,127.0395
2,Dongjak-gu,동작구,419261,16.35 km²,25643/km²,37.5121,126.9395
3,Eunpyeong-gu,은평구,503243,29.70 km²,16944/km²,37.6024,126.9293
4,Gangbuk-gu,강북구,338410,23.60 km²,14339/km²,37.6395,127.0255


In [13]:
address = 'Seoul'

location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Seoul are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Seoul are 37.5666791, 126.9782914.


In [14]:
# create map of New York using latitude and longitude values
map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(df['Lat'], df['Lng'], df['district'], df['K_name']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
    
map

<div id="item3"></div>

## 3. Methodology

Now, we extracted information of Seoul districts from wikipedia and made a clear dataframe with geographical data. Combining this loation information with Foursquare API, we will explore venues in each district. Obtained venues will be sorted out through Foursquare category data. We will also apply hot-end coding and normalization into this dataframe to dig into the actual meaning of our data. Finally, we will divide Seoul into 5 areas through Kmean technique and find out how each area is different from others.

In [16]:
CLIENT_ID = 'Removed after importing data' # your Foursquare ID
CLIENT_SECRET = 'Removed after importing data' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100 # A default Foursquare API limit value

In [17]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
# type your answer here
seoul_venues = getNearbyVenues(names=df['district'],
                                   latitudes=df['Lat'],
                                   longitudes=df['Lng']
                                  )

print('................')
print('Process is done.')

Dobong-gu 
Dongdaemun-gu 
Dongjak-gu 
Eunpyeong-gu 
Gangbuk-gu 
Gangdong-gu 
Gangnam-gu 
Gangseo-gu 
Geumcheon-gu 
Guro-gu 
Gwanak-gu 
Gwangjin-gu 
Jongno-gu 
Jung-gu 
Jungnang-gu 
Mapo-gu 
Nowon-gu 
Seocho-gu 
Seodaemun-gu 
Seongbuk-gu 
Seongdong-gu 
Songpa-gu 
Yangcheon-gu 
Yeongdeungpo-gu 
Yongsan-gu 
................
Process is done.


In [20]:
print(seoul_venues.shape)
seoul_venues.head()

(705, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Dobong-gu,37.6686,127.0466,WAGEN COFFEE,37.666922,127.045057,Café
1,Dobong-gu,37.6686,127.0466,맥도날드 (McDonald's) (맥도날드),37.670196,127.043726,Fast Food Restaurant
2,Dobong-gu,37.6686,127.0466,Dunkin',37.668252,127.046433,Donut Shop
3,Dobong-gu,37.6686,127.0466,Baskin-Robbins,37.666314,127.046257,Ice Cream Shop
4,Dobong-gu,37.6686,127.0466,VIC Market (빅마켓),37.667676,127.045963,Big Box Store


In [21]:
seoul_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dobong-gu,8,8,8,8,8,8
Dongdaemun-gu,20,20,20,20,20,20
Dongjak-gu,28,28,28,28,28,28
Eunpyeong-gu,9,9,9,9,9,9
Gangbuk-gu,17,17,17,17,17,17
Gangdong-gu,19,19,19,19,19,19
Gangnam-gu,30,30,30,30,30,30
Gangseo-gu,7,7,7,7,7,7
Geumcheon-gu,6,6,6,6,6,6
Guro-gu,8,8,8,8,8,8


In [22]:
print('There are {} uniques categories.'.format(len(seoul_venues['Venue Category'].unique())))

There are 139 uniques categories.


In [23]:
# one hot encoding
seoul_onehot = pd.get_dummies(seoul_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
seoul_onehot['Neighborhood'] = seoul_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [seoul_onehot.columns[-1]] + list(seoul_onehot.columns[:-1])
seoul_onehot = seoul_onehot[fixed_columns]

seoul_onehot.head()

Unnamed: 0,Neighborhood,African Restaurant,American Restaurant,Antique Shop,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,Bagel Shop,Bakery,Bar,Bath House,Beer Bar,Beer Garden,Big Box Store,Bike Trail,Bistro,Bookstore,Bossam/Jokbal Restaurant,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Buffet,Bunsik Restaurant,Burger Joint,Bus Station,Bus Stop,Butcher,Café,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Convention Center,Cosmetics Shop,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Fish Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Golf Driving Range,Grocery Store,Gukbap Restaurant,Gym,Gym / Fitness Center,Halal Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Kebab Restaurant,Korean BBQ Restaurant,Korean Restaurant,Lounge,Market,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Multiplex,Museum,Nightclub,Noodle House,Optical Shop,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store,Park,Performing Arts Venue,Photography Studio,Pizza Place,Plaza,Pub,Ramen Restaurant,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Samgyetang Restaurant,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soba Restaurant,Soccer Stadium,Spa,Spanish Restaurant,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Trail,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Veterinarian,Vietnamese Restaurant,Village,Warehouse Store,Whisky Bar,Wine Bar,Wings Joint
0,Dobong-gu,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Dobong-gu,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Dobong-gu,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Dobong-gu,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Dobong-gu,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [24]:
seoul_onehot.shape

(705, 140)

In [25]:
seoul_grouped = seoul_onehot.groupby('Neighborhood').mean().reset_index()
seoul_grouped

Unnamed: 0,Neighborhood,African Restaurant,American Restaurant,Antique Shop,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,Bagel Shop,Bakery,Bar,Bath House,Beer Bar,Beer Garden,Big Box Store,Bike Trail,Bistro,Bookstore,Bossam/Jokbal Restaurant,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Buffet,Bunsik Restaurant,Burger Joint,Bus Station,Bus Stop,Butcher,Café,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Convention Center,Cosmetics Shop,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Fish Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Golf Driving Range,Grocery Store,Gukbap Restaurant,Gym,Gym / Fitness Center,Halal Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Kebab Restaurant,Korean BBQ Restaurant,Korean Restaurant,Lounge,Market,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Multiplex,Museum,Nightclub,Noodle House,Optical Shop,Other Great Outdoors,Outlet Store,Paper / Office Supplies Store,Park,Performing Arts Venue,Photography Studio,Pizza Place,Plaza,Pub,Ramen Restaurant,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Samgyetang Restaurant,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soba Restaurant,Soccer Stadium,Spa,Spanish Restaurant,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Trail,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Veterinarian,Vietnamese Restaurant,Village,Warehouse Store,Whisky Bar,Wine Bar,Wings Joint
0,Dobong-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Dongdaemun-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.45,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.05,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Dongjak-gu,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.107143,0.035714,0.0,0.0,0.071429,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Eunpyeong-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.111111,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Gangbuk-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.176471,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.176471,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Gangdong-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.052632,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.157895,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Gangnam-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.033333,0.0
7,Gangseo-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Geumcheon-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Guro-gu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [26]:
seoul_grouped.shape

(25, 140)

In [27]:
num_top_venues = 5

for hood in seoul_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = seoul_grouped[seoul_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Dobong-gu ----
                  venue  freq
0        Ice Cream Shop  0.12
1  Fast Food Restaurant  0.12
2         Big Box Store  0.12
3                  Café  0.12
4            Bath House  0.12


----Dongdaemun-gu ----
               venue  freq
0          BBQ Joint  0.45
1           Bus Stop  0.10
2  Korean Restaurant  0.10
3             Market  0.05
4  Electronics Store  0.05


----Dongjak-gu ----
                  venue  freq
0           Coffee Shop  0.14
1    Seafood Restaurant  0.14
2     Korean Restaurant  0.11
3  Fast Food Restaurant  0.11
4            Donut Shop  0.07


----Eunpyeong-gu ----
              venue  freq
0    Clothing Store  0.11
1    Ice Cream Shop  0.11
2      Concert Hall  0.11
3  Sushi Restaurant  0.11
4       Coffee Shop  0.11


----Gangbuk-gu ----
               venue  freq
0        Coffee Shop  0.18
1         Donut Shop  0.18
2  Korean Restaurant  0.12
3             Bakery  0.06
4       Dessert Shop  0.06


----Gangdong-gu ----
                  venue  

In [28]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [29]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = seoul_grouped['Neighborhood']

for ind in np.arange(seoul_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(seoul_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dobong-gu,Sushi Restaurant,Big Box Store,Ice Cream Shop,Café,Fast Food Restaurant,Donut Shop,Bath House,Bakery,Dessert Shop,Dumpling Restaurant
1,Dongdaemun-gu,BBQ Joint,Bus Stop,Korean Restaurant,Supermarket,Donut Shop,Metro Station,Ice Cream Shop,Butcher,Market,Electronics Store
2,Dongjak-gu,Coffee Shop,Seafood Restaurant,Fast Food Restaurant,Korean Restaurant,Fried Chicken Joint,Ice Cream Shop,Donut Shop,Arcade,Paper / Office Supplies Store,Japanese Restaurant
3,Eunpyeong-gu,Sushi Restaurant,Clothing Store,Café,Ice Cream Shop,Coffee Shop,Concert Hall,Fried Chicken Joint,Korean Restaurant,Bakery,Deli / Bodega
4,Gangbuk-gu,Coffee Shop,Donut Shop,Korean Restaurant,Dessert Shop,Ice Cream Shop,Café,Bus Stop,Fast Food Restaurant,Brewery,Bookstore


In [30]:
# set number of clusters
kclusters = 5

seoul_grouped_clustering = seoul_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(seoul_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 3, 0, 2, 0, 4, 2, 4, 1, 0], dtype=int32)

In [31]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

seoul_merged = df

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
seoul_merged = seoul_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='district')

seoul_merged.head() # check the last columns!

Unnamed: 0,district,K_name,Population,Area,PopDensity,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dobong-gu,도봉구,355712,20.70 km²,17184/km²,37.6686,127.0466,0,Sushi Restaurant,Big Box Store,Ice Cream Shop,Café,Fast Food Restaurant,Donut Shop,Bath House,Bakery,Dessert Shop,Dumpling Restaurant
1,Dongdaemun-gu,동대문구,376319,14.21 km²,26483/km²,37.5742,127.0395,3,BBQ Joint,Bus Stop,Korean Restaurant,Supermarket,Donut Shop,Metro Station,Ice Cream Shop,Butcher,Market,Electronics Store
2,Dongjak-gu,동작구,419261,16.35 km²,25643/km²,37.5121,126.9395,0,Coffee Shop,Seafood Restaurant,Fast Food Restaurant,Korean Restaurant,Fried Chicken Joint,Ice Cream Shop,Donut Shop,Arcade,Paper / Office Supplies Store,Japanese Restaurant
3,Eunpyeong-gu,은평구,503243,29.70 km²,16944/km²,37.6024,126.9293,2,Sushi Restaurant,Clothing Store,Café,Ice Cream Shop,Coffee Shop,Concert Hall,Fried Chicken Joint,Korean Restaurant,Bakery,Deli / Bodega
4,Gangbuk-gu,강북구,338410,23.60 km²,14339/km²,37.6395,127.0255,0,Coffee Shop,Donut Shop,Korean Restaurant,Dessert Shop,Ice Cream Shop,Café,Bus Stop,Fast Food Restaurant,Brewery,Bookstore


In [32]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(seoul_merged['Lat'], seoul_merged['Lng'], seoul_merged['district'], seoul_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
<div id="item4">map_clusters

<div id="item4"></div>
    
## 4. Result

Now, we found out how we can divide Seould into 5 sectors. Let's have more details of each sector.

In [33]:
seoul_merged.loc[seoul_merged['Cluster Labels'] == 0, seoul_merged.columns[[0]+[1]+list(range(5, seoul_merged.shape[1]))]]

Unnamed: 0,district,K_name,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dobong-gu,도봉구,37.6686,127.0466,0,Sushi Restaurant,Big Box Store,Ice Cream Shop,Café,Fast Food Restaurant,Donut Shop,Bath House,Bakery,Dessert Shop,Dumpling Restaurant
2,Dongjak-gu,동작구,37.5121,126.9395,0,Coffee Shop,Seafood Restaurant,Fast Food Restaurant,Korean Restaurant,Fried Chicken Joint,Ice Cream Shop,Donut Shop,Arcade,Paper / Office Supplies Store,Japanese Restaurant
4,Gangbuk-gu,강북구,37.6395,127.0255,0,Coffee Shop,Donut Shop,Korean Restaurant,Dessert Shop,Ice Cream Shop,Café,Bus Stop,Fast Food Restaurant,Brewery,Bookstore
9,Guro-gu,구로구,37.4952,126.8877,0,Fried Chicken Joint,Fast Food Restaurant,Ice Cream Shop,Coffee Shop,Asian Restaurant,Korean Restaurant,Bakery,Theater,Wings Joint,Donut Shop
14,Jungnang-gu,중랑구,37.6063,127.093,0,Fast Food Restaurant,Ice Cream Shop,Coffee Shop,Park,Japanese Restaurant,Bakery,Trail,Wings Joint,Deli / Bodega,Dim Sum Restaurant
22,Yangcheon-gu,양천구,37.5171,126.8663,0,Korean Restaurant,Convention Center,Park,Fast Food Restaurant,Tennis Court,Bakery,Donut Shop,Coffee Shop,Ice Cream Shop,Café


In [34]:
seoul_merged.loc[seoul_merged['Cluster Labels'] == 1, seoul_merged.columns[[0]+[1]+list(range(5, seoul_merged.shape[1]))]]

Unnamed: 0,district,K_name,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Geumcheon-gu,금천구,37.4565,126.8954,1,Event Space,Chinese Restaurant,Metro Station,Bus Station,Grocery Store,Bakery,Deli / Bodega,Donut Shop,Dim Sum Restaurant,Dessert Shop


In [35]:
seoul_merged.loc[seoul_merged['Cluster Labels'] == 2, seoul_merged.columns[[0]+[1]+list(range(5, seoul_merged.shape[1]))]]

Unnamed: 0,district,K_name,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Eunpyeong-gu,은평구,37.6024,126.9293,2,Sushi Restaurant,Clothing Store,Café,Ice Cream Shop,Coffee Shop,Concert Hall,Fried Chicken Joint,Korean Restaurant,Bakery,Deli / Bodega
6,Gangnam-gu,강남구,37.5177,127.0473,2,Coffee Shop,Bakery,Dessert Shop,Modern European Restaurant,Udon Restaurant,Noodle House,Sake Bar,Korean BBQ Restaurant,Salon / Barbershop,Snack Place
12,Jongno-gu,종로구,37.58031,126.983079,2,Korean Restaurant,Café,Coffee Shop,Italian Restaurant,Bakery,Art Gallery,History Museum,Art Museum,Dessert Shop,Thai Restaurant
15,Mapo-gu,마포구,37.566571,126.901532,2,Supermarket,Asian Restaurant,Soccer Stadium,Multiplex,Farmers Market,Fast Food Restaurant,Food Court,Auto Workshop,BBQ Joint,Cosmetics Shop
21,Songpa-gu,송파구,37.5145,127.1058,2,Korean Restaurant,Bakery,Japanese Restaurant,Coffee Shop,Café,BBQ Joint,Dessert Shop,Lounge,Bookstore,Seafood Restaurant
23,Yeongdeungpo-gu,영등포구,37.5262,126.8959,2,BBQ Joint,Korean Restaurant,Café,Park,Concert Hall,Chinese Restaurant,Sushi Restaurant,Food Court,Bakery,Bagel Shop
24,Yongsan-gu,용산구,37.5323,126.99,2,Bar,Korean Restaurant,Pub,Coffee Shop,Café,Lounge,BBQ Joint,Dumpling Restaurant,Pizza Place,Nightclub


In [36]:
seoul_merged.loc[seoul_merged['Cluster Labels'] == 3, seoul_merged.columns[[0]+[1]+list(range(5, seoul_merged.shape[1]))]]

Unnamed: 0,district,K_name,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Dongdaemun-gu,동대문구,37.5742,127.0395,3,BBQ Joint,Bus Stop,Korean Restaurant,Supermarket,Donut Shop,Metro Station,Ice Cream Shop,Butcher,Market,Electronics Store


In [37]:
seoul_merged.loc[seoul_merged['Cluster Labels'] == 4, seoul_merged.columns[[0]+[1]+list(range(5, seoul_merged.shape[1]))]]

Unnamed: 0,district,K_name,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Gangdong-gu,강동구,37.53,127.1237,4,Korean Restaurant,Coffee Shop,Fast Food Restaurant,Asian Restaurant,Bakery,BBQ Joint,Donut Shop,Steakhouse,Park,Grocery Store
7,Gangseo-gu,강서구,37.5509,126.8497,4,Chinese Restaurant,Coffee Shop,Spa,Noodle House,Bakery,Korean Restaurant,Department Store,Donut Shop,Dim Sum Restaurant,Dessert Shop
10,Gwanak-gu,관악구,37.4782,126.9518,4,Japanese Restaurant,Korean Restaurant,Coffee Shop,Chinese Restaurant,Bakery,Thai Restaurant,Café,Vietnamese Restaurant,Burger Joint,Ice Cream Shop
11,Gwangjin-gu,광진구,37.5384,127.0828,4,Korean Restaurant,Coffee Shop,Bunsik Restaurant,Bakery,Ice Cream Shop,Gukbap Restaurant,Snack Place,Market,Donut Shop,Seafood Restaurant
13,Jung-gu,중구,37.563656,126.99751,4,Korean Restaurant,Coffee Shop,Hotel,Noodle House,Bakery,Market,Seafood Restaurant,Donut Shop,Bunsik Restaurant,Sandwich Place
16,Nowon-gu,노원구,37.654,127.0567,4,Fast Food Restaurant,Bus Stop,Japanese Restaurant,Multiplex,Steakhouse,Farmers Market,Snack Place,Korean Restaurant,Department Store,Donut Shop
17,Seocho-gu,서초구,37.4835,127.0322,4,Coffee Shop,Korean Restaurant,BBQ Joint,Seafood Restaurant,Sake Bar,Gym / Fitness Center,Bunsik Restaurant,Gym,Noodle House,Performing Arts Venue
18,Seodaemun-gu,서대문구,37.579075,126.936786,4,Korean Restaurant,Health Food Store,Gym,Coffee Shop,Bus Station,Science Museum,Other Great Outdoors,Wings Joint,Dessert Shop,Department Store
19,Seongbuk-gu,성북구,37.59,127.0165,4,Korean Restaurant,Coffee Shop,Japanese Restaurant,Gym / Fitness Center,Café,Noodle House,Seafood Restaurant,Sandwich Place,Burger Joint,Bus Stop
20,Seongdong-gu,성동구,37.5635,127.0365,4,Coffee Shop,Korean Restaurant,Seafood Restaurant,Japanese Restaurant,Vietnamese Restaurant,Middle Eastern Restaurant,Bubble Tea Shop,Gym,Ramen Restaurant,Plaza


<div id="item5"></div>

## 5. Discussion

Using the most common place information above, we might explain characteristics of each cluster as below:

1. Cluster #0 has 6 districts. It is mostly located in outside of Seoul. These districts are not as commercial as other districts and they also have more parks than other clusters. Just like New your or Toronto, Seoul also shows that core area becomes more commercial but outsides are not.
1. Cluster #1 has only 1 district. This district, Geumcheon-gu, is not commercial but industrial area that has many plants. Also, many of Chinese people are living here so Chinese restaurants are the 2nd common place here.
1. Cluster #2 has 7 districts and it is quite commercial area. Not like other clusters, we could easily find special venues such as multiplex, stadium, lounge or concert hall. It would be good for tourists to enjoy it and have nice experiece here.
1. Cluster #3 has only 1 district, Dongdaemun-gu. Not like other clusters, this district is only for tourism and shopping. You will find out many shops and restaurants so this would be the best place for tourists.
1. Cluster #4 has 10 districts. These districts are living towns so they have many gyms. At the same time, these are quite commercial because of their geographical advantages so they also have many foreign restaurants as well as hotels and other special venues.

<div id="item6"></div>

## 6. Conclusion

Using Foursquare API and simple machine learning technique, we divide Seoul into 5 clusters and find out characteritics of each cluster. Based on this infomation, we could make few recommendations for tourists.

If they are first time to visit in Seoul or Korea, Cluster #2 or #3 would be the best for them. These are commercial enough which means that they are full of excitement to enjoy. Cluster #4 might be less exciting but good enough for them who want to know more about real Seoul. I am not recommending to visit cluster #0 or #1 that are located in outskirt of Seoul, however, it would be wonderful experience if they've already visited Seoul few times and they want to know more about this city.