# Applied Data Science Capstone Project - Battle of the Neighborhoods - *A Study of the Guangdong-Hong Kong-Macao Greater Bay Area*

## <font color=blue>Introduction</font>

The Central Government of the People's Republic of China promulgated the Outline Development Plan for the Guangdong-Hong Kong-Macao Greater Bay Area (GBA) in 2019, which sets out the strategic development plan of the GBA with a view to developing an international first-class bay area ideal for living, working and travelling.  The GBA comprises the two Special Administrative Regions of Hong Kong and Macao, and the nine municipalities of Guangzhou, Shenzhen, Zhuhai, Foshan, Huizhou, Dongguan, Zhongshan, Jiangmen and Zhaoqing in Guangdong Province. The total area is around 56 000 km2. At end 2018, the total population is over 71 million, the GDP is USD 1,642.5 billion and GDP per capita is USD 23,342, presenting vast opportunities and prospects for future development.  In the iight of the development potential of the GBA, it would be of utmost importance to understand and harness the unique comparative advantages of each of the regions in order to maximise the synergy between the "9+2" cities within the GBA.  Thus, this study undertakes to study the key cities within the GBA to identify the characteristics of individual cities and identify strategic clusters that could inform future collaborations between cities to take forward the mega-GBA development.

## <font color=blue>Data</font>

In addition to the Foursquare location data, this study will also be utilising some demographic statistics obtained from the GBA thematic website maintained by the Government of the Hong Kong Special Administrative Region (https://www.bayarea.gov.hk/) and the Hong Kong Trade Development Council (http://hong-kong-economy-research.hktdc.com/business-news/article/Guangdong-Hong-Kong-Macau-Bay-Area/Statistics-of-the-Guangdong-Hong-Kong-Macao-Greater-Bay-Area/bayarea/en/1/1X000000/1X0AE3Q1.htm), which includes information such as industry structure, GDP, population, area, trade value, etc.  Utilising Foursquare, the study will identify the key venues of each cities in the GBA, by obtaining their longitude and latitude values, specifically those related to economic activities, to identify the more common types of venues in each cities.  In combination with the other demographic data, the study will perform clustering analysis on the citiies to separate the 11 cities in total into 3 clusters, and identify each of their strategic strength area through analysis of their characterstics.  The study will also use the folium map package to demonstrate the clustering result.


### <font color=green>*Obtaining Demographic Data of the GBA Cities*</font>

The Hong Kong Trade Development Council maintains a thematic website on statistics of the GBA.  The BeautifulSoup4 package is used here with the lxml parser to scrap the website for a summary table of the demographic statistics of the 11 GBA cities.

#### Installing General Libraries

In [1]:
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json
#!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim 
import requests 
from pandas.io.json import json_normalize 
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0 --yes 
import folium 
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    altair-4.0.1               |             py_0         575 KB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.0 MB

The following NEW packages will be 

#### Install Beautiful Soup Package

In [2]:
!pip install beautifulsoup4
!pip install lxml



#### Scrapping Data from Webpage

In [3]:
from bs4 import BeautifulSoup as bs
import requests
source = requests.get('http://hong-kong-economy-research.hktdc.com/business-news/article/Guangdong-Hong-Kong-Macau-Bay-Area/Statistics-of-the-Guangdong-Hong-Kong-Macao-Greater-Bay-Area/bayarea/en/1/1X000000/1X0AE3Q1.htm').text
soup = bs(source,'lxml')
print(soup.prettify())

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
 <head>
  <title>
   Statistics of the Guangdong-Hong Kong-Macao Greater Bay Area  | HKTDC
  </title>
  <link href="http://hong-kong-economy-research.hktdc.com/business-news/article/Guangdong-Hong-Kong-Macau-Bay-Area/Statistics-of-the-Guangdong-Hong-Kong-Macao-Greater-Bay-Area/bayarea/en/1/1X000000/1X0AE3Q1.htm" rel="canonical"/>
  <meta content="en" name="_l"/>
  <meta content="" name="_cds"/>
  <meta content="" name="_ads"/>
  <meta content="Bay Area economic indicators,Bay Area statistical comparison" name="_kw"/>
  <meta content="C,HK" name="_rg"/>
  <meta content="CHN,HKG,MAC" name="_co"/>
  <meta content="5" name="_fls"/>
  <meta content="715" name="_sls"/>
  <meta content="Mainland China, Hong Kong " name="region"/>
  <meta content="The Guangdong-Hong Kong-Macao Greater Bay Area’s city cluster will extend across Hong Kong, Macao and nine Pearl River Delta (PRD) cities. H

In [4]:
table = soup.find("table", class_="fairDetailTable")
table.prettify()

'<table border="0" class="fairDetailTable" style="text-align: center;">\n <tbody>\n  <tr style="height: 8px; background-color: #ff8c00;">\n   <td style="height: 8px; text-align: center;" valign="middle">\n    <strong>\n     City\n    </strong>\n    <br/>\n   </td>\n   <td style="height: 8px; text-align: center;" valign="middle">\n    <strong>\n     Land Area\n     <br/>\n     (sq. km)\n    </strong>\n   </td>\n   <td style="height: 8px; text-align: center;" valign="middle">\n    <strong>\n     Population\n     <br/>\n     (mn)\n    </strong>\n   </td>\n   <td style="height: 8px; text-align: center;" valign="middle">\n    <strong>\n     GDP\n     <sup>\n      1\n     </sup>\n     <br/>\n     (US$ bn\n     <sup>\n      2\n     </sup>\n     )\n    </strong>\n   </td>\n   <td style="height: 8px; text-align: center;" valign="middle">\n    <strong>\n     Per-capita GDP\n     <br/>\n     (US$\n     <sup>\n      2\n     </sup>\n     )\n    </strong>\n   </td>\n   <td style="height: 8px; text-a

In [5]:
row = []
for tr in table.find_all('tr'):
    if tr.find_all('td')!=[]:
        row.append(tr.find_all('td'))
list = []
for row in row:
    city = row[0].text.rstrip()
    area = row[1].text.rstrip()
    population = row[2].text.rstrip()
    gdp=row[3].text.rstrip()
    gdp_pc=row[4].text.rstrip()
    tertiary=row[5].text.rstrip()
    export=row[6].text.rstrip()
    fdi=row[7].text.rstrip()
    list.append([city, area, population, gdp, gdp_pc, tertiary, export, fdi])
list

[['City',
  'Land Area(sq. km)',
  'Population(mn)',
  'GDP1(US$ bn2)',
  'Per-capita GDP(US$2)',
  'GDP share of tertiary industry(%)',
  'Export(US$ bn2)',
  'Utilised FDI(US$ bn2)'],
 ['Guangdong-Hong Kong-Macao Greater Bay Area',
  '56,904',
  '71.16',
  '1,641.97',
  '23,075',
  '66.1',
  '1,145.84',
  '132.695'],
 ['Hong Kong',
  '1,107',
  '7.48',
  '362.66',
  '48,673',
  '92.43',
  '530.44',
  '110.73'],
 ['Macao', '33', '0.67', '54.54', '82,609', '94.93', '1.51', '0.3753'],
 ['Guangzhou', '7,434', '14.9', '345.44', '23,497', '71.8', '84.74', '6.611'],
 ['Shenzhen', '1,997', '13.03', '366.03', '28,647', '58.8', '245.94', '8.203'],
 ['Foshan', '3,798', '7.91', '150.15', '18,992', '42.0', '53.30', '0.691'],
 ['Dongguan', '2,460', '8.39', '125.1', '14,951', '51.1', '120.22', '1.361'],
 ['Huizhou', '11,347', '4.83', '62.0', '12,908', '43.0', '33.38', '0.959'],
 ['Zhongshan', '1,784', '3.31', '54.9', '16,711', '49.3', '27.23', '0.527'],
 ['Jiangmen', '9,507', '4.6', '43.83', '9,570

In [6]:
header = list[0]
header

['City',
 'Land Area(sq. km)',
 'Population(mn)',
 'GDP1(US$ bn2)',
 'Per-capita GDP(US$2)',
 'GDP share of tertiary industry(%)',
 'Export(US$ bn2)',
 'Utilised FDI(US$ bn2)']

In [7]:
data = pd.DataFrame(list[2:],columns=['City','Area','Population','GDP','GDP_pc','Tertiary','Export','FDI'])
data

Unnamed: 0,City,Area,Population,GDP,GDP_pc,Tertiary,Export,FDI
0,Hong Kong,1107,7.48,362.66,48673,92.43,530.44,110.73
1,Macao,33,0.67,54.54,82609,94.93,1.51,0.3753
2,Guangzhou,7434,14.9,345.44,23497,71.8,84.74,6.611
3,Shenzhen,1997,13.03,366.03,28647,58.8,245.94,8.203
4,Foshan,3798,7.91,150.15,18992,42.0,53.3,0.691
5,Dongguan,2460,8.39,125.1,14951,51.1,120.22,1.361
6,Huizhou,11347,4.83,62.0,12908,43.0,33.38,0.959
7,Zhongshan,1784,3.31,54.9,16711,49.3,27.23,0.527
8,Jiangmen,9507,4.6,43.83,9570,44.5,16.97,0.734
9,Zhuhai,1736,1.89,44.05,24100,49.1,28.52,2.391


In [8]:
data.drop('GDP',axis=1,inplace=True)
data

Unnamed: 0,City,Area,Population,GDP_pc,Tertiary,Export,FDI
0,Hong Kong,1107,7.48,48673,92.43,530.44,110.73
1,Macao,33,0.67,82609,94.93,1.51,0.3753
2,Guangzhou,7434,14.9,23497,71.8,84.74,6.611
3,Shenzhen,1997,13.03,28647,58.8,245.94,8.203
4,Foshan,3798,7.91,18992,42.0,53.3,0.691
5,Dongguan,2460,8.39,14951,51.1,120.22,1.361
6,Huizhou,11347,4.83,12908,43.0,33.38,0.959
7,Zhongshan,1784,3.31,16711,49.3,27.23,0.527
8,Jiangmen,9507,4.6,9570,44.5,16.97,0.734
9,Zhuhai,1736,1.89,24100,49.1,28.52,2.391


In [9]:
data_fig = data.iloc[:,1:]

In [10]:
data_fig.iloc[:,0] = data_fig.iloc[:,0].str.replace(',','').astype(float)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item_labels[indexer[info_axis]]] = value


In [11]:
data_fig.iloc[:,2] = data_fig.iloc[:,2].str.replace(',','').astype(float)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s


In [12]:
from sklearn.preprocessing import StandardScaler
data_stan = StandardScaler().fit_transform(data_fig)
data_stan

  return self.partial_fit(X, y)
  return self.fit(X, **fit_params).transform(X)


array([[-0.85122548,  0.23884755,  1.07857003,  1.82128004,  2.82790771,
         3.15161778],
       [-1.08021148, -1.37015153,  2.71064964,  1.95267863, -0.68103182,
        -0.373433  ],
       [ 0.49774507,  1.99197136, -0.13221604,  0.73697886, -0.12888119,
        -0.17424657],
       [-0.66146987,  1.55014636,  0.11546223,  0.05370619,  0.94052508,
        -0.12339345],
       [-0.27748125,  0.34044367, -0.34887442, -0.82929235, -0.33745521,
        -0.36334862],
       [-0.5627543 ,  0.4538533 , -0.5432177 , -0.35100147,  0.10649434,
        -0.34194687],
       [ 1.33203017, -0.3872681 , -0.64147143, -0.77673291, -0.46960517,
        -0.35478792],
       [-0.70688329, -0.74639859, -0.45857425, -0.44560846, -0.51040447,
        -0.36858726],
       [ 0.93972642, -0.44161021, -0.80200543, -0.69789376, -0.57846966,
        -0.36197508],
       [-0.7171173 , -1.08190206, -0.10321604, -0.45612035, -0.50184657,
        -0.30904567],
       [ 2.08764131, -0.54793174, -0.87510659, -1.

In [13]:
data_stan = pd.DataFrame(data_stan,columns=['Area','Population','GDP_pc','Tertiary','Export','FDI'])
data_stan

Unnamed: 0,Area,Population,GDP_pc,Tertiary,Export,FDI
0,-0.851225,0.238848,1.07857,1.82128,2.827908,3.151618
1,-1.080211,-1.370152,2.71065,1.952679,-0.681032,-0.373433
2,0.497745,1.991971,-0.132216,0.736979,-0.128881,-0.174247
3,-0.66147,1.550146,0.115462,0.053706,0.940525,-0.123393
4,-0.277481,0.340444,-0.348874,-0.829292,-0.337455,-0.363349
5,-0.562754,0.453853,-0.543218,-0.351001,0.106494,-0.341947
6,1.33203,-0.387268,-0.641471,-0.776733,-0.469605,-0.354788
7,-0.706883,-0.746399,-0.458574,-0.445608,-0.510404,-0.368587
8,0.939726,-0.44161,-0.802005,-0.697894,-0.57847,-0.361975
9,-0.717117,-1.081902,-0.103216,-0.45612,-0.501847,-0.309046


In [14]:
data_city = pd.DataFrame(data.iloc[:,0],columns=['City'])
data_city

Unnamed: 0,City
0,Hong Kong
1,Macao
2,Guangzhou
3,Shenzhen
4,Foshan
5,Dongguan
6,Huizhou
7,Zhongshan
8,Jiangmen
9,Zhuhai


In [15]:
data_clean = data_city.join(data_stan)

In [16]:
data_clean

Unnamed: 0,City,Area,Population,GDP_pc,Tertiary,Export,FDI
0,Hong Kong,-0.851225,0.238848,1.07857,1.82128,2.827908,3.151618
1,Macao,-1.080211,-1.370152,2.71065,1.952679,-0.681032,-0.373433
2,Guangzhou,0.497745,1.991971,-0.132216,0.736979,-0.128881,-0.174247
3,Shenzhen,-0.66147,1.550146,0.115462,0.053706,0.940525,-0.123393
4,Foshan,-0.277481,0.340444,-0.348874,-0.829292,-0.337455,-0.363349
5,Dongguan,-0.562754,0.453853,-0.543218,-0.351001,0.106494,-0.341947
6,Huizhou,1.33203,-0.387268,-0.641471,-0.776733,-0.469605,-0.354788
7,Zhongshan,-0.706883,-0.746399,-0.458574,-0.445608,-0.510404,-0.368587
8,Jiangmen,0.939726,-0.44161,-0.802005,-0.697894,-0.57847,-0.361975
9,Zhuhai,-0.717117,-1.081902,-0.103216,-0.45612,-0.501847,-0.309046


#### Obtain Coordinates of the Cities

As the Geocodeer package can be very unreliable, the geographical coordinates of the GBA cities is complied into a CSV file uploaded onto GitHub at https://github.com/Huirricane/Coursera_Capstone/blob/master/GBA_Coordinates.csv.

In [17]:
data_coor = pd.DataFrame ({'Latitude':[22.302711, 22.198746, 23.12911, 22.543097, 23.021479, 23.020674, 23.091181, 22.52747, 22.580391, 22.270979, 23.047192],
                           'Longitude':[114.177216, 113.543877, 113.264381, 114.057861, 113.121437, 113.751801, 114.400681, 113.361526, 113.080009, 113.576675, 112.465091]})
data_coor

Unnamed: 0,Latitude,Longitude
0,22.302711,114.177216
1,22.198746,113.543877
2,23.12911,113.264381
3,22.543097,114.057861
4,23.021479,113.121437
5,23.020674,113.751801
6,23.091181,114.400681
7,22.52747,113.361526
8,22.580391,113.080009
9,22.270979,113.576675


In [18]:
df = data_city.join(data_coor)
df

Unnamed: 0,City,Latitude,Longitude
0,Hong Kong,22.302711,114.177216
1,Macao,22.198746,113.543877
2,Guangzhou,23.12911,113.264381
3,Shenzhen,22.543097,114.057861
4,Foshan,23.021479,113.121437
5,Dongguan,23.020674,113.751801
6,Huizhou,23.091181,114.400681
7,Zhongshan,22.52747,113.361526
8,Jiangmen,22.580391,113.080009
9,Zhuhai,22.270979,113.576675


In [19]:
df_full = data_clean.join(data_coor)
df_full

Unnamed: 0,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude
0,Hong Kong,-0.851225,0.238848,1.07857,1.82128,2.827908,3.151618,22.302711,114.177216
1,Macao,-1.080211,-1.370152,2.71065,1.952679,-0.681032,-0.373433,22.198746,113.543877
2,Guangzhou,0.497745,1.991971,-0.132216,0.736979,-0.128881,-0.174247,23.12911,113.264381
3,Shenzhen,-0.66147,1.550146,0.115462,0.053706,0.940525,-0.123393,22.543097,114.057861
4,Foshan,-0.277481,0.340444,-0.348874,-0.829292,-0.337455,-0.363349,23.021479,113.121437
5,Dongguan,-0.562754,0.453853,-0.543218,-0.351001,0.106494,-0.341947,23.020674,113.751801
6,Huizhou,1.33203,-0.387268,-0.641471,-0.776733,-0.469605,-0.354788,23.091181,114.400681
7,Zhongshan,-0.706883,-0.746399,-0.458574,-0.445608,-0.510404,-0.368587,22.52747,113.361526
8,Jiangmen,0.939726,-0.44161,-0.802005,-0.697894,-0.57847,-0.361975,22.580391,113.080009
9,Zhuhai,-0.717117,-1.081902,-0.103216,-0.45612,-0.501847,-0.309046,22.270979,113.576675


### <font color=green>*Obtaining Location Data with Foursquare*</font>


#### Create a Map of the GBA

In [20]:
address = 'Guangdong'
geolocator = Nominatim(user_agent='explorer')
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print ('The geographical coordinate of the GBA are {}, {}'.format( latitude, longitude))

The geographical coordinate of the GBA are 23.1357694, 113.1982688


In [21]:
map = folium.Map(location=[latitude, longitude], zoom_start=8)
for lat, lng, city in zip(df['Latitude'], df['Longitude'], df['City']):
    label = '{}'.format(city)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
map

#### Initiate Foursquare

In [22]:
# The code was removed by Watson Studio for sharing.

In [23]:
city_latitude = df.loc[0, 'Latitude']
city_longitude = df.loc[0, 'Longitude']
city_name = df.loc[0, 'City']
print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of Hong Kong are 22.302711, 114.177216.


In [24]:
LIMIT=100
radius=5000
url='https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    city_latitude,
    city_longitude,
    radius,
    LIMIT)

In [25]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e35938a618f43001b316914'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Hong Kong',
  'headerFullLocation': 'Hong Kong',
  'headerLocationGranularity': 'city',
  'totalResults': 237,
  'suggestedBounds': {'ne': {'lat': 22.347711045000043,
    'lng': 114.22576380010017},
   'sw': {'lat': 22.257710954999954, 'lng': 114.12866819989983}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4d26701677a2a1cde9bf4fb7',
       'name': 'Hotel ICON (唯港薈)',
       'location': {'address': '17 Science Museum Road, Tsim Sha Tsui East',
        'lat': 22.300801104582376,
        'lng': 114.17971994113918,
        'labeledLatLngs': [{'label': 'display',
       

In [26]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [27]:
venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Hotel ICON (唯港薈),Hotel,22.300801,114.17972
1,Hong Kong Museum of History (香港歷史博物館),History Museum,22.301474,114.177297
2,Din Tai Fung (鼎泰豐),Dumpling Restaurant,22.300782,114.172039
3,InterContinental Grand Stanford Hong Kong (海景嘉...,Hotel,22.299053,114.179393
4,Mira Place 2,Shopping Mall,22.300233,114.172398


In [28]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


In [29]:
def getNearbyVenues(names, latitudes, longitudes, radius=10000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        results = requests.get(url).json()["response"]['groups'][0]['items']
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [30]:
venues = getNearbyVenues(names=df['City'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Hong Kong
Macao
Guangzhou
Shenzhen
Foshan
Dongguan
Huizhou
Zhongshan
Jiangmen
Zhuhai
Zhaoqing


In [31]:
print(venues.shape)
venues

(650, 7)


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Hong Kong,22.302711,114.177216,Hotel ICON (唯港薈),22.300801,114.17972,Hotel
1,Hong Kong,22.302711,114.177216,Din Tai Fung (鼎泰豐),22.300782,114.172039,Dumpling Restaurant
2,Hong Kong,22.302711,114.177216,Hong Kong Museum of History (香港歷史博物館),22.301474,114.177297,History Museum
3,Hong Kong,22.302711,114.177216,Mira Place 2,22.300233,114.172398,Shopping Mall
4,Hong Kong,22.302711,114.177216,The Peninsula Hong Kong (香港半島酒店),22.295102,114.171854,Hotel
5,Hong Kong,22.302711,114.177216,InterContinental Grand Stanford Hong Kong (海景嘉...,22.299053,114.179393,Hotel
6,Hong Kong,22.302711,114.177216,InterContinental Hong Kong (香港洲際酒店),22.293497,114.173866,Hotel
7,Hong Kong,22.302711,114.177216,Broadway Cinematheque (百老匯電影中心),22.31061,114.16873,Multiplex
8,Hong Kong,22.302711,114.177216,Morton's The Steakhouse,22.294655,114.172873,Steakhouse
9,Hong Kong,22.302711,114.177216,Kowloon Shangri-La (九龍香格里拉大酒店),22.297371,114.176921,Hotel


In [32]:
venues.groupby('City').count()

Unnamed: 0_level_0,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dongguan,60,60,60,60,60,60
Foshan,56,56,56,56,56,56
Guangzhou,100,100,100,100,100,100
Hong Kong,100,100,100,100,100,100
Huizhou,12,12,12,12,12,12
Jiangmen,11,11,11,11,11,11
Macao,100,100,100,100,100,100
Shenzhen,100,100,100,100,100,100
Zhaoqing,4,4,4,4,4,4
Zhongshan,27,27,27,27,27,27


In [33]:
print('There are {} uniques categories.'.format(len(venues['Venue Category'].unique())))

There are 130 uniques categories.


In [34]:
onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")
onehot['City'] = venues['City'] 
onehot.head()

Unnamed: 0,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo,City
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Hong Kong
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Hong Kong
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Hong Kong
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Hong Kong
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Hong Kong


In [35]:
col = onehot.columns.tolist()
col = col[-1:] + col[:-1]
onehot = onehot [col]

In [36]:
onehot.head()

Unnamed: 0,City,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo
0,Hong Kong,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Hong Kong,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Hong Kong,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Hong Kong,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Hong Kong,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [37]:
onehot.shape

(650, 131)

In [38]:
grouped = onehot.groupby('City').mean().reset_index()
grouped

Unnamed: 0,City,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo
0,Dongguan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.216667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.116667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.016667,0.0,0.1,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.05,0.0,0.016667,0.033333,0.0,0.0,0.016667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.016667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Foshan,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.214286,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.0,0.125,0.0,0.0,0.089286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.017857,0.0,0.0,0.0,0.142857,0.017857,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Guangzhou,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.03,0.0,0.01,0.0,0.02,0.06,0.01,0.01,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.18,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.06,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0
3,Hong Kong,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.01,0.04,0.0,0.03,0.01,0.01,0.03,0.0,0.01,0.02,0.02,0.0,0.04,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.21,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.01
4,Huizhou,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Jiangmen,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Macao,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.08,0.02,0.01,0.0,0.06,0.03,0.0,0.0,0.02,0.06,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.15,0.02,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.03,0.07,0.0,0.0,0.05,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0
7,Shenzhen,0.0,0.01,0.0,0.01,0.01,0.01,0.01,0.01,0.03,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.02,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.09,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.15,0.01,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.03,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Zhaoqing,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.75,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Zhongshan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.185185,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [39]:
grouped.shape

(11, 131)

In [40]:
num_top_venues = 5
for city in grouped['City']:
    print("----"+city+"----")
    temp = grouped[grouped['City'] == city].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Dongguan----
                  venue  freq
0           Coffee Shop  0.22
1  Fast Food Restaurant  0.12
2    Italian Restaurant  0.10
3           Pizza Place  0.07
4         Shopping Mall  0.07


----Foshan----
                    venue  freq
0             Coffee Shop  0.21
1                   Hotel  0.14
2    Fast Food Restaurant  0.12
3           Shopping Mall  0.11
4  Furniture / Home Store  0.09


----Guangzhou----
                venue  freq
0               Hotel  0.18
1         Coffee Shop  0.06
2                Park  0.06
3       Shopping Mall  0.06
4  Turkish Restaurant  0.04


----Hong Kong----
                 venue  freq
0                Hotel  0.21
1  Dumpling Restaurant  0.04
2   Chinese Restaurant  0.04
3                 Park  0.04
4                 Café  0.03


----Huizhou----
           venue  freq
0          Hotel  0.33
1  Shopping Mall  0.17
2    Coffee Shop  0.17
3           Lake  0.08
4  Train Station  0.08


----Jiangmen----
                  venue  freq
0      

In [41]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [42]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['City'] = grouped['City']
for ind in np.arange(grouped.shape[0]):
    venues_sorted.iloc[ind, 1:] = return_most_common_venues(grouped.iloc[ind, :], num_top_venues)
venues_sorted.head()

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dongguan,Coffee Shop,Fast Food Restaurant,Italian Restaurant,Pizza Place,Hotel,Shopping Mall,Resort,Bar,Sandwich Place,Thai Restaurant
1,Foshan,Coffee Shop,Hotel,Fast Food Restaurant,Shopping Mall,Furniture / Home Store,Pizza Place,Restaurant,Dim Sum Restaurant,Italian Restaurant,Diner
2,Guangzhou,Hotel,Park,Coffee Shop,Shopping Mall,Turkish Restaurant,Middle Eastern Restaurant,Chinese Restaurant,Cantonese Restaurant,Café,Electronics Store
3,Hong Kong,Hotel,Dumpling Restaurant,Park,Chinese Restaurant,Clothing Store,Electronics Store,Japanese Restaurant,Café,Shopping Mall,Coffee Shop
4,Huizhou,Hotel,Coffee Shop,Shopping Mall,Lake,Japanese Restaurant,Fast Food Restaurant,Train Station,Grocery Store,Golf Course,Gift Shop


## <font color=blue>Methodology</font>

To compare the similarity between different cities in the GBA and identify their strategic strengths, the study adopts the k-means clustering methodology to cluster the 11 cities into 5 clusters.

In [43]:
final = pd.merge(data_clean,grouped,on='City')
final

Unnamed: 0,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo
0,Hong Kong,-0.851225,0.238848,1.07857,1.82128,2.827908,3.151618,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.01,0.04,0.0,0.03,0.01,0.01,0.03,0.0,0.01,0.02,0.02,0.0,0.04,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.21,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.01
1,Macao,-1.080211,-1.370152,2.71065,1.952679,-0.681032,-0.373433,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.08,0.02,0.01,0.0,0.06,0.03,0.0,0.0,0.02,0.06,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.15,0.02,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.03,0.07,0.0,0.0,0.05,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0
2,Guangzhou,0.497745,1.991971,-0.132216,0.736979,-0.128881,-0.174247,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.03,0.0,0.01,0.0,0.02,0.06,0.01,0.01,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.18,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.06,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0
3,Shenzhen,-0.66147,1.550146,0.115462,0.053706,0.940525,-0.123393,0.0,0.01,0.0,0.01,0.01,0.01,0.01,0.01,0.03,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.02,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.09,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.15,0.01,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.03,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Foshan,-0.277481,0.340444,-0.348874,-0.829292,-0.337455,-0.363349,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.214286,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.0,0.125,0.0,0.0,0.089286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.017857,0.0,0.0,0.0,0.142857,0.017857,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Dongguan,-0.562754,0.453853,-0.543218,-0.351001,0.106494,-0.341947,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.216667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.116667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.016667,0.0,0.1,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.05,0.0,0.016667,0.033333,0.0,0.0,0.016667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.016667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Huizhou,1.33203,-0.387268,-0.641471,-0.776733,-0.469605,-0.354788,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Zhongshan,-0.706883,-0.746399,-0.458574,-0.445608,-0.510404,-0.368587,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.185185,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Jiangmen,0.939726,-0.44161,-0.802005,-0.697894,-0.57847,-0.361975,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Zhuhai,-0.717117,-1.081902,-0.103216,-0.45612,-0.501847,-0.309046,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.025,0.0,0.0125,0.0,0.0125,0.0625,0.0125,0.0125,0.0,0.075,0.0375,0.0,0.0,0.0125,0.125,0.0,0.0125,0.025,0.0,0.0,0.0,0.0125,0.0,0.0375,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.025,0.0125,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0125,0.0,0.0,0.0375,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0125,0.0,0.0125,0.0125,0.0,0.0375,0.05,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0375,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0


In [44]:
from sklearn.cluster import KMeans
kclusters = 5
clustering = final.drop('City', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(clustering)
kmeans.labels_[0:11] 

array([1, 2, 3, 3, 0, 0, 4, 0, 4, 0, 4], dtype=int32)

In [45]:
robust = grouped.drop('City',1)
kmeans_r = KMeans(n_clusters=kclusters, random_state=0).fit(robust)
kmeans_r.labels_[0:11]

array([3, 3, 2, 2, 0, 0, 4, 2, 1, 3, 4], dtype=int32)

In [46]:
data_non_r = data_city.join(data_fig)
alt_r = pd.merge(data_non_r,grouped,on='City')
alt_r
alt_cluster = alt_r.drop('City',1)
head_r = alt_cluster.columns.tolist()
alt_stan = StandardScaler().fit_transform(alt_cluster)
alt_stan = pd.DataFrame(alt_stan,columns = head_r)
alt_stan

  return self.partial_fit(X, y)
  return self.fit(X, **fit_params).transform(X)


Unnamed: 0,Area,Population,GDP_pc,Tertiary,Export,FDI,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo
0,-0.851225,0.238848,1.07857,1.82128,2.827908,3.151618,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,2.12132,-0.471405,0.616267,-0.618863,2.12132,0.989949,-0.316228,-0.316228,-0.456773,-0.316228,-0.465633,0.362254,0.018841,-0.467888,3.162278,0.708795,-0.467888,0.543841,3.162278,0.434099,-1.224943,-0.471405,1.190919,1.190919,1.97205,-0.316228,3.064129,-0.316228,1.873829,-0.924872,-0.316228,0.917497,-0.353391,-0.316228,-0.316228,-0.316228,3.162278,-0.496776,-0.316228,2.12132,-0.316228,-0.316228,-0.316228,-0.595383,0.478067,-0.316228,-0.316228,-0.316228,-0.078358,1.621566,-0.442326,0.634183,-0.467888,-0.454369,-0.316228,-0.017409,-0.316228,0.62238,-0.454369,-0.316228,-0.355887,1.119865,-0.316228,3.162278,-0.316228,-0.412568,2.12132,-0.434728,0.338646,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,3.162278,-0.316228,-0.316228,-0.316228,3.162278,-0.009288,-0.467888,-0.316228,-0.467888,-0.316228,-0.627038,0.085523,-0.4636,3.162278,3.162278,-0.68794,-0.316228,-0.725955,-0.316228,1.179536,-0.316228,0.616267,-0.316228,-0.7829,2.12132,-0.316228,3.162278,-0.659566,-0.316228,-0.316228,0.99558,1.448535,1.419905,-0.316228,-0.316228,-0.316228,0.258322,-0.316228,-0.316228,-0.316228,3.162278,-0.470861,-0.392837,2.12132,-0.316228,3.162278,3.162278,-0.467888,3.162278,3.162278
1,-1.080211,-1.370152,2.71065,1.952679,-0.681032,-0.373433,-0.316228,-0.316228,3.162278,-0.316228,-0.316228,-0.471405,-0.471405,0.616267,0.012295,-0.471405,-0.565685,-0.316228,-0.316228,2.635231,-0.316228,-0.465633,2.215645,0.018841,1.819563,-0.316228,1.469454,1.819563,-0.563615,-0.316228,1.707457,-0.835007,-0.471405,-0.749838,1.190919,-0.703683,-0.316228,-0.392837,-0.316228,-0.599625,-0.924872,-0.316228,2.412678,0.038136,-0.316228,-0.316228,-0.316228,-0.316228,-0.496776,-0.316228,-0.471405,-0.316228,-0.316228,-0.316228,2.10135,0.478067,-0.316228,-0.316228,-0.316228,-0.408399,1.621566,1.179536,-0.703683,1.819563,-0.454369,-0.316228,0.340061,-0.316228,-0.199162,-0.454369,3.162278,-0.355887,1.119865,3.162278,-0.316228,-0.316228,-0.412568,-0.471405,-0.434728,-0.528405,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.445676,1.819563,-0.316228,1.819563,-0.316228,-0.627038,1.590732,2.511169,-0.316228,-0.316228,2.007805,-0.316228,0.511651,-0.316228,2.801397,-0.316228,0.616267,-0.316228,-1.128478,-0.471405,-0.316228,-0.316228,0.174013,3.162278,-0.316228,0.99558,-0.60745,-0.454369,-0.316228,-0.316228,3.162278,-0.727998,3.162278,-0.316228,-0.316228,-0.316228,-0.470861,-0.392837,2.12132,-0.316228,-0.316228,-0.316228,1.819563,-0.316228,-0.316228
2,0.497745,1.991971,-0.132216,0.736979,-0.128881,-0.174247,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,2.12132,0.616267,-0.618863,2.12132,0.989949,-0.316228,-0.316228,-0.456773,-0.316228,-0.465633,-0.008425,0.343225,-0.467888,-0.316228,0.328466,-0.467888,-0.194463,-0.316228,1.707457,-0.835007,2.12132,1.190919,1.190919,0.634183,-0.316228,-0.392837,-0.316228,1.049344,-0.924872,3.162278,-0.577684,-0.353391,-0.316228,-0.316228,3.162278,-0.316228,-0.496776,-0.316228,2.12132,3.162278,-0.316228,-0.316228,-0.595383,1.804593,-0.316228,3.162278,3.162278,-0.243378,-0.723234,-0.442326,1.97205,-0.467888,-0.454369,3.162278,-0.374879,-0.316228,-0.199162,-0.454369,-0.316228,0.063551,0.153707,-0.316228,-0.316228,-0.316228,2.991122,2.12132,-0.434728,-0.528405,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,3.162278,-0.316228,-0.316228,3.162278,-0.316228,-0.316228,0.281637,-0.467888,3.162278,-0.467888,-0.316228,-0.165419,0.838127,-0.4636,-0.316228,-0.316228,-0.148791,-0.316228,1.749256,-0.316228,-0.442326,3.162278,0.616267,3.162278,-0.264533,2.12132,3.162278,-0.316228,1.007593,-0.316228,-0.316228,-0.657458,-0.60745,-0.454369,-0.316228,-0.316228,-0.316228,0.258322,-0.316228,-0.316228,3.162278,-0.316228,-0.470861,3.064129,-0.471405,3.162278,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
3,-0.66147,1.550146,0.115462,0.053706,0.940525,-0.123393,-0.316228,3.162278,-0.316228,3.162278,3.162278,2.12132,2.12132,0.616267,1.274612,-0.471405,2.545584,-0.316228,-0.316228,-0.456773,3.162278,-0.465633,0.362254,0.018841,-0.467888,-0.316228,0.328466,-0.467888,-0.563615,-0.316228,0.434099,-0.44507,2.12132,1.190919,-0.749838,0.634183,-0.316228,0.471405,-0.316228,1.873829,-0.924872,-0.316228,-0.577684,-0.353391,-0.316228,3.162278,-0.316228,-0.316228,0.421062,-0.316228,-0.471405,-0.316228,3.162278,-0.316228,-0.595383,-0.84846,3.162278,-0.316228,-0.316228,-0.408399,0.449166,2.801397,0.634183,-0.467888,1.419905,-0.316228,-0.374879,3.162278,0.62238,1.419905,-0.316228,-0.355887,2.086024,-0.316228,-0.316228,-0.316228,0.721995,-0.471405,-0.434728,1.205697,3.162278,3.162278,3.162278,-0.316228,3.162278,-0.316228,-0.316228,3.162278,-0.316228,-0.316228,-0.316228,-0.009288,-0.467888,-0.316228,-0.467888,-0.316228,-0.627038,-0.667081,-0.4636,-0.316228,-0.316228,-0.68794,-0.316228,-0.725955,-0.316228,-0.442326,-0.316228,0.616267,-0.316228,-0.091744,-0.471405,-0.316228,-0.316228,1.007593,-0.316228,3.162278,0.169061,1.448535,-0.454369,-0.316228,-0.316228,-0.316228,0.258322,-0.316228,3.162278,-0.316228,-0.316228,-0.470861,0.471405,-0.471405,-0.316228,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
4,-0.277481,0.340444,-0.348874,-0.829292,-0.337455,-0.363349,3.162278,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,-0.471405,-1.043881,0.508205,-0.471405,-0.565685,-0.316228,3.162278,-0.456773,-0.316228,-0.465633,-0.749781,-0.629928,-0.467888,-0.316228,-0.812522,-0.467888,-0.563615,-0.316228,-0.839259,1.170381,-0.471405,-0.749838,-0.749838,1.685364,3.162278,-0.392837,-0.316228,-0.599625,0.818367,-0.316228,-0.577684,3.142384,-0.316228,-0.316228,-0.316228,-0.316228,-0.496776,-0.316228,-0.471405,-0.316228,-0.316228,3.162278,1.009816,1.520338,-0.316228,-0.316228,-0.316228,-0.44769,1.370338,-0.442326,1.685364,-0.467888,-0.454369,-0.316228,-0.09401,-0.316228,-0.609932,-0.454369,-0.316228,-0.355887,-0.812451,-0.316228,-0.316228,-0.316228,-0.412568,-0.471405,-0.434728,-0.528405,-0.316228,-0.316228,-0.316228,3.162278,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.331384,-0.467888,-0.316228,-0.467888,-0.316228,1.021602,-0.667081,-0.4636,-0.316228,-0.316228,0.274826,-0.316228,1.484055,-0.316228,-0.442326,-0.316228,-1.043881,-0.316228,0.550044,-0.471405,-0.316228,-0.316228,-0.659566,-0.316228,-0.316228,-0.657458,-0.60745,-0.454369,-0.316228,-0.316228,-0.316228,1.033287,-0.316228,-0.316228,-0.316228,-0.316228,-0.470861,-0.392837,-0.471405,-0.316228,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
5,-0.562754,0.453853,-0.543218,-0.351001,0.106494,-0.341947,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,-0.471405,1.723032,2.536928,-0.471405,-0.565685,-0.316228,-0.316228,-0.456773,-0.316228,2.461203,-0.749781,-0.629928,-0.467888,-0.316228,-0.812522,-0.467888,-0.563615,-0.316228,-0.839259,1.201328,-0.471405,-0.749838,-0.749838,-0.703683,-0.316228,-0.392837,-0.316228,-0.599625,0.702151,-0.316228,-0.577684,-0.353391,-0.316228,-0.316228,-0.316228,-0.316228,-0.496776,3.162278,-0.471405,-0.316228,-0.316228,-0.316228,-0.595383,-0.84846,-0.316228,-0.316228,-0.316228,-0.86679,-0.723234,-0.442326,-0.703683,-0.467888,2.669421,-0.316228,2.842349,-0.316228,-0.609932,2.669421,-0.316228,-0.355887,-0.812451,-0.316228,-0.316228,3.162278,-0.412568,-0.471405,1.049344,-0.528405,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.348701,-0.467888,-0.316228,-0.467888,-0.316228,2.450423,-0.667081,-0.4636,-0.316228,-0.316228,2.007805,-0.316228,1.336721,3.162278,-0.442326,-0.316228,1.723032,-0.316228,-0.14934,-0.471405,-0.316228,-0.316228,-0.659566,-0.316228,-0.316228,-0.657458,-0.60745,2.669421,3.162278,-0.316228,-0.316228,2.559735,-0.316228,-0.316228,-0.316228,-0.316228,-0.470861,-0.392837,-0.471405,-0.316228,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
6,1.33203,-0.387268,-0.641471,-0.776733,-0.469605,-0.354788,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,-0.471405,-1.043881,-0.618863,-0.471405,-0.565685,-0.316228,-0.316228,-0.456773,-0.316228,-0.465633,-0.749781,-0.629928,-0.467888,-0.316228,-0.812522,-0.467888,-0.563615,-0.316228,-0.839259,0.551434,-0.471405,-0.749838,-0.749838,-0.703683,-0.316228,-0.392837,-0.316228,-0.599625,0.237287,-0.316228,-0.577684,-0.353391,-0.316228,-0.316228,-0.316228,-0.316228,-0.496776,-0.316228,-0.471405,-0.316228,-0.316228,-0.316228,-0.595383,-0.84846,-0.316228,-0.316228,-0.316228,0.600061,-0.723234,-0.442326,-0.703683,-0.467888,-0.454369,-0.316228,-0.732349,-0.316228,2.813157,-0.454369,-0.316228,3.139436,-0.812451,-0.316228,-0.316228,-0.316228,-0.412568,-0.471405,-0.434728,-0.528405,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.591138,-0.467888,-0.316228,-0.467888,-0.316228,-0.627038,-0.667081,-0.4636,-0.316228,-0.316228,-0.68794,-0.316228,-0.725955,-0.316228,-0.442326,-0.316228,-1.043881,-0.316228,1.57855,-0.471405,-0.316228,-0.316228,-0.659566,-0.316228,-0.316228,-0.657458,-0.60745,-0.454369,-0.316228,-0.316228,-0.316228,-0.727998,-0.316228,-0.316228,-0.316228,-0.316228,2.006277,-0.392837,-0.471405,-0.316228,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
7,-0.706883,-0.746399,-0.458574,-0.445608,-0.510404,-0.368587,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,-0.471405,-1.043881,-0.618863,-0.471405,-0.565685,-0.316228,-0.316228,-0.456773,-0.316228,-0.465633,-0.749781,2.974342,-0.467888,-0.316228,-0.812522,-0.467888,0.803614,-0.316228,-0.839259,1.273538,-0.471405,-0.749838,-0.749838,-0.703683,-0.316228,-0.392837,-0.316228,-0.599625,1.657704,-0.316228,-0.577684,-0.353391,-0.316228,-0.316228,-0.316228,-0.316228,2.902626,-0.316228,-0.471405,-0.316228,-0.316228,-0.316228,-0.595383,-0.84846,-0.316228,-0.316228,-0.316228,-0.622315,-0.723234,-0.442326,-0.703683,-0.467888,-0.454369,-0.316228,-0.732349,-0.316228,-0.609932,-0.454369,-0.316228,-0.355887,-0.812451,-0.316228,-0.316228,-0.316228,-0.412568,-0.471405,2.863211,2.682894,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.591138,-0.467888,-0.316228,-0.467888,-0.316228,1.082662,-0.667081,-0.4636,-0.316228,-0.316228,-0.68794,3.162278,-0.725955,-0.316228,-0.442326,-0.316228,-1.043881,-0.316228,0.618611,-0.471405,-0.316228,-0.316228,2.427765,-0.316228,-0.316228,-0.657458,-0.60745,-0.454369,-0.316228,-0.316228,-0.316228,-0.727998,-0.316228,-0.316228,-0.316228,-0.316228,-0.470861,-0.392837,-0.471405,-0.316228,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
8,0.939726,-0.44161,-0.802005,-0.697894,-0.57847,-0.361975,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,-0.471405,-1.043881,-0.618863,-0.471405,-0.565685,-0.316228,-0.316228,-0.456773,-0.316228,-0.465633,-0.749781,-0.629928,-0.467888,-0.316228,-0.812522,-0.467888,2.79231,-0.316228,-0.839259,0.748371,-0.471405,-0.749838,-0.749838,-0.703683,-0.316228,-0.392837,-0.316228,-0.599625,1.610748,-0.316228,-0.577684,-0.353391,-0.316228,-0.316228,-0.316228,-0.316228,-0.496776,-0.316228,-0.471405,-0.316228,-0.316228,-0.316228,-0.595383,-0.84846,-0.316228,-0.316228,-0.316228,0.266686,-0.723234,-0.442326,-0.703683,-0.467888,-0.454369,-0.316228,-0.732349,-0.316228,-0.609932,-0.454369,-0.316228,-0.355887,-0.812451,-0.316228,-0.316228,-0.316228,-0.412568,-0.471405,-0.434728,-0.528405,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.591138,-0.467888,-0.316228,-0.467888,-0.316228,-0.627038,-0.667081,-0.4636,-0.316228,-0.316228,-0.68794,-0.316228,-0.725955,-0.316228,-0.442326,-0.316228,-1.043881,-0.316228,1.840351,-0.471405,-0.316228,-0.316228,-0.659566,-0.316228,-0.316228,-0.657458,-0.60745,-0.454369,-0.316228,-0.316228,-0.316228,-0.727998,-0.316228,-0.316228,-0.316228,-0.316228,2.231471,-0.392837,-0.471405,-0.316228,-0.316228,-0.316228,-0.467888,-0.316228,-0.316228
9,-0.717117,-1.081902,-0.103216,-0.45612,-0.501847,-0.309046,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.471405,-0.471405,1.031304,-0.618863,-0.471405,-0.565685,3.162278,-0.316228,1.47573,-0.316228,1.729494,1.566958,-0.224448,2.391426,-0.316228,2.039948,2.391426,-0.563615,-0.316228,0.752439,0.009855,-0.471405,1.676108,1.676108,-0.703683,-0.316228,-0.392837,3.162278,-0.599625,-0.4019,-0.316228,1.291293,-0.353391,3.162278,-0.316228,-0.316228,-0.316228,0.650522,-0.316228,-0.471405,-0.316228,-0.316228,-0.316228,1.651895,0.809698,-0.316228,-0.316228,-0.316228,-0.683433,-0.723234,-0.442326,-0.703683,2.391426,-0.454369,-0.316228,0.608163,-0.316228,-0.609932,-0.454369,-0.316228,-0.355887,0.395247,-0.316228,-0.316228,-0.316228,-0.412568,-0.471405,-0.434728,-0.528405,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,-0.316228,3.162278,-0.316228,-0.40931,2.391426,-0.316228,2.391426,3.162278,-0.627038,2.155185,1.661235,-0.316228,-0.316228,-0.014004,-0.316228,-0.725955,-0.316228,-0.442326,-0.316228,1.031304,-0.316228,-0.869294,-0.471405,-0.316228,-0.316228,-0.659566,-0.316228,-0.316228,2.441987,1.962531,-0.454369,-0.316228,3.162278,-0.316228,-0.727998,-0.316228,-0.316228,-0.316228,-0.316228,-0.470861,-0.392837,-0.471405,-0.316228,-0.316228,-0.316228,2.391426,-0.316228,-0.316228


In [47]:
kmeans_rs = KMeans(n_clusters=kclusters, random_state=0).fit(alt_stan)
kmeans_rs.labels_[0:11] 

array([2, 0, 3, 4, 1, 1, 1, 1, 1, 0, 1], dtype=int32)

In [48]:
data_non = data_city.join(data_fig)
alt = pd.merge(data_non,grouped,on='City')
data_final = alt.join(data_coor)

In [49]:
data_final.insert(0, 'Cluster Labels', kmeans.labels_)
data_final

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo,Latitude,Longitude
0,1,Hong Kong,1107.0,7.48,48673.0,92.43,530.44,110.73,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.01,0.04,0.0,0.03,0.01,0.01,0.03,0.0,0.01,0.02,0.02,0.0,0.04,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.21,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.01,22.302711,114.177216
1,2,Macao,33.0,0.67,82609.0,94.93,1.51,0.3753,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.08,0.02,0.01,0.0,0.06,0.03,0.0,0.0,0.02,0.06,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.15,0.02,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.03,0.07,0.0,0.0,0.05,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,22.198746,113.543877
2,3,Guangzhou,7434.0,14.9,23497.0,71.8,84.74,6.611,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.03,0.0,0.01,0.0,0.02,0.06,0.01,0.01,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.18,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.06,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,23.12911,113.264381
3,3,Shenzhen,1997.0,13.03,28647.0,58.8,245.94,8.203,0.0,0.01,0.0,0.01,0.01,0.01,0.01,0.01,0.03,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.02,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.09,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.15,0.01,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.03,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.543097,114.057861
4,0,Foshan,3798.0,7.91,18992.0,42.0,53.3,0.691,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.214286,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.0,0.125,0.0,0.0,0.089286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.017857,0.0,0.0,0.0,0.142857,0.017857,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.021479,113.121437
5,0,Dongguan,2460.0,8.39,14951.0,51.1,120.22,1.361,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.216667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.116667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.016667,0.0,0.1,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.05,0.0,0.016667,0.033333,0.0,0.0,0.016667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.016667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.020674,113.751801
6,4,Huizhou,11347.0,4.83,12908.0,43.0,33.38,0.959,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.091181,114.400681
7,0,Zhongshan,1784.0,3.31,16711.0,49.3,27.23,0.527,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.185185,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.52747,113.361526
8,4,Jiangmen,9507.0,4.6,9570.0,44.5,16.97,0.734,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.580391,113.080009
9,0,Zhuhai,1736.0,1.89,24100.0,49.1,28.52,2.391,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.025,0.0,0.0125,0.0,0.0125,0.0625,0.0125,0.0125,0.0,0.075,0.0375,0.0,0.0,0.0125,0.125,0.0,0.0125,0.025,0.0,0.0,0.0,0.0125,0.0,0.0375,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.025,0.0125,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0125,0.0,0.0,0.0375,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0125,0.0,0.0125,0.0125,0.0,0.0375,0.05,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0375,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,22.270979,113.576675


In [52]:
#venues_sorted.drop('Cluster Labels',axis=1,inplace=True)
data_final['Cluster Labels'] = data_final['Cluster Labels'].astype(int)
merged = data_final.join(venues_sorted.set_index('City'), on='City')
merged

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Bookstore,Border Crossing,Buddhist Temple,Buffet,Burger Joint,Bus Station,Café,Cantonese Restaurant,Casino,Cha Chaan Teng,Chinese Restaurant,Church,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Concert Hall,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym / Fitness Center,Gym Pool,Hainan Restaurant,Halal Restaurant,Historic Site,History Museum,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Indian Restaurant,Indonesian Restaurant,Irish Pub,Island,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Karaoke Bar,Korean Restaurant,Lake,Lounge,Macanese Restaurant,Market,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Motel,Multiplex,Museum,Music Venue,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Non-Profit,Noodle House,Opera House,Outdoor Sculpture,Pakistani Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pier,Pizza Place,Plaza,Portuguese Restaurant,Ramen Restaurant,Record Shop,Resort,Rest Area,Restaurant,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Taco Place,Taxi Stand,Temple,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Yoga Studio,Zoo,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Hong Kong,1107.0,7.48,48673.0,92.43,530.44,110.73,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.01,0.04,0.0,0.03,0.01,0.01,0.03,0.0,0.01,0.02,0.02,0.0,0.04,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.21,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.03,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.01,22.302711,114.177216,Hotel,Dumpling Restaurant,Park,Chinese Restaurant,Clothing Store,Electronics Store,Japanese Restaurant,Café,Shopping Mall,Coffee Shop
1,2,Macao,33.0,0.67,82609.0,94.93,1.51,0.3753,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.08,0.02,0.01,0.0,0.06,0.03,0.0,0.0,0.02,0.06,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.15,0.02,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.03,0.07,0.0,0.0,0.05,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,22.198746,113.543877,Hotel,Café,Portuguese Restaurant,Coffee Shop,Chinese Restaurant,Resort,Historic Site,Italian Restaurant,Church,Plaza
2,3,Guangzhou,7434.0,14.9,23497.0,71.8,84.74,6.611,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.03,0.0,0.01,0.0,0.02,0.06,0.01,0.01,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.18,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.06,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,23.12911,113.264381,Hotel,Park,Coffee Shop,Shopping Mall,Turkish Restaurant,Middle Eastern Restaurant,Chinese Restaurant,Cantonese Restaurant,Café,Electronics Store
3,3,Shenzhen,1997.0,13.03,28647.0,58.8,245.94,8.203,0.0,0.01,0.0,0.01,0.01,0.01,0.01,0.01,0.03,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.02,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.09,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.15,0.01,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.03,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.543097,114.057861,Hotel,Coffee Shop,Shopping Mall,Park,Chinese Restaurant,Café,Lounge,Electronics Store,Bar,Japanese Restaurant
4,0,Foshan,3798.0,7.91,18992.0,42.0,53.3,0.691,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.214286,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.0,0.125,0.0,0.0,0.089286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.017857,0.0,0.0,0.0,0.142857,0.017857,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.021479,113.121437,Coffee Shop,Hotel,Fast Food Restaurant,Shopping Mall,Furniture / Home Store,Pizza Place,Restaurant,Dim Sum Restaurant,Italian Restaurant,Diner
5,0,Dongguan,2460.0,8.39,14951.0,51.1,120.22,1.361,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.216667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.116667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.016667,0.0,0.1,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.05,0.0,0.016667,0.033333,0.0,0.0,0.016667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.016667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.020674,113.751801,Coffee Shop,Fast Food Restaurant,Italian Restaurant,Pizza Place,Hotel,Shopping Mall,Resort,Bar,Sandwich Place,Thai Restaurant
6,4,Huizhou,11347.0,4.83,12908.0,43.0,33.38,0.959,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.091181,114.400681,Hotel,Coffee Shop,Shopping Mall,Lake,Japanese Restaurant,Fast Food Restaurant,Train Station,Grocery Store,Golf Course,Gift Shop
7,0,Zhongshan,1784.0,3.31,16711.0,49.3,27.23,0.527,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.185185,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.52747,113.361526,Coffee Shop,Fast Food Restaurant,Hotel,Shopping Mall,Cantonese Restaurant,Golf Course,Pizza Place,Clothing Store,Rest Area,Motel
8,4,Jiangmen,9507.0,4.6,9570.0,44.5,16.97,0.734,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.580391,113.080009,Hotel,Fast Food Restaurant,Coffee Shop,Shopping Mall,Clothing Store,Train Station,Zoo,Grocery Store,Golf Course,Gift Shop
9,0,Zhuhai,1736.0,1.89,24100.0,49.1,28.52,2.391,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.025,0.0,0.0125,0.0,0.0125,0.0625,0.0125,0.0125,0.0,0.075,0.0375,0.0,0.0,0.0125,0.125,0.0,0.0125,0.025,0.0,0.0,0.0,0.0125,0.0,0.0375,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.025,0.0125,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0125,0.0,0.0,0.0375,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0125,0.0,0.0125,0.0125,0.0,0.0375,0.05,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0375,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,22.270979,113.576675,Coffee Shop,Hotel,Chinese Restaurant,Café,Portuguese Restaurant,Church,Italian Restaurant,Steakhouse,Fast Food Restaurant,Plaza


In [62]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=9)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
markers_colors = []
for lat, lon, poi, cluster in zip(merged['Latitude'], merged['Longitude'], merged['City'], merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ', Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [54]:
map_clusters.save('GBA_cluster.html')

## <font color=blue>Results</font>

### Cluster 1

In [55]:
interim = grouped.drop('City',1)
delete = interim.columns.tolist()
result = merged.drop(delete,1)

In [56]:
result

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Hong Kong,1107.0,7.48,48673.0,92.43,530.44,110.73,22.302711,114.177216,Hotel,Dumpling Restaurant,Park,Chinese Restaurant,Clothing Store,Electronics Store,Japanese Restaurant,Café,Shopping Mall,Coffee Shop
1,2,Macao,33.0,0.67,82609.0,94.93,1.51,0.3753,22.198746,113.543877,Hotel,Café,Portuguese Restaurant,Coffee Shop,Chinese Restaurant,Resort,Historic Site,Italian Restaurant,Church,Plaza
2,3,Guangzhou,7434.0,14.9,23497.0,71.8,84.74,6.611,23.12911,113.264381,Hotel,Park,Coffee Shop,Shopping Mall,Turkish Restaurant,Middle Eastern Restaurant,Chinese Restaurant,Cantonese Restaurant,Café,Electronics Store
3,3,Shenzhen,1997.0,13.03,28647.0,58.8,245.94,8.203,22.543097,114.057861,Hotel,Coffee Shop,Shopping Mall,Park,Chinese Restaurant,Café,Lounge,Electronics Store,Bar,Japanese Restaurant
4,0,Foshan,3798.0,7.91,18992.0,42.0,53.3,0.691,23.021479,113.121437,Coffee Shop,Hotel,Fast Food Restaurant,Shopping Mall,Furniture / Home Store,Pizza Place,Restaurant,Dim Sum Restaurant,Italian Restaurant,Diner
5,0,Dongguan,2460.0,8.39,14951.0,51.1,120.22,1.361,23.020674,113.751801,Coffee Shop,Fast Food Restaurant,Italian Restaurant,Pizza Place,Hotel,Shopping Mall,Resort,Bar,Sandwich Place,Thai Restaurant
6,4,Huizhou,11347.0,4.83,12908.0,43.0,33.38,0.959,23.091181,114.400681,Hotel,Coffee Shop,Shopping Mall,Lake,Japanese Restaurant,Fast Food Restaurant,Train Station,Grocery Store,Golf Course,Gift Shop
7,0,Zhongshan,1784.0,3.31,16711.0,49.3,27.23,0.527,22.52747,113.361526,Coffee Shop,Fast Food Restaurant,Hotel,Shopping Mall,Cantonese Restaurant,Golf Course,Pizza Place,Clothing Store,Rest Area,Motel
8,4,Jiangmen,9507.0,4.6,9570.0,44.5,16.97,0.734,22.580391,113.080009,Hotel,Fast Food Restaurant,Coffee Shop,Shopping Mall,Clothing Store,Train Station,Zoo,Grocery Store,Golf Course,Gift Shop
9,0,Zhuhai,1736.0,1.89,24100.0,49.1,28.52,2.391,22.270979,113.576675,Coffee Shop,Hotel,Chinese Restaurant,Café,Portuguese Restaurant,Church,Italian Restaurant,Steakhouse,Fast Food Restaurant,Plaza


In [57]:
result.loc[result['Cluster Labels'] == 0]

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,0,Foshan,3798.0,7.91,18992.0,42.0,53.3,0.691,23.021479,113.121437,Coffee Shop,Hotel,Fast Food Restaurant,Shopping Mall,Furniture / Home Store,Pizza Place,Restaurant,Dim Sum Restaurant,Italian Restaurant,Diner
5,0,Dongguan,2460.0,8.39,14951.0,51.1,120.22,1.361,23.020674,113.751801,Coffee Shop,Fast Food Restaurant,Italian Restaurant,Pizza Place,Hotel,Shopping Mall,Resort,Bar,Sandwich Place,Thai Restaurant
7,0,Zhongshan,1784.0,3.31,16711.0,49.3,27.23,0.527,22.52747,113.361526,Coffee Shop,Fast Food Restaurant,Hotel,Shopping Mall,Cantonese Restaurant,Golf Course,Pizza Place,Clothing Store,Rest Area,Motel
9,0,Zhuhai,1736.0,1.89,24100.0,49.1,28.52,2.391,22.270979,113.576675,Coffee Shop,Hotel,Chinese Restaurant,Café,Portuguese Restaurant,Church,Italian Restaurant,Steakhouse,Fast Food Restaurant,Plaza


### Cluster 2

In [58]:
result.loc[result['Cluster Labels'] == 1]

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Hong Kong,1107.0,7.48,48673.0,92.43,530.44,110.73,22.302711,114.177216,Hotel,Dumpling Restaurant,Park,Chinese Restaurant,Clothing Store,Electronics Store,Japanese Restaurant,Café,Shopping Mall,Coffee Shop


### Cluster 3

In [59]:
result.loc[result['Cluster Labels'] == 2]

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,2,Macao,33.0,0.67,82609.0,94.93,1.51,0.3753,22.198746,113.543877,Hotel,Café,Portuguese Restaurant,Coffee Shop,Chinese Restaurant,Resort,Historic Site,Italian Restaurant,Church,Plaza


### Cluster 4

In [60]:
result.loc[result['Cluster Labels'] == 3]

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,3,Guangzhou,7434.0,14.9,23497.0,71.8,84.74,6.611,23.12911,113.264381,Hotel,Park,Coffee Shop,Shopping Mall,Turkish Restaurant,Middle Eastern Restaurant,Chinese Restaurant,Cantonese Restaurant,Café,Electronics Store
3,3,Shenzhen,1997.0,13.03,28647.0,58.8,245.94,8.203,22.543097,114.057861,Hotel,Coffee Shop,Shopping Mall,Park,Chinese Restaurant,Café,Lounge,Electronics Store,Bar,Japanese Restaurant


### Cluster 5

In [61]:
result.loc[result['Cluster Labels'] == 4]

Unnamed: 0,Cluster Labels,City,Area,Population,GDP_pc,Tertiary,Export,FDI,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,4,Huizhou,11347.0,4.83,12908.0,43.0,33.38,0.959,23.091181,114.400681,Hotel,Coffee Shop,Shopping Mall,Lake,Japanese Restaurant,Fast Food Restaurant,Train Station,Grocery Store,Golf Course,Gift Shop
8,4,Jiangmen,9507.0,4.6,9570.0,44.5,16.97,0.734,22.580391,113.080009,Hotel,Fast Food Restaurant,Coffee Shop,Shopping Mall,Clothing Store,Train Station,Zoo,Grocery Store,Golf Course,Gift Shop
10,4,Zhaoqing,14891.0,4.15,8050.0,38.6,3.59,0.143,23.047192,112.465091,Hotel,Park,Zoo,Hainan Restaurant,Fast Food Restaurant,Fish & Chips Shop,French Restaurant,Furniture / Home Store,Garden,General Entertainment
