# Exploring the Kuala Lumpur Neighbourhoods: Data-Science in Real Life

# Introduction


Kuala Lumpur is major city in Malaysia. It becomes a center of attention for residential, job employment, tourism, education, and shopping and sports activity. In addition, this city is well known in Malaysia, and become the top choice for local and foreign communities.

Kuala Lumpur: is the national capital of Malaysia as well as its largest city. The only global city in Malaysia, it covers an area of 243 km2 (94 sq mi) and has an estimated population of 1.73 million as of 2016. Greater Kuala Lumpur, also known as the Klang Valley, is an urban agglomeration of 7.25 million people as of 2017.It is among the fastest growing metropolitan regions in South-East Asia, in both population and economic development. (Source: https://en.wikipedia.org/wiki/Kuala_Lumpur)


# Objective

In this project, we will study in details the area classification using Foursquare data and machine learning segmentation and clustering. The aim of this project is to segment areas of Kuala Lumpur based on the most common places captured from Foursquare. Using segmentation and clustering, we hope we can determine the classification of area located inside the city whether it is residential, tourism places, or others.


# Data

The data acquired from wikipedia pages and restructure to csv file for easier manipulation and reading. Both files uploaded to my github for references. Link to the files are:
https://github.com/nabilalqadhi/Coursera_Capstone/blob/master/KL_disrict.csv
Another aspect to consider for this project is the Foursquare data. I believe that the data as good as provided, meaning although we are using Foursquare data for segmentation and clustering, the amount and accuracy of data captured can't 100% determine correct classification in real world.


To start, let's get and look at the data. I've already downloaded it, so let's read it (from local drive) and load it to dataframe:



In [3]:
#import the required library
import numpy as np
import pandas as pd

#read csv file contain KL data
df_kl = pd.read_csv("/resources/labs/DP0701EN/KL_disrict.csv")
df_kl.head()

Unnamed: 0,Postcode,District,Area
0,52100,Kepong,Jinjang
1,52100,Kepong,Taman Bukit Maluri
2,51200,Segambut,Bandar Menjalara
3,51200,Segambut,Bukit Kiara
4,51200,Segambut,Bukit Tunku


In [4]:
#examine data
print('Kuala Lumpur dataframe has {} district and {} areas.'.format(
        len(df_kl['District'].unique()),
        df_kl.shape[0]
    )
)

#grouping data to find District with highest number of area
df_kl.groupby('District').count()

Kuala Lumpur dataframe has 11 district and 66 areas.


Unnamed: 0_level_0,Postcode,Area
District,Unnamed: 1_level_1,Unnamed: 2_level_1
Bandar Tun Razak,6,6
Batu,2,2
Bukit Bintang,11,11
Cheras,9,9
Kepong,2,2
Lembah Pantai,6,6
Segambut,11,11
Seputeh,8,8
Setiawangsa,3,3
Titiwangsa,5,5


In [8]:
#now, using Geocoder and Google API, we get the Latitude and Longitude of each area
import geocoder
GOOGLE_API_KEY='AIzaSyAQWqMTOcyLBRDR2skO4F_5QEWzNDOlUHw'

#function to get latitude and longitude
def get_latlng(postal_code):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.google('{}, Malaysia'.format(postal_code), key=GOOGLE_API_KEY)
        lat_lng_coords = g.latlng
    return lat_lng_coords

#put new column of latitude and logitude into dataframe
postal_codes1 = df_kl['Area']    
coords = [ get_latlng(postal_code) for postal_code in postal_codes1.tolist() ]

df_kl_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
df_kl['Latitude'] = df_kl_coords['Latitude']
df_kl['Longitude'] = df_kl_coords['Longitude']
df_kl.head(10)

Unnamed: 0,Postcode,District,Area,Latitude,Longitude
0,52100,Kepong,Jinjang,3.211033,101.642303
1,52100,Kepong,Taman Bukit Maluri,3.201923,101.632259
2,51200,Segambut,Bandar Menjalara,3.193871,101.63088
3,51200,Segambut,Bukit Kiara,3.142163,101.644358
4,51200,Segambut,Bukit Tunku,3.166521,101.682767
5,51200,Segambut,Damansara,3.142145,101.649912
6,51200,Segambut,Damansara Town Centre,3.146779,101.662265
7,51200,Segambut,Jalan Duta,3.167529,101.670687
8,51200,Segambut,Kampung Kasipillay,3.174557,101.684333
9,51200,Segambut,Kampung Sungai Penchala,3.162039,101.624515


In [9]:
from geopy.geocoders import Nominatim
import folium

address = 'Kuala Lumpur, Malaysia'
geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

# create map of New York using latitude and longitude values
map_kl = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_kl['Latitude'], df_kl['Longitude'], df_kl['District'], df_kl['Area']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_kl)  
    
map_kl

# Methodology
@In this project, I will use the basic methodology as taught in Week 3 lab.

@Above, we have done convert addresses into their equivalent latitude and longitude values.
@Then we will use the Foursquare API to explore neighborhoods in Kuala Lumpur..

@ After that, explore function to get the most common venue categories in each neighborhood,and then use this feature to group the neighborhoods into clusters.

@ K-means clustering algorithm will be use to complete this task. And also, the Folium library to visualize the neighborhoods in Kuala Lumpur.


In [10]:
#slice the original dataframe and create a new dataframe of the Bukit Bintang
bbintang = df_kl[df_kl['District'] == 'Bukit Bintang'].reset_index(drop=True)

#get the geographical coordinates of Bukit Bintang, Kuala Lumpur
address = 'Bukit Bintang, Kuala Lumpur'
geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

# create map of Bukit Bintang using latitude and longitude values
map_bintang = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(bbintang['Latitude'], bbintang['Longitude'], bbintang['Area']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bintang)  
    
map_bintang

Using Foursquare API to get venues at surounding area of both Bukit Bintang, Kuala Lumpur.



In [11]:
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

#Define Foursquare Credentials and Version
CLIENT_ID = 'S34EAXF4QDSSZSRGKUHWA25K4ANQXEARFSR4ZI3W1EMBYZXW' # your Foursquare ID
CLIENT_SECRET = 'GV3ILPWKD2ETMZOLFMPRA0S3ORTYEQZAYMJA3RM2XN32OWVY' # your Foursquare Secret
VERSION = '20180604'

#explore the first neighborhood in our dataframe
#Get the neighborhood's latitude and longitude values.
neighborhood_latitude = bbintang.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = bbintang.loc[0, 'Longitude'] # neighborhood longitude value
neighborhood_name = bbintang.loc[0, 'Area'] # neighborhood name

#get the top 100 venues that are in Bukit Bintang within a radius of 500 meters
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

#Send the GET request and examine the resutls
results = requests.get(url).json()

#borrow the get_category_type function from the Foursquare lab.
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#clean the json and structure it into a pandas dataframe
venues = results['response']['groups'][0]['items']    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
print('{} venues were returned by Foursquare for Bukit Bintang, Kuala Lumpur.'.format(nearby_venues.shape[0]))
nearby_venues.head()

60 venues were returned by Foursquare for Bukit Bintang, Kuala Lumpur.


Unnamed: 0,name,categories,lat,lng
0,Hilton Kuala Lumpur,Hotel,3.135405,101.68569
1,Aloft Kuala Lumpur Sentral,Hotel,3.132767,101.686094
2,Hilton Executive Lounge,Hotel Bar,3.135923,101.685782
3,Family Mart,Convenience Store,3.13296,101.68748
4,Le Meridien Club Lounge,Hotel Bar,3.136157,101.686242


In [12]:
#function to repeat the same process to all area
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Area', 
                  'Area Latitude', 
                  'Area Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#run the above function on each neighborhood and create a new dataframe
bintang_venues = getNearbyVenues(names=bbintang['Area'],
                                   latitudes=bbintang['Latitude'],
                                   longitudes=bbintang['Longitude']
                                  )

#check the size of the resulting dataframe
print(bintang_venues.shape)
bintang_venues.head()

KL Sentral
Bukit Nanas
Bukit Petaling
Chow Kit
Dang Wangi
Kampung Baru
KL City Centre
Medan Tuanku
Pudu
Salak South
Tun Razak Exchange
(602, 7)


Unnamed: 0,Area,Area Latitude,Area Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,KL Sentral,3.134339,101.686337,Hilton Kuala Lumpur,3.135405,101.68569,Hotel
1,KL Sentral,3.134339,101.686337,Aloft Kuala Lumpur Sentral,3.132767,101.686094,Hotel
2,KL Sentral,3.134339,101.686337,Hilton Executive Lounge,3.135923,101.685782,Hotel Bar
3,KL Sentral,3.134339,101.686337,Family Mart,3.13296,101.68748,Convenience Store
4,KL Sentral,3.134339,101.686337,Le Meridien Club Lounge,3.136157,101.686242,Hotel Bar


In [13]:
#check how many venues were returned for each area
print('There are {} uniques categories in Kuala Lumpur.'.format(len(bintang_venues['Venue Category'].unique())))
bintang_venues.groupby('Area').count()

There are 146 uniques categories in Kuala Lumpur.


Unnamed: 0_level_0,Area Latitude,Area Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Area,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bukit Nanas,43,43,43,43,43,43
Bukit Petaling,16,16,16,16,16,16
Chow Kit,97,97,97,97,97,97
Dang Wangi,33,33,33,33,33,33
KL City Centre,100,100,100,100,100,100
KL Sentral,60,60,60,60,60,60
Kampung Baru,71,71,71,71,71,71
Medan Tuanku,72,72,72,72,72,72
Pudu,34,34,34,34,34,34
Salak South,28,28,28,28,28,28


### Analyze Kuala Lumpur


In [14]:
# one hot encoding
bintang_onehot = pd.get_dummies(bintang_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
bintang_onehot['Area'] = bintang_venues['Area'] 

# move neighborhood column to the first column
fixed_columns = [bintang_onehot.columns[-1]] + list(bintang_onehot.columns[:-1])
bintang_onehot = bintang_onehot[fixed_columns]

#examine the new dataframe size after one hot encoding
print('{} rows were returned after one hot encoding.'.format(bintang_onehot.shape[0]))

#group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
bintang_grouped = bintang_onehot.groupby('Area').mean().reset_index()

#examine the new dataframe size after one hot encoding
print('{} rows were returned after grouping.'.format(bintang_grouped.shape[0]))

602 rows were returned after one hot encoding.
11 rows were returned after grouping.


In [15]:
#print each neighborhood along with the top 5 most common venues
num_top_venues = 5

for hood in bintang_grouped['Area']:
    print("----"+hood+"----")
    temp = bintang_grouped[bintang_grouped['Area'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bukit Nanas----
                venue  freq
0   Indian Restaurant  0.14
1                Café  0.09
2    Malay Restaurant  0.07
3         Coffee Shop  0.05
4  Chinese Restaurant  0.02


----Bukit Petaling----
                venue  freq
0    Malay Restaurant  0.25
1      Breakfast Spot  0.06
2  Falafel Restaurant  0.06
3          Food Court  0.06
4              Museum  0.06


----Chow Kit----
                venue  freq
0    Malay Restaurant  0.07
1               Hotel  0.05
2    Asian Restaurant  0.05
3         Coffee Shop  0.04
4  Chinese Restaurant  0.04


----Dang Wangi----
              venue  freq
0             Hotel  0.27
1        Soup Place  0.06
2               Spa  0.06
3  Malay Restaurant  0.06
4               Bar  0.06


----KL City Centre----
                venue  freq
0   Indian Restaurant  0.09
1               Hotel  0.06
2    Asian Restaurant  0.06
3         Coffee Shop  0.06
4  Chinese Restaurant  0.06


----KL Sentral----
                 venue  freq
0           

In [16]:
#put into a pandas dataframe

#write a function to sort the venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#create the new dataframe and display the top 10 venues for each neighborhood
num_top_venues = 8

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Area']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
areas_venues_sorted = pd.DataFrame(columns=columns)
areas_venues_sorted['Area'] = bintang_grouped['Area']

for ind in np.arange(bintang_grouped.shape[0]):
    areas_venues_sorted.iloc[ind, 1:] = return_most_common_venues(bintang_grouped.iloc[ind, :], num_top_venues)

areas_venues_sorted.head()

Unnamed: 0,Area,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue
0,Bukit Nanas,Indian Restaurant,Café,Malay Restaurant,Coffee Shop,Zoo,General Travel,Road,Noodle House
1,Bukit Petaling,Malay Restaurant,Breakfast Spot,Outlet Store,Park,Art Gallery,Travel & Transport,Convenience Store,Asian Restaurant
2,Chow Kit,Malay Restaurant,Hotel,Asian Restaurant,Chinese Restaurant,Coffee Shop,Food Court,Bakery,Soup Place
3,Dang Wangi,Hotel,Malay Restaurant,Soup Place,Spa,Bar,Multiplex,Scenic Lookout,Lounge
4,KL City Centre,Indian Restaurant,Chinese Restaurant,Coffee Shop,Hotel,Asian Restaurant,Café,Food Truck,Restaurant


## K-mean Cluster

In [17]:
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 3

bintang_grouped_clustering = bintang_grouped.drop('Area', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(bintang_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

#create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
bintang_merged = bbintang

# add clustering labels
bintang_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
bintang_merged = bintang_merged.join(areas_venues_sorted.set_index('Area'), on='Area')

bintang_merged.head()

Unnamed: 0,Postcode,District,Area,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue
0,50200,Bukit Bintang,KL Sentral,3.134339,101.686337,0,Hotel,Indian Restaurant,Coffee Shop,Ice Cream Shop,Hotel Bar,Steakhouse,Snack Place,Japanese Restaurant
1,50200,Bukit Bintang,Bukit Nanas,3.15,101.7,2,Indian Restaurant,Café,Malay Restaurant,Coffee Shop,Zoo,General Travel,Road,Noodle House
2,50200,Bukit Bintang,Bukit Petaling,3.131057,101.698382,0,Malay Restaurant,Breakfast Spot,Outlet Store,Park,Art Gallery,Travel & Transport,Convenience Store,Asian Restaurant
3,50200,Bukit Bintang,Chow Kit,3.159971,101.696953,0,Malay Restaurant,Hotel,Asian Restaurant,Chinese Restaurant,Coffee Shop,Food Court,Bakery,Soup Place
4,50200,Bukit Bintang,Dang Wangi,3.156222,101.702956,0,Hotel,Malay Restaurant,Soup Place,Spa,Bar,Multiplex,Scenic Lookout,Lounge


In [18]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

#Finally, let's visualize the resulting clusters
# create map 3.1343385, 101.6863371
bb_clusters = folium.Map(location=[3.1343385, 101.6863371], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(bintang_merged['Latitude'], bintang_merged['Longitude'], bintang_merged['Area'], bintang_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(bb_clusters)
       
bb_clusters

### Results


### Cluster 1 


In [19]:
bintang_merged.loc[bintang_merged['Cluster Labels'] == 0, bintang_merged.columns[[2] + list(range(5, bintang_merged.shape[1]))]]

Unnamed: 0,Area,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue
0,KL Sentral,0,Hotel,Indian Restaurant,Coffee Shop,Ice Cream Shop,Hotel Bar,Steakhouse,Snack Place,Japanese Restaurant
2,Bukit Petaling,0,Malay Restaurant,Breakfast Spot,Outlet Store,Park,Art Gallery,Travel & Transport,Convenience Store,Asian Restaurant
3,Chow Kit,0,Malay Restaurant,Hotel,Asian Restaurant,Chinese Restaurant,Coffee Shop,Food Court,Bakery,Soup Place
4,Dang Wangi,0,Hotel,Malay Restaurant,Soup Place,Spa,Bar,Multiplex,Scenic Lookout,Lounge
5,Kampung Baru,0,Malay Restaurant,Thai Restaurant,Asian Restaurant,Indonesian Restaurant,Hotel,Diner,Soccer Field,Breakfast Spot
7,Medan Tuanku,0,Malay Restaurant,Asian Restaurant,Hotel,Coffee Shop,Food Court,Bakery,Chinese Restaurant,Soup Place
10,Tun Razak Exchange,0,Nightclub,Bar,Middle Eastern Restaurant,Candy Store,Bakery,Lounge,Wine Bar,Chinese Restaurant


### Cluster 2


In [20]:
bintang_merged.loc[bintang_merged['Cluster Labels'] == 1, bintang_merged.columns[[2] + list(range(5, bintang_merged.shape[1]))]]

Unnamed: 0,Area,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue
8,Pudu,1,Chinese Restaurant,Asian Restaurant,Breakfast Spot,Noodle House,Hong Kong Restaurant,Food Truck,Pet Store,Dessert Shop
9,Salak South,1,Chinese Restaurant,Indian Restaurant,Soccer Field,South Indian Restaurant,Food Truck,Night Market,Supermarket,Flea Market


### Cluster 3 


In [21]:
bintang_merged.loc[bintang_merged['Cluster Labels'] == 2, bintang_merged.columns[[2] + list(range(5, bintang_merged.shape[1]))]]

Unnamed: 0,Area,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue
1,Bukit Nanas,2,Indian Restaurant,Café,Malay Restaurant,Coffee Shop,Zoo,General Travel,Road,Noodle House
6,KL City Centre,2,Indian Restaurant,Chinese Restaurant,Coffee Shop,Hotel,Asian Restaurant,Café,Food Truck,Restaurant


### Discussion

Based on cluster  above, we believe that classification for each cluster can be done better with calculation of venues categories (most common). Refering to each clsuter, we can't 
deterimine clearly what represent in each cluster by using Foursquare - Most Common Venue data.

However, for the sake  of this project we assumed each cluster as follow:

Cluster 1:  Tourism

Cluster 2:  Residental

Cluster 3: Mix
        
We believe that the classification we propose is an encouraging step towards a quantitative and systematic comparison of the different cities. Further studies are indeed needed in order to relate the data acquired, then observe it to more meaningful and objective results.

# Conclusion



Using Foursquare API, we can captured data of common places all around the world. Using it, we refer back to our main objectives, which is to determine classification of area located inside the city whether it is residential, tourism places, or others.

In conclusion,  Kuala Lumpur is the center of attraction among Malaysian. The classitification based on common venues, again we must have more systematic or quantitative way to identify and declare this Comparison can be made, but no such method or quantitative data to determine this. We hope in the future, a method to determine it can be establish and explore for references.
