<h1 align=center><font size = 5>Segmenting and Clustering Neighborhoods in East Jakarta</font></h1>

## 1. INTRODUCTION

### 1.1. Background

In the past decade, the lifestyle of urban people has changed with the trends and habits of drinking coffee. Coffee, which was ancient, is identical to drinks commonly used by older men, now women and men of all ages are accustomed to drinking coffee. And not just enjoying coffee, but many people are looking for a place to drink coffee. The coffee shop has finally become a cool hangout with an internet connection while enjoying a variety of steeping coffee beans. 
This coffee drinking trend will become a big business opportunity. The business world is starting to work on places that serve specialty coffee. Although the Indonesian people are not addicted to coffee, which means they have to drink every day, like in Melbourne. And the Coffe shop industry is still relatively new, but in big cities like Jakarta, Coffe shop has the opportunity to get a gross profit of Rp 100 million to Rp 1 billion. However, getting into the business world is not as easy as one might imagine.
If you already have the capital to open a coffee shop, then you must have the courage, start designing strategies and seeing the market. If you have long been in love with coffee and a hobby of drinking coffee, it means you can start a business with the right passion. Therefore I try to practice my learning at Coursera to answer relevant questions, namely designing strategies to determine which areas are suitable for opening coffee shops.

### 1.2. Problem
Finding data about the area and postcode in South Jakarta is a challenge that must be resolved. The
price of renting a place to determine the exact location of a coffee shop is also one of the problems that
must be resolved.

### 1.3. Interest
I believe this is a relevant challenge with a valid question for anyone who wants to open a coffee shop and determine the right location. The same methodology can be applied according to demands as applicable. This case also applies to anyone interested in exploring starting or finding new business in any city. Finally, this can also serve as a good practical exercise for developing Data Science skills.

## 2. Data Acquisition and Cleaning

### 2.1. Data Acquisition
1. The data acquired for this project is a combination of data from two sources. The first data source of data is scraped from a wikipedia page that contains the list of Neighboorhod East Jakarta ---> https://id.wikipedia.org/wiki/Daftar_kecamatan_dan_kelurahan_di_Kota_Administrasi_Jakarta_Selatan. This page contains additional information about the boroughs, the following are the columns:
    * Kelurahan : Name of the urban village
    * Kecamatan : Name of the sub-district
    * Kota / Provinsi : Name of the province



2. The Second data source is the list of Logitude & Latitude from website longlat.net, and list of postcode from Wikipedia in East Jakartan, the following are columns:

    * Kelurahan : Name of the urban village.
    * PostCode : Number of the posccode area.
    * Latitude : Latitude of the urban village.
    * Longitude : Longitude of the urban village.

#### Import necessary libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

from bs4 import BeautifulSoup

import wikipedia as wp

import warnings

print('Libraries imported.')

Libraries imported.


### 2.2. Download and Explore Dataset 

In [2]:
import requests
website_url = requests.get('https://id.wikipedia.org/wiki/Daftar_kecamatan_dan_kelurahan_di_Kota_Administrasi_Jakarta_Selatan').text

In [3]:
from bs4 import BeautifulSoup
soup = BeautifulSoup(website_url,'lxml')
print(soup.prettify())

<!DOCTYPE html>
<html class="client-nojs" dir="ltr" lang="id">
 <head>
  <meta charset="utf-8"/>
  <title>
   Daftar kecamatan dan kelurahan di Kota Administrasi Jakarta Selatan - Wikipedia bahasa Indonesia, ensiklopedia bebas
  </title>
  <script>
   document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":[",\t.",".\t,"],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","Januari","Februari","Maret","April","Mei","Juni","Juli","Agustus","September","Oktober","November","Desember"],"wgMonthNamesShort":["","Jan","Feb","Mar","Apr","Mei","Jun","Jul","Ags","Sep","Okt","Nov","Des"],"wgRequestId":"Xl0J4ApAMNQABAlRrlcAAABX","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"Daftar_kecamatan_dan_kelurahan_di_Kota_Administrasi_Jakarta_Selatan","wgTitle":"Daftar kecamatan dan kelurahan di Kota Administrasi Jakarta Selatan","wgCurRevisionId":16412642,"wgRevisionId":

In [4]:
My_table = soup.find('table',{'class':'wikitable sortable'})
My_table

<table class="wikitable sortable" width="100%">
<tbody><tr style="background-color: #ccc;" valign="top">
<th>Kode <br/>Kemendagri</th>
<th>Kecamatan</th>
<th>Jumlah <br/>Kelurahan</th>
<th>Daftar <br/>Kelurahan
</th></tr>
<tr valign="top">
<td>31.74.06
</td>
<td><a href="/wiki/Cilandak,_Jakarta_Selatan" title="Cilandak, Jakarta Selatan">Cilandak</a></td>
<td align="center">5</td>
<td><div class="hlist">
<ul><li><a href="/wiki/Cilandak_Barat,_Cilandak,_Jakarta_Selatan" title="Cilandak Barat, Cilandak, Jakarta Selatan">Cilandak Barat</a></li>
<li><a href="/wiki/Cipete_Selatan,_Cilandak,_Jakarta_Selatan" title="Cipete Selatan, Cilandak, Jakarta Selatan">Cipete Selatan</a></li>
<li><a href="/wiki/Gandaria_Selatan,_Cilandak,_Jakarta_Selatan" title="Gandaria Selatan, Cilandak, Jakarta Selatan">Gandaria Selatan</a></li>
<li><a href="/wiki/Lebak_Bulus,_Cilandak,_Jakarta_Selatan" title="Lebak Bulus, Cilandak, Jakarta Selatan">Lebak Bulus</a></li>
<li><a href="/wiki/Pondok_Labu,_Cilandak,_Jakart

In [5]:
links = My_table.findAll('a')
links

[<a href="/wiki/Cilandak,_Jakarta_Selatan" title="Cilandak, Jakarta Selatan">Cilandak</a>,
 <a href="/wiki/Cilandak_Barat,_Cilandak,_Jakarta_Selatan" title="Cilandak Barat, Cilandak, Jakarta Selatan">Cilandak Barat</a>,
 <a href="/wiki/Cipete_Selatan,_Cilandak,_Jakarta_Selatan" title="Cipete Selatan, Cilandak, Jakarta Selatan">Cipete Selatan</a>,
 <a href="/wiki/Gandaria_Selatan,_Cilandak,_Jakarta_Selatan" title="Gandaria Selatan, Cilandak, Jakarta Selatan">Gandaria Selatan</a>,
 <a href="/wiki/Lebak_Bulus,_Cilandak,_Jakarta_Selatan" title="Lebak Bulus, Cilandak, Jakarta Selatan">Lebak Bulus</a>,
 <a href="/wiki/Pondok_Labu,_Cilandak,_Jakarta_Selatan" title="Pondok Labu, Cilandak, Jakarta Selatan">Pondok Labu</a>,
 <a href="/wiki/Jagakarsa,_Jakarta_Selatan" title="Jagakarsa, Jakarta Selatan">Jagakarsa</a>,
 <a href="/wiki/Ciganjur,_Jagakarsa,_Jakarta_Selatan" title="Ciganjur, Jagakarsa, Jakarta Selatan">Ciganjur</a>,
 <a href="/wiki/Cipedak,_Jagakarsa,_Jakarta_Selatan" title="Cipedak, 

In [6]:
Provinsi = []
for link in links:
    Provinsi.append(link.get('title'))
    
print(Provinsi)

['Cilandak, Jakarta Selatan', 'Cilandak Barat, Cilandak, Jakarta Selatan', 'Cipete Selatan, Cilandak, Jakarta Selatan', 'Gandaria Selatan, Cilandak, Jakarta Selatan', 'Lebak Bulus, Cilandak, Jakarta Selatan', 'Pondok Labu, Cilandak, Jakarta Selatan', 'Jagakarsa, Jakarta Selatan', 'Ciganjur, Jagakarsa, Jakarta Selatan', 'Cipedak, Jagakarsa, Jakarta Selatan', 'Jagakarsa, Jagakarsa, Jakarta Selatan', 'Lenteng Agung, Jagakarsa, Jakarta Selatan', 'Srengseng Sawah, Jagakarsa, Jakarta Selatan', 'Tanjung Barat, Jagakarsa, Jakarta Selatan', 'Kebayoran Baru, Jakarta Selatan', 'Cipete Utara, Kebayoran Baru, Jakarta Selatan', 'Gandaria Utara, Kebayoran Baru, Jakarta Selatan', 'Gunung, Kebayoran Baru, Jakarta Selatan', 'Kramat Pela, Kebayoran Baru, Jakarta Selatan', 'Melawai, Kebayoran Baru, Jakarta Selatan', 'Petogogan, Kebayoran Baru, Jakarta Selatan', 'Pulo, Kebayoran Baru, Jakarta Selatan', 'Rawa Barat, Kebayoran Baru, Jakarta Selatan', 'Selong, Kebayoran Baru, Jakarta Selatan', 'Senayan, Kebay

In [7]:
df = pd.DataFrame()
df['Provinsi'] = Provinsi
df = df.Provinsi.str.split(',', expand=True).rename(columns={0: "Kelurahan", 1: "Kecamatan", 2: "Kota"}, errors="raise").dropna()
df.head()

Unnamed: 0,Kelurahan,Kecamatan,Kota
1,Cilandak Barat,Cilandak,Jakarta Selatan
2,Cipete Selatan,Cilandak,Jakarta Selatan
3,Gandaria Selatan,Cilandak,Jakarta Selatan
4,Lebak Bulus,Cilandak,Jakarta Selatan
5,Pondok Labu,Cilandak,Jakarta Selatan


In [8]:
East_jakarta_geo = pd.read_csv('East_jakarta.csv')
East_jakarta_geo.head()

Unnamed: 0,Kelurahan,PostCode,Latitude,Longitude
0,Cilandak Barat,12430,-6.288289,106.796765
1,Cipete Selatan,12410,-6.271827,106.804876
2,Gandaria Selatan,12420,-6.272557,106.794845
3,Lebak Bulus,12440,-6.301837,106.779642
4,Pondok Labu,12450,-6.308832,106.797495


In [9]:
East_jakarta_merged = pd.merge(df, East_jakarta_geo, on='Kelurahan')
East_jakarta_merged.head()

Unnamed: 0,Kelurahan,Kecamatan,Kota,PostCode,Latitude,Longitude
0,Cilandak Barat,Cilandak,Jakarta Selatan,12430,-6.288289,106.796765
1,Cipete Selatan,Cilandak,Jakarta Selatan,12410,-6.271827,106.804876
2,Gandaria Selatan,Cilandak,Jakarta Selatan,12420,-6.272557,106.794845
3,Lebak Bulus,Cilandak,Jakarta Selatan,12440,-6.301837,106.779642
4,Pondok Labu,Cilandak,Jakarta Selatan,12450,-6.308832,106.797495


In [10]:
East_jakarta_data=East_jakarta_merged[['PostCode','Kecamatan','Kelurahan','Latitude','Longitude']]
East_jakarta_data

Unnamed: 0,PostCode,Kecamatan,Kelurahan,Latitude,Longitude
0,12430,Cilandak,Cilandak Barat,-6.288289,106.796765
1,12410,Cilandak,Cipete Selatan,-6.271827,106.804876
2,12420,Cilandak,Gandaria Selatan,-6.272557,106.794845
3,12440,Cilandak,Lebak Bulus,-6.301837,106.779642
4,12450,Cilandak,Pondok Labu,-6.308832,106.797495
5,12620,Jagakarsa,Ciganjur,-6.335664,106.807269
6,12630,Jagakarsa,Cipedak,-6.351934,106.801658
7,12620,Jagakarsa,Jagakarsa,-6.324892,106.819768
8,12610,Jagakarsa,Lenteng Agung,-6.323122,106.836247
9,12640,Jagakarsa,Srengseng Sawah,-6.344935,106.826291


### Define Foursquare Credentials and Version

In [11]:
CLIENT_ID = 'ZCDBYWTWVFRJTKKW3NYERHNPNA3TA2J5A0L4TJGWDMYSD5ER' # your Foursquare ID
CLIENT_SECRET = 'Q5M4N2IPD5Z3RPUDKU1CQWNVU1B5YV01CNOXNNXPPTZO5YO5' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ZCDBYWTWVFRJTKKW3NYERHNPNA3TA2J5A0L4TJGWDMYSD5ER
CLIENT_SECRET:Q5M4N2IPD5Z3RPUDKU1CQWNVU1B5YV01CNOXNNXPPTZO5YO5


#### Let's explore the first neighborhood in our dataframe.

In [12]:
East_jakarta_data.loc[0, 'Kelurahan']

'Cilandak Barat'

Get the neighborhood's latitude and longitude values.

In [13]:
neighborhood_latitude = East_jakarta_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = East_jakarta_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = East_jakarta_data.loc[0, 'Kelurahan'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Cilandak Barat are -6.288289, 106.796765.


### Now, let's get the top 100 venues in  radius of 500 meters.

First, let's create the GET request URL. Name your URL url.

In [14]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius


# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=ZCDBYWTWVFRJTKKW3NYERHNPNA3TA2J5A0L4TJGWDMYSD5ER&client_secret=Q5M4N2IPD5Z3RPUDKU1CQWNVU1B5YV01CNOXNNXPPTZO5YO5&v=20180605&ll=-6.288289,106.796765&radius=500&limit=100'

Send the GET request and examine the resutls

In [15]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e5d32f760ba08001b1c43c8'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Cilandak',
  'headerFullLocation': 'Cilandak, Jakarta',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 35,
  'suggestedBounds': {'ne': {'lat': -6.283788995499996,
    'lng': 106.80128379039412},
   'sw': {'lat': -6.292789004500004, 'lng': 106.79224620960586}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bf7cbde5ec320a1ce8387d3',
       'name': 'Total Buah Segar',
       'location': {'address': 'Jalan RS. Fatmawati No. 52F',
        'lat': -6.287626009704267,
        'lng': 106.79541365563003,
        'labeledLatLngs': [{'label': 'display',
        

**get_category_type** function from the Foursquare lab.

In [16]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [17]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Total Buah Segar,Farmers Market,-6.287626,106.795414
1,Roti Bakar Wiwied,Sandwich Place,-6.289398,106.795378
2,Maxima Fitness,Gym,-6.287644,106.795485
3,Mars Kitchen,Café,-6.287163,106.795424
4,TOUS les JOURS,Bakery,-6.291655,106.79973


In [18]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

35 venues were returned by Foursquare.


## 3. Explore Neighborhoods in East Jakarta

In [19]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Kelurahan', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Now write the code to run the above function on each neighborhood and create a new dataframe called toronto_venues.

In [20]:
East_jakarta_venues = getNearbyVenues(names=East_jakarta_data['Kelurahan'],
                                   latitudes=East_jakarta_data['Latitude'],
                                   longitudes=East_jakarta_data['Longitude']
                                  )

Cilandak Barat
Cipete Selatan
Gandaria Selatan
Lebak Bulus
Pondok Labu
Ciganjur
Cipedak
Jagakarsa
Lenteng Agung
Srengseng Sawah
Tanjung Barat
Cipete Utara
Gandaria Utara
Gunung
Kramat Pela
Melawai
Petogogan
Pulo
Rawa Barat
Selong
Senayan
Cipulir
Grogol Selatan
Grogol Utara
Kebayoran Lama Selatan
Kebayoran Lama Utara
Pondok Pinang
Bangka
Kuningan Barat
Mampang Prapatan
Pela Mampang
Tegal Parang
Cikoko
Duren Tiga
Kalibata
Pancoran
Pengadegan
Rawajati
Cilandak Timur
Jati Padang
Kebagusan
Pasar Minggu
Pejaten Barat
Pejaten Timur
Ragunan
Bintaro
Pesanggrahan
Petukangan Selatan
Petukangan Utara
Guntur
Karet Kuningan
Karet Semanggi
Karet
Kuningan Timur
Menteng Atas
Pasar Manggis
Setiabudi
Bukit Duri
Kebon Baru
Manggarai Selatan
Manggarai
Menteng Dalam
Tebet Barat
Tebet Timur


In [21]:
print(East_jakarta_venues.shape)
East_jakarta_venues.head()

(1097, 7)


Unnamed: 0,Kelurahan,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Cilandak Barat,-6.288289,106.796765,Total Buah Segar,-6.287626,106.795414,Farmers Market
1,Cilandak Barat,-6.288289,106.796765,Roti Bakar Wiwied,-6.289398,106.795378,Sandwich Place
2,Cilandak Barat,-6.288289,106.796765,Maxima Fitness,-6.287644,106.795485,Gym
3,Cilandak Barat,-6.288289,106.796765,Mars Kitchen,-6.287163,106.795424,Café
4,Cilandak Barat,-6.288289,106.796765,TOUS les JOURS,-6.291655,106.79973,Bakery


Let's check how many venues were returned for each neighborhood

In [22]:
East_jakarta_venues.groupby('Kelurahan').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Kelurahan,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bangka,11,11,11,11,11,11
Bintaro,11,11,11,11,11,11
Bukit Duri,2,2,2,2,2,2
Ciganjur,4,4,4,4,4,4
Cikoko,11,11,11,11,11,11
Cilandak Barat,35,35,35,35,35,35
Cilandak Timur,9,9,9,9,9,9
Cipedak,7,7,7,7,7,7
Cipete Selatan,8,8,8,8,8,8
Cipete Utara,5,5,5,5,5,5


#### Let's find out how many unique categories can be curated from all the returned venues

In [23]:
print('There are {} uniques categories.'.format(len(East_jakarta_venues['Venue Category'].unique())))

There are 177 uniques categories.


## 4. Analyze Each Neighborhood

In [24]:
# one hot encoding
East_jakarta_onehot = pd.get_dummies(East_jakarta_venues[['Venue Category']], prefix="", prefix_sep="")
#Mampang_onehot.drop(['Neighborhood'],axis=1,inplace=True) 
East_jakarta_onehot.insert(loc=0, column='Kelurahan', value=East_jakarta_venues['Kelurahan'] )
East_jakarta_onehot.shape

(1097, 178)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [25]:
East_jakarta_grouped = East_jakarta_onehot.groupby('Kelurahan').mean().reset_index()
East_jakarta_grouped.head()

Unnamed: 0,Kelurahan,Acehnese Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Workshop,Automotive Shop,BBQ Joint,Bakery,Balinese Restaurant,Bar,Basketball Court,Bed & Breakfast,Beer Bar,Beer Garden,Betawinese Restaurant,Bistro,Bookstore,Boutique,Breakfast Spot,Bridal Shop,Bubble Tea Shop,Buffet,Building,Burger Joint,Butcher,Café,Campground,Car Wash,Chinese Restaurant,Clothing Store,Coffee Shop,College Residence Hall,Comfort Food Restaurant,Concert Hall,Convenience Store,Cultural Center,Cupcake Shop,Dance Studio,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Garden,Gas Station,Gastropub,General College & University,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Hardware Store,High School,History Museum,Hobby Shop,Hospital,Hot Dog Joint,Hotel,Hotel Bar,Housing Development,Ice Cream Shop,Indian Restaurant,Indonesian Meatball Place,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Javanese Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Lake,Lighthouse,Lounge,Manadonese Restaurant,Market,Martial Arts Dojo,Massage Studio,Medical Center,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Motorcycle Shop,Movie Theater,Multiplex,Music School,Music Store,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Padangnese Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Service,Pet Store,Pharmacy,Pizza Place,Playground,Pool,Pool Hall,Pub,Ramen Restaurant,Record Shop,Residential Building (Apartment / Condo),Restaurant,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Satay Restaurant,School,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shopping Mall,Skate Park,Ski Area,Snack Place,Soccer Field,Soup Place,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Student Center,Sundanese Restaurant,Supermarket,Sushi Restaurant,Taco Place,Tailor Shop,Tea Room,Tech Startup,Thai Restaurant,Theme Restaurant,Track,Train Station,Travel Agency,Turkish Restaurant,Vape Store,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store
0,Bangka,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bintaro,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bukit Duri,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Ciganjur,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Cikoko,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.272727,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's print each neighborhood along with the top 5 most common venues

In [26]:
num_top_venues = 5

for hood in East_jakarta_grouped['Kelurahan']:
    print("----"+hood+"----")
    temp = East_jakarta_grouped[East_jakarta_grouped['Kelurahan'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bangka----
                       venue  freq
0                Coffee Shop  0.27
1                 Restaurant  0.18
2          Convenience Store  0.09
3                     Bistro  0.09
4  Middle Eastern Restaurant  0.09


----Bintaro----
                   venue  freq
0             Food Truck  0.18
1  Indonesian Restaurant  0.18
2             Restaurant  0.09
3           Burger Joint  0.09
4       Asian Restaurant  0.09


----Bukit Duri----
                   venue  freq
0  Indonesian Restaurant   0.5
1       Asian Restaurant   0.5
2    Acehnese Restaurant   0.0
3                   Park   0.0
4            Music Venue   0.0


----Ciganjur----
                  venue  freq
0           Art Gallery  0.25
1  Fast Food Restaurant  0.25
2     Convenience Store  0.25
3              Ski Area  0.25
4   Acehnese Restaurant  0.00


----Cikoko----
                venue  freq
0         Coffee Shop  0.27
1   Convenience Store  0.09
2         Gas Station  0.09
3    Asian Restaurant  0.09
4  Chine

         venue  freq
0  Flea Market  0.25
1       Arcade  0.25
2   Restaurant  0.25
3   Food Truck  0.25
4         Park  0.00


----Petogogan----
                   venue  freq
0                 Bakery  0.12
1            Coffee Shop  0.09
2                   Café  0.09
3       Asian Restaurant  0.06
4  Indonesian Restaurant  0.06


----Petukangan Selatan----
                           venue  freq
0      Indonesian Meatball Place  0.33
1                   Soccer Field  0.33
2                   Noodle House  0.33
3            Acehnese Restaurant  0.00
4  Paper / Office Supplies Store  0.00


----Petukangan Utara----
                 venue  freq
0           Food Court   1.0
1  Acehnese Restaurant   0.0
2     Ramen Restaurant   0.0
3          Music Store   0.0
4          Music Venue   0.0


----Pondok Labu----
                  venue  freq
0         Grocery Store  0.43
1   Javanese Restaurant  0.14
2  Fast Food Restaurant  0.14
3                  Food  0.14
4            Restaurant  0.14




#### Let's put that into a *pandas* dataframe

In [27]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [28]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Kelurahan']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Kelurahan'] = East_jakarta_grouped['Kelurahan']

for ind in np.arange(East_jakarta_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(East_jakarta_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bangka,Coffee Shop,Restaurant,Pet Service,Art Gallery,Arts & Crafts Store,Middle Eastern Restaurant,Bistro,Convenience Store,Women's Store,Food
1,Bintaro,Indonesian Restaurant,Food Truck,Bakery,Restaurant,Clothing Store,Burger Joint,Motorcycle Shop,Pizza Place,Asian Restaurant,Hospital
2,Bukit Duri,Indonesian Restaurant,Asian Restaurant,Fish & Chips Shop,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop
3,Ciganjur,Fast Food Restaurant,Art Gallery,Convenience Store,Ski Area,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court
4,Cikoko,Coffee Shop,Travel Agency,Asian Restaurant,Chinese Restaurant,Hardware Store,Train Station,Supermarket,Convenience Store,Gas Station,Food Stand


## 5. Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 5 clusters.

In [29]:

# set number of clusters
kclusters = 5

East_jakarta_grouped_clustering = East_jakarta_grouped.drop('Kelurahan', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(East_jakarta_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [30]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

East_jakarta_merged = East_jakarta_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
East_jakarta_merged = East_jakarta_merged.join(neighborhoods_venues_sorted.set_index('Kelurahan'), on='Kelurahan')

East_jakarta_merged.head()

Unnamed: 0,PostCode,Kecamatan,Kelurahan,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,12430,Cilandak,Cilandak Barat,-6.288289,106.796765,0,Bookstore,Fast Food Restaurant,Indonesian Restaurant,Pizza Place,Café,Restaurant,Student Center,Coffee Shop,Clothing Store,Massage Studio
1,12410,Cilandak,Cipete Selatan,-6.271827,106.804876,0,Indonesian Restaurant,Auto Workshop,Coffee Shop,Food Court,Café,Music Venue,Spa,Dessert Shop,Hospital,Indian Restaurant
2,12420,Cilandak,Gandaria Selatan,-6.272557,106.794845,0,Food Truck,Indonesian Restaurant,Café,American Restaurant,Fried Chicken Joint,Karaoke Bar,Motorcycle Shop,Food & Drink Shop,Garden,Fruit & Vegetable Store
3,12440,Cilandak,Lebak Bulus,-6.301837,106.779642,0,Food Truck,Soup Place,Indonesian Restaurant,Café,Asian Restaurant,Bubble Tea Shop,Shabu-Shabu Restaurant,Chinese Restaurant,Supermarket,Sushi Restaurant
4,12450,Cilandak,Pondok Labu,-6.308832,106.797495,0,Grocery Store,Javanese Restaurant,Restaurant,Food,Fast Food Restaurant,Women's Store,Fish & Chips Shop,French Restaurant,Food Truck,Food Stand


In [31]:
neighborhoods_venues_sorted.head()

Unnamed: 0,Cluster Labels,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,Bangka,Coffee Shop,Restaurant,Pet Service,Art Gallery,Arts & Crafts Store,Middle Eastern Restaurant,Bistro,Convenience Store,Women's Store,Food
1,0,Bintaro,Indonesian Restaurant,Food Truck,Bakery,Restaurant,Clothing Store,Burger Joint,Motorcycle Shop,Pizza Place,Asian Restaurant,Hospital
2,0,Bukit Duri,Indonesian Restaurant,Asian Restaurant,Fish & Chips Shop,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop
3,0,Ciganjur,Fast Food Restaurant,Art Gallery,Convenience Store,Ski Area,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court
4,0,Cikoko,Coffee Shop,Travel Agency,Asian Restaurant,Chinese Restaurant,Hardware Store,Train Station,Supermarket,Convenience Store,Gas Station,Food Stand


Finally, let's visualize the resulting clusters

In [32]:
latitude = -6.260838
longitude = 106.820788
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(East_jakarta_merged['Latitude'], East_jakarta_merged['Longitude'], East_jakarta_merged['Kelurahan'], East_jakarta_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 6. Examine Clusters
Elimination of Coffee Shop and Café place categories because of avoiding competition with other coffee shops.

In [33]:
One = neighborhoods_venues_sorted[neighborhoods_venues_sorted["1st Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
two = One[One["2nd Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
three =two[two["3rd Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
four = three[three["4th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
five = four[four["5th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
six = five[five["6th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
seven = six[six["7th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
eight = seven[seven["8th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
nine = eight[eight["9th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]
Location_Recomendation = nine[nine["10th Most Common Venue"].apply(lambda x:x not in ['Coffee Shop','Café'])]

### Cluster 1

In [34]:
Location_Recomendation.loc[Location_Recomendation['Cluster Labels'] == 0]

Unnamed: 0,Cluster Labels,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,0,Bintaro,Indonesian Restaurant,Food Truck,Bakery,Restaurant,Clothing Store,Burger Joint,Motorcycle Shop,Pizza Place,Asian Restaurant,Hospital
2,0,Bukit Duri,Indonesian Restaurant,Asian Restaurant,Fish & Chips Shop,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop
3,0,Ciganjur,Fast Food Restaurant,Art Gallery,Convenience Store,Ski Area,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court
7,0,Cipedak,Grocery Store,Department Store,Pharmacy,Burger Joint,Asian Restaurant,Farmers Market,Food,Fried Chicken Joint,French Restaurant,Food Truck
9,0,Cipete Utara,Dessert Shop,Playground,Food Truck,Snack Place,Bakery,Flea Market,Fried Chicken Joint,French Restaurant,Food Stand,Food Court
10,0,Cipulir,Asian Restaurant,Indonesian Restaurant,Hotel,Restaurant,Hobby Shop,Sporting Goods Shop,Department Store,Fried Chicken Joint,Food Truck,Food Stand
14,0,Grogol Selatan,Indonesian Meatball Place,Soup Place,Garden,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop,Food
19,0,Jati Padang,Indonesian Restaurant,Asian Restaurant,Music Store,Food Truck,High School,Fast Food Restaurant,Convenience Store,Breakfast Spot,Japanese Restaurant,Event Space
26,0,Kebayoran Lama Utara,Donut Shop,Art Gallery,Golf Course,Music Venue,Tailor Shop,Women's Store,Food,Fried Chicken Joint,French Restaurant,Food Truck
27,0,Kebon Baru,Indonesian Meatball Place,Housing Development,Noodle House,Grocery Store,Indonesian Restaurant,Hotel Bar,Hotel,Food Truck,Food Stand,Food Court


### Cluster 2

In [35]:
Location_Recomendation.loc[Location_Recomendation['Cluster Labels'] == 1]

Unnamed: 0,Cluster Labels,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
15,1,Grogol Utara,Noodle House,Women's Store,Dessert Shop,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop


### Cluster 3

In [36]:
Location_Recomendation.loc[Location_Recomendation['Cluster Labels'] == 2]

Unnamed: 0,Cluster Labels,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,2,Pengadegan,BBQ Joint,Food Truck,Restaurant,Car Wash,Women's Store,Fried Chicken Joint,French Restaurant,Food Stand,Food Court,Food & Drink Shop
46,2,Pesanggrahan,Flea Market,Arcade,Restaurant,Food Truck,Women's Store,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Stand,Food Court


### Cluster 4

In [37]:
Location_Recomendation.loc[Location_Recomendation['Cluster Labels'] == 3]

Unnamed: 0,Cluster Labels,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
49,3,Petukangan Utara,Food Court,Women's Store,Indonesian Restaurant,Fruit & Vegetable Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food & Drink Shop,Food


### Cluster 5

In [38]:
Location_Recomendation.loc[Location_Recomendation['Cluster Labels'] == 4]

Unnamed: 0,Cluster Labels,Kelurahan,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,4,Jagakarsa,Pharmacy,Acehnese Restaurant,Convenience Store,Fish & Chips Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop
53,4,Ragunan,Pharmacy,Campground,Women's Store,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Court,Food & Drink Shop


## 7. Results and Discussion
The purpose of this project is to help people or coffee shop owners who want to open a new shop in an area by comparing the number of coffee shops in the area. The right area to open coffee shops for the first time is in clusters 3, 4 or 5 because the venue categories such as Food Trucks, Food Stands and Food Court indirectly there is a possibility of coffee menus in the venue and still a little even mostly above the 5th Most Common Venue. But if you want to open branches or add franchises to clusters 1 and 2 can be a consideration.

## 8. Conclusion 
This project helps one get a better understanding of the environment in relation to the most suitable place to open coffee shops. The future of this project includes considering other factors such as the cost of renting a place, the price of land to open a new coffee shop or even the work and salaries of each person in the area to be able to more accurately determine the price of coffee to be sold.