<a href="https://colab.research.google.com/github/somau24/Coursera_Capstone/blob/main/The%20Battle%20of%20Neighborhoods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Capstone Project - The Battle of the Neighborhoods (Week 2)**

### Applied Data Science Capstone by IBM/Coursera

## **Introduction: Business Problem**

In this project we will try to get the most common venue categories in each district, and then use this feature to group the district into clusters.
This report will be targeted to stakeholders interested in opening a venue in **Ubud, Bali, Indonesia.**


##**Data**

Ubud has a total of 8 districts. In order to segment the district and explore them, we will essentially need a dataset that contains the districts.
Let's create an empty dataframe.

In [None]:
import pandas as pd

column_names = ['Village', 'Latitude', 'Longitude']
ubud_region=pd.DataFrame(columns=column_names)
ubud_region

Unnamed: 0,Village,Latitude,Longitude


Since we difficult to find the Ubud's districts dataset. We add manually the name of districts and find the latitude and longitude using geopy library

In [None]:
from geopy.geocoders import Nominatim
village = ['Kedewatan, Ubud', 'Petulu, Ubud', 'Lodtunduh, Ubud', 'Mas, Ubud', 'Peliatan, Ubud', 'Sayan, Ubud', 'Singakerta, Ubud', 'Ubud']
latitudes = []
longitudes = []

geolocator = Nominatim(user_agent="ubud_explorer")
for vil in village:
  location = geolocator.geocode(vil)
  latitude = location.latitude
  longitude = location.longitude
  latitudes.append(latitude)
  longitudes.append(longitude)

for n in range(0,8):
  ubud_region = ubud_region.append({'Village':village[n],
                                  'Latitude': latitudes[n],
                                  'Longitude':longitudes[n]}, ignore_index=True)
ubud_region

Unnamed: 0,Village,Latitude,Longitude
0,"Kedewatan, Ubud",-8.484786,115.245827
1,"Petulu, Ubud",-8.477163,115.276407
2,"Lodtunduh, Ubud",-8.552164,115.26083
3,"Mas, Ubud",-8.545034,115.273189
4,"Peliatan, Ubud",-8.518865,115.269199
5,"Sayan, Ubud",-8.512834,115.240697
6,"Singakerta, Ubud",-8.5267,115.244694
7,Ubud,-8.506898,115.262293


create map of Ubud using latitude and longitude values and mark the center of the district

In [None]:
import folium
map_ubud = folium.Map(location=[latitude, longitude], zoom_start=15)

#add marker to map
for lat,lng,lab in zip(ubud_region['Latitude'],ubud_region['Longitude'],ubud_region['Village']):
  label = folium.Popup(lab,parse_html=True)
  folium.CircleMarker(
      [lat,lng],
      radius=50,
      popup=label,
      color='blue',
      fill=True,
      fill_color='#3186cc',
      fill_opacity=0.7,
      parse_html=False).add_to(map_ubud)
map_ubud

##**Foursquare**

Define Foursquare Credentials and Version

In [None]:
CLIENT_ID='SABSE4FI5VIFEJXGFMPNAKOOQOQU1CLTVHER2XQJ3PKCGNVO'
CLIENT_SECRET='SSOGVVKX2D0ZBX0O0DEYCPFTDLT2B4ODBWJJBGYZGBJGKSYHB'
VERSION='20180605'
LIMIT=500

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: SABSE4FI5VIFEJXGFMPNAKOOQOQU1CLTVHER2XQJ3PKCGNVO
CLIENT_SECRET:SSOGVVKX2D0ZBX0O0DEYCPFTDLT2B4ODBWJJBGYZGBJGKSYHB


Let's create a function to get the top venues that are in Ubud within a radius of 1 kilometers.

In [None]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):

  venues_list=[]
  for name, lat, lng in zip(names, latitudes, longitudes):
    print(name)

    #create the API request URL
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
    
    #make the GET request
    results = requests.get(url).json()['response']['groups'][0]['items']

    #return only relevant information for each nearby venue
    venues_list.append([(
                       name,
                       lat,
                       lng,
                       v['venue']['name'],
                       v['venue']['location']['lat'],
                       v['venue']['location']['lng'],
                       v['venue']['categories'][0]['name']) for v in results])
  
  nearby_venues = pd.DataFrame([item for venues_list in venues_list for item in venues_list])
  nearby_venues.columns=['Village',
                         'Village Latitude',
                         'Village Longitude',
                         'Venue',
                         'Venue Langitude',
                         'Venue Longitude',
                         'Venue Category']

  return(nearby_venues)

In [None]:
import requests
#create a new dataframe called ubud_venues
ubud_venues = getNearbyVenues(names=ubud_region['Village'],
                                   latitudes=ubud_region['Latitude'],
                                   longitudes=ubud_region['Longitude']
                                  )

Kedewatan, Ubud
Petulu, Ubud
Lodtunduh, Ubud
Mas, Ubud
Peliatan, Ubud
Sayan, Ubud
Singakerta, Ubud
Ubud


In [None]:
#check the size of the resulting dataframe
print(ubud_venues.shape)
ubud_venues.head()

(260, 7)


Unnamed: 0,Village,Village Latitude,Village Longitude,Venue,Venue Langitude,Venue Longitude,Venue Category
0,"Kedewatan, Ubud",-8.484786,115.245827,"Mandapa, a Ritz-Carlton Reserve",-8.485558,115.243921,Resort
1,"Kedewatan, Ubud",-8.484786,115.245827,Nasi Ayam Kedewatan Ibu Mangku,-8.483584,115.246516,Indonesian Restaurant
2,"Kedewatan, Ubud",-8.484786,115.245827,Mandapa Spa,-8.485301,115.243481,Spa
3,"Kedewatan, Ubud",-8.484786,115.245827,Kubu,-8.487002,115.24242,Modern European Restaurant
4,"Kedewatan, Ubud",-8.484786,115.245827,Sawah Terrace,-8.486121,115.243188,Theme Restaurant


In [None]:
#check how venues were returned for each district
ubud_venues.groupby('Village').count()

Unnamed: 0_level_0,Village Latitude,Village Longitude,Venue,Venue Langitude,Venue Longitude,Venue Category
Village,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Kedewatan, Ubud",33,33,33,33,33,33
"Lodtunduh, Ubud",6,6,6,6,6,6
"Mas, Ubud",14,14,14,14,14,14
"Peliatan, Ubud",82,82,82,82,82,82
"Petulu, Ubud",10,10,10,10,10,10
"Sayan, Ubud",10,10,10,10,10,10
"Singakerta, Ubud",5,5,5,5,5,5
Ubud,100,100,100,100,100,100


In [None]:
#check how many unique categories can be curated from all the returned venues
print('There are {} uniques categories'.format(len(ubud_venues['Venue Category'].unique())))

There are 71 uniques categories


##**Methodology**

In this project We will limit our analysis to area ~1km around each district center.

In first step we have collected the required data: location and category of every venue within 1km from the Ubud's districts center.

Second step in our analysis will be exploration of **10 top venues** in each district. This exploration can be a reference for stakeholders who interested to open a venue in Ubud.

In third step and final step we group the district into cluster.

# Analyze Each District

In [None]:
# one hot encoding
ubud_onehot = pd.get_dummies(ubud_venues[['Venue Category']], prefix="", prefix_sep="")

# add village column back to dataframe
ubud_onehot['Village'] = ubud_venues['Village'] 

# move village column to the first column
fixed_columns = [ubud_onehot.columns[-1]] + list(ubud_onehot.columns[:-1])
ubud_onehot = ubud_onehot[fixed_columns]

ubud_onehot.head()

Unnamed: 0,Village,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bakery,Balinese Restaurant,Bar,Beach Bar,Bed & Breakfast,Bistro,Board Shop,Burger Joint,Café,Castle,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop,Creperie,Cupcake Shop,Dessert Shop,Diner,Electronics Store,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Garden Center,Gift Shop,Greek Restaurant,Grocery Store,Health Food Store,History Museum,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Internet Cafe,Italian Restaurant,Juice Bar,Massage Studio,Mexican Restaurant,Modern European Restaurant,Motel,Music Venue,New American Restaurant,Peruvian Restaurant,Pizza Place,Playground,Rafting,Resort,Restaurant,River,Seafood Restaurant,Spa,Supermarket,Taco Place,Tapas Restaurant,Thai Restaurant,Theme Restaurant,Tourist Information Center,Trade School,Trail,Vegetarian / Vegan Restaurant,Yoga Studio
0,"Kedewatan, Ubud",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Kedewatan, Ubud",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Kedewatan, Ubud",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
3,"Kedewatan, Ubud",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Kedewatan, Ubud",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0


In [None]:
#the new dataframe size
ubud_onehot.shape

(260, 72)

In [None]:
#group rows by district and by taking the mean of the frequency of occurence of each category
ubud_groupped=ubud_onehot.groupby('Village').mean().reset_index()
ubud_groupped

Unnamed: 0,Village,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bakery,Balinese Restaurant,Bar,Beach Bar,Bed & Breakfast,Bistro,Board Shop,Burger Joint,Café,Castle,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop,Creperie,Cupcake Shop,Dessert Shop,Diner,Electronics Store,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Garden Center,Gift Shop,Greek Restaurant,Grocery Store,Health Food Store,History Museum,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Internet Cafe,Italian Restaurant,Juice Bar,Massage Studio,Mexican Restaurant,Modern European Restaurant,Motel,Music Venue,New American Restaurant,Peruvian Restaurant,Pizza Place,Playground,Rafting,Resort,Restaurant,River,Seafood Restaurant,Spa,Supermarket,Taco Place,Tapas Restaurant,Thai Restaurant,Theme Restaurant,Tourist Information Center,Trade School,Trail,Vegetarian / Vegan Restaurant,Yoga Studio
0,"Kedewatan, Ubud",0.030303,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.060606,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.0,0.121212,0.212121,0.030303,0.030303,0.0,0.060606,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0
1,"Lodtunduh, Ubud",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Mas, Ubud",0.0,0.0,0.214286,0.0,0.071429,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.142857,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Peliatan, Ubud",0.0,0.012195,0.02439,0.0,0.0,0.060976,0.012195,0.02439,0.02439,0.0,0.0,0.012195,0.012195,0.0,0.0,0.097561,0.012195,0.012195,0.0,0.0,0.036585,0.0,0.0,0.012195,0.0,0.0,0.0,0.012195,0.0,0.012195,0.0,0.012195,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.085366,0.0,0.02439,0.109756,0.0,0.0,0.0,0.012195,0.012195,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.097561,0.04878,0.0,0.012195,0.04878,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.097561,0.012195
4,"Petulu, Ubud",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0
5,"Sayan, Ubud",0.0,0.0,0.0,0.0,0.1,0.2,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0
6,"Singakerta, Ubud",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Ubud,0.0,0.0,0.0,0.01,0.0,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.0,0.01,0.08,0.01,0.0,0.01,0.02,0.05,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.01,0.0,0.02,0.1,0.01,0.0,0.06,0.0,0.03,0.01,0.0,0.02,0.01,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.11,0.03,0.0,0.0,0.03,0.0,0.01,0.01,0.01,0.0,0.01,0.01,0.01,0.05,0.02


In [None]:
#the size
ubud_groupped.shape

(8, 72)

In [None]:
#print each district along with the top 5 most common venues
num_top_venues = 5

for hood in ubud_groupped['Village']:
    print("----"+hood+"----")
    temp = ubud_groupped[ubud_groupped['Village'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Kedewatan, Ubud----
                        venue  freq
0                      Resort  0.21
1                     Rafting  0.12
2       Indonesian Restaurant  0.09
3                       Hotel  0.09
4  Modern European Restaurant  0.06


----Lodtunduh, Ubud----
                   venue  freq
0            Coffee Shop  0.33
1                  Hotel  0.17
2                 Resort  0.17
3  Indonesian Restaurant  0.17
4             Restaurant  0.17


----Mas, Ubud----
                   venue  freq
0            Art Gallery  0.21
1                 Resort  0.14
2             Restaurant  0.14
3  Indonesian Restaurant  0.14
4         History Museum  0.07


----Peliatan, Ubud----
                           venue  freq
0          Indonesian Restaurant  0.11
1                         Resort  0.10
2  Vegetarian / Vegan Restaurant  0.10
3                           Café  0.10
4                          Hotel  0.09


----Petulu, Ubud----
                   venue  freq
0                  Hotel   0.

In [None]:
#function to sort the venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [None]:
#create the new dataframe and display the top 10 venues for each district
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Village']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Village'] = ubud_groupped['Village']

for ind in np.arange(ubud_groupped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ubud_groupped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Village,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Kedewatan, Ubud",Resort,Rafting,Hotel,Indonesian Restaurant,Modern European Restaurant,Bed & Breakfast,Spa,American Restaurant,Restaurant,Diner
1,"Lodtunduh, Ubud",Coffee Shop,Indonesian Restaurant,Resort,Hotel,Restaurant,Cupcake Shop,Cocktail Bar,Comfort Food Restaurant,Cosmetics Shop,Creperie
2,"Mas, Ubud",Art Gallery,Restaurant,Indonesian Restaurant,Resort,Food Truck,Arts & Crafts Store,Asian Restaurant,History Museum,Internet Cafe,Yoga Studio
3,"Peliatan, Ubud",Indonesian Restaurant,Café,Resort,Vegetarian / Vegan Restaurant,Hotel,Asian Restaurant,Restaurant,Spa,Coffee Shop,Bakery
4,"Petulu, Ubud",Indonesian Restaurant,Hotel,Resort,Balinese Restaurant,Trail,Garden Center,Playground,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop
5,"Sayan, Ubud",Asian Restaurant,Hotel,Resort,Balinese Restaurant,Indonesian Restaurant,Arts & Crafts Store,Thai Restaurant,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop
6,"Singakerta, Ubud",Supermarket,Indonesian Restaurant,Coffee Shop,Restaurant,Cocktail Bar,Comfort Food Restaurant,Cosmetics Shop,Creperie,Cupcake Shop,Yoga Studio
7,Ubud,Resort,Hotel,Café,Indonesian Restaurant,Vegetarian / Vegan Restaurant,Coffee Shop,Restaurant,Italian Restaurant,Spa,Hostel


## Cluster Districts

In [None]:
#cluster the district into 4 clusters
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 4

ubud_grouped_clustering = ubud_groupped.drop('Village', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ubud_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 0, 2, 1, 1, 1, 3, 1], dtype=int32)

create a new dataframe that includes the cluster as well as the top 10 venues for each district

In [None]:
#neighborhoods_venues_sorted.drop('Cluster Labels', axis=1, inplace=True)

In [None]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

ubud_merged = ubud_region

# merge ubud_grouped with ubud_data to add latitude/longitude for each vilage
ubud_merged = ubud_merged.join(neighborhoods_venues_sorted.set_index('Village'), on='Village')

ubud_merged.head() # check the last columns!

Unnamed: 0,Village,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Kedewatan, Ubud",-8.484786,115.245827,1,Resort,Rafting,Hotel,Indonesian Restaurant,Modern European Restaurant,Bed & Breakfast,Spa,American Restaurant,Restaurant,Diner
1,"Petulu, Ubud",-8.477163,115.276407,1,Indonesian Restaurant,Hotel,Resort,Balinese Restaurant,Trail,Garden Center,Playground,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop
2,"Lodtunduh, Ubud",-8.552164,115.26083,0,Coffee Shop,Indonesian Restaurant,Resort,Hotel,Restaurant,Cupcake Shop,Cocktail Bar,Comfort Food Restaurant,Cosmetics Shop,Creperie
3,"Mas, Ubud",-8.545034,115.273189,2,Art Gallery,Restaurant,Indonesian Restaurant,Resort,Food Truck,Arts & Crafts Store,Asian Restaurant,History Museum,Internet Cafe,Yoga Studio
4,"Peliatan, Ubud",-8.518865,115.269199,1,Indonesian Restaurant,Café,Resort,Vegetarian / Vegan Restaurant,Hotel,Asian Restaurant,Restaurant,Spa,Coffee Shop,Bakery


### Examine each Cluster

Cluster 1

In [None]:
ubud_merged.loc[ubud_merged['Cluster Labels'] == 0, ubud_merged.columns[[1] + list(range(5, ubud_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,-8.552164,Indonesian Restaurant,Resort,Hotel,Restaurant,Cupcake Shop,Cocktail Bar,Comfort Food Restaurant,Cosmetics Shop,Creperie


Cluster 2

In [None]:
ubud_merged.loc[ubud_merged['Cluster Labels'] == 1, ubud_merged.columns[[1] + list(range(5, ubud_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,-8.484786,Rafting,Hotel,Indonesian Restaurant,Modern European Restaurant,Bed & Breakfast,Spa,American Restaurant,Restaurant,Diner
1,-8.477163,Hotel,Resort,Balinese Restaurant,Trail,Garden Center,Playground,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop
4,-8.518865,Café,Resort,Vegetarian / Vegan Restaurant,Hotel,Asian Restaurant,Restaurant,Spa,Coffee Shop,Bakery
5,-8.512834,Hotel,Resort,Balinese Restaurant,Indonesian Restaurant,Arts & Crafts Store,Thai Restaurant,Coffee Shop,Comfort Food Restaurant,Cosmetics Shop
7,-8.506898,Hotel,Café,Indonesian Restaurant,Vegetarian / Vegan Restaurant,Coffee Shop,Restaurant,Italian Restaurant,Spa,Hostel


Cluster 3

In [None]:
ubud_merged.loc[ubud_merged['Cluster Labels'] == 2, ubud_merged.columns[[1] + list(range(5, ubud_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,-8.545034,Restaurant,Indonesian Restaurant,Resort,Food Truck,Arts & Crafts Store,Asian Restaurant,History Museum,Internet Cafe,Yoga Studio


Cluster 4

In [None]:
ubud_merged.loc[ubud_merged['Cluster Labels'] == 3, ubud_merged.columns[[1] + list(range(5, ubud_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,-8.5267,Indonesian Restaurant,Coffee Shop,Restaurant,Cocktail Bar,Comfort Food Restaurant,Cosmetics Shop,Creperie,Cupcake Shop,Yoga Studio


Visualize the resulting clusters

In [None]:
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ubud_merged['Latitude'], ubud_merged['Longitude'], ubud_merged['Village'], ubud_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=50,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

##**Results and Discussion**

As far we know, Ubud is 
Seperti yang kita ketahui, Ubud is a favorite destination for many thanks to its cool highland location, a slow-paced village lifestyle, and an overall laid-back atmosphere. No wonder the top 10 venues in each Ubud district are tourism venues such as Hotel, Resort, Rafting, Art Gallery, Spa, etc. It's a good idea to open a new venue with reference to the top 10 venues around Ubud. However, if I can give a suggestion, I suggest opening venues that are not in the top 10 venues. My reason is that tourism venues in Ubud are mushrooming in almost every corner, so the competition to take consumers is also getting harder. On the other hand, you can open a venue that has a lot of enthusiasts but is still lacking around Ubud.

Purpose of this analysis was to only provide information of the top 10 venues in each district of Ubud. it is entirely possible that there is a very good reason for small number of another venues in any of those areas, reasons which would make them unsuitable for a new venue regardless of lack of competition in the area. The top 10 venues in each district should therefore be considered only as a starting point for more detailed analysis which could eventually result in location which has not only no nearby competition but also other factors taken into account and all other relevant conditions met.

## **Conclusion**

Purpose of this project was to get the most common venue categories in each district in order to aid stakeholders in the search for a location for a new venue. By get the top venues within radius of ~1 kilometers from Foursquare data we can collected the required data(venue Category, venue name). Then, we explore the 10 top venues in each district. This exploration can be a reference for stakeholders who interested to open a venue in Ubud.

Final decission on optimal venue location and venue vategory will be made by stakeholders based on specific characteristics of neighborhoods and locations in every district, taking into consideration additional factors like attractiveness of each location (proximity to park or water), levels of noise / proximity to major roads, real estate availability, prices, social and economic dynamics of every neighborhood etc.