# Capstone Project - Identify new location for Indian Restaurant in Kuala Lumpur, Malaysia.

## Table of contents
* [Introduction: Business Problem Section](#introduction)
* [Data Collection](#data)
* [Analysis](#analysis)
* [Observation](#observation)

## Introduction: Business Problem Section <a name="introduction"></a>

In this Project, I am focusing on identifying the best places to open new Indian Restaurant in **Kuala Lumpur, Malaysia**.
We will be identifying existing Indian Restaurant in neighbourhoods of Kuala Lumpur and identify new locations where there exists opportunities to open new Indian Restaurant.

Thus report will be helpful to clients who are planning to open a new Indian Restaurant in neighbourhoods of **Kuala Lumpur, Malaysia**.

## Data Collection <a name="data"></a>

The Factors we will be focusing on to complete our Business requirement are as follows:
* Identifying the neighbourhoods of Kuala Lumpur.
* Finding existing Indian Restaurant in the neighbourhoods of Kuala Lumpur.
* Exploring new neighbourhoods, where there is opportunity to open new Indian Restaurant.

The Data sources used to generate above points are as below:
* Using **Wikipedia Page** to get data about the neighbourhoods.
* Obtain Coordinates of neighbourhoods.
* Using **Foursquare API** to get the venues data of the neighbourhood.
* Find the best place to open the Indian Restaurant by exploring the clusters.


### Import and install libraries:

In [43]:
!pip install geopy
!pip install geocoder
!pip install folium



distributed 1.21.8 requires msgpack, which is not installed.
You are using pip version 10.0.1, however version 19.2.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.




distributed 1.21.8 requires msgpack, which is not installed.
You are using pip version 10.0.1, however version 19.2.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


Collecting folium
  Downloading https://files.pythonhosted.org/packages/72/ff/004bfe344150a064e558cb2aedeaa02ecbf75e60e148a55a9198f0c41765/folium-0.10.0-py2.py3-none-any.whl (91kB)
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/63/36/1c93318e9653f4e414a2e0c3b98fc898b4970e939afeedeee6075dd3b703/branca-0.3.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.3.1 folium-0.10.0


distributed 1.21.8 requires msgpack, which is not installed.
You are using pip version 10.0.1, however version 19.2.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


In [44]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print("Libraries imported.")


Libraries imported.


###  Get Neighbourhood Data of Kuala Lumpur

In [96]:
# GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:Suburbs_in_Kuala_Lumpur").text
# Using beautifulsoup object parse data from HTML
soup = BeautifulSoup(data, 'html.parser')


In [98]:
neighbourhood_List = []
# Get data into list 
for area in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighbourhood_List.append(area.text)

In [100]:
# Create DataFrame
DataFrame_KF = pd.DataFrame({"Neighbourhood": neighbourhood_List})
DataFrame_KF.head()

Unnamed: 0,Neighbourhood
0,Alam Damai
1,"Ampang, Kuala Lumpur"
2,Bandar Menjalara
3,Bandar Sri Permaisuri
4,Bandar Tasik Selatan


### Geographical coordinates of Neighbourhood:

In [101]:
# define a function to get coordinates
def getLatLng(neigh):
    # initialize your variable to None
    cordLatLng = None
    # loop until you get the coordinates
    while(cordLatLng is None):
        g = geocoder.arcgis('{}, Kuala Lumpur, Malaysia'.format(neigh))
        cordLatLng = g.latlng
    return cordLatLng

In [107]:
# call the function to get the coordinates, store in a new list using list comprehension
cordinates = [ getLatLng(neighborhood) for neigh in DataFrame_KF["Neighbourhood"].tolist() ]

In [109]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(cordinates, columns=['Latitude', 'Longitude'])

In [110]:
# merge the coordinates into the original dataframe
DataFrame_KF['Latitude'] = df_coords['Latitude']
DataFrame_KF['Longitude'] = df_coords['Longitude']

In [112]:
# check the neighborhoods and the coordinates
print(DataFrame_KF.shape)
DataFrame_KF.head()

(70, 3)


Unnamed: 0,Neighbourhood,Latitude,Longitude
0,Alam Damai,3.05769,101.74388
1,"Ampang, Kuala Lumpur",3.05769,101.74388
2,Bandar Menjalara,3.05769,101.74388
3,Bandar Sri Permaisuri,3.05769,101.74388
4,Bandar Tasik Selatan,3.05769,101.74388


In [113]:
# Save the DataFrame as CSV file
DataFrame_KF.to_csv("DataFrame_KF.csv", index=False)

### 4. Create a map of Kuala Lumpur with neighborhoods superimposed on top

In [114]:
# get the coordinates of Kuala Lumpur
address = 'Kuala Lumpur, Malaysia'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Kuala Lumpur, Malaysiae {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Kuala Lumpur, Malaysiae 3.1516636, 101.6943028.


In [135]:
# create map of Toronto using latitude and longitude values
map_kl = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighbourhood in zip(DataFrame_KF['Latitude'], DataFrame_KF['Longitude'], DataFrame_KF['Neighbourhood']):
    label = '{}'.format(neighbourhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_kl)  
    
map_kl

In [136]:
# save the map as HTML file
map_kl.save('map_kl.html')

### Foursquare API to explore the neighbourhoods.

In [137]:
# define Foursquare Credentials and Version
CLIENT_ID = '4YUI5WA4FLEFU01SXP0IWB3QPMQSRUNH4KHW2LC3XVFAIJHS' # your Foursquare ID
CLIENT_SECRET = '33RA54YANSMMY0YU0RKC41DR41TLSMUSJCRTUSXCDKDU1EPP' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Credentials:
CLIENT_ID: 4YUI5WA4FLEFU01SXP0IWB3QPMQSRUNH4KHW2LC3XVFAIJHS
CLIENT_SECRET:33RA54YANSMMY0YU0RKC41DR41TLSMUSJCRTUSXCDKDU1EPP


In [164]:
radius = 2000
LIMIT = 150

venues = []

for lat, long, neighbourhood in zip(DataFrame_KF['Latitude'], DataFrame_KF['Longitude'], DataFrame_KF['Neighbourhood']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
     # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighbourhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [165]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Neighbourhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(7000, 7)


Unnamed: 0,Neighbourhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Alam Damai,3.05769,101.74388,Pengedar Shaklee Kuala Lumpur,3.061235,101.740696,Supplement Shop
1,Alam Damai,3.05769,101.74388,Machi Noodle 妈子面,3.057695,101.746635,Noodle House
2,Alam Damai,3.05769,101.74388,Minang Tomyam,3.057185,101.749812,Seafood Restaurant
3,Alam Damai,3.05769,101.74388,628火焰鑫茶室,3.058442,101.747947,Chinese Restaurant
4,Alam Damai,3.05769,101.74388,Restoran Ikbal,3.061134,101.75022,Restaurant


In [175]:
venues_df.groupby(["Neighbourhood"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alam Damai,100,100,100,100,100,100
"Ampang, Kuala Lumpur",100,100,100,100,100,100
Bandar Menjalara,100,100,100,100,100,100
Bandar Sri Permaisuri,100,100,100,100,100,100
Bandar Tasik Selatan,100,100,100,100,100,100
Bandar Tun Razak,100,100,100,100,100,100
Bangsar,100,100,100,100,100,100
Bangsar Park,100,100,100,100,100,100
Bangsar South,100,100,100,100,100,100
Batu 11 Cheras,100,100,100,100,100,100


### Find the unique categories from the returned venues

In [176]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 43 uniques categories.


In [177]:
# print out the list of categories
venues_df['VenueCategory'].unique()

array(['Supplement Shop', 'Noodle House', 'Seafood Restaurant',
       'Chinese Restaurant', 'Restaurant',
       'Vegetarian / Vegan Restaurant', 'Breakfast Spot', 'Food Court',
       'Asian Restaurant', 'Park', 'Other Great Outdoors',
       'Dim Sum Restaurant', 'Indian Restaurant', 'Snack Place', 'Spa',
       'Food Truck', 'Bubble Tea Shop', 'Convenience Store',
       'Chinese Breakfast Place', 'Japanese Restaurant', 'Pet Store',
       'Outlet Store', 'Dessert Shop', 'Café', 'Farmers Market',
       'Malay Restaurant', 'Cantonese Restaurant', 'Gym / Fitness Center',
       'Fast Food Restaurant', 'Steakhouse', 'Badminton Court', 'Bakery',
       'Hakka Restaurant', 'Athletics & Sports',
       'Middle Eastern Restaurant', 'Mamak Restaurant', 'Winery',
       'Burger Joint', 'College Bookstore', 'Grocery Store',
       'Halal Restaurant', 'Diner', 'Mexican Restaurant'], dtype=object)

In [178]:
# check if the results contain "Shopping Mall"
"Indian Restaurant" in venues_df['VenueCategory'].unique()

True

## Analysis <a name="analysis"></a>

### Analyse the Neighbourhoods.

In [179]:
# one hot encoding
kl_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
kl_onehot['Neighbourhoods'] = venues_df['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [kl_onehot.columns[-1]] + list(kl_onehot.columns[:-1])
kl_onehot = kl_onehot[fixed_columns]

print(kl_onehot.shape)
kl_onehot.head()

(7000, 44)


Unnamed: 0,Neighbourhoods,Asian Restaurant,Athletics & Sports,Badminton Court,Bakery,Breakfast Spot,Bubble Tea Shop,Burger Joint,Café,Cantonese Restaurant,Chinese Breakfast Place,Chinese Restaurant,College Bookstore,Convenience Store,Dessert Shop,Dim Sum Restaurant,Diner,Farmers Market,Fast Food Restaurant,Food Court,Food Truck,Grocery Store,Gym / Fitness Center,Hakka Restaurant,Halal Restaurant,Indian Restaurant,Japanese Restaurant,Malay Restaurant,Mamak Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Noodle House,Other Great Outdoors,Outlet Store,Park,Pet Store,Restaurant,Seafood Restaurant,Snack Place,Spa,Steakhouse,Supplement Shop,Vegetarian / Vegan Restaurant,Winery
0,Alam Damai,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,Alam Damai,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
2,Alam Damai,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
3,Alam Damai,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Alam Damai,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0


### Group rows by neighborhood and take the mean of the frequency of occurrence of each category.

In [180]:
kl_grouped = kl_onehot.groupby(["Neighbourhoods"]).mean().reset_index()

print(kl_grouped.shape)
kl_grouped

(70, 44)


Unnamed: 0,Neighbourhoods,Asian Restaurant,Athletics & Sports,Badminton Court,Bakery,Breakfast Spot,Bubble Tea Shop,Burger Joint,Café,Cantonese Restaurant,Chinese Breakfast Place,Chinese Restaurant,College Bookstore,Convenience Store,Dessert Shop,Dim Sum Restaurant,Diner,Farmers Market,Fast Food Restaurant,Food Court,Food Truck,Grocery Store,Gym / Fitness Center,Hakka Restaurant,Halal Restaurant,Indian Restaurant,Japanese Restaurant,Malay Restaurant,Mamak Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Noodle House,Other Great Outdoors,Outlet Store,Park,Pet Store,Restaurant,Seafood Restaurant,Snack Place,Spa,Steakhouse,Supplement Shop,Vegetarian / Vegan Restaurant,Winery
0,Alam Damai,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
1,"Ampang, Kuala Lumpur",0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
2,Bandar Menjalara,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
3,Bandar Sri Permaisuri,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
4,Bandar Tasik Selatan,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
5,Bandar Tun Razak,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
6,Bangsar,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
7,Bangsar Park,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
8,Bangsar South,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01
9,Batu 11 Cheras,0.06,0.01,0.01,0.01,0.02,0.01,0.01,0.05,0.02,0.01,0.18,0.01,0.06,0.01,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.03,0.01,0.11,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01


In [200]:
len(kl_grouped[kl_grouped["Indian Restaurant"] > 0])

70

### Create a new DataFrame for Indian Restaurant data only

In [201]:
kl_rest = kl_grouped[["Neighbourhoods","Indian Restaurant"]]

In [202]:
kl_rest.head()

Unnamed: 0,Neighbourhoods,Indian Restaurant
0,Alam Damai,0.03
1,"Ampang, Kuala Lumpur",0.03
2,Bandar Menjalara,0.03
3,Bandar Sri Permaisuri,0.03
4,Bandar Tasik Selatan,0.03


### 7. Cluster Neighborhoods
Run k-means to cluster the neighborhoods in Kuala Lumpur into 3 clusters.

In [203]:
# set number of clusters
kclusters = 3

kl_clustering = kl_rest.drop(["Neighbourhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(kl_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [204]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
kl_merged = kl_rest.copy()

# add clustering labels
kl_merged["Cluster Labels"] = kmeans.labels_

In [205]:
kl_merged.rename(columns={"Neighbourhoods": "Neighbourhood"}, inplace=True)
kl_merged.head()

Unnamed: 0,Neighbourhood,Indian Restaurant,Cluster Labels
0,Alam Damai,0.03,0
1,"Ampang, Kuala Lumpur",0.03,0
2,Bandar Menjalara,0.03,0
3,Bandar Sri Permaisuri,0.03,0
4,Bandar Tasik Selatan,0.03,0


In [206]:
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
kl_merged = kl_merged.join(DataFrame_KF.set_index("Neighbourhood"), on="Neighbourhood")

print(kl_merged.shape)
kl_merged.head() # check the last columns!

(70, 5)


Unnamed: 0,Neighbourhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
0,Alam Damai,0.03,0,3.05769,101.74388
1,"Ampang, Kuala Lumpur",0.03,0,3.05769,101.74388
2,Bandar Menjalara,0.03,0,3.05769,101.74388
3,Bandar Sri Permaisuri,0.03,0,3.05769,101.74388
4,Bandar Tasik Selatan,0.03,0,3.05769,101.74388


In [207]:
# sort the results by Cluster Labels
print(kl_merged.shape)
kl_merged.sort_values(["Cluster Labels"], inplace=True)
kl_merged

(70, 5)


Unnamed: 0,Neighbourhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
0,Alam Damai,0.03,0,3.05769,101.74388
37,Medan Tuanku,0.03,0,3.05769,101.74388
38,Miharja,0.03,0,3.05769,101.74388
39,Mont Kiara,0.03,0,3.05769,101.74388
40,Pantai Dalam,0.03,0,3.05769,101.74388
41,"Pudu, Kuala Lumpur",0.03,0,3.05769,101.74388
42,Putrajaya,0.03,0,3.05769,101.74388
43,Salak South,0.03,0,3.05769,101.74388
44,Segambut,0.03,0,3.05769,101.74388
45,Semarak,0.03,0,3.05769,101.74388


### Finally, let's visualize the resulting clusters

In [208]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(kl_merged['Latitude'], kl_merged['Longitude'], kl_merged['Neighbourhood'], kl_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [209]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

### Examine Clusters
#### Cluster 0

In [210]:
kl_merged.loc[kl_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighbourhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
0,Alam Damai,0.03,0,3.05769,101.74388
37,Medan Tuanku,0.03,0,3.05769,101.74388
38,Miharja,0.03,0,3.05769,101.74388
39,Mont Kiara,0.03,0,3.05769,101.74388
40,Pantai Dalam,0.03,0,3.05769,101.74388
41,"Pudu, Kuala Lumpur",0.03,0,3.05769,101.74388
42,Putrajaya,0.03,0,3.05769,101.74388
43,Salak South,0.03,0,3.05769,101.74388
44,Segambut,0.03,0,3.05769,101.74388
45,Semarak,0.03,0,3.05769,101.74388


#### Cluster 1

In [211]:
kl_merged.loc[kl_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighbourhood,Indian Restaurant,Cluster Labels,Latitude,Longitude


#### Cluster 2

In [212]:
kl_merged.loc[kl_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighbourhood,Indian Restaurant,Cluster Labels,Latitude,Longitude


### Observations: <a name="observation"></a>

From our analysis, we can conclude that most of the **Indian Restaurants** are present in Cluster 0. There are no **Indian Restaurants** in cluster 1 and cluster 2.
As there are number of Indian Restaurants already in existense in cluster 0, it provides an intense competition to open new restaurant in that neighbourhood and hence will be a bad idea to open a new Restaurant over there.

New clients who are planning to open an **Indian Restaurant** can look for opportunities in cluster 1 and cluster 2, as there is almost no competition and would provide a great opportunity for business.Thus we conclude our analysis. 