# Coursera: IBM Data Science Professional Certifciate: Capstone Project
### Final Report: **_Hotel Apartment Business in Kochi, Kerala - India_**


## BACKGROUND:
Kochi is a beautiful city situated on the Western Coast of India facing the Arabian Sea, and rightfully called the ‘Queen of the Arabian Sea’. It has a rich network of backwaters and is the culture and heritage center of Kerala. Cochin known as the Queen of Arabian Sea, is the Commercial Capital of Kerala. An enormous potential exists in Kerala, especially in Cochin for the hospitality industry. There are several IT parks, industries, many export companies, 100% literacy, highly developed social structure, and well laid-out communication facilities and transport infrastructure. These and a few other factors provide enormous scope for the growth of hospitality industry in Cochin.


## BUSINESS PROBLEM:
My client wants to utilize the hospitality business potential of Kochi, and he is ready to open a hotel apartment in the region. 
The objective of this project is to help the client in selecting a best location in Kochi to start his new hotel service apartment business. Choosing the right location is the key to the success of the business. 
There are several factors needs to be considered for selecting the right location,
1.	Accessibility, transportation facility etc
2.	Density of other hotel and hotel apartments
3.	Industries and commercial establishments around the region

This project will attempt to explore patterns of suburbs of Kochi by using data science methodology like categorizing them into clusters in order to determine the best location to open a hotel apartment.

## Methodology and Code

Below Steps will go through the code I used to solve the business problem mentioned above

## Step 1: Create data frame with Neighborhood of Kochi

In [1]:
#import libraries
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup

In [2]:
#extract data from html
wiki_url = requests.get('https://en.wikipedia.org/wiki/Category:Suburbs_of_Kochi').text
soup = BeautifulSoup(wiki_url,'html.parser')
print(soup.prettify())

<!DOCTYPE html>
<html class="client-nojs" dir="ltr" lang="en">
 <head>
  <meta charset="utf-8"/>
  <title>
   Category:Suburbs of Kochi - Wikipedia
  </title>
  <script>
   document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"791f8693-a9cd-4c6d-95ac-5f873e5c98e4","wgCSPNonce":!1,"wgCanonicalNamespace":"Category","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":14,"wgPageName":"Category:Suburbs_of_Kochi","wgTitle":"Suburbs of Kochi","wgCurRevisionId":796835276,"wgRevisionId":796835276,"wgArticleId":34457347,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Commons category link is on Wikidata","Neighbourhoods in Kochi"],"wgPageContentLanguage":"en","wgPageContentModel"

In [3]:
neighborhood = []

# append data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighborhood.append(row.text)

In [4]:


# DataFrame from the list
kochi_df = pd.DataFrame({"Neighborhood": neighborhood})
kochi_df.head()

Unnamed: 0,Neighborhood
0,Alangad
1,Angamaly
2,Aroor
3,Chellanam
4,Chendamangalam


In [5]:
kochi_df.shape

(44, 1)

## Step 2: Add position data

In [10]:
#!pip install geocoder

In [11]:
#import libraries
from geopy.geocoders import Nominatim
import geocoder 

In [12]:
# Function to get lat and long values
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Kochi, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [13]:
# call the function to get the coordinates, store in a new list using list comprehension
lat_long = [ get_latlng(neighborhood) for neighborhood in kochi_df["Neighborhood"].tolist() ]

In [14]:
lat_long

[[10.847500000000025, 76.43609000000004],
 [10.20366000000007, 76.38268000000005],
 [9.936010000000067, 76.26142000000004],
 [9.835260000000062, 76.27029000000005],
 [10.022316296699168, 76.3393582975359],
 [10.15354000000002, 76.34068000000008],
 [10.060850000000073, 76.28901000000008],
 [9.96118000000007, 76.30659000000009],
 [10.081320000000062, 76.34155000000004],
 [9.939097369770508, 76.37503598454573],
 [10.128150000000062, 76.37217000000004],
 [10.086410000000058, 76.38181000000003],
 [9.957580000000064, 76.24239000000006],
 [9.966870000000029, 76.35720000000003],
 [10.06352000000004, 76.24660000000006],
 [9.988452830057541, 76.30342643172278],
 [9.947600000000023, 76.26079000000004],
 [10.107690000000048, 76.26171000000005],
 [10.055610000000058, 76.27164000000005],
 [10.103150000000028, 76.24615000000006],
 [9.90220000000005, 76.31064000000003],
 [9.940510000000074, 76.32395000000008],
 [9.952060000000074, 76.25080000000008],
 [9.999140000000068, 76.26241000000005],
 [9.930700

In [15]:
# dataframe to populate the coordinates into Latitude and Longitude
df_lat_long = pd.DataFrame(lat_long, columns=['Latitude', 'Longitude'])

In [16]:
# merge df_lat_long into kochi_df
kochi_df['Latitude'] = df_lat_long['Latitude']
kochi_df['Longitude'] = df_lat_long['Longitude']

In [17]:
kochi_df.head(50)

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Alangad,10.8475,76.43609
1,Angamaly,10.20366,76.38268
2,Aroor,9.93601,76.26142
3,Chellanam,9.83526,76.27029
4,Chendamangalam,10.022316,76.339358
5,"Chengamanad, Ernakulam district",10.15354,76.34068
6,Cheranallur,10.06085,76.28901
7,Chilavannoor,9.96118,76.30659
8,Choornikkara,10.08132,76.34155
9,Chottanikkara,9.939097,76.375036


In [18]:
kochi_df.shape

(44, 3)

## Step 3: Create a map

In [25]:
from geopy.geocoders import Nominatim
import matplotlib.cm as cm
import matplotlib.colors as colors

#!conda install -c conda-forge folium=0.5.0
import folium

from sklearn.cluster import KMeans

In [26]:
# get the coordinates of Kuala Lumpur
address = 'Kochi, India'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Kochi, India {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Kochi, India 9.9340738, 76.2606304.


In [27]:
# create map of Toronto using latitude and longitude values
map_kochi = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(kochi_df['Latitude'], kochi_df['Longitude'], kochi_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_kochi)  
    
map_kochi

In [28]:
# save the map 
map_kochi.save('map_kochi.html')

## Step 4: Explore Kochi venues

In [29]:
#Define Foursquare credentials - removed before submitting on github
CLIENT_ID = 'IVMZJYM0FP0SRZPCRVV5UQ1DNAVKQ0S1TZRS12XIZY2EF12D'
CLIENT_SECRET = 'WFJCGLGJ0NHRTPX0G1ZJAY5E0CAFLLIUKG41BFTN0VBTCP2D' 
VERSION = '20200528'

In [70]:
radius=10000
LIMIT=100

venues_list = []

for lat, long, neighborhood in zip(kochi_df['Latitude'], kochi_df['Longitude'], kochi_df['Neighborhood']): 
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            long, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
    
        # return only relevant information for each nearby venue
        for venue in results:
            venues_list.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))


In [71]:
# convert the venues list into a new DataFrame
kochi_venues = pd.DataFrame(venues_list)

kochi_venues.columns = ['Neighborhood', 'Latitude', 'Longitude', 'Venue Name', 'Venue Latitude', 'Venue Longitude', 'Venue Category']



In [72]:
kochi_venues.shape


(3315, 7)

In [73]:
kochi_venues.head(10)

Unnamed: 0,Neighborhood,Latitude,Longitude,Venue Name,Venue Latitude,Venue Longitude,Venue Category
0,Alangad,10.8475,76.43609,Azhiyannur,10.870042,76.471718,Bus Stop
1,Alangad,10.8475,76.43609,Hotel Ambujam,10.781332,76.47776,Indian Restaurant
2,Alangad,10.8475,76.43609,"Libas Collections, Pathiripala",10.779531,76.474815,Clothing Store
3,Alangad,10.8475,76.43609,Pathiripala,10.783413,76.488584,General Travel
4,Angamaly,10.20366,76.38268,Terminal 3 - Cochin International Airport,10.157032,76.39304,Airport Service
5,Angamaly,10.20366,76.38268,Carnival Cinemas,10.195147,76.386157,Multiplex
6,Angamaly,10.20366,76.38268,Hotel Flora International,10.161317,76.390701,Hotel
7,Angamaly,10.20366,76.38268,Indian Coffee House,10.176288,76.375646,Indian Restaurant
8,Angamaly,10.20366,76.38268,Carnival Cinemas Multiplex,10.195266,76.386193,Multiplex
9,Angamaly,10.20366,76.38268,Earth Lounge,10.155537,76.392409,Airport Lounge


In [74]:
#How many venues returned for each Neighborhood
kochi_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Latitude,Longitude,Venue Name,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alangad,4,4,4,4,4,4
Angamaly,32,32,32,32,32,32
Aroor,100,100,100,100,100,100
Chellanam,10,10,10,10,10,10
Chendamangalam,100,100,100,100,100,100
"Chengamanad, Ernakulam district",32,32,32,32,32,32
Cheranallur,61,61,61,61,61,61
Chilavannoor,100,100,100,100,100,100
Choornikkara,56,56,56,56,56,56
Chottanikkara,82,82,82,82,82,82


In [75]:
# print the list of categories
kochi_venues['Venue Category'].unique()[:5000]

array(['Bus Stop', 'Indian Restaurant', 'Clothing Store',
       'General Travel', 'Airport Service', 'Multiplex', 'Hotel',
       'Airport Lounge', 'Train Station', 'Café', 'Kerala Restaurant',
       'Airport', 'Airport Food Court', 'Food', 'Fast Food Restaurant',
       'Asian Restaurant', 'Pizza Place', 'Coffee Shop', 'Food Truck',
       'Airport Terminal', 'Duty-free Shop', 'Golf Course', 'Bakery',
       'Juice Bar', 'Food & Drink Shop', 'South Indian Restaurant',
       'Fried Chicken Joint', 'Performing Arts Venue', 'Art Gallery',
       'Bar', 'Gastropub', 'Seafood Restaurant', 'Thai Restaurant',
       'Ice Cream Shop', 'Indie Movie Theater', 'Beach',
       'Chinese Restaurant', 'French Restaurant', 'Tea Room', 'Stadium',
       'Hotel Bar', 'Playground', 'Donut Shop',
       'Cajun / Creole Restaurant', 'Athletics & Sports', 'BBQ Joint',
       'Mediterranean Restaurant', 'Restaurant', 'Sandwich Place',
       'Garden', 'Middle Eastern Restaurant', 'Department Store',
    

## Step 5: Analyze Kochi Neighborhoods

In [76]:
# check if the results contain "Hotel Apartments"
kochi_venues[(kochi_venues['Venue Category'].str.contains('Hotel', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Hotel Apartment', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Homestay', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Service Apartment', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Guesthouse', regex=False))].count()

Neighborhood       315
Latitude           315
Longitude          315
Venue Name         315
Venue Latitude     315
Venue Longitude    315
Venue Category     315
dtype: int64

In [77]:
# Number of hotels in each location
kochi_hotels = kochi_venues[(kochi_venues['Venue Category'].str.contains('Hotel', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Hotel Apartment', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Homestay', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Service Apartment', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Guesthouse', regex=False))].groupby(['Neighborhood']).count()
kochi_hotels.drop(['Latitude', 'Longitude', 'Venue Longitude', 'Venue Name', 'Venue Latitude'], axis = 1, inplace = True)
kochi_hotels.rename(columns = {'Venue Category':'Number of Hotels'}, inplace=True)

kochi_hotels.head(50)

Unnamed: 0_level_0,Number of Hotels
Neighborhood,Unnamed: 1_level_1
Angamaly,4
Aroor,13
Chellanam,1
Chendamangalam,5
"Chengamanad, Ernakulam district",4
Cheranallur,3
Chilavannoor,10
Choornikkara,4
Chottanikkara,5
Chowwara,4


In [83]:
# Locations with Transporation points

kochi_bus = kochi_venues[(kochi_venues['Venue Category'].str.contains('Bus Station', regex=False)) |
                         (kochi_venues['Venue Category'].str.contains('Metro Station', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Train Station', regex=False)) |
                 (kochi_venues['Venue Category'].str.contains('Bus Stop', regex=False))].groupby(['Neighborhood']).count()
kochi_bus.drop(['Latitude', 'Longitude', 'Venue Longitude', 'Venue Name', 'Venue Latitude'], axis = 1, inplace = True)
kochi_bus.rename(columns = {'Venue Category':'Number of Transporation points'}, inplace=True)

kochi_bus.head(20)

Unnamed: 0_level_0,Number of Transporation points
Neighborhood,Unnamed: 1_level_1
Alangad,1
Angamaly,1
Chellanam,2
"Chengamanad, Ernakulam district",2
Chottanikkara,3
Chowwara,3
Nedumbassery,1
Thiruvankulam,3
Thrippunithura,1


In [84]:
# Locations with Restaurant
kochi_rest = kochi_venues[(kochi_venues['Venue Category'].str.contains('Restaurant', regex=False))].groupby(['Neighborhood']).count()
kochi_rest.drop(['Latitude', 'Longitude', 'Venue Longitude', 'Venue Name', 'Venue Latitude'], axis = 1, inplace = True)
kochi_rest.rename(columns = {'Venue Category':'Number of Restaurants'}, inplace=True)

kochi_rest.head(20)

Unnamed: 0_level_0,Number of Restaurants
Neighborhood,Unnamed: 1_level_1
Alangad,1
Angamaly,8
Aroor,30
Chellanam,2
Chendamangalam,35
"Chengamanad, Ernakulam district",6
Cheranallur,19
Chilavannoor,32
Choornikkara,17
Chottanikkara,30


In [85]:
# Locations with Shopping Malls
kochi_malls = kochi_venues[(kochi_venues['Venue Category'].str.contains('Mall', regex=False))].groupby(['Neighborhood']).count()
kochi_malls.drop(['Latitude', 'Longitude', 'Venue Longitude', 'Venue Name', 'Venue Latitude'], axis = 1, inplace = True)
kochi_malls.rename(columns = {'Venue Category':'Number of Shopping Malls'}, inplace=True)

kochi_malls.head(20)

Unnamed: 0_level_0,Number of Shopping Malls
Neighborhood,Unnamed: 1_level_1
Aroor,1
Chellanam,1
Chendamangalam,2
Cheranallur,2
Chilavannoor,2
Choornikkara,2
Fort Kochi,1
Irumpanam,2
Kadamakkudy,3
Karanakodam,2


In [86]:
# Locations with Cinemas
kochi_cinema = kochi_venues[(kochi_venues['Venue Category'].str.contains('Multiplex', regex=False))].groupby(['Neighborhood']).count()
kochi_cinema.drop(['Latitude', 'Longitude', 'Venue Longitude', 'Venue Name', 'Venue Latitude'], axis = 1, inplace = True)
kochi_cinema.rename(columns = {'Venue Category':'Number of Cinemas'}, inplace=True)

kochi_cinema.head(20)

Unnamed: 0_level_0,Number of Cinemas
Neighborhood,Unnamed: 1_level_1
Angamaly,2
Chendamangalam,1
"Chengamanad, Ernakulam district",2
Cheranallur,1
Chilavannoor,1
Choornikkara,1
Chowwara,2
Irumpanam,1
Kadamakkudy,1
Karanakodam,1


In [102]:
# Merge the number of venues to main dataframe
kochi_df = kochi_df.drop(columns=['Number of Hotels','Number of Transporation points','Number of Restaurants','Number of Shopping Malls','Number of Cinemas'])

kochi_df = kochi_df.join(kochi_hotels, on='Neighborhood')
kochi_df = kochi_df.join(kochi_bus, on='Neighborhood')
kochi_df = kochi_df.join(kochi_rest, on='Neighborhood')
kochi_df = kochi_df.join(kochi_malls, on='Neighborhood')
kochi_df = kochi_df.join(kochi_cinema, on='Neighborhood')

kochi_df.head()

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas
0,Alangad,10.8475,76.43609,,1.0,1,,
1,Angamaly,10.20366,76.38268,4.0,1.0,8,,2.0
2,Aroor,9.93601,76.26142,13.0,,30,1.0,
3,Chellanam,9.83526,76.27029,1.0,2.0,2,1.0,
4,Chendamangalam,10.022316,76.339358,5.0,,35,2.0,1.0


In [103]:
# Replace NaN with 0
kochi_df = kochi_df.fillna(0)
kochi_df.head()

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas
0,Alangad,10.8475,76.43609,0.0,1.0,1,0.0,0.0
1,Angamaly,10.20366,76.38268,4.0,1.0,8,0.0,2.0
2,Aroor,9.93601,76.26142,13.0,0.0,30,1.0,0.0
3,Chellanam,9.83526,76.27029,1.0,2.0,2,1.0,0.0
4,Chendamangalam,10.022316,76.339358,5.0,0.0,35,2.0,1.0


## Step 6: Cluster Neighborhoods
Run k-means to cluster the neighborhoods in Kochi into 4 clusters.

In [104]:
# set number of clusters
kclusters = 4

kochi_cluster = kochi_df.drop(["Neighborhood"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(kochi_cluster)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 0, 1, 3, 1, 2, 0, 2, 3], dtype=int32)

In [105]:
# create a new dataframe that includes the cluster 
kochi_merged = kochi_df.copy()

# add clustering labels
kochi_merged["Cluster Labels"] = kmeans.labels_

In [106]:
kochi_merged.head()

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas,Cluster Labels
0,Alangad,10.8475,76.43609,0.0,1.0,1,0.0,0.0,1
1,Angamaly,10.20366,76.38268,4.0,1.0,8,0.0,2.0,1
2,Aroor,9.93601,76.26142,13.0,0.0,30,1.0,0.0,0
3,Chellanam,9.83526,76.27029,1.0,2.0,2,1.0,0.0,1
4,Chendamangalam,10.022316,76.339358,5.0,0.0,35,2.0,1.0,3


<b>Create maps representing the clusters. The first map is illustrating the clusters where the radius of the Circle marker is proportional to a Number of Hotels <b>

In [107]:
kochi_cluster.head()

Unnamed: 0,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas
0,10.8475,76.43609,0.0,1.0,1,0.0,0.0
1,10.20366,76.38268,4.0,1.0,8,0.0,2.0
2,9.93601,76.26142,13.0,0.0,30,1.0,0.0
3,9.83526,76.27029,1.0,2.0,2,1.0,0.0
4,10.022316,76.339358,5.0,0.0,35,2.0,1.0


In [109]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(kochi_merged['Latitude'], kochi_merged['Longitude'], kochi_merged['Neighborhood'], kochi_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Step 6: Analyze each Cluster

<b> Cluster 0<b>

In [110]:
kochi_merged.loc[kochi_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas,Cluster Labels
2,Aroor,9.93601,76.26142,13.0,0.0,30,1.0,0.0,0
7,Chilavannoor,9.96118,76.30659,10.0,0.0,32,2.0,1.0,0
12,Fort Kochi,9.95758,76.24239,13.0,0.0,31,1.0,0.0,0
13,Irumpanam,9.96687,76.3572,8.0,0.0,34,2.0,1.0,0
15,Karanakodam,9.988453,76.303426,10.0,0.0,28,2.0,1.0,0
16,Kochangadi,9.9476,76.26079,13.0,0.0,30,1.0,0.0,0
20,"Kumbalam, Ernakulam",9.9022,76.31064,11.0,0.0,29,0.0,0.0,0
21,Maradu,9.94051,76.32395,10.0,0.0,32,2.0,1.0,0
22,Mattancherry,9.95206,76.2508,12.0,0.0,30,1.0,0.0,0
23,Mulavukad,9.99914,76.26241,10.0,0.0,31,2.0,1.0,0


<b> Cluster 1<b>

In [111]:
kochi_merged.loc[kochi_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas,Cluster Labels
0,Alangad,10.8475,76.43609,0.0,1.0,1,0.0,0.0,1
1,Angamaly,10.20366,76.38268,4.0,1.0,8,0.0,2.0,1
3,Chellanam,9.83526,76.27029,1.0,2.0,2,1.0,0.0,1
5,"Chengamanad, Ernakulam district",10.15354,76.34068,4.0,2.0,6,0.0,2.0,1
10,Chowwara,10.12815,76.37217,4.0,3.0,7,0.0,2.0,1
17,Koonammavu,10.10769,76.26171,1.0,0.0,4,0.0,0.0,1
19,Kottuvally,10.10315,76.24615,1.0,0.0,2,0.0,0.0,1
25,Nedumbassery,10.15669,76.3778,4.0,1.0,10,0.0,2.0,1
36,Twenty20 Kizhakkambalam,10.04626,76.40411,0.0,0.0,7,0.0,0.0,1
42,Vypin,10.07538,76.20702,1.0,0.0,1,0.0,0.0,1


<b> Cluster 2<b>

In [112]:
kochi_merged.loc[kochi_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas,Cluster Labels
6,Cheranallur,10.06085,76.28901,3.0,0.0,19,2.0,1.0,2
8,Choornikkara,10.08132,76.34155,4.0,0.0,17,2.0,1.0,2
18,Kothad,10.05561,76.27164,7.0,0.0,22,2.0,1.0,2
29,Pathalam,10.07817,76.31857,2.0,0.0,17,2.0,1.0,2


<b> Cluster 3<b>

In [113]:
kochi_merged.loc[kochi_merged['Cluster Labels'] == 3]

Unnamed: 0,Neighborhood,Latitude,Longitude,Number of Hotels,Number of Transporation points,Number of Restaurants,Number of Shopping Malls,Number of Cinemas,Cluster Labels
4,Chendamangalam,10.022316,76.339358,5.0,0.0,35,2.0,1.0,3
9,Chottanikkara,9.939097,76.375036,5.0,3.0,30,0.0,0.0,3
11,Edathala,10.08641,76.38181,6.0,0.0,25,0.0,0.0,3
14,Kadamakkudy,10.06352,76.2466,4.0,0.0,27,3.0,1.0,3
32,Thiruvankulam,9.94635,76.36746,6.0,3.0,27,1.0,0.0,3
34,Thrikkakkara South,10.03324,76.32519,5.0,0.0,35,2.0,1.0,3
39,Varappuzha,10.08261,76.27041,2.0,0.0,28,3.0,1.0,3


## Observations and Results


During the analysis, four clusters were defined.

1. Cluster 0 has most number of hotels, and cluser 0 more close to the city center 
2. Cluster 0 also has the most number of shopping malls, cinema and Restaurants. This means location for hotel has a direct relation with facilities like restraurants, malls etc
3. Cluster 1 has very less number of hotels, which means less competition but there are very less other facilities like restraurants, malls etc
4. Cluster 2 has very less number of hotels, which means less competition. Also the neighbourhoods in this cluster has a considerable number of restaurants, malls.
4. Cluster 3 also has very less number of hotels, which means less competition. Also the neighbourhoods in this cluster has a big number of restaurants, malls.


## Conclusion

To conclude, the basic data analysis was performed to identify the most optimal locations for openoing a hotel business in the city of Kochi. During the analysis, several important statistical features of the negihborhoods were explored and visualized. 

Based on the analysis below mentioned clusters and neighborhoods are slected,
1. Cluster 3 is looking promising with not much competition and other customer attracting faciltiies like restaurant, malls and cinema. Consider locations like Kadamakkudy and Thiruvankulam.
2. If client is ready to accept the competition and provide a good quality service, then we can even cosnider the center of the city which is Cluster 0 where most of the hotels are located. Consider locations like Irumpanam and vaduthala.