<h1>Capstone Project - Battle of Neighbourhoods </h1>

<h3>Introduction to Business Problem</h3>

<h4>Opening a new Italian Restaurant in Hyderabad, India</h4>

Hyderabad is a cosmopolitan city spread across the area of 7250 Km2 roughly making it the fourth most populous city in India. The city is popular for its rich heritage and it is home for restaurants serving multiple cuisines. The objective of this project is to evaluate the best possible location to open Italian Restaurant in Hyderabad, India with minimum competition.


<h3>Data</h3>

Data Sources for the Project would be : <br/>

Wikimedia Commons: Information about areas/localities in Hyderabad <br/>
Geopy - For the co-ordinates of different locations <br/>
Foursquare API - To get the list of venues and their details around a given location <br/>

<h3>Methodology</h3>

<ol>
    <li>To get the Lat,Long co-ordinates of the Hyderabad city.</li>
    <li>To get the list of localities in the Hyderabad and their co-ordinates.</li>
    <li>Explore the venues and venue categories in the target localities.</li>
    <li>Cluster the localities in the Hyderabad city.</li>
    <li>Analyzing the clusters formed.</li>
    <li>Collecting information about the Italian Restaurant already present in the clusters.</li>
    <li>Compare the clusters and recommend the cluster with minimum competition </li>
</ol>

<h3>1. Importing required libraries</h3>

In [1]:
import numpy as np
import pandas as pd
from geopy.geocoders import Nominatim
try:
    import geocoder
except:
    !pip install geocoder
    import geocoder
import requests
from bs4 import BeautifulSoup
try:
    import folium
except:
    !pip install folium
    import folium   
from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn import metrics
import matplotlib as mpl
import matplotlib.pyplot as plt

print("Libraries Installed")

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 18.0MB/s ta 0:00:01
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB)
[K     |████████████████████████████████| 102kB 6.4MB/s ta 0:00:011
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl
I

<h3>2. Hyderbad Location Details</h3>

In [2]:
g = geocoder.arcgis('Hyderabad, India')
h_lat = g.latlng[0]
h_lng = g.latlng[1]
print("Hyderabad Latitute & Longitute are {} and {}".format(h_lat, h_lng))

Hyderabad Latitute & Longitute are 17.394870000000026 and 78.47076000000004


<h3>3. List of Localities in Hyderabad from Wikimedia</h3>

In [3]:
loca = requests.get("https://commons.wikimedia.org/wiki/Category:Suburbs_of_Hyderabad,_India").text

In [5]:
soup = BeautifulSoup(loca, 'html.parser')

In [6]:
localist = []

In [7]:
for i in soup.find_all('div', class_='mw-category')[0].find_all('a'):
    localist.append(i.text)

#Creating a dataframe from the list
loca_df = pd.DataFrame({"Locality": localist})
loca_df.head()

Unnamed: 0,Locality
0,Abids
1,Alwal
2,"Ameerpet, Hyderabad"
3,"Bandlaguda, Rangareddy"
4,Banjara Hills


In [8]:
loca_df.shape

(54, 1)

<h3>4. To get Co-ordinates of Localities</h3>

In [9]:
def get_location(localities):
    g = geocoder.arcgis('{}, Hyderabad, India'.format(localities))
    get_latlng = g.latlng
    return get_latlng

In [10]:
co_ordinates = []
for i in loca_df["Locality"].tolist():
    co_ordinates.append(get_location(i))
print(co_ordinates)

[[17.389800000000037, 78.47658000000007], [17.535430000000076, 78.54427000000004], [17.435350000000028, 78.44861000000003], [17.299820000000068, 78.46495000000004], [17.415350000000046, 78.43435000000005], [17.40211000000005, 78.47770000000008], [17.447290000000066, 78.45396000000005], [17.40954000000005, 78.57896000000005], [17.51911000000007, 78.50153000000006], [17.394870000000026, 78.47076000000004], [17.40301000000005, 78.49792000000008], [17.366180000000043, 78.48736000000008], [17.368570000000034, 78.53515000000004], [17.409950000000038, 78.48229000000003], [17.45333000000005, 78.43034000000006], [17.43181000000004, 78.38636000000008], [17.522760000000062, 78.43862000000007], [17.463205559882393, 78.62119384031365], [17.389410000000055, 78.40406000000007], [17.32707000000005, 78.60533000000004], [17.429230000000075, 78.37495000000007], [17.399230000000045, 78.48073000000005], [17.36838000000006, 78.39999000000006], [17.42865000000006, 78.39762000000007], [17.386880000000076, 78.

In [11]:
co_ordinates[:5]

[[17.389800000000037, 78.47658000000007],
 [17.535430000000076, 78.54427000000004],
 [17.435350000000028, 78.44861000000003],
 [17.299820000000068, 78.46495000000004],
 [17.415350000000046, 78.43435000000005]]

In [12]:
co_ordinates_df = pd.DataFrame(co_ordinates, columns=['Latitudes', 'Longitudes'])

In [13]:
loca_df["Latitudes"] = co_ordinates_df["Latitudes"]
loca_df["Longitudes"] = co_ordinates_df["Longitudes"]

In [14]:
print("The shape of loca_df is {}".format(loca_df.shape))
loca_df.head()

The shape of loca_df is (54, 3)


Unnamed: 0,Locality,Latitudes,Longitudes
0,Abids,17.3898,78.47658
1,Alwal,17.53543,78.54427
2,"Ameerpet, Hyderabad",17.43535,78.44861
3,"Bandlaguda, Rangareddy",17.29982,78.46495
4,Banjara Hills,17.41535,78.43435


<h3>5. Plotting the Localities on map</h3>

In [16]:
hyd_map = folium.Map(location=[h_lat, h_lng],zoom_start=11)

folium.Marker([h_lat, h_lng], popup='<i>Hyderabad/i>', color='red', tooltip="Click to see").add_to(hyd_map)

for latitude,longitude,name in zip(loca_df["Latitudes"], loca_df["Longitudes"], loca_df["Locality"]):
    folium.CircleMarker(
        [latitude, longitude],
        radius=6,
        color='blue',
        popup=name,
        fill=True,
        fill_color='#3186ff'
    ).add_to(hyd_map)

hyd_map

<h3>6. Using Foursquare API to explore the localities</h3>

In [21]:
CLIENT_ID = 'NGLX4VOILIGRUM1DHLXDMDC5BG5FJ04ZWJLFRLTJHOC4XGDM'
CLIENT_SECRET = 'DTD0MXPJXDJCUSPCJVVASLVNMVYN22LVM4NDPRFKTWYUCTU4'
VERSION = '20180605' 

In [22]:
#Getting the top 100 venues in each locality
radius = 2000
LIMIT = 100

venues = []

for lat, lng, locality in zip(loca_df["Latitudes"], loca_df["Longitudes"], loca_df["Locality"]):
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, lat, lng, VERSION, radius, LIMIT)
    results = requests.get(url).json()['response']['groups'][0]['items']

    for venue in results:
        venues.append((locality, lat, lng, venue['venue']['name'], venue['venue']['location']['lat'], venue['venue']['location']['lng'], venue['venue']['categories'][0]['name'], venue['venue']['id']))

In [23]:
venues[0]

('Abids',
 17.389800000000037,
 78.47658000000007,
 'Pragati',
 17.38808781386729,
 78.48113363131787,
 'South Indian Restaurant',
 '4dc4a65e18506de4adc5d5e5')

In [24]:
venues_df = pd.DataFrame(venues)
venues_df.columns = ['Locality', 'Latitude', 'Longitude', 'Venue name', 'Venue Lat', 'Venue Lng', 'Venue Category', 'Venue ID']
venues_df.head()

Unnamed: 0,Locality,Latitude,Longitude,Venue name,Venue Lat,Venue Lng,Venue Category,Venue ID
0,Abids,17.3898,78.47658,Pragati,17.388088,78.481134,South Indian Restaurant,4dc4a65e18506de4adc5d5e5
1,Abids,17.3898,78.47658,Mayur Pan Shop,17.388894,78.480578,Juice Bar,4cdd08d4fc973704fe47d905
2,Abids,17.3898,78.47658,Santosh Dhaba,17.388485,78.479509,Indian Restaurant,4d3d4eca14aa8cfaa6d6b15e
3,Abids,17.3898,78.47658,Taj Mahal Hotel,17.391942,78.476915,Hotel,4be06fbf4c55b651426feab7
4,Abids,17.3898,78.47658,Karachi Bakery,17.383454,78.475075,Bakery,4bffe44ec30a2d7fbc9a111d


In [25]:
venues_df.shape

(2289, 8)

In [27]:
res_df = pd.DataFrame({'Venue Category': venues_df['Venue Category'], 'Strength': venues_df['Venue Category']})
res_df = res_df.groupby(['Venue Category']).count()
res_df = res_df.sort_values(['Strength'], ascending=False)
print(res_df.head())

                      Strength
Venue Category                
Indian Restaurant          319
Café                       119
Fast Food Restaurant       104
Hotel                      104
Coffee Shop                 83


In [28]:
res_df.shape

(150, 1)

In [29]:
df1 = pd.DataFrame({'Venue Category':res_df.index[:50]})
category_strength=[]
for i in range(50):
    category_strength.append(res_df['Strength'][i])
df2 = pd.DataFrame(category_strength, columns=['Strength'])
dff = pd.DataFrame({'Venue Category': df1['Venue Category'], 'Strength': df2['Strength']})
dff.head()

Unnamed: 0,Venue Category,Strength
0,Indian Restaurant,319
1,Café,119
2,Fast Food Restaurant,104
3,Hotel,104
4,Coffee Shop,83


In [30]:
#List of 50 most common categories of restuarants in Hyderabad City
cat_res_list = res_df.index[0:50]
cat_res_list

Index(['Indian Restaurant', 'Café', 'Fast Food Restaurant', 'Hotel',
       'Coffee Shop', 'Bakery', 'Pizza Place', 'Multiplex', 'Ice Cream Shop',
       'Chinese Restaurant', 'Restaurant', 'Vegetarian / Vegan Restaurant',
       'Department Store', 'Sandwich Place', 'Dessert Shop', 'Lounge',
       'Breakfast Spot', 'Clothing Store', 'Asian Restaurant', 'Movie Theater',
       'South Indian Restaurant', 'Snack Place', 'Shopping Mall', 'Hookah Bar',
       'Italian Restaurant', 'Juice Bar', 'Bookstore', 'Diner',
       'Hyderabadi Restaurant', 'Stadium', 'Park', 'Hotel Bar',
       'Middle Eastern Restaurant', 'Food Court', 'Gym', 'Bar', 'Pub',
       'Train Station', 'BBQ Joint', 'Bus Station', 'Shoe Store',
       'Indie Movie Theater', 'Chaat Place', 'Performing Arts Venue',
       'Farmers Market', 'Burger Joint', 'Garden', 'Science Museum',
       'Smoke Shop', 'Convenience Store'],
      dtype='object', name='Venue Category')

In [31]:
venue_final = venues_df[venues_df['Venue Category'].isin(['Indian Restaurant', 'Café', 'Ice Cream Shop', 'Fast Food Restaurant',
       'Pizza Place', 'Coffee Shop', 'Hotel', 'Chinese Restaurant', 'Lounge',
       'Italian Restaurant', 'Bakery', 'Pub', 'Restaurant',
       'Asian Restaurant', 'Breakfast Spot', 'Bar', 'Brewery', 'Burger Joint',
       'Shopping Mall', 'Sandwich Place', 'Vegetarian / Vegan Restaurant',
       'BBQ Joint', 'Snack Place', 'Park', 'Juice Bar',
       'South Indian Restaurant', 'Tea Room',
       'Middle Eastern Restaurant', 'Dessert Shop', 'Donut Shop', 'Bookstore',
       'Multiplex', 'Cocktail Bar',
       'Seafood Restaurant', 'Mexican Restaurant', 'French Restaurant',
       'Andhra Restaurant', 'Korean Restaurant', 'Cupcake Shop',
       'Karnataka Restaurant', 'Steakhouse', 'Boutique', 'Liquor Store',
       'Arcade', 'Deli / Bodega', 'Bus Station'])]

<h3>7. Analyzing the Localities according to the venues</h3>

In [32]:
hyd_onehot = pd.get_dummies(venues_df[['Venue Category']], prefix="", prefix_sep="")

hyd_onehot['Locality'] = venues_df['Locality']

hyd_onehot = hyd_onehot[ [ 'Locality' ] + [ col for col in hyd_onehot.columns if col!='Locality' ] ]
hyd_onehot.head()

Unnamed: 0,Locality,ATM,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,American Restaurant,Arcade,...,Supermarket,Taxi Stand,Tea Room,Tech Startup,Temple,Tex-Mex Restaurant,Thai Restaurant,Train Station,Vegetarian / Vegan Restaurant,Wings Joint
0,Abids,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Abids,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Abids,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Abids,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Abids,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


<h4>Grouping the categories</h4>

In [33]:
hyd_grouped = hyd_onehot.groupby(['Locality']).mean().reset_index()
print(hyd_grouped.shape)
hyd_grouped.head()

(52, 151)


Unnamed: 0,Locality,ATM,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,American Restaurant,Arcade,...,Supermarket,Taxi Stand,Tea Room,Tech Startup,Temple,Tex-Mex Restaurant,Thai Restaurant,Train Station,Vegetarian / Vegan Restaurant,Wings Joint
0,Abids,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.012195,0.0
1,Alwal,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Ameerpet, Hyderabad",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.05,0.0
3,"Bandlaguda, Rangareddy",0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Banjara Hills,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,...,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0


In [34]:
#numbers of localities having Italian Restaurants
len(hyd_grouped[hyd_grouped['Italian Restaurant'] > 0])

8

In [37]:
hyd_italian = hyd_grouped[['Locality', 'Italian Restaurant']]
hyd_italian.head()

Unnamed: 0,Locality,Italian Restaurant
0,Abids,0.0
1,Alwal,0.0
2,"Ameerpet, Hyderabad",0.0
3,"Bandlaguda, Rangareddy",0.0
4,Banjara Hills,0.04


In [38]:
hyd_map = folium.Map(location=[h_lat, h_lng],zoom_start=11)


folium.Marker([h_lat, h_lng], popup='<i>Hyderabad</i>', color='red', tooltip="Click to see").add_to(hyd_map)


for latitude,longitude,name,strength in zip(loca_df["Latitudes"], loca_df["Longitudes"], loca_df["Locality"], hyd_italian["Italian Restaurant"]):
    folium.CircleMarker(
        [latitude, longitude],
        radius=strength*300,
        color='green',
        popup=name,
        fill=True,
        fill_color='#3186ff'
    ).add_to(hyd_map)

hyd_map

<h3>8. Clustering The Localities</h3>

Machine Learning- K-means clustering Algorithm for clustering the locality in Hyderabad base on the number of Italian restaurants present in each locality.

In [55]:
cluster = 5

#Dataframe for clustering
hyd_clustering = hyd_italian.drop(['Locality'], 1)

#run K-means clustering
k_means = KMeans(init="k-means++", n_clusters=cluster, n_init=12).fit(hyd_clustering)

#getting the labels for first 10 locality 
print(k_means.labels_[0:10])

[0 0 0 0 4 0 0 0 0 0]


In [56]:
hyd_labels = hyd_italian.copy()
hyd_labels["Cluster Label"] = k_means.labels_

hyd_labels.head()

Unnamed: 0,Locality,Italian Restaurant,Cluster Label
0,Abids,0.0,0
1,Alwal,0.0,0
2,"Ameerpet, Hyderabad",0.0,0
3,"Bandlaguda, Rangareddy",0.0,0
4,Banjara Hills,0.04,4


In [57]:
hyd_labels = hyd_labels.join(loca_df.set_index('Locality'), on='Locality')
hyd_labels.head()

Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
0,Abids,0.0,0,17.3898,78.47658
1,Alwal,0.0,0,17.53543,78.54427
2,"Ameerpet, Hyderabad",0.0,0,17.43535,78.44861
3,"Bandlaguda, Rangareddy",0.0,0,17.29982,78.46495
4,Banjara Hills,0.04,4,17.41535,78.43435


In [58]:
hyd_labels.sort_values(["Cluster Label"], inplace=True)
hyd_labels.head()

Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
0,Abids,0.0,0,17.3898,78.47658
27,L. B. Nagar,0.0,0,17.51265,78.44129
29,Malakpet,0.0,0,17.37493,78.51567
30,Malkajgiri,0.0,0,17.44737,78.5352
31,Manikonda,0.0,0,17.40139,78.39163


In [59]:
#Cleaning the dataframe for mapping the localities according to their cluster labels
hyd_only_labels = hyd_labels.drop(columns=['Italian Restaurant','Latitudes','Longitudes'])
hyd_only_labels.head()

Unnamed: 0,Locality,Cluster Label
0,Abids,0
27,L. B. Nagar,0
29,Malakpet,0
30,Malkajgiri,0
31,Manikonda,0


In [61]:
#Plot the cluster on map
cluster_map = folium.Map(location=[h_lat, h_lng],zoom_start=11)

folium.Marker([h_lat, h_lng], popup='<i>Hyderabad</i>', color='red', tooltip="Click to see").add_to(cluster_map)

#Getting the colors for the clusters
col = ['red', 'green', 'blue','orange','yellow']

#markers for localities
for latitude,longitude,name,clus in zip(hyd_labels["Latitudes"], hyd_labels["Longitudes"], hyd_labels["Locality"], hyd_labels["Cluster Label"]):
    label = folium.Popup(name + ' - Cluster ' + str(clus+1))
    folium.CircleMarker(
        [latitude, longitude],
        radius=6,
        color=col[clus],
        popup=label,
        fill=False,
        fill_color=col[clus],
        fill_opacity=0.3
    ).add_to(cluster_map)
       
cluster_map

<h3>9. Analyzing The Cluster</h3>

In [68]:
#Cluster 1
#Dataframe containing localities with cluster label 0, which corresponds to localities with no Italian Restaurant
cluster_1 = hyd_labels[hyd_labels['Cluster Label'] == 0]
print("There are {} localities in cluster-1".format(cluster_1.shape[0]))
mean_presence_1 = cluster_1['Italian Restaurant'].mean()
print("The mean occurence of Italian restaurant in cluster-1 is {0:.2f}".format(mean_presence_1))
cluster_1.head()

There are 45 localities in cluster-1
The mean occurence of Italian restaurant in cluster-1 is 0.00


Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
0,Abids,0.0,0,17.3898,78.47658
27,L. B. Nagar,0.0,0,17.51265,78.44129
29,Malakpet,0.0,0,17.37493,78.51567
30,Malkajgiri,0.0,0,17.44737,78.5352
31,Manikonda,0.0,0,17.40139,78.39163


In [69]:
#Cluster 2
#Dataframe containing localities with cluster label 1, which corresponds to localities with density of Italian Restaurant
cluster_2 = hyd_labels[hyd_labels['Cluster Label'] == 1]
print("There are {} localities in cluster-2".format(cluster_2.shape[0]))
mean_presence_2 = cluster_2['Italian Restaurant'].mean()
print("The mean occurence of Italian restaurant in cluster-2 is {0:.2f}".format(mean_presence_2))
cluster_2.head()

There are 2 localities in cluster-2
The mean occurence of Italian restaurant in cluster-2 is 0.03


Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
28,Madhapur,0.03,1,17.45694,78.39013
22,Jubilee Hills,0.03,1,17.42865,78.39762


In [70]:
#Cluster 3
#Dataframe containing localities with cluster label 2, which corresponds to localities with highest density of Italian Restaurant
cluster_3 = hyd_labels[hyd_labels['Cluster Label'] == 2]
print("There are {} localities in cluster-3".format(cluster_3.shape[0]))
mean_presence_3 = cluster_3['Italian Restaurant'].mean()
print("The mean occurence of Italian restaurant in cluster-3 is {0:.2f}".format(mean_presence_3))
cluster_3.head()

There are 1 localities in cluster-3
The mean occurence of Italian restaurant in cluster-3 is 0.06


Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
15,Gachibowli,0.06,2,17.43181,78.38636


In [74]:
#Cluster 4
#Dataframe containing localities with cluster label 3, which corresponds to localities with low density of Italian Restaurant
cluster_4 = hyd_labels[hyd_labels['Cluster Label'] == 3]
print("There are {} localities in cluster-4".format(cluster_4.shape[0]))
mean_presence_4 = cluster_4['Italian Restaurant'].mean()
print("The mean occurence of Italian restaurant in cluster-4 is {0:.2f}".format(mean_presence_4))
cluster_4.head()

There are 3 localities in cluster-4
The mean occurence of Italian restaurant in cluster-4 is 0.02


Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
24,Khairtabad,0.02,3,17.40592,78.45856
32,Masab Tank,0.02,3,17.40093,78.45362
19,HITEC City,0.024691,3,17.42923,78.37495


In [86]:
#Cluster 5
#Dataframe containing localities with cluster label 4, which corresponds to localities with medium density of Italian Restaurant
cluster_5 = hyd_labels[hyd_labels['Cluster Label'] == 4]
print("There are {} localities in cluster-5".format(cluster_5.shape[0]))
mean_presence_5 = cluster_5['Italian Restaurant'].mean()
print("The mean occurence of Italian restaurant in cluster-5 is {0:.2f}".format(mean_presence_5))
cluster_5.head()

There are 1 localities in cluster-5
The mean occurence of Italian restaurant in cluster-5 is 0.04


Unnamed: 0,Locality,Italian Restaurant,Cluster Label,Latitudes,Longitudes
4,Banjara Hills,0.04,4,17.41535,78.43435


# Results & Observations

<ul> 
    <li>Analysing the clusters accordingly we can infer that cluster-1(shown with red color) has no existing Italian Restaurant with the highest numbers of the same in cluster-2(shown with blue color).
    <li>Moderate number of Italian Restaurants are present in cluster-2 and cluster-5(shown with green, yellow color) and low number of Italian Restuarants are found in cluster-4 (shown in orange color).</li>
    <li>The analysis shows that most of the Italian restaurants are spread across central to North west region of the city. Hardly, there are no restaurants in the eastern part of the city. There are no Italian restaurants in the outer parts of the city. <br/> 
 
        

# Conclusion

<ul> 
    <li> Cluster-1 and cluster-4 are the best options for the outsider/ investors to open the Italian restaurant in Hyderabad. 
    <li> As Cluster 1 has no Italian restaurants more marketing efforts restaurants need to spend, while for cluster-4 which already has existing Italian restaurants with very low competition can attract the existing customers visiting to other Italian restuarants along with new customers.