# The Battle of Neighborhoods 2-- Boroughs of Bangalore City

## Introduction

In this section of the capstone project, we will use the Foursquare API to explore neighborhoods in Bangalore City. We will use the explore function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. We will use the k-means clustering algorithm to complete this task. Finally, we will use the Folium library to visualize the neighborhoods in Bangalore City and their emerging clusters.

## 1. Download and Explore Dataset

In [1]:
import numpy as np
import pandas as pd
import requests
import types
url='https://raw.githubusercontent.com/ravindrasinghchouhan/Coursera_Capstone/master/Bangalore_pincode_geolocation.csv'
df = pd.read_csv(url)
df.head(5)

Unnamed: 0,Pincode,Location,Latitude,Longitude
0,560001,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906
1,560002,"Bangalore Fort, Bangalore City, Bangalore Corp...",12.971599,77.594563
2,560003,"Vyalikaval Extn, Malleswaram, Palace Guttahall...",13.00835,77.56145
3,560004,"Pasmpamahakavi Road, Basavanagudi, Shankarpura...",12.9454,77.5776
4,560005,"Jeevanahalli, Fraser Town",13.0713,77.5905


In [2]:
df.shape

(99, 4)

## Use geopy library to get the latitude and longitude values of Bangalore City.

In [3]:
from geopy.geocoders import Nominatim
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

In [4]:
import folium
geolocator = Nominatim(user_agent="coursera")
address = 'Bangalore'
try:
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    print('The geograpical coordinates of {} are {}, {}.'.format(address, latitude, longitude))
except AttributeError:
    print('Cannot find: {}, will drop index: {}'.format(address, index))

my_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['Pincode']):
    label = folium.Popup(label)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(my_map)  
    
my_map

The geograpical coordinates of Bangalore are 12.9791198, 77.5912997.


In [5]:
CLIENT_ID = '5KEGUOOIP5OMLYYZGQMBLUXOAZSMTMBXYXFBIZAANXJHQ2UB' # your Foursquare ID
CLIENT_SECRET = 'IBQ5DAZY2FXGKYJLQDX40PZKCW0WFC5NUZR2R45V5CJ14C1G' # your Foursquare Secret
VERSION = '20180604'

In [6]:
df['Pincode'] = df['Pincode'].astype(str)

In [7]:
df.loc[df['Pincode'] == '560002']

Unnamed: 0,Pincode,Location,Latitude,Longitude
1,560002,"Bangalore Fort, Bangalore City, Bangalore Corp...",12.971599,77.594563


In [8]:
df.set_index('Pincode', inplace = True) 
neighborhood_latitude = df.loc['560002']['Latitude']
neighborhood_longitude = df.loc['560002']['Longitude']

## 2. Explore Neighborhoods in Bangalore City

### Extract Venues data for each neighborhoods in Bangalore City

In [9]:
LIMIT = 200 # limit of number of venues returned by Foursquare API
radius = 1000 # define radius
# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=5KEGUOOIP5OMLYYZGQMBLUXOAZSMTMBXYXFBIZAANXJHQ2UB&client_secret=IBQ5DAZY2FXGKYJLQDX40PZKCW0WFC5NUZR2R45V5CJ14C1G&v=20180604&ll=12.9715987,77.5945627&radius=1000&limit=200'

In [10]:
results = requests.get(url).json()

In [11]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [12]:
venues = results['response']['groups'][0]['items']

In [13]:
nearby_venues = json_normalize(venues) # flatten JSON
nearby_venues

Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,venue.location.distance,...,venue.location.city,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.venuePage.id,venue.location.crossStreet,venue.location.neighborhood
0,e-0-51d1245e498ef93fd0e713bb-0,0,"[{'summary': 'This spot is popular', 'type': '...",51d1245e498ef93fd0e713bb,JW Marriott Hotel Bengaluru,24/1 Vittal Mallya Road,12.972362,77.595051,"[{'label': 'display', 'lat': 12.97236177249022...",100,...,Bangalore,Karnātaka,India,"[24/1 Vittal Mallya Road, Bangalore 560001, Ka...","[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",0,[],131922146,,
1,e-0-4bcd805cfb84c9b61512223e-1,0,"[{'summary': 'This spot is popular', 'type': '...",4bcd805cfb84c9b61512223e,UB City,at Vittal Mallya Rd,12.971709,77.595905,"[{'label': 'display', 'lat': 12.97170898069531...",146,...,Bangalore,Karnātaka,India,"[at Vittal Mallya Rd, Bangalore 560001, Karnāt...","[{'id': '4bf58dd8d48988d1fd941735', 'name': 'S...",0,[],,,
2,e-0-4bc1cd90b492d13a4e74a660-2,0,"[{'summary': 'This spot is popular', 'type': '...",4bc1cd90b492d13a4e74a660,Toscano,UB City Level 2 Concorde Block,12.971980,77.596066,"[{'label': 'display', 'lat': 12.97198038085137...",168,...,Bangalore,Karnātaka,India,[UB City Level 2 Concorde Block (24 Vittal Mal...,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",0,[],,24 Vittal Mallya Road,
3,e-0-523de40611d2996a150886fc-3,0,"[{'summary': 'This spot is popular', 'type': '...",523de40611d2996a150886fc,J W Kitchen,Near U B City,12.972410,77.594592,"[{'label': 'display', 'lat': 12.97241038412729...",90,...,Bangalore,Karnātaka,India,"[Near U B City (Vittal Mallya Road), Bangalore...","[{'id': '4bf58dd8d48988d142941735', 'name': 'A...",0,[],,Vittal Mallya Road,
4,e-0-4b895510f964a520442c32e3-4,0,"[{'summary': 'This spot is popular', 'type': '...",4b895510f964a520442c32e3,Shiro,"3rd Flr., UB City, Vittal Mallya Rd.",12.971900,77.596236,"[{'label': 'display', 'lat': 12.97189955907753...",184,...,Bangalore,Karnātaka,India,"[3rd Flr., UB City, Vittal Mallya Rd. (Lavelle...","[{'id': '4bf58dd8d48988d111941735', 'name': 'J...",0,[],,Lavelle Rd.,Richmond
5,e-0-4baef172f964a5202ce33be3-5,0,"[{'summary': 'This spot is popular', 'type': '...",4baef172f964a5202ce33be3,Café Noir,"2nd Floor, UB City, Vittal Mallya Road, Near L...",12.971995,77.596001,"[{'label': 'display', 'lat': 12.97199474634367...",162,...,Bangalore,Karnātaka,India,"[2nd Floor, UB City, Vittal Mallya Road, Near ...","[{'id': '4bf58dd8d48988d10c941735', 'name': 'F...",0,[],,at Vittal Mallya Rd,
6,e-0-4d23471ed7b0b1f7d0552c9f-6,0,"[{'summary': 'This spot is popular', 'type': '...",4d23471ed7b0b1f7d0552c9f,Skyye,"Uber Level, 16th Flr., UB City, Vittal Mallya Rd.",12.971646,77.596242,"[{'label': 'display', 'lat': 12.97164563919374...",182,...,Bangalore,Karnātaka,India,"[Uber Level, 16th Flr., UB City, Vittal Mallya...","[{'id': '4bf58dd8d48988d121941735', 'name': 'L...",0,[],,Lavelle Rd.,Richmo
7,e-0-528f734f11d24f6d2c578d31-7,0,"[{'summary': 'This spot is popular', 'type': '...",528f734f11d24f6d2c578d31,Spice Terrace,JW Mariot,12.972254,77.595200,"[{'label': 'display', 'lat': 12.97225441231577...",100,...,Bangalore,Karnātaka,India,"[JW Mariot, Bangalore, Karnātaka, India]","[{'id': '4bf58dd8d48988d121941735', 'name': 'L...",0,[],,,
8,e-0-4c8b8c31a92fa093fe789bbf-8,0,"[{'summary': 'This spot is popular', 'type': '...",4c8b8c31a92fa093fe789bbf,Bliss Luxe Chocolate Lounge,UB City,12.971525,77.596201,"[{'label': 'display', 'lat': 12.97152504853497...",177,...,Bangalore,Karnātaka,India,"[UB City (Vittal Mallya Road), Bangalore, Karn...","[{'id': '4bf58dd8d48988d1bc941735', 'name': 'C...",0,[],,Vittal Mallya Road,
9,e-0-5332e12b11d215ddd88aa74f-9,0,"[{'summary': 'This spot is popular', 'type': '...",5332e12b11d215ddd88aa74f,JW Marriott Executive Lounge,,12.972120,77.594804,"[{'label': 'display', 'lat': 12.97211968956926...",63,...,,,India,[India],"[{'id': '4bf58dd8d48988d121941735', 'name': 'L...",0,[],,,


In [14]:
# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,JW Marriott Hotel Bengaluru,Hotel,12.972362,77.595051
1,UB City,Shopping Mall,12.971709,77.595905
2,Toscano,Italian Restaurant,12.97198,77.596066
3,J W Kitchen,Asian Restaurant,12.97241,77.594592
4,Shiro,Japanese Restaurant,12.9719,77.596236


In [15]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


### Run the above function on each neighborhood and create a new dataframe called venues.

In [16]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [17]:
venues = getNearbyVenues(names=df['Location'],latitudes=df['Latitude'],longitudes=df['Longitude'])

Bangalore Bazaar, Legislators Home, Dr. ambedkar veedhi, Cubban Road, Mahatma Gandhi road, Vidhana Soudha, Narayan Pillai street, Rajbhavan, Highcourt, Brigade Road, Bangalore.
Bangalore Fort, Bangalore City, Bangalore Corporation building, Basavaraja Market, Narasimjharaja Road, New Tharaggupet, Cahmrajendrapet, Sri Jayachamarajendra road, Avenue Road
Vyalikaval Extn, Malleswaram, Palace Guttahalli, Venkatarangapura, Aranya Bhavan, Swimming Pool extn
Pasmpamahakavi Road, Basavanagudi, Shankarpura, Lalbagh West, Visveswarapuram, Mavalli
Jeevanahalli, Fraser Town
J.C.nagar, Training Command iaf
Air Force hospital, Agram
Hulsur Bazaar, H.A.l ii stage, Someswarapura
K. g. road, Subhashnagar, Bangalore Dist offices bldg
Rajajinagar Ivth block, Bhashyam Circle, Rajajinagar, Industrial Estate, Rajajinagar I block
Madhavan Park, Jayangar Iii block
Science Institute
Jalahalli Village, Govindapalya, H M t, Jalahalli
Jalahalli East
Kamagondanahalli, Jalahalli West
Doorvaninagar, Krishnarajapuram

In [18]:
print(venues.shape)
venues.head()

(2878, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,Samarkand,12.980616,77.604668,Afghan Restaurant
1,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,Peppa Zzing,12.9797,77.605907,Burger Joint
2,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,Shiv Sagar,12.981879,77.608322,Indian Restaurant
3,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,Krispy Kreme,12.98263,77.607027,Donut Shop
4,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,Mysore Saree Udyog,12.981433,77.610214,Women's Store


In [19]:
print('There are {} unique categories.'.format(len(venues['Venue Category'].unique())))

There are 209 unique categories.


In [20]:
def Venues_Map(Borough_name, Borough_neighborhoods):
    geolocator = Nominatim(user_agent="coursera")
    address = 'Bangalore'
    try:
        location = geolocator.geocode(address)
        latitude = location.latitude
        longitude = location.longitude
        print('The geograpical coordinates of {} are {}, {}.'.format(address, latitude, longitude))
    except AttributeError:
        print('Cannot find: {}, will drop index: {}'.format(address, index))

    my_map = folium.Map(location=[latitude, longitude], zoom_start=11)
    # add markers to map
    for lat, lng, venue, category in zip(Borough_neighborhoods['Venue Latitude'], Borough_neighborhoods['Venue Longitude'], Borough_neighborhoods['Venue'], Borough_neighborhoods['Venue Category']):
        label = folium.Popup(str(venue) + ' category ' + str(category), parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            color='blue',
            fill_color='#3186cc',
            fill_opacity=0.7).add_to(my_map)  
    

    return my_map

In [21]:
Venues_Map('Bangalore', venues)

The geograpical coordinates of Bangalore are 12.9791198, 77.5912997.


In [22]:
venues.groupby('Venue Category')['Venue'].count().sort_values(ascending=False)

Venue Category
Indian Restaurant        474
Café                     184
Ice Cream Shop            89
Hotel                     88
Chinese Restaurant        85
                        ... 
Field                      1
Bridge                     1
Pakistani Restaurant       1
Performing Arts Venue      1
Paintball Field            1
Name: Venue, Length: 209, dtype: int64

In [23]:
venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A F station yelahanka,2,2,2,2,2,2
"Adugodi, Hosur Road",43,43,43,43,43,43
"Air Force hospital, Agram",69,69,69,69,69,69
"Amruthahalli, Kodigehalli, Sahakaranagar P.o",3,3,3,3,3,3
"Anekalbazar, Vanakanahalli, Marsur, Anekal, Hennagara, Thammanayakanahalli, Sidihoskote, Bestamaranahalli, Indalavadi, Hulimangala, Samandur, Jigani, Harogadde",3,3,3,3,3,3
"Arabic College, Ramakrishna Hegde nagar, Devarjeevanahalli, Venkateshapura, Nagavara",14,14,14,14,14,14
"Austin Town, Viveknagar",8,8,8,8,8,8
"Avani Sringeri mutt, Mahalakshmipuram Layout, Basaveswaranagar Ii stage",55,55,55,55,55,55
"Banashankari, Ashoknagar, State Bank of mysore colony, Dasarahalli",100,100,100,100,100,100
"Bangalore Air port, Vimapura, Nal",2,2,2,2,2,2


## 3. Analyze Each Neighborhood

In [24]:
# one hot encoding
onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

#column lists before adding neighborhood
column_names = ['Neighborhood'] + list(onehot.columns)

# add neighborhood column back to dataframe
onehot['Neighborhood'] = venues['Neighborhood'] 

# move neighborhood column to the first column
onehot = onehot[column_names]

onehot.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,...,Track Stadium,Trail,Train Station,Turkish Restaurant,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store,Yoga Studio
0,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0


In [25]:
onehot.shape

(2878, 210)

In [26]:
restaurant_List = []
search = 'Restaurant'
for i in onehot.columns :
    if search in i:
        restaurant_List.append(i)

In [27]:
restaurant_List

['Afghan Restaurant',
 'American Restaurant',
 'Andhra Restaurant',
 'Asian Restaurant',
 'Bengali Restaurant',
 'Cantonese Restaurant',
 'Chinese Restaurant',
 'Comfort Food Restaurant',
 'Dim Sum Restaurant',
 'Eastern European Restaurant',
 'Fast Food Restaurant',
 'French Restaurant',
 'German Restaurant',
 'Halal Restaurant',
 'Hyderabadi Restaurant',
 'Indian Restaurant',
 'Italian Restaurant',
 'Japanese Restaurant',
 'Karnataka Restaurant',
 'Kerala Restaurant',
 'Korean Restaurant',
 'Mediterranean Restaurant',
 'Mexican Restaurant',
 'Middle Eastern Restaurant',
 'Modern European Restaurant',
 'Multicuisine Indian Restaurant',
 'North Indian Restaurant',
 'Paella Restaurant',
 'Pakistani Restaurant',
 'Parsi Restaurant',
 'Punjabi Restaurant',
 'Rajasthani Restaurant',
 'Restaurant',
 'Seafood Restaurant',
 'South Indian Restaurant',
 'Sushi Restaurant',
 'Szechuan Restaurant',
 'Tex-Mex Restaurant',
 'Thai Restaurant',
 'Tibetan Restaurant',
 'Turkish Restaurant',
 'Udupi Re

In [28]:
col_name = []
col_name = ['Neighborhood'] + restaurant_List
BM_restaurant = onehot[col_name]
BM_restaurant = BM_restaurant.iloc[:,1::]
BM_restaurant

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Bengali Restaurant,Cantonese Restaurant,Chinese Restaurant,Comfort Food Restaurant,Dim Sum Restaurant,...,South Indian Restaurant,Sushi Restaurant,Szechuan Restaurant,Tex-Mex Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [29]:
BM_restaurant_grouped = BM_restaurant.groupby('Neighborhood').sum().reset_index()

In [30]:
BM_restaurant_grouped['Total'] = BM_restaurant_grouped .sum(axis=1)

## 4. Cluster Neighborhoods and Examine Clusters

First, let's determine the optimal value of K for our dataset using the Silhouette Coefficient Method

In [31]:
from sklearn.cluster import KMeans
from sklearn import metrics
BM_grouped_clustering = BM_restaurant_grouped.drop('Neighborhood', 1)

for n_cluster in range(2, 10):
    kmeans = KMeans(n_clusters=n_cluster).fit(BM_grouped_clustering)
    label = kmeans.labels_
    sil_coeff = metrics.silhouette_score(BM_grouped_clustering, label, metric='euclidean')
    print("For n_clusters={}, The Silhouette Coefficient is {}".format(n_cluster, sil_coeff))

For n_clusters=2, The Silhouette Coefficient is 0.6916122496033011
For n_clusters=3, The Silhouette Coefficient is 0.6079002877292529
For n_clusters=4, The Silhouette Coefficient is 0.5413365695081502
For n_clusters=5, The Silhouette Coefficient is 0.5477387219943588
For n_clusters=6, The Silhouette Coefficient is 0.4938072704982384
For n_clusters=7, The Silhouette Coefficient is 0.5037338488725128
For n_clusters=8, The Silhouette Coefficient is 0.471815898912413
For n_clusters=9, The Silhouette Coefficient is 0.4776315260996387


#### As we can see, n_clusters=2 has highest Silhouette Coefficient. This means that 2 should be the optimal number of clusters.
#### For n_clusters=2, The Silhouette Coefficient is 0.678403355340968
#### Run k-means to cluster the neighborhood into 2 clusters.

In [33]:
# set number of clusters
kclusters = 2

BM_grouped_clustering = BM_restaurant_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(BM_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,
       0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0,
       1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0,
       0, 1, 0, 0, 0, 1, 0, 1])

In [34]:
BM_results = pd.DataFrame(kmeans.cluster_centers_)
BM_results.columns = BM_grouped_clustering.columns
BM_results.index = ['cluster0','cluster1']
BM_results['Total Sum'] = BM_results.sum(axis = 1)
BM_results

Unnamed: 0,Afghan Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Bengali Restaurant,Cantonese Restaurant,Chinese Restaurant,Comfort Food Restaurant,Dim Sum Restaurant,Eastern European Restaurant,...,Szechuan Restaurant,Tex-Mex Restaurant,Thai Restaurant,Tibetan Restaurant,Turkish Restaurant,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Total,Total Sum
cluster0,1.387779e-17,0.042857,0.042857,0.2,0.014286,0.0,0.185714,1.387779e-17,6.938894e-18,1.387779e-17,...,0.01428571,6.938894e-18,0.057143,0.01428571,0.01428571,6.938894e-18,0.242857,2.775558e-17,4.0,8.0
cluster1,0.07692308,0.230769,0.423077,0.923077,0.115385,0.115385,2.769231,0.07692308,0.03846154,0.07692308,...,1.734723e-18,0.03846154,0.269231,1.734723e-18,1.734723e-18,0.03846154,0.769231,0.1538462,28.423077,56.846154


In [35]:
BM_results_merged = pd.DataFrame(BM_restaurant_grouped['Neighborhood'])

BM_results_merged['Total'] = BM_restaurant_grouped['Total']
BM_results_merged = BM_results_merged.assign(Cluster_Labels = kmeans.labels_)

In [36]:
print(BM_results_merged.shape)
BM_results_merged

(96, 3)


Unnamed: 0,Neighborhood,Total,Cluster_Labels
0,A F station yelahanka,0,0
1,"Adugodi, Hosur Road",10,0
2,"Air Force hospital, Agram",21,1
3,"Amruthahalli, Kodigehalli, Sahakaranagar P.o",0,0
4,"Anekalbazar, Vanakanahalli, Marsur, Anekal, He...",1,0
5,"Arabic College, Ramakrishna Hegde nagar, Devar...",3,0
6,"Austin Town, Viveknagar",1,0
7,"Avani Sringeri mutt, Mahalakshmipuram Layout, ...",15,0
8,"Banashankari, Ashoknagar, State Bank of mysore...",32,1
9,"Bangalore Air port, Vimapura, Nal",0,0


In [37]:
df = df.rename(columns={'Location': 'Neighborhood'})

In [38]:
BM_merged = df

BM_merged = BM_merged.join(BM_results_merged.set_index('Neighborhood'), on='Neighborhood')


BM_merged.dropna(subset=['Total'], how='all', inplace = True)
BM_merged['Cluster_Labels'] =BM_merged['Cluster_Labels'].astype(int)
print(BM_merged.shape)
BM_merged.head(10) # check the last columns!

(96, 5)


Unnamed: 0_level_0,Neighborhood,Latitude,Longitude,Total,Cluster_Labels
Pincode,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
560001,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,17.0,1
560002,"Bangalore Fort, Bangalore City, Bangalore Corp...",12.971599,77.594563,37.0,1
560003,"Vyalikaval Extn, Malleswaram, Palace Guttahall...",13.00835,77.56145,19.0,1
560004,"Pasmpamahakavi Road, Basavanagudi, Shankarpura...",12.9454,77.5776,25.0,1
560005,"Jeevanahalli, Fraser Town",13.0713,77.5905,6.0,0
560006,"J.C.nagar, Training Command iaf",12.987639,77.637862,13.0,0
560007,"Air Force hospital, Agram",12.958,77.639,21.0,1
560008,"Hulsur Bazaar, H.A.l ii stage, Someswarapura",12.97375,77.62499,17.0,0
560009,"K. g. road, Subhashnagar, Bangalore Dist offic...",12.971626,77.594536,38.0,1
560010,"Rajajinagar Ivth block, Bhashyam Circle, Rajaj...",13.0178,77.55175,17.0,0


## Finally, let's visualize the resulting clusters

In [39]:
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(BM_merged['Latitude'], BM_merged['Longitude'], BM_merged['Neighborhood'], BM_merged['Cluster_Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## List Neighborhoods of Interest in Bangalore City

## Cluster 1 : Saturated Markets

In [41]:
BM_merged[BM_merged['Cluster_Labels'] == 1].reset_index(drop=True)

Unnamed: 0,Neighborhood,Latitude,Longitude,Total,Cluster_Labels
0,"Bangalore Bazaar, Legislators Home, Dr. ambedk...",12.987485,77.604906,17.0,1
1,"Bangalore Fort, Bangalore City, Bangalore Corp...",12.971599,77.594563,37.0,1
2,"Vyalikaval Extn, Malleswaram, Palace Guttahall...",13.00835,77.56145,19.0,1
3,"Pasmpamahakavi Road, Basavanagudi, Shankarpura...",12.9454,77.5776,25.0,1
4,"Air Force hospital, Agram",12.958,77.639,21.0,1
5,"K. g. road, Subhashnagar, Bangalore Dist offic...",12.971626,77.594536,38.0,1
6,"Madhavan Park, Jayangar Iii block",12.930158,77.587714,37.0,1
7,"Seshadripuram, K.P.west",12.990409,77.57769,26.0,1
8,"Museum Road, Bangalore Sub fgn post, Cmp Centr...",12.97018,77.61189,31.0,1
9,Tyagrajnagar,12.971611,77.594551,37.0,1


## Cluster 0 : Untapped Markets

In [42]:
BM_merged[BM_merged['Total'] == 0].reset_index(drop=True)

Unnamed: 0,Neighborhood,Latitude,Longitude,Total,Cluster_Labels
0,"Bangalore Air port, Vimapura, Nal",12.945062,77.665135,0.0,0
1,Magadi Road,12.96988,77.55731,0.0,0
2,Nayandahalli,12.94258,77.523234,0.0,0
3,Benson Town,13.00759,77.60352,0.0,0
4,"Thambuchetty Palya, Bidrahalli, Virgonagar, Mu...",13.04948,77.74613,0.0,0
5,"Peenya I stage, Peenya Ii stage, Peenya Small ...",13.006288,77.497442,0.0,0
6,Rv Niketan,12.84346,77.43727,0.0,0
7,A F station yelahanka,13.1122,77.6261,0.0,0
8,"Amruthahalli, Kodigehalli, Sahakaranagar P.o",13.057768,77.575519,0.0,0
9,Bellandur,12.933713,77.662194,0.0,0
