# Coffee Clusters

<h2> Opening a coffee shop in New Delhi
<h4> Submission for Week 5 

Brief Overview of the Methodology employed in this Jupyter Notebook :
<ol>
<li> Build a dataframe of neighborhoods in New Delhi, India by web scraping the data from Wikipedia page. 
<li> Get the geographical coordinates of the neighborhoods in New Delhi. 
<li> Obtain the venue data for the neighborhoods from Foursquare API
<li> Explore and cluster the neighborhoods in Delhi. 
<li> Select the best cluster to open a new shopping mall

# Part 1

<h3> Importing Required Libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner
print("Numpy imported.")

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)
print("Panda imported.")

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
print("JSON normalized in pandas imported.")

import json # library to handle JSON files
print("JSON library imported.")

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
print("Nomaintim imported.")

!conda install -c conda-forge geocoder --yes 
import geocoder # to get coordinates
print("Geocoder imported.")

import requests # library to handle requests
print("Requests library imported.")

from bs4 import BeautifulSoup # library to parse HTML and XML documents
print("BeautifulSoup imported.")

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
print("Matplotlib modules imported.")

# import k-means from clustering stage
from sklearn.cluster import KMeans
print("KMeans imported.")

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library
print("Folium imported.")


Numpy imported.
Panda imported.
JSON normalized in pandas imported.
JSON library imported.
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Nomaintim imported.
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Geocoder imported.
Requests library imported.
BeautifulSoup imported.
Matplotlib modules imported.
KMeans imported.
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Folium imported.


<h3> Scraping data from a Wikipedia page into a DataFrame

I have used the Wikipedia page on neighborhood of Delhi that can be found at https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Delhi

In [2]:
# send the GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Delhi").text

In [3]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')

In [4]:
# create a list to store neighborhood data
nbList = []

# populate the neighborhood list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    nbList.append(row.text)

In [5]:
# create a new DataFrame from the list
Ndelhi = pd.DataFrame({"Neighborhoods in Delhi": nbList})

Ndelhi.head()

Unnamed: 0,Neighborhoods in Delhi
0,Neighbourhoods of Delhi
1,Ashok Nagar (Delhi)
2,Ashok Vihar
3,Ashram Chowk
4,Babarpur


In [6]:
# remove the first row as it's redundant 
Ndelhi = Ndelhi.iloc[1:]
Ndelhi.head()

Unnamed: 0,Neighborhoods in Delhi
1,Ashok Nagar (Delhi)
2,Ashok Vihar
3,Ashram Chowk
4,Babarpur
5,"Badarpur, Delhi"


In [7]:
# print the number of rows of the dataframe
Ndelhi.shape

(138, 1)

<h3> Getting the geographical coordinates

In [8]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, New Delhi, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [11]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [] # to store coordinates of each neighborhood
coords = [ get_latlng(neighborhood) for neighborhood in Ndelhi["Neighborhoods in Delhi"].tolist() ]

In [12]:
coords[0:5] # sanity check 

[[28.692230000000052, 77.30124000000006],
 [28.69037000000003, 77.17609000000004],
 [28.710598435255907, 77.32696519316737],
 [28.50738000000007, 77.30346000000003],
 [28.50738000000007, 77.30346000000003]]

In [13]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
coordinates = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

# merge the coordinates into the original dataframe
Ndelhi['Latitude'] = coordinates['Latitude']
Ndelhi['Longitude'] = coordinates['Longitude']

In [14]:
# check the neighborhoods and the coordinates
print(Ndelhi.shape)
Ndelhi.head()

(138, 3)


Unnamed: 0,Neighborhoods in Delhi,Latitude,Longitude
1,Ashok Nagar (Delhi),28.69037,77.17609
2,Ashok Vihar,28.710598,77.326965
3,Ashram Chowk,28.50738,77.30346
4,Babarpur,28.50738,77.30346
5,"Badarpur, Delhi",28.65223,77.129411


In [15]:
# remove all the values which may have a NaN value
Ndelhi.dropna(inplace = True)

In [16]:
# save the DataFrame as CSV file
Ndelhi.to_csv("NewDelhi.csv", index=False)

<h3> Creating a map of New Delhi with neighborhoods superimposed on top

In [17]:
# get the coordinates of Kuala Lumpur
address = 'New Delhi, India'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New Delhi, India : {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New Delhi, India : 28.6138954, 77.2090057.


In [21]:
# create map of New Delhi using latitude and longitude values
map_nd = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, neighborhood in zip(Ndelhi['Latitude'], Ndelhi['Longitude'], Ndelhi['Neighborhoods in Delhi']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='coral',
        fill=True,
        fill_color='red',
        fill_opacity=0.7).add_to(map_nd)  
    
map_nd

In [19]:
# save the map as HTML file
map_nd.save('map_nd.html')

<h3> Using the Foursquare API to explore the neighborhoods

In [23]:
# define Foursquare Credentials and Version
CLIENT_ID = 'ZS0CGYOENKXU4Q1KTTMUTMHY4W5BPMQCDF4B5CXZGKR5KCGU' # my Foursquare ID
CLIENT_SECRET = 'SGMD3FGJDTSGBOK2ZXY3RY1Z32BMESZW1NM20RQXWWXFFRWT' # my Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ZS0CGYOENKXU4Q1KTTMUTMHY4W5BPMQCDF4B5CXZGKR5KCGU
CLIENT_SECRET:SGMD3FGJDTSGBOK2ZXY3RY1Z32BMESZW1NM20RQXWWXFFRWT


In [24]:
radius = 2000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(Ndelhi['Latitude'], Ndelhi['Longitude'], Ndelhi['Neighborhoods in Delhi']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [25]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Neighborhoods in Delhi', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(5969, 7)


Unnamed: 0,Neighborhoods in Delhi,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Ashok Nagar (Delhi),28.69037,77.17609,Major Dhyan Chand Sports Complex,28.684029,77.167487,Athletics & Sports
1,Ashok Nagar (Delhi),28.69037,77.17609,Bellagio,28.696361,77.180021,Asian Restaurant
2,Ashok Nagar (Delhi),28.69037,77.17609,Subway,28.696321,77.179983,Sandwich Place
3,Ashok Nagar (Delhi),28.69037,77.17609,Rahul Egg Corner,28.68824,77.168599,Snack Place
4,Ashok Nagar (Delhi),28.69037,77.17609,Subway.,28.695571,77.171964,Sandwich Place


In [26]:
# group venues by count
venues_df.groupby(["Neighborhoods in Delhi"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhoods in Delhi,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Ashok Nagar (Delhi),22,22,22,22,22,22
Ashok Vihar,3,3,3,3,3,3
Ashram Chowk,4,4,4,4,4,4
Babarpur,4,4,4,4,4,4
"Badarpur, Delhi",53,53,53,53,53,53
Bali Nagar,1,1,1,1,1,1
Bawana,97,97,97,97,97,97
Ber Sarai,8,8,8,8,8,8
Bhajanpura,73,73,73,73,73,73
Chanakyapuri,52,52,52,52,52,52


In [27]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 211 uniques categories.


In [28]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:50]

array(['Athletics & Sports', 'Asian Restaurant', 'Sandwich Place',
       'Snack Place', 'Pizza Place', 'Indian Restaurant',
       'South Indian Restaurant', 'Department Store',
       'Fast Food Restaurant', 'Coffee Shop', 'Market', 'Dessert Shop',
       'Basketball Court', 'Train Station', 'Light Rail Station', 'ATM',
       'Tourist Information Center', 'Indian Sweet Shop', 'Café',
       'American Restaurant', 'Donut Shop', 'Bakery', 'Diner',
       'Hookah Bar', 'BBQ Joint', 'Hotel', 'Sports Bar', 'Pub',
       'Garden Center', 'Multiplex', 'Shopping Mall',
       'Furniture / Home Store', 'Fried Chicken Joint', 'Bar',
       'Restaurant', 'Cosmetics Shop', 'Gym', 'Electronics Store',
       'Clothing Store', 'Jewelry Store', 'Garden', 'Park',
       'Salon / Barbershop', 'Playground', 'Art Gallery',
       'Mediterranean Restaurant', 'Tea Room', 'Tibetan Restaurant',
       'Lounge', 'Ice Cream Shop'], dtype=object)

In [29]:
# check if the results contain "Shopping Mall"
"Coffee Shop" in venues_df['VenueCategory'].unique()

True

<h3> Analyzing each area

In [30]:
# one hot encoding
nd_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
nd_onehot['Neighborhoods'] = venues_df['Neighborhoods in Delhi'] 

# move neighborhood column to the first column
fixed_columns = [nd_onehot.columns[-1]] + list(nd_onehot.columns[:-1])
nd_onehot = nd_onehot[fixed_columns]

print(nd_onehot.shape)
nd_onehot.head()

(5969, 212)


Unnamed: 0,Neighborhoods,ATM,Accessories Store,Afghan Restaurant,Airport Food Court,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Workshop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bed & Breakfast,Beer Bar,Beer Garden,Bengali Restaurant,Big Box Store,Bike Shop,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Buffet,Burger Joint,Burmese Restaurant,Bus Station,Business Service,Cafeteria,Café,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gastropub,Gift Shop,Golf Course,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Hardware Store,Health & Beauty Service,High School,Hindu Temple,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotel Pool,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Karnataka Restaurant,Korean Restaurant,Lake,Leather Goods Store,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Mosque,Motel,Motorcycle Shop,Movie Theater,Moving Target,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Music Store,Music Venue,Neighborhood,Nightclub,Nightlife Spot,North Indian Restaurant,Northeast Indian Restaurant,Other Great Outdoors,Other Nightlife,Paper / Office Supplies Store,Park,Performing Arts Venue,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Public Art,Punjabi Restaurant,Racetrack,Record Shop,Restaurant,River,Road,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Speakeasy,Spiritual Center,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Sushi Restaurant,Tapas Restaurant,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Tibetan Restaurant,Tourist Information Center,Toy / Game Store,Track,Track Stadium,Trail,Train Station,Turkish Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Weight Loss Center,Women's Store,Yoga Studio,Zoo
0,Ashok Nagar (Delhi),0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Ashok Nagar (Delhi),0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Ashok Nagar (Delhi),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Ashok Nagar (Delhi),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Ashok Nagar (Delhi),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [31]:
# reset the index and add the Neighborhoods column again
nd_grouped = nd_onehot.groupby(["Neighborhoods"]).mean().reset_index()

print(nd_grouped.shape)
nd_grouped

(136, 212)


Unnamed: 0,Neighborhoods,ATM,Accessories Store,Afghan Restaurant,Airport Food Court,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Workshop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bed & Breakfast,Beer Bar,Beer Garden,Bengali Restaurant,Big Box Store,Bike Shop,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Buffet,Burger Joint,Burmese Restaurant,Bus Station,Business Service,Cafeteria,Café,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gastropub,Gift Shop,Golf Course,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Hardware Store,Health & Beauty Service,High School,Hindu Temple,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotel Pool,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Karnataka Restaurant,Korean Restaurant,Lake,Leather Goods Store,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Mosque,Motel,Motorcycle Shop,Movie Theater,Moving Target,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Music Store,Music Venue,Neighborhood,Nightclub,Nightlife Spot,North Indian Restaurant,Northeast Indian Restaurant,Other Great Outdoors,Other Nightlife,Paper / Office Supplies Store,Park,Performing Arts Venue,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Public Art,Punjabi Restaurant,Racetrack,Record Shop,Restaurant,River,Road,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South Indian Restaurant,Spa,Spanish Restaurant,Speakeasy,Spiritual Center,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Sushi Restaurant,Tapas Restaurant,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Tibetan Restaurant,Tourist Information Center,Toy / Game Store,Track,Track Stadium,Trail,Train Station,Turkish Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Weight Loss Center,Women's Store,Yoga Studio,Zoo
0,Ashok Nagar (Delhi),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.136364,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Ashok Vihar,0.666667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Ashram Chowk,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Babarpur,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Badarpur, Delhi",0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037736,0.0,0.018868,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.113208,0.0,0.037736,0.0,0.018868,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.037736,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.113208,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.037736,0.018868,0.018868,0.0,0.0,0.0,0.0,0.0,0.037736,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037736,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.075472,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.037736,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.037736,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bali Nagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Bawana,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.020619,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.010309,0.030928,0.0,0.030928,0.0,0.0,0.010309,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.103093,0.020619,0.0,0.010309,0.061856,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041237,0.0,0.0,0.010309,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.010309,0.0,0.123711,0.0,0.0,0.0,0.020619,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.041237,0.020619,0.0,0.0,0.010309,0.010309,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.010309,0.072165,0.0,0.0,0.0,0.0,0.020619,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Ber Sarai,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0
8,Bhajanpura,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027397,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.013699,0.0,0.0,0.082192,0.041096,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027397,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.068493,0.013699,0.0,0.0,0.0,0.0,0.178082,0.0,0.0,0.0,0.027397,0.013699,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.027397,0.0,0.0,0.0,0.041096,0.0,0.013699,0.027397,0.0,0.0,0.0,0.027397,0.0,0.0,0.013699,0.0,0.0,0.013699,0.013699,0.0,0.013699,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.013699,0.0,0.0,0.0,0.013699,0.013699,0.0,0.0,0.013699,0.013699,0.013699,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0
9,Chanakyapuri,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.057692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.038462,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.096154,0.0,0.0,0.0,0.0,0.0,0.211538,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.019231,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.019231,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.038462,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [32]:
# find number of neighborhoods with atleast one coffee shop
len(nd_grouped[nd_grouped["Coffee Shop"] > 0])

89

In [33]:
# drop all the columns except the neighborhood and coffee shop
nd_grocery = nd_grouped[["Neighborhoods","Coffee Shop"]]

In [34]:
nd_grocery.head() #sanity check

Unnamed: 0,Neighborhoods,Coffee Shop
0,Ashok Nagar (Delhi),0.045455
1,Ashok Vihar,0.0
2,Ashram Chowk,0.0
3,Babarpur,0.0
4,"Badarpur, Delhi",0.018868


<h3> K-Means Clustering

In [35]:
# set number of clusters
kclusters = 5

nd_clustering = nd_grocery.drop(["Neighborhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(nd_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([4, 0, 0, 0, 4, 0, 2, 0, 0, 4])

In [36]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
nd_merged = nd_grocery.copy()

# add clustering labels
nd_merged["Cluster Labels"] = kmeans.labels_
nd_merged.head()

Unnamed: 0,Neighborhoods,Coffee Shop,Cluster Labels
0,Ashok Nagar (Delhi),0.045455,4
1,Ashok Vihar,0.0,0
2,Ashram Chowk,0.0,0
3,Babarpur,0.0,0
4,"Badarpur, Delhi",0.018868,4


In [37]:
# rename the Neighborhoods column to Neighborhood
nd_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
nd_merged.head()

Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels
0,Ashok Nagar (Delhi),0.045455,4
1,Ashok Vihar,0.0,0
2,Ashram Chowk,0.0,0
3,Babarpur,0.0,0
4,"Badarpur, Delhi",0.018868,4


In [38]:
# merge toronto_grouped with the new delhi data to add latitude/longitude for each neighborhood
nd_merged = nd_merged.join(Ndelhi.set_index("Neighborhoods in Delhi"), on="Neighborhood")

print(nd_merged.shape)
nd_merged.head() 

(136, 5)


Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
0,Ashok Nagar (Delhi),0.045455,4,28.69037,77.17609
1,Ashok Vihar,0.0,0,28.710598,77.326965
2,Ashram Chowk,0.0,0,28.50738,77.30346
3,Babarpur,0.0,0,28.50738,77.30346
4,"Badarpur, Delhi",0.018868,4,28.65223,77.129411


In [39]:
# sort the results by Cluster Labels
print(nd_merged.shape)
nd_merged.sort_values(["Cluster Labels"], inplace=True)
nd_merged

(136, 5)


Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
135,Yamuna Pushta,0.0,0,28.70059,77.27212
101,Rama Krishna Puram,0.0,0,28.68584,77.13188
87,Okhla,0.016667,0,28.65434,77.23258
86,Nizamuddin West,0.0,0,28.53247,77.27839
33,"Green Park, Delhi",0.0,0,28.62043,77.04941
84,Nigambodh Ghat,0.0,0,28.60124,77.264521
83,New Moti Bagh,0.0,0,28.66471,77.23633
36,Hauz Khas,0.0,0,28.65107,77.30669
81,New Delhi,0.0,0,28.57812,77.26999
78,Narela,0.0,0,28.67369,77.28326


In [40]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(nd_merged['Latitude'], nd_merged['Longitude'], nd_merged['Neighborhood'], nd_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [41]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

<h3> Examining the results

Cluster 1

In [42]:
nd_merged.loc[nd_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
135,Yamuna Pushta,0.0,0,28.70059,77.27212
101,Rama Krishna Puram,0.0,0,28.68584,77.13188
87,Okhla,0.016667,0,28.65434,77.23258
86,Nizamuddin West,0.0,0,28.53247,77.27839
33,"Green Park, Delhi",0.0,0,28.62043,77.04941
84,Nigambodh Ghat,0.0,0,28.60124,77.264521
83,New Moti Bagh,0.0,0,28.66471,77.23633
36,Hauz Khas,0.0,0,28.65107,77.30669
81,New Delhi,0.0,0,28.57812,77.26999
78,Narela,0.0,0,28.67369,77.28326


Cluster 2

In [43]:
nd_merged.loc[nd_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
60,Mahipalpur,0.153846,1,28.69893,77.22715
58,Madanpur Khadar JJ Colony,0.108108,1,28.57223,77.26357
70,Moti Nagar (New Delhi),0.142857,1,28.71053,77.2144
92,Pandav Nagar,0.111111,1,28.66933,77.09173
117,"Shakti Nagar, Delhi",0.125,1,28.71423,77.15744
115,Shahdara district,0.11,1,28.54854,77.21393
65,Mayur Vihar Phase - 3,0.166667,1,28.66121,77.0869
106,Safdarjung (Delhi),0.166667,1,28.60588,77.09552
67,Mehrauli,0.16,1,28.70501,77.1895
19,Delhi Cantonment,0.111111,1,28.70037,77.20493


Cluster 3

In [44]:
nd_merged.loc[nd_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
111,Sarita Vihar,0.057143,2,28.5756,77.19364
32,Greater Kailash,0.1,2,28.55897,77.20462
10,Chandni Chowk,0.054054,2,28.67671,77.21767
90,Palika Bazaar,0.07,2,28.546774,77.244757
91,Pamposh Enclave,0.095238,2,28.61458,77.27574
30,"Golf Links, New Delhi",0.060606,2,28.53508,77.26512
94,Patel Nagar,0.081081,2,28.6959,77.13725
98,"Rajendra Nagar, Delhi",0.065217,2,28.64546,77.17776
16,Dashrath Puri,0.08,2,28.56059,77.24678
100,Rajouri Garden,0.065789,2,28.56553,77.17719


Cluster 4

In [45]:
nd_merged.loc[nd_merged['Cluster Labels'] == 3]

Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
88,Old Delhi,0.25,3,28.59106,77.09117
104,Roop Nagar,0.222222,3,28.59028,77.12014


Cluster 5

In [46]:
nd_merged.loc[nd_merged['Cluster Labels'] == 4]

Unnamed: 0,Neighborhood,Coffee Shop,Cluster Labels,Latitude,Longitude
4,"Badarpur, Delhi",0.018868,4,28.65223,77.129411
133,Vivek Vihar subdivision,0.037037,4,28.64783,77.16449
124,South Extension,0.035714,4,28.64561,77.16682
9,Chanakyapuri,0.019231,4,28.65627,77.23232
127,Tilak Nagar (Delhi),0.02,4,28.666,77.2152
13,"Dabri, New Delhi",0.018868,4,28.654598,77.233397
118,Shalimar Bagh (Delhi Assembly constituency),0.037037,4,28.63847,77.28912
119,Shankar Vihar,0.037037,4,28.57756,77.16811
11,"Civil Lines, Delhi",0.03,4,28.63394,77.21968
17,Dayanand Colony,0.052083,4,28.57298,77.23357


Finding number of neighborhoods in each cluster

In [47]:
nd_merged['Cluster Labels'].value_counts()

0    49
2    37
4    34
1    14
3     2
Name: Cluster Labels, dtype: int64

<h3> Observation

There are :
1. Maximum number of neighborhoods in the first cluster which don't have any coffee shops. 
2. There is also a high number of neighborhoods which have too many coffee shops.
Hence, there is great disparity in the distribution of coffee shops across neighborhoods. 
This also shows that any potential investors would be better suited investing in clusters 0, 1 or 3. 