# IBM Applied Data Science Capstone Course by Coursera
### Week 5 Final Report
**_Opening a New Indian Restaurant in Delhi, India_**
- Build a dataframe of neighborhoods in Delhi, India by web scraping the data from Wikipedia page
- Get the geographical coordinates of the neighborhoods
- Obtain the venue data for the neighborhoods from Foursquare API
- Explore and cluster the neighborhoods
- Select the best cluster to open a new Indian Restaurant
### 1. Import libraries

In [7]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print("Libraries imported.")

Libraries imported.


### 2. Scrap data from Wikipedia page into a DataFrame

In [8]:
# send the GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Delhi").text

In [9]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')

In [10]:
# create a list to store neighborhood data
neighborhoodList = []
# append the data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighborhoodList.append(row.text)

In [12]:
# create a new DataFrame from the list
dl_df = pd.DataFrame({"Neighborhood": neighborhoodList})

dl_df.head()

Unnamed: 0,Neighborhood
0,Neighbourhoods of Delhi
1,Ashok Nagar (Delhi)
2,Ashok Vihar
3,Ashram Chowk
4,Babarpur


In [13]:
# print the number of rows of the dataframe
dl_df.shape

(139, 1)

### 3. Get the geographical coordinates

In [14]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Delhi, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [17]:
coords = [ get_latlng(neighborhood) for neighborhood in dl_df["Neighborhood"].tolist() ]

Status code Unknown from https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find: ERROR - HTTPSConnectionPool(host='geocode.arcgis.com', port=443): Read timed out. (read timeout=5.0)
Status code Unknown from https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find: ERROR - HTTPSConnectionPool(host='geocode.arcgis.com', port=443): Read timed out. (read timeout=5.0)
Status code Unknown from https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find: ERROR - HTTPSConnectionPool(host='geocode.arcgis.com', port=443): Read timed out. (read timeout=5.0)
Status code Unknown from https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find: ERROR - HTTPSConnectionPool(host='geocode.arcgis.com', port=443): Read timed out. (read timeout=5.0)
Status code Unknown from https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find: ERROR - HTTPSConnectionPool(host='geocode.arcgis.com', port=443): Read timed out. (read timeout=5.0)


KeyboardInterrupt: 

In [18]:
coords

[[28.523450000000025, 77.26178000000004],
 [28.692230000000052, 77.30124000000006],
 [28.69037000000003, 77.17609000000004],
 [28.710597501792023, 77.32696517369723],
 [28.50738000000007, 77.30346000000003],
 [28.50738000000007, 77.30346000000003],
 [28.652234222889238, 77.12939224396462],
 [28.800590000000057, 77.03473000000008],
 [28.549540000000036, 77.18167000000005],
 [28.699880000000064, 77.25906000000003],
 [28.595060000000046, 77.18573000000004],
 [28.656270000000063, 77.23232000000007],
 [28.67671000000007, 77.21767000000006],
 [28.633940000000052, 77.21968000000004],
 [28.60761000000008, 77.08714000000003],
 [28.65457890544559, 77.23339989939495],
 [28.62832000000003, 77.24727000000007],
 [28.605920000000026, 77.08529000000004],
 [28.560590000000047, 77.24678000000006],
 [28.57298000000003, 77.23357000000004],
 [28.591510000000028, 77.12945000000008],
 [28.699110000000076, 77.19105000000008],
 [28.594857177133914, 77.16729160908383],
 [28.684700000000078, 77.32774000000006],


In [19]:
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [20]:
dl_df['Latitude'] = df_coords['Latitude']
dl_df['Longitude'] = df_coords['Longitude']

In [21]:
# check the neighborhoods and the coordinates
print(dl_df.shape)
dl_df

(139, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Neighbourhoods of Delhi,28.52345,77.26178
1,Ashok Nagar (Delhi),28.69223,77.30124
2,Ashok Vihar,28.69037,77.17609
3,Ashram Chowk,28.710598,77.326965
4,Babarpur,28.50738,77.30346
5,"Badarpur, Delhi",28.50738,77.30346
6,Bali Nagar,28.652234,77.129392
7,Bawana,28.80059,77.03473
8,Ber Sarai,28.54954,77.18167
9,Bhajanpura,28.69988,77.25906


### 4. Create a map of Delhi with neighborhoods superimposed on top

In [24]:
# get the coordinates of Delhi
address = 'Delhi, India'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Delhi, India {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Delhi, India 28.6517178, 77.2219388.


In [26]:
# create map of Delhi using latitude and longitude values
map_dl = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(dl_df['Latitude'], dl_df['Longitude'], dl_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_dl)  
    
map_dl

### 5. Use the Foursquare API to explore the neighborhoods

In [38]:
# define Foursquare Credentials and Version
CLIENT_ID = 'QEF4SFROUUVOQKNPAFNNRDUW4ACAWSYYG312LM3BKEDDKPIZ' # your Foursquare ID
CLIENT_SECRET = 'CCBFW1WDVTUYDLPFX3SR0QBB5R2UKRYOZ3F1JDJ3PLQMPDRM' # your Foursquare Secret
VERSION = '20180604' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: QEF4SFROUUVOQKNPAFNNRDUW4ACAWSYYG312LM3BKEDDKPIZ
CLIENT_SECRET:CCBFW1WDVTUYDLPFX3SR0QBB5R2UKRYOZ3F1JDJ3PLQMPDRM


In [58]:
#Now, let's get the top 100 venues that are within a radius of 2000 meters
def getNearbyVenues(names, latitudes, longitudes):
    radius=500
    LIMIT=100
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [61]:
dl_venues = getNearbyVenues(names=dl_df['Neighborhood'],latitudes=dl_df['Latitude'],longitudes=dl_df['Longitude'])

Neighbourhoods of Delhi
Ashok Nagar (Delhi)
Ashok Vihar
Ashram Chowk
Babarpur
Badarpur, Delhi
Bali Nagar
Bawana
Ber Sarai
Bhajanpura
Chanakyapuri
Chandni Chowk
Civil Lines, Delhi
Connaught Place, New Delhi
Dabri, New Delhi
Dariba Kalan
Daryaganj
Dashrath Puri
Dayanand Colony
Defence Colony
Delhi Cantonment
Derawal Nagar
Dhaula Kuan
Dilshad Colony
Dilshad Garden
Dwarka, Delhi
East Patel Nagar
Gadaipur, Mehrauli, New Delhi
Geetanjali Enclave
Ghitorni
Gole Market
Golf Links, New Delhi
Govindpuri
Greater Kailash
Green Park, Delhi
Gulabi Bagh
Gulmohar Park
Hauz Khas
Inder Puri
Janakpuri
Jangpura
Jia Sarai
Kabir Nagar, New Delhi
Kailash Colony
Kamla Nagar, New Delhi
Kapasheda Border, Delhi
Karol Bagh
Keshav Puram
Khaira, Delhi
Khanpur, Delhi
Khari Baoli
Kingsway Camp
Kirti Nagar
Kotla Mubarakpur Complex
Krishna Nagar, Delhi
Lajpat Nagar
Laxmi Nagar (Delhi)
Laxmibai Nagar
Lodhi Colony
Lutyens' Delhi
Madanpur Khadar JJ Colony
Maharani Bagh
Mahipalpur
Majnu-ka-tilla
Malviya Nagar (Delhi)
Mayapu

In [64]:
dl_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Neighbourhoods of Delhi,28.52345,77.26178,Khana Khazana,28.526114,77.259298,Restaurant
1,Neighbourhoods of Delhi,28.52345,77.26178,kalakhatta wala @ gk2,28.523928,77.260645,Ice Cream Shop
2,Neighbourhoods of Delhi,28.52345,77.26178,Axis Bank ATM,28.52315,77.26094,ATM
3,Neighbourhoods of Delhi,28.52345,77.26178,Axis Bank ATM,28.52353,77.25733,ATM
4,Ashok Nagar (Delhi),28.69223,77.30124,Axis Bank ATM,28.69647,77.29991,ATM


In [66]:
print('There are {} uniques categories.'.format(len(dl_venues['Venue Category'].unique())))

There are 163 uniques categories.


### 6. Analyze Each Neighborhood

In [68]:
# one hot encoding
dl_onehot = pd.get_dummies(dl_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
dl_onehot['Neighborhoods'] = dl_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [dl_onehot.columns[-1]] + list(dl_onehot.columns[:-1])
dl_onehot = dl_onehot[fixed_columns]

print(dl_onehot.shape)
dl_onehot.head()

(1108, 164)


Unnamed: 0,Neighborhoods,ATM,Afghan Restaurant,Airport Food Court,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bakery,Bank,Bar,Baseball Field,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Shop,Bistro,Bookstore,Boutique,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Bus Station,Business Service,Café,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Fast Food Restaurant,Fishing Store,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Hindu Temple,Historic Site,History Museum,Hookah Bar,Hot Dog Joint,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Lawyer,Light Rail Station,Lighting Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Mosque,Motel,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Neighborhood,Night Market,Nightclub,North Indian Restaurant,Nudist Beach,Other Nightlife,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Photography Studio,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Recording Studio,Resort,Rest Area,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Sculpture Garden,Shoe Store,Shopping Mall,Shopping Plaza,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Stadium,Steakhouse,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Travel Agency,University,Vietnamese Restaurant,Water Park,Wine Bar,Women's Store,Yoga Studio
0,Neighbourhoods of Delhi,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Neighbourhoods of Delhi,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Neighbourhoods of Delhi,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Neighbourhoods of Delhi,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Ashok Nagar (Delhi),1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [92]:
dl_onehot.describe()

Unnamed: 0,ATM,Afghan Restaurant,Airport Food Court,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bakery,Bank,Bar,Baseball Field,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Shop,Bistro,Bookstore,Boutique,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Bus Station,Business Service,Café,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Fast Food Restaurant,Fishing Store,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Hindu Temple,Historic Site,History Museum,Hookah Bar,Hot Dog Joint,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Lawyer,Light Rail Station,Lighting Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Mosque,Motel,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Neighborhood,Night Market,Nightclub,North Indian Restaurant,Nudist Beach,Other Nightlife,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Photography Studio,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Recording Studio,Resort,Rest Area,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Sculpture Garden,Shoe Store,Shopping Mall,Shopping Plaza,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Stadium,Steakhouse,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Travel Agency,University,Vietnamese Restaurant,Water Park,Wine Bar,Women's Store,Yoga Studio
count,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0,1108.0
mean,0.027076,0.000903,0.001805,0.000903,0.001805,0.002708,0.00361,0.000903,0.013538,0.001805,0.001805,0.009928,0.016245,0.004513,0.036101,0.000903,0.001805,0.001805,0.000903,0.000903,0.006318,0.001805,0.000903,0.006318,0.001805,0.004513,0.000903,0.000903,0.001805,0.065884,0.000903,0.037004,0.015343,0.002708,0.037906,0.000903,0.000903,0.005415,0.002708,0.000903,0.000903,0.005415,0.00722,0.018953,0.005415,0.015343,0.001805,0.002708,0.005415,0.000903,0.000903,0.000903,0.001805,0.001805,0.044224,0.000903,0.009025,0.000903,0.006318,0.001805,0.00722,0.00361,0.000903,0.00361,0.001805,0.001805,0.000903,0.002708,0.002708,0.009928,0.005415,0.000903,0.000903,0.001805,0.002708,0.002708,0.001805,0.049639,0.00361,0.009928,0.112816,0.000903,0.00361,0.000903,0.012635,0.00361,0.001805,0.002708,0.001805,0.000903,0.000903,0.000903,0.00722,0.000903,0.015343,0.021661,0.001805,0.000903,0.008123,0.000903,0.000903,0.002708,0.002708,0.000903,0.002708,0.001805,0.000903,0.000903,0.00361,0.005415,0.00361,0.000903,0.000903,0.004513,0.005415,0.000903,0.000903,0.001805,0.009928,0.000903,0.000903,0.000903,0.00361,0.000903,0.027076,0.002708,0.004513,0.001805,0.00361,0.009928,0.000903,0.000903,0.000903,0.024368,0.001805,0.002708,0.009928,0.000903,0.002708,0.006318,0.000903,0.002708,0.013538,0.006318,0.012635,0.000903,0.001805,0.006318,0.000903,0.000903,0.00361,0.00361,0.000903,0.000903,0.000903,0.002708,0.000903,0.000903,0.000903,0.000903,0.001805,0.000903,0.000903
std,0.162378,0.030042,0.042467,0.030042,0.042467,0.051987,0.060003,0.030042,0.115614,0.042467,0.042467,0.099187,0.126475,0.067055,0.186626,0.030042,0.042467,0.042467,0.030042,0.030042,0.079268,0.042467,0.030042,0.079268,0.042467,0.067055,0.030042,0.030042,0.042467,0.248192,0.030042,0.188856,0.122968,0.051987,0.191056,0.030042,0.030042,0.073421,0.051987,0.030042,0.030042,0.073421,0.084703,0.136421,0.073421,0.122968,0.042467,0.051987,0.073421,0.030042,0.030042,0.030042,0.042467,0.042467,0.205685,0.030042,0.094614,0.030042,0.079268,0.042467,0.084703,0.060003,0.030042,0.060003,0.042467,0.042467,0.030042,0.051987,0.051987,0.099187,0.073421,0.030042,0.030042,0.042467,0.051987,0.051987,0.042467,0.217296,0.060003,0.099187,0.31651,0.030042,0.060003,0.030042,0.111745,0.060003,0.042467,0.051987,0.042467,0.030042,0.030042,0.030042,0.084703,0.030042,0.122968,0.145639,0.042467,0.030042,0.0898,0.030042,0.030042,0.051987,0.051987,0.030042,0.051987,0.042467,0.030042,0.030042,0.060003,0.073421,0.060003,0.030042,0.030042,0.067055,0.073421,0.030042,0.030042,0.042467,0.099187,0.030042,0.030042,0.030042,0.060003,0.030042,0.162378,0.051987,0.067055,0.042467,0.060003,0.099187,0.030042,0.030042,0.030042,0.154259,0.042467,0.051987,0.099187,0.030042,0.051987,0.079268,0.030042,0.051987,0.115614,0.079268,0.111745,0.030042,0.042467,0.079268,0.030042,0.030042,0.060003,0.060003,0.030042,0.030042,0.030042,0.051987,0.030042,0.030042,0.030042,0.030042,0.042467,0.030042,0.030042
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [99]:
dl_onehot['Indian Restaurant'].value_counts()

0    983
1    125
Name: Indian Restaurant, dtype: int64

In [113]:
#Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
dl_grouped = dl_onehot.groupby(["Neighborhoods"]).mean().reset_index()

print(dl_grouped.shape)
dl_grouped

(129, 164)


Unnamed: 0,Neighborhoods,ATM,Afghan Restaurant,Airport Food Court,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bakery,Bank,Bar,Baseball Field,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Shop,Bistro,Bookstore,Boutique,Breakfast Spot,Brewery,Burger Joint,Burmese Restaurant,Bus Station,Business Service,Café,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Fast Food Restaurant,Fishing Store,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Hindu Temple,Historic Site,History Museum,Hookah Bar,Hot Dog Joint,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Lawyer,Light Rail Station,Lighting Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Mosque,Motel,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Neighborhood,Night Market,Nightclub,North Indian Restaurant,Nudist Beach,Other Nightlife,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Photography Studio,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Recording Studio,Resort,Rest Area,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Sculpture Garden,Shoe Store,Shopping Mall,Shopping Plaza,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Stadium,Steakhouse,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Travel Agency,University,Vietnamese Restaurant,Water Park,Wine Bar,Women's Store,Yoga Studio
0,Ashok Nagar (Delhi),1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Ashok Vihar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Babarpur,0.666667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Badarpur, Delhi",0.666667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bali Nagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Ber Sarai,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Chanakyapuri,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Chandni Chowk,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.153846,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.307692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.153846,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.153846,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,"Civil Lines, Delhi",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Connaught Place, New Delhi",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.028571,0.014286,0.0,0.085714,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.085714,0.0,0.071429,0.028571,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.057143,0.0,0.0,0.0,0.014286,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.014286,0.1,0.0,0.014286,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.014286,0.0,0.014286,0.0,0.0,0.0,0.042857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.014286,0.042857,0.0,0.0,0.0,0.014286,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [100]:
len(dl_grouped[dl_grouped["Indian Restaurant"] > 0])

56

In [101]:
dl_rest = dl_grouped[["Neighborhoods","Indian Restaurant"]]

In [102]:
dl_rest.head()

Unnamed: 0,Neighborhoods,Indian Restaurant
0,Ashok Nagar (Delhi),0.0
1,Ashok Vihar,0.0
2,Babarpur,0.0
3,"Badarpur, Delhi",0.0
4,Bali Nagar,0.25


### 7. Cluster Neighborhoods
Run k-means to cluster the neighborhoods in Delhi into 3 clusters.

In [103]:
# set number of clusters
kclusters = 3

dl_clustering = dl_rest.drop(["Neighborhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(dl_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 1, 0, 1, 1, 2, 2, 0])

In [104]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
dl_merged = dl_rest.copy()

# add clustering labels
dl_merged["Cluster Labels"] = kmeans.labels_

In [105]:
dl_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
dl_merged.head()

Unnamed: 0,Neighborhood,Indian Restaurant,Cluster Labels
0,Ashok Nagar (Delhi),0.0,1
1,Ashok Vihar,0.0,1
2,Babarpur,0.0,1
3,"Badarpur, Delhi",0.0,1
4,Bali Nagar,0.25,0


In [106]:
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
dl_merged = dl_merged.join(dl_df.set_index("Neighborhood"), on="Neighborhood")

print(dl_merged.shape)
dl_merged.head() # check the last columns!

(129, 5)


Unnamed: 0,Neighborhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
0,Ashok Nagar (Delhi),0.0,1,28.69223,77.30124
1,Ashok Vihar,0.0,1,28.69037,77.17609
2,Babarpur,0.0,1,28.50738,77.30346
3,"Badarpur, Delhi",0.0,1,28.50738,77.30346
4,Bali Nagar,0.25,0,28.652234,77.129392


In [107]:
# sort the results by Cluster Labels
print(dl_merged.shape)
dl_merged.sort_values(["Cluster Labels"], inplace=True)
dl_merged

(129, 5)


Unnamed: 0,Neighborhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
101,Saket District Centre,0.106667,0,28.52813,77.21905
28,"Green Park, Delhi",0.148148,0,28.55897,77.20462
31,Hauz Khas,0.166667,0,28.55109,77.20399
32,Inder Puri,0.2,0,28.62803,77.14504
100,Saket (Delhi),0.153846,0,28.52407,77.20677
37,"Kamla Nagar, New Delhi",0.1,0,28.68376,77.20163
39,Karol Bagh,0.176471,0,28.65045,77.18873
95,"Rani Bagh, Delhi",0.25,0,28.68584,77.13188
42,Khari Baoli,0.125,0,28.65726,77.22284
93,Rajouri Garden,0.176471,0,28.64562,77.12209


In [108]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(dl_merged['Latitude'], dl_merged['Longitude'], dl_merged['Neighborhood'], dl_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Clusters

In [109]:
dl_merged.loc[dl_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
101,Saket District Centre,0.106667,0,28.52813,77.21905
28,"Green Park, Delhi",0.148148,0,28.55897,77.20462
31,Hauz Khas,0.166667,0,28.55109,77.20399
32,Inder Puri,0.2,0,28.62803,77.14504
100,Saket (Delhi),0.153846,0,28.52407,77.20677
37,"Kamla Nagar, New Delhi",0.1,0,28.68376,77.20163
39,Karol Bagh,0.176471,0,28.65045,77.18873
95,"Rani Bagh, Delhi",0.25,0,28.68584,77.13188
42,Khari Baoli,0.125,0,28.65726,77.22284
93,Rajouri Garden,0.176471,0,28.64562,77.12209


#### Cluster 1

In [110]:
dl_merged.loc[dl_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
75,New Friends Colony,0.0,1,28.57812,77.26999
76,New Moti Bagh,0.0,1,28.580997,77.181823
85,Pandav Nagar,0.0,1,28.61458,77.27574
123,"Vasant Vihar, Delhi",0.0,1,28.56494,77.16131
80,Okhla,0.0,1,28.53247,77.27839
72,Neighbourhoods of Delhi,0.0,1,28.52345,77.26178
71,Naveen Shahdara,0.0,1,28.67369,77.28326
82,Palam,0.0,1,28.59106,77.09117
120,Tis Hazari,0.0,1,28.666,77.2152
84,Pamposh Enclave,0.0,1,28.546776,77.244759


#### Cluster 2

In [111]:
dl_merged.loc[dl_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Indian Restaurant,Cluster Labels,Latitude,Longitude
121,Urdu Bazaar,0.428571,2,28.648881,77.238692
7,Chandni Chowk,0.307692,2,28.65627,77.23232
8,"Civil Lines, Delhi",0.333333,2,28.67671,77.21767
43,Kingsway Camp,0.333333,2,28.71169,77.20197
15,Derawal Nagar,0.333333,2,28.69911,77.19105
57,Mayur Vihar,0.5,2,28.612795,77.288501
18,Dilshad Garden,0.375,2,28.67904,77.31476
94,Rama Krishna Puram,0.5,2,28.56553,77.17719
105,Seelampur subdivision,0.666667,2,28.68944,77.29381
104,Sarojini Nagar,0.307692,2,28.5756,77.19364


#### Observations:
Most of the Indian Restaurant are concentrated in the central area of Delhi city, with the highest number in cluster 2 and moderate number in cluster 0. On the other hand, cluster 1 has very low number to totally no Indian Restaurant in the neighborhoods. This represents a great opportunity and high potential areas to open new Indian Restaurant as there is very little to no competition from existing Restaurant. Meanwhile, Indian Restaurant in cluster 2 are likely suffering from intense competition due to oversupply and high concentration of Indian Restaurant. From another perspective, this also shows that the oversupply of Indian Restaurant mostly happened in the central area of the city, with the suburb area still have very few Indian Restaurant. Therefore, this project recommends property developers to capitalize on these findings to open new Indian Restaurant in neighborhoods in cluster 1 with little to no competition. 