## Part 02 - Exploring and Clustering Delhi Neighborhoods

#### Importing the essential libraries

In [1]:
import numpy as np
import pandas as pd
from geopy.geocoders import Nominatim
import folium
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans

#### Importing the concatenated dataframe for further processing

In [2]:
df_final = pd.read_csv("df_final.csv")
df_delhi = df_final[df_final['region'].str.contains("Delhi")].reset_index(drop=True)
print(df_delhi.shape)
df_delhi.head()

(95, 5)


Unnamed: 0,pincode,region,neighborhood,Latitude,Longitude
0,110001,Delhi,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051
1,110002,Delhi,"A.G.C.R. , Ajmeri Gate Extn. , Darya Ganj , Ga...",28.64625,77.265751
2,110003,Delhi,"Delhi High Court Extension Counter , Delhi Hig...",28.596234,77.223611
3,110004,Delhi,Rashtrapati Bhawan,28.614348,77.19943
4,110005,Delhi,"Anand Parbat Indl. Area , Anand Parbat , Bank ...",28.657456,77.191789


#### Use geopy library to get the latitude and longitude values of Delhi, India.

In [3]:
address = 'Delhi, India'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Delhi city are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Delhi city are 28.6273928, 77.1716954.


  app.launch_new_instance()


#### Create a map of Delhi with neighborhoods superimposed on top.

In [4]:
map_delhi = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(df_delhi['Latitude'], df_delhi['Longitude'], df_delhi['neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_delhi)  
    
map_delhi

#### Utilizing the Foursquare API to explore and segment neighborhoods

In [5]:
CLIENT_ID = 'NUTE4SSXCN4MMYEDX2XSN2PFXJMWCV4PAZSSYHYGUURYNHPL' # your Foursquare ID
CLIENT_SECRET = 'TMVFNU45RMVOURKBQ5DYQOZRAQIMPM3SQNQDLD0HDH10VZYA' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100
print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentials:
CLIENT_ID: NUTE4SSXCN4MMYEDX2XSN2PFXJMWCV4PAZSSYHYGUURYNHPL
CLIENT_SECRET:TMVFNU45RMVOURKBQ5DYQOZRAQIMPM3SQNQDLD0HDH10VZYA


### 2.1 Exploring Neighbourhood in Delhi

#### Exploring the first venue of the dataset 'Baroda House , Bengali Market , Bhagat Singh Market , Connaught Place , Constitution House , Election Commission , Janpath , Krishi Bhawan , Lady Harding Medical College , North Avenue , Parliament House , Patiala House , Pragati Maidan Camp , Pragati Maidan , Rail Bhawan , Sansad Marg , Sansadiya Soudh , Secretariat North , Shastri Bhawan , Supreme Court , New Delhi G.P.O. '

In [6]:
df_delhi.loc[0, 'neighborhood']

'Baroda House , Bengali Market , Bhagat Singh Market , Connaught Place , Constitution House , Election Commission , Janpath , Krishi Bhawan , Lady Harding Medical College , North Avenue , Parliament House , Patiala House , Pragati Maidan Camp , Pragati Maidan , Rail Bhawan , Sansad Marg , Sansadiya Soudh , Secretariat North , Shastri Bhawan , Supreme Court , New Delhi G.P.O. '

In [7]:
neighborhood_latitude = df_delhi.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df_delhi.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = df_delhi.loc[0, 'neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are \n{}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Baroda House , Bengali Market , Bhagat Singh Market , Connaught Place , Constitution House , Election Commission , Janpath , Krishi Bhawan , Lady Harding Medical College , North Avenue , Parliament House , Patiala House , Pragati Maidan Camp , Pragati Maidan , Rail Bhawan , Sansad Marg , Sansadiya Soudh , Secretariat North , Shastri Bhawan , Supreme Court , New Delhi G.P.O.  are 
28.6304847, 77.2150513.


In [8]:
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=NUTE4SSXCN4MMYEDX2XSN2PFXJMWCV4PAZSSYHYGUURYNHPL&client_secret=TMVFNU45RMVOURKBQ5DYQOZRAQIMPM3SQNQDLD0HDH10VZYA&v=20180605&ll=28.6304847,77.2150513&radius=500&limit=100'

In [9]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e2694f9be61c9001beae7d8'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'N.D. Charge 4',
  'headerFullLocation': 'N.D. Charge 4, Delhi',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 42,
  'suggestedBounds': {'ne': {'lat': 28.634984704500006,
    'lng': 77.22016860474847},
   'sw': {'lat': 28.625984695499994, 'lng': 77.20993399525153}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4d655003823ca35d7f1cfe88',
       'name': 'Jain Chawal Wale',
       'location': {'address': 'Shivaji Stadium',
        'crossStreet': 'Connaught Place',
        'lat': 28.631328358375647,
        'lng': 77.21617205630533,
        'labeledLatLn

In [10]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [11]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Jain Chawal Wale,Food Truck,28.631328,77.216172
1,HOTEL SARAVANA BHAVAN,South Indian Restaurant,28.632319,77.216445
2,Aqua,Lounge,28.628809,77.215636
3,Pind Balluchi,North Indian Restaurant,28.630318,77.2176
4,Starbucks,Coffee Shop,28.632011,77.217731


In [12]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

42 venues were returned by Foursquare.


In [13]:
# The following function retrieves the venues given the names and coordinates and stores it into dataframe.

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Retrieve all venues given the Addresses

In [15]:
delhi_neighborhoods = df_delhi
delhi_venues = getNearbyVenues(names=delhi_neighborhoods['neighborhood'],
                                   latitudes=delhi_neighborhoods['Latitude'],
                                   longitudes=delhi_neighborhoods['Longitude']
                                  )

Baroda House , Bengali Market , Bhagat Singh Market , Connaught Place , Constitution House , Election Commission , Janpath , Krishi Bhawan , Lady Harding Medical College , North Avenue , Parliament House , Patiala House , Pragati Maidan Camp , Pragati Maidan , Rail Bhawan , Sansad Marg , Sansadiya Soudh , Secretariat North , Shastri Bhawan , Supreme Court , New Delhi G.P.O. 
A.G.C.R. , Ajmeri Gate Extn. , Darya Ganj , Gandhi Smarak Nidhi , I.P.Estate , Indraprastha , Minto Road 
Delhi High Court Extension Counter , Delhi High Court , Pandara Road , Aliganj  (South Delhi), C G O Complex , Golf Links , Kasturba Nagar  (South Delhi), Lodi Road , Pragati Vihar , Safdarjung Air Port 
Rashtrapati Bhawan 
Anand Parbat Indl. Area , Anand Parbat , Bank Street  (Central Delhi), Desh Bandhu Gupta Road , Guru Gobind Singh Marg , Karol Bagh , Master Prithvi Nath Marg , Sat Nagar 
Delhi G.P.O. , Baratooti , Chandni Chowk , Chawri Bazar , Dareeba , Delhi Sadar Bazar , S.T. Road , Hauz Qazi , Jama Mas

#### Check size of resulting dataframe

In [16]:
print(delhi_venues.shape)
delhi_venues.head()

(414, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,Jain Chawal Wale,28.631328,77.216172,Food Truck
1,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,HOTEL SARAVANA BHAVAN,28.632319,77.216445,South Indian Restaurant
2,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,Aqua,28.628809,77.215636,Lounge
3,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,Pind Balluchi,28.630318,77.2176,North Indian Restaurant
4,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,Starbucks,28.632011,77.217731,Coffee Shop


#### Count of venues were returned for each region

In [17]:
delhi_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"505 A B Workshop , A F Palam , Aps Colony , Bazar Road , C.V.D. , COD (South West Delhi), Delhi Cantt , Dhaula Kuan , Kirby Place , Pinto Park , R R Hospital , Signal Enclave , Station Road (South West Delhi), Subroto Park",4,4,4,4,4,4
"A.K.Market , Multani Dhanda , Pahar Ganj , Swami Ram Tirth Nagar",4,4,4,4,4,4
"Abul Fazal Enclave-I , Jamia Nagar , New Friends Colony , Sukhdev Vihar , Zakir Nagar",3,3,3,3,3,3
"Adrash Nagar , Bhalaswa , Jahangir Puri A Block , Jahangir Puri D Block , Jahangir Puri H Block , N.S.Mandi",4,4,4,4,4,4
"Air Force Station Tugalkabad , BSF Camp Tigri , Dakshinpuri Phase-I , Dakshinpuri Phase-II , Dakshinpuri Phase-III , Deoli , Dr. Ambedkar Nagar (South Delhi), Hamdard Nagar , Khanpur (South Delhi), Pushpa Bhawan , Talimabad",5,5,5,5,5,5
"Alaknanda , Chittranjan Park , Kalkaji , Nehru Place",26,26,26,26,26,26
"Amar Colony , Defence Colony (South Delhi), Krishna Market , Lajpat Nagar (South Delhi)",5,5,5,5,5,5
"Amberhai , District Court Complex Dwarka , Dwarka Sec-6",1,1,1,1,1,1
"Anand Niketan , Chanakya Puri , Malcha Marg , Moti Bagh , Nanak Pura , South Delhi Campus",5,5,5,5,5,5
"Anand Parbat Indl. Area , Anand Parbat , Bank Street (Central Delhi), Desh Bandhu Gupta Road , Guru Gobind Singh Marg , Karol Bagh , Master Prithvi Nath Marg , Sat Nagar",5,5,5,5,5,5


#### How many unique categories can be curated from all the returned venues

### 2.2 Analyze each region Neighborhood

In [18]:
print('There are {} uniques categories.'.format(len(delhi_venues['Venue Category'].unique())))

There are 119 uniques categories.


In [19]:
# one hot encoding
delhi_onehot = pd.get_dummies(delhi_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
delhi_onehot['Neighborhood'] = delhi_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [delhi_onehot.columns[-1]] + list(delhi_onehot.columns[:-1])
delhi_onehot = delhi_onehot[fixed_columns]

delhi_onehot.head()

Unnamed: 0,Neighborhood,ATM,Airport Food Court,American Restaurant,Arcade,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bank,...,Sporting Goods Shop,Stadium,Temple,Theater,Tibetan Restaurant,Toy / Game Store,Train Station,Vegetarian / Vegan Restaurant,Water Park,Yoga Studio
0,"Baroda House , Bengali Market , Bhagat Singh M...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Baroda House , Bengali Market , Bhagat Singh M...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Baroda House , Bengali Market , Bhagat Singh M...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Baroda House , Bengali Market , Bhagat Singh M...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Baroda House , Bengali Market , Bhagat Singh M...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### And let's find out the dataframe size

In [20]:
delhi_onehot.shape

(414, 120)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category.

In [21]:
delhi_grouped = delhi_onehot.groupby('Neighborhood').mean().reset_index()
delhi_grouped

Unnamed: 0,Neighborhood,ATM,Airport Food Court,American Restaurant,Arcade,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bank,...,Sporting Goods Shop,Stadium,Temple,Theater,Tibetan Restaurant,Toy / Game Store,Train Station,Vegetarian / Vegan Restaurant,Water Park,Yoga Studio
0,"505 A B Workshop , A F Palam , Aps Colony , Ba...",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
1,"A.K.Market , Multani Dhanda , Pahar Ganj , Swa...",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
2,"Abul Fazal Enclave-I , Jamia Nagar , New Frien...",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
3,"Adrash Nagar , Bhalaswa , Jahangir Puri A Bloc...",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
4,"Air Force Station Tugalkabad , BSF Camp Tigri ...",0.000000,0.000000,0.0,0.200000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
5,"Alaknanda , Chittranjan Park , Kalkaji , Nehru...",0.000000,0.000000,0.0,0.038462,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
6,"Amar Colony , Defence Colony (South Delhi), K...",0.000000,0.000000,0.0,0.000000,0.000000,0.2,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.2,0.000000,0.0,0.000000,0.000000,0.0,0.00
7,"Amberhai , District Court Complex Dwarka , Dwa...",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
8,"Anand Niketan , Chanakya Puri , Malcha Marg , ...",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00
9,"Anand Parbat Indl. Area , Anand Parbat , Bank ...",0.000000,0.000000,0.0,0.000000,0.200000,0.0,0.000000,0.000000,0.0,...,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.00


In [22]:
delhi_grouped.shape

(78, 120)

#### Let's put that into a pandas dataframe

In [23]:
num_top_venues = 5

for hood in delhi_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = delhi_grouped[delhi_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----505 A B Workshop , A F Palam , Aps Colony , Bazar Road , C.V.D. , COD  (South West Delhi), Delhi Cantt , Dhaula Kuan , Kirby Place , Pinto Park , R R Hospital , Signal Enclave , Station Road  (South West Delhi), Subroto Park ----
               venue  freq
0  Indian Restaurant  0.50
1     Farmers Market  0.25
2  Convenience Store  0.25
3                ATM  0.00
4             Mosque  0.00


----A.K.Market , Multani Dhanda , Pahar Ganj , Swami Ram Tirth Nagar ----
                venue  freq
0         High School  0.25
1     Bed & Breakfast  0.25
2  Light Rail Station  0.25
3     Motorcycle Shop  0.25
4                 ATM  0.00


----Abul Fazal Enclave-I , Jamia Nagar , New Friends Colony , Sukhdev Vihar , Zakir Nagar ----
                     venue  freq
0       Italian Restaurant  0.33
1  Health & Beauty Service  0.33
2                      Gym  0.33
3                      ATM  0.00
4      Monument / Landmark  0.00


----Adrash Nagar , Bhalaswa , Jahangir Puri A Block , Jahangir 

#### First, let's write a function to sort the venues in descending order.

In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = delhi_grouped['Neighborhood']

for ind in np.arange(delhi_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(delhi_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()
neighborhoods_venues_sorted.shape

(78, 11)

### 2.3 Clustering Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [26]:
# set number of clusters
kclusters = 5

delhi_grouped_clustering = delhi_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(delhi_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 2, 2, 1, 4, 2, 4, 2, 4, 2], dtype=int32)

In [27]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

delhi_merged = delhi_neighborhoods

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
delhi_merged = delhi_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='neighborhood')
delhi_merged.head() # check the last columns!

Unnamed: 0,pincode,region,neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,110001,Delhi,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,2.0,Café,Coffee Shop,Bar,Chinese Restaurant,Hotel,Fast Food Restaurant,Pub,Italian Restaurant,Bistro,Indian Restaurant
1,110002,Delhi,"A.G.C.R. , Ajmeri Gate Extn. , Darya Ganj , Ga...",28.64625,77.265751,,,,,,,,,,,
2,110003,Delhi,"Delhi High Court Extension Counter , Delhi Hig...",28.596234,77.223611,2.0,Café,Furniture / Home Store,Dessert Shop,Pub,Bar,Sandwich Place,Toy / Game Store,Eastern European Restaurant,Event Space,Farmers Market
3,110004,Delhi,Rashtrapati Bhawan,28.614348,77.19943,4.0,Museum,Garden,Yoga Studio,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
4,110005,Delhi,"Anand Parbat Indl. Area , Anand Parbat , Bank ...",28.657456,77.191789,2.0,Dessert Shop,Snack Place,Asian Restaurant,Pizza Place,Coffee Shop,Fast Food Restaurant,Food Court,Food & Drink Shop,Food,Flea Market


#### Finding out the unique values in cluster labels

Since, there are some NaN values for some addresses, it shows there are no venues near 500m radius. And, they do no belong to any of the clusters.

In [28]:
delhi_merged['Cluster Labels'].unique().tolist()

[2.0, nan, 4.0, 1.0, 3.0, 0.0]

#### Dropping the rows that has NaN values in there Cluster Labels

In [29]:
delhi_merged = delhi_merged.dropna()
delhi_merged = delhi_merged.reset_index()
delhi_merged

Unnamed: 0,index,pincode,region,neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,110001,Delhi,"Baroda House , Bengali Market , Bhagat Singh M...",28.630485,77.215051,2.0,Café,Coffee Shop,Bar,Chinese Restaurant,Hotel,Fast Food Restaurant,Pub,Italian Restaurant,Bistro,Indian Restaurant
1,2,110003,Delhi,"Delhi High Court Extension Counter , Delhi Hig...",28.596234,77.223611,2.0,Café,Furniture / Home Store,Dessert Shop,Pub,Bar,Sandwich Place,Toy / Game Store,Eastern European Restaurant,Event Space,Farmers Market
2,3,110004,Delhi,Rashtrapati Bhawan,28.614348,77.199430,4.0,Museum,Garden,Yoga Studio,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
3,4,110005,Delhi,"Anand Parbat Indl. Area , Anand Parbat , Bank ...",28.657456,77.191789,2.0,Dessert Shop,Snack Place,Asian Restaurant,Pizza Place,Coffee Shop,Fast Food Restaurant,Food Court,Food & Drink Shop,Food,Flea Market
4,5,110006,Delhi,"Delhi G.P.O. , Baratooti , Chandni Chowk , Cha...",28.651309,77.228772,2.0,Paper / Office Supplies Store,Hardware Store,Fast Food Restaurant,Mosque,Snack Place,Light Rail Station,Hotel,Health & Beauty Service,Food Court,Donut Shop
5,6,110007,Delhi,"C.C.I. , Delhi University , Gulabi Bagh , Jawa...",28.677368,77.202799,2.0,Park,Indian Restaurant,Pizza Place,Coffee Shop,Flea Market,Bank,Breakfast Spot,Donut Shop,Food Court,Eastern European Restaurant
6,7,110008,Delhi,"Dada Ghosh Bhawan , Patel Nagar East , Patel N...",28.648775,77.164165,4.0,Hotel,Dance Studio,Indian Restaurant,Juice Bar,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
7,8,110009,Delhi,"Dr.Mukerjee Nagar , G.T.B.Nagar , Gujranwala C...",28.705296,77.182184,4.0,Metro Station,Historic Site,Indian Restaurant,Convenience Store,Bus Station,Frozen Yogurt Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market
8,9,110010,Delhi,"505 A B Workshop , A F Palam , Aps Colony , Ba...",28.596205,77.124081,1.0,Indian Restaurant,Convenience Store,Farmers Market,Yoga Studio,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space,Fast Food Restaurant,Flea Market
9,10,110011,Delhi,"Nirman Bhawan , South Avenue , Udyog Bhawan",28.611286,77.215644,2.0,Park,Boutique,Government Building,Rest Area,History Museum,Asian Restaurant,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space


#### Finally, let's visualize the clusters

In [30]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(delhi_merged['Latitude'], delhi_merged['Longitude'], delhi_merged['neighborhood'], delhi_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Cluster 1

In [31]:
delhi_merged.loc[delhi_merged['Cluster Labels'] == 0, delhi_merged.columns[[1] + list(range(5, delhi_merged.shape[1]))]]

Unnamed: 0,pincode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
34,110043,77.087679,0.0,ATM,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market,Food
41,110051,77.266656,0.0,ATM,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market,Food
57,110070,76.951801,0.0,ATM,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market,Food
63,110077,77.240304,0.0,ATM,Athletics & Sports,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market
73,110092,77.269171,0.0,ATM,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market,Food


#### Cluster 2

In [32]:
delhi_merged.loc[delhi_merged['Cluster Labels'] == 1, delhi_merged.columns[[1] + list(range(5, delhi_merged.shape[1]))]]

Unnamed: 0,pincode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,110010,77.124081,1.0,Indian Restaurant,Convenience Store,Farmers Market,Yoga Studio,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space,Fast Food Restaurant,Flea Market
10,110012,77.173526,1.0,Indian Restaurant,Yoga Studio,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market
13,110015,77.167428,1.0,Indian Restaurant,Train Station,Business Service,Yoga Studio,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market
28,110033,77.132133,1.0,Indian Restaurant,Department Store,Pizza Place,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
60,110073,77.061837,1.0,Indian Restaurant,Coffee Shop,Fast Food Restaurant,Yoga Studio,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Flea Market
70,110089,77.304028,1.0,Indian Restaurant,Gym,Yoga Studio,Furniture / Home Store,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant


#### Cluster 3

In [33]:
delhi_merged.loc[delhi_merged['Cluster Labels'] == 2, delhi_merged.columns[[1] + list(range(5, delhi_merged.shape[1]))]]

Unnamed: 0,pincode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,110001,77.215051,2.0,Café,Coffee Shop,Bar,Chinese Restaurant,Hotel,Fast Food Restaurant,Pub,Italian Restaurant,Bistro,Indian Restaurant
1,110003,77.223611,2.0,Café,Furniture / Home Store,Dessert Shop,Pub,Bar,Sandwich Place,Toy / Game Store,Eastern European Restaurant,Event Space,Farmers Market
3,110005,77.191789,2.0,Dessert Shop,Snack Place,Asian Restaurant,Pizza Place,Coffee Shop,Fast Food Restaurant,Food Court,Food & Drink Shop,Food,Flea Market
4,110006,77.228772,2.0,Paper / Office Supplies Store,Hardware Store,Fast Food Restaurant,Mosque,Snack Place,Light Rail Station,Hotel,Health & Beauty Service,Food Court,Donut Shop
5,110007,77.202799,2.0,Park,Indian Restaurant,Pizza Place,Coffee Shop,Flea Market,Bank,Breakfast Spot,Donut Shop,Food Court,Eastern European Restaurant
9,110011,77.215644,2.0,Park,Boutique,Government Building,Rest Area,History Museum,Asian Restaurant,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space
12,110014,77.251941,2.0,Shopping Mall,Hyderabadi Restaurant,Flea Market,Sandwich Place,Yoga Studio,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market
14,110016,77.206239,2.0,Metro Station,Nightclub,Event Space,Coffee Shop,Chinese Restaurant,Lounge,Yoga Studio,Electronics Store,Farmers Market,Fast Food Restaurant
15,110017,77.213019,2.0,Coffee Shop,Hotel,Donut Shop,Pizza Place,Café,Miscellaneous Shop,Burger Joint,Sandwich Place,Diner,Asian Restaurant
16,110018,77.091635,2.0,Photography Studio,Electronics Store,Print Shop,Fast Food Restaurant,Yoga Studio,Food Truck,Diner,Donut Shop,Eastern European Restaurant,Event Space


#### Cluster 4

In [34]:
delhi_merged.loc[delhi_merged['Cluster Labels'] == 3, delhi_merged.columns[[1] + list(range(5, delhi_merged.shape[1]))]]

Unnamed: 0,pincode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
31,110040,77.135756,3.0,Platform,Business Service,Yoga Studio,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
65,110082,77.203594,3.0,Platform,Yoga Studio,Dessert Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market
67,110085,77.080612,3.0,Business Service,Yoga Studio,Dessert Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant,Flea Market


#### Cluster 5

In [35]:
delhi_merged.loc[delhi_merged['Cluster Labels'] == 4, delhi_merged.columns[[1] + list(range(5, delhi_merged.shape[1]))]]

Unnamed: 0,pincode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,110004,77.19943,4.0,Museum,Garden,Yoga Studio,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
6,110008,77.164165,4.0,Hotel,Dance Studio,Indian Restaurant,Juice Bar,Furniture / Home Store,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market,Fast Food Restaurant
7,110009,77.182184,4.0,Metro Station,Historic Site,Indian Restaurant,Convenience Store,Bus Station,Frozen Yogurt Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market
11,110013,77.241864,4.0,Indian Restaurant,Hotel Bar,BBQ Joint,Monument / Landmark,Garden,Mughlai Restaurant,Pizza Place,Historic Site,Hotel,Restaurant
18,110021,77.169614,4.0,Park,Department Store,Indian Restaurant,Café,Bus Stop,Frozen Yogurt Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market
19,110022,77.173342,4.0,Gift Shop,IT Services,Snack Place,Indian Restaurant,Yoga Studio,Flea Market,Food Court,Food & Drink Shop,Food,Farmers Market
20,110023,77.207077,4.0,Market,Northeast Indian Restaurant,Train Station,Indian Restaurant,Restaurant,Flea Market,Food Truck,Food Court,Food & Drink Shop,Food
21,110024,77.283276,4.0,Indian Restaurant,Business Service,Theater,Athletics & Sports,Metro Station,Flea Market,Food Truck,Food Court,Food & Drink Shop,Food
23,110026,77.118783,4.0,Hotel Bar,Indian Restaurant,Movie Theater,Gourmet Shop,Frozen Yogurt Shop,Donut Shop,Eastern European Restaurant,Electronics Store,Event Space,Farmers Market
24,110027,77.138256,4.0,Hotel,Arcade,Multiplex,Indian Restaurant,Sandwich Place,Fast Food Restaurant,Café,Food,Food Truck,Food Court


##  Results and Discussion section

In this project, we have attempted to load the dataset for two of India’s prime metro cities and have tried to analyze the neighborhood regions in these metro cities based on the type of popular and top venues they have. We have clustered the neighborhoods based on the most common top venues in each of the neighborhood. Our intention with this project was to analyze and understand the difference in the type of life in these metros, which can offer decision points for anybody who is considering to settle in either of the metro cities and can get a peek into what type of experience and facilities he will be provided with.  

Given our cluster information for both Mumbai and Delhi, we see that Mumbai and its neighbourhoods are a great place for a foodie. There are a lot of restaurants, cafes, bars, etc in Mumbai neighbourhoods. Also due to the proximity of Mumbai to the seashore, Mumbai neighborhoods offer for harbors, seafood, boat, and ferry rides. On the other hand, we see how dissimilar life in Delhi neighbourhoods would be compared to Mumbai neighbourhoods. Delhi neighborhoods and good for those who like Arts and Crafts, Museums, Water Parks and Pizza places. There is very less in terms of foreign cuisine restaurants in Delhi. Mumbai, on the other hand, is great for international visitors, expats, etc, because of the variety and types of food outlets it has. Delhi is inland and its neighborhoods have proximity to Water Parks, Museums and Arts, and Crafts stores.

## Conclusion

Thus with this project, we have analyzed the kind of life each of these big metro cities has to offer based on the popular venues in their neighborhood.

#### **Mumbai would be the choice if you are a foodie!**

Another important aspect the study reveals is that the categories of venues Mumbai offers are far too many compared to Delhi. This means that Delhi becomes restrictive in terms of variety and convenience. With the data, we have studied Mumbai wins this battle of metros!