# Coursera IBM Applied Data Science Capstone Project

## PROJECT - Locations for a new Shopping Mall in Delhi NCR 

__Data Sources__
- Pincodes - India Postal Data from https://www.indiapost.gov.in/vas/pages/findpincode.aspx as a csv file 
- Foursquare API - To get the venue data from location
- geoCoder -  To get the Lat Long location data for the pincodes
- Folium library to plot the geo data
- sklearn - To do clustering on the data

__1. Import Libraries__

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

!conda install -c conda-forge geocoder --yes
print("GeoCoder Installation Done!")
import geocoder # import geocoder
print("Geo Coder imported!")

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library
print("Libraries imported.")

Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda

  added / updated specs:
    - geocoder


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    certifi-2019.3.9           |           py36_0         149 KB  conda-forge
    conda-4.6.8                |           py36_0         876 KB  conda-forge
    geocoder-1.38.1            |             py_0          52 KB  conda-forge
    openssl-1.1.1b             |       h14c3975_1         4.0 MB  conda-forge
    orderedset-2.0             |           py36_0         231 KB  conda-forge
    ratelim-0.1.6              |           py36_0           5 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         5.4 MB

The foll

__2. Read the Postal CSV Data__

In [2]:
all_india_pincode_df = df = pd.read_csv('All_India_pincode_data_26022018.csv', encoding = 'windows-1252', dtype=object)
print("Shape of all the Postal Code Data is ",all_india_pincode_df.shape)
all_india_pincode_df.head(5)

Shape of all the Postal Code Data is  (155570, 10)


Unnamed: 0,officename,pincode,officetype,Deliverystatus,divisionname,regionname,circlename,taluk,districtname,statename
0,Chakragaon S.O,744112,S.O,Delivery,A - N Islands,Calcutta HQ,West Bengal,Portblair,South Andaman,ANDAMAN & NICOBAR ISLANDS
1,Chatham S.O,744102,S.O,Non-Delivery,A - N Islands,Calcutta HQ,West Bengal,Portblair,South Andaman,ANDAMAN & NICOBAR ISLANDS
2,Delanipur S.O,744102,S.O,Non-Delivery,A - N Islands,Calcutta HQ,West Bengal,Portblair,South Andaman,ANDAMAN & NICOBAR ISLANDS
3,Marine Jetty S.O,744101,S.O,Non-Delivery,A - N Islands,Calcutta HQ,West Bengal,Portblair,South Andaman,ANDAMAN & NICOBAR ISLANDS
4,Minnie Bay S.O,744103,S.O,Non-Delivery,A - N Islands,Calcutta HQ,West Bengal,Portblair,South Andaman,ANDAMAN & NICOBAR ISLANDS


__3. Filtering out cities which are a part of the NCR Region - Delhi, Gurgoan, Faridabad, Noida (Gautam Buddha Nagar), Ghaziabad__

In [3]:
#Filtering out cities for which the data is required
ncr_cities = ['Delhi','Faridabad','Gautam Buddha Nagar','Ghaziabad','Gurgaon']
ncr_pstl_data_df = all_india_pincode_df[(all_india_pincode_df.regionname=='Delhi') | (all_india_pincode_df['districtname'].isin(ncr_cities))]
print("Shape of Dataframe is - ",ncr_pstl_data_df.shape)
ncr_pstl_data_df.head()

Shape of Dataframe is -  (1197, 10)


Unnamed: 0,officename,pincode,officetype,Deliverystatus,divisionname,regionname,circlename,taluk,districtname,statename
27319,IP Extension S.O,110092,S.O,Non-Delivery,Delhi East,Delhi,Delhi,,East Delhi,DELHI
27320,Rohini Sector-7 S.O,110085,S.O,Delivery,Delhi North,Delhi,Delhi,,North West Delhi,DELHI
27321,R K Puram Sector - 6 Postal SB S.O,110022,S.O,Non-Delivery,New Delhi South West,Delhi,Delhi,,South West Delhi,DELHI
27322,Abul Fazal Enclave-I S.O,110025,S.O,Non-Delivery,New Delhi South,Delhi,Delhi,,South Delhi,DELHI
27323,Jaitpur S.O (South Delhi),110044,S.O,Non-Delivery,New Delhi South,Delhi,Delhi,,South Delhi,DELHI


In [4]:
# Checking if we did get all the cities required
ncr_pstl_data_df.districtname.unique()

array(['East Delhi', 'North West Delhi', 'South West Delhi',
       'South Delhi', 'North East Delhi', 'North Delhi', 'West Delhi',
       'Central Delhi', 'New Delhi', 'Faridabad', 'Gurgaon', 'Ghaziabad',
       'Gautam Buddha Nagar'], dtype=object)

In [5]:
#Creating a single neighbourhoods column instead of multple values
ncr_pstl_data_df['ncr_city'] = np.where((ncr_pstl_data_df.regionname=='Delhi'), 'Delhi',ncr_pstl_data_df.districtname)
ncr_pstl_data_df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Unnamed: 0,officename,pincode,officetype,Deliverystatus,divisionname,regionname,circlename,taluk,districtname,statename,ncr_city
27319,IP Extension S.O,110092,S.O,Non-Delivery,Delhi East,Delhi,Delhi,,East Delhi,DELHI,Delhi
27320,Rohini Sector-7 S.O,110085,S.O,Delivery,Delhi North,Delhi,Delhi,,North West Delhi,DELHI,Delhi
27321,R K Puram Sector - 6 Postal SB S.O,110022,S.O,Non-Delivery,New Delhi South West,Delhi,Delhi,,South West Delhi,DELHI,Delhi
27322,Abul Fazal Enclave-I S.O,110025,S.O,Non-Delivery,New Delhi South,Delhi,Delhi,,South Delhi,DELHI,Delhi
27323,Jaitpur S.O (South Delhi),110044,S.O,Non-Delivery,New Delhi South,Delhi,Delhi,,South Delhi,DELHI,Delhi


In [6]:
# Renaming Columns for convieniece
final_ncr_pstl_data_df = ncr_pstl_data_df[['pincode','ncr_city']]
final_ncr_pstl_data_df = final_ncr_pstl_data_df.rename(columns = {"pincode": "PostalCode",}).drop_duplicates().reset_index(drop=True)
print("\n The Final Shape of the dataframe is  - ",final_ncr_pstl_data_df.shape)


 The Final Shape of the dataframe is  -  (212, 2)


In [7]:
final_ncr_pstl_data_df.head()

Unnamed: 0,PostalCode,ncr_city
0,110092,Delhi
1,110085,Delhi
2,110022,Delhi
3,110025,Delhi
4,110044,Delhi


### Delhi NCR Postal data created with 212 Pin Codes

__4. Creating a function to get the Lat Long data from the Postal Code__

In [8]:
def get_geocoder(postal_code_from_df):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, NCR, India'.format(postal_code_from_df.strip()))
        lat_lng_coords = g.latlng
        latitude = lat_lng_coords[0]
        longitude = lat_lng_coords[1]
    return latitude,longitude

Adding the Latitude and Longitude columns to the Pandas DataFrame

In [9]:
final_ncr_pstl_data_df['Latitude'], final_ncr_pstl_data_df['Longitude'] = zip(*final_ncr_pstl_data_df['PostalCode'].apply(get_geocoder))
final_ncr_pstl_data_df.sort_values(by='PostalCode',inplace=True)
final_ncr_pstl_data_df.head()

Unnamed: 0,PostalCode,ncr_city,Latitude,Longitude
15,110001,Delhi,28.623203,77.222803
72,110002,Delhi,28.636728,77.2476
43,110003,Delhi,28.587729,77.226215
17,110004,Delhi,23.37938,79.443327
94,110005,Delhi,28.654413,77.191401


In [10]:
print("Shape of the dataframe is - ",final_ncr_pstl_data_df.shape)

Shape of the dataframe is -  (212, 4)


## Map of Delhi NCR

__5. Using geolocator for Mapping Delhi NCR as will be required in the next part__

In [11]:
address = 'Delhi, NCR'

geolocator = Nominatim(user_agent="delhi_ncr")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Delhi, NCR are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Delhi, NCR are 28.6882438, 77.1212148.


Using Folium to Map

In [1]:
def inline_map(m):
    from folium import Map
    from IPython.display import HTML, IFrame
    if isinstance(m, Map):
        m._build_map()
        srcdoc = m.HTML.replace('"', '&quot;')
        embed = HTML('<iframe srcdoc="{srcdoc}" '
                     'style="width: 100%; height: 500px; '
                     'border: none"></iframe>'.format(srcdoc=srcdoc))
    elif isinstance(m, str):
        embed = IFrame(m, width=1200, height=600)
    return embed

In [13]:
map_ncr = folium.Map(location=[latitude, longitude], zoom_start=9.8)

for lat, long, post, neigh in zip(final_ncr_pstl_data_df['Latitude'], final_ncr_pstl_data_df['Longitude'], final_ncr_pstl_data_df['PostalCode'], final_ncr_pstl_data_df['ncr_city']):
    label = "{} - {}".format(post, neigh)
    popup = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, long],
        radius=5,
        popup=popup,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_ncr)
map_ncr.save("Delhi_NCR.html")

In [2]:
inline_map("Delhi_NCR.html")

__6. Using Foursquare API to get the venues in all the Postal Codes__

In [14]:
CLIENT_ID = 'PDUBQ0JA2ZYZG1VB00LUWNZDAYNXQABP0EJAWOOHQLGQF02I' # your Foursquare ID
CLIENT_SECRET = 'N3AI2BJBGGQUCO0XX444TBT3II4ITQOAU3YIFZMACVHEWRYX' # your Foursquare Secret
VERSION = '20180604'
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: PDUBQ0JA2ZYZG1VB00LUWNZDAYNXQABP0EJAWOOHQLGQF02I
CLIENT_SECRET:N3AI2BJBGGQUCO0XX444TBT3II4ITQOAU3YIFZMACVHEWRYX


### All recommended places in each Postal Code in Delhi NCR

In [15]:
radius = 2000
LIMIT = 100

recommends = []
for lat, long, post, neighborhoods in zip(final_ncr_pstl_data_df['Latitude'], final_ncr_pstl_data_df['Longitude'], final_ncr_pstl_data_df['PostalCode'], final_ncr_pstl_data_df['ncr_city']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    postal_data = requests.get(url).json()["response"]['groups'][0]['items']
    for recommend_post in postal_data:
        recommends.append((
            post, 
            neighborhoods,
            lat, 
            long, 
            recommend_post['venue']['name'], 
            recommend_post['venue']['location']['lat'], 
            recommend_post['venue']['location']['lng'],  
            recommend_post['venue']['categories'][0]['name']))

In [56]:
ncr_recommends = pd.DataFrame(recommends)
ncr_recommends.columns = ['PostalCode', 'ncr_city', 'Postal_Lat', 'Postal_Long', 'Venue_Name', 'Venue_Lat', 'Venue_Long', 'Venue_Category']
print("Shape of all the recommendations of all the Postal Codes in Delhi NCR - ", ncr_recommends.shape)
ncr_recommends.head()

Shape of all the recommendations of all the Postal Codes in Delhi NCR -  (4290, 8)


Unnamed: 0,PostalCode,ncr_city,Postal_Lat,Postal_Long,Venue_Name,Venue_Lat,Venue_Long,Venue_Category
0,110001,Delhi,28.623203,77.222803,The Imperial,28.625548,77.218664,Hotel
1,110001,Delhi,28.623203,77.222803,Tamra,28.620543,77.218174,Restaurant
2,110001,Delhi,28.623203,77.222803,HOTEL SARAVANA BHAVAN,28.627041,77.219514,South Indian Restaurant
3,110001,Delhi,28.623203,77.222803,"le meridian ,new delhi",28.619001,77.21771,Hotel Bar
4,110001,Delhi,28.623203,77.222803,Shangri-La's - Eros Hotel,28.620909,77.217537,Hotel


In [57]:
#Removing any duplicates if any
ncr_recommends.drop_duplicates(keep=False, inplace=True)
ncr_recommends.shape

(4286, 8)

In [58]:
ncr_recommends.groupby(['PostalCode', 'ncr_city'])['Venue_Name'].count()

PostalCode  ncr_city           
110001      Delhi                  100
110002      Delhi                   33
110003      Delhi                  100
110005      Delhi                   35
110006      Delhi                   71
110007      Delhi                   25
110008      Delhi                   20
110009      Delhi                   27
110010      Delhi                    7
110011      Delhi                   97
110012      Delhi                   22
110013      Delhi                   19
110014      Delhi                   41
110015      Delhi                   42
110016      Delhi                  100
110017      Delhi                  100
110018      Delhi                   45
110019      Delhi                   84
110020      Delhi                   21
110021      Delhi                   78
110022      Delhi                   67
110023      Delhi                   52
110024      Delhi                   93
110025      Delhi                   38
110026      Delhi               

As we there is a duplication in the pin code and names are different (Ghaziabad and Gautam Buddha Nagar)
- Resolving them by renaming Gautam Buddha Nagar to Noida and
- dropping the Postal Code values which are repeating (201301 and 201303) for ghaziabad

In [59]:
ncr_recommends['ncr_city'] = np.where((ncr_recommends.ncr_city=='Gautam Buddha Nagar'), "Noida",ncr_recommends.ncr_city)
repeat_postal_codes = ['201301','201303']
ncr_recommends = ncr_recommends.drop(ncr_recommends.index[(ncr_recommends.PostalCode.isin(repeat_postal_codes)) & (ncr_recommends.ncr_city=="Ghaziabad")])
ncr_recommends.drop_duplicates(inplace = True)
print("Updated shape of ncr_recommends is ",ncr_recommends.shape)
#Getting updated counts
ncr_recommends.groupby(['PostalCode', 'ncr_city'])['Venue_Name'].count()

Updated shape of ncr_recommends is  (4219, 8)


PostalCode  ncr_city 
110001      Delhi        100
110002      Delhi         33
110003      Delhi        100
110005      Delhi         35
110006      Delhi         71
110007      Delhi         25
110008      Delhi         20
110009      Delhi         27
110010      Delhi          7
110011      Delhi         97
110012      Delhi         22
110013      Delhi         19
110014      Delhi         41
110015      Delhi         42
110016      Delhi        100
110017      Delhi        100
110018      Delhi         45
110019      Delhi         84
110020      Delhi         21
110021      Delhi         78
110022      Delhi         67
110023      Delhi         52
110024      Delhi         93
110025      Delhi         38
110026      Delhi         31
110027      Delhi         54
110028      Delhi         12
110029      Delhi        100
110030      Delhi          9
110031      Delhi          4
110032      Delhi         15
110033      Delhi          8
110034      Delhi         33
110035      Delhi    

__As we can see above there are no durther duplications on Pincode__

In [60]:
print("Number of Unique Venue Categories are: ",len(ncr_recommends['Venue_Category'].unique()))
ncr_recommends['Venue_Category'].unique()

Number of Unique Venue Categories are:  243


array(['Hotel', 'Restaurant', 'South Indian Restaurant', 'Hotel Bar',
       'Asian Restaurant', 'Café', 'Spa', 'Plaza', 'Indian Restaurant',
       'Gastropub', 'Ice Cream Shop', 'Bakery', 'Clothing Store',
       'Coffee Shop', 'Monument / Landmark', 'North Indian Restaurant',
       'Bistro', 'Flea Market', 'Mediterranean Restaurant',
       'Molecular Gastronomy Restaurant', 'Theater', 'Italian Restaurant',
       'Sculpture Garden', 'Beer Garden', 'Portuguese Restaurant',
       'BBQ Joint', 'Bar', 'Lounge', 'History Museum', 'Arcade',
       'Art Gallery', 'Spiritual Center', 'Chinese Restaurant',
       'Donut Shop', 'Food & Drink Shop', 'Tea Room', 'Music Venue',
       'Deli / Bodega', 'Fast Food Restaurant', 'Concert Hall',
       'Performing Arts Venue', 'Historic Site', 'Snack Place',
       'Cocktail Bar', 'Hockey Arena', 'Pub', 'Japanese Restaurant',
       'Miscellaneous Shop', 'Light Rail Station', 'Cricket Ground',
       'Stadium', 'Hostel', 'Road', 'Movie Theater', '

__7. Analyze venues in each area__

In [61]:
# getting all the values as columns for all areas
ncr_recommends = ncr_recommends.drop(['Postal_Lat','Postal_Long','Venue_Lat','Venue_Long'],axis=1)
ncr_recommends_df = pd.get_dummies(ncr_recommends, columns=['Venue_Category'],prefix = "", prefix_sep = "")
ncr_recommends_df.head(40)

Unnamed: 0,PostalCode,ncr_city,Venue_Name,ATM,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bathing Area,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Shop,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Buffet,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Business Center,Business Service,Cafeteria,Café,Campground,Candy Store,Cantonese Restaurant,Castle,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Dairy Store,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gastropub,General Entertainment,Gift Shop,Golf Course,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Hardware Store,Hindu Temple,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Karnataka Restaurant,Korean Restaurant,Lake,Light Rail Station,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Mosque,Motel,Motorcycle Shop,Movie Theater,Moving Target,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Nightlife Spot,North Indian Restaurant,Northeast Indian Restaurant,Office,Optical Shop,Other Great Outdoors,Other Nightlife,Outdoors & Recreation,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Print Shop,Pub,Public Bathroom,Punjabi Restaurant,Racetrack,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,River,Road,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skating Rink,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South Indian Restaurant,Spa,Speakeasy,Spiritual Center,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Supplement Shop,Sushi Restaurant,Tapas Restaurant,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Tibetan Restaurant,Toll Booth,Toll Plaza,Tourist Information Center,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Udupi Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Volleyball Court,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,110001,Delhi,The Imperial,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,110001,Delhi,Tamra,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,110001,Delhi,HOTEL SARAVANA BHAVAN,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,110001,Delhi,"le meridian ,new delhi",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,110001,Delhi,Shangri-La's - Eros Hotel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,110001,Delhi,The Spice Route,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,110001,Delhi,"The Square, Cafe Coffee Day",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,110001,Delhi,Le Méridien,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,110001,Delhi,Spa At Shangri-La,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,110001,Delhi,Connaught Place | कनॉट प्लेस (Connaught Place),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [64]:
ncr_venues_freq = ncr_recommends_df.groupby(['PostalCode','ncr_city']).mean().reset_index()
ncr_venues_freq.head(40)

Unnamed: 0,PostalCode,ncr_city,ATM,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bathing Area,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Shop,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Buffet,Burger Joint,Burmese Restaurant,Burrito Place,Bus Station,Business Center,Business Service,Cafeteria,Café,Campground,Candy Store,Cantonese Restaurant,Castle,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Dairy Store,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gastropub,General Entertainment,Gift Shop,Golf Course,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Hardware Store,Hindu Temple,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Karnataka Restaurant,Korean Restaurant,Lake,Light Rail Station,Lounge,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Mosque,Motel,Motorcycle Shop,Movie Theater,Moving Target,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Nightlife Spot,North Indian Restaurant,Northeast Indian Restaurant,Office,Optical Shop,Other Great Outdoors,Other Nightlife,Outdoors & Recreation,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Print Shop,Pub,Public Bathroom,Punjabi Restaurant,Racetrack,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,River,Road,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skating Rink,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South Indian Restaurant,Spa,Speakeasy,Spiritual Center,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Supplement Shop,Sushi Restaurant,Tapas Restaurant,Tea Room,Temple,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Tibetan Restaurant,Toll Booth,Toll Plaza,Tourist Information Center,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Udupi Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Volleyball Court,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,110001,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.05,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.05,0.01,0.01,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.09,0.01,0.0,0.02,0.0,0.15,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,110002,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.090909,0.030303,0.0,0.0,0.0,0.212121,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,110003,Delhi,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.03,0.0,0.03,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.05,0.03,0.0,0.0,0.0,0.09,0.0,0.0,0.01,0.05,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.02,0.0,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,110005,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.028571,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.085714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.171429,0.0,0.0,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.085714,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,110006,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042254,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.084507,0.028169,0.0,0.0,0.014085,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.014085,0.0,0.0,0.0,0.0,0.028169,0.0,0.183099,0.0,0.0,0.0,0.014085,0.126761,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.014085,0.0,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.014085,0.0,0.014085,0.014085,0.0,0.014085,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.042254,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042254,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.028169,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,110007,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,110008,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.15,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,110009,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,110010,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,110011,Delhi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030928,0.0,0.0,0.010309,0.0,0.0,0.0,0.030928,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.072165,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.010309,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.010309,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.051546,0.0,0.0,0.0,0.0,0.0,0.072165,0.020619,0.0,0.020619,0.0,0.123711,0.0,0.0,0.010309,0.030928,0.020619,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.030928,0.010309,0.030928,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.010309,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.010309,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.041237,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.030928,0.010309,0.0,0.0,0.0,0.0,0.010309,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


__8. Top 10 Venues in the each of the Areas__

In [78]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
areaColumns = ['PostalCode', 'ncr_city']
freqColumns = []
for ind in np.arange(num_top_venues):
    try:
        freqColumns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        freqColumns.append('{}th Most Common Venue'.format(ind+1))
columns = areaColumns+freqColumns

# Create a new dataframe
all_ncr_venues = pd.DataFrame(columns=columns)
all_ncr_venues['PostalCode'] = ncr_venues_freq['PostalCode']
all_ncr_venues['ncr_city'] = ncr_venues_freq['ncr_city']
for ind in np.arange(ncr_venues_freq.shape[0]):
    row_categories = ncr_venues_freq.iloc[ind, :].iloc[2:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    all_ncr_venues.iloc[ind, 2:] = row_categories_sorted.index.values[0:num_top_venues]

all_ncr_venues.sort_values(freqColumns, inplace=True)
all_ncr_venues.sort_values(['PostalCode'],inplace=True)
all_ncr_venues

Unnamed: 0,PostalCode,ncr_city,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,110001,Delhi,Indian Restaurant,Hotel,Café,Bar,Chinese Restaurant,Lounge,Italian Restaurant,Art Gallery,Coffee Shop,Restaurant
1,110002,Delhi,Indian Restaurant,Hotel,Theater,Café,Bakery,History Museum,Cricket Ground,Performing Arts Venue,Road,Flea Market
2,110003,Delhi,Indian Restaurant,Café,Italian Restaurant,Restaurant,Hotel,Chinese Restaurant,Coffee Shop,Bar,BBQ Joint,Sandwich Place
3,110005,Delhi,Fast Food Restaurant,Indian Restaurant,Coffee Shop,Hotel,Snack Place,Food & Drink Shop,Light Rail Station,Restaurant,Sandwich Place,Bar
4,110006,Delhi,Hotel,Indian Restaurant,Fast Food Restaurant,Pizza Place,Restaurant,Dessert Shop,Train Station,Platform,Flea Market,Snack Place
5,110007,Delhi,Pizza Place,Fast Food Restaurant,Indian Restaurant,Donut Shop,Chinese Restaurant,Breakfast Spot,Coffee Shop,Miscellaneous Shop,Bakery,Sandwich Place
6,110008,Delhi,Pizza Place,Fast Food Restaurant,Indian Restaurant,Bakery,Café,Hotel,Arcade,Gym / Fitness Center,Coffee Shop,Bar
7,110009,Delhi,Coffee Shop,Pizza Place,Fast Food Restaurant,Café,Chinese Restaurant,Bakery,Indian Restaurant,Snack Place,Food Truck,Men's Store
8,110010,Delhi,Cafeteria,Shopping Mall,Plaza,Convenience Store,Historic Site,Coffee Shop,Café,Flower Shop,Farmers Market,Fast Food Restaurant
9,110011,Delhi,Indian Restaurant,Hotel,Café,History Museum,Restaurant,Asian Restaurant,Lounge,Mediterranean Restaurant,Smoke Shop,Bar


__9. Clustering areas__
Use KMeans algorigthm, try to cluster the toronto central areas into 6 clusters

In [83]:
# set number of clusters
kclusters = 10

ncr_venues_freq_clustering = ncr_venues_freq.drop(['PostalCode', 'ncr_city'], 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ncr_venues_freq_clustering)

#No data for a particular postal code so removed it
ncr_clustered_df = all_ncr_venues[['PostalCode','ncr_city']].copy()
ncr_clustered_df['Cluster'] = kmeans.labels_

ncr_clustered_df = ncr_clustered_df.join(all_ncr_venues.drop(['ncr_city'], 1).set_index('PostalCode'), on='PostalCode')
ncr_clustered_df.sort_values(['Cluster'] + freqColumns, inplace=True)
final_ncr_pstl_data_df = final_ncr_pstl_data_df.drop(final_ncr_pstl_data_df.index[(final_ncr_pstl_data_df.PostalCode.isin(repeat_postal_codes)) & (final_ncr_pstl_data_df.ncr_city=="Ghaziabad")])
final_ncr_pstl_data_df.drop_duplicates(inplace=True)
ncr_clustered_df = pd.merge(ncr_clustered_df, final_ncr_pstl_data_df[['PostalCode','Latitude','Longitude']], on='PostalCode', how='left')
ncr_clustered_df

Unnamed: 0,PostalCode,ncr_city,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
0,121005,Faridabad,0,ATM,Accessories Store,Farm,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,28.361354,77.296577
1,110071,Delhi,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.558817,77.001835
2,122505,Gurgaon,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.453836,76.921988
3,201013,Ghaziabad,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.695718,77.505094
4,201102,Ghaziabad,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.76885,77.319183
5,201201,Ghaziabad,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.846978,77.649733
6,201206,Ghaziabad,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.80137,77.469718
7,203202,Noida,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.40238,77.637002
8,203207,Noida,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.532525,77.638734
9,245101,Ghaziabad,0,ATM,Dance Studio,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Stand,Food Service,Food Court,Food & Drink Shop,28.696908,77.798955


### Mapping Clusters

In [84]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, post, poi, cluster in zip(ncr_clustered_df['Latitude'], ncr_clustered_df['Longitude'], ncr_clustered_df['PostalCode'], ncr_clustered_df['ncr_city'], ncr_clustered_df['Cluster']):
    label = folium.Popup('{} - {} - Cluster {}'.format(post, poi, cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

map_clusters.save("Delhi_NCR_clusters.html")


In [3]:
inline_map("Delhi_NCR_clusters.html")

### All 10 Clusters plotted above - this can be used for any purpose like locations which are similar etc

__10. Now working on the locations for the new Shoppping Mall__

In [85]:
ncr_malls = ncr_venues_freq[["PostalCode","ncr_city","Shopping Mall"]]
ncr_malls.head()

Unnamed: 0,PostalCode,ncr_city,Shopping Mall
0,110001,Delhi,0.0
1,110002,Delhi,0.0
2,110003,Delhi,0.0
3,110005,Delhi,0.0
4,110006,Delhi,0.0


__11. Running K Means for Shopping Mall Data Only__

In [93]:
# set number of clusters
kclusters = 5

ncr_clustering = ncr_malls.drop(["PostalCode","ncr_city"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ncr_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_
ncr_malls_merged = ncr_malls.copy()

# add clustering labels
ncr_malls_merged["Cluster"] = kmeans.labels_
ncr_malls_merged.head()

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster
0,110001,Delhi,0.0,0
1,110002,Delhi,0.0,0
2,110003,Delhi,0.0,0
3,110005,Delhi,0.0,0
4,110006,Delhi,0.0,0


In [94]:
ncr_malls_clustered_df = pd.merge(ncr_malls_merged, final_ncr_pstl_data_df[['PostalCode','Latitude','Longitude']], on='PostalCode', how='left')
ncr_malls_clustered_df.head()

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster,Latitude,Longitude
0,110001,Delhi,0.0,0,28.623203,77.222803
1,110002,Delhi,0.0,0,28.636728,77.2476
2,110003,Delhi,0.0,0,28.587729,77.226215
3,110005,Delhi,0.0,0,28.654413,77.191401
4,110006,Delhi,0.0,0,28.656,77.225032


In [95]:
#Creating Map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, post, poi, cluster in zip(ncr_malls_clustered_df['Latitude'], ncr_malls_clustered_df['Longitude'], ncr_malls_clustered_df['PostalCode'], ncr_malls_clustered_df['ncr_city'], ncr_malls_clustered_df['Cluster']):
    label = folium.Popup('{} - {} - Cluster {}'.format(post, poi, cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

map_clusters.save("Delhi_Malls_NCR_clusters.html")

In [4]:
inline_map("Delhi_Malls_NCR_clusters.html")

__All 5 Clusters Created above__

__11. Examining Clusters__

In [96]:
#Cluster 0
ncr_malls_clustered_df.loc[ncr_malls_clustered_df['Cluster']==0]

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster,Latitude,Longitude
0,110001,Delhi,0.0,0,28.623203,77.222803
1,110002,Delhi,0.0,0,28.636728,77.2476
2,110003,Delhi,0.0,0,28.587729,77.226215
3,110005,Delhi,0.0,0,28.654413,77.191401
4,110006,Delhi,0.0,0,28.656,77.225032
5,110007,Delhi,0.0,0,28.67467,77.199005
6,110008,Delhi,0.0,0,28.652968,77.167513
7,110009,Delhi,0.0,0,28.70853,77.202935
9,110011,Delhi,0.0,0,28.61053,77.21022
10,110012,Delhi,0.0,0,28.634845,77.157996


In [97]:
#Cluster 1
ncr_malls_clustered_df.loc[ncr_malls_clustered_df['Cluster']==1]

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster,Latitude,Longitude
8,110010,Delhi,0.142857,1,28.605315,77.137845
72,110077,Delhi,0.166667,1,28.562683,77.056204
79,110085,Delhi,0.12,1,28.716689,77.1173
87,110094,Delhi,0.166667,1,28.724898,77.263867
94,121006,Faridabad,0.2,1,28.364545,77.325293
96,121008,Faridabad,0.142857,1,28.433415,77.317932
136,201010,Ghaziabad,0.117647,1,28.660058,77.341374
138,201012,Ghaziabad,0.125,1,28.658318,77.365918
143,201017,Ghaziabad,0.222222,1,28.704503,77.433441


In [98]:
#Cluster 2
ncr_malls_clustered_df.loc[ncr_malls_clustered_df['Cluster']==2]

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster,Latitude,Longitude
103,121107,Faridabad,0.5,2,27.986715,77.492483


In [99]:
#Cluster 3
ncr_malls_clustered_df.loc[ncr_malls_clustered_df['Cluster']==3]

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster,Latitude,Longitude
13,110015,Delhi,0.071429,3,28.651296,77.140132
15,110017,Delhi,0.04,3,28.533665,77.214255
16,110018,Delhi,0.044444,3,28.643165,77.087168
24,110026,Delhi,0.032258,3,28.668139,77.134761
25,110027,Delhi,0.074074,3,28.647223,77.117525
30,110032,Delhi,0.066667,3,28.67515,77.288309
32,110034,Delhi,0.060606,3,28.694953,77.131419
33,110035,Delhi,0.060606,3,28.682685,77.154014
36,110038,Delhi,0.035088,3,28.511655,77.106457
48,110051,Delhi,0.038462,3,28.653613,77.285643


In [100]:
#Cluster 4
ncr_malls_clustered_df.loc[ncr_malls_clustered_df['Cluster']==4]

Unnamed: 0,PostalCode,ncr_city,Shopping Mall,Cluster,Latitude,Longitude
90,121001,Faridabad,0.333333,4,28.403587,77.285945
91,121002,Faridabad,0.333333,4,28.425802,77.37375
133,201007,Ghaziabad,0.333333,4,28.682691,77.38787


## Findings
- Cluster 0 - Locations which have no shopping malls in the vicinity
- Cluster 1 - Locations which have less shopping malls in the vicinity
- Cluster 2 - Locations which have shopping malls in the vicinity
- Cluster 3 - Locations which have a good number shopping malls in the vicinity
- Cluster 4 - Locations which have abundant shopping malls in the vicinity

## Conclusion
- Cluster 4 and Cluster 3 already have many shopping malls in their vicnities
- Cluster 0 has no shopping Malls
- I would suggest the bulder to go build near the Clusters 1 and 2 as malls which are near Cluster 0 would give rise to Dead Malls as people would not visit a location specifically for a single mall. They would prefer going to multiple places