**_Opening a New Shopping Mall in Kanpur, India_**
- Build a dataframe of neighborhoods in Kanpur, India by web scraping the data from Wikipedia page
- Get the geographical coordinates of the neighborhoods
- Obtain the venue data for the neighborhoods from Foursquare API
- Explore and cluster the neighborhoods
- Select the best cluster to open a new shopping mall
***
### 1. Import libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print("Libraries imported.")

Libraries imported.


### 2. Scrap data from Wikipedia page into a DataFrame

In [2]:
# send the GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Kanpur").text

In [3]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')

In [4]:
# create a list to store neighborhood data
neighborhoodList = []

In [5]:
# append the data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighborhoodList.append(row.text)

In [6]:
# create a new DataFrame from the list
kanpur_df = pd.DataFrame({"Neighborhood": neighborhoodList})

kanpur_df.head()

Unnamed: 0,Neighborhood
0,"Barra, Kanpur"
1,Bhitargaon
2,"Birhana Road, Kanpur"
3,Bithoor
4,"Chaman Ganj, Kanpur"


In [7]:
# print the number of rows of the dataframe
kanpur_df.shape

(21, 1)

### 3. Get the geographical coordinates

In [8]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Kanpur, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [9]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(neighborhood) for neighborhood in kanpur_df["Neighborhood"].tolist() ]

In [10]:
coords

[[26.43651158185418, 80.28410974063011],
 [26.21157000000005, 80.27301000000006],
 [26.42865834210118, 80.38628290132293],
 [26.51967146434187, 80.25879411608547],
 [26.467178655325988, 80.33499630794532],
 [26.61737000000005, 80.19634000000008],
 [26.449520000000064, 80.30274000000009],
 [26.418900000000065, 80.40744000000007],
 [26.475430000000074, 80.28700000000003],
 [26.490680000000054, 80.25215000000009],
 [26.453790000000026, 80.36283000000003],
 [26.466097488777894, 80.34706107715456],
 [26.46893170923684, 80.35977566020455],
 [26.48290000000003, 80.32901000000004],
 [26.468860884053807, 80.34833044202689],
 [26.530830000000037, 80.23555000000005],
 [26.38838000000004, 80.32378000000006],
 [26.45791440974284, 80.21657990630568],
 [26.447290000000066, 80.28482000000008],
 [26.477770000000078, 80.27634000000006],
 [26.485630078030223, 80.32398814090162]]

In [11]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [12]:
# merge the coordinates into the original dataframe
kanpur_df['Latitude'] = df_coords['Latitude']
kanpur_df['Longitude'] = df_coords['Longitude']

In [13]:
# check the neighborhoods and the coordinates
print(kanpur_df.shape)
kanpur_df

(21, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,"Barra, Kanpur",26.436512,80.28411
1,Bhitargaon,26.21157,80.27301
2,"Birhana Road, Kanpur",26.428658,80.386283
3,Bithoor,26.519671,80.258794
4,"Chaman Ganj, Kanpur",26.467179,80.334996
5,Chobepur,26.61737,80.19634
6,Govind Nagar,26.44952,80.30274
7,Jajmau,26.4189,80.40744
8,Kakadeo,26.47543,80.287
9,"Kalyanpur, Uttar Pradesh",26.49068,80.25215


In [14]:
# save the DataFrame as CSV file
kanpur_df.to_csv("kanpur_df.csv", index=False)

### 4. Create a map of Kanpur with neighborhoods superimposed on top

In [15]:
# get the coordinates of Kanpur
address = 'Kanpur, India'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Kanpur, India {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Kanpur, India 26.4609135, 80.3217588.


In [16]:
# create map of Kanpur using latitude and longitude values
map_kanpur = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(kanpur_df['Latitude'], kanpur_df['Longitude'], kanpur_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_kanpur)  
    
map_kanpur

In [17]:
# save the map as HTML file
map_kanpur.save('map_kanpur.html')

### 5. Use the Foursquare API to explore the neighborhoods

In [18]:
# define Foursquare Credentials and Version
CLIENT_ID = 'OGK1XD4XEJ1NL05MEKB40FG0E2IW2I33FFYCBTXST1UCVJ1J' # your Foursquare ID
CLIENT_SECRET = 'WBDMUC3XLGM1FYOPKJ0WHBFC5YBSAI3E5DJ3AJL2FKFLYOK5' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: OGK1XD4XEJ1NL05MEKB40FG0E2IW2I33FFYCBTXST1UCVJ1J
CLIENT_SECRET:WBDMUC3XLGM1FYOPKJ0WHBFC5YBSAI3E5DJ3AJL2FKFLYOK5


**Now, let's get the top 100 venues that are within a radius of 2000 meters.**

In [19]:
radius = 2000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(kanpur_df['Latitude'], kanpur_df['Longitude'], kanpur_df['Neighborhood']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [20]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(162, 7)


Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,"Barra, Kanpur",26.436512,80.28411,Donalds ice cream parlour,26.436635,80.283463,Breakfast Spot
1,"Barra, Kanpur",26.436512,80.28411,Harjeet Laminators,26.443938,80.284948,Business Service
2,"Barra, Kanpur",26.436512,80.28411,HDFC Bank,26.429111,80.274624,ATM
3,"Barra, Kanpur",26.436512,80.28411,Shri Bhojaanalaya,26.452703,80.277802,Diner
4,"Birhana Road, Kanpur",26.428658,80.386283,Sigma,26.428899,80.389984,Bar


**Let's check how many venues were returned for each neighorhood**

In [21]:
venues_df.groupby(["Neighborhood"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Barra, Kanpur",4,4,4,4,4,4
"Birhana Road, Kanpur",8,8,8,8,8,8
Bithoor,4,4,4,4,4,4
"Chaman Ganj, Kanpur",16,16,16,16,16,16
Govind Nagar,4,4,4,4,4,4
Jajmau,4,4,4,4,4,4
Kakadeo,4,4,4,4,4,4
"Kalyanpur, Uttar Pradesh",4,4,4,4,4,4
Kanpur Cantonment,6,6,6,6,6,6
"Latouche Road, Kanpur",21,21,21,21,21,21


**Let's find out how many unique categories can be curated from all the returned venues**

In [22]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 41 uniques categories.


In [23]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:50]

array(['Breakfast Spot', 'Business Service', 'ATM', 'Diner', 'Bar', 'Spa',
       'Bowling Alley', 'Shopping Mall', 'Bookstore', 'Hotel',
       'Indian Restaurant', 'Golf Course', 'Convenience Store',
       'Food Court', 'Cricket Ground', 'Dessert Shop', 'Café',
       'Fast Food Restaurant', 'Clothing Store', 'Pizza Place',
       'Multiplex', 'Coffee Shop', 'Market', 'Bus Station',
       'Jewelry Store', 'Train Station', 'Airport', 'Snack Place',
       'Bakery', 'Pool', 'Garden Center', 'Cosmetics Shop', "Men's Store",
       'Chinese Restaurant', 'Camera Store', 'Furniture / Home Store',
       'Accessories Store', 'Park', 'Tea Room', 'Gym', 'Ice Cream Shop'],
      dtype=object)

In [25]:
# check if the results contain "Shopping Mall"
"Shopping Mall" in venues_df['VenueCategory'].unique()

True

### 6. Analyze Each Neighborhood

In [26]:
# one hot encoding
kanpur_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
kanpur_onehot['Neighborhoods'] = venues_df['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [kanpur_onehot.columns[-1]] + list(kanpur_onehot.columns[:-1])
kanpur_onehot = kanpur_onehot[fixed_columns]

print(kanpur_onehot.shape)
kanpur_onehot.head()

(162, 42)


Unnamed: 0,Neighborhoods,ATM,Accessories Store,Airport,Bakery,Bar,Bookstore,Bowling Alley,Breakfast Spot,Bus Station,Business Service,Café,Camera Store,Chinese Restaurant,Clothing Store,Coffee Shop,Convenience Store,Cosmetics Shop,Cricket Ground,Dessert Shop,Diner,Fast Food Restaurant,Food Court,Furniture / Home Store,Garden Center,Golf Course,Gym,Hotel,Ice Cream Shop,Indian Restaurant,Jewelry Store,Market,Men's Store,Multiplex,Park,Pizza Place,Pool,Shopping Mall,Snack Place,Spa,Tea Room,Train Station
0,"Barra, Kanpur",0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Barra, Kanpur",0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Barra, Kanpur",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Barra, Kanpur",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Birhana Road, Kanpur",0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


**Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category**

In [27]:
kanpur_grouped = kanpur_onehot.groupby(["Neighborhoods"]).mean().reset_index()

print(kanpur_grouped.shape)
kanpur_grouped

(17, 42)


Unnamed: 0,Neighborhoods,ATM,Accessories Store,Airport,Bakery,Bar,Bookstore,Bowling Alley,Breakfast Spot,Bus Station,Business Service,Café,Camera Store,Chinese Restaurant,Clothing Store,Coffee Shop,Convenience Store,Cosmetics Shop,Cricket Ground,Dessert Shop,Diner,Fast Food Restaurant,Food Court,Furniture / Home Store,Garden Center,Golf Course,Gym,Hotel,Ice Cream Shop,Indian Restaurant,Jewelry Store,Market,Men's Store,Multiplex,Park,Pizza Place,Pool,Shopping Mall,Snack Place,Spa,Tea Room,Train Station
0,"Barra, Kanpur",0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Birhana Road, Kanpur",0.0,0.0,0.0,0.0,0.125,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0
2,Bithoor,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Chaman Ganj, Kanpur",0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0625,0.0625,0.0625,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0625,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0
4,Govind Nagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25
5,Jajmau,0.0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0
6,Kakadeo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.25,0.0,0.25,0.0,0.0,0.0,0.0
7,"Kalyanpur, Uttar Pradesh",0.25,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0
8,Kanpur Cantonment,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.166667
9,"Latouche Road, Kanpur",0.0,0.0,0.0,0.047619,0.0,0.0,0.047619,0.0,0.047619,0.0,0.095238,0.0,0.0,0.047619,0.0,0.0,0.0,0.047619,0.047619,0.047619,0.190476,0.0,0.0,0.047619,0.0,0.0,0.047619,0.0,0.0,0.047619,0.047619,0.0,0.047619,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.047619


In [28]:
len(kanpur_grouped[kanpur_grouped["Shopping Mall"] > 0])

9

**Create a new DataFrame for Shopping Mall data only**

In [29]:
kanpur_mall = kanpur_grouped[["Neighborhoods","Shopping Mall"]]

In [30]:
kanpur_mall.head()

Unnamed: 0,Neighborhoods,Shopping Mall
0,"Barra, Kanpur",0.0
1,"Birhana Road, Kanpur",0.125
2,Bithoor,0.0
3,"Chaman Ganj, Kanpur",0.0625
4,Govind Nagar,0.0


### 7. Cluster Neighborhoods
Run k-means to cluster the neighborhoods in Kanpur into 3 clusters.

In [32]:
# set number of clusters
kclusters = 3

kanpur_clustering = kanpur_mall.drop(["Neighborhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(kanpur_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 0, 2, 0, 1, 1, 0, 0, 2])

In [33]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
kanpur_merged = kanpur_mall.copy()

# add clustering labels
kanpur_merged["Cluster Labels"] = kmeans.labels_

In [34]:
kanpur_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
kanpur_merged.head()

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels
0,"Barra, Kanpur",0.0,0
1,"Birhana Road, Kanpur",0.125,2
2,Bithoor,0.0,0
3,"Chaman Ganj, Kanpur",0.0625,2
4,Govind Nagar,0.0,0


In [35]:
# merge kanpur_grouped with toronto_data to add latitude/longitude for each neighborhood
kanpur_merged = kanpur_merged.join(kanpur_df.set_index("Neighborhood"), on="Neighborhood")

print(kanpur_merged.shape)
kanpur_merged.head() # check the last columns!

(17, 5)


Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,"Barra, Kanpur",0.0,0,26.436512,80.28411
1,"Birhana Road, Kanpur",0.125,2,26.428658,80.386283
2,Bithoor,0.0,0,26.519671,80.258794
3,"Chaman Ganj, Kanpur",0.0625,2,26.467179,80.334996
4,Govind Nagar,0.0,0,26.44952,80.30274


In [36]:
# sort the results by Cluster Labels
print(kanpur_merged.shape)
kanpur_merged.sort_values(["Cluster Labels"], inplace=True)
kanpur_merged

(17, 5)


Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,"Barra, Kanpur",0.0,0,26.436512,80.28411
14,Rawatpur,0.0,0,26.47777,80.27634
13,Ratan Lal Nagar,0.0,0,26.44729,80.28482
12,Padri Lalpur,0.0,0,26.38838,80.32378
7,"Kalyanpur, Uttar Pradesh",0.0,0,26.49068,80.25215
8,Kanpur Cantonment,0.0,0,26.45379,80.36283
4,Govind Nagar,0.0,0,26.44952,80.30274
2,Bithoor,0.0,0,26.519671,80.258794
6,Kakadeo,0.25,1,26.47543,80.287
5,Jajmau,0.25,1,26.4189,80.40744


**Finally, let's visualize the resulting clusters**

In [38]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(kanpur_merged['Latitude'], kanpur_merged['Longitude'], kanpur_merged['Neighborhood'], kanpur_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [39]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

### 8. Examine Clusters

#### Cluster 0

In [40]:
kanpur_merged.loc[kanpur_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,"Barra, Kanpur",0.0,0,26.436512,80.28411
14,Rawatpur,0.0,0,26.47777,80.27634
13,Ratan Lal Nagar,0.0,0,26.44729,80.28482
12,Padri Lalpur,0.0,0,26.38838,80.32378
7,"Kalyanpur, Uttar Pradesh",0.0,0,26.49068,80.25215
8,Kanpur Cantonment,0.0,0,26.45379,80.36283
4,Govind Nagar,0.0,0,26.44952,80.30274
2,Bithoor,0.0,0,26.519671,80.258794


#### Cluster 1

In [41]:
kanpur_merged.loc[kanpur_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
6,Kakadeo,0.25,1,26.47543,80.287
5,Jajmau,0.25,1,26.4189,80.40744


#### Cluster 2

In [42]:
kanpur_merged.loc[kanpur_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
15,"The Mall, Kanpur",0.05,2,26.468932,80.359776
9,"Latouche Road, Kanpur",0.047619,2,26.466097,80.347061
10,McRobertganj,0.052632,2,26.4829,80.32901
11,"Meston Road, Kanpur",0.05,2,26.468861,80.34833
3,"Chaman Ganj, Kanpur",0.0625,2,26.467179,80.334996
1,"Birhana Road, Kanpur",0.125,2,26.428658,80.386283
16,"VIP Road, Kanpur",0.083333,2,26.48563,80.323988


#### Observations:
Most of the shopping malls in Kanpur City are concentrated in cluster 2 with moderate in cluster 2 and none in cluster 1, hence for the new builders, it will be a good opportunity to open shopping mall in cluster 1 since there will be no competition which lead to benifit for both the vendors and for the people living in the near by areas.