# IBM Applied Data Science Capstone Course by Coursera
##**_Opening a New Shopping Mall in Mumbai, India_**
- Build a dataframe of neighborhoods in Mumbai, India by web scraping the data from Wikipedia page
- Get the geographical coordinates of the neighborhoods
- Obtain the venue data for the neighborhoods from Foursquare API
- Explore and cluster the neighborhoods
- Select the best cluster to open a new shopping mall
***
### **1. Import libraries**

In [161]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print("Libraries imported.")

Libraries imported.


### **2. Scrap data from Wikipedia page into a DataFrame**

In [0]:
# send the GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:Suburbs_of_Mumbai").text

In [0]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')

In [0]:
# create a list to store neighborhood data
neighborhoodList = []

In [0]:
# append the data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighborhoodList.append(row.text)

In [166]:
# create a new DataFrame from the list
mum_df = pd.DataFrame({"Neighborhood": neighborhoodList})

mum_df.head()

Unnamed: 0,Neighborhood
0,Andheri
1,Anushakti Nagar
2,Baiganwadi
3,Bandra
4,Bhandup


In [167]:
# print the number of rows of the dataframe
mum_df.shape

(42, 1)

### **3. Get the geographical coordinates**

In [0]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Mumbai, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [0]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(neighborhood) for neighborhood in mum_df["Neighborhood"].tolist() ]

In [170]:
coords

[[19.118459378296492, 72.84176321065843],
 [19.042830000000038, 72.92734000000007],
 [19.062940000000026, 72.92663000000005],
 [19.054370000000063, 72.84017000000006],
 [19.145560000000046, 72.94856000000004],
 [19.229360000000042, 72.85751000000005],
 [19.208660000000066, 72.82612000000006],
 [19.06218000000007, 72.90241000000003],
 [19.250030000000038, 72.85907000000003],
 [19.224690000000066, 72.86605000000003],
 [19.212750000000028, 73.08324000000005],
 [19.00534722389655, 72.85580272012932],
 [19.08652321008152, 72.90900774216628],
 [19.164550000000077, 72.84946000000008],
 [18.959290000000067, 72.83108000000004],
 [19.137920000000065, 72.84941000000003],
 [19.014920000000075, 72.84522000000004],
 [18.953937419095155, 72.82036732944775],
 [19.21198153260436, 72.83757275783374],
 [19.131380000000036, 72.93568000000005],
 [19.127580000000023, 72.82539000000008],
 [19.064980000000048, 72.88069000000007],
 [19.21198153260436, 72.83757275783374],
 [19.048530000000028, 72.93222000000003

In [0]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [0]:
# merge the coordinates into the original dataframe
mum_df['Latitude'] = df_coords['Latitude']
mum_df['Longitude'] = df_coords['Longitude']

In [173]:
# check the neighborhoods and the coordinates
print(mum_df.shape)
mum_df

(42, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Andheri,19.118459,72.841763
1,Anushakti Nagar,19.04283,72.92734
2,Baiganwadi,19.06294,72.92663
3,Bandra,19.05437,72.84017
4,Bhandup,19.14556,72.94856
5,Borivali,19.22936,72.85751
6,Charkop,19.20866,72.82612
7,Chembur,19.06218,72.90241
8,Dahisar,19.25003,72.85907
9,Devipada,19.22469,72.86605


In [0]:
# save the DataFrame as CSV file
mum_df.to_csv("mum_df.csv", index=False)

### **4. Create a map of Mumbai with neighborhoods superimposed on top**

In [175]:
# get the coordinates of Mumbai
address = 'Mumbai, India'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Mumbai, India {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Mumbai, India 18.9387711, 72.8353355.


In [176]:
# create map of Mumbai using latitude and longitude values
map_mum = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(mum_df['Latitude'], mum_df['Longitude'], mum_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_mum)  
    
map_mum

In [0]:
# save the map as HTML file
map_mum.save('map_mum.html')

### **5. Use the Foursquare API to explore the neighborhoods**

In [178]:
# define Foursquare Credentials and Version
CLIENT_ID = 'your Foursquare ID' # your Foursquare ID
CLIENT_SECRET = 'your Foursquare Secret' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: XWP4PG1BLKODXDKSRAXOEEDXLELL2OV02FJ1RWWYDJO33LAY
CLIENT_SECRET:ZL2XSNGKRFPHLFMEPYBSYUUSDRLDLLJFFDGK42K5QYIQVEZU


**Now, let's get the top 100 venues that are within a radius of 2000 meters.**

In [0]:
radius = 2000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(mum_df['Latitude'], mum_df['Longitude'], mum_df['Neighborhood']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [180]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(2687, 7)


Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Andheri,19.118459,72.841763,Merwans Cake shop,19.1193,72.845418,Bakery
1,Andheri,19.118459,72.841763,Radha Krishna Veg Restaurant,19.11513,72.84306,Indian Restaurant
2,Andheri,19.118459,72.841763,Naturals,19.111204,72.837255,Ice Cream Shop
3,Andheri,19.118459,72.841763,Narayan Sandwich,19.121398,72.85027,Sandwich Place
4,Andheri,19.118459,72.841763,Temptations,19.113767,72.841337,Ice Cream Shop


**Let's check how many venues were returned for each neighorhood**

In [181]:
venues_df.groupby(["Neighborhood"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Andheri,100,100,100,100,100,100
Anushakti Nagar,16,16,16,16,16,16
Baiganwadi,9,9,9,9,9,9
Bandra,100,100,100,100,100,100
Bhandup,25,25,25,25,25,25
Borivali,97,97,97,97,97,97
Charkop,59,59,59,59,59,59
Chembur,100,100,100,100,100,100
Dahisar,67,67,67,67,67,67
Devipada,88,88,88,88,88,88


**Let's find out how many unique categories can be curated from all the returned venues**

In [182]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 191 uniques categories.


In [183]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:50]

array(['Bakery', 'Indian Restaurant', 'Ice Cream Shop', 'Sandwich Place',
       'Coffee Shop', 'Falafel Restaurant', 'Pub', 'Juice Bar',
       'Pizza Place', 'Multiplex', 'Fast Food Restaurant',
       'Seafood Restaurant', 'Snack Place', 'Breakfast Spot', 'Café',
       'Chinese Restaurant', 'Maharashtrian Restaurant',
       'American Restaurant', 'Cocktail Bar', 'Bar', 'Diner',
       'Gym / Fitness Center', 'BBQ Joint', 'Department Store', 'Spa',
       "Women's Store", 'Asian Restaurant', 'Lounge', 'Liquor Store',
       'Residential Building (Apartment / Condo)', 'Electronics Store',
       'Food Truck', 'Smoke Shop', 'Vegetarian / Vegan Restaurant',
       'Athletics & Sports', 'Fish Market', 'Park', 'Martial Arts Dojo',
       'Tea Room', 'Burger Joint', 'Food', 'Plaza', 'Platform',
       'Paper / Office Supplies Store', 'Food & Drink Shop',
       'Music Venue', 'Fried Chicken Joint', 'Gym', 'Dessert Shop',
       'Sports Club'], dtype=object)

In [184]:
# check if the results contain "Shopping Mall"
"Neighborhood" in venues_df['VenueCategory'].unique()

True

### **6. Analyze Each Neighborhood**



In [185]:
# one hot encoding
mum_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
mum_onehot['Neighborhoods'] = venues_df['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [mum_onehot.columns[-1]] + list(mum_onehot.columns[:-1])
mum_onehot = mum_onehot[fixed_columns]

print(mum_onehot.shape)
mum_onehot.head()

(2687, 192)


Unnamed: 0,Neighborhoods,Accessories Store,Afghan Restaurant,American Restaurant,Antique Shop,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Bookstore,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Building,Burger Joint,Burrito Place,Bus Station,Cafeteria,Café,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Auditorium,College Gym,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dhaba,Dim Sum Restaurant,Diner,Donut Shop,Electronics Store,Event Space,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,General College & University,General Entertainment,German Restaurant,Gluten-free Restaurant,Goan Restaurant,Golf Course,Gourmet Shop,Grocery Store,Gujarati Restaurant,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Hawaiian Restaurant,Historic Site,History Museum,Hockey Arena,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Korean Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Maharashtrian Restaurant,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Motorcycle Shop,Mountain,Movie Theater,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Music Store,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Outdoors & Recreation,Paper / Office Supplies Store,Park,Performing Arts Venue,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Punjabi Restaurant,Recreation Center,Residential Building (Apartment / Condo),Resort,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,South American Restaurant,South Indian Restaurant,Spa,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park,Toy / Game Store,Track,Trail,Train Station,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
0,Andheri,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Andheri,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Andheri,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Andheri,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Andheri,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


**Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category**

In [186]:
mum_grouped = mum_onehot.groupby(["Neighborhoods"]).mean().reset_index()

print(mum_grouped.shape)
mum_grouped

(41, 192)


Unnamed: 0,Neighborhoods,Accessories Store,Afghan Restaurant,American Restaurant,Antique Shop,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Bookstore,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Building,Burger Joint,Burrito Place,Bus Station,Cafeteria,Café,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Auditorium,College Gym,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dhaba,Dim Sum Restaurant,Diner,Donut Shop,Electronics Store,Event Space,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,General College & University,General Entertainment,German Restaurant,Gluten-free Restaurant,Goan Restaurant,Golf Course,Gourmet Shop,Grocery Store,Gujarati Restaurant,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Hawaiian Restaurant,Historic Site,History Museum,Hockey Arena,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Korean Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Maharashtrian Restaurant,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Motorcycle Shop,Mountain,Movie Theater,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Music Store,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Outdoors & Recreation,Paper / Office Supplies Store,Park,Performing Arts Venue,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Punjabi Restaurant,Recreation Center,Residential Building (Apartment / Condo),Resort,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,South American Restaurant,South Indian Restaurant,Spa,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park,Toy / Game Store,Track,Trail,Train Station,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
0,Andheri,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.02,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.05,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.15,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.04,0.0,0.0,0.0,0.01,0.05,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01
1,Anushakti Nagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.125,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Baiganwadi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.111111,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bandra,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.03,0.0,0.01,0.01,0.06,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.09,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.05,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.1,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.02,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bhandup,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.16,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0
5,Borivali,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.010309,0.0,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.0,0.082474,0.0,0.051546,0.030928,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.020619,0.0,0.010309,0.0,0.0,0.010309,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.051546,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030928,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030928,0.020619,0.0,0.0,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.010309,0.061856,0.103093,0.010309,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.041237,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.0,0.010309,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.010309,0.0,0.0,0.020619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051546,0.0,0.0,0.0,0.041237,0.010309,0.0,0.010309,0.0,0.0,0.010309,0.0,0.030928,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010309,0.0,0.010309,0.0,0.0,0.0,0.0,0.010309,0.020619,0.0,0.0,0.0
6,Charkop,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.016949,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.033898,0.0,0.050847,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.016949,0.0,0.0,0.0,0.0,0.0,0.135593,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.016949,0.0,0.016949,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.033898,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.033898,0.050847,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.050847,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.016949,0.050847,0.0,0.0,0.016949,0.0,0.016949,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0
7,Chembur,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.2,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.05,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0
8,Dahisar,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.014925,0.029851,0.0,0.0,0.0,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044776,0.0,0.044776,0.0,0.044776,0.014925,0.0,0.029851,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.029851,0.0,0.029851,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044776,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044776,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.059701,0.089552,0.014925,0.0,0.0,0.014925,0.0,0.0,0.044776,0.0,0.0,0.0,0.014925,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044776,0.0,0.0,0.0,0.029851,0.0,0.0,0.029851,0.0,0.0,0.0,0.0,0.014925,0.029851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.014925,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0
9,Devipada,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.068182,0.0,0.034091,0.034091,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.034091,0.0,0.0,0.0,0.0,0.011364,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.056818,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.011364,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.090909,0.113636,0.022727,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.011364,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.022727,0.0,0.0,0.0,0.045455,0.011364,0.0,0.022727,0.0,0.0,0.0,0.0,0.034091,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0,0.0,0.0,0.011364,0.0,0.0,0.0


In [187]:
len(mum_grouped[mum_grouped["Shopping Mall"] > 0])

17

**Create a new DataFrame for Shopping Mall data only**

In [0]:
mum_mall = mum_grouped[["Neighborhoods","Shopping Mall"]]

In [189]:
mum_mall.head()

Unnamed: 0,Neighborhoods,Shopping Mall
0,Andheri,0.0
1,Anushakti Nagar,0.0
2,Baiganwadi,0.0
3,Bandra,0.0
4,Bhandup,0.04


### **7. Cluster Neighborhoods**
####Run k-means to cluster the neighborhoods in Mumbai into 3 clusters.

In [190]:
# set number of clusters
clusters = 3

mum_clustering = mum_mall.drop(["Neighborhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=clusters, random_state=0).fit(mum_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 2, 1, 0, 1, 0, 0], dtype=int32)

In [0]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
mum_merged = mum_mall.copy()

# add clustering labels
mum_merged["Cluster Labels"] = kmeans.labels_

In [192]:
mum_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
mum_merged.head()

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels
0,Andheri,0.0,0
1,Anushakti Nagar,0.0,0
2,Baiganwadi,0.0,0
3,Bandra,0.0,0
4,Bhandup,0.04,2


In [193]:
# merge mum_merged with mum_df to add latitude/longitude for each neighborhood
mum_merged = mum_merged.join(mum_df.set_index("Neighborhood"), on="Neighborhood")

print(mum_merged.shape)
mum_merged.head() # check the last columns!

(41, 5)


Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,Andheri,0.0,0,19.118459,72.841763
1,Anushakti Nagar,0.0,0,19.04283,72.92734
2,Baiganwadi,0.0,0,19.06294,72.92663
3,Bandra,0.0,0,19.05437,72.84017
4,Bhandup,0.04,2,19.14556,72.94856


In [194]:
# sort the results by Cluster Labels
print(mum_merged.shape)
mum_merged.sort_values(["Cluster Labels"], inplace=True)
mum_merged

(41, 5)


Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,Andheri,0.0,0,19.118459,72.841763
34,Tilak Nagar (Mumbai),0.0,0,18.99616,72.85279
31,"Sion, Mumbai",0.0,0,19.04359,72.86412
30,Shil Phata,0.0,0,19.14658,73.04005
27,Mumbra,0.0,0,19.19054,73.02266
25,Mira Road,0.0,0,19.280032,72.867932
24,"Matharpacady, Mumbai",0.0,0,19.04492,72.867205
23,Mankhurd,0.0,0,19.04853,72.93222
22,Mahavir Nagar (Kandivali),0.0,0,19.211982,72.837573
21,Kurla,0.0,0,19.06498,72.88069


**Finally, let's visualize the resulting clusters**

In [195]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(clusters)
ys = [i+x+(i*x)**2 for i in range(clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(mum_merged['Latitude'], mum_merged['Longitude'], mum_merged['Neighborhood'], mum_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [0]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

### **8. Examine Clusters**

#### **Cluster 0**

In [197]:
mum_merged.loc[mum_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,Andheri,0.0,0,19.118459,72.841763
34,Tilak Nagar (Mumbai),0.0,0,18.99616,72.85279
31,"Sion, Mumbai",0.0,0,19.04359,72.86412
30,Shil Phata,0.0,0,19.14658,73.04005
27,Mumbra,0.0,0,19.19054,73.02266
25,Mira Road,0.0,0,19.280032,72.867932
24,"Matharpacady, Mumbai",0.0,0,19.04492,72.867205
23,Mankhurd,0.0,0,19.04853,72.93222
22,Mahavir Nagar (Kandivali),0.0,0,19.211982,72.837573
21,Kurla,0.0,0,19.06498,72.88069


#### **Cluster 1**

In [198]:
mum_merged.loc[mum_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
13,Goregaon,0.02,1,19.16455,72.84946
38,Wadala,0.021277,1,19.0172,72.85817
33,Thakur village,0.017544,1,19.2102,72.87541
32,"Sonapur, Bhandup",0.023256,1,19.16394,72.93544
29,Seven Bungalows,0.01,1,19.131342,72.816342
40,Worli,0.02,1,19.00744,72.81688
5,Borivali,0.010309,1,19.22936,72.85751
7,Chembur,0.01,1,19.06218,72.90241
39,Western Suburbs (Mumbai),0.013158,1,19.19701,72.82768
12,Ghatkopar,0.012346,1,19.086523,72.909008


#### **Cluster 2**

In [199]:
mum_merged.loc[mum_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
26,Mulund,0.038961,2,19.17183,72.95565
19,Kanjurmarg,0.030303,2,19.13138,72.93568
35,Uttan,0.043478,2,26.86634,80.93884
36,Vashi,0.03,2,19.08465,72.90481
37,Vikhroli,0.03,2,19.11109,72.92781
4,Bhandup,0.04,2,19.14556,72.94856


### **Observations:**
Most of the shopping malls are concentrated in the central area of Mumbai city, with the highest number in cluster 2 and moderate number in cluster 1. On the other hand, cluster 0 has no shopping mall in the neighborhoods. This represents a great opportunity and high potential areas to open new shopping malls as there is very little to no competition from existing malls. Meanwhile, shopping malls in cluster 2 are likely suffering from intense competition due to oversupply and high concentration of shopping malls. From another perspective, this also shows that the oversupply of shopping malls mostly happened in the central area of the city, with the suburb area still have very few shopping malls. Therefore, this project recommends property developers to capitalize on these findings to open new shopping malls in neighborhoods in cluster 0 with no competition. Property developers with unique selling propositions to stand out from the competition can also open new shopping malls in neighborhoods in cluster 1 with moderate competition. Lastly, property developers are advised to avoid neighborhoods in cluster 2 which already have high concentration of shopping malls and suffering from intense competition.