<b><h1>Construction of Shopping Malls in Mumbai, India<h1><b>


<b><h1>Introduction</h1></b>

*Shopping is a great way for tourists as well as locals to enjoy,relax and treat themselves to something new and exciting. This activity has become extremely popular in recent times. Although life was brought to a standstill due to the Coronavirus pandemic,in a lot of countries, including India, shops have begun opening up, resulting in the return of this trend. Property developers are also taking advantage of this to tap into the enormous market out there. As a result, Mumbai, the commercial capital of India, has a ton of shopping centres, and numerous ones are being currently constructed. These malls allow developers to have a consistent income.However, not all malls are successful; their locations also have to be taken into consideration.This factor is what truly determines a mall's success or failure.*

<b><h1>Business Problem</h1></b>

*The objective of this project is to analyze and select the best locations in the city of Mumbai, India, to open a new shopping mall. This project is mainly focused on geospatial analysis of Mumbai to understand which would be the best place to open a new mall. Using data science methodology and machine learning techniques like clustering, this project aims to provide solutions to answer the business question: In Mumbai, if a property developer is looking to open a new shopping mall, where would you recommend that they open it?*

<b><h1>Data</h1></b>

####Requirements

To solve this problem, we need geographical location data for Mumbai.We need to obtain the following data:
<ul>
<li>List of neighbourhoods in Mumbai</li>
<li>Latitude and Longitude data of these neighbourhoods</li>
<li>Venue data (related to shopping malls)</li> 

####<b>Sources</b>

We scrape data from this wikipedia page to derive our solution:
https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Mumbai

This page has all the neighbouroods in Mumbai, making it an important and effective source.

<b>Foursquare API data</b>

We will need data about different venues in different neighbourhoods. In order to gain that information we will use "Foursquare" locational information. Foursquare is a location data provider with information about all manner of venues and events within an area of interest. Such information includes venue names, locations, menus and even photos. As such, the foursquare location platform will be used as the sole data source since all the stated required information can be obtained through the API.

After finding the list of neighbourhoods, we then connect to the Foursquare API to gather information about venues inside each and every neighbourhood.

<i><h3>Finally, we have all the data required to build our model. Using this, our stakeholders can make important decisions</h3></i>

<h1><b>Methodology</b></h1>

In [2]:
import pandas as pd
import requests
from bs4 import BeautifulSoup

In [3]:
url=requests.get('https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Mumbai')
url

<Response [200]>

In [4]:
soup = BeautifulSoup(url.text, 'html.parser')

In [5]:
neighborhoodList = []
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
  neighborhoodList.append(row.text)
df = pd.DataFrame({"Neighborhood": neighborhoodList})
df.head()  

Unnamed: 0,Neighborhood
0,List of neighbourhoods in Mumbai
1,Aarey Forest
2,Agripada
3,Altamount Road
4,"Amboli, Mumbai"


In [6]:

df.drop(index=0, inplace=True , axis=0)

In [7]:
df.head()

Unnamed: 0,Neighborhood
1,Aarey Forest
2,Agripada
3,Altamount Road
4,"Amboli, Mumbai"
5,Amrut Nagar


In [8]:
df=df.reset_index()
df.head()

Unnamed: 0,index,Neighborhood
0,1,Aarey Forest
1,2,Agripada
2,3,Altamount Road
3,4,"Amboli, Mumbai"
4,5,Amrut Nagar


In [9]:
df.drop(columns=['index'], axis=1,inplace=True)
df.head()

Unnamed: 0,Neighborhood
0,Aarey Forest
1,Agripada
2,Altamount Road
3,"Amboli, Mumbai"
4,Amrut Nagar


In [10]:
!pip install geocoder


Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |███▎                            | 10kB 16.9MB/s eta 0:00:01[K     |██████▋                         | 20kB 18.9MB/s eta 0:00:01[K     |██████████                      | 30kB 15.0MB/s eta 0:00:01[K     |█████████████▎                  | 40kB 13.9MB/s eta 0:00:01[K     |████████████████▋               | 51kB 11.7MB/s eta 0:00:01[K     |████████████████████            | 61kB 11.8MB/s eta 0:00:01[K     |███████████████████████▎        | 71kB 11.5MB/s eta 0:00:01[K     |██████████████████████████▋     | 81kB 12.5MB/s eta 0:00:01[K     |██████████████████████████████  | 92kB 11.7MB/s eta 0:00:01[K     |████████████████████████████████| 102kB 6.4MB/s 
Collecting ratelim
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fd

In [11]:
import geocoder


In [12]:
def get_coor(neighborhood):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Mumbai,India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

coords = [ get_coor(neighborhood) for neighborhood in df["Neighborhood"].tolist()]

In [13]:
len(coords)

134

In [14]:
len(df.Neighborhood)

134

In [16]:
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
df['Latitude'] = df_coords['Latitude']
df['Longitude'] = df_coords['Longitude']
df

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Aarey Forest,19.138660,72.884250
1,Agripada,18.976280,72.826150
2,Altamount Road,18.964334,72.807842
3,"Amboli, Mumbai",18.940170,72.834830
4,Amrut Nagar,19.145160,72.846740
...,...,...,...
129,Walkeshwar,18.950120,72.799800
130,Wellington Pier (Bombay),18.943268,72.826687
131,Western Suburbs (Mumbai),19.197010,72.827680
132,Yashodham,19.088200,72.833370


In [17]:
df.shape

(134, 3)

In [18]:
!pip install geopy



In [19]:
!pip install geolocator

Collecting geolocator
  Downloading https://files.pythonhosted.org/packages/93/d2/230ed3be8afdda66e35611c617b15709b99522481c1b51744491b4208bbe/geolocator-0.1.1.zip
Building wheels for collected packages: geolocator
  Building wheel for geolocator (setup.py) ... [?25l[?25hdone
  Created wheel for geolocator: filename=geolocator-0.1.1-cp36-none-any.whl size=12167 sha256=3a3e801f97350d886611dcaaef40689e1d5dd6b3bd8a5c8aa057c9c6d1c629d9
  Stored in directory: /root/.cache/pip/wheels/fd/0f/35/478bc0f7f5c5acf6e5345db12383a79daeb524d75691691e15
Successfully built geolocator
Installing collected packages: geolocator
Successfully installed geolocator-0.1.1


In [20]:
from geopy.geocoders import Nominatim
geolocator=Nominatim()
address = 'Mumbai, India'
geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Mumbai, India {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Mumbai, India 19.0759899, 72.8773928.




In [21]:
import folium

In [22]:
map_kl = folium.Map(location=[latitude, longitude], zoom_start=15)
for lat, lng, neighborhood in zip(df['Latitude'],  df['Longitude'], df['Neighborhood']):
 label = '{}'.format(neighborhood)
 label = folium.Popup(label, parse_html=True)
 folium.CircleMarker([lat, lng],radius=5,popup=label,color='blue',fill=True,fill_color='#3186cc',fill_opacity=0.7).add_to(map_kl)
map_kl


In [28]:
CLIENT_ID = '4PTCUFNXSY0MM0EE3YILFHZKFWTZILFCR0J2IYMWCWZXSFZL' 
CLIENT_SECRET = '5ZUW5RJQTU3BYGHH4MDU1AIUUHKZQR22RO23MNBZKZVYVIWI'
VERSION = '20180605'
radius = 2000
LIMIT = 100
venues_list=[]
for name, lat, lng in zip(df['Neighborhood'],df['Latitude'], df['Longitude']):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        for venue in results:
          venues_list.append((name,lat,lng,venue['venue']['name'],
          venue['venue']['location']['lat'],venue['venue']['location']    ['lng'],venue['venue']['categories'][0]['name']))

Aarey Forest
Agripada
Altamount Road
Amboli, Mumbai
Amrut Nagar
Antop Hill
Anushakti Nagar
Asalfa
Badhwar Park
Baiganwadi
Ballard Estate
Bandra
Bandra Kurla Complex
Bangur Nagar
Bhuleshwar
Bori Bunder
Breach Candy
Byculla
C.G.S. colony
Cavel
Chandanwadi, Mumbai
Chandivali
Chinchpokli
Chira Bazaar
Chor Bazaar, Mumbai
Churchgate
Colaba
Cotton Green
Cuffe Parade
Cumbala Hill
Currey Road railway station
D.N. Nagar
Dadar
Dadar Parsi Colony
Dagdi Chawl
Dava Bazaar
Dedh galli
Deonar
Dharavi
Dhobitalao
Dindoshi
Dongri
Fanas Wadi
Ferry Wharf
Fort (Mumbai precinct)
Four Bungalows
Gavangaon
Ghodapdeo
Girgaon
Gokuldham
Gopalrao Deshmukh Marg
Gorai
Gowalia Tank
Guru Tegh Bahadur Nagar
Hindu Colony
Hiranandani Gardens, Mumbai
I.C. Colony
Irla
Jagruti Nagar
JB Nagar
Kajuwadi
Kala Ghoda
Kalbadevi
Kamathipura
Kannamwar Nagar
Kemps Corner
Khar Danda
Kherwadi
Khotachiwadi
Koliwada
Koombarwara
Kopar Road
Lalbaug
Lallubhai Compound
Land's End, Bandra
Lion Gate (Mumbai)
Lohar Chawl
Lokhandwala Complex
Madh 

In [29]:
venues_df = pd.DataFrame(venues_list)


In [30]:
venues_df

Unnamed: 0,0,1,2,3,4,5,6
0,Aarey Forest,19.13866,72.88425,Powai,19.124896,72.893503,Garden
1,Aarey Forest,19.13866,72.88425,Skky,19.135841,72.898494,Restaurant
2,Aarey Forest,19.13866,72.88425,Fratelli Fresh Italian Restaurant,19.134830,72.901951,Italian Restaurant
3,Aarey Forest,19.13866,72.88425,Renaissance Hotel Executive Lounge,19.134863,72.901883,Lounge
4,Aarey Forest,19.13866,72.88425,The Residence Hotel And Convention Center @ Powai,19.134862,72.898510,Hotel
...,...,...,...,...,...,...,...
3787,Zaveri Bazaar,18.95028,72.83157,Tawakkal sweets,18.960053,72.830936,Dessert Shop
3788,Zaveri Bazaar,18.95028,72.83157,Town House Cafe,18.938550,72.833464,Bar
3789,Zaveri Bazaar,18.95028,72.83157,Jaffar Bhai Delhi Darbar,18.950000,72.834625,Indian Restaurant
3790,Zaveri Bazaar,18.95028,72.83157,Delhi Darbar,18.961581,72.823527,Indian Restaurant


In [31]:
venues_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']
print(venues_df.shape)
venues_df

(3792, 7)


Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Aarey Forest,19.13866,72.88425,Powai,19.124896,72.893503,Garden
1,Aarey Forest,19.13866,72.88425,Skky,19.135841,72.898494,Restaurant
2,Aarey Forest,19.13866,72.88425,Fratelli Fresh Italian Restaurant,19.134830,72.901951,Italian Restaurant
3,Aarey Forest,19.13866,72.88425,Renaissance Hotel Executive Lounge,19.134863,72.901883,Lounge
4,Aarey Forest,19.13866,72.88425,The Residence Hotel And Convention Center @ Powai,19.134862,72.898510,Hotel
...,...,...,...,...,...,...,...
3787,Zaveri Bazaar,18.95028,72.83157,Tawakkal sweets,18.960053,72.830936,Dessert Shop
3788,Zaveri Bazaar,18.95028,72.83157,Town House Cafe,18.938550,72.833464,Bar
3789,Zaveri Bazaar,18.95028,72.83157,Jaffar Bhai Delhi Darbar,18.950000,72.834625,Indian Restaurant
3790,Zaveri Bazaar,18.95028,72.83157,Delhi Darbar,18.961581,72.823527,Indian Restaurant


In [32]:
df['Neighborhood'].unique()

array(['Aarey Forest', 'Agripada', 'Altamount Road', 'Amboli, Mumbai',
       'Amrut Nagar', 'Antop Hill', 'Anushakti Nagar', 'Asalfa',
       'Badhwar Park', 'Baiganwadi', 'Ballard Estate', 'Bandra',
       'Bandra Kurla Complex', 'Bangur Nagar', 'Bhuleshwar',
       'Bori Bunder', 'Breach Candy', 'Byculla', 'C.G.S. colony', 'Cavel',
       'Chandanwadi, Mumbai', 'Chandivali', 'Chinchpokli', 'Chira Bazaar',
       'Chor Bazaar, Mumbai', 'Churchgate', 'Colaba', 'Cotton Green',
       'Cuffe Parade', 'Cumbala Hill', 'Currey Road railway station',
       'D.N. Nagar', 'Dadar', 'Dadar Parsi Colony', 'Dagdi Chawl',
       'Dava Bazaar', 'Dedh galli', 'Deonar', 'Dharavi', 'Dhobitalao',
       'Dindoshi', 'Dongri', 'Fanas Wadi', 'Ferry Wharf',
       'Fort (Mumbai precinct)', 'Four Bungalows', 'Gavangaon',
       'Ghodapdeo', 'Girgaon', 'Gokuldham', 'Gopalrao Deshmukh Marg',
       'Gorai', 'Gowalia Tank', 'Guru Tegh Bahadur Nagar', 'Hindu Colony',
       'Hiranandani Gardens, Mumbai', 'I.C.

In [33]:
from sklearn.preprocessing import OneHotEncoder

In [36]:
df=venues_df

In [38]:
df

Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Aarey Forest,19.13866,72.88425,Powai,19.124896,72.893503,Garden
1,Aarey Forest,19.13866,72.88425,Skky,19.135841,72.898494,Restaurant
2,Aarey Forest,19.13866,72.88425,Fratelli Fresh Italian Restaurant,19.134830,72.901951,Italian Restaurant
3,Aarey Forest,19.13866,72.88425,Renaissance Hotel Executive Lounge,19.134863,72.901883,Lounge
4,Aarey Forest,19.13866,72.88425,The Residence Hotel And Convention Center @ Powai,19.134862,72.898510,Hotel
...,...,...,...,...,...,...,...
3787,Zaveri Bazaar,18.95028,72.83157,Tawakkal sweets,18.960053,72.830936,Dessert Shop
3788,Zaveri Bazaar,18.95028,72.83157,Town House Cafe,18.938550,72.833464,Bar
3789,Zaveri Bazaar,18.95028,72.83157,Jaffar Bhai Delhi Darbar,18.950000,72.834625,Indian Restaurant
3790,Zaveri Bazaar,18.95028,72.83157,Delhi Darbar,18.961581,72.823527,Indian Restaurant


In [39]:
len(df['VenueCategory'])

3792

In [40]:
venues_df.groupby(["Neighborhood"]).count

<bound method DataFrameGroupBy.count of <pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f3fe09fc978>>

In [41]:
print('There are {} unique categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 170 unique categories.


In [43]:
venues_df['VenueCategory'].unique()

array(['Garden', 'Restaurant', 'Italian Restaurant', 'Lounge', 'Hotel',
       'Gym', 'Indian Restaurant', 'Asian Restaurant', 'Ice Cream Shop',
       'Chinese Restaurant', 'Farm', 'Pizza Place', 'Dance Studio',
       'Coffee Shop', 'Gym / Fitness Center', 'Café', 'Garden Center',
       'Cupcake Shop', 'Tea Room', 'Bed & Breakfast', 'Nightclub',
       'Bakery', 'Club House', 'Golf Course', 'Middle Eastern Restaurant',
       'Music Venue', 'Scenic Lookout', 'History Museum', 'Bar',
       'Juice Bar', 'Spa', 'Deli / Bodega', 'Brewery', 'Bookstore',
       'Japanese Restaurant', 'Sandwich Place', 'Fast Food Restaurant',
       'Donut Shop', 'Mexican Restaurant', 'Park', "Men's Store",
       'Snack Place', 'Dessert Shop', 'Other Great Outdoors',
       'Salon / Barbershop', 'Beach', 'Cricket Ground',
       'Parsi Restaurant', 'Seafood Restaurant', 'Clothing Store',
       'Market', 'Monument / Landmark', 'Cheese Shop', 'Art Gallery',
       'Mughlai Restaurant', 'Pub', 'American Re

In [44]:
kl_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

In [45]:
kl_onehot

Unnamed: 0,Airport Service,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Bus Station,Café,Chaat Place,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Auditorium,College Gym,Comedy Club,Comfort Food Restaurant,...,Pharmacy,Pier,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Punjabi Restaurant,Recreation Center,Resort,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shopping Mall,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Track,Train Station,Tunnel,Vegetarian / Vegan Restaurant,Water Park,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3787,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3788,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3789,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3790,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [46]:
kl_onehot['Neighborhoods'] = venues_df['Neighborhood']

In [47]:
kl_onehot

Unnamed: 0,Airport Service,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Bus Station,Café,Chaat Place,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Auditorium,College Gym,Comedy Club,Comfort Food Restaurant,...,Pier,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Punjabi Restaurant,Recreation Center,Resort,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shopping Mall,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Track,Train Station,Tunnel,Vegetarian / Vegan Restaurant,Water Park,Women's Store,Neighborhoods
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Aarey Forest
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Aarey Forest
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Aarey Forest
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Aarey Forest
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Aarey Forest
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3787,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Zaveri Bazaar
3788,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Zaveri Bazaar
3789,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Zaveri Bazaar
3790,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Zaveri Bazaar


In [48]:
fixed_columns = [kl_onehot.columns[-1]] + list(kl_onehot.columns[:-1])
kl_onehot = kl_onehot[fixed_columns]

In [49]:
kl_onehot

Unnamed: 0,Neighborhoods,Airport Service,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Bus Station,Café,Chaat Place,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Auditorium,College Gym,Comedy Club,...,Pharmacy,Pier,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Punjabi Restaurant,Recreation Center,Resort,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shopping Mall,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Track,Train Station,Tunnel,Vegetarian / Vegan Restaurant,Water Park,Women's Store
0,Aarey Forest,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Aarey Forest,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Aarey Forest,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Aarey Forest,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Aarey Forest,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3787,Zaveri Bazaar,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3788,Zaveri Bazaar,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3789,Zaveri Bazaar,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3790,Zaveri Bazaar,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [50]:
kl_grouped=kl_onehot.groupby(["Neighborhoods"]).sum().reset_index()
print(kl_grouped.shape)
kl_grouped

(134, 171)


Unnamed: 0,Neighborhoods,Airport Service,American Restaurant,Arcade,Art Gallery,Asian Restaurant,Athletics & Sports,Australian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Bus Station,Café,Chaat Place,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Auditorium,College Gym,Comedy Club,...,Pharmacy,Pier,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Punjabi Restaurant,Recreation Center,Resort,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shopping Mall,Smoke Shop,Snack Place,South Indian Restaurant,Spa,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Track,Train Station,Tunnel,Vegetarian / Vegan Restaurant,Water Park,Women's Store
0,Aarey Forest,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,2,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
1,Agripada,0,0,0,0,1,0,0,0,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Altamount Road,0,0,0,0,0,0,0,0,0,3,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Amboli, Mumbai",0,0,0,1,1,0,0,0,0,1,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,1,1,1,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Amrut Nagar,0,1,0,0,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,2,0,0,0,1,0,0,0,0,...,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129,Walkeshwar,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
130,Wellington Pier (Bombay),0,0,0,0,0,0,0,0,0,3,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,1,0,0,0,1,1,0,0,0,...,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
131,Western Suburbs (Mumbai),0,1,0,0,0,0,0,0,0,0,2,0,1,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,1,2,0,0,3,0,0,0,0,...,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
132,Yashodham,0,0,0,0,0,0,1,0,0,0,2,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,1,1,2,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1


In [51]:
len((kl_grouped[kl_grouped["Shopping Mall"] > 0]))

18

In [52]:
kl_mall = kl_grouped[["Neighborhoods","Shopping Mall"]]

In [53]:
kl_mall

Unnamed: 0,Neighborhoods,Shopping Mall
0,Aarey Forest,0
1,Agripada,0
2,Altamount Road,0
3,"Amboli, Mumbai",0
4,Amrut Nagar,0
...,...,...
129,Walkeshwar,0
130,Wellington Pier (Bombay),0
131,Western Suburbs (Mumbai),1
132,Yashodham,0


In [66]:
len(kl_mall['Shopping Mall'])
c=0
d=0

In [68]:
for v in kl_mall['Shopping Mall']:
  if(v==1):
    c+=1
  if(v==2):  
    d+=1

In [69]:
c

11

In [70]:
d

7

In [73]:
from sklearn.cluster import KMeans
kclusters = 3
kl_clustering = kl_mall.drop(["Neighborhoods"], 1)
kmeans = KMeans(n_clusters=kclusters,random_state=0).fit(kl_clustering)
kmeans

KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
       n_clusters=3, n_init=10, n_jobs=None, precompute_distances='auto',
       random_state=0, tol=0.0001, verbose=0)

In [74]:
kl_merged = kl_mall.copy()

In [75]:
kl_merged["Cluster Labels"] = kmeans.labels_
kl_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
kl_merged.head(10)

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels
0,Aarey Forest,0,0
1,Agripada,0,0
2,Altamount Road,0,0
3,"Amboli, Mumbai",0,0
4,Amrut Nagar,0,0
5,Antop Hill,0,0
6,Anushakti Nagar,0,0
7,Asalfa,1,1
8,Badhwar Park,0,0
9,Baiganwadi,0,0


In [77]:
kl_merged['Latitude'] = df['Latitude']
kl_merged['Longitude'] = df['Longitude']

In [78]:
kl_merged

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,Aarey Forest,0,0,19.13866,72.88425
1,Agripada,0,0,19.13866,72.88425
2,Altamount Road,0,0,19.13866,72.88425
3,"Amboli, Mumbai",0,0,19.13866,72.88425
4,Amrut Nagar,0,0,19.13866,72.88425
...,...,...,...,...,...
129,Walkeshwar,0,0,19.14516,72.84674
130,Wellington Pier (Bombay),0,0,19.14516,72.84674
131,Western Suburbs (Mumbai),1,1,19.14516,72.84674
132,Yashodham,0,0,19.14516,72.84674


In [79]:
kl_merged.sort_values(["Cluster Labels"], inplace=True)
kl_merged

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,Aarey Forest,0,0,19.13866,72.88425
93,Mendham's Point,0,0,18.94017,72.83483
92,Mazagaon,0,0,18.94017,72.83483
91,"Matunga Road, Mumbai",0,0,18.94017,72.83483
90,Marol,0,0,18.94017,72.83483
...,...,...,...,...,...
13,Bangur Nagar,2,2,19.13866,72.88425
31,Currey Road railway station,2,2,18.97628,72.82615
18,C.G.S. colony,2,2,19.13866,72.88425
105,Parel,2,2,18.94017,72.83483


In [82]:
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
markers_colors = []
for lat, lon, poi, cluster in zip(kl_merged['Latitude'], kl_merged['Longitude'], kl_merged['Neighborhood'], kl_merged['Cluster Labels']):
  label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
  folium.CircleMarker([lat,lon],radius=5,popup=label,color=rainbow[cluster-1],fill=True,fill_color=rainbow[cluster-1],fill_opacity=0.7).add_to(map_clusters)
map_clusters  

In [84]:
len(kl_merged.loc[kl_merged['Cluster Labels'] == 0])


116

In [85]:
len(kl_merged.loc[kl_merged['Cluster Labels'] == 1])


11

In [86]:

len(kl_merged.loc[kl_merged['Cluster Labels'] == 2])

7

<b><h1>Results</h1></b><br>
There are 116 places in cluster 0 which is the highest among the 3 clusters, and cluster 0 contains all the places which do not have a shopping mall. Cluster 1 contains 11 places and all of them contain exactly 1 shopping mall, while cluster 2 contains 7 places where all the places contain 2 or more shopping malls.
The results from the K-means clustering show that we can categorize the neighbourhoods into 3 clusters based on the frequency of occurrence for “Shopping Mall”:
• Cluster 0: Neighbourhoods with very less number of shopping malls
• Cluster 1: Neighbourhoods with a moderate concentration of shopping malls
• Cluster 2: Neighbourhoods with a high concentration of shopping malls


<h1><b>Discussion</b></h1>
<br>The results clearly indicate that 116 places in Mumbai are suitab1le for the construction of shopping malls, as they have no competition in the area.Construction companies and other developers can channel this information to make huge profits 

<b><h1>Conclusion</h1></b><br>
A good number of shopping malls are present in Mumbai. Cluster 0 has a very low number of malls. This represents a great opportunity and high potential areas to open new shopping malls, as there is very little to no competition from existing malls. Meanwhile, shopping malls in cluster 2 are likely suffering from intense competition because of oversupply and a high concentration of shopping malls. Therefore, this project recommends property developers to capitalize on these findings to open new shopping malls in neighbourhoods in cluster 0 with little to no competition. Property developers with unique selling propositions to stand out from the competition can also open new shopping malls in neighbourhoods in cluster 1 with moderate competition. Lastly, property developers are advised to avoid neighbourhoods in cluster 2 which already have a high concentration of shopping malls and suffering from intense competition.
In this project, we only consider one factor i.e. frequency of occurrence of shopping malls, there are other factors such as population and income of residents that could influence the location decision of a new shopping mall.
But for setting up a shopping mall we need to consider other factors such as the cost of rent, the surroundings around the shopping mall, the kind of people in the locality-if it's a luxurious area many people prefer going out, their lifestyle will be different from others and therefore spend a lot. If we decide a place where the competition is less, then we need to consider the people living in that locality as well. If the people in that area spend a lot and love going out then it’ll be a success. If the people staying near the mall don't prefer going out, then the construction of a mall serves no purpose