# Introduction: Business Problem

### Mumbai is the financial, commercial and entertainment capital of India and contributes to 6.16% of India's GDP. According to United Nations, Mumbai is the second-most populous city in India aftter Delhi. Investors and entrepreneurs all over the world want to invest in Mumbai due to it's business oppurtunities and large labor-force.
### Due to it's high population, the food and beverage industry can prosper with a reasonably moderate investment and large profit margins. Thus, this project (report) can provide a perspective to entrepreneurs and investors from all over the world looking to invest in beverage and food industry in Mumbai. The analysis will provide a perspective on various areas (clusters) of Mumbai and will be able to tell where opening a beverage shop or a restaurant can prove profitable to the investor.

# Data

### The data for the list of neighborhoods in Mumbai is obtained from <a>https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Mumbai</a>
### The different venues and it's queries is obtained from the FourSquare APIs. The dataframe will contain top 10 venues in each cluster of areas and will provide adequate information to make a decision about where to open beverage shops or restaurants. K-means clustering algorithm of machine learning will be used to cluster the areas (districts) of Mumbai with high accuracy and will be used to find pattern between the different districts and clusters.
### Thus, the data will be used to meet the following objectives:
<ul>
    <li> Neighborhoods and Areas of Mumbai </li>
    <li> Trending Venues of the Areas </li>
    <li> Categorizing the venues </li>
    <li> Cluster different Areas of Mumbai using K-means algorithm </li>

## Data Gathering

#### Import the required libraries

In [1]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup

#### Import url through requests and BeautifulSoup

In [2]:
url='https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Mumbai'
html=requests.get(url)
soup=BeautifulSoup(html.content, 'html.parser')

#### Append the table list through find_all of BeautifulSoup

In [3]:
table=[]
for i in soup.table.tbody.find_all('td'):
    table.append(i.text.strip())
table[0:12]

['Amboli',
 'Andheri,Western Suburbs',
 '19.1293',
 '72.8434',
 'Chakala, Andheri',
 'Western Suburbs',
 '19.111388',
 '72.860833',
 'D.N. Nagar',
 'Andheri,Western Suburbs',
 '19.124085',
 '72.831373']

#### Get the required Dataframe

In [4]:
dict={'Neighborhood':table[0::4], 'Borough':table[1::4], 'Latitude':table[2::4], 'Longitude':table[3::4]}
df=pd.DataFrame(dict)
df.head(10)

Unnamed: 0,Neighborhood,Borough,Latitude,Longitude
0,Amboli,"Andheri,Western Suburbs",19.1293,72.8434
1,"Chakala, Andheri",Western Suburbs,19.111388,72.860833
2,D.N. Nagar,"Andheri,Western Suburbs",19.124085,72.831373
3,Four Bungalows,"Andheri,Western Suburbs",19.124714,72.82721
4,Lokhandwala,"Andheri,Western Suburbs",19.130815,72.82927
5,Marol,"Andheri,Western Suburbs",19.119219,72.882743
6,Sahar,"Andheri,Western Suburbs",19.098889,72.867222
7,Seven Bungalows,"Andheri,Western Suburbs",19.129052,72.817018
8,Versova,"Andheri,Western Suburbs",19.12,72.82
9,Mira Road,"Mira-Bhayandar,Western Suburbs",19.284167,72.871111


In [5]:
df.shape

(93, 4)

#### Convert the latitude and longitudes from 'object' type to 'float' type

In [13]:
df['Latitude']=df['Latitude'].astype("float")
df['Longitude']=df['Longitude'].astype("float")
df.dtypes

Neighborhood     object
Borough          object
Latitude        float64
Longitude       float64
dtype: object

In [6]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json

import requests
from pandas.io.json import json_normalize 


import matplotlib.cm as cm
import matplotlib.colors as colors


from sklearn.cluster import KMeans

#### Get the Folium and Geopy libraries for visualization

In [7]:
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim 
!conda install -c conda-forge folium=0.5.0 --yes
import folium 

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-2.0.0                |     pyh9f0ad1d_0          63 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0          conda-forge
    geopy:           

#### Get the location (latitude, longitude) of Mumbai

In [8]:
address='Mumbai, MH'
geolocator=Nominatim(user_agent='mu_explorer')
location=geolocator.geocode(address)
Latitude=location.latitude
Longitude=location.longitude
print('The geographical coordinates of Mumbai is {},{}'.format(Latitude, Longitude))

The geographical coordinates of Mumbai is 18.9387711,72.8353355


#### Get the map of Mumbai (marked with nwighborhoods) using Folium library

In [37]:
map_mumbai=folium.Map(location=[Latitude, Longitude], zoom_start=11)
for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker( [lat, lng], radius=5, popup=label, color='blue', fill=True, fill_color='#3186cc', fill_opacity=0.7, parse_html=False).add_to(map_mumbai)  
    
map_mumbai

#### Foursquare API credentials and call

In [15]:
# Sensitive Cell (Hidden)
CLIENT_ID = 'OONUYXZ1DCJUIGN2XVDSBJS5HZL4QN5ZC51AMNXGYATQH4RG' # your Foursquare ID
CLIENT_SECRET = '03DLTWXCTMHBO0GPZP4PH5LAZ1BE4UZ4SBEDT4VAXFT5ZPN0' # your Foursquare Client secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: OONUYXZ1DCJUIGN2XVDSBJS5HZL4QN5ZC51AMNXGYATQH4RG
CLIENT_SECRET:03DLTWXCTMHBO0GPZP4PH5LAZ1BE4UZ4SBEDT4VAXFT5ZPN0


#### Get 100 nearby popular venues for all the neighborhoods in the boroughs of Mumbai

In [21]:
LIMIT=100
def GetNearbyVenues (names, latitude, longitude, radius=500):
    venues_list=[]
    for name, lat, lng in zip(names, latitude, longitude):
        print(name)
        url='https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION,
        lat, lng, radius, LIMIT)
        results=requests.get(url).json()['response']['groups'][0]['items']
        venues_list.append([(name, lat, lng, v['venue']['name'], v['venue']['location']['lat'], v['venue']['location']['lng']
                           , v['venue']['categories'][0]['name']) for v in results])
    nearby_venues=pd.DataFrame([item for venues_list in venues_list for item in venues_list])
    nearby_venues.columns=['Neighborhood', 'Neighborhood Latitude', 'Neighborhood Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category']
    return(nearby_venues)

In [22]:
mumbai_venues=GetNearbyVenues(names=df['Neighborhood'], latitude=df['Latitude'], longitude=df['Longitude'])

Amboli
Chakala, Andheri
D.N. Nagar
Four Bungalows
Lokhandwala
Marol
Sahar
Seven Bungalows
Versova
Mira Road
Bhayandar
Uttan
Bandstand Promenade
Kherwadi
Pali Hill
I.C. Colony
Gorai
Dahisa
Aarey Milk Colony
Bangur Nagar
Jogeshwari West
Juhu
Charkop
Poisar
Mahavir Nagar
Thakur village
Pali Naka
Khar Danda
Dindoshi
Sunder Nagar
Kalina
Naigaon
Nalasopara
Virar
Irla
Vile Parle
Bhandup
Amrut Nagar
Asalfa
Pant Nagar
Kanjurmarg
Nehru Nagar
Nahur
Chandivali
Hiranandani Gardens
Indian Institute of Technology Bombay campus
Vidyavihar
Vikhroli
Chembur
Deonar
Mankhurd
Mahul
Agripada
Altamount Road
Bhuleshwar
Breach Candy
Carmichael Road
Cavel
Churchgate
Cotton Green
Cuffe Parade
Cumbala Hill
Currey Road
Dhobitalao
Dongri
Kala Ghoda
Kemps Corner
Lower Parel
Mahalaxmi
Mahim
Malabar Hill
Marine Drive
Marine Lines
Mumbai Central
Nariman Point
Prabhadevi
Sion
Walkeshwar
Worli
C.G.S. colony
Dagdi Chawl
Navy Nagar
Hindu colony
Ballard Estate
Chira Bazaar
Fanas Wadi
Chor Bazaar
Matunga
Parel
Gowalia Tank
D

#### Get the shape and count of the Dataframe mumbai_venues

In [23]:
print(mumbai_venues.shape)
mumbai_venues.groupby('Neighborhood').count()

(1343, 7)


Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agripada,4,4,4,4,4,4
Altamount Road,8,8,8,8,8,8
Amboli,7,7,7,7,7,7
Amrut Nagar,37,37,37,37,37,37
Asalfa,5,5,5,5,5,5
Ballard Estate,6,6,6,6,6,6
Bandstand Promenade,15,15,15,15,15,15
Bangur Nagar,4,4,4,4,4,4
Bhandup,10,10,10,10,10,10
Bhayandar,1,1,1,1,1,1


# Methodology

### Our objective is to find the best suitable areas in Mumbai to open a restaurant or a beverage shop. We will use the K-means clustering algorithm to achieve our objective. 
### We will use the one-hot encoding method on the Venues Dataframe and then group it by Neighborhoods. The one-hot encoding will return the venue categories as column per neighborhood, and then it will be grouped together to provide the weighting of venue type occurence on each neighborhood.
### K-means clustering algorithm will help us cluster the neighborhoods based on the top venues in the encoded dataframe and provide cluster labels for similar neighborhoods. We will then observe the clusters one by one to determine it's content and then provide the appropriate recommendation.

## Data Analysis

#### Analyze each neighborhood through One-hot encoding

In [24]:
mumbai_onehot=pd.get_dummies(mumbai_venues[['Venue Category']], prefix='', prefix_sep='')
mumbai_onehot['Neighborhood']=mumbai_venues['Neighborhood']
fixed_columns=[mumbai_onehot.columns[-1]]+ list(mumbai_onehot.columns[:-1])
mumbai_onehot=mumbai_onehot[fixed_columns]
mumbai_onehot.head()

Unnamed: 0,Yoga Studio,ATM,Advertising Agency,Afghan Restaurant,American Restaurant,Amphitheater,Antique Shop,Aquarium,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Buffet,Burger Joint,Bus Station,Café,Chaat Place,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Auditorium,Comedy Club,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dhaba,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,General Entertainment,German Restaurant,Gift Shop,Goan Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Lake,Lighthouse,Liquor Store,Lounge,Maharashtrian Restaurant,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Movie Theater,Moving Target,Mughlai Restaurant,Multiplex,Music Store,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,North Indian Restaurant,Office,Other Great Outdoors,Paper / Office Supplies Store,Park,Parsi Restaurant,Performing Arts Venue,Pharmacy,Photography Studio,Pizza Place,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Recreation Center,Residential Building (Apartment / Condo),Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shipping Store,Shop & Service,Shopping Mall,Skating Rink,Smoke Shop,Snack Place,Soccer Field,South Indian Restaurant,Spa,Sporting Goods Shop,Sports Bar,Stables,Stadium,Steakhouse,Tea Room,Tex-Mex Restaurant,Theater,Tourist Information Center,Trail,Train Station,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Amboli,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Amboli,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Amboli,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Amboli,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Amboli,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### Group the rows by Neighborhood and by taking mean and frequency of each category

In [25]:
mumbai_grouped=mumbai_onehot.groupby('Neighborhood').mean().reset_index()
mumbai_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,ATM,Advertising Agency,Afghan Restaurant,American Restaurant,Amphitheater,Antique Shop,Aquarium,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Bengali Restaurant,Big Box Store,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Buffet,Burger Joint,Bus Station,Café,Chaat Place,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Auditorium,Comedy Club,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dhaba,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,General Entertainment,German Restaurant,Gift Shop,Goan Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Lake,Lighthouse,Liquor Store,Lounge,Maharashtrian Restaurant,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Movie Theater,Moving Target,Mughlai Restaurant,Multiplex,Music Store,Music Venue,New American Restaurant,Nightclub,Noodle House,North Indian Restaurant,Office,Other Great Outdoors,Paper / Office Supplies Store,Park,Parsi Restaurant,Performing Arts Venue,Pharmacy,Photography Studio,Pizza Place,Platform,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Recreation Center,Residential Building (Apartment / Condo),Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shipping Store,Shop & Service,Shopping Mall,Skating Rink,Smoke Shop,Snack Place,Soccer Field,South Indian Restaurant,Spa,Sporting Goods Shop,Sports Bar,Stables,Stadium,Steakhouse,Tea Room,Tex-Mex Restaurant,Theater,Tourist Information Center,Trail,Train Station,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store
0,Agripada,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Altamount Road,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Amboli,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Amrut Nagar,0.0,0.0,0.0,0.027027,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.108108,0.0,0.0,0.027027,0.027027,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.027027,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.162162,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Asalfa,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Get the top 5 most common and popular venues of each Neighborhood

In [26]:
num_top_venues = 5
for hood in mumbai_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = mumbai_grouped[mumbai_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Agripada----
               venue  freq
0                Gym  0.25
1        Coffee Shop  0.25
2  Indian Restaurant  0.25
3             Bakery  0.25
4        Yoga Studio  0.00


----Altamount Road----
               venue  freq
0               Café  0.25
1             Bakery  0.12
2        Pizza Place  0.12
3  Indian Restaurant  0.12
4     Sandwich Place  0.12


----Amboli----
                  venue  freq
0    Chinese Restaurant  0.14
1                  Park  0.14
2  Fast Food Restaurant  0.14
3                   Gym  0.14
4     Indian Restaurant  0.14


----Amrut Nagar----
                  venue  freq
0     Indian Restaurant  0.16
1                  Café  0.11
2     Electronics Store  0.05
3      Asian Restaurant  0.05
4  Fast Food Restaurant  0.05


----Asalfa----
         venue  freq
0  Men's Store   0.2
1       Hostel   0.2
2         Park   0.2
3   Playground   0.2
4   Food Truck   0.2


----Ballard Estate----
               venue  freq
0    Harbor / Marina  0.33
1  Convenienc

#### Sort the venues in descending order

In [27]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]


#### Get the Dataframe of Top 10 venues of each Neighborhood

In [48]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))


neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = mumbai_grouped['Neighborhood']

for ind in np.arange(mumbai_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(mumbai_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agripada,Gym,Coffee Shop,Indian Restaurant,Bakery,Dim Sum Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Donut Shop
1,Altamount Road,Café,Pizza Place,Sandwich Place,Bakery,Indian Restaurant,Theater,Coffee Shop,Dance Studio,Creperie,Cosmetics Shop
2,Amboli,Park,Fast Food Restaurant,Gym,Coffee Shop,Sandwich Place,Chinese Restaurant,Indian Restaurant,Dessert Shop,Electronics Store,Donut Shop
3,Amrut Nagar,Indian Restaurant,Café,Electronics Store,Fast Food Restaurant,Restaurant,Asian Restaurant,Falafel Restaurant,Bookstore,Bowling Alley,Brewery
4,Asalfa,Park,Men's Store,Hostel,Playground,Food Truck,Dhaba,Event Space,Electronics Store,Donut Shop,Dog Run
5,Ballard Estate,Harbor / Marina,Convenience Store,Hotel,Indian Restaurant,Grocery Store,Dhaba,Falafel Restaurant,Event Space,Electronics Store,Donut Shop
6,Bandstand Promenade,Scenic Lookout,Gym,Indian Restaurant,Fast Food Restaurant,Lounge,Beach,Café,Food Truck,Chinese Restaurant,Italian Restaurant
7,Bangur Nagar,Park,Food Truck,Smoke Shop,Multiplex,Cosmetics Shop,Convenience Store,Event Space,Concert Hall,Electronics Store,Donut Shop
8,Bhandup,Indian Restaurant,Multiplex,Shopping Mall,Café,Fried Chicken Joint,Sports Bar,Arcade,Pizza Place,Big Box Store,Vegetarian / Vegan Restaurant
9,Bhayandar,Shipping Store,Women's Store,Dhaba,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run,Diner,Dim Sum Restaurant


### Machine Learning algorithm
### K-means Clustering

#### Use K-means clustering method to cluster the Neighborhood

In [64]:
kclusters=4
mumbai_clustering=mumbai_grouped.drop('Neighborhood', 1)
kmeans=KMeans(n_clusters=kclusters, random_state=0).fit(mumbai_clustering)
kmeans.labels_[0:10]

array([2, 1, 1, 1, 1, 2, 1, 1, 1, 3], dtype=int32)

#### Get new daataframe which includes the cluster labels and Top 10 venues of each Neighborhood

In [68]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
mumbai_df=df
mumbai_df=mumbai_df.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
mumbai_df

Unnamed: 0,Neighborhood,Borough,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Amboli,"Andheri,Western Suburbs",19.1293,72.8434,1.0,Park,Fast Food Restaurant,Gym,Coffee Shop,Sandwich Place,Chinese Restaurant,Indian Restaurant,Dessert Shop,Electronics Store,Donut Shop
1,"Chakala, Andheri",Western Suburbs,19.111388,72.860833,1.0,Café,Fast Food Restaurant,Restaurant,Hotel,Multiplex,Indian Restaurant,Falafel Restaurant,Salon / Barbershop,Diner,Asian Restaurant
2,D.N. Nagar,"Andheri,Western Suburbs",19.124085,72.831373,2.0,Gym / Fitness Center,Indian Restaurant,Cocktail Bar,Pizza Place,Snack Place,Lounge,German Restaurant,Department Store,Electronics Store,Donut Shop
3,Four Bungalows,"Andheri,Western Suburbs",19.124714,72.82721,1.0,Women's Store,Bar,Gym,Ice Cream Shop,Juice Bar,Fish Market,Market,Electronics Store,Pizza Place,Residential Building (Apartment / Condo)
4,Lokhandwala,"Andheri,Western Suburbs",19.130815,72.82927,1.0,Pub,Lounge,Women's Store,Indian Restaurant,Coffee Shop,Cocktail Bar,Pizza Place,Department Store,Residential Building (Apartment / Condo),Market
5,Marol,"Andheri,Western Suburbs",19.119219,72.882743,2.0,Indian Restaurant,Snack Place,Coffee Shop,Ice Cream Shop,Diner,Restaurant,Food,Bakery,Asian Restaurant,Convenience Store
6,Sahar,"Andheri,Western Suburbs",19.098889,72.867222,,,,,,,,,,,
7,Seven Bungalows,"Andheri,Western Suburbs",19.129052,72.817018,1.0,Café,Indian Restaurant,Pub,Ice Cream Shop,Chinese Restaurant,Bar,Seafood Restaurant,Restaurant,Coffee Shop,Creperie
8,Versova,"Andheri,Western Suburbs",19.12,72.82,1.0,Fast Food Restaurant,Bar,Women's Store,Fish & Chips Shop,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run
9,Mira Road,"Mira-Bhayandar,Western Suburbs",19.284167,72.871111,1.0,Café,Vegetarian / Vegan Restaurant,Pizza Place,Chinese Restaurant,Dessert Shop,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run


#### Drop the rows with 'NaN' values in Cluster Labels

In [76]:
mumbai_df.dropna(subset=['Cluster Labels'], axis=0, inplace=True)
mumbai_df=mumbai_df.reset_index(drop=True)
mumbai_df

Unnamed: 0,Neighborhood,Borough,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Amboli,"Andheri,Western Suburbs",19.1293,72.8434,1.0,Park,Fast Food Restaurant,Gym,Coffee Shop,Sandwich Place,Chinese Restaurant,Indian Restaurant,Dessert Shop,Electronics Store,Donut Shop
1,"Chakala, Andheri",Western Suburbs,19.111388,72.860833,1.0,Café,Fast Food Restaurant,Restaurant,Hotel,Multiplex,Indian Restaurant,Falafel Restaurant,Salon / Barbershop,Diner,Asian Restaurant
2,D.N. Nagar,"Andheri,Western Suburbs",19.124085,72.831373,2.0,Gym / Fitness Center,Indian Restaurant,Cocktail Bar,Pizza Place,Snack Place,Lounge,German Restaurant,Department Store,Electronics Store,Donut Shop
3,Four Bungalows,"Andheri,Western Suburbs",19.124714,72.82721,1.0,Women's Store,Bar,Gym,Ice Cream Shop,Juice Bar,Fish Market,Market,Electronics Store,Pizza Place,Residential Building (Apartment / Condo)
4,Lokhandwala,"Andheri,Western Suburbs",19.130815,72.82927,1.0,Pub,Lounge,Women's Store,Indian Restaurant,Coffee Shop,Cocktail Bar,Pizza Place,Department Store,Residential Building (Apartment / Condo),Market
5,Marol,"Andheri,Western Suburbs",19.119219,72.882743,2.0,Indian Restaurant,Snack Place,Coffee Shop,Ice Cream Shop,Diner,Restaurant,Food,Bakery,Asian Restaurant,Convenience Store
6,Seven Bungalows,"Andheri,Western Suburbs",19.129052,72.817018,1.0,Café,Indian Restaurant,Pub,Ice Cream Shop,Chinese Restaurant,Bar,Seafood Restaurant,Restaurant,Coffee Shop,Creperie
7,Versova,"Andheri,Western Suburbs",19.12,72.82,1.0,Fast Food Restaurant,Bar,Women's Store,Fish & Chips Shop,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run
8,Mira Road,"Mira-Bhayandar,Western Suburbs",19.284167,72.871111,1.0,Café,Vegetarian / Vegan Restaurant,Pizza Place,Chinese Restaurant,Dessert Shop,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run
9,Bhayandar,"Mira-Bhayandar,Western Suburbs",19.29,72.85,3.0,Shipping Store,Women's Store,Dhaba,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run,Diner,Dim Sum Restaurant


#### Convert the Clsuter Labels from 'float' type to 'integer' type

In [77]:
mumbai_df['Cluster Labels']=mumbai_df['Cluster Labels'].astype("int")
mumbai_df.dtypes

Neighborhood               object
Borough                    object
Latitude                  float64
Longitude                 float64
Cluster Labels              int64
1st Most Common Venue      object
2nd Most Common Venue      object
3rd Most Common Venue      object
4th Most Common Venue      object
5th Most Common Venue      object
6th Most Common Venue      object
7th Most Common Venue      object
8th Most Common Venue      object
9th Most Common Venue      object
10th Most Common Venue     object
dtype: object

### Data Visualization

#### Cluster Visualization using Folium

In [78]:
map_mumbai2=folium.Map(location=[Latitude, Longitude], zoom_start=11)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
markers_colors = []
for lat, lon, poi, cluster in zip(mumbai_df['Latitude'], mumbai_df['Longitude'], mumbai_df['Neighborhood'], mumbai_df['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker( [lat, lon], radius=5, popup=label, color=rainbow[cluster-1], fill=True, fill_color=rainbow[cluster-1], fill_opacity=0.7).add_to(map_mumbai2)

map_mumbai2

# Results Section

### Here we see the results of our Data Analysis, Machine Learning algorithm (K-Means) to cluster the neighborhoods and Data Visualization 

#### Cluster 1
#### Looking at Cluster 1 we can say that there is an abundance of restaurants of different cuisines (Indian, Dhaba, Seafood, Falafel (Greek), Donut Shop) but they are lacking in beverage shops like a coffee shops or cafe's.

In [79]:
mumbai_df.loc[mumbai_df['Cluster Labels'] == 0, mumbai_df.columns[[1] + list(range(5, mumbai_df.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,"Mira-Bhayandar,Western Suburbs",Beach,Playground,Indian Restaurant,Resort,Bus Station,Women's Store,Dhaba,Event Space,Electronics Store,Donut Shop
15,"Borivali (West),Western Suburbs",Resort,Seafood Restaurant,Aquarium,Indian Restaurant,Dhaba,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run
58,South Mumbai,Beach,Playground,Indian Restaurant,Resort,Bus Station,Women's Store,Dhaba,Event Space,Electronics Store,Donut Shop


#### Cluster 2
#### This is the most extensive cluster spread across the entire map. While the Western Suburbs have plenty of restaurants of many cuisines and beverage shops of different kinds, the Eastern Suburbs and the Harbour Suburbs look to have a deficiency in good Chinese restaurants.

In [80]:
mumbai_df.loc[mumbai_df['Cluster Labels'] == 1, mumbai_df.columns[[1] + list(range(5, mumbai_df.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Andheri,Western Suburbs",Park,Fast Food Restaurant,Gym,Coffee Shop,Sandwich Place,Chinese Restaurant,Indian Restaurant,Dessert Shop,Electronics Store,Donut Shop
1,Western Suburbs,Café,Fast Food Restaurant,Restaurant,Hotel,Multiplex,Indian Restaurant,Falafel Restaurant,Salon / Barbershop,Diner,Asian Restaurant
3,"Andheri,Western Suburbs",Women's Store,Bar,Gym,Ice Cream Shop,Juice Bar,Fish Market,Market,Electronics Store,Pizza Place,Residential Building (Apartment / Condo)
4,"Andheri,Western Suburbs",Pub,Lounge,Women's Store,Indian Restaurant,Coffee Shop,Cocktail Bar,Pizza Place,Department Store,Residential Building (Apartment / Condo),Market
6,"Andheri,Western Suburbs",Café,Indian Restaurant,Pub,Ice Cream Shop,Chinese Restaurant,Bar,Seafood Restaurant,Restaurant,Coffee Shop,Creperie
7,"Andheri,Western Suburbs",Fast Food Restaurant,Bar,Women's Store,Fish & Chips Shop,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run
8,"Mira-Bhayandar,Western Suburbs",Café,Vegetarian / Vegan Restaurant,Pizza Place,Chinese Restaurant,Dessert Shop,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run
11,"Bandra,Western Suburbs",Scenic Lookout,Gym,Indian Restaurant,Fast Food Restaurant,Lounge,Beach,Café,Food Truck,Chinese Restaurant,Italian Restaurant
12,"Bandra,Western Suburbs",Indian Restaurant,Café,Bar,Bakery,Chinese Restaurant,Gourmet Shop,Snack Place,Pizza Place,Women's Store,Bookstore
13,"Bandra,Western Suburbs",Italian Restaurant,Fast Food Restaurant,Dessert Shop,Cupcake Shop,Ice Cream Shop,Bakery,Middle Eastern Restaurant,BBQ Joint,Coffee Shop,Donut Shop


#### Cluster 3
#### While the South Mumbai areas have a lot of restaurants, bars and beverage shops, there is a deficiency in Andheri (Western Suburb area) in terms of coffee shops and cafe's.

In [81]:
mumbai_df.loc[mumbai_df['Cluster Labels'] == 2, mumbai_df.columns[[1] + list(range(5, mumbai_df.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Andheri,Western Suburbs",Gym / Fitness Center,Indian Restaurant,Cocktail Bar,Pizza Place,Snack Place,Lounge,German Restaurant,Department Store,Electronics Store,Donut Shop
5,"Andheri,Western Suburbs",Indian Restaurant,Snack Place,Coffee Shop,Ice Cream Shop,Diner,Restaurant,Food,Bakery,Asian Restaurant,Convenience Store
25,"Khar,Western Suburbs",Indian Restaurant,Bar,Lounge,Pub,Fast Food Restaurant,Dessert Shop,Beer Garden,Bengali Restaurant,Restaurant,Hotel
27,"Malad,Western Suburbs",Indian Restaurant,Gym / Fitness Center,Coffee Shop,Chinese Restaurant,Hotel,Dessert Shop,Café,Bus Station,Diner,Lounge
28,"Sanctacruz,Western Suburbs",Indian Restaurant,Women's Store,Dance Studio,Clothing Store,Chinese Restaurant,Moving Target,Platform,Middle Eastern Restaurant,Lounge,Sandwich Place
36,"Ghatkopar,Eastern Suburbs",Indian Restaurant,Ice Cream Shop,Pizza Place,Snack Place,Bank,Bakery,Farmers Market,Multiplex,Arcade,Coffee Shop
38,"Mulund,Eastern Suburbs",Indian Restaurant,Restaurant,Ice Cream Shop,Bus Station,Women's Store,Dhaba,Falafel Restaurant,Event Space,Electronics Store,Donut Shop
41,"Powai,Eastern Suburbs",Indian Restaurant,Concert Hall,Event Space,Coffee Shop,Diner,Bakery,Café,Dhaba,Falafel Restaurant,Electronics Store
46,South Mumbai,Gym,Coffee Shop,Indian Restaurant,Bakery,Dim Sum Restaurant,Farmers Market,Falafel Restaurant,Event Space,Electronics Store,Donut Shop
48,South Mumbai,Indian Restaurant,Market,Fast Food Restaurant,American Restaurant,Restaurant,Food,Cheese Shop,Snack Place,Ice Cream Shop,Jewelry Store


#### Cluster 4
#### This a single cluster and has many good restaurants nearby but not enough beverage shops.

In [83]:
mumbai_df.loc[mumbai_df['Cluster Labels'] == 3, mumbai_df.columns[[1] + list(range(5, mumbai_df.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,"Mira-Bhayandar,Western Suburbs",Shipping Store,Women's Store,Dhaba,Falafel Restaurant,Event Space,Electronics Store,Donut Shop,Dog Run,Diner,Dim Sum Restaurant


# Discussion Section

### In Cluster 1 we observe that there is a certain deficiency in beverage shops. Looking at other clusters and seeing the variety of beverage shops like bars, pubs, coffee shops and cafe's being one of the most popular venues, it would be viable for an investor or an entrepreneur to open a beverage shop of their liking.
### Cluster 2 is the most extensive cluster of all and covers a lot of the map of Mumbai especially central and southern Mumbai. While the western suburbs have a lot of restaurants and beverage shops, it would not be profitable to open either of them without facing stiff competition. The harbour suburbs lack in Chinese cuisine based restaurants and thus that could be a good investment for a lot less competition. The same can be said about eastern suburbs but due to less foot traffic, the profit margin might not be substantial and can face a little competition from a few chinese restaurants placed there.
### Cluster 3 is concentrated in the south of Mumbai and traverses a little towards the western areas of Mumbai. This cluster shows that South Mumbai although may not be lacking in terms of restaurants and coffee shops, it is still a commercial side of Mumbai and thus foot traffic will be high and hence can provide to be a market for a good restaurant or beverage shop depending on intricate places where the investor wants to open the restaraunts or beverage shops. The Andheri area in the western suburb is a stand out in that cluster as the presence of coffee shops and cafe's is scarce and thus can provide a good, less competitve market for investors.
### Cluster 4 contains only one borough, and thus looking at the popular venues we can safely suggest that the area is deficient in terms of beverage shops and options like a cafe, coffee shop, restobar, pub can all prove to be successful with less-hassle.

# Conclusion

### Investors and entrepreneurs may particularly find the South Mumbai region in cluster 3 and harbour suburbs in cluster 2 to be the most finanicially, and economically viable as the foot traffic in these areas is high with many commercial sites and offices situated in these areas and hence we can conclude this by saying that the investors will find it profitable to invest in a restaurant or a beverage shop in certain parts of Mumbai and be successful.

#### Note: Due to COVID-19 pandemic the FourSquare calls to get the most popular and nearby venues in all the neighborhoods of boroughs in Mumbai, may not be 100% accurate as the calls depend on the time when the call is made and also on the foot traffic which can prove to be less than the usual days because of the pandemic situation in Mumbai and also in the world. Although I still feel that the results we got will be accurate with a very low error rate.