# Best Neighbourhood in Mumbai

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find best neighbourhood in Mumbai, India. As Mumbai is very busy in terms of traffic, we will find some areas which has good local trains connectivity and good amenities like restaurants, gyms, nightclub etc.

This report will be targeted to stakeholders interested in buying new home in the area where they could save their travelling time and real estate agents and builders for best place to build new societies.  

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## Data<a name = data></a> 

Based on definition of our problem, factors that will influence our decission are:

* train stations in Mumbai with major train halts.
* find neighbourhood of 5Km around each major station.
* number of restaurants, gyms, nightclub etc in each neighbourhood

Following data sources will be needed to extract/generate the required information:

* **wiki page** to find List of Mumbai Suburban Railway stations.
* coordinate of of each station will be obtained using **Google Maps API geocoding**.
* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
* number of restaurants, gyms, nightclub etc in every neighborhood will be obtained using **Foursquare API**

### Train stations

In [1]:
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
import re

In [2]:
!wget -q -O 'List_of_Mumbai_stations.html' https://en.wikipedia.org/wiki/List_of_Mumbai_Suburban_Railway_stations
print('Data downloaded!')    

Data downloaded!


Find all stations in mumbai and trainlines running through them.

In [3]:
wiki_df = pd.DataFrame(columns = ['index', 'station', 'train lines', 'Fast train stop'])

with open("List_of_Mumbai_stations.html") as fp:
    soup = BeautifulSoup(fp)
table_str = soup.find('div', class_='mw-content-ltr')

for i in table_str.find_all('tr'):
    if i.find('td', style="text-align:center;"):
        values = []  
        values.append(i.find('td', style="text-align:center;").text)

        for j in i.find_all('a'):
            values.append(j.text)
        value_list = [values[0], values[1],(','.join(values[2:]))]
        k = i.find('img')
        if k == None:
            value_list.append(False)
        elif 'alt="☒"' in str(k):
            value_list.append(False)
        else:
            value_list.append(True)
        wiki_df = wiki_df.append(pd.Series(value_list, index=wiki_df.columns), ignore_index=True)    
        
wiki_df.set_index('index', drop = True, inplace = True) 
wiki_df    

Unnamed: 0_level_0,station,train lines,Fast train stop
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
01,Airoli,Trans-Harbour Line,False
02,Ambarnath,Central Line,True
03,Ambivli,Central Line,False
04,Andheri,"Western Line,Harbour Line,Line 1 (Mumbai Metro)",True
05,Asangaon,Central Line,True
06,Atgaon,Central Line,False
07,Badlapur,Central Line,True
08,Baman Dongari,Nerul–Uran line,False
09,Bandra,"Western Line,Harbour Line",True
10,Bhandup,Central Line,True


As decided by stakeholders, we will finds stations which has 2 or more train lines running through it and has halt for fast trains too. 

In [4]:
# Train where fast train stop
station_df = wiki_df.loc[wiki_df['Fast train stop'] == True]
# Station with 2 or more train lines
station_df = station_df[station_df['train lines'].str.contains(',')]
station_df

Unnamed: 0_level_0,station,train lines,Fast train stop
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
4,Andheri,"Western Line,Harbour Line,Line 1 (Mumbai Metro)",True
9,Bandra,"Western Line,Harbour Line",True
15,Borivali,"Western Line,Harbour Line",True
20,Chhatrapati Shivaji Maharaj Terminus,"Central Line,Harbour Line",True
26,Dadar Central,"Dadar Western,Central Line,Western Line",True
30,Diva,"Central Line,Vasai Road,Panvel",True
38,Ghatkopar,"Central Line,Line 1 (Mumbai Metro)",True
39,Goregaon,"Western Line,Harbour Line",True
67,Kurla,"Central Line,Harbour Line",True
109,Thane,"Central Line,Trans-Harbour Line",True


Station with 2 or more train lines

### Neighborhood Candidates

In [5]:
# The code was removed by Watson Studio for sharing.
google_api_key = "AIzaSyAWuAZMXEDfyhsRbHIEUAydbrsKDpAr4bY"

In [6]:
import requests

def get_coordinates(api_key, address, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&address={}'.format(api_key, address)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        geographical_data = results[0]['geometry']['location'] # get geographical coordinates
        lat = geographical_data['lat']
        lon = geographical_data['lng']
        return [lat, lon]
    except:
        return [None, None]
mumbai_coordinate = get_coordinates(google_api_key, 'Mumbai, India') 

dict_coordinate = {}    
for i in station_df['station']:
    address = '%s, Mumbai, India' % (i)
    address_val = get_coordinates(google_api_key, address)
    dict_coordinate.update({address:address_val})
print(dict_coordinate)   

{'Andheri, Mumbai, India': [19.113645, 72.8697339], 'Bandra, Mumbai, India': [19.0595596, 72.8295287], 'Borivali, Mumbai, India': [19.2307329, 72.856673], ' Chhatrapati Shivaji Maharaj Terminus, Mumbai, India': [18.9398446, 72.8354475], 'Dadar Central, Mumbai, India': [19.018769, 72.84317659999999], 'Diva, Mumbai, India': [19.1869395, 73.0477775], 'Ghatkopar, Mumbai, India': [19.0790239, 72.9080122], 'Goregaon, Mumbai, India': [19.1662566, 72.8525696], 'Kurla, Mumbai, India': [19.0726295, 72.8844721], 'Thane, Mumbai, India': [19.2183307, 72.9780897]}


Now let's create a grid of area candidates, Area within ~5 Km to station.
To accurately calculate distances we need to create our grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in meters (not in latitude/longitude degrees). Then we'll project those coordinates back to latitude/longitude degrees to be shown on Folium map. So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in  meters).

In [7]:
#!pip install shapely
#import shapely.geometry

!pip install pyproj
import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)




Let's create a hexagonal grid of cells: we offset every other row, and adjust vertical row spacing so that every cell center is equally distant from all it's center.

In [8]:
def cal_lon_lag(longitude, latitude):
    center_x, center_y = lonlat_to_xy(longitude, latitude)# City center in Cartesian coordinates
    k = math.sqrt(1) / 2
    
    x_min = center_x - 5000
    x_step = 5000
    y_min = center_y - 5000 - (int(11/k)*k*5000 - 10000)/2
    y_step = 5000 * k 

    for i in range(0, int(11/k)):
        y = y_min + i * y_step
        
        x_offset = 2500 if i%2==0 else 0
        
        for j in range(0, 11):
            x = x_min + j * x_step + x_offset
           
            distance_from_center = calc_xy_distance(center_x, center_y, x, y)
            
            if (distance_from_center <= 5001):
                lon, lat = xy_to_lonlat(x, y)
                if (format(lon, '.4f') == format(longitude, '.4f')) & \
                    (format(lat, '.4f') == format(latitude, '.4f')):
                    pass;
                else:
                    latitudes.append(lat)
                    longitudes.append(lon)
                    distances_from_center.append(distance_from_center)
                    xs.append(x)
                    ys.append(y)

xs = []
ys = []
distances_from_center = []
latitudes = []
longitudes = []  
for k,v in dict_coordinate.items():
    cal_lon_lag(v[1], v[0])
print(len(longitudes), 'candidate neighborhood centers generated.')    

80 candidate neighborhood centers generated.


Let's visualize the data we have so far: city center location and candidate neighborhood centers:

In [9]:
#!pip install folium
import folium

In [10]:

map_mumbai = folium.Map(location=mumbai_coordinate, zoom_start=13)
for i in range(0, len(latitudes)):
    folium.Marker([latitudes[i], longitudes[i]], popup="mumbai").add_to(map_mumbai)
for k,v  in dict_coordinate.items():
    folium.Marker(v, popup="mumbai").add_to(map_mumbai)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=2500, color='blue', fill=False).add_to(map_mumbai)
    #folium.Marker([lat, lon]).add_to(map_berlin)
map_mumbai

OK, we now have the coordinates of centers of neighborhoods/areas to be evaluated, equally spaced (distance from every point to its station) and within ~5km from Alexanderplatz.

Let's now use Google Maps API to get approximate addresses of those locations.

In [11]:
def get_address(api_key, latitude, longitude, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&latlng={},{}'.format(api_key, latitude, longitude)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None
    
    

In [12]:
print('Obtaining location addresses: ', end='')
addresses = []
for lat, lon in zip(latitudes, longitudes):
    address = get_address(google_api_key, lat, lon)
    if address is None:
        address = 'NO ADDRESS'
    address = address.replace(', Mumbai', '') # We don't need country part of address
    addresses.append(address)
    print(' .', end='')
print(' done.')

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [13]:
addresses

['Vile Parle East, Vile Parle, Maharashtra, India',
 'C 107,Neeta co. Op. SoC. Tajpalschemerd. No5, Vile Parle East, Maharashtra 400057, India',
 'Navpada, Chhatrapati Shivaji International Airport Area, Vile Parle, Maharashtra 400099, India',
 '703, Bhardawadi Rd, ICICI Colony, Amboli, Andheri West, Maharashtra 400058, India',
 'Unnamed Road, Pereira Wadi, Asalpha, Maharashtra 400072, India',
 '400060, Sarvodaya Nagar, MHADA Colony, Indira Nagar, Jogeshwari East, Maharashtra 400093, India',
 'D-83, Muranjan Wadi, Marol, Andheri East, Maharashtra 400072, India',
 'Maroshi Naka, Unit 25, Aarey Colony, Goregaon, Maharashtra 400065, India',
 'Centre Cable of Bandra Worli Sea Link, Bandra - Worli Sea Link, 400050, India',
 'India',
 '2, HMA Faqir St, The Mahim Makarand CHS, Mahim West, Mahim, Maharashtra 400016, India',
 'India',
 '184/B, Sant Rohidas Marg, Dharavi koliwada, Dharavi Village, Dharavi, Maharashtra 400017, India',
 '446, Govind Patil Rd, Patil Pada, Danda Village, Khar Danda,

Looking good. Let's now place all this into a Pandas dataframe.

In [41]:
import pandas as pd


df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes, 
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center})
df_locations.drop(df_locations.loc[df_locations['Address'] == 'India'].index, inplace = True)
mumbai_loc = df_locations['Address'].tolist()
val = [location for location in mumbai_loc if "Mumbai, Maharashtra" in location]
df_locations.drop(df_locations.loc[df_locations['Address'].isin(val)].index, inplace = True)
df_locations.shape
        
        


(73, 6)

...and let's now save/persist this data into local file.

In [42]:
df_locations.to_pickle('./locations.pkl')

### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on restaurants in each neighborhood.
We will find neigbourhood which has good access to restaurants, snacks shops, nightlife, gym, shopping.


In [43]:
# The code was removed by Watson Studio for sharing.
CLIENT_ID = 'OYO0NM5CMICIRICPG2WSKJ1YZ22TY14GD321UBKJ1OCMF2IP'
CLIENT_SECRET = 'FIOIMTDTC4WJNETMWENCBK2E0ESPOWQ2AVSWDF4S1T5QRLHI'
VERSION = '20180605' # Foursquare API version
LIMIT=1000

In [44]:
def getNearbyVenues(names, latitudes, longitudes, radius=2000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Get the amenities for all the neighbourhood

In [45]:
mumbai_venues = getNearbyVenues(names=df_locations['Address'],
                                   latitudes=df_locations['Latitude'],
                                   longitudes=df_locations['Longitude']
                                  )


In [46]:
print(mumbai_venues.shape)
mumbai_venues.head(10)

(4636, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Starbucks Coffee: A Tata Alliance,19.092329,72.85605,Coffee Shop
1,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Taj Santacruz,19.092823,72.854601,Hotel
2,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Sahara Star,19.09562,72.853938,Hotel
3,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Tarmac,19.092039,72.859045,Airport
4,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Mad Over Donuts,19.092992,72.856178,Donut Shop
5,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Natural's Ice Cream,19.07756,72.863035,Ice Cream Shop
6,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Grand Hyatt,19.076832,72.85127,Hotel
7,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,The Good Food Co.,19.098453,72.845769,Café
8,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Chhatrapati Shivaji International Airport,19.090509,72.865148,Airport
9,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,Celini,19.076831,72.851734,Italian Restaurant


See, how different amenities for each neighbourhood

In [47]:
mumbai_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"1, Narayan Nagar, Mumbra, Thane, Maharashtra 400612, India",4,4,4,4,4,4
"1, Pokharan Rd Number 1, Shastri Nagar, Vartak Nagar, Thane West, Thane, Maharashtra 400606, India",68,68,68,68,68,68
"10, Bimbisar Nagar Rd, Bimbisar Nagar, Goregaon, Maharashtra 400065, India",32,32,32,32,32,32
"12, Sakharam Lanjekar Road, Parmanand Wadi, Parel, Maharashtra 400012, India",49,49,49,49,49,49
"184/B, Sant Rohidas Marg, Dharavi koliwada, Dharavi Village, Dharavi, Maharashtra 400017, India",91,91,91,91,91,91
"1st Floor, St. Lawrence High School 90 Feet Road, Opposite Kanakia Sanskruti Thakur Complex, Asha Nagar, Kandivali East, Maharashtra 400101, India",90,90,90,90,90,90
"2, HMA Faqir St, The Mahim Makarand CHS, Mahim West, Mahim, Maharashtra 400016, India",100,100,100,100,100,100
"209, Dinshaw Vacha Rd, Churchgate, Maharashtra 400020, India",100,100,100,100,100,100
"25, Prakash Thorat Marg, Gaikwad nagar, Budha Nagar, Chedda Nagar, Maharashtra 400071, India",83,83,83,83,83,83
"329, Subhash Nagar, Mohili, Asalpha, Maharashtra 400084, India",85,85,85,85,85,85


There are multiple categories for each amenity, lets merge them for convinience

In [48]:
print('There are {} uniques categories.'.format(len(mumbai_venues['Venue Category'].unique())))
temp = mumbai_venues['Venue Category'].unique()
resto = 'Restaurant'
gym = 'Gym'
snacks = ['Coffee Shop', 'Café', 'Ice Cream Shop', 'Dessert Shop', 'Bakery', 'Salad Place', 'Cupcake Shop', 'Snack Place', 
          'Pizza Place', 'Bagel Shop', 'Irani Cafe', 'Donut Shop', 'Food & Drink Shop', 'Food Court', 'Bed & Breakfast', 
          'Tea Room', 'Fried Chicken Joint', 'Frozen Yogurt Shop', 'Food', 'Burger Joint', 'Burrito Place', 'Chaat Place', 
          'Noodle House', 'Sandwich Place', 'Breakfast Spot', 'Deli / Bodega', 'Juice Bar', 'Cafeteria', 'Bistro']
Night_life = ['Hotel', 'Bar', 'Lounge', 'Cocktail Bar', 'Beer Garden', 'BBQ Joint', 'Diner', 'Hotel Bar', 'Whisky Bar',
              'Beer Bar', 'Dhaba', 'Buffet', 'Wine Bar', 'Nightclub', 'Gastropub', 'Steakhouse', 'Hookah Bar', 'Pub']
Shopping = ["Women's Store", 'Toy / Game Store', 'Clothing Store', 'Grocery Store', 
            'Movie Theater', 'Convenience Store', 'Food Truck', 'Shopping Mall', 'Camera Store',
            'Electronics Store', 'Department Store', 'Gift Shop', 'Flea Market', 'Market', 'Jewelry Store',
            'Cheese Shop', "Men's Store", 'Flower Shop',  'Supermarket', 'Sporting Goods Shop']

res = []
exp = []
snacks_type = []
night_life_type = []
shopping_type =[]
gym_type = []
for i in temp:
    if(resto in i):
        res.append(i)
    elif(i in snacks):
        snacks_type.append(i)
    elif(i in Night_life):
        night_life_type.append(i)
    elif(i in Shopping):
        shopping_type.append(i)
    elif(gym in i):
        gym_type.append(i)
    else:
        exp.append(i)
        

There are 222 uniques categories.


Merge the cell as per category

In [49]:
def addition(a, num):
    val = []
    for i in range(0, num):
        temp = 0
        for j in a:
            temp = temp + mumbai_onehot[j][i]
        val.append(temp)
    return(val)        
        

# one hot encoding
mumbai_onehot = pd.get_dummies(mumbai_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
mumbai_onehot['Neighborhood'] = mumbai_venues['Neighborhood']
mumbai_onehot.columns.get_loc("Neighborhood")
# move neighborhood column to the first column
fixed_columns = [mumbai_onehot.columns[116]] + list(mumbai_onehot.columns[:116]) + list(mumbai_onehot.columns[117:])
mumbai_onehot = mumbai_onehot[fixed_columns]
print(mumbai_onehot.shape)
mumbai_onehot['All Restaurant'] = addition(res, 4636)
mumbai_onehot['Snacks'] = addition(snacks_type, 4636)
mumbai_onehot['Night life'] = addition(night_life_type, 4636)
mumbai_onehot['Shopping'] = addition(shopping_type, 4636)
mumbai_onehot['All_Gym'] = addition(gym_type, 4636)
mumbai_onehot.drop(res + snacks_type + night_life_type + shopping_type + gym_type, axis = 1, inplace = True)
print(mumbai_onehot.shape)

(4636, 222)
(4636, 113)


In [50]:
mumbai_grouped = mumbai_onehot.groupby('Neighborhood').mean().reset_index()
mumbai_grouped

Unnamed: 0,Neighborhood,ATM,Accessories Store,Airport,Airport Lounge,Airport Service,Airport Terminal,Antique Shop,Arcade,Art Gallery,...,Track,Trail,Train,Train Station,Water Park,All Restaurant,Snacks,Night life,Shopping,All_Gym
0,"1, Narayan Nagar, Mumbra, Thane, Maharashtra 4...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.250000,0.000000,0.500000,0.250000,0.000000,0.000000,0.000000
1,"1, Pokharan Rd Number 1, Shastri Nagar, Vartak...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.397059,0.235294,0.132353,0.132353,0.000000
2,"10, Bimbisar Nagar Rd, Bimbisar Nagar, Goregao...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.343750,0.281250,0.031250,0.031250,0.000000
3,"12, Sakharam Lanjekar Road, Parmanand Wadi, Pa...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.040816,0.000000,0.306122,0.408163,0.081633,0.061224,0.020408
4,"184/B, Sant Rohidas Marg, Dharavi koliwada, Dh...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.010989,0.000000,...,0.000000,0.000000,0.010989,0.032967,0.000000,0.483516,0.219780,0.065934,0.043956,0.021978
5,"1st Floor, St. Lawrence High School 90 Feet Ro...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.011111,0.000000,0.266667,0.300000,0.055556,0.155556,0.033333
6,"2, HMA Faqir St, The Mahim Makarand CHS, Mahim...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.010000,0.000000,...,0.010000,0.000000,0.000000,0.000000,0.000000,0.330000,0.360000,0.110000,0.020000,0.010000
7,"209, Dinshaw Vacha Rd, Churchgate, Maharashtra...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.010000,...,0.000000,0.000000,0.000000,0.010000,0.000000,0.330000,0.260000,0.110000,0.050000,0.010000
8,"25, Prakash Thorat Marg, Gaikwad nagar, Budha ...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.409639,0.265060,0.072289,0.072289,0.048193
9,"329, Subhash Nagar, Mohili, Asalpha, Maharasht...",0.00,0.000000,0.000000,0.000000,0.00,0.000000,0.00,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.388235,0.200000,0.152941,0.105882,0.011765


In [51]:
mumbai_grouped.shape

(72, 113)

Let see the top 5 amenities for each neighbourhood

In [52]:
num_top_venues = 5

for hood in mumbai_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = mumbai_grouped[mumbai_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----1, Narayan Nagar, Mumbra, Thane, Maharashtra 400612, India----
                   venue  freq
0         All Restaurant  0.50
1                 Snacks  0.25
2          Train Station  0.25
3                    ATM  0.00
4  Outdoors & Recreation  0.00


----1, Pokharan Rd Number 1, Shastri Nagar, Vartak Nagar, Thane West, Thane, Maharashtra 400606, India----
            venue  freq
0  All Restaurant  0.40
1          Snacks  0.24
2        Shopping  0.13
3      Night life  0.13
4       Multiplex  0.03


----10, Bimbisar Nagar Rd, Bimbisar Nagar, Goregaon, Maharashtra 400065, India----
                 venue  freq
0       All Restaurant  0.34
1               Snacks  0.28
2        Design Studio  0.03
3  Indie Movie Theater  0.03
4           Smoke Shop  0.03


----12, Sakharam Lanjekar Road, Parmanand Wadi, Parel, Maharashtra 400012, India----
            venue  freq
0          Snacks  0.41
1  All Restaurant  0.31
2      Night life  0.08
3        Shopping  0.06
4           Plaza  0.04


--

                        venue  freq
0                    Shopping  0.50
1              All Restaurant  0.25
2  Tourist Information Center  0.25
3                         ATM  0.00
4       Outdoors & Recreation  0.00


----Ghartan Pada / Chinchani Number 1, Diamond Industrial Estate, Dahisar East, Maharashtra 400068, India----
                          venue  freq
0                All Restaurant  0.30
1                        Snacks  0.23
2                    Night life  0.17
3                      Shopping  0.13
4  General College & University  0.03


----Government Colony Building No 3, JL Shirshekar Marg, Government Colony, Bandra East, Maharashtra 400051, India----
            venue  freq
0  All Restaurant  0.36
1          Snacks  0.29
2      Night life  0.22
3         All_Gym  0.04
4        Shopping  0.02


----Hashimpremji, Shop No-2, Navpada, Vile Parle East, Vile Parle, Maharashtra 400099, India----
             venue  freq
0   All Restaurant  0.38
1       Night life  0.24
2    

            venue  freq
0  All Restaurant  0.32
1          Snacks  0.31
2      Night life  0.14
3        Shopping  0.09
4         Airport  0.02


----garden grove, b-1, flat no. 1301, b-wing, chikuwadi, borivali west, Maharashtra 400092, India----
            venue  freq
0  All Restaurant  0.37
1          Snacks  0.32
2        Shopping  0.07
3         All_Gym  0.05
4      Night life  0.02


----pollution response team Near mazagon Dock, Maharashtra, India----
              venue  freq
0            Snacks  0.41
1    All Restaurant  0.34
2        Night life  0.10
3  Community Center  0.03
4       Music Venue  0.03




Now we will find most common venues

In [53]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [54]:
num_top_venues = 15

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = mumbai_grouped['Neighborhood']

for ind in np.arange(mumbai_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(mumbai_grouped.iloc[ind, :], num_top_venues)

print(len(neighborhoods_venues_sorted))
neighborhoods_venues_sorted.head(10)



72


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,"1, Narayan Nagar, Mumbra, Thane, Maharashtra 4...",All Restaurant,Snacks,Train Station,All_Gym,Farm,Community Center,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space
1,"1, Pokharan Rd Number 1, Shastri Nagar, Vartak...",All Restaurant,Snacks,Night life,Shopping,Sports Bar,Multiplex,Brewery,Scenic Lookout,Soccer Stadium,Factory,Cricket Ground,Coworking Space,Dance Studio,Cosmetics Shop,Convention Center
2,"10, Bimbisar Nagar Rd, Bimbisar Nagar, Goregao...",All Restaurant,Snacks,Smoke Shop,Plaza,Night life,Indie Movie Theater,Farm,Shopping,Boutique,Bus Station,Dance Studio,Design Studio,Athletics & Sports,Multiplex,Event Space
3,"12, Sakharam Lanjekar Road, Parmanand Wadi, Pa...",Snacks,All Restaurant,Night life,Shopping,Train Station,Plaza,All_Gym,Roof Deck,Recreation Center,Airport Service,Farm,Convention Center,Cosmetics Shop,Coworking Space,Creperie
4,"184/B, Sant Rohidas Marg, Dharavi koliwada, Dh...",All Restaurant,Snacks,Night life,Shopping,Train Station,All_Gym,Garden,Arcade,Brewery,Multiplex,Bus Station,Miscellaneous Shop,Sports Club,Park,Indie Movie Theater
5,"1st Floor, St. Lawrence High School 90 Feet Ro...",Snacks,All Restaurant,Shopping,Night life,All_Gym,Park,Stadium,Multiplex,Smoke Shop,Pharmacy,Residential Building (Apartment / Condo),Sports Bar,Train Station,Bookstore,General Entertainment
6,"2, HMA Faqir St, The Mahim Makarand CHS, Mahim...",Snacks,All Restaurant,Night life,Performing Arts Venue,Gourmet Shop,Shopping,Scenic Lookout,Beach,Spa,Playground,College Auditorium,Park,Boutique,All_Gym,Track
7,"209, Dinshaw Vacha Rd, Churchgate, Maharashtra...",All Restaurant,Snacks,Night life,Shopping,Cricket Ground,Bookstore,Theater,Scenic Lookout,Boutique,College Academic Building,Performing Arts Venue,Park,All_Gym,Music Store,Beach
8,"25, Prakash Thorat Marg, Gaikwad nagar, Budha ...",All Restaurant,Snacks,Night life,Shopping,All_Gym,Park,Garden,Spa,Playground,Performing Arts Venue,Sculpture Garden,Plaza,Golf Course,General Entertainment,Fish Market
9,"329, Subhash Nagar, Mohili, Asalpha, Maharasht...",All Restaurant,Snacks,Night life,Shopping,Multiplex,Brewery,Bowling Alley,All_Gym,Bus Station,Shoe Store,Sports Bar,Bank,Platform,Theme Park,Design Studio


## Methodology <a name="methodology"></a>
We will use K-clustering approach to resolve our problem. We will cluster our neighbourhood in 20 clusters and then analyse which cluster is best fit to our requirements

Now, cluster the neighbourhood using K-cluster approach

In [55]:
# set number of clusters
kclusters = 20

mumbai_grouped_clustering = mumbai_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(mumbai_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:5] 

array([16, 12,  1,  1, 12], dtype=int32)

Add this cluster numbers to our dataset

In [56]:
# add clustering labels

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

mumbai_merged = df_locations
mumbai_merged

mumbai_merged = mumbai_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Address')

mumbai_merged.head() # check the last columns!

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,...,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,"Vile Parle East, Vile Parle, Maharashtra, India",19.089736,72.85649,7515072.0,3673569.0,5000.0,9,All Restaurant,Snacks,Night life,...,Train Station,Airport,All_Gym,Airport Lounge,Duty-free Shop,Theater,Outdoors & Recreation,Martial Arts Dojo,Arts & Crafts Store,Dance Studio
1,"C 107,Neeta co. Op. SoC. Tajpalschemerd. No5, ...",19.107987,72.850532,7512572.0,3676069.0,3535.533906,1,Snacks,All Restaurant,Night life,...,Duty-free Shop,Martial Arts Dojo,Multiplex,All_Gym,Design Studio,Factory,Event Space,Creperie,Dance Studio,Cricket Ground
2,"Navpada, Chhatrapati Shivaji International Air...",19.095397,72.875682,7517572.0,3676069.0,3535.533906,15,All Restaurant,Night life,Snacks,...,Duty-free Shop,Airport Service,Light Rail Station,Resort,Brewery,Spa,Theme Park,Pool,Airport Lounge,Bowling Alley
3,"703, Bhardawadi Rd, ICICI Colony, Amboli, Andh...",19.126246,72.844576,7510072.0,3678569.0,5000.0,9,All Restaurant,Snacks,Night life,...,Multiplex,Pharmacy,Brewery,Sports Club,Residential Building (Apartment / Condo),Accessories Store,Boutique,Event Space,Creperie,Cricket Ground
4,"Unnamed Road, Pereira Wadi, Asalpha, Maharasht...",19.101049,72.894872,7520072.0,3678569.0,5000.0,9,All Restaurant,Snacks,Night life,...,Bank,Pool,Resort,Brewery,Bowling Alley,Office,Shoe Store,Soccer Field,Multiplex,Accessories Store


Visualise the clusters

In [57]:
map_clusters = folium.Map(location=mumbai_coordinate, zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
print(rainbow)
markers_colors = []
for lat, lon, poi, cluster in zip(mumbai_merged['Latitude'], mumbai_merged['Longitude'], mumbai_merged['Address'], mumbai_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
    
map_clusters

['#8000ff', '#6629fe', '#4c50fc', '#3079f7', '#169bf2', '#07bbea', '#20d5e1', '#3dead5', '#56f7ca', '#72febb', '#8cfead', '#a8f79c', '#c2ea8c', '#ded579', '#f9bb66', '#ff9b52', '#ff793e', '#ff5029', '#ff2914', '#ff0000']


### Analysis <a name="analysis"></a>
Now we will evalue each cluster and see what fit best to our requirement

In [58]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 0, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
18,"1st Floor, St. Lawrence High School 90 Feet Ro...",3535.533906,0,Snacks,All Restaurant,Shopping,Night life,All_Gym,Park,Stadium,Multiplex,Smoke Shop,Pharmacy,Residential Building (Apartment / Condo),Sports Bar,Train Station,Bookstore,General Entertainment
20,"National Park - BMC Colony Rd, Maharashtra 400...",5000.0,0,Snacks,All Restaurant,Shopping,Night life,All_Gym,Stadium,Indie Movie Theater,General Entertainment,Lighthouse,Park,Residential Building (Apartment / Condo),Sports Bar,Historic Site,Arts & Crafts Store,Art Gallery
61,"Rolex Apartment, Underai Rd, Malad, Navy Colon...",3535.533906,0,Snacks,All Restaurant,Shopping,Night life,Multiplex,All_Gym,Bookstore,Cosmetics Shop,Brewery,Smoke Shop,Sports Bar,Gaming Cafe,Furniture / Home Store,Convention Center,Gourmet Shop
63,"Unnamed Road, Malad, Shivaji Nagar, Kurar Vill...",5000.0,0,Snacks,All Restaurant,Shopping,All_Gym,Multiplex,Night life,Bus Station,Brewery,Bookstore,Train,Playground,Event Space,Creperie,Cosmetics Shop,Cricket Ground


In [59]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 1, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
1,"C 107,Neeta co. Op. SoC. Tajpalschemerd. No5, ...",3535.533906,1,Snacks,All Restaurant,Night life,Shopping,Airport,Duty-free Shop,Martial Arts Dojo,Multiplex,All_Gym,Design Studio,Factory,Event Space,Creperie,Dance Studio,Cricket Ground
10,"2, HMA Faqir St, The Mahim Makarand CHS, Mahim...",3535.533906,1,Snacks,All Restaurant,Night life,Performing Arts Venue,Gourmet Shop,Shopping,Scenic Lookout,Beach,Spa,Playground,College Auditorium,Park,Boutique,All_Gym,Track
13,"446, Govind Patil Rd, Patil Pada, Danda Villag...",3535.533906,1,Snacks,All Restaurant,Night life,Shopping,All_Gym,Arcade,Scenic Lookout,Furniture / Home Store,Farmers Market,General Entertainment,Event Space,Salon / Barbershop,Design Studio,Dance Studio,Creperie
17,"garden grove, b-1, flat no. 1301, b-wing, chik...",3535.533906,1,All Restaurant,Snacks,Shopping,All_Gym,Night life,Park,Sports Bar,Historic Site,Miscellaneous Shop,Multiplex,Pharmacy,Playground,Bridge,Boutique,Shop & Service
21,"A Hill Crest, Holy cross Road I.C.Colony, near...",3535.533906,1,Snacks,All Restaurant,Night life,All_Gym,Shopping,Train Station,Arcade,Optical Shop,Outdoors & Recreation,Paper / Office Supplies Store,Soccer Field,Sports Bar,Bus Station,Theater,Art Gallery
29,"Sardar Vallabhbhai Patel Rd, Ajmer, Null Bazar...",3535.533906,1,All Restaurant,Snacks,Night life,Shopping,Indie Movie Theater,Harbor / Marina,Garden,Bridal Shop,Music Venue,Music Store,Multiplex,Beach,Arcade,Opera House,Antique Shop
31,"pollution response team Near mazagon Dock, Mah...",5000.0,1,Snacks,All Restaurant,Night life,Fish Market,Garden,Music Venue,Community Center,Gourmet Shop,Factory,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio
33,"515/A, Shankar Ghanekar Rd, Three View Co-Oper...",3535.533906,1,Snacks,All Restaurant,Night life,Shopping,Scenic Lookout,All_Gym,Theater,Music Venue,Office,Brewery,Athletics & Sports,Stadium,Recreation Center,Art Gallery,Arcade
34,"12, Sakharam Lanjekar Road, Parmanand Wadi, Pa...",3535.533906,1,Snacks,All Restaurant,Night life,Shopping,Train Station,Plaza,All_Gym,Roof Deck,Recreation Center,Airport Service,Farm,Convention Center,Cosmetics Shop,Coworking Space,Creperie
37,"Room No. 11, 4, B.M.C Bldg., 2nd Floor Opp. Ch...",3535.533906,1,All Restaurant,Snacks,Night life,Shopping,All_Gym,Beach,Theater,Performing Arts Venue,Boat or Ferry,Arcade,Playground,Track,Garden,Event Space,Convention Center


In [60]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 2, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
5,"400060, Sarvodaya Nagar, MHADA Colony, Indira ...",3535.533906,2,All Restaurant,Snacks,Night life,All_Gym,Garden,Dance Studio,Boutique,Music Venue,Athletics & Sports,Public Art,General College & University,Gaming Cafe,Community Center,Convention Center,Cosmetics Shop
7,"Maroshi Naka, Unit 25, Aarey Colony, Goregaon,...",5000.0,2,All Restaurant,Snacks,Night life,All_Gym,Resort,Golf Course,Dance Studio,Garden,Creperie,Duty-free Shop,Design Studio,Cricket Ground,Cosmetics Shop,Coworking Space,Factory
19,"Eskay Rd, Eksar Village, Eksar, Borivali West,...",5000.0,2,All Restaurant,Snacks,All_Gym,Night life,Theme Park,Multiplex,Playground,Sports Bar,Arcade,Bridge,Pool Hall,Water Park,Duty-free Shop,Cosmetics Shop,Coworking Space
64,"B/23, Highway Apartment., 2nd floor., Eastern ...",5000.0,2,All Restaurant,Snacks,Night life,All_Gym,Train Station,Garden,Park,Smoke Shop,Soccer Field,Bookstore,Brewery,Performing Arts Venue,Shopping,Train,Pool


In [61]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 3, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
46,"Gavdevi Road, Betwade Gaon, Thane, Maharashtra...",3535.533906,3,Shopping,All Restaurant,Tourist Information Center,Farmers Market,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space,Factory,Farm


In [62]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 4, 
        mumbai_merged.columns[[1] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Latitude,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
45,19.205171,3535.533906,4,Train Station,Shopping,Harbor / Marina,All Restaurant,Smoke Shop,Tourist Information Center,All_Gym,Cricket Ground,Duty-free Shop,Design Studio,Dance Studio,Coworking Space,Creperie,Factory,Cosmetics Shop
47,19.21072,5000.0,4,Train Station,Shopping,Harbor / Marina,All Restaurant,Smoke Shop,Tourist Information Center,All_Gym,Cricket Ground,Duty-free Shop,Design Studio,Dance Studio,Coworking Space,Creperie,Factory,Cosmetics Shop


In [63]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 5, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
36,"Unnamed Road, Pratiksha Nagar, BPCL Terminal, ...",5000.0,5,Snacks,Shopping,All_Gym,Train Station,Smoke Shop,Trail,Convention Center,Cosmetics Shop,Factory,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop
76,"B5-WING, Kharegaon, Kalwa, Thane, Maharashtra ...",5000.0,5,Snacks,Shopping,Big Box Store,Moving Target,Train Station,Tennis Court,Multiplex,Farm,Factory,Event Space,Duty-free Shop,All_Gym,Design Studio,Farmers Market,Cricket Ground


In [64]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 6, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
8,"Centre Cable of Bandra Worli Sea Link, Bandra ...",5000.0,6,Snacks,Night life,All Restaurant,Scenic Lookout,All_Gym,Boutique,Stadium,Beach,Shopping,Coworking Space,Factory,Cricket Ground,Dance Studio,Design Studio,Cosmetics Shop
79,"LE ROUGE TRENDZ PVT.LTD.Ground Floor,Rounak Bu...",5000.0,6,Snacks,Night life,All_Gym,All Restaurant,Shopping,Bus Station,Cricket Ground,Event Space,Duty-free Shop,Design Studio,Dance Studio,Coworking Space,Creperie,Farm,Cosmetics Shop


In [65]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 7, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
44,"Casa Bella Magnifica Bus Stop, Casa Bella Main...",5000.0,7,All_Gym,All Restaurant,Multiplex,Shopping,Airport,Convention Center,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space,Factory,Farm


In [66]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 8, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
42,"Unnamed Road, Domkhar Gaon, Thane, Maharashtra...",3535.533906,8,All Restaurant,All_Gym,Bus Station,Farmers Market,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space,Factory,Farm


In [67]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 9, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
0,"Vile Parle East, Vile Parle, Maharashtra, India",5000.0,9,All Restaurant,Snacks,Night life,Shopping,Smoke Shop,Train Station,Airport,All_Gym,Airport Lounge,Duty-free Shop,Theater,Outdoors & Recreation,Martial Arts Dojo,Arts & Crafts Store,Dance Studio
3,"703, Bhardawadi Rd, ICICI Colony, Amboli, Andh...",5000.0,9,All Restaurant,Snacks,Night life,Shopping,All_Gym,Multiplex,Pharmacy,Brewery,Sports Club,Residential Building (Apartment / Condo),Accessories Store,Boutique,Event Space,Creperie,Cricket Ground
4,"Unnamed Road, Pereira Wadi, Asalpha, Maharasht...",5000.0,9,All Restaurant,Snacks,Night life,Shopping,Light Rail Station,Bank,Pool,Resort,Brewery,Bowling Alley,Office,Shoe Store,Soccer Field,Multiplex,Accessories Store
14,"Government Colony Building No 3, JL Shirshekar...",3535.533906,9,All Restaurant,Snacks,Night life,All_Gym,Shopping,Park,Arcade,Brewery,Bookstore,Sports Club,Indie Movie Theater,Antique Shop,Art Gallery,Creperie,Cricket Ground
15,"Mangal Prabha, Prabhat Colony Rd, Sen Nagar, S...",5000.0,9,Snacks,Night life,All Restaurant,Shopping,All_Gym,Arts & Crafts Store,Duty-free Shop,Outdoors & Recreation,Sports Bar,Airport,Golf Course,Factory,Cosmetics Shop,Coworking Space,Creperie
23,"Ghartan Pada / Chinchani Number 1, Diamond Ind...",5000.0,9,All Restaurant,Snacks,Night life,Shopping,Bus Station,Outdoors & Recreation,Train Station,General College & University,Road,Creperie,Design Studio,Dance Studio,Cricket Ground,All_Gym,Coworking Space
24,"Cuffe Parade, Cuffe Parade, Maharashtra 400005...",5000.0,9,All Restaurant,Snacks,Night life,Shopping,Spa,Theater,Art Gallery,Performing Arts Venue,Cricket Ground,Boutique,Scenic Lookout,Monument / Landmark,Stadium,All_Gym,History Museum
25,"209, Dinshaw Vacha Rd, Churchgate, Maharashtra...",3535.533906,9,All Restaurant,Snacks,Night life,Shopping,Cricket Ground,Bookstore,Theater,Scenic Lookout,Boutique,College Academic Building,Performing Arts Venue,Park,All_Gym,Music Store,Beach
26,"Unnamed Road, Fort, Maharashtra 400001, India",3535.533906,9,All Restaurant,Snacks,Night life,Shopping,Bookstore,Monument / Landmark,All_Gym,Tennis Court,Park,Other Great Outdoors,Boutique,Field,Stadium,Spa,Art Gallery
32,"Lower Parel, Lower Parel Bridge, Lower Parel, ...",5000.0,9,All Restaurant,Snacks,Night life,Shopping,Multiplex,Spa,All_Gym,Comedy Club,Roof Deck,Recreation Center,Plaza,Music Venue,Planetarium,Cosmetics Shop,Office


In [68]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 10, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
43,"Anant Niwas Rana Nagar No. 1, Reti Bunder Road...",5000.0,10,Snacks,Big Box Store,All Restaurant,Multiplex,Bookstore,All_Gym,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space,Factory


In [69]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 11, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
52,"Kachra Depot Rd, Govandi Slums, Nirankar Nagar...",5000.0,11,Snacks,All Restaurant,All_Gym,Farmers Market,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space,Factory,Farm


In [70]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 12, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
12,"184/B, Sant Rohidas Marg, Dharavi koliwada, Dh...",5000.0,12,All Restaurant,Snacks,Night life,Shopping,Train Station,All_Gym,Garden,Arcade,Brewery,Multiplex,Bus Station,Miscellaneous Shop,Sports Club,Park,Indie Movie Theater
16,"Lucky Towers, MG Cross Rd Number 4, Amrut Naga...",5000.0,12,All Restaurant,Snacks,Shopping,Night life,Multiplex,Train Station,Miscellaneous Shop,Cosmetics Shop,Bus Station,Park,Gaming Cafe,Art Gallery,All_Gym,Liquor Store,Garden
38,Room no 645 rajive gandhi nagar Near New Trans...,3535.533906,12,All Restaurant,Snacks,Shopping,All_Gym,Night life,Train Station,Miscellaneous Shop,Baseball Field,Convention Center,Farmers Market,Gaming Cafe,Garden,Indie Movie Theater,Light Rail Station,Stadium
39,Shop Number.21 A Wing Kamaraj CHS Near JPR Rad...,5000.0,12,All Restaurant,Snacks,Shopping,Train Station,Night life,All_Gym,Indie Movie Theater,Train,Bookstore,Garden,Arcade,Miscellaneous Shop,Sports Club,Gaming Cafe,Duty-free Shop
48,"428, KN Gaikwad Marg, Ganesh Nagar, Postal Col...",5000.0,12,All Restaurant,Snacks,Shopping,Night life,All_Gym,Garden,Performing Arts Venue,Light Rail Station,Playground,Spa,Golf Course,Plaza,Pool,Arts & Crafts Store,Smoke Shop
49,"Premier Colony, Kurla West, Maharashtra 400070...",3535.533906,12,All Restaurant,Snacks,Night life,Shopping,All_Gym,Theme Park,Performing Arts Venue,Bowling Alley,Sculpture Garden,Shoe Store,Playground,Garden,Furniture / Home Store,Community Center,Convention Center
50,"PL Lokhande Marg, ACC Nagar, Chedda Nagar, Mah...",3535.533906,12,All Restaurant,Snacks,Night life,Shopping,All_Gym,Park,Garden,Spa,Performing Arts Venue,Sculpture Garden,Plaza,Furniture / Home Store,Cricket Ground,Community Center,Convention Center
66,"Shiv Shakti Shopping Complex, Deepak Kumar Mar...",3535.533906,12,All Restaurant,Snacks,Shopping,Night life,All_Gym,Soccer Field,Garden,Performing Arts Venue,Light Rail Station,General Entertainment,Pool,Shoe Store,Arts & Crafts Store,Smoke Shop,Spa
68,"25, Prakash Thorat Marg, Gaikwad nagar, Budha ...",5000.0,12,All Restaurant,Snacks,Night life,Shopping,All_Gym,Park,Garden,Spa,Playground,Performing Arts Venue,Sculpture Garden,Plaza,Golf Course,General Entertainment,Fish Market
70,"Shop No. 1, Ground Floor, Vallabh Vihar, Opp. ...",3535.533906,12,All Restaurant,Snacks,Night life,Shopping,All_Gym,Playground,Sculpture Garden,Multiplex,Bowling Alley,Theme Park,Dance Studio,Design Studio,Coworking Space,Cricket Ground,Creperie


In [71]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 13, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
30,"New Dock Rd, Victoria Docks Port Trust, Mazgao...",3535.533906,13,All Restaurant,Harbor / Marina,Night life,Shopping,Plaza,Snacks,Creperie,Duty-free Shop,Design Studio,Dance Studio,Cricket Ground,All_Gym,Coworking Space,Cosmetics Shop,Factory


In [72]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 14, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
60,"400065, Workers Colony, Aarey Colony, Goregaon...",5000.0,14,All Restaurant,Snacks,All_Gym,Resort,Golf Course,Monument / Landmark,Farm,Event Space,Dance Studio,Lake,Arcade,Art Gallery,Creperie,Cricket Ground,Airport Terminal


In [73]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 15, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
2,"Navpada, Chhatrapati Shivaji International Air...",3535.533906,15,All Restaurant,Night life,Snacks,Shopping,Airport,Duty-free Shop,Airport Service,Light Rail Station,Resort,Brewery,Spa,Theme Park,Pool,Airport Lounge,Bowling Alley
6,"D-83, Muranjan Wadi, Marol, Andheri East, Maha...",3535.533906,15,All Restaurant,Night life,Snacks,Shopping,Light Rail Station,Resort,Office,Scenic Lookout,Multiplex,Brewery,Garden,Train Station,Duty-free Shop,Creperie,Cricket Ground
51,"Andheri - Kurla Rd, Jarimari, Saki Naka, Mahar...",5000.0,15,All Restaurant,Night life,Snacks,Shopping,Airport Service,Light Rail Station,Music Store,Multiplex,Duty-free Shop,Spa,Resort,Theme Park,Bowling Alley,Pool,Airport Lounge
69,"Hashimpremji, Shop No-2, Navpada, Vile Parle E...",3535.533906,15,All Restaurant,Night life,Snacks,Shopping,Airport Service,Light Rail Station,Resort,Duty-free Shop,Bowling Alley,Brewery,Spa,Theme Park,Pool,Airport Lounge,Airport


In [74]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 16, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
41,"1, Narayan Nagar, Mumbra, Thane, Maharashtra 4...",3535.533906,16,All Restaurant,Snacks,Train Station,All_Gym,Farm,Community Center,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space


In [75]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 17, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
78,"Dadlani Park, Raheja Complex, Majiwada, Thane,...",3535.533906,17,Shopping,Snacks,Multiplex,All_Gym,All Restaurant,Dance Studio,Factory,Event Space,Duty-free Shop,Design Studio,Creperie,Cricket Ground,Farmers Market,Coworking Space,Cosmetics Shop


In [76]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 18, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
40,"802, B3, MM Valley, Nr TMC Stadium, Jilani Par...",5000.0,18,All Restaurant,ATM,Snacks,Farmers Market,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cricket Ground,Dance Studio,Design Studio,Duty-free Shop,Event Space,Factory,Farm


In [77]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'] == 19, 
        mumbai_merged.columns[[0] + list(range(5, mumbai_merged.shape[1]))]]

Unnamed: 0,Address,Distance from center,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue
22,"Rushi Van Borivali, Rushi Van, Next to Kajupad...",3535.533906,19,Snacks,Night life,All Restaurant,All_Gym,Shopping,Scenic Lookout,Lighthouse,Historic Site,Train Station,Indie Movie Theater,Convention Center,Factory,Cosmetics Shop,Coworking Space,Creperie


Considering discussion with stakeholders, Basic needs are food shopping and Gym, Neighbourhood of cluster 0, 12 and 19 is considered as best neighbourhood. 

In [78]:
mumbai_merged.loc[mumbai_merged['Cluster Labels'].isin([0, 12, 19]), 
        mumbai_merged.columns[[0]]]['Address']


12    184/B, Sant Rohidas Marg, Dharavi koliwada, Dh...
16    Lucky Towers, MG Cross Rd Number 4, Amrut Naga...
18    1st Floor, St. Lawrence High School 90 Feet Ro...
20    National Park - BMC Colony Rd, Maharashtra 400...
22    Rushi Van Borivali, Rushi Van, Next to Kajupad...
38    Room no 645 rajive gandhi nagar Near New Trans...
39    Shop Number.21 A Wing Kamaraj CHS Near JPR Rad...
48    428, KN Gaikwad Marg, Ganesh Nagar, Postal Col...
49    Premier Colony, Kurla West, Maharashtra 400070...
50    PL Lokhande Marg, ACC Nagar, Chedda Nagar, Mah...
61    Rolex Apartment, Underai Rd, Malad, Navy Colon...
63    Unnamed Road, Malad, Shivaji Nagar, Kurar Vill...
66    Shiv Shakti Shopping Complex, Deepak Kumar Mar...
68    25, Prakash Thorat Marg, Gaikwad nagar, Budha ...
70    Shop No. 1, Ground Floor, Vallabh Vihar, Opp. ...
72    Shri Ganesh Rahivasi Seva Sangh, Khale Compoun...
73    1, Pokharan Rd Number 1, Shastri Nagar, Vartak...
Name: Address, dtype: object

## Results and Discussion <a name="results"></a>

Our analysis shows some good option which are good places to live in. These are some places which has good access of trains also find all the venues of entertainments near there place. For this we have found all trains stations in mumbai specially which are connented to 2 major train lines in mumbai. Then we have found 5 km areas cenreting the station which will have great access to that train station.
Foursquare is used to find amenities of all those neighbourhood. Result is then clustered into 20 clusters and found best areas which has all amenities as discussed by stakeholder. Result location are best candidate area for live in in mumbai 

## Conclusion <a name="conclusion"></a>

Purpose of this project was to find best neighbourhood in Mumbai. Major creteria is have less travelling time, considering heave traffic in Mumbai. Less travelling time include travel time to local station and travel time to gym, restaturants, shopping etc. So train places near traing station are considered as best candidate. It is further narrow down using foursquare data for amenities near the candidate neighbourhood.

Final decision will be made by stakeholders based on specific requirement of neighbourhood and locations in every recommended zone, taking into consideration additional factors like price, area, office location, levels of noise, social and economic dynamics of every neighborhood etc. 