# Capstone Project on Neighborhood Battle Between Mumbai And Pune

## Introduction to the project

#### When thinking about relocating to a new city or country for work purposes or to start a new life, or to go for a holiday destination people tend to research areas before moving. This research includes population rate, average house price, school ratings, crime rates, weather conditions, recreational facilities, holiday destinations-tourism, Carnivals and Sports events/activity, Etc.
#### Based on the above, a search engine algorithm would be an efficient tool to use that will allow users to enter cities and get the neighbourhood name that best suits their lifestyle or living conditions. 
 #### In this project, we will study in detail the area classification using foursquare data and machine learning segmentation and clustering. And segment areas of two cities based on the most common places captured from Foursquare. 
#### This could be done as the aim of this Project using an algorithm (Using segmentation and clustering) that will perform an extensive analysis on
#### 1. The similarities and dissimilarities between the two cities of the user’s search criteria, 
#### 2. Determine which city is bested suited for lifestyle.
#### This project is based on a recommendation system using the Pune and Mumbai cities in South Africa as my search criteria


In [1]:
# Import main libraries
import numpy as np #library to use vectors
import pandas as pd #library for analysing data
from bs4 import BeautifulSoup
import requests #library to handle requests
import json #library to use json files
import xml

!pip install geocoder
!pip install geopy

from geopy.geocoders import Nominatim
from pandas.io.json import json_normalize
import matplotlib.cm as cm
 
import matplotlib.colors as colors
from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0 --yes
import folium

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K    100% |████████████████████████████████| 102kB 18.1MB/s 
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda

  added / updated specs:
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    certifi-2019.3.9           |           py36_0       

### Downloding the postal codes of Pune City in table format

In [2]:
table = pd.read_html('https://www.mapsofindia.com/pincode/india/maharashtra/pune/', header = 1)

#Obtain the second table
Pune_df = table[0]
Pune_df.head(10)

Unnamed: 0,Location,Pincode,State,District
0,A.R. shala,411004,Maharashtra,Pune
1,Afmc,411040,Maharashtra,Pune
2,Adhale Bk,410506,Maharashtra,Pune
3,Adivare,410509,Maharashtra,Pune
4,Agoti,413132,Maharashtra,Pune
5,Airport,411032,Maharashtra,Pune
6,Akurdi,411035,Maharashtra,Pune
7,Ala,412411,Maharashtra,Pune
8,Alande,412205,Maharashtra,Pune
9,Alandi Chorachi,412201,Maharashtra,Pune


### Index reset

In [3]:
Pune_df = Pune_df.reset_index(drop=True)
Pune_df

Unnamed: 0,Location,Pincode,State,District
0,A.R. shala,411004,Maharashtra,Pune
1,Afmc,411040,Maharashtra,Pune
2,Adhale Bk,410506,Maharashtra,Pune
3,Adivare,410509,Maharashtra,Pune
4,Agoti,413132,Maharashtra,Pune
5,Airport,411032,Maharashtra,Pune
6,Akurdi,411035,Maharashtra,Pune
7,Ala,412411,Maharashtra,Pune
8,Alande,412205,Maharashtra,Pune
9,Alandi Chorachi,412201,Maharashtra,Pune


### By  the Geocoder Library  we find the Longitude and Latitute of All Locations of Pune 

In [4]:
import geocoder

SSK_API_KEY='AIzaSyClDcKdhBhhALFXOk1K6IA729msSRZ0tsQ'
#get latitude and longitude
def get_latlng(Pincode):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.google('{}, India'.format(Pincode), key=SSK_API_KEY)
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [5]:
postal_codes_Pune= Pune_df['Pincode']
coords = [ get_latlng(Pincode) for Pincode in postal_codes_Pune]
df_Pune_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
Pune_df['Latitude'] = df_Pune_coords['Latitude']
Pune_df['Longitude'] = df_Pune_coords['Longitude']
Pune_df

Unnamed: 0,Location,Pincode,State,District,Latitude,Longitude
0,A.R. shala,411004,Maharashtra,Pune,18.515729,73.834868
1,Afmc,411040,Maharashtra,Pune,18.492095,73.900178
2,Adhale Bk,410506,Maharashtra,Pune,18.685787,73.665394
3,Adivare,410509,Maharashtra,Pune,19.136358,73.677166
4,Agoti,413132,Maharashtra,Pune,18.178437,74.897493
5,Airport,411032,Maharashtra,Pune,18.593099,73.921781
6,Akurdi,411035,Maharashtra,Pune,18.652747,73.780145
7,Ala,412411,Maharashtra,Pune,19.167885,74.159228
8,Alande,412205,Maharashtra,Pune,18.239245,73.959494
9,Alandi Chorachi,412201,Maharashtra,Pune,18.465191,74.041763


In [6]:
print('Pune has {} locations.'.format(
        len(Pune_df['Pincode'].unique()),
        Pune_df.shape[0]
    )
)

Pune has 136 locations.


In [7]:
address = 'Pune ,India'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Pune, India are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Pune, India are 18.5203062, 73.8543185.


### By using Location Dataset downloaded Map of Pune and its Loactions

In [8]:
# create map of Pune using latitude and longitude values
Pune_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(Pune_df['Latitude'],Pune_df['Longitude'], Pune_df['Location']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(Pune_map)  
    
Pune_map

### By using the Client id and the client secret code  downloaded near by vanues of Pune city using foursquare API

In [9]:
CLIENT_ID  = 'OFPXVQCD2Z1E5M4N4RCF3IG0BAN2OPDJHMCQHFTEPUVGTXZN' # your Foursquare ID
CLIENT_SECRET = 'ECZH0LNZZMD2ROGSLSPJFWA5TWYXNKUGIMGE45PZX5OBFYBF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('Client_ID: ' + CLIENT_ID )
print('Client_Secret:' + CLIENT_SECRET)

Your credentails:
Client_ID: OFPXVQCD2Z1E5M4N4RCF3IG0BAN2OPDJHMCQHFTEPUVGTXZN
Client_Secret:ECZH0LNZZMD2ROGSLSPJFWA5TWYXNKUGIMGE45PZX5OBFYBF


In [10]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [11]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            100)
            
        # make the GET request
        results = requests.get(url).json() ["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)    

In [12]:
Pune_venues = getNearbyVenues(names=Pune_df['Location'],
                              latitudes=Pune_df['Latitude'],
                              longitudes=Pune_df['Longitude']
                             )
Pune_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,A.R. shala,18.515729,73.834868,Le Plaisir,18.514205,73.838551,Bistro
1,A.R. shala,18.515729,73.834868,Casa Lolo,18.51693,73.83394,Deli / Bodega
2,A.R. shala,18.515729,73.834868,Panchvati Gaurav,18.517879,73.838623,Indian Restaurant
3,A.R. shala,18.515729,73.834868,Café Peterdonuts,18.514394,73.833497,Bagel Shop
4,A.R. shala,18.515729,73.834868,Kamala Nehru Park,18.51705,73.834099,Garden
5,A.R. shala,18.515729,73.834868,Supreme Pav Bhaji and Pizza (Canal Road Branch),18.513946,73.830783,Pizza Place
6,A.R. shala,18.515729,73.834868,Curry On The Roof,18.514348,73.833383,Restaurant
7,A.R. shala,18.515729,73.834868,Sachin Tea Stall,18.517036,73.834058,Snack Place
8,A.R. shala,18.515729,73.834868,Ice Cream Magic,18.51751,73.834232,Ice Cream Shop
9,A.R. shala,18.515729,73.834868,Four Fountain Spa,18.516903,73.836809,Spa


In [13]:
print(Pune_venues.shape)

(1167, 7)


In [14]:
#Venues per Neighborhood
Pune_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A.R. shala,15,15,15,15,15,15
Afmc,16,16,16,16,16,16
Airport,1,1,1,1,1,1
Akurdi,5,5,5,5,5,5
Ambale,1,1,1,1,1,1
Ambarvet,5,5,5,5,5,5
Ambavane,7,7,7,7,7,7
Ambegaon Bk,1,1,1,1,1,1
Ambi,1,1,1,1,1,1
Ammunition Factory khadki,5,5,5,5,5,5


In [15]:
print('There are {} distinct venues in {} categories.'.format(
    len(Pune_venues['Venue'].unique()),len(Pune_venues['Venue Category'].unique())))

There are 243 distinct venues in 82 categories.


In [16]:
# one hot encoding
Pune_onehot = pd.get_dummies(Pune_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Pune_onehot['Neighborhood'] = Pune_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Pune_onehot.columns[-1]] + list(Pune_onehot.columns[:-1])
Pune_onehot = Pune_onehot[fixed_columns]

Pune_onehot.head()

Unnamed: 0,Neighborhood,ATM,American Restaurant,Arcade,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bistro,Bookstore,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Business Service,Café,Chinese Restaurant,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Flea Market,Food & Drink Shop,Food Court,Food Truck,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Gym,Gym / Fitness Center,Health & Beauty Service,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Jewelry Store,Juice Bar,Lake,Liquor Store,Lounge,Motel,Motorcycle Shop,Mountain,Movie Theater,Multiplex,Nightclub,North Indian Restaurant,Park,Pharmacy,Pizza Place,Resort,Restaurant,Sandwich Place,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Spa,Sporting Goods Shop,Tea Room,Track,Train Station,Vegetarian / Vegan Restaurant,Wine Shop
0,A.R. shala,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,A.R. shala,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,A.R. shala,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,A.R. shala,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,A.R. shala,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [17]:
Pune_onehot.shape

(1167, 83)

In [18]:
Pune_grouped = Pune_onehot.groupby('Neighborhood').mean().reset_index()
Pune_grouped

Unnamed: 0,Neighborhood,ATM,American Restaurant,Arcade,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bistro,Bookstore,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Business Service,Café,Chinese Restaurant,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Flea Market,Food & Drink Shop,Food Court,Food Truck,Furniture / Home Store,Garden,Gastropub,Gift Shop,Golf Course,Gym,Gym / Fitness Center,Health & Beauty Service,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Jewelry Store,Juice Bar,Lake,Liquor Store,Lounge,Motel,Motorcycle Shop,Mountain,Movie Theater,Multiplex,Nightclub,North Indian Restaurant,Park,Pharmacy,Pizza Place,Resort,Restaurant,Sandwich Place,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Spa,Sporting Goods Shop,Tea Room,Track,Train Station,Vegetarian / Vegan Restaurant,Wine Shop
0,A.R. shala,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.066667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.066667,0.0,0.0,0.066667,0.0,0.0,0.0
1,Afmc,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.125,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Airport,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Akurdi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.2,0.0,0.0,0.0,0.0
4,Ambale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Ambarvet,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Ambavane,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.142857,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0
7,Ambegaon Bk,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Ambi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Ammunition Factory khadki,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0


In [19]:
num_top_venues = 10

for hood in Pune_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Pune_grouped[Pune_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----A.R. shala----
               venue  freq
0              Hotel  0.13
1  Indian Restaurant  0.07
2               Bank  0.07
3         Food Truck  0.07
4             Garden  0.07
5             Bistro  0.07
6        Snack Place  0.07
7        Pizza Place  0.07
8                Spa  0.07
9     Ice Cream Shop  0.07


----Afmc----
                  venue  freq
0  Fast Food Restaurant  0.19
1        Ice Cream Shop  0.19
2           Coffee Shop  0.12
3        Sandwich Place  0.06
4    Chinese Restaurant  0.06
5                  Café  0.06
6     Electronics Store  0.06
7                Lounge  0.06
8          Dance Studio  0.06
9     Indian Restaurant  0.06


----Airport----
               venue  freq
0  Indian Restaurant   1.0
1                ATM   0.0
2      Jewelry Store   0.0
3           Mountain   0.0
4    Motorcycle Shop   0.0
5              Motel   0.0
6             Lounge   0.0
7       Liquor Store   0.0
8               Lake   0.0
9          Juice Bar   0.0


----Akurdi----
       

### All vanues of Pune And its near by area are find out and then arrange them in their Geographical locations

In [20]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [21]:
# **Build ten top venues dataset.**
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Pune_grouped['Neighborhood']

for ind in np.arange(Pune_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Pune_grouped.iloc[ind, :], num_top_venues)

print(neighborhoods_venues_sorted.shape)
neighborhoods_venues_sorted

(201, 11)


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,A.R. shala,Hotel,Food Truck,Bagel Shop,Deli / Bodega,Indian Restaurant,Ice Cream Shop,Bistro,Pizza Place,Bank,Restaurant
1,Afmc,Fast Food Restaurant,Ice Cream Shop,Coffee Shop,Chinese Restaurant,American Restaurant,Lounge,Café,Dance Studio,Electronics Store,Sandwich Place
2,Airport,Indian Restaurant,Clothing Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store
3,Akurdi,Motel,Restaurant,Tea Room,Snack Place,Hotel,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega
4,Ambale,Lake,Wine Shop,Clothing Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop
5,Ambarvet,Asian Restaurant,Fast Food Restaurant,Bank,Shoe Store,Donut Shop,Wine Shop,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega
6,Ambavane,Vegetarian / Vegan Restaurant,Tea Room,Juice Bar,Snack Place,Fast Food Restaurant,Shop & Service,Restaurant,Wine Shop,Diner,Coffee Shop
7,Ambegaon Bk,Business Service,Wine Shop,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop
8,Ambi,Lake,Wine Shop,Clothing Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop
9,Ammunition Factory khadki,Department Store,Farmers Market,Tea Room,Bakery,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega,Dessert Shop,Diner


In [22]:
neighborhoods_venues_sorted.iloc[11,]

Neighborhood                           Andgaon
1st Most Common Venue         Asian Restaurant
2nd Most Common Venue     Fast Food Restaurant
3rd Most Common Venue                     Bank
4th Most Common Venue               Shoe Store
5th Most Common Venue               Donut Shop
6th Most Common Venue                Wine Shop
7th Most Common Venue        Electronics Store
8th Most Common Venue             Concert Hall
9th Most Common Venue             Dance Studio
10th Most Common Venue           Deli / Bodega
Name: 11, dtype: object

In [23]:
cols = Pune_df.columns.tolist()
n = int(cols.index('Location'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
Pune_df = Pune_df[cols]
Pune_df

Unnamed: 0,Location,Pincode,State,District,Latitude,Longitude
0,A.R. shala,411004,Maharashtra,Pune,18.515729,73.834868
1,Afmc,411040,Maharashtra,Pune,18.492095,73.900178
2,Adhale Bk,410506,Maharashtra,Pune,18.685787,73.665394
3,Adivare,410509,Maharashtra,Pune,19.136358,73.677166
4,Agoti,413132,Maharashtra,Pune,18.178437,74.897493
5,Airport,411032,Maharashtra,Pune,18.593099,73.921781
6,Akurdi,411035,Maharashtra,Pune,18.652747,73.780145
7,Ala,412411,Maharashtra,Pune,19.167885,74.159228
8,Alande,412205,Maharashtra,Pune,18.239245,73.959494
9,Alandi Chorachi,412201,Maharashtra,Pune,18.465191,74.041763


In [24]:
Pune_df.shape

(762, 6)

In [25]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

Pune_grouped_clustering = Pune_df.drop('Location', 1)
Pune_grouped_cluster = Pune_grouped_clustering.drop('State',1)
Pune_grouped_clusters = Pune_grouped_cluster.drop('District',1)

# run k-means clustering
kmeans = KMeans(n_clusters=5, random_state=2).fit(Pune_grouped_clusters)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([4, 4, 1, 1, 3, 4, 4, 0, 0, 0, 0, 0, 0, 0, 1, 4, 0, 1, 4, 0, 1, 1,
       0, 1, 0, 4, 0, 4, 0, 4, 0, 0, 1, 3, 0, 1, 0, 4, 0, 0, 1, 0, 0, 4,
       4, 4, 0, 0, 0, 0, 1, 4, 1, 4, 4, 3, 3, 3, 3, 0, 4, 3, 0, 2, 1, 0,
       3, 0, 0, 0, 0, 0, 3, 1, 4, 3, 3, 3, 1, 0, 0, 3, 0, 0, 0, 1, 0, 4,
       4, 1, 0, 4, 4, 4, 0, 1, 4, 3, 0, 3, 0, 0, 0, 2, 4, 0, 4, 4, 1, 0,
       1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 4, 4, 0, 4, 0, 3, 0, 4, 4, 2,
       2, 0, 1, 0, 4, 0, 0, 0, 0, 0, 2, 1, 3, 1, 1, 0, 0, 4, 0, 4, 4, 3,
       0, 4, 1, 0, 0, 1, 4, 3, 4, 4, 4, 0, 4, 4, 0, 1, 0, 0, 0, 4, 1, 0,
       0, 0, 4, 4, 4, 1, 3, 0, 2, 1, 0, 1, 4, 0, 2, 4, 1, 0, 3, 0, 0, 0,
       4, 4, 4, 4, 0, 0, 0, 0, 3, 2, 0, 0, 1, 0, 0, 4, 4, 0, 3, 3, 0, 1,
       4, 0, 1, 3, 0, 4, 1, 1, 0, 0, 0, 0, 3, 0, 1, 1, 0, 0, 0, 0, 1, 3,
       3, 3, 0, 4, 3, 1, 0, 3, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 4, 0, 0, 0,
       0, 0, 0, 1, 3, 0, 0, 0, 0, 3, 1, 0, 1, 4, 1, 4, 4, 0, 0, 0, 3, 3,
       4, 3, 0, 3, 0, 0, 0, 0, 0, 0, 3, 4, 4, 4, 4,

In [26]:
Pune_merged = Pune_df

# add clustering labels
Pune_merged['Cluster Labels'] = kmeans.labels_

# merge capetown_grouped with port_elizabeth_df to add latitude/longitude for each neighborhood
Pune_merged = Pune_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Location')

print(Pune_merged.shape)
Pune_merged.head() # check the last columns!

(762, 17)


Unnamed: 0,Location,Pincode,State,District,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,A.R. shala,411004,Maharashtra,Pune,18.515729,73.834868,4,Hotel,Food Truck,Bagel Shop,Deli / Bodega,Indian Restaurant,Ice Cream Shop,Bistro,Pizza Place,Bank,Restaurant
1,Afmc,411040,Maharashtra,Pune,18.492095,73.900178,4,Fast Food Restaurant,Ice Cream Shop,Coffee Shop,Chinese Restaurant,American Restaurant,Lounge,Café,Dance Studio,Electronics Store,Sandwich Place
2,Adhale Bk,410506,Maharashtra,Pune,18.685787,73.665394,1,,,,,,,,,,
3,Adivare,410509,Maharashtra,Pune,19.136358,73.677166,1,,,,,,,,,,
4,Agoti,413132,Maharashtra,Pune,18.178437,74.897493,3,,,,,,,,,,


## Finally, let's visualize the resulting clusters
### Build cluster dataset and plot the map

In [27]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

In [28]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
colors_array = cm.rainbow(np.linspace(0, 1, kclusters))
rainbow = [colors.rgb2hex(i) for i in colors_array]
print(rainbow)
# add markers to the map
markers_colors = []
for lat, lon, nei , cluster in zip(Pune_merged['Latitude'],Pune_merged['Longitude'], Pune_merged['Location'], Pune_merged['Cluster Labels']):
    label = folium.Popup(str(nei) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

['#8000ff', '#00b5eb', '#80ffb4', '#ffb360', '#ff0000']


# Cluster 1

In [29]:
Pune_cluster_0 = Pune_merged.loc[Pune_merged['Cluster Labels'] == 0, Pune_merged.columns[[1] + [0] + list(range(4, Pune_merged.shape[1]))]]
Pune_cluster_0.shape

(359, 15)

In [30]:
Pune_cluster_0

Unnamed: 0,Pincode,Location,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,412411,Ala,19.167885,74.159228,0,,,,,,,,,,
8,412205,Alande,18.239245,73.959494,0,,,,,,,,,,
9,412201,Alandi Chorachi,18.465191,74.041763,0,,,,,,,,,,
10,412105,Alandi Devachi,18.660455,73.900709,0,,,,,,,,,,
11,412211,Alegaon,18.598027,74.382206,0,,,,,,,,,,
12,412411,Alephata,19.167885,74.159228,0,,,,,,,,,,
13,412206,Ambade,18.08503,73.77132,0,,,,,,,,,,
16,412206,Ambavade,18.08503,73.77132,0,,,,,,,,,,
19,412206,Ambeghar,18.08503,73.77132,0,,,,,,,,,,
22,412211,Amble,18.598027,74.382206,0,,,,,,,,,,


## Cluster 2

In [31]:
Pune_cluster_1 = Pune_merged.loc[Pune_merged['Cluster Labels'] == 1, Pune_merged.columns[[1] + [0] + list(range(4, Pune_merged.shape[1]))]]
Pune_cluster_1.shape

(143, 15)

In [32]:
Pune_cluster_1

Unnamed: 0,Pincode,Location,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,410506,Adhale Bk,18.685787,73.665394,1,,,,,,,,,,
3,410509,Adivare,19.136358,73.677166,1,,,,,,,,,,
14,410507,Ambale,18.77167,73.724249,1,Lake,Wine Shop,Clothing Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop
17,410401,Ambavane,18.733734,73.429807,1,Vegetarian / Vegan Restaurant,Tea Room,Juice Bar,Snack Place,Fast Food Restaurant,Shop & Service,Restaurant,Wine Shop,Diner,Coffee Shop
20,410501,Ambethan,18.750621,73.87719,1,,,,,,,,,,
21,410507,Ambi,18.77167,73.724249,1,Lake,Wine Shop,Clothing Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop
23,410505,Amboli,18.968552,73.606522,1,,,,,,,,,,
32,410502,Anjanavale,19.24144,73.81838,1,,,,,,,,,,
35,410502,Aptale,19.24144,73.81838,1,,,,,,,,,,
40,410509,Asane,19.136358,73.677166,1,,,,,,,,,,


## Cluster 3

In [33]:
Pune_cluster_2 = Pune_merged.loc[Pune_merged['Cluster Labels'] == 2,Pune_merged.columns[[1] + [0] + list(range(4, Pune_merged.shape[1]))]]

Pune_cluster_2.shape


(13, 15)

In [34]:
Pune_cluster_2

Unnamed: 0,Pincode,Location,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
63,413801,Beam Wirelee station,18.390491,74.663464,2,,,,,,,,,,
103,413801,Boribyal,18.390491,74.663464,2,,,,,,,,,,
131,413801,Daund,18.390491,74.663464,2,,,,,,,,,,
132,413801,Daund Bazar,18.390491,74.663464,2,,,,,,,,,,
142,413801,Deulgaonraja,18.390491,74.663464,2,,,,,,,,,,
184,413801,Girim,18.390491,74.663464,2,,,,,,,,,,
190,413801,Gopalwadi,18.390491,74.663464,2,,,,,,,,,,
207,413801,Hingniberdi,18.390491,74.663464,2,,,,,,,,,,
357,413802,Kurkumbh,18.311575,74.546328,2,,,,,,,,,,
421,413802,Midc Kurkumbh,18.311575,74.546328,2,,,,,,,,,,


## Cluster 4

In [35]:
Pune_cluster_3 = Pune_merged.loc[Pune_merged['Cluster Labels'] == 3,Pune_merged.columns[[1] + [0] + list(range(4,Pune_merged.shape[1]))]]
Pune_cluster_3.shape

(91, 15)

In [36]:
Pune_cluster_3

Unnamed: 0,Pincode,Location,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,413132,Agoti,18.178437,74.897493,3,,,,,,,,,,
33,413114,Anthurne,18.044869,74.81562,3,,,,,,,,,,
55,413102,Baramati,18.15194,74.569762,3,Movie Theater,Ice Cream Shop,Multiplex,Diner,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega,Department Store
56,413133,Baramati Midc,18.210592,74.581477,3,,,,,,,,,,
57,413102,Baramati West,18.15194,74.569762,3,Movie Theater,Ice Cream Shop,Multiplex,Diner,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega,Department Store
58,413102,Barhanpur,18.15194,74.569762,3,Movie Theater,Ice Cream Shop,Multiplex,Diner,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega,Department Store
61,413103,Bawada,17.954871,74.991011,3,,,,,,,,,,
66,413104,Belawadi,18.167157,74.757115,3,,,,,,,,,,
72,413103,Bhandgaon,17.954871,74.991011,3,,,,,,,,,,
75,413104,Bhavaninagar,18.167157,74.757115,3,,,,,,,,,,


## Cluster 5

In [37]:
Pune_cluster_4 = Pune_merged.loc[Pune_merged['Cluster Labels'] == 4, Pune_merged.columns[[1] + [0] + list(range(4, Pune_merged.shape[1]))]]

Pune_cluster_4.shape


(156, 15)

In [38]:
Pune_cluster_4

Unnamed: 0,Pincode,Location,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,411004,A.R. shala,18.515729,73.834868,4,Hotel,Food Truck,Bagel Shop,Deli / Bodega,Indian Restaurant,Ice Cream Shop,Bistro,Pizza Place,Bank,Restaurant
1,411040,Afmc,18.492095,73.900178,4,Fast Food Restaurant,Ice Cream Shop,Coffee Shop,Chinese Restaurant,American Restaurant,Lounge,Café,Dance Studio,Electronics Store,Sandwich Place
5,411032,Airport,18.593099,73.921781,4,Indian Restaurant,Clothing Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store
6,411035,Akurdi,18.652747,73.780145,4,Motel,Restaurant,Tea Room,Snack Place,Hotel,Clothing Store,Coffee Shop,Concert Hall,Dance Studio,Deli / Bodega
15,411042,Ambarvet,18.503077,73.866074,4,Asian Restaurant,Fast Food Restaurant,Bank,Shoe Store,Donut Shop,Wine Shop,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega
18,411046,Ambegaon Bk,18.402869,73.853668,4,Business Service,Wine Shop,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Donut Shop
25,411003,Ammunition Factory khadki,18.563874,73.851531,4,Department Store,Farmers Market,Tea Room,Bakery,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega,Dessert Shop,Diner
27,411051,Anandnagar,18.477797,73.821321,4,Fast Food Restaurant,Indian Restaurant,Gym,Snack Place,Pizza Place,Wine Shop,Concert Hall,Dance Studio,Deli / Bodega,Department Store
29,411042,Andgaon,18.503077,73.866074,4,Asian Restaurant,Fast Food Restaurant,Bank,Shoe Store,Donut Shop,Wine Shop,Electronics Store,Concert Hall,Dance Studio,Deli / Bodega
37,411021,Armament,18.528229,73.777808,4,Garden,Indian Restaurant,Lake,Fast Food Restaurant,Concert Hall,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner


#  Exploring Mumbai 

In [39]:
# Obtain Suburb and Postal code information from sapostal codes for Mumbai, India
table = pd.read_html('https://www.mapsofindia.com/pincode/india/maharashtra/mumbai/', header = 1)

#Obtain the second table
Mumbai_df = table[0]
Mumbai_df.head(10)

Unnamed: 0,Location,Pincode,State,District
0,A I staff colony,400029,Maharashtra,Mumbai
1,Aareymilk Colony,400065,Maharashtra,Mumbai
2,Agripada,400011,Maharashtra,Mumbai
3,Airport,400099,Maharashtra,Mumbai
4,Ambewadi,400004,Maharashtra,Mumbai
5,Andheri,400053,Maharashtra,Mumbai
6,Andheri East,400069,Maharashtra,Mumbai
7,Andheri Railway station,400058,Maharashtra,Mumbai
8,Antop Hill,400037,Maharashtra,Mumbai
9,Asvini,400005,Maharashtra,Mumbai


In [40]:
Mumbai_df = Mumbai_df.reset_index(drop=True)
Mumbai_df

Unnamed: 0,Location,Pincode,State,District
0,A I staff colony,400029,Maharashtra,Mumbai
1,Aareymilk Colony,400065,Maharashtra,Mumbai
2,Agripada,400011,Maharashtra,Mumbai
3,Airport,400099,Maharashtra,Mumbai
4,Ambewadi,400004,Maharashtra,Mumbai
5,Andheri,400053,Maharashtra,Mumbai
6,Andheri East,400069,Maharashtra,Mumbai
7,Andheri Railway station,400058,Maharashtra,Mumbai
8,Antop Hill,400037,Maharashtra,Mumbai
9,Asvini,400005,Maharashtra,Mumbai


In [41]:
df_Mumbai = Mumbai_df.drop_duplicates(['Pincode'])

In [42]:
df_Mumbai = df_Mumbai.reset_index(drop=True)
df_Mumbai

Unnamed: 0,Location,Pincode,State,District
0,A I staff colony,400029,Maharashtra,Mumbai
1,Aareymilk Colony,400065,Maharashtra,Mumbai
2,Agripada,400011,Maharashtra,Mumbai
3,Airport,400099,Maharashtra,Mumbai
4,Ambewadi,400004,Maharashtra,Mumbai
5,Andheri,400053,Maharashtra,Mumbai
6,Andheri East,400069,Maharashtra,Mumbai
7,Andheri Railway station,400058,Maharashtra,Mumbai
8,Antop Hill,400037,Maharashtra,Mumbai
9,Asvini,400005,Maharashtra,Mumbai


In [43]:
import geocoder
SSK_API_KEY='AIzaSyClDcKdhBhhALFXOk1K6IA729msSRZ0tsQ'
#get latitude and longitude
def get_latlng(Pincode):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.google('{}, India'.format(Pincode), key=SSK_API_KEY)
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [44]:
postal_codes_Mumbai = df_Mumbai['Pincode']
coords = [ get_latlng(Pincode) for Pincode in postal_codes_Mumbai]
df_Mumbai_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
df_Mumbai['Latitude'] = df_Mumbai_coords['Latitude']
df_Mumbai['Longitude'] = df_Mumbai_coords['Longitude']
df_Mumbai

Unnamed: 0,Location,Pincode,State,District,Latitude,Longitude
0,A I staff colony,400029,Maharashtra,Mumbai,19.07967,72.867897
1,Aareymilk Colony,400065,Maharashtra,Mumbai,19.155576,72.884959
2,Agripada,400011,Maharashtra,Mumbai,18.981045,72.826758
3,Airport,400099,Maharashtra,Mumbai,19.092927,72.865377
4,Ambewadi,400004,Maharashtra,Mumbai,18.957988,72.82144
5,Andheri,400053,Maharashtra,Mumbai,19.112105,72.861073
6,Andheri East,400069,Maharashtra,Mumbai,19.117753,72.873702
7,Andheri Railway station,400058,Maharashtra,Mumbai,19.122735,72.830201
8,Antop Hill,400037,Maharashtra,Mumbai,19.025748,72.869694
9,Asvini,400005,Maharashtra,Mumbai,18.910369,72.819758


In [45]:
#To find The geograpical coordinate of Mumbai, India

address = 'Mumbai,India'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Mumbai,  india are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Mumbai,  india are 18.9387711, 72.8353355.


In [46]:
# create map of Mumbai using latitude and longitude values
Mumbai_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df_Mumbai['Latitude'], df_Mumbai['Longitude'], df_Mumbai['Location']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(Mumbai_map)  
    
Mumbai_map


In [47]:
Mumbai_venues = getNearbyVenues(names=df_Mumbai['Pincode'],
                                   latitudes=df_Mumbai['Latitude'],
                                   longitudes=df_Mumbai['Longitude']
                                  )
Mumbai_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,400029,19.07967,72.867897,King Chilly,19.078382,72.86635,Chinese Restaurant
1,400029,19.07967,72.867897,The Camp,19.077917,72.865643,Asian Restaurant
2,400029,19.07967,72.867897,Smokin JOES Pizza,19.076598,72.867765,Pizza Place
3,400029,19.07967,72.867897,S l electronics,19.077582,72.864725,Electronics Store
4,400065,19.155576,72.884959,Aarey Milk Colony,19.155678,72.8844,Farm


In [48]:
print(Mumbai_venues.shape)

(957, 7)


In [49]:
#Venues per Neighborhood
Mumbai_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
400001,19,19,19,19,19,19
400002,5,5,5,5,5,5
400003,18,18,18,18,18,18
400004,7,7,7,7,7,7
400005,12,12,12,12,12,12
400006,4,4,4,4,4,4
400007,26,26,26,26,26,26
400008,10,10,10,10,10,10
400009,5,5,5,5,5,5
400010,1,1,1,1,1,1


In [50]:
# Find No of distinct venues & categories.
print('There are {} distinct venues in {} categories.'.format(
    len(Mumbai_venues['Venue'].unique()),len(Mumbai_venues['Venue Category'].unique())))

There are 833 distinct venues in 162 categories.


In [51]:
# one hot encoding
Mumbai_onehot = pd.get_dummies(Mumbai_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Mumbai_onehot['Neighborhood'] = Mumbai_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Mumbai_onehot.columns[-1]] + list(Mumbai_onehot.columns[:-1])
Mumbai_onehot = Mumbai_onehot[fixed_columns]

Mumbai_onehot.head(10)

Unnamed: 0,Zoo,Afghan Restaurant,Airport,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Breakfast Spot,Brewery,Building,Burger Joint,Bus Station,Café,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Auditorium,College Gym,Comfort Food Restaurant,Concert Hall,Convenience Store,Convention Center,Cosmetics Shop,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Gastropub,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Government Building,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Historic Site,History Museum,Hockey Arena,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Indoor Play Area,Intersection,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Motel,Movie Theater,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Music Venue,Neighborhood,Nightclub,Noodle House,Office,Other Great Outdoors,Outdoors & Recreation,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Racetrack,Recording Studio,Recreation Center,Resort,Rest Area,Restaurant,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,South Indian Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Tea Room,Tennis Court,Thai Restaurant,Theater,Train Station,Travel & Transport,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar,Women's Store,Yoga Studio
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400029,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400029,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400029,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400029,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400065,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400065,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400065,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400065,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400065,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,400065,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [52]:
Mumbai_onehot.shape

(957, 162)

In [53]:
Mumbai_grouped = Mumbai_onehot.groupby('Neighborhood').mean().reset_index()
Mumbai_grouped

Unnamed: 0,Neighborhood,Zoo,Afghan Restaurant,Airport,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Beach,Bed & Breakfast,Beer Garden,Bengali Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Breakfast Spot,Brewery,Building,Burger Joint,Bus Station,Café,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Auditorium,College Gym,Comfort Food Restaurant,Concert Hall,Convenience Store,Convention Center,Cosmetics Shop,Cricket Ground,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Gastropub,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Government Building,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Historic Site,History Museum,Hockey Arena,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Indoor Play Area,Intersection,Irani Cafe,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Motel,Movie Theater,Mughlai Restaurant,Multicuisine Indian Restaurant,Multiplex,Music Venue,Nightclub,Noodle House,Office,Other Great Outdoors,Outdoors & Recreation,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pool,Pub,Racetrack,Recording Studio,Recreation Center,Resort,Rest Area,Restaurant,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,South Indian Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Tea Room,Tennis Court,Thai Restaurant,Theater,Train Station,Travel & Transport,Vegetarian / Vegan Restaurant,Whisky Bar,Wine Bar,Women's Store,Yoga Studio
0,400001,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.052632,0.0,0.0,0.210526,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0
1,400002,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0
2,400003,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.055556,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,400004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0
4,400005,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,400006,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,400007,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.038462,0.0,0.0,0.0,0.038462,0.0,0.0,0.076923,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.076923,0.0,0.038462,0.076923,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0
7,400008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,400009,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,400010,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [54]:
num_top_venues = 10
for hood in Mumbai_grouped['Neighborhood']:
    #print("----"+hood+"----")
    temp = Mumbai_grouped[Mumbai_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

                  venue  freq
0     Indian Restaurant  0.21
1            Irani Cafe  0.11
2                   Bar  0.11
3                 Hotel  0.05
4        Sandwich Place  0.05
5            Food Truck  0.05
6                  Café  0.05
7    Chinese Restaurant  0.05
8                Lounge  0.05
9  Fast Food Restaurant  0.05


                 venue  freq
0    Indian Restaurant   0.4
1        Movie Theater   0.2
2        Train Station   0.2
3                 Café   0.2
4                  Zoo   0.0
5            Multiplex   0.0
6   Miscellaneous Shop   0.0
7  Monument / Landmark   0.0
8                Motel   0.0
9   Mughlai Restaurant   0.0


               venue  freq
0  Indian Restaurant  0.33
1          Juice Bar  0.11
2         Smoke Shop  0.06
3       Dessert Shop  0.06
4  Indian Sweet Shop  0.06
5     Sandwich Place  0.06
6  Convenience Store  0.06
7               Café  0.06
8          BBQ Joint  0.06
9         Restaurant  0.06


                           venue  freq
0        

In [55]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [56]:
# **Build ten top venues dataset.**
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Mumbai_grouped['Neighborhood']

for ind in np.arange(Mumbai_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Mumbai_grouped.iloc[ind, :], num_top_venues)

print(neighborhoods_venues_sorted.shape)    
neighborhoods_venues_sorted

(64, 11)


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,400001,Indian Restaurant,Irani Cafe,Bar,Hotel,Sandwich Place,Fast Food Restaurant,Seafood Restaurant,Food Truck,Café,Chinese Restaurant
1,400002,Indian Restaurant,Train Station,Movie Theater,Café,Yoga Studio,Diner,Field,Fast Food Restaurant,Farmers Market,Farm
2,400003,Indian Restaurant,Juice Bar,Convenience Store,Chinese Restaurant,Restaurant,Rest Area,Smoke Shop,Café,BBQ Joint,Indian Sweet Shop
3,400004,Indian Restaurant,Snack Place,Vegetarian / Vegan Restaurant,Fast Food Restaurant,Coffee Shop,Electronics Store,Yoga Studio,Diner,Field,Farmers Market
4,400005,Pizza Place,Italian Restaurant,Snack Place,Bar,Spa,Chinese Restaurant,Ice Cream Shop,Hotel,Indian Restaurant,Thai Restaurant
5,400006,Gym,Dessert Shop,Park,Restaurant,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,Diner
6,400007,Restaurant,Fast Food Restaurant,Electronics Store,Chinese Restaurant,Bakery,Lounge,Salon / Barbershop,Bookstore,Farmers Market,Bus Station
7,400008,Donut Shop,Fast Food Restaurant,Cupcake Shop,Pizza Place,Middle Eastern Restaurant,Music Venue,Arts & Crafts Store,Indian Restaurant,Dessert Shop,Bakery
8,400009,Indian Restaurant,Harbor / Marina,Sandwich Place,Furniture / Home Store,Diner,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store
9,400010,Government Building,Yoga Studio,Dessert Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,Diner


In [57]:
neighborhoods_venues_sorted.iloc[11,]

Neighborhood                            400012
1st Most Common Venue        Indian Restaurant
2nd Most Common Venue     Gym / Fitness Center
3rd Most Common Venue       Chinese Restaurant
4th Most Common Venue                  Stadium
5th Most Common Venue         Recording Studio
6th Most Common Venue              Bus Station
7th Most Common Venue                      Bar
8th Most Common Venue               Playground
9th Most Common Venue      Sporting Goods Shop
10th Most Common Venue              Restaurant
Name: 11, dtype: object

In [79]:
cols = df_Mumbai.columns.tolist()
n = int(cols.index('Location'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
df_Mumbai = df_Mumbai[cols]
df_Mumbai.shape

(63, 6)

In [115]:
df_Mumbai.shape



(63, 6)

## Clustering Mumbai

In [64]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

Mumbai_grouped_clustering = Mumbai_df.drop('Location', 1)
Mumbai_grouped_cluster = Mumbai_grouped_clustering.drop('State',1)
Mumbai_grouped_clusters = Mumbai_grouped_cluster.drop('District',1)

# run k-means clustering
kmeans = KMeans(n_clusters=5, random_state=2).fit(Mumbai_grouped_clusters)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 1, 2, 3, 4, 1, 1, 1, 0, 4, 1, 4, 1, 4, 1, 1, 3, 4, 2, 4, 0, 0,
       3, 1, 3, 2, 2, 0, 3, 2, 1, 4, 4, 4, 2, 2, 4, 0, 0, 0, 2, 2, 1, 1,
       1, 1, 2, 2, 2, 2, 0, 4, 4, 0, 1, 1, 1, 1, 0, 4, 1, 2, 2, 0, 1, 0,
       4, 1, 3, 3, 1, 4, 2, 1, 3, 1, 0, 4, 4, 3, 1, 2, 1, 1, 1, 3, 1, 0,
       0, 2, 1, 4, 4, 1, 4, 1, 2, 2, 2, 4, 1, 3, 1, 3, 4, 0, 2, 1, 1, 4,
       2, 2, 2, 2, 2, 3, 4, 4, 4, 1, 1, 2, 2, 0, 2, 4, 4, 4, 1, 3, 2, 2,
       4, 0, 4, 0, 1, 4, 0, 3, 0, 1, 0, 4, 4, 1, 3, 3, 1, 0, 1, 1, 0, 3,
       2, 1, 0, 4, 4, 4, 0, 4, 4, 4, 0, 0, 2, 1, 4, 1, 1, 3, 1, 1, 1, 0,
       0, 2, 0, 2, 0, 0], dtype=int32)

In [122]:
df_Mumbai=df_Mumbai.drop(columns=['Pincode','State','District'])
df_Mumbai.head()

Unnamed: 0,Location,Latitude,Longitude
0,A I staff colony,19.07967,72.867897
1,Aareymilk Colony,19.155576,72.884959
2,Agripada,18.981045,72.826758
3,Airport,19.092927,72.865377
4,Ambewadi,18.957988,72.82144


In [124]:
Mumbai_merged = Mumbai_df

# add clustering labels
Mumbai_merged['Cluster Labels'] = kmeans.labels_

# merge capetown_grouped with port_elizabeth_df to add latitude/longitude for each neighborhood
Mumbai_merged = Mumbai_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Pincode')

print(Mumbai_merged.shape)
Mumbai_merged.head() # check the last columns!
Mumbai_merged = Mumbai_merged.join(df_Mumbai.set_index('Location'), on='Location')
Mumbai_merged.head()

(182, 15)


Unnamed: 0,Location,Pincode,State,District,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
0,A I staff colony,400029,Maharashtra,Mumbai,0,Asian Restaurant,Chinese Restaurant,Pizza Place,Electronics Store,Field,Fast Food Restaurant,Farmers Market,Farm,Donut Shop,Diner,19.07967,72.867897
1,Aareymilk Colony,400065,Maharashtra,Mumbai,1,Café,Gym / Fitness Center,Farm,Golf Course,Hotel,Restaurant,Resort,Yoga Studio,Farmers Market,Electronics Store,19.155576,72.884959
2,Agripada,400011,Maharashtra,Mumbai,2,Indian Restaurant,Gym,Racetrack,Restaurant,Coffee Shop,Bakery,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,18.981045,72.826758
3,Airport,400099,Maharashtra,Mumbai,3,Airport,Airport Lounge,Bakery,Coffee Shop,Yoga Studio,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,19.092927,72.865377
4,Ambewadi,400004,Maharashtra,Mumbai,4,Indian Restaurant,Snack Place,Vegetarian / Vegan Restaurant,Fast Food Restaurant,Coffee Shop,Electronics Store,Yoga Studio,Diner,Field,Farmers Market,18.957988,72.82144


In [140]:

#Mumbai_merged.dropna()
#Mumbai_merged.drop(Mumbai_merged[Mumbai_merged.Location == 'nan'].index, inplace=True)
#Mumbai_merged.dropna(how='all')
Mumbai_merged.dropna(inplace=True)


## Cluster Visualization

In [141]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

In [144]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
colors_array = cm.rainbow(np.linspace(0, 1, kclusters))
rainbow = [colors.rgb2hex(i) for i in colors_array]
print(rainbow)
# add markers to the map
markers_colors = []
for lat, lon, nei , cluster in zip(Mumbai_merged['Latitude'], Mumbai_merged['Longitude'], Mumbai_merged['Pincode'], Mumbai_merged['Cluster Labels']):
    label = folium.Popup(str(nei) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

['#8000ff', '#00b5eb', '#80ffb4', '#ffb360', '#ff0000']


## Clustering of Mumbai

### Cluster 1

In [145]:
Mumbai_cluster_0 = Mumbai_merged.loc[Mumbai_merged['Cluster Labels'] == 0, Mumbai_merged.columns[[1] + [0] + list(range(4, Mumbai_merged.shape[1]))]]

Mumbai_cluster_0.shape

(10, 15)

In [146]:
Mumbai_cluster_0

Unnamed: 0,Pincode,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
0,400029,A I staff colony,0,Asian Restaurant,Chinese Restaurant,Pizza Place,Electronics Store,Field,Fast Food Restaurant,Farmers Market,Farm,Donut Shop,Diner,19.07967,72.867897
8,400037,Antop Hill,0,Historic Site,Tea Room,Smoke Shop,Convenience Store,Gourmet Shop,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,19.025748,72.869694
20,400028,Bhawani Shankar,0,Indian Restaurant,Ice Cream Shop,Café,Movie Theater,Chinese Restaurant,Bar,Bakery,Breakfast Spot,Italian Restaurant,Plaza,19.021722,72.837479
27,400030,Century Mill,0,Bakery,Chinese Restaurant,Café,Coffee Shop,Flea Market,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,19.01217,72.821863
37,400033,Cotton Exchange,0,Department Store,Whisky Bar,Train Station,Dessert Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,18.985633,72.844163
38,400026,Cumballa Hill,0,Racetrack,Sandwich Place,Pizza Place,Bakery,Café,Stadium,Department Store,Hotel,Men's Store,Food Truck,18.970642,72.809483
63,400034,Hajiali,0,Chinese Restaurant,Fast Food Restaurant,Bengali Restaurant,Gym,Sandwich Place,Vegetarian / Vegan Restaurant,Juice Bar,Italian Restaurant,Electronics Store,Food Court,18.972416,72.814818
65,400032,High Court bulding,0,Indian Restaurant,Café,Fast Food Restaurant,Coffee Shop,Bookstore,Nightclub,Hotel,Chinese Restaurant,Clothing Store,Asian Restaurant,18.929005,72.830131
87,400031,Kidwai Nagar,0,Convention Center,Movie Theater,Gym,Cupcake Shop,Bus Station,Café,Bakery,Field,Fast Food Restaurant,Farmers Market,19.013289,72.854898
123,400025,New Prabhadevi road,0,Indian Restaurant,Café,Seafood Restaurant,Pizza Place,Sandwich Place,Convenience Store,Electronics Store,Smoke Shop,Bakery,Chinese Restaurant,19.015831,72.829392


### Mumbai Cluster 2

In [148]:
Mumbai_cluster_1 = Mumbai_merged.loc[Mumbai_merged['Cluster Labels'] == 1, Mumbai_merged.columns[[1] + [0] + list(range(4,Mumbai_merged.shape[1]))]]

Mumbai_cluster_1.shape

(20, 15)

In [149]:
Mumbai_cluster_1

Unnamed: 0,Pincode,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
1,400065,Aareymilk Colony,1,Café,Gym / Fitness Center,Farm,Golf Course,Hotel,Restaurant,Resort,Yoga Studio,Farmers Market,Electronics Store,19.155576,72.884959
5,400053,Andheri,1,Multiplex,Fast Food Restaurant,Restaurant,Cocktail Bar,Café,Indian Restaurant,Asian Restaurant,Lounge,Hotel,Pizza Place,19.112105,72.861073
6,400069,Andheri East,1,Hotel,Diner,Asian Restaurant,Restaurant,Dessert Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,19.117753,72.873702
7,400058,Andheri Railway station,1,Electronics Store,Gym / Fitness Center,Convenience Store,Smoke Shop,Gym,Snack Place,Pizza Place,Chinese Restaurant,Sandwich Place,Bar,19.122735,72.830201
12,400051,B.N. bhavan,1,Gym,Café,Pizza Place,Food Court,Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,Diner,19.05961,72.855298
14,400050,Bandra West,1,Café,Indian Restaurant,Bar,Chinese Restaurant,Bakery,Asian Restaurant,Pizza Place,Gourmet Shop,Snack Place,Burger Joint,19.05517,72.829952
23,400066,Borivali East,1,Indian Restaurant,Juice Bar,Asian Restaurant,Scenic Lookout,Chinese Restaurant,Indie Movie Theater,Bakery,Park,Gym / Fitness Center,Sandwich Place,19.231345,72.86353
30,400067,Charkop,1,Gym / Fitness Center,Coffee Shop,Pool,Bike Rental / Bike Share,Cupcake Shop,Donut Shop,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,19.207237,72.834822
44,400052,Danda,1,Indian Restaurant,Lounge,Coffee Shop,Asian Restaurant,Italian Restaurant,Bar,Comfort Food Restaurant,Donut Shop,Clothing Store,Chinese Restaurant,19.071708,72.834104
54,400062,Goregaon,1,Gym / Fitness Center,Ice Cream Shop,Multiplex,Fast Food Restaurant,Sandwich Place,Italian Restaurant,Gym,German Restaurant,Design Studio,Farmers Market,19.15934,72.842686


### Mumbai Cluster 3

In [150]:
Mumbai_cluster_2 = Mumbai_merged.loc[Mumbai_merged['Cluster Labels'] == 2, Mumbai_merged.columns[[1] + [0] + list(range(4,Mumbai_merged.shape[1]))]]

Mumbai_cluster_2.shape

(10, 15)

In [151]:
Mumbai_cluster_2

Unnamed: 0,Pincode,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
2,400011,Agripada,2,Indian Restaurant,Gym,Racetrack,Restaurant,Coffee Shop,Bakery,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,18.981045,72.826758
18,400012,Best Staff colony,2,Indian Restaurant,Gym / Fitness Center,Chinese Restaurant,Stadium,Recording Studio,Bus Station,Bar,Playground,Sporting Goods Shop,Restaurant,18.999541,72.839709
25,400013,C G s colony,2,Café,Indian Restaurant,Clothing Store,Pizza Place,Shopping Mall,Dessert Shop,Asian Restaurant,Cosmetics Shop,Seafood Restaurant,Indoor Play Area,18.998173,72.827469
26,400020,Central Building,2,Fast Food Restaurant,Ice Cream Shop,Cricket Ground,Café,Indian Restaurant,Movie Theater,Train Station,Bakery,Hotel,Restaurant,18.935118,72.826438
40,400014,Dadar,2,Indian Restaurant,Coffee Shop,Restaurant,Snack Place,Farmers Market,Café,Hotel,Juice Bar,Vegetarian / Vegan Restaurant,Lounge,19.014796,72.845453
47,400017,Dharavi,2,Tea Room,Asian Restaurant,Dance Studio,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,19.045592,72.853706
49,400010,Dockyard Road,2,Government Building,Yoga Studio,Dessert Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,Diner,18.970188,72.845962
81,400016,Kapad Bazar,2,Indian Restaurant,Café,Breakfast Spot,Intersection,Fast Food Restaurant,Pizza Place,Hotel Bar,Coffee Shop,Dessert Shop,Southern / Soul Food Restaurant,19.038946,72.842055
110,400019,Matunga Railway workshop,2,Indian Restaurant,Ice Cream Shop,Café,Snack Place,Bar,Fast Food Restaurant,South Indian Restaurant,Juice Bar,Jewelry Store,Market,19.026901,72.855377
122,400021,Nariman Point,2,Theater,Italian Restaurant,Indian Restaurant,Hotel,Chaat Place,Mediterranean Restaurant,Café,Restaurant,Pizza Place,Dessert Shop,18.925573,72.824222


### Mumbai Cluster 4

In [152]:
Mumbai_cluster_3 = Mumbai_merged.loc[Mumbai_merged['Cluster Labels'] == 3, Mumbai_merged.columns[[1] + [0] + list(range(4,Mumbai_merged.shape[1]))]]

Mumbai_cluster_3.shape

(10, 15)

In [153]:

Mumbai_cluster_3

Unnamed: 0,Pincode,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
3,400099,Airport,3,Airport,Airport Lounge,Bakery,Coffee Shop,Yoga Studio,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,19.092927,72.865377
16,400090,Bangur Nagar,3,Smoke Shop,Food Truck,Soccer Field,Seafood Restaurant,Cricket Ground,Cupcake Shop,Fish & Chips Shop,Convention Center,Field,Fast Food Restaurant,19.162944,72.835301
22,400091,Borivali,3,Gym,Ice Cream Shop,Restaurant,Fast Food Restaurant,Food,Liquor Store,Gift Shop,Dessert Shop,Farmers Market,Grocery Store,19.234037,72.839949
28,400093,Chakala Midc,3,Ice Cream Shop,Yoga Studio,Sports Bar,Hotel,Indian Restaurant,Market,Coffee Shop,Chinese Restaurant,Restaurant,Scenic Lookout,19.128274,72.867709
68,400095,Ins Hamla,3,Chinese Restaurant,Market,Yoga Studio,Diner,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,19.188802,72.819704
74,400102,Jogeshwari West,3,Indian Restaurant,Chinese Restaurant,Pub,Brewery,Multiplex,Sports Bar,Building,Italian Restaurant,Seafood Restaurant,Shopping Mall,19.141896,72.834424
79,400101,Kandivali East,3,Seafood Restaurant,Pizza Place,Restaurant,Food Truck,Indian Restaurant,Dessert Shop,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,19.205859,72.86612
101,400097,Malad East,3,Coffee Shop,Indian Restaurant,Yoga Studio,Diner,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,19.183863,72.858432
103,400103,Mandapeshwar,3,Snack Place,Pizza Place,Burger Joint,Chinese Restaurant,Lounge,Yoga Studio,Donut Shop,Field,Fast Food Restaurant,Farmers Market,19.245015,72.845409
153,400096,Seepz,3,Restaurant,Vegetarian / Vegan Restaurant,Airport Terminal,Pizza Place,Bus Station,Yoga Studio,Diner,Fast Food Restaurant,Farmers Market,Farm,19.126657,72.876655


### Mumbai Cluster 5

In [154]:
Mumbai_cluster_4 = Mumbai_merged.loc[Mumbai_merged['Cluster Labels'] == 4, Mumbai_merged.columns[[1] + [0] + list(range(4,Mumbai_merged.shape[1]))]]

Mumbai_cluster_4.shape

(9, 15)

In [156]:

Mumbai_cluster_4

Unnamed: 0,Pincode,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Latitude,Longitude
4,400004,Ambewadi,4,Indian Restaurant,Snack Place,Vegetarian / Vegan Restaurant,Fast Food Restaurant,Coffee Shop,Electronics Store,Yoga Studio,Diner,Field,Farmers Market,18.957988,72.82144
9,400005,Asvini,4,Pizza Place,Italian Restaurant,Snack Place,Bar,Spa,Chinese Restaurant,Ice Cream Shop,Hotel,Indian Restaurant,Thai Restaurant,18.910369,72.819758
11,400003,B P t colony,4,Indian Restaurant,Juice Bar,Convenience Store,Chinese Restaurant,Restaurant,Rest Area,Smoke Shop,Café,BBQ Joint,Indian Sweet Shop,18.953193,72.835301
17,400001,Bazargate,4,Indian Restaurant,Irani Cafe,Bar,Hotel,Sandwich Place,Fast Food Restaurant,Seafood Restaurant,Food Truck,Café,Chinese Restaurant,18.938535,72.836334
19,400007,Bharat Nagar,4,Restaurant,Fast Food Restaurant,Electronics Store,Chinese Restaurant,Bakery,Lounge,Salon / Barbershop,Bookstore,Farmers Market,Bus Station,18.961841,72.813383
33,400009,Chinchbunder,4,Indian Restaurant,Harbor / Marina,Sandwich Place,Furniture / Home Store,Diner,Field,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,18.958296,72.838941
51,400008,Falkland Road,4,Donut Shop,Fast Food Restaurant,Cupcake Shop,Pizza Place,Middle Eastern Restaurant,Music Venue,Arts & Crafts Store,Indian Restaurant,Dessert Shop,Bakery,18.96714,72.828657
77,400002,Kalbadevi,4,Indian Restaurant,Train Station,Movie Theater,Café,Yoga Studio,Diner,Field,Fast Food Restaurant,Farmers Market,Farm,18.948367,72.825936
99,400006,Malabar Hill,4,Gym,Dessert Shop,Park,Restaurant,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Donut Shop,Diner,18.954086,72.800448
