# Mumbai Bar Hopper #

## Introduction ##

Mumbai, India, has an ever growing nightlife, with bars and restaurants opening up at a rapid pace. It has become almost impossible to know which place is trending, highly rated, or which place has deteriorated over time and overpopulation.

### I would like to leverage the Foursquare location data, along with Geo Spatial information about Mumbai Neighborhoods, in order to depict the bars/pubs around each location. The target stakeholder would be:
### 1. Common individual, looking for bars/pubs around them, along with Trending bars/pubs around them.
### 2. Common individual , looking for the neighborhood with the most bars to bar hop.
### 3. Stakeholders looking for the right neighborhood to open a new bar/pub.

## Data ##

To provide the appropriate stakeholders the necessary information, I will be utilizing Mumbai's locational data by zip code, along with BeautifulSoup, in order to extract locational data by latitude and longitude. This geospatial data will be parsed into my Foursquare calls, where I will search all venues in a given radius. The venues will be sorted by rating(Foursquare Likes) and zip code frequency. I will also cluster the data, so that individuals can visually see the frequency of bars/pubs by neighborhood, and by rating in neighborhoods.  

### The data for Mumbai's locational data by zip code is available at the following link:
### https://www.mapsofindia.com/pincode/india/maharashtra/mumbai/

### The following data points will be extracted from the Foursquare API for each venue by zip code:

1. Venue Name

2. Unique ID

3. Geospatial Location

4. Number of Likes

5. Trending Status

## Methodology

### Data Cleaning

In [102]:
# Import Appropriate Libraries
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
from pandas import DataFrame

import numpy as np
from bs4 import BeautifulSoup
import requests
import io
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import hmac

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert address into latitude and longitude values

print('Libraries Imported.')

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Libraries Imported.


In [103]:
#Get Zipcode Data of Neighbourhoods thr source Url
url='https://www.mapsofindia.com/pincode/india/maharashtra/mumbai/'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
print(df)

               Pincode Details Pincode Details.1 Pincode Details.2  \
0                     Location           Pincode             State   
1             A I staff colony            400029       Maharashtra   
2             Aareymilk Colony            400065       Maharashtra   
3                     Agripada            400011       Maharashtra   
4                      Airport            400099       Maharashtra   
5                     Ambewadi            400004       Maharashtra   
6                      Andheri            400053       Maharashtra   
7                 Andheri East            400069       Maharashtra   
8      Andheri Railway station            400058       Maharashtra   
9                   Antop Hill            400037       Maharashtra   
10                      Asvini            400005       Maharashtra   
11                  Azad Nagar            400053       Maharashtra   
12                B P t colony            400003       Maharashtra   
13                 B

In [104]:
# Replacing the Current Header with the first row
new_header = df.iloc[0] 
df = df[1:] 
df.columns = new_header

In [105]:
# Rearranging the columns, ready for Geopy
mumbaizip_df = df[['Location','District','State','Pincode']]
mumbaizip_df

Unnamed: 0,Location,District,State,Pincode
1,A I staff colony,Mumbai,Maharashtra,400029
2,Aareymilk Colony,Mumbai,Maharashtra,400065
3,Agripada,Mumbai,Maharashtra,400011
4,Airport,Mumbai,Maharashtra,400099
5,Ambewadi,Mumbai,Maharashtra,400004
6,Andheri,Mumbai,Maharashtra,400053
7,Andheri East,Mumbai,Maharashtra,400069
8,Andheri Railway station,Mumbai,Maharashtra,400058
9,Antop Hill,Mumbai,Maharashtra,400037
10,Asvini,Mumbai,Maharashtra,400005


In [106]:
mumbaizip_df.shape

(182, 4)

In [107]:
#Combining all three rows, separated by commas in order to get locational data from Foursquare
df = DataFrame(mumbaizip_df, columns= ['Location', 'District','State','Pincode']) 

df1 = mumbaizip_df['Location'].map(str) + ', ' + mumbaizip_df['District'].map(str) + ', ' + mumbaizip_df['State'].map(str) + ', ' + mumbaizip_df['Pincode'].map(str)

#print (df1)

df2 = df1.to_frame()
df2.columns = ['Address']
mumbaizip_df['Address'] = df2['Address']
df2

Unnamed: 0,Address
1,"A I staff colony, Mumbai, Maharashtra, 400029"
2,"Aareymilk Colony, Mumbai, Maharashtra, 400065"
3,"Agripada, Mumbai, Maharashtra, 400011"
4,"Airport, Mumbai, Maharashtra, 400099"
5,"Ambewadi, Mumbai, Maharashtra, 400004"
6,"Andheri, Mumbai, Maharashtra, 400053"
7,"Andheri East, Mumbai, Maharashtra, 400069"
8,"Andheri Railway station, Mumbai, Maharashtra, ..."
9,"Antop Hill, Mumbai, Maharashtra, 400037"
10,"Asvini, Mumbai, Maharashtra, 400005"


In [108]:
geolocator = Nominatim(user_agent='Mumbai Bar Hopper')

In [109]:
from geopy.extra.rate_limiter import RateLimiter

In [110]:
#Getting Coordinates for each neighbourhood in Mumbai
df3 = df2

geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

df3['Location1'] = df3['Address'].apply(geocode)
df3['Coordinates_Unsorted'] = df3['Location1'].apply(lambda loc: tuple(loc.point) if loc else None)

RateLimiter caught an error, retrying (0/2 tries). Called with (*('S V marg, Mumbai, Maharashtra, 400007',), **{}).
Traceback (most recent call last):
  File "/opt/conda/envs/Python36/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/opt/conda/envs/Python36/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/conda/envs/Python36/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/conda/envs/Python36/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/conda/envs/Python36/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/opt/conda/envs/Python36/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/opt/conda/envs/Python36/li

In [111]:
df4 = df3
df4

Unnamed: 0,Address,Location1,Coordinates_Unsorted
1,"A I staff colony, Mumbai, Maharashtra, 400029",,
2,"Aareymilk Colony, Mumbai, Maharashtra, 400065",,
3,"Agripada, Mumbai, Maharashtra, 400011","(Agripada, Zone 1, Mumbai, Mumbai City, Mahara...","(18.9753024, 72.8248975, 0.0)"
4,"Airport, Mumbai, Maharashtra, 400099","(New Airport Colony, K/E Ward, Zone 3, Mumbai,...","(19.1051366, 72.8558026258599, 0.0)"
5,"Ambewadi, Mumbai, Maharashtra, 400004","(Ambewadi, Zone 4, Mumbai, Mumbai Suburban, Ma...","(19.1867764, 72.8593129, 0.0)"
6,"Andheri, Mumbai, Maharashtra, 400053","(Andheri, Madhavdas Amarshi Marg, K/W Ward, Zo...","(19.1196976, 72.8464205, 0.0)"
7,"Andheri East, Mumbai, Maharashtra, 400069","(Andheri East, Zone 3, Mumbai, Mumbai Suburban...","(19.1158835, 72.854202, 0.0)"
8,"Andheri Railway station, Mumbai, Maharashtra, ...","(Andheri, Madhavdas Amarshi Marg, K/W Ward, Zo...","(19.1196976, 72.8464205, 0.0)"
9,"Antop Hill, Mumbai, Maharashtra, 400037","(Antop Hill, Mumbai, Mumbai City, Maharashtra,...","(19.0207608, 72.8652556, 0.0)"
10,"Asvini, Mumbai, Maharashtra, 400005","(INHS Asvini, Nanabhai Moos Marg, Dhobi Ghat, ...","(18.900976, 72.8159707, 0.0)"


In [112]:
df4.shape

(182, 3)

In [113]:
#Cleaning the DataFrame
df4['Location1'] = df4.Location1.astype(str)
df4.drop_duplicates(subset ="Address", keep = False, inplace = True)
df4 = df4[~df4['Location1'].isin(["None"])]
df4.dropna()
df4.reset_index(drop=True, inplace=True)
df4

Unnamed: 0,Address,Location1,Coordinates_Unsorted
0,"Agripada, Mumbai, Maharashtra, 400011","Agripada, Zone 1, Mumbai, Mumbai City, Maharas...","(18.9753024, 72.8248975, 0.0)"
1,"Airport, Mumbai, Maharashtra, 400099","New Airport Colony, K/E Ward, Zone 3, Mumbai, ...","(19.1051366, 72.8558026258599, 0.0)"
2,"Ambewadi, Mumbai, Maharashtra, 400004","Ambewadi, Zone 4, Mumbai, Mumbai Suburban, Mah...","(19.1867764, 72.8593129, 0.0)"
3,"Andheri, Mumbai, Maharashtra, 400053","Andheri, Madhavdas Amarshi Marg, K/W Ward, Zon...","(19.1196976, 72.8464205, 0.0)"
4,"Andheri East, Mumbai, Maharashtra, 400069","Andheri East, Zone 3, Mumbai, Mumbai Suburban,...","(19.1158835, 72.854202, 0.0)"
5,"Andheri Railway station, Mumbai, Maharashtra, ...","Andheri, Madhavdas Amarshi Marg, K/W Ward, Zon...","(19.1196976, 72.8464205, 0.0)"
6,"Antop Hill, Mumbai, Maharashtra, 400037","Antop Hill, Mumbai, Mumbai City, Maharashtra, ...","(19.0207608, 72.8652556, 0.0)"
7,"Asvini, Mumbai, Maharashtra, 400005","INHS Asvini, Nanabhai Moos Marg, Dhobi Ghat, A...","(18.900976, 72.8159707, 0.0)"
8,"Azad Nagar, Mumbai, Maharashtra, 400053","Azad Nagar, K/W Ward, Zone 3, Mumbai, Mumbai S...","(19.1283153, 72.8400381, 0.0)"
9,"B P t colony, Mumbai, Maharashtra, 400003","P&T Colony, Zone 3, Mumbai, Mumbai Suburban, M...","(19.101937, 72.8615987, 0.0)"


In [114]:
df4['Coordinates_Unsorted'] = df4.Coordinates_Unsorted.astype(str)

lat = []
lon = []

# For each row in a varible,
for row in df4['Coordinates_Unsorted']:
    try :
    
        # Split the row by comma and append
        # everything before the comma to lat
        lat.append(row.split(',')[0])
        # Split the row by comma and append
        # everything after the comma to lon
        lon.append(row.split(',')[1])
    except :
        # append a missing value to lat
        lat.append(np.NaN)
        # append a missing value to lon
        lon.append(np.NaN)

# Create two new columns from lat and lon
df4['Latitude'] = lat
df4['Latitude'] = df4['Latitude'].str[1:]
df4['Longitude'] = lon

df4['Latitude'] = df4.Latitude.astype(float)
df4['Longitude'] = df4.Longitude.astype(float)

df4

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view

Unnamed: 0,Address,Location1,Coordinates_Unsorted,Latitude,Longitude
0,"Agripada, Mumbai, Maharashtra, 400011","Agripada, Zone 1, Mumbai, Mumbai City, Maharas...","(18.9753024, 72.8248975, 0.0)",18.975302,72.824898
1,"Airport, Mumbai, Maharashtra, 400099","New Airport Colony, K/E Ward, Zone 3, Mumbai, ...","(19.1051366, 72.8558026258599, 0.0)",19.105137,72.855803
2,"Ambewadi, Mumbai, Maharashtra, 400004","Ambewadi, Zone 4, Mumbai, Mumbai Suburban, Mah...","(19.1867764, 72.8593129, 0.0)",19.186776,72.859313
3,"Andheri, Mumbai, Maharashtra, 400053","Andheri, Madhavdas Amarshi Marg, K/W Ward, Zon...","(19.1196976, 72.8464205, 0.0)",19.119698,72.84642
4,"Andheri East, Mumbai, Maharashtra, 400069","Andheri East, Zone 3, Mumbai, Mumbai Suburban,...","(19.1158835, 72.854202, 0.0)",19.115883,72.854202
5,"Andheri Railway station, Mumbai, Maharashtra, ...","Andheri, Madhavdas Amarshi Marg, K/W Ward, Zon...","(19.1196976, 72.8464205, 0.0)",19.119698,72.84642
6,"Antop Hill, Mumbai, Maharashtra, 400037","Antop Hill, Mumbai, Mumbai City, Maharashtra, ...","(19.0207608, 72.8652556, 0.0)",19.020761,72.865256
7,"Asvini, Mumbai, Maharashtra, 400005","INHS Asvini, Nanabhai Moos Marg, Dhobi Ghat, A...","(18.900976, 72.8159707, 0.0)",18.900976,72.815971
8,"Azad Nagar, Mumbai, Maharashtra, 400053","Azad Nagar, K/W Ward, Zone 3, Mumbai, Mumbai S...","(19.1283153, 72.8400381, 0.0)",19.128315,72.840038
9,"B P t colony, Mumbai, Maharashtra, 400003","P&T Colony, Zone 3, Mumbai, Mumbai Suburban, M...","(19.101937, 72.8615987, 0.0)",19.101937,72.861599


In [115]:
df4.shape

(120, 5)

In [116]:
#Merging DataFrames
mumbaizip_df = pd.merge(mumbaizip_df, df4, on = 'Address', how = 'inner')
mumbaizip_df

Unnamed: 0,Location,District,State,Pincode,Address,Location1,Coordinates_Unsorted,Latitude,Longitude
0,Agripada,Mumbai,Maharashtra,400011,"Agripada, Mumbai, Maharashtra, 400011","Agripada, Zone 1, Mumbai, Mumbai City, Maharas...","(18.9753024, 72.8248975, 0.0)",18.975302,72.824898
1,Airport,Mumbai,Maharashtra,400099,"Airport, Mumbai, Maharashtra, 400099","New Airport Colony, K/E Ward, Zone 3, Mumbai, ...","(19.1051366, 72.8558026258599, 0.0)",19.105137,72.855803
2,Ambewadi,Mumbai,Maharashtra,400004,"Ambewadi, Mumbai, Maharashtra, 400004","Ambewadi, Zone 4, Mumbai, Mumbai Suburban, Mah...","(19.1867764, 72.8593129, 0.0)",19.186776,72.859313
3,Andheri,Mumbai,Maharashtra,400053,"Andheri, Mumbai, Maharashtra, 400053","Andheri, Madhavdas Amarshi Marg, K/W Ward, Zon...","(19.1196976, 72.8464205, 0.0)",19.119698,72.84642
4,Andheri East,Mumbai,Maharashtra,400069,"Andheri East, Mumbai, Maharashtra, 400069","Andheri East, Zone 3, Mumbai, Mumbai Suburban,...","(19.1158835, 72.854202, 0.0)",19.115883,72.854202
5,Andheri Railway station,Mumbai,Maharashtra,400058,"Andheri Railway station, Mumbai, Maharashtra, ...","Andheri, Madhavdas Amarshi Marg, K/W Ward, Zon...","(19.1196976, 72.8464205, 0.0)",19.119698,72.84642
6,Antop Hill,Mumbai,Maharashtra,400037,"Antop Hill, Mumbai, Maharashtra, 400037","Antop Hill, Mumbai, Mumbai City, Maharashtra, ...","(19.0207608, 72.8652556, 0.0)",19.020761,72.865256
7,Asvini,Mumbai,Maharashtra,400005,"Asvini, Mumbai, Maharashtra, 400005","INHS Asvini, Nanabhai Moos Marg, Dhobi Ghat, A...","(18.900976, 72.8159707, 0.0)",18.900976,72.815971
8,Azad Nagar,Mumbai,Maharashtra,400053,"Azad Nagar, Mumbai, Maharashtra, 400053","Azad Nagar, K/W Ward, Zone 3, Mumbai, Mumbai S...","(19.1283153, 72.8400381, 0.0)",19.128315,72.840038
9,B P t colony,Mumbai,Maharashtra,400003,"B P t colony, Mumbai, Maharashtra, 400003","P&T Colony, Zone 3, Mumbai, Mumbai Suburban, M...","(19.101937, 72.8615987, 0.0)",19.101937,72.861599


In [117]:
mumbaizip_df = mumbaizip_df[['Location', 'District', 'State', 'Pincode', 'Latitude', 'Longitude']]
mumbaizip_df

Unnamed: 0,Location,District,State,Pincode,Latitude,Longitude
0,Agripada,Mumbai,Maharashtra,400011,18.975302,72.824898
1,Airport,Mumbai,Maharashtra,400099,19.105137,72.855803
2,Ambewadi,Mumbai,Maharashtra,400004,19.186776,72.859313
3,Andheri,Mumbai,Maharashtra,400053,19.119698,72.84642
4,Andheri East,Mumbai,Maharashtra,400069,19.115883,72.854202
5,Andheri Railway station,Mumbai,Maharashtra,400058,19.119698,72.84642
6,Antop Hill,Mumbai,Maharashtra,400037,19.020761,72.865256
7,Asvini,Mumbai,Maharashtra,400005,18.900976,72.815971
8,Azad Nagar,Mumbai,Maharashtra,400053,19.128315,72.840038
9,B P t colony,Mumbai,Maharashtra,400003,19.101937,72.861599


## Data Analysis

In [118]:
address = 'Mumbai, Maharashtra, India'

geolocator = Nominatim(user_agent="Mumbai Bar Hopper")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Mumbai are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Mumbai are 18.9387711, 72.8353355.


In [119]:
# create map of Mumbai using latitude and longitude values
map_mum = folium.Map(location=[latitude, longitude], zoom_start=10.5)

# add markers to map
for Lat, Lng, District, Neighborhood in zip(mumbaizip_df['Latitude'], mumbaizip_df['Longitude'], mumbaizip_df['District'], mumbaizip_df['Location']):
    label = '{}, {}'.format(Neighborhood, District)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [Lat, Lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_mum)  
    
map_mum

In [120]:
mumbaizip_df.to_pickle('./mumbaizip_df.pkl')    

In [121]:
#@hidden_cell
CLIENT_ID = '5CADB2HOVOAGZNRP4LSZ4RFGDYGMSRYQEIHUEUTQTDONU5AF' # your Foursquare ID
CLIENT_SECRET = 'XVE5R05OJK1CUNINNU0HIIW3PA30YKNFCX2ZXVFM1F0ITI1C' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [122]:
search_query = 'bar'
categoryId = '4bf58dd8d48988d116941735,52e81612bcbc57f1066b7a0d,56aa371ce4b08b9a8d57356c,4bf58dd8d48988d117941735,4bf58dd8d48988d11e941735,4bf58dd8d48988d118941735,4bf58dd8d48988d1d8941735,4bf58dd8d48988d119941735,4bf58dd8d48988d1d5941735,4bf58dd8d48988d120941735,4bf58dd8d48988d11b941735,4bf58dd8d48988d11c941735,4bf58dd8d48988d11d941735,56aa371be4b08b9a8d57354d,4bf58dd8d48988d122941735,4bf58dd8d48988d123941735,50327c8591d4c4b30a586d5d,4bf58dd8d48988d121941735,4bf58dd8d48988d11f941735'
LIMIT = '100'

In [123]:
#Function to get Nearby Venues(Bars) for all Neighbourhoods
def getNearbyVenues(names, lat1, long1, radius=200):
    
    venues_list=[]
    for name, lat, lng in zip(names, lat1, long1):

        # create the API request URL
        url1 = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}&categoryId={}'.format(CLIENT_ID, CLIENT_SECRET, lat, lng, VERSION, search_query, radius, LIMIT, categoryId)


        # make the GET request
        results = requests.get(url1).json()["response"]["venues"]

        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng']) for v in results])

        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])

    return(nearby_venues)

In [124]:
#Create a dataframe of all the venues from each neighbourhood
mum_venues_df = getNearbyVenues(names=mumbaizip_df['Location'],
                                   lat1=mumbaizip_df['Latitude'],
                                   long1=mumbaizip_df['Longitude']
                                   )

In [125]:
print(mum_venues_df.shape)

(63, 6)


In [126]:
mum_venues_df.columns=["Location","Latitude","Longitude","Bar_Name","Bar_Latitude","Bar_Longitude"]
mum_venues_df

Unnamed: 0,Location,Latitude,Longitude,Bar_Name,Bar_Latitude,Bar_Longitude
0,Andheri East,19.115883,72.854202,Kalinga Bar and Restaurant,19.117131,72.854494
1,Andheri East,19.115883,72.854202,Delite Bar,19.116728,72.855364
2,Azad Nagar,19.128315,72.840038,Kamal Chhaya Bar,19.128245,72.83761
3,B.N. bhavan,18.937132,72.832556,Central Bar,18.93865,72.833919
4,Bangur Nagar,19.168814,72.833678,B99 Karaoke Bar,19.170534,72.8339
5,Central Building,18.935439,72.826718,Czar Bar,18.935381,72.825258
6,Charni Road,18.959867,72.819531,New Yazdani Bar Aani Restaurant,18.95899,72.819954
7,Charni Road,18.959867,72.819531,Krishna Bar & Restaurant,18.96155,72.819063
8,Churchgate,18.935957,72.82734,Czar Bar,18.935381,72.825258
9,Cumballa Hill,18.969424,72.806851,Bar Bar Hookah,18.968373,72.805223


In [127]:
# create map of Mumbai Bars using latitude and longitude values
map_mumbars = folium.Map(location=[latitude, longitude], zoom_start=10.5)

# add markers to map
for Lat, Lng, Bar_Name, Loc in zip(mum_venues_df['Bar_Latitude'], mum_venues_df['Bar_Longitude'], mum_venues_df['Bar_Name'], mum_venues_df['Location']):
    label = '{},{}'.format(Loc,Bar_Name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [Lat, Lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_mumbars)  
    
map_mumbars

In [128]:
#Automatically find the count of unique bars that can be curated from the mum_venues_df dataframe
print('There are {} uniques bars.'.format(len(mum_venues_df['Bar_Name'].unique())))

There are 51 uniques bars.


In [129]:
# one hot encoding
mum_onehot = pd.get_dummies(mum_venues_df[['Bar_Name']], prefix="", prefix_sep="")

# add neighbourhood column back to dataframe
mum_onehot['Location'] = mum_venues_df['Location'] 

# move neighbourhood column to the first column
fixed_columns = [mum_onehot.columns[-1]] + list(mum_onehot.columns[:-1])
mum_onehot = mum_onehot[fixed_columns]

In [130]:
mum_onehot

Unnamed: 0,Location,20th Century Stores & Bar,B99 Karaoke Bar,Bar Bar Hookah,Baroke,Barook,Behram Bar,Bottle bar,Capitol Bar,Central Bar,Chicken Centre Resto Bar,Czar Bar,Deccan Bar,Delite Bar,Float bar,GK Bar,Gopal Krishna Bar & Restaurant,Guru Bar,Jay Prakash Restaurant & Bar,Kalinga Bar and Restaurant,Kamal Chhaya Bar,Krishna Bar & Restaurant,Lounge Bar & Cigar Divan,National Bar,New Yazdani Bar Aani Restaurant,Nishant Bar & Restaurant,Orlem Bar & Restaurant,Panchmukhi Restaurant And Bar,Rajdarbar Lounge Bar,Red Light Lounge Bar,Revival Terrace Bar,Ruby's Bar & Grill,Sagar Bar,Sai Prasad Family Restaurant & Bar,Sandhya Bar,Sanman Sports Bar & Restro,SatkAr Beer Bar,Satkar Restaurent and Bar,THC Bar And Grill,Taj Mahal Harbour Bar,The Bar Stock Exchange,The Bar Terminal,The Big Bang Bar And Cafe Oshiwara,The Studs Sports Bar & Grill,The Wine Bar At Kala Ghoda Café,U Turn Sports Bar,"U Turn, UFO Restaurant & BAR",Utsav Bar & Restaurant,lakshmi bar,new Golden Gate bar,"stax bar, hyatt regency",willingdon sports club bar
0,Andheri East,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Andheri East,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Azad Nagar,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,B.N. bhavan,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Bangur Nagar,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Central Building,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,Charni Road,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,Charni Road,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,Churchgate,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,Cumballa Hill,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [131]:
mum_onehot.shape

(63, 52)

In [132]:
#Grouping by the Neighbourhood(Location)
mum_grouped = mum_onehot.groupby('Location').mean().reset_index()
mum_grouped

Unnamed: 0,Location,20th Century Stores & Bar,B99 Karaoke Bar,Bar Bar Hookah,Baroke,Barook,Behram Bar,Bottle bar,Capitol Bar,Central Bar,Chicken Centre Resto Bar,Czar Bar,Deccan Bar,Delite Bar,Float bar,GK Bar,Gopal Krishna Bar & Restaurant,Guru Bar,Jay Prakash Restaurant & Bar,Kalinga Bar and Restaurant,Kamal Chhaya Bar,Krishna Bar & Restaurant,Lounge Bar & Cigar Divan,National Bar,New Yazdani Bar Aani Restaurant,Nishant Bar & Restaurant,Orlem Bar & Restaurant,Panchmukhi Restaurant And Bar,Rajdarbar Lounge Bar,Red Light Lounge Bar,Revival Terrace Bar,Ruby's Bar & Grill,Sagar Bar,Sai Prasad Family Restaurant & Bar,Sandhya Bar,Sanman Sports Bar & Restro,SatkAr Beer Bar,Satkar Restaurent and Bar,THC Bar And Grill,Taj Mahal Harbour Bar,The Bar Stock Exchange,The Bar Terminal,The Big Bang Bar And Cafe Oshiwara,The Studs Sports Bar & Grill,The Wine Bar At Kala Ghoda Café,U Turn Sports Bar,"U Turn, UFO Restaurant & BAR",Utsav Bar & Restaurant,lakshmi bar,new Golden Gate bar,"stax bar, hyatt regency",willingdon sports club bar
0,Andheri East,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Azad Nagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,B.N. bhavan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bangur Nagar,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Building,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Charni Road,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Churchgate,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Cumballa Hill,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Dadar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Daulat Nagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0


In [133]:
mum_grouped.shape

(38, 52)

In [134]:
#Getting the Top 5 Venues(Bars) for each Neighbourhood(Location).
num_top_venues = 5

for loc in mum_grouped['Location']:
    print("----"+loc+"----")
    temp = mum_grouped[mum_grouped['Location'] == loc].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Andheri East----
                        venue  freq
0  Kalinga Bar and Restaurant   0.5
1                  Delite Bar   0.5
2   20th Century Stores & Bar   0.0
3       Taj Mahal Harbour Bar   0.0
4         Revival Terrace Bar   0.0


----Azad Nagar----
                           venue  freq
0               Kamal Chhaya Bar   1.0
1      20th Century Stores & Bar   0.0
2  Panchmukhi Restaurant And Bar   0.0
3           Red Light Lounge Bar   0.0
4            Revival Terrace Bar   0.0


----B.N. bhavan----
                       venue  freq
0                Central Bar   1.0
1  20th Century Stores & Bar   0.0
2      Taj Mahal Harbour Bar   0.0
3       Red Light Lounge Bar   0.0
4        Revival Terrace Bar   0.0


----Bangur Nagar----
                           venue  freq
0                B99 Karaoke Bar   1.0
1  Panchmukhi Restaurant And Bar   0.0
2           Rajdarbar Lounge Bar   0.0
3           Red Light Lounge Bar   0.0
4            Revival Terrace Bar   0.0


----Central Build

In [135]:
#Function to return the most Common Venues(Bars)
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [136]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Location']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
mum_neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
mum_neighbourhoods_venues_sorted['Location'] = mum_grouped['Location']

for ind in np.arange(mum_grouped.shape[0]):
    mum_neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(mum_grouped.iloc[ind, :], num_top_venues)

mum_neighbourhoods_venues_sorted

Unnamed: 0,Location,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Andheri East,Delite Bar,Kalinga Bar and Restaurant,willingdon sports club bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant
1,Azad Nagar,Kamal Chhaya Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant
2,B.N. bhavan,Central Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
3,Bangur Nagar,B99 Karaoke Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
4,Central Building,Czar Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
5,Charni Road,New Yazdani Bar Aani Restaurant,Krishna Bar & Restaurant,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant
6,Churchgate,Czar Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
7,Cumballa Hill,Bar Bar Hookah,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
8,Dadar,Gopal Krishna Bar & Restaurant,GK Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar
9,Daulat Nagar,Utsav Bar & Restaurant,Nishant Bar & Restaurant,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar


#### Cluster Analysis

In [137]:
# set number of clusters
kclusters = 5

mum_grouped_clustering = mum_grouped.drop('Location', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(mum_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 1, 2, 1, 2, 1, 1, 1], dtype=int32)

In [138]:
# add clustering labels
mum_neighbourhoods_venues_sorted.insert(0, 'Cluster_Labels', kmeans.labels_)

mum_merged = mumbaizip_df

# merge mum_grouped with mumbaizip_data to add latitude/longitude for each neighbourhood
mum_merged = mum_merged.join(mum_neighbourhoods_venues_sorted.set_index('Location'), on='Location')

mum_merged = mum_merged.fillna(0)
#mum_merged['Cluster_Labels'] = mum_merged['Cluster_Labels'].astype(int)

mum_merged.head() # check the last columns!

Unnamed: 0,Location,District,State,Pincode,Latitude,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agripada,Mumbai,Maharashtra,400011,18.975302,72.824898,0.0,0,0,0,0,0,0,0,0,0,0
1,Airport,Mumbai,Maharashtra,400099,19.105137,72.855803,0.0,0,0,0,0,0,0,0,0,0,0
2,Ambewadi,Mumbai,Maharashtra,400004,19.186776,72.859313,0.0,0,0,0,0,0,0,0,0,0,0
3,Andheri,Mumbai,Maharashtra,400053,19.119698,72.84642,0.0,0,0,0,0,0,0,0,0,0,0
4,Andheri East,Mumbai,Maharashtra,400069,19.115883,72.854202,1.0,Delite Bar,Kalinga Bar and Restaurant,willingdon sports club bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant


In [139]:
mum_merged['Cluster_Labels'] = mum_merged['Cluster_Labels'].astype(int)
mum_merged.head()

Unnamed: 0,Location,District,State,Pincode,Latitude,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agripada,Mumbai,Maharashtra,400011,18.975302,72.824898,0,0,0,0,0,0,0,0,0,0,0
1,Airport,Mumbai,Maharashtra,400099,19.105137,72.855803,0,0,0,0,0,0,0,0,0,0,0
2,Ambewadi,Mumbai,Maharashtra,400004,19.186776,72.859313,0,0,0,0,0,0,0,0,0,0,0
3,Andheri,Mumbai,Maharashtra,400053,19.119698,72.84642,0,0,0,0,0,0,0,0,0,0,0
4,Andheri East,Mumbai,Maharashtra,400069,19.115883,72.854202,1,Delite Bar,Kalinga Bar and Restaurant,willingdon sports club bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant


In [140]:
# create map
mum_map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10.5)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(mum_merged['Latitude'], mum_merged['Longitude'], mum_merged['Location'], mum_merged['Cluster_Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(mum_map_clusters)
       
mum_map_clusters

#### Examining Clusters

#### Cluster 1

In [141]:
mum_merged.loc[mum_merged['Cluster_Labels'] == 0, mum_merged.columns[[0] + list(range(5, mum_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agripada,72.824898,0,0,0,0,0,0,0,0,0,0,0
1,Airport,72.855803,0,0,0,0,0,0,0,0,0,0,0
2,Ambewadi,72.859313,0,0,0,0,0,0,0,0,0,0,0
3,Andheri,72.84642,0,0,0,0,0,0,0,0,0,0,0
5,Andheri Railway station,72.84642,0,0,0,0,0,0,0,0,0,0,0
6,Antop Hill,72.865256,0,0,0,0,0,0,0,0,0,0,0
7,Asvini,72.815971,0,0,0,0,0,0,0,0,0,0,0
9,B P t colony,72.861599,0,0,0,0,0,0,0,0,0,0,0
11,Bandra West,72.830267,0,0,0,0,0,0,0,0,0,0,0
12,Bandra(east),72.849811,0,0,0,0,0,0,0,0,0,0,0


#### Cluster 2

In [142]:
mum_merged.loc[mum_merged['Cluster_Labels'] == 1, mum_merged.columns[[0] + list(range(5, mum_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Andheri East,72.854202,1,Delite Bar,Kalinga Bar and Restaurant,willingdon sports club bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant
8,Azad Nagar,72.840038,1,Kamal Chhaya Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant
10,B.N. bhavan,72.832556,1,Central Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
13,Bangur Nagar,72.833678,1,B99 Karaoke Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
24,Charni Road,72.819531,1,New Yazdani Bar Aani Restaurant,Krishna Bar & Restaurant,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar,Gopal Krishna Bar & Restaurant
30,Cumballa Hill,72.806851,1,Bar Bar Hookah,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
31,Dadar,72.842876,1,Gopal Krishna Bar & Restaurant,GK Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar
35,Daulat Nagar,72.860493,1,Utsav Bar & Restaurant,Nishant Bar & Restaurant,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar
44,Goregaon,72.850018,1,Jay Prakash Restaurant & Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Guru Bar,Gopal Krishna Bar & Restaurant
47,Gowalia Tank,72.810098,1,Chicken Centre Resto Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar


#### Cluster 3

In [143]:
mum_merged.loc[mum_merged['Cluster_Labels'] == 2, mum_merged.columns[[0] + list(range(5, mum_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Central Building,72.826718,2,Czar Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar
27,Churchgate,72.82734,2,Czar Bar,willingdon sports club bar,Delite Bar,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar


#### Cluster 4

In [144]:
mum_merged.loc[mum_merged['Cluster_Labels'] == 3, mum_merged.columns[[0] + list(range(5, mum_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,Goregaon East,72.855255,3,Sanman Sports Bar & Restro,willingdon sports club bar,Nishant Bar & Restaurant,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar


#### Cluster 5

In [145]:
mum_merged.loc[mum_merged['Cluster_Labels'] == 4, mum_merged.columns[[0] + list(range(5, mum_merged.shape[1]))]]

Unnamed: 0,Location,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
80,Marol Naka,72.878492,4,Rajdarbar Lounge Bar,willingdon sports club bar,Nishant Bar & Restaurant,National Bar,Lounge Bar & Cigar Divan,Krishna Bar & Restaurant,Kamal Chhaya Bar,Kalinga Bar and Restaurant,Jay Prakash Restaurant & Bar,Guru Bar


## Results

Our cluster analysis of the Mumbai Locational Data, combined with the Venue(Bars) information from the Foursquare API, indicates that a large number of bars are located centrally in Mumbai. As one moves away from the center of Mumbai, the number of bars seem to decline. Additionally, as one moves further North from the center of Mumbai, the number of Bars seem to decline more rapidly. This suggests that if one wants the highest frequency of bars around them in order to Bar Hop, an indivual should go to the center of Mumbai. Additionally, if a Stakeholder wants to invest into opening a Bar in Mumbai, they should seek to invest away from the center of Mumbai, in order to avoid competition from a high density of bars.

Analysis of the mum_cluster_map indicates that the Foursquare API places a Bar off the coast of Mumbai, in the sea. This indicates that the Foursquare API may not have all correct locational data for Mumbai.

## Discussion

Whilst collecting data, I came across the barrier of a lack of readily available and parseable data. For example, a data set on the property values of each neighbourhood by zip code would have been highly beneficial to analyze for a Stakeholder looking to invest into a new Bar. Additionally, data on the population density by zipcode would have further helped Stakeholders and Individuals looking to open a new bar or Bar Hop, respectively.
It is also important to notice that since the Foursquare API placed a bar off the coast of Mumbai, in the water, the Foursquare API data on Mumbai may not be completely accurate.

## Conclusion

Our cluster analysis of the Mumbai Locational Data, combined with the Venue(Bars) information from the Foursquare API, helps us conclude that a large number of bars are located centrally in Mumbai. As one moves away from the center of Mumbai, the number of bars seem to decline. 
Therefore, for a potential Bar Hopping night, an individual may choose to go towards the center of Mumbai, whilst a potential Stakeholder looking to open a new Bar in Mumbai, should look towards the edges of Mumbai, in order to avoid a high density of competition.