# Battle of Neighbourhoods.

### Table of contents:

1. Introduction
2. Data
3. Methodology
4. Analysis
5. Results and Discussion
6. Conclusion

### 1.	Introduction

    Opening a restaurant is a difficult undertaking. But the work is more manageable if broken down into different sub-tasks. There are 
    many   tasks which needs to be taken care of while opening a restaurant. Some of them are the following (assuming that the person 
    already has the required funds):
        1.	Choose a restaurant concept
        2.	Form the menu items
        3.	Write the restaurant business plan
        4.	Choose a location 
        5.	Permits and Licenses etc.
    
    This assignment focuses on point 4 – Choose a location, to open a restaurant in the city of Bangalore (Bengaluru), Karnataka, India.
    When choosing a location for a new restaurant, the following features are among the most important:
        1.	Visibility and accessibility
        2.	The demographics
        3.	Labour costs and minimum wage
        4.	Competition in the area etc.
    
    So, what is Location Analysis? Location analysis is a technique for finding the best location to open a new restaurant. We need to 
    analyse potential locations and their access to customers, their location in relation to equipment and food suppliers and other 
    important factors.

    So, choosing a good location for your business might be the single most effective thing to do, to get success. Having a good menu 
    and professional staff is important to restaurant success but having a good location can give the necessary push for success in 
    the restaurant business.

    In this assignment we will try to find the optimal location to open a new restaurant in the city of Bangalore. Since there are lots 
    of restaurants in Bangalore, we will try to detect locations that are not already crowded with restaurants. We would prefer locations 
    which are close to the city centre as possible.


### 2.	Data
    Data has been collected from the following sources:

        1.Dataset with various neighbourhoods and its Pin codes are collected from the link https://finkode.com/ka/bangalore.html. 
          This dataset has the neighbourhoods with the pin codes and the district name.
        2.geopy.geocoders is used to collect the geo coordinates for various neighbourhoods in Bangalore city.
        3.Foursquare API is used for collecting various venue details in the neighbourhoods of Bangalore city.
        
    Based on the definition of our problem, factors that will influence our decisions are the following:
        1.Number of existing restaurants in the neighbourhood.
        2.Distance between the restaurants.
        3.Distance of neighbourhood from the city centre.
        4.We will even look into various zones in Bangalore city and try to figure out the best place to open a restaurant in a zone as well.


In [346]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

Reading Bengaluru city neighbourhoods.

In [349]:
df_postalcodes = pd.read_html("https://finkode.com/ka/bangalore.html")
df_postalcodes[0].head()

Unnamed: 0,Post Office,District,Pincode
0,A F Station Yelahanka S.O,Bangalore,560063
1,Adugodi S.O,Bangalore,560030
2,Agara B.O,Bangalore,560034
3,Agram S.O,Bangalore,560007
4,Amruthahalli B.O,Bangalore,560092


In [350]:
df_postalcodes[0].shape

(270, 3)

In [351]:
#renaming column 'Post Office to PostOffice'.
df_postalcodes[0].rename(columns={'Post Office':'PostOffice'}, inplace=True) 

In [352]:
df_postalcodes[0].head()

Unnamed: 0,PostOffice,District,Pincode
0,A F Station Yelahanka S.O,Bangalore,560063
1,Adugodi S.O,Bangalore,560030
2,Agara B.O,Bangalore,560034
3,Agram S.O,Bangalore,560007
4,Amruthahalli B.O,Bangalore,560092


In [353]:
#removing acronyms S.O, B.O from neighbourhood names.
df_postalcodes[0]['PostOffice'] = df_postalcodes[0]['PostOffice'].str[0:-4] 

In [355]:
# merging Neighbourhoods with same Postcode.
df_postalcodes[0] = df_postalcodes[0].groupby('Pincode').agg({'PostOffice':'first'}).reset_index()
df_postalcodes[0]

Unnamed: 0,Pincode,PostOffice
0,560001,Bangalore Bazaar
1,560002,Bangalore City
2,560003,Malleswaram
3,560004,Basavanagudi
4,560005,Fraser Town
5,560006,J.C.Nagar
6,560007,Agram
7,560008,H.A.L II Stage
8,560009,Bangalore Dist Offices Bldg
9,560010,Industrial Estate S.O (Bangal


In [356]:
df_postalcodes[0].shape

(104, 2)

In [357]:
#renaming few neighbourhoods with proper names, which can give venue data.
df_postalcodes[0].iloc[0,1] = 'M.G Road'
df_postalcodes[0].iloc[8,1] = 'K.G Road'
df_postalcodes[0].iloc[9,1] = 'Rajajinagar'
df_postalcodes[0].iloc[11,1] = 'IISC'
df_postalcodes[0].iloc[17,1] = 'Chamrajpet'
df_postalcodes[0].iloc[23,1] = 'Anandnagar'
df_postalcodes[0].iloc[35,1] = 'Indiranagar'
df_postalcodes[0].iloc[41,1] = 'Nagawara'
df_postalcodes[0].iloc[46,1] = 'Ashoknagar'
df_postalcodes[0].iloc[58,1] = 'Yelahanka'
df_postalcodes[0].iloc[102,1] = 'Bagalur'
df_postalcodes[0].iloc[7,1] = 'H.A.L'
df_postalcodes[0].iloc[10,1] = 'Jayanagar 3rd Block'
df_postalcodes[0].iloc[15,1] = 'Ramamurthy Nagar'
df_postalcodes[0].iloc[6,1] = 'Air Force Hospital'
df_postalcodes[0].iloc[61,1] = 'Whitefield'
df_postalcodes[0].iloc[31,1] = 'Kormangala'
df_postalcodes[0].iloc[51,1] = 'Bangalore University'

In [358]:
df_postalcodes[0].shape

(104, 2)

In [359]:
df_postalcodes[0].head()

Unnamed: 0,Pincode,PostOffice
0,560001,M.G Road
1,560002,Bangalore City
2,560003,Malleswaram
3,560004,Basavanagudi
4,560005,Fraser Town


DataFrame to hold bangalore city neighbourhood data.
Postcode of neighbourhood, neighbourhood name, neighbourhood latitude, neighbourhood longitude, neighbourhood distance from the city center.

In [360]:
cols = ['Postcode','Neighbourhood','Latitude','Longitude','Distance']
df_bengaluru = pd.DataFrame(columns=cols)

In [362]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geopy.distance #calculate distance between corrdinates

Lets take city centre for Bangalore as neighbourhood with pincode 560001.
we will calculate the distance from coordinates of pincode 560001 to other neighbourhood coordinate.

To calculate the distance between the coordinates, we are using geopy.distance.

df_bengaluru will contain the neighbourhood details of bangalore city with latitude, longitude and distance from city center details.
we are inserting only those neighbourhoods into the dataframe for which location 'is not None'.

In [363]:
center_postcode = '560001'
center_neighbourhood = 'M.G Road'
center_geolocator = Nominatim(user_agent=center_postcode)
center_location = center_geolocator.geocode(center_neighbourhood + ',Bangalore')
center_lat = center_location.latitude
center_long = center_location.longitude
print(center_lat)
print(center_long)

#Pincode Details	Pincode Details.1
for _index,_row in df_postalcodes[0].iterrows(): 
    postcode = _row['Pincode']
    neighbourhood = _row['PostOffice']
    print(neighbourhood)
    
    geolocator = Nominatim(user_agent=postcode)
    location = geolocator.geocode(neighbourhood + ',Bangalore')
    
    #insert only those neighbourhoods for which location data is available.
    if(location is not None):
        lat = location.latitude
        long = location.longitude
    
        coords1 = (center_lat,center_long)
        coords2 = (lat,long)
        distance = geopy.distance.distance(coords1, coords2).km

        df_bengaluru = df_bengaluru.append({'Postcode' : postcode,'Neighbourhood': neighbourhood,
                                          'Latitude':lat ,'Longitude':long, 'Distance':distance}, ignore_index=True)
    
df_bengaluru

12.9730154
77.6166123
M.G Road
Bangalore City
Malleswaram
Basavanagudi
Fraser Town
J.C.Nagar
Air Force Hospital
H.A.L
K.G Road
Rajajinagar
Jayanagar 3rd Block
IISC
Jalahalli
Jalahalli East
Jalahalli West
Ramamurthy Nagar
NAL
Chamrajpet
Gaviopuram Extension
Seshadripuram
Gayathrinagar
Yeshwanthpur Bazar
Magadi Road
Anandnagar
CMP Centre And School
Deepanjalinagar
Sampangiramnagar
Dharmaram College
Adugodi
Kanakanagar
Malkand Lines
Kormangala
Carmelaram
Devasandra
Doddanekkundi
Indiranagar
Nayandahalli
Chandra Lay Out
Jayanagar
Sivan Chetty Gardens
Banaswadi
Nagawara
Benson Town
Austin Town
Hoodi
Bhattarahalli
Ashoknagar
H.K.P. Road
Chickpet
Mathikere
Malleswaram West
Bangalore University
Peenya Dasarahalli
Laggere
Rv Niketan
Chudenapura
Chikkalasandra
Doddakallasandra
Yelahanka
Attur
G.K.V.K.
Whitefield
Devanagundi
Begur
B Sk II Stage
Domlur
Nagarbhavi
Bagalgunte
Kumbalagodu
Jeevabhimanagar
Bannerghatta Road
Dr. Shivarama Karanth Nagar
J P Nagar
Basaveshwaranagar
Sadashivanagar
Bolare
B

Unnamed: 0,Postcode,Neighbourhood,Latitude,Longitude,Distance
0,560001,M.G Road,12.973015,77.616612,0.0
1,560002,Bangalore City,12.97912,77.5913,2.82811
2,560003,Malleswaram,13.016341,77.558664,7.905429
3,560004,Basavanagudi,12.941726,77.575502,5.646159
4,560006,J.C.Nagar,15.349723,75.137521,375.187845
5,560007,Air Force Hospital,12.964027,77.6275,1.544116
6,560009,K.G Road,12.89122,77.622812,9.07405
7,560010,Rajajinagar,12.988234,77.554883,6.905588
8,560011,Jayanagar 3rd Block,12.934705,77.582714,5.611704
9,560012,IISC,13.022235,77.567183,7.642283


In [364]:
import folium # map rendering library
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

In [365]:
#take bangalore city center.
address = 'Bangalore'

geolocator = Nominatim(user_agent="560001")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinate of Bangalore City are {}, {}.'.format(latitude, longitude))

The geographical coordinate of Bangalore City are 12.9791198, 77.5912997.


In [758]:
# create map of Central Bangalore using latitude and longitude values
map_bengaluru = folium.Map(location=[latitude, longitude], zoom_start=10)
map_bengaluru

In [759]:
# add markers to map
for lat, lng, neighbourhood in zip(df_bengaluru['Latitude'], df_bengaluru['Longitude'], df_bengaluru['Neighbourhood']):
    label = '{}'.format(neighbourhood)
    label = folium.Popup(neighbourhood, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_bengaluru)  
    
map_bengaluru

Foursquare data pulling for the neighbourhoods in Bangalore

In [368]:
CLIENT_ID = 'IL5NROJRYUGBTMVWG4WXDRWSKOOAP13BN2TETE2DLTGOLWZS' # your Foursquare ID
CLIENT_SECRET = 'CDHTH35NHVMAJP1WERJTXNICS5K0IYSAGPKDKZOWBRRQW0VV' # your Foursquare Secret
VERSION = '20191201' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: IL5NROJRYUGBTMVWG4WXDRWSKOOAP13BN2TETE2DLTGOLWZS
CLIENT_SECRET:CDHTH35NHVMAJP1WERJTXNICS5K0IYSAGPKDKZOWBRRQW0VV


In [369]:
# type your answer here
LIMIT = 150 # limit of number of venues returned by Foursquare API
radius = 1000 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latitude, 
    longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=IL5NROJRYUGBTMVWG4WXDRWSKOOAP13BN2TETE2DLTGOLWZS&client_secret=CDHTH35NHVMAJP1WERJTXNICS5K0IYSAGPKDKZOWBRRQW0VV&v=20191201&ll=12.9791198,77.5912997&radius=1000&limit=150'

In [370]:
results = requests.get(url).json()
#results

df_venues will get all the venue data in the city of bangalore with coordinates and other details.

In [371]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'],
             v['venue']['id'],
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue_ID', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [372]:
df_venues = getNearbyVenues(names=df_bengaluru['Neighbourhood'],
                                   latitudes=df_bengaluru['Latitude'],
                                   longitudes=df_bengaluru['Longitude']
                                  )

df_venues.head()

M.G Road
Bangalore City
Malleswaram
Basavanagudi
J.C.Nagar
Air Force Hospital
K.G Road
Rajajinagar
Jayanagar 3rd Block
IISC
Jalahalli
Jalahalli East
Jalahalli West
Ramamurthy Nagar
NAL
Chamrajpet
Seshadripuram
Magadi Road
Anandnagar
CMP Centre And School
Deepanjalinagar
Dharmaram College
Adugodi
Kanakanagar
Kormangala
Carmelaram
Devasandra
Doddanekkundi
Indiranagar
Nayandahalli
Jayanagar
Banaswadi
Nagawara
Benson Town
Austin Town
Hoodi
Bhattarahalli
Ashoknagar
Chickpet
Mathikere
Malleswaram West
Bangalore University
Peenya Dasarahalli
Laggere
Chikkalasandra
Doddakallasandra
Yelahanka
Attur
Whitefield
Begur
Domlur
Nagarbhavi
Bagalgunte
Kumbalagodu
Bannerghatta Road
J P Nagar
Basaveshwaranagar
Sadashivanagar
Bolare
Bannerghatta
Kacharakanahalli
Banashankari III Stage
Gunjur
Tarabanahalli
Amruthahalli
C.V.Raman Nagar
Koramangala VI Bk
Chikkabettahalli
Bommasandra Industrial Estate
Electronics City
HSR Layout
Bellandur
Haragadde
Bangalore International Airport
Anekal
Attibele
Dommasandra
C

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue_ID,Venue Latitude,Venue Longitude,Venue Category
0,M.G Road,12.973015,77.616612,The Oberoi,4b530849f964a520108d27e3,12.973457,77.618289,Hotel
1,M.G Road,12.973015,77.616612,Vivanta by Taj,4cbc9d194352a1cd453f9ff5,12.973365,77.619951,Hotel
2,M.G Road,12.973015,77.616612,Teppan,4faff6ade4b0c0e08eb8d691,12.975727,77.616879,Japanese Restaurant
3,M.G Road,12.973015,77.616612,Foodhall,4fabd03ae4b050ddfe0aa376,12.973486,77.620117,Department Store
4,M.G Road,12.973015,77.616612,Benjarong,4bcaeabafb84c9b639e91d3e,12.975615,77.616916,Thai Restaurant


In [373]:
df_venues.shape

(732, 8)

In [374]:
df_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue_ID,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Adugodi,5,5,5,5,5,5,5
Air Force Hospital,2,2,2,2,2,2,2
Amruthahalli,3,3,3,3,3,3,3
Anandnagar,4,4,4,4,4,4,4
Anekal,2,2,2,2,2,2,2
Ashoknagar,4,4,4,4,4,4,4
Austin Town,3,3,3,3,3,3,3
Bagalgunte,5,5,5,5,5,5,5
Bagalur,1,1,1,1,1,1,1
Banashankari III Stage,7,7,7,7,7,7,7


In [375]:
df_venues.groupby('Venue Category').count()

Unnamed: 0_level_0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue_ID,Venue Latitude,Venue Longitude
Venue Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ATM,4,4,4,4,4,4,4
Accessories Store,1,1,1,1,1,1,1
Afghan Restaurant,1,1,1,1,1,1,1
Airport,1,1,1,1,1,1,1
Airport Lounge,1,1,1,1,1,1,1
American Restaurant,2,2,2,2,2,2,2
Andhra Restaurant,3,3,3,3,3,3,3
Arts & Crafts Store,2,2,2,2,2,2,2
Arts & Entertainment,1,1,1,1,1,1,1
Asian Restaurant,8,8,8,8,8,8,8


We will take only restaurant data.

In [376]:
df_venues = df_venues[df_venues['Venue Category'].str.contains('Restaurant')]

In [377]:
df_venues.head()

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue_ID,Venue Latitude,Venue Longitude,Venue Category
2,M.G Road,12.973015,77.616612,Teppan,4faff6ade4b0c0e08eb8d691,12.975727,77.616879,Japanese Restaurant
4,M.G Road,12.973015,77.616612,Benjarong,4bcaeabafb84c9b639e91d3e,12.975615,77.616916,Thai Restaurant
5,M.G Road,12.973015,77.616612,Yauatcha,523d691011d20523c79c04ca,12.973318,77.620072,Asian Restaurant
13,M.G Road,12.973015,77.616612,Cafe Mozaic,4ba270e9f964a52043f937e3,12.973642,77.619762,Restaurant
14,M.G Road,12.973015,77.616612,Le Jardin,4f484abfe4b0d63740f04e9a,12.973562,77.617864,French Restaurant


In [760]:
df_venues.shape

(273, 8)

### 3. Methodology

In this assignment we will focus on to find the best location to open a restaurant in the city of Bangalore.
We will try to do a Location Analysis of the neighbourhoods of Bangalore.
Since there are lots of restaurants in Bangalore, we will try to detect locations that are not already crowded with restaurants. We would prefer locations which are close to the city centre as possible.

Based on the definition of our problem, factors that will influence our decisions are the following:
1.	Number of existing restaurants in the neighbourhood or Distance between the restaurants or Density of restaurants in a neighbourhood.
2.	Distance of neighbourhood from the city centre.
3.	We will even investigate various zones in Bangalore city and try to figure out the best place to open a restaurant in a zone as well.

We will give a score to the neighbourhood according to the above parameters and the the neighbourhood with the best score will be the best place to open a restaurant in the city of Bangalore with the given data.

We will even have a cluster map of the neighbourhoods with restaurants.

### 4. Analysis

In [378]:
df_restaurants_count = df_venues.groupby('Neighbourhood')['Venue Category'].agg('count') #shows number of restaurants in the neighbourhood.
df_restaurants_count.sort_values(ascending=False)

Neighbourhood
Kormangala                         50
Jayanagar                          27
M.G Road                           18
Jayanagar 3rd Block                17
HSR Layout                         13
J P Nagar                          12
Indiranagar                        11
NAL                                11
Seshadripuram                       9
Bannerghatta Road                   7
Basaveshwaranagar                   7
Domlur                              6
Chikkalasandra                      5
Dharmaram College                   5
Banaswadi                           5
Basavanagudi                        5
Sadashivanagar                      5
Chamrajpet                          4
Bangalore International Airport     4
Hoodi                               3
Electronics City                    3
C.V.Raman Nagar                     3
Doddakallasandra                    2
Chickpet                            2
Bangalore City                      2
Bellandur                           

In [379]:
print('There are {} uniques categories.'.format(len(df_venues['Venue Category'].unique()))) #checking unique categories.

There are 33 uniques categories.


In [380]:
print('There are {} neighbourhoods.'.format(len(df_venues['Neighbourhood'].unique()))) #checking the number of neighbourhoods.

There are 50 neighbourhoods.


In [381]:
# one hot encoding
df_onehot = pd.get_dummies(df_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
df_onehot['Neighbourhood'] = df_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [df_onehot.columns[-1]] + list(df_onehot.columns[:-1])
df_onehot = df_onehot[fixed_columns]

df_onehot.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant,Fast Food Restaurant,French Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Karnataka Restaurant,Kerala Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,North Indian Restaurant,Pakistani Restaurant,Persian Restaurant,Punjabi Restaurant,Rajasthani Restaurant,Restaurant,Seafood Restaurant,South Indian Restaurant,Tex-Mex Restaurant,Thai Restaurant,Tibetan Restaurant,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
2,M.G Road,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,M.G Road,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
5,M.G Road,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
13,M.G Road,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
14,M.G Road,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [382]:
df_onehot.shape

(273, 34)

In [383]:
df_grouped = df_onehot.groupby('Neighbourhood').mean().reset_index()
df_grouped.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant,Fast Food Restaurant,French Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Karnataka Restaurant,Kerala Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,North Indian Restaurant,Pakistani Restaurant,Persian Restaurant,Punjabi Restaurant,Rajasthani Restaurant,Restaurant,Seafood Restaurant,South Indian Restaurant,Tex-Mex Restaurant,Thai Restaurant,Tibetan Restaurant,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Amruthahalli,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Anandnagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Ashoknagar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Austin Town,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bagalgunte,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [384]:
df_grouped.shape

(50, 34)

In [385]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [719]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
df_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
df_neighborhoods_venues_sorted['Neighbourhood'] = df_grouped['Neighbourhood']

for ind in np.arange(df_grouped.shape[0]):
    df_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df_grouped.iloc[ind, :], num_top_venues)

df_neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Amruthahalli,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
1,Anandnagar,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
2,Ashoknagar,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
3,Austin Town,Indian Restaurant,Italian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant
4,Bagalgunte,Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,Indian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant


In [720]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

In [721]:
# set number of clusters
kclusters = 5

df_grouped_clustering = df_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(df_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 0, 0, 3, 1, 4, 0, 2, 2])

In [722]:
df_merged = pd.DataFrame()

In [723]:
# add clustering labels
df_neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

df_merged = df_bengaluru

# merge df_grouped with df_bengaluru to add latitude/longitude for each neighborhood
df_merged = df_merged.join(df_neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

df_merged.head() # check the last columns!

Unnamed: 0,Postcode,Neighbourhood,Latitude,Longitude,Distance,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,560001,M.G Road,12.973015,77.616612,0.0,2.0,Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Asian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Modern European Restaurant,French Restaurant,Kerala Restaurant
1,560002,Bangalore City,12.97912,77.5913,2.82811,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
2,560003,Malleswaram,13.016341,77.558664,7.905429,,,,,,,,,,,
3,560004,Basavanagudi,12.941726,77.575502,5.646159,0.0,Indian Restaurant,Restaurant,Mediterranean Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant
4,560006,J.C.Nagar,15.349723,75.137521,375.187845,4.0,Vegetarian / Vegan Restaurant,Indian Restaurant,Fast Food Restaurant,Kerala Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant


In [724]:
df_merged.shape

(79, 16)

In [725]:
df_merged = df_merged.dropna() #dropping neighbourhoods which has NaN values.

In [726]:
df_merged.shape

(50, 16)

In [727]:
_merged.head()

Unnamed: 0,postcode,neighbourhood,latitude,longitude,distance,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,560001,M.G Road,12.973015,77.616612,0.0,1.0,Hotel,Clothing Store,Restaurant,Café,Indian Restaurant,Chinese Restaurant,Bar,Thai Restaurant,Asian Restaurant,Steakhouse
1,560002,Bangalore City,12.97912,77.5913,2.82811,0.0,Park,Restaurant,Indian Restaurant,Capitol Building,Donut Shop,Diner,Dhaba,Dessert Shop,Department Store,Deli / Bodega
2,560003,Malleswaram,13.016341,77.558664,7.905429,1.0,Coffee Shop,Department Store,Bar,Juice Bar,Convenience Store,Coworking Space,Creperie,Cupcake Shop,Deli / Bodega,Fast Food Restaurant
3,560004,Basavanagudi,12.941726,77.575502,5.646159,0.0,Indian Restaurant,Café,Tea Room,Hookah Bar,Snack Place,Metro Station,Road,Restaurant,Mediterranean Restaurant,Diner
4,560006,J.C.Nagar,15.349723,75.137521,375.187845,0.0,Food Court,Hotel,Vegetarian / Vegan Restaurant,Indian Restaurant,Bed & Breakfast,Bus Station,Café,Dhaba,Eastern European Restaurant,Donut Shop


In [757]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_merged['Latitude'], df_merged['Longitude'], df_merged['Neighbourhood'], df_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(int(cluster)), parse_html=True)
    folium.CircleMarker(
      [lat, lon],
      radius=5,
        popup=label,
         color=rainbow[int(cluster)-1],
          fill=True,
          fill_color=rainbow[int(cluster)-1],
            fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

We will merge the counts of restaurants in a neighbourhood with df_merged dataframe which contains our final dataset which was used to
show the map clusters above.

Counts of restaurants in a neighbourhood is in df_restaurants_count.

In [729]:
# merge df_merged with df_restaurants_count to add latitude/longitude for each neighborhood
df_merged = df_merged.join(df_restaurants_count, on='Neighbourhood')

In [730]:
df_merged.shape

(50, 17)

In [731]:
df_merged.sort_values(ascending=False, by='Venue Category').head()

Unnamed: 0,Postcode,Neighbourhood,Latitude,Longitude,Distance,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Venue Category
24,560034,Kormangala,12.935468,77.61721,4.154379,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Italian Restaurant,Fast Food Restaurant,Mexican Restaurant,Asian Restaurant,Kerala Restaurant,Eastern European Restaurant,Persian Restaurant,50
30,560041,Jayanagar,12.929273,77.582423,6.097561,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Fast Food Restaurant,Mexican Restaurant,Andhra Restaurant,Chettinad Restaurant,Asian Restaurant,Comfort Food Restaurant,Mediterranean Restaurant,27
0,560001,M.G Road,12.973015,77.616612,0.0,2.0,Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Asian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Modern European Restaurant,French Restaurant,Kerala Restaurant,18
8,560011,Jayanagar 3rd Block,12.934705,77.582714,5.611704,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Andhra Restaurant,Asian Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,17
70,560102,HSR Layout,12.911623,77.638862,7.208249,0.0,Indian Restaurant,Fast Food Restaurant,Mediterranean Restaurant,Punjabi Restaurant,Chettinad Restaurant,North Indian Restaurant,Seafood Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,13


In [732]:
df_merged.columns

Index(['Postcode', 'Neighbourhood', 'Latitude', 'Longitude', 'Distance',
       'Cluster Labels', '1st Most Common Venue', '2nd Most Common Venue',
       '3rd Most Common Venue', '4th Most Common Venue',
       '5th Most Common Venue', '6th Most Common Venue',
       '7th Most Common Venue', '8th Most Common Venue',
       '9th Most Common Venue', '10th Most Common Venue', 'Venue Category'],
      dtype='object')

Rearanging the columns order in th df_merged

In [733]:
df_merged = df_merged[['Postcode', 'Neighbourhood', 'Latitude', 'Longitude', 'Distance', 'Venue Category',
       'Cluster Labels', '1st Most Common Venue', '2nd Most Common Venue',
       '3rd Most Common Venue', '4th Most Common Venue',
       '5th Most Common Venue', '6th Most Common Venue',
       '7th Most Common Venue', '8th Most Common Venue',
       '9th Most Common Venue', '10th Most Common Venue']]

In [734]:
df_merged.shape

(50, 17)

In [735]:
df_merged.sort_values(ascending=False, by='Venue Category').head()

Unnamed: 0,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
24,560034,Kormangala,12.935468,77.61721,4.154379,50,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Italian Restaurant,Fast Food Restaurant,Mexican Restaurant,Asian Restaurant,Kerala Restaurant,Eastern European Restaurant,Persian Restaurant
30,560041,Jayanagar,12.929273,77.582423,6.097561,27,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Fast Food Restaurant,Mexican Restaurant,Andhra Restaurant,Chettinad Restaurant,Asian Restaurant,Comfort Food Restaurant,Mediterranean Restaurant
0,560001,M.G Road,12.973015,77.616612,0.0,18,2.0,Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Asian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Modern European Restaurant,French Restaurant,Kerala Restaurant
8,560011,Jayanagar 3rd Block,12.934705,77.582714,5.611704,17,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Andhra Restaurant,Asian Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant
70,560102,HSR Layout,12.911623,77.638862,7.208249,13,0.0,Indian Restaurant,Fast Food Restaurant,Mediterranean Restaurant,Punjabi Restaurant,Chettinad Restaurant,North Indian Restaurant,Seafood Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant


Now lets calculate the Score for the neighbourhoods depending on the parameter which we discussed earlier above and insert into our df_merged.

the Score will total of 10 points.
the Score is divided equally into two parameters : 
    1. Distance of neighbourhood from the city center. (maximum of 5 points and minimum of 1 point).
    2. Number of restaurants in the neighbourhood. (maximum of 5 points and minimum of 1 point).
    
calculation of Distance of neighbourhood from the city center.
    Note: distance in df_merged is in KM.
        a. distance 0 to 5 km -> 5points
        b. distance 6 to 10 km -> 4points
        c. distance 11 to 18 km -> 3points
        d. distance 19 to 25 km -> 2points
        e. distance 26 and more than 26 km -> 1point

calculation of Number of existing restaurants in the neighbourhood.
        a. no. of restaurants 0 to 10 -> 5points
        b. no. of restaurants 11 to 15 -> 4points
        c. no. of restaurants 16 to 25 -> 3points
        d. no. of restaurants 26 to 30 -> 2points
        e. no. of restaurants 31 and more than 31 -> 1point

so we will take points from both the parameters and give a final score out of 10 to the neighbourhood.
The neighbourhood with the highest score will be the best place to open a new restaurant in the city of Bangalore with the given data
and with the above parameters.

We will even merge the zonal data with our final dataset and check the results for each zones available in Bangalore.


Get Bangalore Zonal data - The Neighbourhoods in Bangalore are divided into different zones.

In [736]:
# reading zonal data from a local file, created from zonal maps of Bangalore.
df_zonal_data = pd.read_excel('Zones.xlsx')

In [737]:
df_zonal_data['Neighbourhood'] = df_zonal_data['Neighbourhood'].str.strip() # removing spaces

In [762]:
df_zonal_data.head(10)

Unnamed: 0,Neighbourhood,Zone
0,Amruthahalli,Northern
1,Anandnagar,Northern
2,Ashoknagar,Central
3,Austin Town,Central
4,Bagalgunte,Northern
5,Banashankari III Stage,Western
6,Banaswadi,Northern
7,Bangalore City,Central
8,Bangalore International Airport,Outskirt
9,Bangalore University,Western


In [739]:
# adding a new column 'Score' to df_merged and assigning value 0 to it.
df_merged['Score'] = 0
df_merged['Zone'] = ''

In [740]:
df_merged.head()

Unnamed: 0,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Score,Zone
0,560001,M.G Road,12.973015,77.616612,0.0,18,2.0,Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Asian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Modern European Restaurant,French Restaurant,Kerala Restaurant,0,
1,560002,Bangalore City,12.97912,77.5913,2.82811,2,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,0,
3,560004,Basavanagudi,12.941726,77.575502,5.646159,5,0.0,Indian Restaurant,Restaurant,Mediterranean Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,0,
4,560006,J.C.Nagar,15.349723,75.137521,375.187845,2,4.0,Vegetarian / Vegan Restaurant,Indian Restaurant,Fast Food Restaurant,Kerala Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,0,
6,560009,K.G Road,12.89122,77.622812,9.07405,1,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant,0,


In [741]:
# rearanging columns order.
df_merged = df_merged[['Postcode', 'Neighbourhood', 'Latitude', 'Longitude', 'Distance', 'Venue Category','Score','Zone',
       'Cluster Labels', '1st Most Common Venue', '2nd Most Common Venue',
       '3rd Most Common Venue', '4th Most Common Venue',
       '5th Most Common Venue', '6th Most Common Venue',
       '7th Most Common Venue', '8th Most Common Venue',
       '9th Most Common Venue', '10th Most Common Venue']]

In [742]:
df_merged.reset_index(inplace=True)

In [743]:
df_merged.head()

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,560001,M.G Road,12.973015,77.616612,0.0,18,0,,2.0,Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Asian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Modern European Restaurant,French Restaurant,Kerala Restaurant
1,1,560002,Bangalore City,12.97912,77.5913,2.82811,2,0,,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
2,3,560004,Basavanagudi,12.941726,77.575502,5.646159,5,0,,0.0,Indian Restaurant,Restaurant,Mediterranean Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant
3,4,560006,J.C.Nagar,15.349723,75.137521,375.187845,2,0,,4.0,Vegetarian / Vegan Restaurant,Indian Restaurant,Fast Food Restaurant,Kerala Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
4,6,560009,K.G Road,12.89122,77.622812,9.07405,1,0,,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant


In [744]:
for index, row in df_merged.iterrows():
   # print(index)
    distance_score = 0
    no_restaurants_score = 0
    #columns to be used -> Distance and Venue Category
    
    if(row['Distance'] >= 0 and row['Distance'] <= 5):
        distance_score = 5
    elif (row['Distance'] >= 6 and row['Distance'] <= 10):
        distance_score = 4
    elif (row['Distance'] >= 11 and row['Distance'] <= 18):
        distance_score = 3
    elif (row['Distance'] >= 19 and row['Distance'] <= 25):
        distance_score = 2
    elif (row['Distance'] >= 26):
        distance_score = 1
      
    if(row['Venue Category'] >= 0 and row['Venue Category'] <= 10):
        no_restaurants_score = 5
    elif (row['Venue Category'] >= 11 and row['Venue Category'] <= 15):
        no_restaurants_score = 4
    elif (row['Venue Category'] >= 16 and row['Venue Category'] <= 25):
        no_restaurants_score = 3
    elif (row['Venue Category'] >= 26 and row['Venue Category'] <= 30):
        no_restaurants_score = 2
    elif (row['Venue Category'] >= 31):
        no_restaurants_score = 1
        
    df_merged.ix[index,'Score'] = distance_score + no_restaurants_score
    
    # add Zones
    _zone = df_zonal_data[df_zonal_data['Neighbourhood'] == row['Neighbourhood']]
    if(len(_zone['Zone']) > 0):
        df_merged.ix[index,'Zone'] = _zone['Zone'].values[0]
    
df_merged

.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated


Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,560001,M.G Road,12.973015,77.616612,0.0,18,8,Central,2.0,Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Asian Restaurant,Japanese Restaurant,Mediterranean Restaurant,Modern European Restaurant,French Restaurant,Kerala Restaurant
1,1,560002,Bangalore City,12.97912,77.5913,2.82811,2,10,Central,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
2,3,560004,Basavanagudi,12.941726,77.575502,5.646159,5,5,Southern,0.0,Indian Restaurant,Restaurant,Mediterranean Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant
3,4,560006,J.C.Nagar,15.349723,75.137521,375.187845,2,6,Central,4.0,Vegetarian / Vegan Restaurant,Indian Restaurant,Fast Food Restaurant,Kerala Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
4,6,560009,K.G Road,12.89122,77.622812,9.07405,1,9,Central,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
5,7,560010,Rajajinagar,12.988234,77.554883,6.905588,1,9,Western,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
6,8,560011,Jayanagar 3rd Block,12.934705,77.582714,5.611704,17,3,Southern,0.0,Indian Restaurant,Chinese Restaurant,Restaurant,Andhra Restaurant,Asian Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant
7,10,560013,Jalahalli,13.046453,77.54838,10.990634,2,5,Northern,4.0,Vegetarian / Vegan Restaurant,Indian Restaurant,Fast Food Restaurant,Kerala Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
8,11,560014,Jalahalli East,13.046453,77.54838,10.990634,2,5,Northern,4.0,Vegetarian / Vegan Restaurant,Indian Restaurant,Fast Food Restaurant,Kerala Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
9,14,560017,NAL,12.966722,77.653786,4.092918,11,9,Eastern,0.0,Indian Restaurant,Chinese Restaurant,Kerala Restaurant,South Indian Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant


In [745]:
df_merged.sort_values(ascending=False, by='Score').head(10)

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,37,560050,Ashoknagar,12.97912,77.5913,2.82811,2,10,Central,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
21,33,560046,Benson Town,12.997803,77.604175,3.056261,1,10,Central,2.0,Pakistani Restaurant,Vietnamese Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,Indian Restaurant,French Restaurant,Eastern European Restaurant,Mediterranean Restaurant
26,38,560053,Chickpet,12.968003,77.578642,4.156842,2,10,Central,2.0,Middle Eastern Restaurant,South Indian Restaurant,Vietnamese Restaurant,Eastern European Restaurant,Japanese Restaurant,Italian Restaurant,Indian Restaurant,French Restaurant,Fast Food Restaurant,Comfort Food Restaurant
1,1,560002,Bangalore City,12.97912,77.5913,2.82811,2,10,Central,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
34,50,560071,Domlur,12.962467,77.638196,2.616447,6,10,Central,0.0,Indian Restaurant,Vietnamese Restaurant,Rajasthani Restaurant,Italian Restaurant,Chinese Restaurant,Karnataka Restaurant,Japanese Restaurant,French Restaurant,Fast Food Restaurant,Eastern European Restaurant
22,34,560047,Austin Town,12.961274,77.615294,1.306855,2,10,Central,0.0,Indian Restaurant,Italian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant
14,21,560029,Dharmaram College,12.933841,77.604203,4.538165,5,10,Southern,0.0,Indian Restaurant,Kerala Restaurant,Middle Eastern Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
48,71,560103,Bellandur,12.97912,77.5913,2.82811,2,10,Eastern,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
37,54,560076,Bannerghatta Road,12.910639,77.600046,7.130945,7,9,Southern,0.0,Indian Restaurant,Udupi Restaurant,Fast Food Restaurant,Punjabi Restaurant,Vietnamese Restaurant,Eastern European Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Comfort Food Restaurant
5,7,560010,Rajajinagar,12.988234,77.554883,6.905588,1,9,Western,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant


In [746]:
df_top10 = df_merged.sort_values(ascending=False, by='Score').head(10)

So from the results above we can see that 8 neighbourhoods have got full 10 points.
The neighbourhoods with highest points are the best neighbourhoods to open a new restaurants. So there are atleast 8 neighbourhoods in 
Bangalore city where a new restaurant can be opened.

In [747]:
df_top10['Neighbourhood'] # top 10 neighbourhoods to open a new restaurant.

25           Ashoknagar
21          Benson Town
26             Chickpet
1        Bangalore City
34               Domlur
22          Austin Town
14    Dharmaram College
48            Bellandur
37    Bannerghatta Road
5           Rajajinagar
Name: Neighbourhood, dtype: object

Below are the top 10 neighbourhoods where not to open a new restaurant. 

In [748]:
df_bottom10 = df_merged.sort_values(ascending=False, by='Score').tail(10)

In [749]:
df_bottom10['Neighbourhood'] # top 10 neighbourhoods not to open a new restaurant.

2            Basavanagudi
7               Jalahalli
44        C.V.Raman Nagar
30         Chikkalasandra
8          Jalahalli East
11          Seshadripuram
18           Nayandahalli
20              Banaswadi
40         Sadashivanagar
6     Jayanagar 3rd Block
Name: Neighbourhood, dtype: object

We can even show the various zones in Bangalore and can show the best neighbourhoods to open a new restaurant in these zones.

In [751]:
df_zones = df_merged.groupby('Zone').count()
df_zones

Unnamed: 0_level_0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Zone,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
Central,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13
Eastern,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8
Northern,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10
Outskirt,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
Southern,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11
Western,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6


In [752]:
df_zones_central = df_merged[df_merged['Zone'] == 'Central']
df_zones_central.sort_values(ascending=False, by='Score').head(3)

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,1,560002,Bangalore City,12.97912,77.5913,2.82811,2,10,Central,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
21,33,560046,Benson Town,12.997803,77.604175,3.056261,1,10,Central,2.0,Pakistani Restaurant,Vietnamese Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,Indian Restaurant,French Restaurant,Eastern European Restaurant,Mediterranean Restaurant
22,34,560047,Austin Town,12.961274,77.615294,1.306855,2,10,Central,0.0,Indian Restaurant,Italian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant


In [753]:
df_zones_central = df_merged[df_merged['Zone'] == 'Northern']
df_zones_central.sort_values(ascending=False, by='Score').head(3)

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,18,560024,Anandnagar,13.033377,77.589523,7.295879,1,9,Northern,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
27,39,560054,Mathikere,13.032888,77.557374,9.22891,2,9,Northern,0.0,American Restaurant,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
29,42,560057,Peenya Dasarahalli,13.033019,77.533201,11.222554,2,8,Northern,0.0,Fast Food Restaurant,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant


In [754]:
df_zones_central = df_merged[df_merged['Zone'] == 'Eastern']
df_zones_central.sort_values(ascending=False, by='Score').head(3)

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
48,71,560103,Bellandur,12.97912,77.5913,2.82811,2,10,Eastern,0.0,Restaurant,Indian Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
9,14,560017,NAL,12.966722,77.653786,4.092918,11,9,Eastern,0.0,Indian Restaurant,Chinese Restaurant,Kerala Restaurant,South Indian Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant
16,27,560037,Doddanekkundi,12.976395,77.694118,8.41729,2,9,Eastern,0.0,Mediterranean Restaurant,Indian Restaurant,Vietnamese Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant


In [755]:
df_zones_central = df_merged[df_merged['Zone'] == 'Southern']
df_zones_central.sort_values(ascending=False, by='Score').head(3)

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,21,560029,Dharmaram College,12.933841,77.604203,4.538165,5,10,Southern,0.0,Indian Restaurant,Kerala Restaurant,Middle Eastern Restaurant,Fast Food Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Vietnamese Restaurant,Mediterranean Restaurant
37,54,560076,Bannerghatta Road,12.910639,77.600046,7.130945,7,9,Southern,0.0,Indian Restaurant,Udupi Restaurant,Fast Food Restaurant,Punjabi Restaurant,Vietnamese Restaurant,Eastern European Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Comfort Food Restaurant
41,59,560083,Bannerghatta,12.895583,77.602622,8.699841,2,9,Southern,0.0,Indian Restaurant,South Indian Restaurant,Vietnamese Restaurant,Eastern European Restaurant,Karnataka Restaurant,Japanese Restaurant,Italian Restaurant,French Restaurant,Fast Food Restaurant,Comfort Food Restaurant


In [756]:
df_zones_central = df_merged[df_merged['Zone'] == 'Western']
df_zones_central.sort_values(ascending=False, by='Score').head(3)

Unnamed: 0,index,Postcode,Neighbourhood,Latitude,Longitude,Distance,Venue Category,Score,Zone,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,7,560010,Rajajinagar,12.988234,77.554883,6.905588,1,9,Western,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant
39,56,560079,Basaveshwaranagar,12.986475,77.538571,8.596926,7,9,Western,0.0,Indian Restaurant,Asian Restaurant,Fast Food Restaurant,Karnataka Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Chettinad Restaurant,Chinese Restaurant
42,61,560085,Banashankari III Stage,12.932708,77.546254,8.841212,1,9,Western,1.0,Indian Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,American Restaurant,Andhra Restaurant,Asian Restaurant,Chettinad Restaurant,Chinese Restaurant,Comfort Food Restaurant,Eastern European Restaurant


### 5. Results and Discussion

Our results show that there are many neighbourhoods in the Bangalore city where a person can open a new restaurant. Out of all the 100+ neighbourhoods that we took in the beginning, we only took those neighbourhoods which had the Venue Categories which contains any kind of restaurants. And in that set of neighbourhoods we found the best neighbourhoods to open a new restaurant based on the score calculated from the 2 parameters: distance of the neighbourhood from the city centre and no. of restaurants already present in the neighbourhood.
We have done a proper Exploratory Data Analysis (EDA) of the data we got for our Bangalore city neighbourhoods. We pulled the various neighbourhoods of Bangalore and correct few names of the neighbourhoods so that we can get good venue details of them. After that we got the coordinates details of the neighbourhoods and distance of the neighbourhoods from the city centre using geopy and tried to plot a map of the city of Bangalore with the neighbourhoods.

Then we pulled the venue details of the neighbourhoods and took out only those neighbourhoods which contained any kind of restaurants in their venue categories. On this dataset we did grouping of various types to analyse the neighbourhood and created onehot encoding for the venue categories. In the end tried to do clustering of the various neighbourhoods with this data.

And finally, we added the no. of restaurants in the neighbourhood in the dataset and calculated the score of the neighbourhoods by using the 2 parameters: distance of the neighbourhood from city centre and no. of restaurants already present in the neighbourhood. And the neighbourhoods which got the highest scores are our required neighbourhoods to open a new restaurant in the city of Bangalore.

top 10 neighbourhoods to open a new restaurant in Bangalore: 
Ashoknagar 
Benson Town
Chickpet
Bangalore City 
Domlur 
Austin Town 
Dharmaram College 
Bellandur 
Bannerghatta Road 
Rajajinagar

top neighbourhoods in the zones of Bangalore: 
Western -> Rajajinagar 
Southern -> Dharmaram College 
Eastern -> Bellandur 
Northern -> Anandnagar 
Central -> Bangalore City

So, this our final result that we got for the city of Bangalore with neighbourhoods’ details to open a new restaurant. This result is all based on the datasets that we have pulled.


### 6. Conclusion

In this assignment we focused on to find the best location to open a restaurant in the city of Bangalore. We will did a Location Analysis of the neighbouhoods of Bangalore. Since there are lots of restaurants in Bangalore, we tried to detect those locations that are not already crowded with restaurants. We preferred locations which are close to the city centre as well. 
We have taken only 2 parameters to check whether a neighbourhood is good or not good to open a new restaurant. We can even have other parameters like demography of the neighbourhood like population, literacy rate, sex ratio etc., traffic, labour costs in the neighbourhood etc. So like this we can take the best parameters we need for our data analysis and do a Location Analysis on a Neighbourhood and we can find which neighbourhoods are best for opening a new restaurant. This same methodology can be used for any kind of business with respective parameters to be used. We can even do a Business Analysis. With the power of Data Analysis and ML/AI we can always find great insights of things which we dont see from outside. This is a very powerful tool it works like Magic!