# Capstone Project - Most Similar Neighboorhood

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find the neighborhood similar to any city location in Bangalore City. Specifically, this report will be targeted to the people who are willing to relocate to bangalore and want to know which location in bangalore is having similar neighborhood in compare to the neighborhood they are staying.

Since there are lots of IT companies in the bangalore so employee from all over India is relocating here every months.Bangalore is the southern part of India and it's neighborhood(specially food resturants) are quite different from northern part. So, people who are relocating will prefer the neighborhood similar to the neighborhood they are staying now.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by the person.

## Data <a name="data"></a>

We decided to use locations(All SO & BO) centered around PinCodes of Bangalore, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:
* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **geopy**
* number of venues in the neighborhood and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinate of Bangalore center will be obtained using **geopy
*Details of person neighborhood will be  collected by user and coordinate is obtained using **geopy

In [1]:
import pandas as pd
!pip install lxml
!pip install geopy
from geopy.geocoders import Nominatim
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium 
import requests
from pandas.io.json import json_normalize 

Collecting lxml
[?25l  Downloading https://files.pythonhosted.org/packages/dd/ba/a0e6866057fc0bbd17192925c1d63a3b85cf522965de9bc02364d08e5b84/lxml-4.5.0-cp36-cp36m-manylinux1_x86_64.whl (5.8MB)
[K     |████████████████████████████████| 5.8MB 23.6MB/s eta 0:00:01     |█████████████████▌              | 3.1MB 23.6MB/s eta 0:00:01
[?25hInstalling collected packages: lxml
Successfully installed lxml-4.5.0
Collecting geopy
[?25l  Downloading https://files.pythonhosted.org/packages/53/fc/3d1b47e8e82ea12c25203929efb1b964918a77067a874b2c7631e2ec35ec/geopy-1.21.0-py2.py3-none-any.whl (104kB)
[K     |████████████████████████████████| 112kB 7.1MB/s eta 0:00:01
[?25hCollecting geographiclib<2,>=1.49 (from geopy)
  Downloading https://files.pythonhosted.org/packages/8b/62/26ec95a98ba64299163199e95ad1b0e34ad3f4e176e221c40245f211e425/geographiclib-1.50-py3-none-any.whl
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-1.21.0


### Loading data from https://finkode.com/ka/bangalore.html

In [2]:
html = 'https://finkode.com/ka/bangalore.html'
try: 
    bangalore_data = pd.read_html(html)[0]  
except IndexError:
    print("Table not found")
bangalore_data.rename(columns={"Post Office":"PostOffice",},inplace=True)
bangalore_data

Unnamed: 0,PostOffice,District,Pincode
0,A F Station Yelahanka S.O,Bangalore,560063
1,Adugodi S.O,Bangalore,560030
2,Agara B.O,Bangalore,560034
3,Agram S.O,Bangalore,560007
4,Amruthahalli B.O,Bangalore,560092
...,...,...,...
265,Yelahanka S.O,Bangalore,560064
266,Yelahanka Satellite Town S.O,Bangalore,560064
267,Yemalur B.O,Bangalore,560037
268,Yeshwanthpur Bazar S.O,Bangalore,560022


In [3]:
len(bangalore_data.Pincode.unique())


104

In [4]:
bangalore_data=bangalore_data.groupby('Pincode')['PostOffice'].apply(','.join).reset_index()


Unnamed: 0,Pincode,PostOffice
0,560001,"Bangalore Bazaar S.O,Bangalore G.P.O.,Cubban R..."
1,560002,"Bangalore City S.O,Bangalore Corporation Build..."
2,560003,"Malleswaram S.O,Palace Guttahalli S.O,Swimming..."
3,560004,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa..."
4,560005,Fraser Town S.O
...,...,...
99,562120,Chamarajasagara B.O
100,562125,"Dommasandra B.O,Handenahalli B.O,Kugur B.O,Sar..."
101,562130,"Chikkanahalli B.O,Chunchanakuppe B.O,Kadabager..."
102,562149,"Bagalur S.O (Bangalore),Bandikodigehalli B.O,D..."


In [5]:
print(len(bangalore_data.Pincode.unique()))
bangalore_data['Latitude']=0
bangalore_data['Longitude']=0
print(bangalore_data.head())


104
   Pincode                                         PostOffice  Latitude  \
0   560001  Bangalore Bazaar S.O,Bangalore G.P.O.,Cubban R...         0   
1   560002  Bangalore City S.O,Bangalore Corporation Build...         0   
2   560003  Malleswaram S.O,Palace Guttahalli S.O,Swimming...         0   
3   560004  Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...         0   
4   560005                                    Fraser Town S.O         0   

   Longitude  
0          0  
1          0  
2          0  
3          0  
4          0  


### Adding latitude & longitude of pincodes

In [6]:
def lat_lng(bangalore_data):
    for i in range(len(bangalore_data)):
        geolocator = Nominatim(user_agent="ny_explorer")
        location = geolocator.geocode(bangalore_data.iloc[i,0] )
        if location != None:
            bangalore_data.iloc[i,2] = location.latitude
            bangalore_data.iloc[i,3] = location.longitude
    return bangalore_data
    

In [7]:
fin_bangalore_data=lat_lng(bangalore_data)

In [8]:
fin_bangalore_data

Unnamed: 0,Pincode,PostOffice,Latitude,Longitude
0,560001,"Bangalore Bazaar S.O,Bangalore G.P.O.,Cubban R...",-33.038136,137.575919
1,560002,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567
2,560003,"Malleswaram S.O,Palace Guttahalli S.O,Swimming...",13.000240,77.565249
3,560004,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",12.944829,77.567095
4,560005,Fraser Town S.O,13.428852,77.627203
...,...,...,...,...
99,562120,Chamarajasagara B.O,12.922515,77.215788
100,562125,"Dommasandra B.O,Handenahalli B.O,Kugur B.O,Sar...",12.887615,77.739003
101,562130,"Chikkanahalli B.O,Chunchanakuppe B.O,Kadabager...",13.855864,75.960947
102,562149,"Bagalur S.O (Bangalore),Bandikodigehalli B.O,D...",13.116054,77.667895


In [182]:
clustering_data=fin_bangalore_data[fin_bangalore_data.Latitude>0]

In [183]:
clustering_data.shape

(101, 4)

In [185]:
clustering_data.head()

Unnamed: 0,Pincode,PostOffice,Latitude,Longitude
1,560002,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567
2,560003,"Malleswaram S.O,Palace Guttahalli S.O,Swimming...",13.00024,77.565249
3,560004,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",12.944829,77.567095
4,560005,Fraser Town S.O,13.428852,77.627203
5,560006,"J.C.Nagar S.O,Training Command IAF S.O",13.006087,77.593151


In [186]:
address = 'Bangalore'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Bangalore City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Bangalore City are 12.9791198, 77.5912997.


### Plotting the map of pincodes of bangalore

In [187]:
 
# create map of Manhattan using latitude and longitude values
map_to = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(clustering_data['Latitude'], clustering_data['Longitude'], clustering_data['PostOffice']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_to)  
    
map_to

### FourSuareAPI for finding nearby venues

In [188]:
CLIENT_ID = 'WH4VWFY0GYS4CYZPF3CLQSHRNSJKERAWCHP5YRKPXUQ4MAYZ' # your Foursquare ID
CLIENT_SECRET = 'UOBCXK5OCVXJYS1VXHANPT5OL2QRJRVBUEHN0I5EJPVSMU4W' # your Foursquare Secret
VERSION = '20180605'  # Foursquare API version
neighborhoods_bangalore=clustering_data
neighborhoods_bangalore.rename(columns={"PostOffice":"Neighborhood"},inplace=True)


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


In [189]:
def getNearbyVenues(names, latitudes, longitudes, radius=500,LIMIT = 100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [190]:
bangalore_venues = getNearbyVenues(names=neighborhoods_bangalore['Neighborhood'],
                                   latitudes=neighborhoods_bangalore['Latitude'],
                                   longitudes=neighborhoods_bangalore['Longitude']
                                  )

print(bangalore_venues.shape)
bangalore_venues

(862, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567,NMH Tiffin House,12.954300,77.578806,Indian Restaurant
1,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567,Mayuri Restaurant,12.956120,77.580845,Indian Restaurant
2,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567,Hotel Nandhini,12.955307,77.579498,Indian Restaurant
3,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567,Bangalore Fort,12.962529,77.575816,Historic Site
4,"Malleswaram S.O,Palace Guttahalli S.O,Swimming...",13.000240,77.565249,Raghavendra Stores,13.000799,77.563924,Breakfast Spot
...,...,...,...,...,...,...,...
857,Bangalore International Airport S.O,13.199828,77.703860,Bangalore Run Way,13.201992,77.701385,Moving Target
858,Bangalore International Airport S.O,13.199828,77.703860,Windmills,13.197953,77.707560,Beer Bar
859,Bangalore International Airport S.O,13.199828,77.703860,Thai Airways check-in,13.198118,77.707842,Airport Service
860,"Attibele S.O,Bidaraguppe B.O,Mayasandra B.O,Ne...",12.777538,77.765040,Shree Ranga Vilas,12.779921,77.768066,South Indian Restaurant


Now we have all the collected the pincodes data it's latitude,longitude , top nearby venues  and it's coordinates we are good to further.
This concludes the data gathering phase - we're now ready to use this data for analysis to produce the report on most likely neighborhood.

## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting areas of Bangalore that have neighbrhood similar to the users neighborhood.

In first step we have collected the required **data: pincode, areas in the pincode as neighborhood,top 10 venues nearby the pincode(according to Foursquare categorization).

Second step in our analysis will be calculation and exploration of 'neighborhood' across different areas of bangalore - ,what are the most frequent venues per pincode,doing proper encoding of data, and then applying k means clustering on the top of it with k=5. That means we are going to divide our entire bangalore area in 5 clusters.

In third and final step we will focus on the most similar area as per the user input and we will take the current place and current pincode as input from user and then we will identify the areas in the bangalore which is most similar to the one specifed by user. 

## Analysis <a name="analysis"></a>

Let's perform some basic explanatory data analysis and derive some additional info from our raw data. First let's count the **number of venues in every neighborhood**:

In [191]:
bangalore_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"A F Station Yelahanka S.O,BSF Campus Yelahanka S.O",1,1,1,1,1,1
Adugodi S.O,4,4,4,4,4,4
Agram S.O,4,4,4,4,4,4
"Amruthahalli B.O,Byatarayanapura B.O,Kodigehalli B.O,Sahakaranagar P.O S.O",12,12,12,12,12,12
"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebbal Kempapura S.O",7,7,7,7,7,7
...,...,...,...,...,...,...
Science Institute S.O,1,1,1,1,1,1
Seshadripuram S.O,39,39,39,39,39,39
Sivan Chetty Gardens S.O,13,13,13,13,13,13
Tarabanahalli B.O,2,2,2,2,2,2


Let's look at the unique venue categories

In [192]:
print('There are {} uniques categories.'.format(len(bangalore_venues['Venue Category'].unique())))

There are 165 uniques categories.


### Perform one hot encoding on the data

In [193]:
#one hot encoding
bangalore_onehot = pd.get_dummies(bangalore_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
bangalore_onehot['Neighborhood'] = bangalore_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [bangalore_onehot.columns[-1]] + list(bangalore_onehot.columns[:-1])
bangalore_onehot = bangalore_onehot[fixed_columns]

bangalore_onehot.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport Service,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,...,Toy / Game Store,Track,Trail,Train Station,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Wine Shop,Women's Store
0,"Bangalore City S.O,Bangalore Corporation Build...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Bangalore City S.O,Bangalore Corporation Build...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Bangalore City S.O,Bangalore Corporation Build...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Bangalore City S.O,Bangalore Corporation Build...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Malleswaram S.O,Palace Guttahalli S.O,Swimming...",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [194]:
bangalore_grouped = bangalore_onehot.groupby('Neighborhood').mean().reset_index()
bangalore_grouped

Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport Service,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,...,Toy / Game Store,Track,Trail,Train Station,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Wine Shop,Women's Store
0,"A F Station Yelahanka S.O,BSF Campus Yelahanka...",0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
1,Adugodi S.O,0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
2,Agram S.O,0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
3,"Amruthahalli B.O,Byatarayanapura B.O,Kodigehal...",0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
4,"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...",0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83,Science Institute S.O,0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
84,Seshadripuram S.O,0.0,0.0,0.0,0.0,0.025641,0.0,0.051282,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000
85,Sivan Chetty Gardens S.O,0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923
86,Tarabanahalli B.O,0.0,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000


In [195]:
bangalore_grouped.shape

(88, 166)

Let's checkout the top 5 venues in each neighborhood along with it's avergae frequency

In [196]:
num_top_venues = 5

for hood in bangalore_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = bangalore_grouped[bangalore_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----A F Station Yelahanka S.O,BSF Campus Yelahanka S.O----
             venue  freq
0        Juice Bar   1.0
1              ATM   0.0
2             Park   0.0
3  Motorcycle Shop   0.0
4    Movie Theater   0.0


----Adugodi S.O----
         venue  freq
0   Playground  0.25
1       Bakery  0.25
2  Pizza Place  0.25
3         Café  0.25
4          ATM  0.00


----Agram S.O----
            venue  freq
0          Casino  0.25
1       Juice Bar  0.25
2     Pizza Place  0.25
3  Breakfast Spot  0.25
4             ATM  0.00


----Amruthahalli B.O,Byatarayanapura B.O,Kodigehalli B.O,Sahakaranagar P.O S.O----
               venue  freq
0  Indian Restaurant  0.17
1  Indian Sweet Shop  0.08
2       Liquor Store  0.08
3             Resort  0.08
4     Sandwich Place  0.08


----Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebbal Kempapura S.O----
               venue  freq
0  Indian Restaurant  0.29
1             Market  0.14
2        Coffee Shop  0.14
3        Pizza Place  0.14
4   Department Store  0.1

Let's try to transform the data to in more better format in which we can easily visulaize top 10 categories of each neighborhood.

In [197]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [198]:
import numpy as np
num_top_venues = 50

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = bangalore_grouped['Neighborhood']

for ind in np.arange(bangalore_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(bangalore_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
0,"A F Station Yelahanka S.O,BSF Campus Yelahanka...",Juice Bar,Women's Store,Diner,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
1,Adugodi S.O,Playground,Pizza Place,Bakery,Café,Diner,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Food & Drink Shop,Flea Market,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bar,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
2,Agram S.O,Casino,Pizza Place,Juice Bar,Breakfast Spot,Women's Store,Diner,Fast Food Restaurant,Farmers Market,Event Space,...,Food & Drink Shop,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
3,"Amruthahalli B.O,Byatarayanapura B.O,Kodigehal...",Indian Restaurant,Resort,Badminton Court,Sandwich Place,Café,Italian Restaurant,Liquor Store,Indian Sweet Shop,Snack Place,...,Food Court,Food & Drink Shop,Clothing Store,Flea Market,Cocktail Bar,Chaat Place,Chocolate Shop,Chinese Restaurant,Bar,Bakery
4,"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...",Indian Restaurant,Department Store,Coffee Shop,Pizza Place,Park,Market,Diner,Fast Food Restaurant,Farmers Market,...,Food & Drink Shop,Clothing Store,Chocolate Shop,Chinese Restaurant,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage


## Clustering the neighborhood (K Means Clustering)

In [199]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 5

bangalore_grouped_clustering = bangalore_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(bangalore_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] # add clustering labels


array([1, 1, 1, 2, 2, 2, 1, 1, 1, 0], dtype=int32)

In [200]:
# add clustering labels

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
bangalore_merged = neighborhoods_bangalore


bangalore_merged = bangalore_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
bangalore_merged.head() # check the last columns!

Unnamed: 0,Pincode,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
1,560002,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567,2.0,Indian Restaurant,Historic Site,Women's Store,Diner,Financial or Legal Service,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
2,560003,"Malleswaram S.O,Palace Guttahalli S.O,Swimming...",13.00024,77.565249,1.0,Pharmacy,Department Store,Snack Place,Bakery,Breakfast Spot,...,Football Stadium,Food Court,Food & Drink Shop,Food,Chocolate Shop,Flea Market,Casino,Chinese Restaurant,Arts & Entertainment,Badminton Court
3,560004,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",12.944829,77.567095,2.0,Indian Restaurant,Fast Food Restaurant,Pizza Place,Art Gallery,Farmers Market,...,Food Truck,Food Court,Cocktail Bar,Food & Drink Shop,Chinese Restaurant,Clothing Store,Chocolate Shop,Beer Bar,Bed & Breakfast,Bar
4,560005,Fraser Town S.O,13.428852,77.627203,,,,,,,...,,,,,,,,,,
5,560006,"J.C.Nagar S.O,Training Command IAF S.O",13.006087,77.593151,1.0,Boat or Ferry,Bowling Alley,Café,Auto Garage,Women's Store,...,Food Truck,Food Court,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Arts & Entertainment,Bakery,Badminton Court,BBQ Joint,Auto Workshop


In [201]:
bangalore_merged = bangalore_merged.dropna()
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(bangalore_merged['Latitude'], bangalore_merged['Longitude'], bangalore_merged['Neighborhood'], bangalore_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Let's have a look at the 5 clusters formed by Kmeans

In [202]:
bangalore_merged.loc[bangalore_merged['Cluster Labels'] == 0, bangalore_merged.columns[[1] + list(range(5, bangalore_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
29,"Kanakanagar S.O,P&T Col. Kavalbyrasandra S.O,R...",Fast Food Restaurant,Resort,Supermarket,Park,Bus Station,Women's Store,Diner,Farmers Market,Event Space,...,Food Court,Food & Drink Shop,Clothing Store,Chaat Place,Chinese Restaurant,Hotel Bar,Bakery,Badminton Court,BBQ Joint,Auto Workshop
43,"Austin Town S.O,Viveknagar S.O (Bangalore)",Indie Movie Theater,Historic Site,Football Stadium,Bus Station,Women's Store,Diner,Fast Food Restaurant,Farmers Market,Event Space,...,Food & Drink Shop,Clothing Store,Chinese Restaurant,Ice Cream Shop,Chaat Place,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
55,Chudenapura B.O,Bus Station,Restaurant,Women's Store,Diner,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Food Court,Clothing Store,Chaat Place,Chinese Restaurant,Ice Cream Shop,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
71,"Dr. Shivarama Karanth Nagar S.O,Kothanur S.O",ATM,Bookstore,Bus Station,Wine Shop,Airport Service,Eastern European Restaurant,Flea Market,Fish & Chips Shop,Financial or Legal Service,...,Fried Chicken Joint,Football Stadium,Coffee Shop,Clothing Store,American Restaurant,Auto Garage,Beer Bar,Bed & Breakfast,Bar,Bakery
75,Bolare B.O,Boarding House,Indie Movie Theater,Restaurant,Bus Station,Tennis Court,Women's Store,Diner,Fast Food Restaurant,Farmers Market,...,Food Truck,Food Court,Food & Drink Shop,Clothing Store,Chaat Place,Chinese Restaurant,Arts & Entertainment,Badminton Court,BBQ Joint,Auto Workshop


In [203]:
bangalore_merged.loc[bangalore_merged['Cluster Labels'] == 1, bangalore_merged.columns[[1] + list(range(5, bangalore_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
2,"Malleswaram S.O,Palace Guttahalli S.O,Swimming...",Pharmacy,Department Store,Snack Place,Bakery,Breakfast Spot,Food Truck,Light Rail Station,Juice Bar,Train Station,...,Football Stadium,Food Court,Food & Drink Shop,Food,Chocolate Shop,Flea Market,Casino,Chinese Restaurant,Arts & Entertainment,Badminton Court
5,"J.C.Nagar S.O,Training Command IAF S.O",Boat or Ferry,Bowling Alley,Café,Auto Garage,Women's Store,Eastern European Restaurant,Financial or Legal Service,Fast Food Restaurant,Farmers Market,...,Food Truck,Food Court,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Arts & Entertainment,Bakery,Badminton Court,BBQ Joint,Auto Workshop
6,Agram S.O,Casino,Pizza Place,Juice Bar,Breakfast Spot,Women's Store,Diner,Fast Food Restaurant,Farmers Market,Event Space,...,Food & Drink Shop,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
7,"H.A.L II Stage H.O,Hulsur Bazaar S.O",Indian Restaurant,Pub,Italian Restaurant,Mexican Restaurant,Restaurant,Vietnamese Restaurant,Hotel,Café,Tea Room,...,Event Space,Farmers Market,Arts & Entertainment,Arts & Crafts Store,Financial or Legal Service,Gourmet Shop,Fish & Chips Shop,Flea Market,Art Gallery,American Restaurant
11,Science Institute S.O,Bookstore,Women's Store,Flea Market,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Chaat Place,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
14,Jalahalli West S.O,Fast Food Restaurant,BBQ Joint,Women's Store,Donut Shop,Fish & Chips Shop,Financial or Legal Service,Farmers Market,Event Space,Electronics Store,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Chaat Place,Bar,Bakery,Badminton Court,Auto Workshop,Auto Garage,Athletics & Sports
15,"Doorvaninagar S.O,Krishnarajapuram R S S.O,Ram...",Stadium,Platform,Dessert Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Food & Drink Shop,Chocolate Shop,Chaat Place,Hotel Bar,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
18,"Gaviopuram Extension S.O,Narasimharaja Colony S.O",Pizza Place,Fast Food Restaurant,Coffee Shop,Art Gallery,Sandwich Place,Juice Bar,Theater,Athletics & Sports,Women's Store,...,Food Truck,Food Court,Food & Drink Shop,Clothing Store,Chaat Place,Chinese Restaurant,Asian Restaurant,Bar,Bakery,Badminton Court
19,Seshadripuram S.O,Indian Restaurant,Clothing Store,Hotel,Fast Food Restaurant,Juice Bar,Chinese Restaurant,Donut Shop,Coffee Shop,Arcade,...,Grocery Store,Food Truck,Food Court,Food,Convenience Store,Fish & Chips Shop,Financial or Legal Service,Flea Market,Women's Store,Historic Site
22,Magadi Road S.O,Tennis Stadium,Karnataka Restaurant,Department Store,Flea Market,Women's Store,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,...,Food Court,Chocolate Shop,Chaat Place,Hotel Bar,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage


In [204]:
bangalore_merged.loc[bangalore_merged['Cluster Labels'] == 2, bangalore_merged.columns[[1] + list(range(5, bangalore_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
1,"Bangalore City S.O,Bangalore Corporation Build...",Indian Restaurant,Historic Site,Women's Store,Diner,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
3,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",Indian Restaurant,Fast Food Restaurant,Pizza Place,Art Gallery,Farmers Market,Sandwich Place,Athletics & Sports,Asian Restaurant,Juice Bar,...,Food Truck,Food Court,Cocktail Bar,Food & Drink Shop,Chinese Restaurant,Clothing Store,Chocolate Shop,Beer Bar,Bed & Breakfast,Bar
8,"Bangalore Dist Offices Bldg S.O,K. G. Road S.O",Indian Restaurant,Hotel,Bed & Breakfast,Dessert Shop,Seafood Restaurant,Shopping Mall,Bookstore,Flea Market,Grocery Store,...,Food,Chocolate Shop,Fish & Chips Shop,Clothing Store,Women's Store,Chinese Restaurant,Arts & Entertainment,Badminton Court,BBQ Joint,Auto Workshop
9,"Industrial Estate S.O (Bangalore),Rajajinagar ...",Indian Restaurant,Bakery,Pharmacy,Café,Snack Place,Women's Store,Donut Shop,Fast Food Restaurant,Farmers Market,...,Food Court,Food & Drink Shop,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bar,Badminton Court,BBQ Joint,Auto Workshop
12,Jalahalli H.O,Indian Restaurant,Fast Food Restaurant,Shopping Mall,Plaza,Vegetarian / Vegan Restaurant,Dessert Shop,Farmers Market,Event Space,Electronics Store,...,Food Court,Food & Drink Shop,Chinese Restaurant,Chaat Place,Hotel,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop
16,"NAL S.O,Vimanapura S.O",Indian Restaurant,Restaurant,Food Truck,Café,Korean Restaurant,Women's Store,Diner,Fast Food Restaurant,Farmers Market,...,Food Court,Food & Drink Shop,Clothing Store,Chinese Restaurant,Hotel Bar,Chaat Place,Bakery,Badminton Court,BBQ Joint,Auto Workshop
17,Chamrajpet S.O (Bangalore),General Entertainment,Indian Restaurant,Fast Food Restaurant,Park,Dessert Shop,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Clothing Store,Chinese Restaurant,Hotel Bar,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
20,"Gayathrinagar S.O,Srirampuram S.O",Indian Restaurant,Bakery,Fast Food Restaurant,Café,Women's Store,Donut Shop,Fish & Chips Shop,Financial or Legal Service,Farmers Market,...,Food Truck,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Ice Cream Shop,Bar,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
21,"Yeshwanthpur Bazar S.O,Yeswanthpura S.O",Fast Food Restaurant,Miscellaneous Shop,Shopping Mall,Indian Restaurant,Multiplex,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Food,Clothing Store,Chinese Restaurant,Hotel,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
23,"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...",Indian Restaurant,Department Store,Coffee Shop,Pizza Place,Park,Market,Diner,Fast Food Restaurant,Farmers Market,...,Food & Drink Shop,Clothing Store,Chocolate Shop,Chinese Restaurant,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage


In [205]:
bangalore_merged.loc[bangalore_merged['Cluster Labels'] == 3, bangalore_merged.columns[[1] + list(range(5, bangalore_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
13,Jalahalli East S.O,ATM,Flea Market,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,Donut Shop,...,Chocolate Shop,Chinese Restaurant,Chaat Place,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
82,"Bapagrama B.O,Herohalli B.O,Herohalli S.O,Visw...",ATM,Astrologer,Donut Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Chaat Place,Bar,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage


In [206]:
bangalore_merged.loc[bangalore_merged['Cluster Labels'] == 4, bangalore_merged.columns[[1] + list(range(5, bangalore_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
90,"Bommasandra Industrial Estate S.O,Chandapura B...",Bakery,Women's Store,Donut Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Chaat Place,Bar,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports


So,our bangalore neighborhood model is ready now we are going to take a demo user input and do proper transformation and then we will predict which cluster it belongs to and the pincodes/neighborhood in that cluster will be considered as suggestion for similar neighborhood in bangalore.

## Suggesting Neighborhood

In [286]:
# current_place='Chandni Chowk'
# current_pincode=110006
# current_city='Delhi'

current_place='Sarojini Nagar'
current_pincode=110023
current_city='Delhi'



### loading test data in proper format

In [287]:
def create_test_data(current_place,current_pincode,current_city):
    List = [[current_place, current_pincode]]
    predict_neighborhood=pd.DataFrame(List )
    predict_neighborhood.rename(columns={0:"Neighborhood",1:"Pincode"},inplace=True)
    predict_neighborhood['Latitude']=0
    predict_neighborhood['Longitude']=0
    predict_neighborhood=lat_lng(predict_neighborhood)
    print(predict_neighborhood.head())
    return predict_neighborhood


In [288]:
predict_neighborhood=create_test_data(current_place,current_pincode,current_city)

     Neighborhood  Pincode   Latitude  Longitude
0  Sarojini Nagar   110023  28.574157   77.19537


### Getting nearby venues of test data

In [289]:
predict_venues = getNearbyVenues(names=predict_neighborhood['Neighborhood'],
                                   latitudes=predict_neighborhood['Latitude'],
                                   longitudes=predict_neighborhood['Longitude']
                                  )

print(predict_venues.shape)
predict_venues.head()

(12, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Sarojini Nagar,28.574157,77.19537,Domino's Pizza,28.576,77.195,Pizza Place
1,Sarojini Nagar,28.574157,77.19537,Sarojini Nagar Market,28.577802,77.196347,Market
2,Sarojini Nagar,28.574157,77.19537,McDonald's,28.576515,77.1961,Fast Food Restaurant
3,Sarojini Nagar,28.574157,77.19537,Rdbd Rang de Basanti Dhaba,28.576552,77.195252,Indian Restaurant
4,Sarojini Nagar,28.574157,77.19537,Haldiram's,28.576374,77.195266,Indian Restaurant


### Transforming the data using one hot encoding

In [290]:
#one hot encoding
predict_onehot = pd.get_dummies(predict_venues[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
predict_onehot['Neighborhood'] = predict_venues['Neighborhood'] 
# move neighborhood column to the first column
fixed_columns = [predict_onehot.columns[-1]] + list(predict_onehot.columns[:-1])
predict_onehot = predict_onehot[fixed_columns]
predict_onehot.head()

Unnamed: 0,Neighborhood,Department Store,Dessert Shop,Fast Food Restaurant,Indian Restaurant,Market,Pizza Place,Shopping Mall,Women's Store
0,Sarojini Nagar,0,0,0,0,0,1,0,0
1,Sarojini Nagar,0,0,0,0,1,0,0,0
2,Sarojini Nagar,0,0,1,0,0,0,0,0
3,Sarojini Nagar,0,0,0,1,0,0,0,0
4,Sarojini Nagar,0,0,0,1,0,0,0,0


#### Ensuring labels are same for test & model data

In [291]:
columns=bangalore_onehot.columns
def add_missing_dummy_columns( d, columns ):
    missing_cols = set( columns ) - set( d.columns )
    for c in missing_cols:
        d[c] = 0
        
def fix_columns( d, columns ):  
    add_missing_dummy_columns( d, columns )
    # make sure we have all the columns we need
    assert( set( columns ) - set( d.columns ) == set())
    extra_cols = set( d.columns ) - set( columns )
    if extra_cols:
        print ("extra columns:", extra_cols)
    d = d[ columns ]
    return d

In [292]:

predict_onehot=fix_columns(predict_onehot,columns)
predict_onehot.head()


Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport Service,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,...,Toy / Game Store,Track,Trail,Train Station,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Waterfront,Wine Shop,Women's Store
0,Sarojini Nagar,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Sarojini Nagar,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Sarojini Nagar,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Sarojini Nagar,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Sarojini Nagar,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Predicting the cluster label using kmeans model

In [293]:
predict_grouped = predict_onehot.groupby('Neighborhood').mean().reset_index()
predict_grouped = predict_grouped.drop('Neighborhood', 1)
cluster_group=kmeans.predict(predict_grouped)
cluster_group
print(cluster_group[0])

2


### Plotting the clusters on map

In [294]:
destination_location=bangalore_merged.loc[bangalore_merged['Cluster Labels'] ==cluster_group[0]]
destination_location

Unnamed: 0,Pincode,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
1,560002,"Bangalore City S.O,Bangalore Corporation Build...",12.958625,77.577567,2.0,Indian Restaurant,Historic Site,Women's Store,Diner,Financial or Legal Service,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
3,560004,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",12.944829,77.567095,2.0,Indian Restaurant,Fast Food Restaurant,Pizza Place,Art Gallery,Farmers Market,...,Food Truck,Food Court,Cocktail Bar,Food & Drink Shop,Chinese Restaurant,Clothing Store,Chocolate Shop,Beer Bar,Bed & Breakfast,Bar
8,560009,"Bangalore Dist Offices Bldg S.O,K. G. Road S.O",12.979649,77.577441,2.0,Indian Restaurant,Hotel,Bed & Breakfast,Dessert Shop,Seafood Restaurant,...,Food,Chocolate Shop,Fish & Chips Shop,Clothing Store,Women's Store,Chinese Restaurant,Arts & Entertainment,Badminton Court,BBQ Joint,Auto Workshop
9,560010,"Industrial Estate S.O (Bangalore),Rajajinagar ...",12.991038,77.549685,2.0,Indian Restaurant,Bakery,Pharmacy,Café,Snack Place,...,Food Court,Food & Drink Shop,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bar,Badminton Court,BBQ Joint,Auto Workshop
12,560013,Jalahalli H.O,13.052544,77.549532,2.0,Indian Restaurant,Fast Food Restaurant,Shopping Mall,Plaza,Vegetarian / Vegan Restaurant,...,Food Court,Food & Drink Shop,Chinese Restaurant,Chaat Place,Hotel,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop
16,560017,"NAL S.O,Vimanapura S.O",12.953824,77.656158,2.0,Indian Restaurant,Restaurant,Food Truck,Café,Korean Restaurant,...,Food Court,Food & Drink Shop,Clothing Store,Chinese Restaurant,Hotel Bar,Chaat Place,Bakery,Badminton Court,BBQ Joint,Auto Workshop
17,560018,Chamrajpet S.O (Bangalore),12.960327,77.57043,2.0,General Entertainment,Indian Restaurant,Fast Food Restaurant,Park,Dessert Shop,...,Clothing Store,Chinese Restaurant,Hotel Bar,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
20,560021,"Gayathrinagar S.O,Srirampuram S.O",12.99384,77.555621,2.0,Indian Restaurant,Bakery,Fast Food Restaurant,Café,Women's Store,...,Food Truck,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Ice Cream Shop,Bar,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
21,560022,"Yeshwanthpur Bazar S.O,Yeswanthpura S.O",13.024783,77.545966,2.0,Fast Food Restaurant,Miscellaneous Shop,Shopping Mall,Indian Restaurant,Multiplex,...,Food,Clothing Store,Chinese Restaurant,Hotel,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
23,560024,"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...",13.036578,77.59634,2.0,Indian Restaurant,Department Store,Coffee Shop,Pizza Place,Park,...,Food & Drink Shop,Clothing Store,Chocolate Shop,Chinese Restaurant,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage


In [295]:
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium 

# create map of Manhattan using latitude and longitude values
map_to = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(destination_location['Latitude'], destination_location['Longitude'], destination_location['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_to)  
    
map_to

According to our model the if the user stay at Sarojini Nagar,Delhi the similar neighborhood belongs to cluster 1.So the users can choose any of the above place.It is shown in map also. The pincodes where the person can stay is:

In [296]:
result=bangalore_merged.loc[bangalore_merged['Cluster Labels'] == cluster_group[0], bangalore_merged.columns[[1] + list(range(5, bangalore_merged.shape[1]))]]
result

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
1,"Bangalore City S.O,Bangalore Corporation Build...",Indian Restaurant,Historic Site,Women's Store,Diner,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Clothing Store,Chinese Restaurant,Ice Cream Shop,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
3,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",Indian Restaurant,Fast Food Restaurant,Pizza Place,Art Gallery,Farmers Market,Sandwich Place,Athletics & Sports,Asian Restaurant,Juice Bar,...,Food Truck,Food Court,Cocktail Bar,Food & Drink Shop,Chinese Restaurant,Clothing Store,Chocolate Shop,Beer Bar,Bed & Breakfast,Bar
8,"Bangalore Dist Offices Bldg S.O,K. G. Road S.O",Indian Restaurant,Hotel,Bed & Breakfast,Dessert Shop,Seafood Restaurant,Shopping Mall,Bookstore,Flea Market,Grocery Store,...,Food,Chocolate Shop,Fish & Chips Shop,Clothing Store,Women's Store,Chinese Restaurant,Arts & Entertainment,Badminton Court,BBQ Joint,Auto Workshop
9,"Industrial Estate S.O (Bangalore),Rajajinagar ...",Indian Restaurant,Bakery,Pharmacy,Café,Snack Place,Women's Store,Donut Shop,Fast Food Restaurant,Farmers Market,...,Food Court,Food & Drink Shop,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bar,Badminton Court,BBQ Joint,Auto Workshop
12,Jalahalli H.O,Indian Restaurant,Fast Food Restaurant,Shopping Mall,Plaza,Vegetarian / Vegan Restaurant,Dessert Shop,Farmers Market,Event Space,Electronics Store,...,Food Court,Food & Drink Shop,Chinese Restaurant,Chaat Place,Hotel,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop
16,"NAL S.O,Vimanapura S.O",Indian Restaurant,Restaurant,Food Truck,Café,Korean Restaurant,Women's Store,Diner,Fast Food Restaurant,Farmers Market,...,Food Court,Food & Drink Shop,Clothing Store,Chinese Restaurant,Hotel Bar,Chaat Place,Bakery,Badminton Court,BBQ Joint,Auto Workshop
17,Chamrajpet S.O (Bangalore),General Entertainment,Indian Restaurant,Fast Food Restaurant,Park,Dessert Shop,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Clothing Store,Chinese Restaurant,Hotel Bar,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports
20,"Gayathrinagar S.O,Srirampuram S.O",Indian Restaurant,Bakery,Fast Food Restaurant,Café,Women's Store,Donut Shop,Fish & Chips Shop,Financial or Legal Service,Farmers Market,...,Food Truck,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Ice Cream Shop,Bar,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
21,"Yeshwanthpur Bazar S.O,Yeswanthpura S.O",Fast Food Restaurant,Miscellaneous Shop,Shopping Mall,Indian Restaurant,Multiplex,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Food,Clothing Store,Chinese Restaurant,Hotel,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage
23,"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...",Indian Restaurant,Department Store,Coffee Shop,Pizza Place,Park,Market,Diner,Fast Food Restaurant,Farmers Market,...,Food & Drink Shop,Clothing Store,Chocolate Shop,Chinese Restaurant,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage


Just for comparison the top 10 venue nearby user location is :

In [297]:
predict_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Sarojini Nagar,28.574157,77.19537,Domino's Pizza,28.576,77.195,Pizza Place
1,Sarojini Nagar,28.574157,77.19537,Sarojini Nagar Market,28.577802,77.196347,Market
2,Sarojini Nagar,28.574157,77.19537,McDonald's,28.576515,77.1961,Fast Food Restaurant
3,Sarojini Nagar,28.574157,77.19537,Rdbd Rang de Basanti Dhaba,28.576552,77.195252,Indian Restaurant
4,Sarojini Nagar,28.574157,77.19537,Haldiram's,28.576374,77.195266,Indian Restaurant
5,Sarojini Nagar,28.574157,77.19537,McDonald's,28.57596,77.195201,Fast Food Restaurant
6,Sarojini Nagar,28.574157,77.19537,South Square Mall,28.576386,77.195325,Shopping Mall
7,Sarojini Nagar,28.574157,77.19537,Babu Market,28.577409,77.195647,Women's Store
8,Sarojini Nagar,28.574157,77.19537,South Square Multilevel Parking,28.577434,77.195623,Department Store
9,Sarojini Nagar,28.574157,77.19537,Khushi Sarees,28.577427,77.195627,Women's Store


In [298]:
venue_category=predict_venues['Venue Category'].tolist()
venue_count=list()
result.shape


for i in range(len(result)):
    score=0
    for j in range(11):
        if result.iloc[i,j] in venue_category : 
            score=score+1
        else:
            continue
    score=(score/10)
    venue_count.append(score)
   
    

In [299]:
venue_category=predict_venues['Venue Category'].tolist()
venue_count=list()
size = len(venue_category)

for i in range(len(result)):
    score=0
    for venue in venue_category:
        for j in range(result.shape[1]):
            if result.iloc[i,j] == venue :
                score=score+1
                break;
    score=(score/size)
    venue_count.append(score)

In [300]:
print(venue_category)

['Pizza Place', 'Market', 'Fast Food Restaurant', 'Indian Restaurant', 'Indian Restaurant', 'Fast Food Restaurant', 'Shopping Mall', "Women's Store", 'Department Store', "Women's Store", 'Dessert Shop', 'Fast Food Restaurant']


In [301]:
result['accuracy']=venue_count
result

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue,accuracy
1,"Bangalore City S.O,Bangalore Corporation Build...",Indian Restaurant,Historic Site,Women's Store,Diner,Financial or Legal Service,Fast Food Restaurant,Farmers Market,Event Space,Electronics Store,...,Chinese Restaurant,Ice Cream Shop,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports,0.75
3,"Basavanagudi H.O,Mavalli S.O,Pampamahakavi Roa...",Indian Restaurant,Fast Food Restaurant,Pizza Place,Art Gallery,Farmers Market,Sandwich Place,Athletics & Sports,Asian Restaurant,Juice Bar,...,Food Court,Cocktail Bar,Food & Drink Shop,Chinese Restaurant,Clothing Store,Chocolate Shop,Beer Bar,Bed & Breakfast,Bar,0.833333
8,"Bangalore Dist Offices Bldg S.O,K. G. Road S.O",Indian Restaurant,Hotel,Bed & Breakfast,Dessert Shop,Seafood Restaurant,Shopping Mall,Bookstore,Flea Market,Grocery Store,...,Chocolate Shop,Fish & Chips Shop,Clothing Store,Women's Store,Chinese Restaurant,Arts & Entertainment,Badminton Court,BBQ Joint,Auto Workshop,0.833333
9,"Industrial Estate S.O (Bangalore),Rajajinagar ...",Indian Restaurant,Bakery,Pharmacy,Café,Snack Place,Women's Store,Donut Shop,Fast Food Restaurant,Farmers Market,...,Food & Drink Shop,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Hotel Bar,Bar,Badminton Court,BBQ Joint,Auto Workshop,0.75
12,Jalahalli H.O,Indian Restaurant,Fast Food Restaurant,Shopping Mall,Plaza,Vegetarian / Vegan Restaurant,Dessert Shop,Farmers Market,Event Space,Electronics Store,...,Food & Drink Shop,Chinese Restaurant,Chaat Place,Hotel,Casino,Bakery,Badminton Court,BBQ Joint,Auto Workshop,0.833333
16,"NAL S.O,Vimanapura S.O",Indian Restaurant,Restaurant,Food Truck,Café,Korean Restaurant,Women's Store,Diner,Fast Food Restaurant,Farmers Market,...,Food & Drink Shop,Clothing Store,Chinese Restaurant,Hotel Bar,Chaat Place,Bakery,Badminton Court,BBQ Joint,Auto Workshop,0.75
17,Chamrajpet S.O (Bangalore),General Entertainment,Indian Restaurant,Fast Food Restaurant,Park,Dessert Shop,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Chinese Restaurant,Hotel Bar,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,Athletics & Sports,0.75
20,"Gayathrinagar S.O,Srirampuram S.O",Indian Restaurant,Bakery,Fast Food Restaurant,Café,Women's Store,Donut Shop,Fish & Chips Shop,Financial or Legal Service,Farmers Market,...,Cocktail Bar,Chinese Restaurant,Chocolate Shop,Ice Cream Shop,Bar,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,0.75
21,"Yeshwanthpur Bazar S.O,Yeswanthpura S.O",Fast Food Restaurant,Miscellaneous Shop,Shopping Mall,Indian Restaurant,Multiplex,Farmers Market,Event Space,Electronics Store,Eastern European Restaurant,...,Clothing Store,Chinese Restaurant,Hotel,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,0.833333
23,"Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...",Indian Restaurant,Department Store,Coffee Shop,Pizza Place,Park,Market,Diner,Fast Food Restaurant,Farmers Market,...,Clothing Store,Chocolate Shop,Chinese Restaurant,Bed & Breakfast,Bakery,Badminton Court,BBQ Joint,Auto Workshop,Auto Garage,0.916667


In [305]:
result.shape

(40, 52)

In [302]:
acc=result['accuracy'].max()
print(acc)
highlight=result[result['accuracy']>=acc]
highlight.reset_index(inplace=True) # Resets the index, makes factor a column
highlight.iloc[0,1]
ht=fin_bangalore_data[fin_bangalore_data['PostOffice']==highlight.iloc[0,1]]
print(ht)
lt=ht.iloc[0,2]
ln=ht.iloc[0,3]

0.9166666666666666
    Pincode                                         PostOffice   Latitude  \
23   560024  Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba...  13.036578   

    Longitude  
23   77.59634  


In [303]:
print(lt)
print(ln)

13.0365775420601
77.59633976975934


In [304]:
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium 

# create map of Manhattan using latitude and longitude values
map_to = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(destination_location['Latitude'], destination_location['Longitude'], destination_location['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_to)  
    folium.Marker( [lt,ln] ).add_to( map_to )
   

map_to

Great! so  we have found most simliar neighborhood Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebba with Similarity score 91%.

## Results and Discussion <a name="results"></a>

In our analysis  we have taken a demo input let's say the person living nearby Sarojini Nagar ,Delhi(PinCode-110023) and willing to relocate to bangalore we got all the nearby venue of Sarojini Nagar in 500m radius using FoursquareAPI. It is going to work with any user provided input.

We have taken all the pincodes of bangalore  as data point and found the nearby 50 venues withinh 500m of those pincodes.
Those location candidates were then clustered in 5 groups to create zones of interest which contains similar kind of neighborhood. All the pincodes are labelled on the basis of their cluster group. Considering our demo case, we found the zone areas(cluster group 1) which is having neighborhood similiar to Sarojini Nagar,Delhi.After identifying the clusters we tried to find the similarity score(accuracy)for each area(pincode) in the given cluster on the basis of percent of venues matched in the user location and predicted areas. 
On the basis of Similarity scores we have highlighted the area having maximum similarity.

Result of all this is 40 areas similar to the user location has been predicted alongwith their similarity score.We have found Anandnagar S.O (Bangalore),H.A. Farm S.O,Hebbal(560024) is very similar to Sarojini Nagar alongwith similarity of 91%.  This, of course, does not imply that those areas are actually optimal locations for a relocation! Purpose of this analysis was to only provide info on bangalore areas similar to the neighborhood of user area, but other factors can also be included like commuting facilities, distance from office,levels of noise etc.

## Conclusion <a name="conclusion"></a>

Purpose of this project was to identify the bangalore city areas similar to the neighborhood area of the user in another city,the information can be utilised for relocation purpose. By using the FoursquareAPI we clustered the bangalore areas on the basis of nearby venue(similar neighborhood) and then recommended the  simliar areas to the user alongwith similarity score.

Final decision on relocation will be made by users based on specific characteristics of neighborhoods and locations in every recommended zone, taking into consideration additional factors like attractiveness of each location (proximity to park or water), levels of noise / proximity to major roads, commuting facilities etc.