# Choosing the most convenient Neighbourhood of Bengaluru

## Introduction/Business Problem

One of the major problem faced by people when they are thinking of switching cities is choosing the neighbourhood to stay/live in that particular city. There are couple of bigger factors like Comfort, Convenience, and saftey.

Here in this particular notebook we will be analyzing the neighbourhood in Bengaluru City of Karnataka, India to figure out the best neighbourhood in this particular city based on Convenience  of  people.

## Data Description

Here for this report we will be using data set for neighbourhood fetched from foursquare loaction api's. But to get data about Boroughs, PinCode, and Neighbourhood we will scraping this data from a webpage(https://www.indiatvnews.com/pincode/karnataka/bangalore) to create our own dataset.

In [105]:
# installing dependencies
!pip install beautifulsoup4
!pip install geopy
!pip install folium

# importing dependencies
import requests
from bs4 import BeautifulSoup
import pandas as pd
import folium



scraping data required from webpage

In [106]:
page = requests.get("https://www.indiatvnews.com/pincode/karnataka/bangalore")
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find('table', class_='alt')
table_rows = table.find_all('tr')

Converting scraped data into dataframe

In [107]:
# this array will hold the table data
temp = []

# adding invidual subarrays for each table array
for tr in table_rows:
    td = tr.find_all('td')
    row = [d.text.strip() for d in td]
    
    if row and row[1] != "NA":
        temp.append(row)

converting array with stored data into dataframe

In [108]:
# creating dataframe out of mentioned array
df = pd.DataFrame(data=temp, columns=['Neighbourhood', 'Borough', 'District', 'State', 'Pincode'])
df = df.drop(['District', 'State'], axis=1)
df = df.iloc[1:]
print(df.shape)
print(df)

(259, 3)
            Neighbourhood          Borough Pincode
1                   Agram  Bangalore South  560007
2      Air Force Hospital  Bangalore North  560007
3            Amruthahalli  Bangalore North  560092
4    Anandnagar Bangalore  Bangalore North  560024
5          Arabic College  Bangalore North  560045
..                    ...              ...     ...
255  Tavarekere Bangalore   Bangaloresouth  562130
256   Thammanayakanahalli           Anekal  562106
257         Vanakanahalli           Anekal  562106
258           Vidyanagara         Bg North  562157
259         Yadavanahalli           Anekal  562107

[259 rows x 3 columns]


As certain Boroughs are misspelled so we will be correcting it in next step and then we will be choosing only 3 Boroughs to work on, i.e Bangalore South and North. Other Borough lies in outskirts of District

In [109]:
df['Borough'] = df['Borough'].replace(['Bangalore North', 'Bangalore north', 'Banglorenorth', 'Bg North', 'Bgnorth'], 'Bangalore North')

df['Borough'] = df['Borough'].replace(['Bangalore South', 'Bangaloresouth', 'Bg South', 'Bgsouth', 'Nla & Bgsouth'], 'Bangalore South')

df['Borough'] = df['Borough'].replace(['Bangalore', 'Banglore'], 'Bangalore')

df = df[df['Borough'].isin(["Bangalore South", "Bangalore North"])]

print("let's drop the row with duplicate pincode and keep only the first one")

print("Shape of dataframe before removing duplicates")
print(df.shape)

print("Shape of dataframe before removing duplicates")
df = df.drop_duplicates(subset="Pincode")
print(df.shape)

# setting pincode as index
df = df.set_index('Pincode')

print("Below show dataframe will be used for further research")
print(df.head())

# adding column for latitude and longitude
df["Latitude"] = "null"
df["Longitude"] = "null"

# saving this dataframe to csv file
df.to_csv("without_lat_lng_bangalore_neighbourhood.csv", sep='\t', encoding='utf-8')

let's drop the row with duplicate pincode and keep only the first one
Shape of dataframe before removing duplicates
(220, 3)
Shape of dataframe before removing duplicates
(100, 3)
Below show dataframe will be used for further research
                Neighbourhood          Borough
Pincode                                       
560007                  Agram  Bangalore South
560092           Amruthahalli  Bangalore North
560024   Anandnagar Bangalore  Bangalore North
560045         Arabic College  Bangalore North
560064                  Attur  Bangalore North


Now we will fetch latitude and longitude of neighbourhood in above dataframe.

In [110]:
import requests

def fetchLatLng(postal_code, Neighbourhood):
    #init variable to none
    lat_lng = None
    api_key = "add your own api key"
    address = '{}, {}, Bangalore, Karnataka, India'.format(postal_code, Neighbourhood)
    print(address)
    geocode_url = "https://maps.googleapis.com/maps/api/geocode/json?address={}".format(address)
    
    if api_key is not None:
        geocode_url = geocode_url + "&key={}".format(api_key)

    #loop until you get co-ordiantes
    while(lat_lng is None):
        results = requests.get(geocode_url)
        results = results.json()
        answer = results['results'][0]
        lat_lng = {
            "latitude": answer.get('geometry').get('location').get('lat'),
            "longitude": answer.get('geometry').get('location').get('lng'),
        }
        latitude = lat_lng['latitude']
        longitude = lat_lng['longitude']


    df.loc[postal_code, 'Latitude'] = latitude
    df.loc[postal_code, 'Longitude'] = longitude
    print('Latitude: {} & longitude: {}'.format(latitude, longitude))
    return
    

In [112]:
#uncomment this section for geocoding from google api
#i have already saved data in a csv file so i will load it from that in next section

# for index, row in df.iterrows():
#     fetchLatLng(index, row['Neighbourhood'])

# print(df.head())

# saving this dataframe to csv file
# df.to_csv("bangalore_neighbourhood.csv")

560007, Agram, Bangalore, Karnataka, India
Latitude: 12.9579166 & longitude: 77.6309117
560092, Amruthahalli, Bangalore, Karnataka, India
Latitude: 13.0658792 & longitude: 77.6042056
560024, Anandnagar Bangalore, Bangalore, Karnataka, India
Latitude: 13.031328 & longitude: 77.5913132
560045, Arabic College, Bangalore, Karnataka, India
Latitude: 13.0303746 & longitude: 77.6211313
560064, Attur, Bangalore, Karnataka, India
Latitude: 13.1069615 & longitude: 77.5662992
560047, Austin Town, Bangalore, Karnataka, India
Latitude: 12.9587681 & longitude: 77.6159946
560043, Banaswadi, Bangalore, Karnataka, India
Latitude: 13.0103761 & longitude: 77.6481944
560001, Bangalore Bazaar, Bangalore, Karnataka, India
Latitude: 12.9287983 & longitude: 77.67638149999999
560103, Bellandur, Bangalore, Karnataka, India
Latitude: 12.9298689 & longitude: 77.6848366
560046, Benson Town, Bangalore, Karnataka, India
Latitude: 13.0011645 & longitude: 77.5995482
560049, Bhattarahalli, Bangalore, Karnataka, India
L

## For further purpose we will load up our data from "bangalore_neighbourhood.csv" file to avoid getting me billed by google for extensive use of maps api

In [123]:
df = pd.read_csv('./bangalore_neighbourhood_original.csv', sep="\t")
df 

Unnamed: 0,Pincode,Neighbourhood,Borough,Latitude,Longitude
0,560007,Agram,Bangalore South,12.957917,77.630912
1,560092,Amruthahalli,Bangalore North,13.065879,77.604206
2,560024,Anandnagar Bangalore,Bangalore North,13.031328,77.591313
3,560045,Arabic College,Bangalore North,13.030375,77.621131
4,560064,Attur,Bangalore North,13.106962,77.566299
...,...,...,...,...,...
95,560022,Yeshwanthpur Bazar,Bangalore North,13.020528,77.554645
96,562149,Bagalur Bangalore,Bangalore North,13.151285,77.668837
97,562157,Bettahalsur,Bangalore North,13.162341,77.609935
98,562130,Chikkanahalli,Bangalore South,12.881289,77.354940


In [126]:
bangalore_latitude = "12.9715"
bangalore_longitude = "77.5945"

map_bangalore = folium. Map(location=[bangalore_latitude, bangalore_longitude], zoom_start=12)

for lat, lng, borough, neighbourhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighbourhood']):
    label = '{}'.format(neighbourhood)
    label= folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='green',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=True
    ).add_to(map_bangalore)

map_bangalore