# Choosing the most convenient Neighbourhood of Bengaluru

## Introduction/Business Problem

One of the major problem faced by people when they are thinking of switching cities is choosing the neighbourhood to stay/live in that particular city. There are couple of bigger factors like Comfort, Convenience, and saftey.

Here in this particular notebook we will be analyzing the neighbourhood in Bengaluru City of Karnataka, India to figure out the best neighbourhood in this particular city based on Convenience  of  people.

In [1]:
# installing dependencies
!pip install beautifulsoup4
!pip install geocoder
!pip install folium

# importing dependencies
import requests
from bs4 import BeautifulSoup
import pandas as pd
import folium



scraping data required from webpage

In [145]:
page = requests.get("https://www.indiatvnews.com/pincode/karnataka/bangalore")
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find('table', class_='alt')
table_rows = table.find_all('tr')

Converting scraped data into dataframe

In [146]:
# this array will hold the table data
temp = []

# adding invidual subarrays for each table array
for tr in table_rows:
    td = tr.find_all('td')
    row = [d.text.strip() for d in td]
    
    if row and row[1] != "NA":
        temp.append(row)

converting array with stored data into dataframe

In [147]:
# creating dataframe out of mentioned array
df = pd.DataFrame(data=temp, columns=['Neighbourhood', 'Borough', 'District', 'State', 'Pincode'])
df = df.drop(['District', 'State'], axis=1)
df = df.iloc[1:]
print(df.shape)
print(df)

(259, 3)
            Neighbourhood          Borough Pincode
1                   Agram  Bangalore South  560007
2      Air Force Hospital  Bangalore North  560007
3            Amruthahalli  Bangalore North  560092
4    Anandnagar Bangalore  Bangalore North  560024
5          Arabic College  Bangalore North  560045
..                    ...              ...     ...
255  Tavarekere Bangalore   Bangaloresouth  562130
256   Thammanayakanahalli           Anekal  562106
257         Vanakanahalli           Anekal  562106
258           Vidyanagara         Bg North  562157
259         Yadavanahalli           Anekal  562107

[259 rows x 3 columns]


As certain Boroughs are misspelled so we will be correcting it in next step and then we will be choosing only 3 Boroughs to work on, i.e Bangalore South and North. Other Borough lies in outskirts of District

In [148]:
df['Borough'] = df['Borough'].replace(['Bangalore North', 'Bangalore north', 'Banglorenorth', 'Bg North', 'Bgnorth'], 'Bangalore North')

df['Borough'] = df['Borough'].replace(['Bangalore South', 'Bangaloresouth', 'Bg South', 'Bgsouth', 'Nla & Bgsouth'], 'Bangalore South')

df['Borough'] = df['Borough'].replace(['Bangalore', 'Banglore'], 'Bangalore')

df = df[df['Borough'].isin(["Bangalore South", "Bangalore North"])]

We also noticed there are neighbourhood with common pincode, they should be merged together and get sepreated by commas

In [149]:
# grouping neighbourhood with common pincode
tempNeighbourhoodDf = df.groupby('Pincode')['Neighbourhood'].apply(','.join).reset_index().set_index('Pincode')
print('Shape of grouped neighbourhood', tempNeighbourhoodDf.shape)
print(tempNeighbourhoodDf)

Shape of grouped neighbourhood (100, 1)
                                             Neighbourhood
Pincode                                                   
560001   Bangalore Bazaar,Dr. Ambedkar Veedhi,HighCourt...
560002       Bangalore Corporation Building,Bangalore City
560003   Malleswaram,Palace Guttahalli,Swimming Pool Ex...
560004             Basavanagudi,Mavalli,Pampamahakavi Road
560005                                         Fraser Town
...                                                    ...
560109                                     Thalaghattapura
562130   Chikkanahalli,Chunchanakuppe,Kadabagere,Tavare...
562149           Bagalur Bangalore,Bandikodigehalli,Kannur
562157   Bettahalsur,Chikkajala,Doddajala,Hunasamaranah...
562162                        Dasanapura,Madanayakanahalli

[100 rows x 1 columns]


In [150]:
# grouping borough on basis of pincode
tempBoroughDf = df.drop('Neighbourhood', axis=1).drop_duplicates('Pincode', 'first').set_index('Pincode')
print(tempBoroughDf.shape)
tempBoroughDf

(100, 1)


Unnamed: 0_level_0,Borough
Pincode,Unnamed: 1_level_1
560007,Bangalore South
560092,Bangalore North
560024,Bangalore North
560045,Bangalore North
560064,Bangalore North
...,...
560022,Bangalore North
562149,Bangalore North
562157,Bangalore North
562130,Bangalore South


Print shape and dataframe for understanding

In [156]:
# merging both tempNeighbourhoodDf & tempBoroughDf
df = tempNeighbourhoodDf.join(tempBoroughDf)
print(df.shape)
print(df.head())

(100, 2)
                                             Neighbourhood          Borough
Pincode                                                                    
560001   Bangalore Bazaar,Dr. Ambedkar Veedhi,HighCourt...  Bangalore North
560002       Bangalore Corporation Building,Bangalore City  Bangalore South
560003   Malleswaram,Palace Guttahalli,Swimming Pool Ex...  Bangalore North
560004             Basavanagudi,Mavalli,Pampamahakavi Road  Bangalore South
560005                                         Fraser Town  Bangalore North


Now we have succesfully created our dataframe which we need for analysing neighbourhood