# Analyzing Neighborhoods in Bengaluru, India to open a shopping mall

## Introduction

Bengaluru, the city that has been adjudged the most livable city in India, is the capital of the Indian State of Karnataka. It is known for its pleasant climate throughout the year. The city hosts numerous prestigious institutions and a large number of Tech Parks.  

Being the third most populous city in the country, there is a lot of opportunity for property developers to build a lot more malls in the city. This project intends to find reccomendations for the stakeholders based on the analysis

### Importing required libraries

In [1]:
import pandas as pd
from bs4 import BeautifulSoup
import requests
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder

In [32]:
url = "https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Bangalore"
html_data = requests.get(url).text

In [33]:
temp_data = pd.read_html(html_data)

In [34]:
blr_data = pd.DataFrame()
for i in range (0,8):
    blr_data = pd.concat([blr_data, temp_data[i]], ignore_index=True)
blr_data

Unnamed: 0,Name,Image,Summary
0,Cantonment area,,The Cantonment area in Bangalore was used as a...
1,Domlur,,"Formerly part of the Cantonment area, Domlur h..."
2,Indiranagar,,Indiranagar is a sought-after residential and ...
3,Rajajinagar,,Established in 1949 on the birthday of C. Raja...
4,Malleswaram,,
...,...,...,...
60,Nandini Layout,,
61,Nayandahalli,,Nayandahalli is a transport junction in the we...
62,Rajajinagar,,
63,Rajarajeshwari Nagar,,Located in the south-western part of the city ...


In [35]:
blr_data.drop(['Image', 'Summary'], axis=1, inplace=True)
blr_data.rename(columns={'Name':"Neighborhood"}, inplace=True)
blr_data.at[0,'Neighborhood'] = "Bangalore Cantonment"
blr_data

Unnamed: 0,Neighborhood
0,Bangalore Cantonment
1,Domlur
2,Indiranagar
3,Rajajinagar
4,Malleswaram
...,...
60,Nandini Layout
61,Nayandahalli
62,Rajajinagar
63,Rajarajeshwari Nagar


In [36]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Bangalore, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [37]:
coords = [ get_latlng(neighborhood) for neighborhood in blr_data["Neighborhood"].tolist() ]
coords

[[12.975660000000062, 77.60542000000004],
 [12.943290000000047, 77.65602000000007],
 [13.030060000000049, 77.49526000000003],
 [13.005440000000021, 77.55693000000008],
 [13.00632005596653, 77.56839983128529],
 [12.966180000000065, 77.58690000000007],
 [13.014830000000075, 77.57771000000008],
 [12.993550000000027, 77.57988000000006],
 [12.987180000000023, 77.60398000000004],
 [12.989080000000058, 77.62795000000006],
 [12.99105000000003, 77.58855000000005],
 [12.927340000000072, 77.67169000000007],
 [12.978999697242791, 77.65613184800841],
 [12.99201000000005, 77.71506000000005],
 [13.000390000000039, 77.68368000000004],
 [12.994090000000028, 77.66633000000007],
 [12.954660000000047, 77.70752000000005],
 [12.943490000000054, 77.74701000000005],
 [12.975230000000067, 77.75238000000007],
 [13.019526511351998, 77.65502797845224],
 [13.026410000000055, 77.62437000000006],
 [13.038700000000063, 77.66192000000007],
 [12.968020000000024, 77.52114000000006],
 [13.014260000000036, 77.636740000000

In [38]:
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
blr_data['Latitude'] = df_coords['Latitude']
blr_data['Longitude'] = df_coords['Longitude']
blr_data

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Bangalore Cantonment,12.97566,77.60542
1,Domlur,12.94329,77.65602
2,Indiranagar,13.03006,77.49526
3,Rajajinagar,13.00544,77.55693
4,Malleswaram,13.00632,77.56840
...,...,...,...
60,Nandini Layout,13.01481,77.53891
61,Nayandahalli,12.94205,77.52100
62,Rajajinagar,13.00544,77.55693
63,Rajarajeshwari Nagar,12.93178,77.52668
