# Cousera Peer Graded Assignment - Capstone Project

## Introduction

Bangalore, officially Bengaluru, is the capital of the Indian state of Karnataka. It has a population of over ten million, making it a megacity and the third-most populous city and fifth-most populous urban agglomeration in India. 

Bengaluru is widely regarded as the "Silicon Valley of India" (or "IT capital of India") because of its role as the nation's leading information technology (IT) exporter. Indian technological organisations such as ISRO, Infosys, Wipro and HAL are headquartered in the city. A demographically diverse city, Bangalore is the second fastest-growing major metropolis in India. Recent estimates of the metro economy of its urban area have ranked Bangalore either the fourth or fifth-most productive metro area of India. It is home to many educational and research institutions in India, such as Indian Institute of Science (IISc), Indian Institute of Management (Bangalore) (IIMB), International Institute of Information Technology, Bangalore (IIITB), National Institute of Fashion Technology, Bangalore, National Institute of Design, Bangalore (NID R&D Campus), National Law School of India University (NLSIU) and National Institute of Mental Health and Neurosciences (NIMHANS). Numerous state-owned aerospace and defence organisations, such as Bharat Electronics, Hindustan Aeronautics and National Aerospace Laboratories are located in the city. The city also houses the Kannada film industry.

## Business Problem

With huge development in the city, the population is expected to increase in the coming years. Bangalore's 2020 population is now estimated at 12,326,532. The aim of this project is to find suitable places in the city to put up a merchandise shop. So that the needs for best shopping experience of people can be catered to.

## Data

The data for this project has been taken from the Wikipedia webpage. Where the list of all neighbohoods in every regions has been listed. 

A link to the webpage is provided here: "https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Bangalore"

### 1. Importing required libraries

We start this project by importing all the necessary libraries required.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes
#import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    geopy-2.0.0                |     pyh9f0ad1d_0          63 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0          conda-forge
    geopy:           

### 2. Download and explore the Dataset

In [2]:
url = "https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Bangalore"
html = requests.get(url).content
df = pd.read_html(html)
print(df)

[                 Name  Image  \
0     Cantonment area    NaN   
1              Domlur    NaN   
2         Indiranagar    NaN   
3   Jeevanbheemanagar    NaN   
4         Malleswaram    NaN   
5           Pete area    NaN   
6      Sadashivanagar    NaN   
7       Seshadripuram    NaN   
8        Shivajinagar    NaN   
9              Ulsoor    NaN   
10      Vasanth Nagar    NaN   

                                              Summary  
0   The Cantonment area in Bangalore was used as a...  
1   Formerly part of the Cantonment area, Domlur h...  
2   Indiranagar is a sought-after residential and ...  
3                                                 NaN  
4                                                 NaN  
5   Established by Kempe Gowda I at the time of cr...  
6   Sadashivanagar is an upscale neighbourhood in ...  
7   Seshadripuram was established in 1892 to reduc...  
8   Shivajinagar is one of the older areas of the ...  
9   Ulsoor (or Halasuru) is one of the oldest area... 

As we see above, there are eight tables divided by their respective regions. In the next step, we will merge all the tables to form one full dataframe set.

In [3]:
bangalore_data = pd.concat([df[0] , df[1] , df[2] , df[3] , df[4] , df[5] , df[6] , df[7]] , ignore_index = True)

We only need the list of neighborhood names. So we drop the unnecessary columns such as "Image" and "Summary"

In [4]:
bangalore_data = bangalore_data.drop(columns = ["Image" , "Summary"])
print(bangalore_data)

                    Name
0        Cantonment area
1                 Domlur
2            Indiranagar
3      Jeevanbheemanagar
4            Malleswaram
5              Pete area
6         Sadashivanagar
7          Seshadripuram
8           Shivajinagar
9                 Ulsoor
10         Vasanth Nagar
11             Bellandur
12        CV Raman Nagar
13                 Hoodi
14      Krishnarajapuram
15          Mahadevapura
16          Marathahalli
17               Varthur
18            Whitefield
19             Banaswadi
20            HBR Layout
21              Horamavu
22          Kalyan Nagar
23          Kammanahalli
24        Lingarajapuram
25      Ramamurthy Nagar
26                Hebbal
27             Jalahalli
28             Mathikere
29                Peenya
30           R. T. Nagar
31        Vidyaranyapura
32             Yelahanka
33          Yeshwanthpur
34          Bommanahalli
35           Bommasandra
36            BTM Layout
37       Electronic City
38            HSR Layout
