
### This notebook will be used for my final capstone project which is for the coursera data sceince professional certificate through IBM.
## 1. Introduction/Business Problem
1.1. Background
Bengaluru (formerly and commonly known as Bangalore) is one of the fastest growing cities in the world. Many people move to Bengaluru to pursue career opportunities, a grand city life, and the fantastic weather. People often refer to it as the ‘Silicon Valley of India’ as it is a hotbed for Information Technology, Artificial Intelligence, and Data Science. As the infrastructure and population continue to grow, efficient governance has become a problem. Many argue over the split of the Bruhat Bengaluru Mahanagara Palike (BBMP) and whether splitting up Bengaluru would promote smoother administration. 

1.2. Problem
Bengaluru is a very diverse place, containing different cultures and spanning over both urban and rural territory. Splitting up Bengaluru, so that similar policies/projects can be efficiently co-implemented is a difficult problem to pursue intuitively. Within a small radius, one can find small fish markets and some of the most modern malls. Thus, a data science based approach may prove useful to solve this issue.

1.3. Interest
The policy-makers in Bengaluru would be the interested party in such an analysis. Splitting up Bengaluru into similar neighborhoods would aid in smoother administration, and policies can be geared to solve problems that are likely similar within similar neighborhoods. 


In [2]:
import pandas as pd
import numpy as np
print('Hello Capstone Project Course!')

Hello Capstone Project Course!


## 2. Data

2.1. Source
Kaggle is a community based environment for data scientists and machine learning enthusiasts. Google created this environment, and it is a fantastic place for people to share data and projects. The data for Bengaluru neighborhoods was obtained from the site, and the original data source would be a central Indian website ‘data.gov.in’.

2.2. Data Cleaning
The data obtained look to have some obviously incorrect outliers. Places that are in Bengaluru cannot have such a wide range of latitudes and longitudes. So outliers are removed and replaced by values found online. In addition, duplicate neighborhoods and excessive columns were removed to obtain a clean dataset.

2.3. Data Usage
The data contains latitudes and longitudes that can be used in foursquare to procure nearby venues. With this data, one can cluster similar neighborhoods together. This can be used to answer the question of how to split up Bengaluru into similar neighborhoods for effective governance. 

In [3]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0.1,Unnamed: 0,Neighborhood,Latitude,Longitude
0,0,Agram,45.813177,15.977048
1,1,Amruthahalli,13.066513,77.596624
2,2,Attur,11.663711,78.533551
3,3,Banaswadi,13.014162,77.651854
4,4,Bellandur,58.235358,26.683116


#### Made the table into a csv, and uploaded it to watson studio. I then used my credentials to access the data and make it a dataframe.
#### As the code above uses my credentials I have hidden them from view.

In [4]:
df_data_1.shape
df_data_1.dtypes

Unnamed: 0        int64
Neighborhood     object
Latitude        float64
Longitude       float64
dtype: object

In [7]:
from scipy import stats

def find_outliers(df, threshold=3):

    c = df.select_dtypes(include=[np.number]).apply(lambda x: np.abs(stats.zscore(x)) < threshold, result_type='reduce').all(axis=1)
    df.drop(df.index[c], inplace=True)


#### The above code identifies outliers by finding the z score of each datapoint. If the z score is greater than three or less than negative three, it is an outlier. 
#### These datapoints were likely wrongly input and therefore will be manually replaced for an updated csv.

In [8]:
find_outliers(df_data_1)
df_data_1

Unnamed: 0.1,Unnamed: 0,Neighborhood,Latitude,Longitude
0,0,Agram,45.813177,15.977048
4,4,Bellandur,58.235358,26.683116
14,14,Fraser Town,42.245363,-74.964607
15,15,Gunjur,13.17602,-16.759896
16,16,HighCourt,53.783508,-0.387187
32,32,Museum Road,-35.081322,137.704268
33,33,NAL,58.318956,12.266015
43,43,Whitefield,44.373441,-71.61026
51,51,Begur,52.480709,13.451829
84,84,Shanthinagar,58.235358,26.683116


In [18]:
# The code was removed by Watson Studio for sharing.

In [20]:
df_data_2.tail()


Unnamed: 0.1,Unnamed: 0,Neighborhood,Latitude,Longitude
342,347,Virupakshipura,13.024075,76.469658
343,348,Vishwanathapura,13.273529,77.649099
344,349,Yadamaranahalli,12.427249,77.379083
345,350,Yadavanahalli,12.789855,77.751454
346,351,Yeliyur,12.509896,76.828661


In [21]:
# dropping first column
del df_data_2['Unnamed: 0']
df_data_2.shape

(347, 3)

In [23]:
# resetting index
df = df_data_2.reset_index()
del df['index']
df

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Agram,12.958000,77.630800
1,Amruthahalli,13.066513,77.596624
2,Attur,11.663711,78.533551
3,Banaswadi,13.014162,77.651854
4,Bellandur,12.930400,77.678400
...,...,...,...
342,Virupakshipura,13.024075,76.469658
343,Vishwanathapura,13.273529,77.649099
344,Yadamaranahalli,12.427249,77.379083
345,Yadavanahalli,12.789855,77.751454


## 3. Methodology

3.1. Exploratory Data Analysis
Bengaluru is a growing city that locates its international airport outside of the city center. It used to have its airport in the city, but demands for a much bigger and international airport forced the government's hand. Looking into the data and the history of the city it is expected that there will be a large number of neighborhoods in the center and the number of neighborhoods that surround will be of lower concentrations. Furthermore, with the airport located far from the city center, it is expected that there will be a cluster of neighborhoods that have accumulated near that location. Analyzing the data backs these claims with many neighborhood locations (latitude and longitude) clustered around the city center. We also see a higher than usual number of neighborhoods near the airport compared to locations equally distant from the city center.


3.2. Inferential statistical testing
The data definitely has some outliers that look like it was wrongly retrieved or misinput into the data set. No location in Bangalore can have a negative longitude when its city center has a value in the positive seventies. To weed out the outliers, the statistical tool z-score was used. The z-score represents how close or far an observation is from the overall mean. For a normal distribution, 99.7% of the data is within three standard deviations of the mean. Values outside this range are considered outliers. The z-score provides an impactful statistical tool to find outliers. Using python’s scipy library, the z-scores of the locations were retrieved. After the outliers were highlighted, they could be replaced manually with correct latitudes and longitudes found online. This created an updated dataset with correct values, verified at the end with a map which had locations in the correct places.

3.3. Data Science and Machine Learning Techniques
First, foursquare data was used to collect venues nearby to each neighborhood. Foursquare is a great location to retrieve location data in a developer friendly manner. Using each latitude and longitude, calls were made to the website retrieving the nearby venues. Once received, the basis for the analysis is obtained. These categorical variables can be used to identify similar neighborhoods. For a numerical algorithm to be used, the categorical variables need to be converted to numerical values, which can be done by using dummy variables. Each venue type is put into a separate column, and if the location has the venue type it will be recorded as one, and if it isn't present it is recorded as zero.

K-means clustering was the primary technique used to analyze the data. This technique is an unsupervised machine learning methodology that uses attempts to cluster similar data points together in a way that minimizes the intra-distance within a cluster and maximize the inter-distance between clusters. Due to the large number of neighborhoods and the diversity of Bengaluru, the number of clusters chosen was 20. This provides sufficient granularity to answer the questions posed in the introduction. 

Finally using Folium, the map with color coded clusters is obtained. This visually shows the clustering for Bengaluru neighborhoods. Tables are made to further analyze the data and see what is similar between neighborhoods using the ten most common venues amongst the locations. From this conclusions can be drawn on how to split up Bengaluru in order to effectively govern.


In [25]:
from sklearn.cluster import KMeans

In [29]:
def getNearbyVenues(names, latitudes, longitudes, radius=2500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [30]:
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!pip install folium
import folium # map rendering library

print('Libraries imported.')
df_blr_venues = getNearbyVenues(names=df['Neighborhood'],latitudes=df['Latitude'],longitudes=df['Longitude'])

Libraries imported.
Agram
Amruthahalli
Attur
Banaswadi
Bellandur
Bhattarahalli
Bidrahalli
Byatarayanapura
Devanagundi
Devasandra
Doddagubbi
Doddanekkundi
Domlur
EPIP
Fraser Town
Gunjur
HighCourt
Hoodi
Horamavu
Indiranagar S.O (Bangalore)
Jakkur
Kadugodi
Kalkunte
Kannamangala
Kodigehalli
Kothanur
Krishnarajapuram
Kundalahalli
Lingarajapuram
Mahadevapura
Medimallasandra
Mundur
Museum Road
NAL
Panathur
Rajanakunte
Sadashivanagar
Samethanahalli
Singanayakanahalli
Vasanthanagar
Venkateshapura
Vimanapura
Virgonagar
Whitefield
Yelahanka
Adugodi
Agara
Anjanapura
Banashankari
Bannerghatta
Basavanagudi H.O
Begur
Bolare
Bommanahalli S.O (Bangalore)
Chandapura
Chickpet
Chikkalasandra
Deepanjalinagar
Doddakallasandra
Girinagar S.O (Bangalore)
Gottigere
Haragadde
Hennagara
Hulimangala
Hulimavu
Huskur
Jayanagar H.O
Jigani
Kalkere
Kallubalu
Kathriguppe
Kengeri
Konanakunte
Koramangala
Kumbalagodu
Madivala
Mallathahalli
Mavalli
Nayandahalli
Ragihalli
Ramohalli
Sakalavara
Shanthinagar
Singasandra
Subrama

In [31]:
df_blr_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Achitnagar,4,4,4,4,4,4
Adugodi,79,79,79,79,79,79
Agram,100,100,100,100,100,100
Akkur,2,2,2,2,2,2
Alahalli,2,2,2,2,2,2
...,...,...,...,...,...,...
Whitefield,74,74,74,74,74,74
Yadavanahalli,5,5,5,5,5,5
Yelachenahalli,93,93,93,93,93,93
Yelahanka,31,31,31,31,31,31


In [33]:
print('There are {} uniques categories.'.format(len(df_blr_venues['Venue Category'].unique())))

There are 252 uniques categories.


#### Multiple radii were tested, and even with a large radius, not all neighbourhoods returned a venue. 
#### This could be because some of the areas are rural bangalore, which may not have venues in the typical urban sense.
#### Now clustering only the neighborhoods that are in the above table.

In [34]:

# one hot encoding
onehot = pd.get_dummies(df_blr_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
onehot['Neighborhood'] = df_blr_venues['Neighborhood'] 


In [35]:
onehot.shape

(4677, 252)

In [36]:
t_grouped = onehot.groupby('Neighborhood').mean().reset_index()
t_grouped

Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport,Airport Service,Airport Terminal,American Restaurant,Andhra Restaurant,Arcade,...,Turkish Coffeehouse,Udupi Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Warehouse Store,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Achitnagar,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.00,0.000000,0.00,0.0,0.0,0.0,0.0,0.0,0.000000
1,Adugodi,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.00,0.000000,0.00,0.0,0.0,0.0,0.0,0.0,0.012658
2,Agram,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.01,0.000000,0.01,0.0,0.0,0.0,0.0,0.0,0.000000
3,Akkur,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.00,0.000000,0.00,0.0,0.0,0.0,0.0,0.0,0.000000
4,Alahalli,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.00,0.000000,0.00,0.0,0.0,0.0,0.0,0.0,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
208,Whitefield,0.0,0.0,0.0,0.000000,0.0,0.0,0.027027,0.000000,0.0,...,0.0,0.00,0.013514,0.00,0.0,0.0,0.0,0.0,0.0,0.000000
209,Yadavanahalli,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.00,0.000000,0.00,0.0,0.0,0.0,0.0,0.0,0.000000
210,Yelachenahalli,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,...,0.0,0.00,0.021505,0.00,0.0,0.0,0.0,0.0,0.0,0.000000
211,Yelahanka,0.0,0.0,0.0,0.032258,0.0,0.0,0.032258,0.032258,0.0,...,0.0,0.00,0.032258,0.00,0.0,0.0,0.0,0.0,0.0,0.000000


In [37]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [111]:
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = t_grouped['Neighborhood']

for ind in np.arange(t_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(t_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Achitnagar,Recreation Center,Bakery,Asian Restaurant,Restaurant,Dry Cleaner,Eastern European Restaurant,Electronics Store,Event Service,Event Space,Yoga Studio
1,Adugodi,Indian Restaurant,Café,Pizza Place,Coffee Shop,Gym,Lounge,Chinese Restaurant,Clothing Store,Tea Room,Dessert Shop
2,Agram,Indian Restaurant,Hotel,Restaurant,Ice Cream Shop,Asian Restaurant,Pub,Clothing Store,Bar,Pizza Place,Café
3,Akkur,Fast Food Restaurant,Bus Station,Yoga Studio,Duty-free Shop,Flea Market,Fishing Spot,Financial or Legal Service,Field,Farmers Market,Farm
4,Alahalli,Food & Drink Shop,Indie Movie Theater,Duty-free Shop,Food,Flea Market,Fishing Spot,Financial or Legal Service,Field,Fast Food Restaurant,Farmers Market


In [112]:
kclusters = 20

x2 = t_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(x2)

kmeans.labels_[0:10] 
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)


In [113]:
merged = df


merged = merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')


df2= merged.dropna()
merged=df2
merged.head()

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agram,12.958,77.6308,13.0,Indian Restaurant,Hotel,Restaurant,Ice Cream Shop,Asian Restaurant,Pub,Clothing Store,Bar,Pizza Place,Café
1,Amruthahalli,13.066513,77.596624,1.0,Indian Restaurant,Café,Ice Cream Shop,Fast Food Restaurant,Garden,Resort,Pizza Place,Lake,Light Rail Station,South Indian Restaurant
3,Banaswadi,13.014162,77.651854,1.0,Indian Restaurant,Café,Ice Cream Shop,Department Store,Pizza Place,Korean Restaurant,BBQ Joint,South Indian Restaurant,Kerala Restaurant,Steakhouse
4,Bellandur,12.9304,77.6784,13.0,Café,Indian Restaurant,Pizza Place,Coffee Shop,Fast Food Restaurant,Hotel,Ice Cream Shop,Lounge,Gym,Sports Bar
5,Bhattarahalli,13.0258,77.714279,13.0,Hotel,Café,Bar,Indian Restaurant,Korean Restaurant,Convenience Store,Construction & Landscaping,Pizza Place,Vegetarian / Vegan Restaurant,Forest


In [114]:
map_clusters = folium.Map(location=[df_blr_venues['Neighborhood Latitude'][0], df_blr_venues['Neighborhood Longitude'][0]], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(merged['Latitude'], merged['Longitude'], merged['Neighborhood'], merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters


In [116]:
merged.loc[df['Neighborhood'].isin(['Fraser Town','Chandapura','Korati','Doddajala','Malur','Marasandra'])]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Fraser Town,12.997,77.6144,1.0,Indian Restaurant,Café,Tea Room,Middle Eastern Restaurant,Bakery,Pub,Ice Cream Shop,Shopping Mall,Coffee Shop,Fast Food Restaurant
54,Chandapura,12.8017,77.7116,1.0,Indian Restaurant,Asian Restaurant,Coffee Shop,Train Station,Yoga Studio,Farm,Electronics Store,Event Service,Event Space,Falafel Restaurant
165,Doddajala,13.176735,77.65205,13.0,Fast Food Restaurant,Hotel,Airport Terminal,Bike Shop,Farm,Toll Booth,Food Truck,Airport,Event Service,Food & Drink Shop
241,Korati,12.9716,77.5946,13.0,Indian Restaurant,Hotel,Lounge,Brewery,Pub,Café,Ice Cream Shop,Italian Restaurant,Sushi Restaurant,Japanese Restaurant
264,Malur,13.006034,77.938284,6.0,Train Station,ATM,Men's Store,Duty-free Shop,Flea Market,Fishing Spot,Financial or Legal Service,Field,Fast Food Restaurant,Farmers Market
274,Marasandra,12.980402,77.873983,6.0,Train Station,Yoga Studio,Duty-free Shop,Food,Flea Market,Fishing Spot,Financial or Legal Service,Field,Fast Food Restaurant,Farmers Market


## Results
The table above highlights three key clusters - 1,6,13. Cluster 1 seems to be citizen hubs. People who lived in Bengaluru for long tend to be in these hubs, which is why Indian restaurants are the most common venue. Additionally we find many local types of venues such as shopping malls, bakeries, and yoga studios. Cluster 13 is the travel heavy neighborhoods. The cluster of neighborhoods surrounding the airport fall in this category. The ones that are closer to the city are heavy on the hotels, lounges and international cuisine restaurants. Finally, cluster 6 has the train station as the primary venue. The daily wage workers who ply their trade in flea markets, fishing spots and fields encompass these neighborhoods.
The map depicted only on the NBviewer version shows the clustering of neighborhoods on the Bengaluru map. As expected, many neighborhoods are in the city center, and as one goes further from it, the number of neighborhoods drop. One notable exception is near the airport. Two primary clusters formed depicted in purple and green. They seem to be spread both in the city center and in the outskirts. Other clusters seem to be separated and only in the outskirts, shown by different shades of orange and red. Finally, it looks as though there is significant neighborhood clustering around the highways, highlighting their importance to any level of government.

## 5.Discussion

The primary focus of the government should be to provide accurate and impactful funding in locations of need. The clusters provide a strong framework to do so. Splitting Bengaluru up into smaller sections by location would result in severe inefficiencies as neighborhoods near each other aren't necessarily similar. Splitting by type allows the government to focus limited resources in projects that can have the most impact. Furthermore, places that need the same kind of help can be assisted simultaneously. Cluster 1 would benefit most from housing projects. These are places with shopping malls, bakeries and yoga studios for the people who have settled down in Bengaluru. As the population grows, housing prices will spike. A proactive approach could aid future generations. Cluster 13 would benefit from infrastructure rebuilding. This cluster is for people who live intermittently in Bengaluru. One would expect neighborhoods in this cluster to have hotels as the centerpiece. Thus connecting the metro from these neighborhoods to the airport could be enormously beneficial. Cluster 6 would benefit from cheap transportation options as they are daily wage earners working in the flea market and in fishing. A greater number of buses and subsidized ticket prices would improve the standard of living immensely. Agricultural government is key in these areas. The map shows that neighborhoods grow out of Bengaluru through its highways. Petrol bunks would be necessary investments in such locations.

## 6. Conclusion

To govern Bengaluru effectively, the spread of neighborhoods must be understood. First, data on Bengaluru neighborhoods were cleaned and outliers were corrected. Second, using calls to foursquare, the most common venues surrounding a neighborhood were obtained. Third, using a k-means clustering approach, the neighborhoods were divided into clusters. This provided insight into how nearby neighborhoods may need different approaches. This could help the government allocate funding in the most impactful manner. Furthermore, this kind of data could help in determining where the metro in the city should be connected to the airport. Although intuitive thinking may suggest to put it right at the High Court, the data would suggest to put it nearer to Domlur for smoother travel. This kind of data- driven approach can aid in making accurate investment to neighborhoods rather than a one size fits all approach. Governing a city like Bengaluru is a difficult task and data science may prove useful in a successful outcome. 

