### Contents in Notebook

    1.Installing & Importing Python libraries and dependencies
    2.Scraping data from webpage into a DataFrame
        a. Data Preprocessing
        b. Output as csv file ( .csv)
    3.Making a geo map of Bangalore
        a. Obtaining geographical coordinates for the pincodes.
        b. Making a map of the different pincodes.
    4.Define Foursquare Credentials and Version
        a.Top 100 venues that are within a radius of 500 meters for each post office
        b.Data Preprocessing
        c.Output as Prediction file ( .csv)
    5.Feature Engineering for the selected Business Problem
       a.Simplification
       b.Feature Selection
       c.Handling Categorical Data (One Hot Encoding)
    6.Clustering And Exploratory Visualization
    7.Examine Clusters
    8.Observations

### 1. Installing & importing Python libraries and dependencies

In [1]:
!pip install geocoder
!pip install folium

Collecting geocoder
  Downloading geocoder-1.38.1-py2.py3-none-any.whl (98 kB)
[K     |████████████████████████████████| 98 kB 9.9 MB/s  eta 0:00:01
Collecting ratelim
  Downloading ratelim-0.1.6-py2.py3-none-any.whl (4.0 kB)
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 7.0 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1


In [2]:
import pandas as pd
import requests
import numpy as np
import geocoder
import folium
import requests 
import matplotlib.cm as cm
import matplotlib.colors as colors
import json
import xml
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")

from pandas.io.json import json_normalize 
from sklearn.cluster import KMeans
from geopy.geocoders import Nominatim 
from bs4 import BeautifulSoup

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

print("All Required Libraries Imported!")

All Required Libraries Imported!


### 2.Scraping data from webpage into a Pandas DataFrame

#### a.Using pandas.read(url) to scrape the list of postal codes. 

Link: https://www.onlinebangalore.com/guide/pincodes/pincode.html

In [3]:
url = "https://www.mapsofindia.com/pincode/india/karnataka/bangalore/"

In [4]:
bng=pd.read_html(url)
bng=bng[0] #Setting the first table as bng.
new_header = bng.iloc[0] #grab the first row for the header
bng = bng[1:] #take the data less the header row
bng.columns = new_header #set the header row as the df header
bng #taking a look at the table

Unnamed: 0,Location,Pincode,State,District
1,A F station yelahanka,560063,Karnataka,Bangalore
2,Adugodi,560030,Karnataka,Bangalore
3,Agara,560034,Karnataka,Bangalore
4,Agram,560007,Karnataka,Bangalore
5,Air Force hospital,560007,Karnataka,Bangalore
6,Amruthahalli,560092,Karnataka,Bangalore
7,Anandnagar,560024,Karnataka,Bangalore
8,Anekal,562106,Karnataka,Bangalore
9,Anekalbazar,562106,Karnataka,Bangalore
10,Arabic College,560045,Karnataka,Bangalore


In [5]:
#bng.groupby('Pincode')

#bng=bng.groupby(['Location','Pincode'])

In [6]:
bng.dtypes

0
Location    object
Pincode     object
State       object
District    object
dtype: object

In [7]:
bng.columns

Index(['Location', 'Pincode', 'State', 'District'], dtype='object', name=0)

In [8]:
bng

Unnamed: 0,Location,Pincode,State,District
1,A F station yelahanka,560063,Karnataka,Bangalore
2,Adugodi,560030,Karnataka,Bangalore
3,Agara,560034,Karnataka,Bangalore
4,Agram,560007,Karnataka,Bangalore
5,Air Force hospital,560007,Karnataka,Bangalore
6,Amruthahalli,560092,Karnataka,Bangalore
7,Anandnagar,560024,Karnataka,Bangalore
8,Anekal,562106,Karnataka,Bangalore
9,Anekalbazar,562106,Karnataka,Bangalore
10,Arabic College,560045,Karnataka,Bangalore


#### b. Output as csv file ( .csv)

In [9]:
bng.to_csv('bngpostcodes.csv')

### 3. Making a map of Bangalore

#### a. Getting Latlong coordinates

In [10]:
bng["Latitude"] = ""
bng["Longitude"] = ""
bng.shape

(294, 6)

In [11]:
def get_latlong(pin_code):
    lat_long_coords = None
    while(lat_long_coords is None):
        g = geocoder.arcgis('{}, Bangalore,Karnataka'.format(pin_code))
        lat_long_coords = g.latlng
    return lat_long_coords
    
get_latlong('560063')

[13.129065000000026, 77.61235192300006]

In [12]:
# Retrieving Postal Code Co-ordinates
pin_codes = bng['Pincode']    
coords = [ get_latlong(pin_code) for pin_code in pin_codes.tolist() ]

In [13]:
# Adding Columns Latitude & Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
bng['Latitude'] = df_coords['Latitude']
bng['Longitude'] = df_coords['Longitude']

In [14]:
bng=bng.dropna(subset=['Longitude'])

bng=bng.dropna(subset=['Latitude'])

In [15]:
bng[bng.Pincode == '560024']

Unnamed: 0,Location,Pincode,State,District,Latitude,Longitude
7,Anandnagar,560024,Karnataka,Bangalore,12.713686,77.683715
96,H.A. farm,560024,Karnataka,Bangalore,12.97303,77.627446
103,Hebbal Kempapura,560024,Karnataka,Bangalore,12.713686,77.683715


In [16]:
# save the DataFrame as CSV file
bng.to_csv("bng_checkpoint.csv", index=False)

#### b. Making the map

In [17]:
# get the coordinates of Bangalore
address = 'Bangalore, Karnataka, India'

geolocator = Nominatim(user_agent="http")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Bangalore, India {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Bangalore, India 12.9791198, 77.5912997.


In [18]:
# create map of Bangalore using latitude and longitude values
bng_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng in zip(bng['Latitude'], bng['Longitude']):
    label = '{}'.format('Location')
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='teal',
        fill=True,
        fill_color='#F4C430',
        fill_opacity=0.7).add_to(bng_map)  
    
bng_map

In [19]:
# save the map as HTML file
bng_map.save('bng_map.html')

### 4. Defining Foursquare Credentials and Version

Credentials input below

In [20]:
# The code was removed by Watson Studio for sharing.

Your credentials:
CLIENT_ID: Hidden
CLIENT_SECRET: Hidden


#### Top 100 venues that are within a radius of 500 meters.

In [21]:
radius = 500
LIMIT = 100

venues = []

for lat, long, loc in zip(bng['Latitude'], bng['Longitude'], bng['Location']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            loc,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [22]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Location', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(2620, 7)


Unnamed: 0,Location,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,A F station yelahanka,12.943915,77.60671,audugodi,12.942543,77.607353,Bus Station
1,A F station yelahanka,12.943915,77.60671,Bharathi Refreshments(South Indian Food) - Adu...,12.943388,77.60784,Fast Food Restaurant
2,A F station yelahanka,12.943915,77.60671,Stoneart,12.941271,77.608701,Design Studio
3,A F station yelahanka,12.943915,77.60671,adigas,12.940589,77.60878,Indian Restaurant
4,A F station yelahanka,12.943915,77.60671,Salt N Pepper,12.944671,77.602664,Restaurant


### 5.Feature Engineering for the selected Business Problem

#### a.Simplification

In [23]:
venues_df.groupby(["Location"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Location,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A F station yelahanka,6,6,6,6,6,6
Adugodi,20,20,20,20,20,20
Air Force hospital,11,11,11,11,11,11
Amruthahalli,12,12,12,12,12,12
Anekalbazar,2,2,2,2,2,2
Arabic College,14,14,14,14,14,14
Aranya Bhavan,1,1,1,1,1,1
Attur,4,4,4,4,4,4
Austin Town,4,4,4,4,4,4
Avalahalli,9,9,9,9,9,9


In [24]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 156 uniques categories.


In [25]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:50]

array(['Bus Station', 'Fast Food Restaurant', 'Design Studio',
       'Indian Restaurant', 'Restaurant', 'Furniture / Home Store',
       'Ice Cream Shop', 'Burger Joint', 'Bar', 'Italian Restaurant',
       'Bakery', 'Grocery Store', 'Snack Place', 'Beer Garden', 'Café',
       'Brewery', 'Gastropub', 'Department Store',
       'Hyderabadi Restaurant', 'Bubble Tea Shop', 'Pizza Place',
       'Chinese Restaurant', 'Electronics Store', 'Liquor Store',
       'Food Court', 'Pharmacy', 'Falafel Restaurant', 'ATM',
       'South Indian Restaurant', 'Donut Shop', 'Coffee Shop',
       'Flea Market', 'Vegetarian / Vegan Restaurant',
       'American Restaurant', 'Camera Store', 'Football Stadium',
       'Indie Movie Theater', 'Historic Site', 'Food Truck', 'Bus Line',
       'Park', 'Breakfast Spot', 'Asian Restaurant', 'Miscellaneous Shop',
       'Plaza', 'Diner', 'Jewelry Store', 'Arts & Crafts Store',
       'Seafood Restaurant', 'Hobby Shop'], dtype=object)

In [26]:
# check if the results contain "Shopping Mall"
"Coffee Shop" in venues_df['VenueCategory'].unique()

True

In [27]:
venues_df.to_csv('venues.csv')

#### Analyzing Each Neighborhood using One-Hot encoding

In [28]:
# one hot encoding
bng_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
bng_onehot['Location'] = venues_df['Location'] 

# move neighborhood column to the first column
fixed_columns = [bng_onehot.columns[-1]] + list(bng_onehot.columns[:-1])
bng_onehot =bng_onehot[fixed_columns]

print(bng_onehot.shape)
bng_onehot.head()

(2620, 157)


Unnamed: 0,Location,ATM,Accessories Store,Afghan Restaurant,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bed & Breakfast,Beer Garden,Big Box Store,Bike Shop,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Bus Line,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Camera Store,Candy Store,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Cupcake Shop,Department Store,Design Studio,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,Football Stadium,Furniture / Home Store,Gastropub,Golf Course,Grocery Store,Gym,Gym Pool,Historic Site,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Hyderabadi Restaurant,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Jewelry Store,Juice Bar,Karnataka Restaurant,Kerala Restaurant,Korean Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Music Store,Nightclub,North Indian Restaurant,Paper / Office Supplies Store,Park,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pub,Punjabi Restaurant,Rajasthani Restaurant,Recreation Center,Rest Area,Restaurant,Rock Climbing Spot,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Sporting Goods Shop,Stadium,Steakhouse,Supermarket,Tea Room,Tennis Court,Thai Restaurant,Theater,Tibetan Restaurant,Toy / Game Store,Trail,Udupi Restaurant,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
0,A F station yelahanka,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,A F station yelahanka,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,A F station yelahanka,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,A F station yelahanka,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,A F station yelahanka,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [29]:
bng_onehot.describe()

Unnamed: 0,ATM,Accessories Store,Afghan Restaurant,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bed & Breakfast,Beer Garden,Big Box Store,Bike Shop,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Bus Line,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Camera Store,Candy Store,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Cupcake Shop,Department Store,Design Studio,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,Football Stadium,Furniture / Home Store,Gastropub,Golf Course,Grocery Store,Gym,Gym Pool,Historic Site,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Hyderabadi Restaurant,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Jewelry Store,Juice Bar,Karnataka Restaurant,Kerala Restaurant,Korean Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Music Store,Nightclub,North Indian Restaurant,Paper / Office Supplies Store,Park,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pub,Punjabi Restaurant,Rajasthani Restaurant,Recreation Center,Rest Area,Restaurant,Rock Climbing Spot,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Sporting Goods Shop,Stadium,Steakhouse,Supermarket,Tea Room,Tennis Court,Thai Restaurant,Theater,Tibetan Restaurant,Toy / Game Store,Trail,Udupi Restaurant,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
count,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0,2620.0
mean,0.005725,0.001145,0.004198,0.008397,0.003435,0.000763,0.006107,0.008015,0.010305,0.00229,0.001145,0.005725,0.000382,0.026718,0.001527,0.016031,0.000382,0.002672,0.00229,0.001145,0.00229,0.003053,0.001145,0.00687,0.00229,0.000763,0.004198,0.00916,0.001145,0.008397,0.001145,0.016412,0.001145,0.000382,0.000382,0.05687,0.00229,0.004198,0.000763,0.017939,0.025573,0.001527,0.017176,0.000382,0.001145,0.00229,0.001527,0.000763,0.017939,0.000763,0.011069,0.010687,0.012595,0.001527,0.008779,0.000763,0.001908,0.00458,0.04084,0.004198,0.000382,0.004198,0.000382,0.000763,0.002672,0.000763,0.005725,0.000763,0.005344,0.003435,0.000382,0.003053,0.007634,0.001145,0.004198,0.000763,0.00229,0.020992,0.000382,0.001908,0.034351,0.001527,0.174427,0.002672,0.006489,0.002672,0.000382,0.004198,0.011832,0.009924,0.002672,0.003053,0.00458,0.000763,0.003817,0.001527,0.014885,0.00458,0.001145,0.005344,0.009542,0.003053,0.001527,0.001527,0.008015,0.004198,0.000382,0.00229,0.003435,0.001145,0.001145,0.005725,0.000382,0.001145,0.003435,0.00916,0.004198,0.022519,0.000763,0.00229,0.008015,0.014122,0.000763,0.000382,0.001527,0.000763,0.012977,0.000763,0.001908,0.005725,0.001527,0.003817,0.004198,0.000382,0.006107,0.003817,0.009542,0.000382,0.001527,0.015267,0.001145,0.000763,0.00458,0.001145,0.003053,0.001908,0.001527,0.001145,0.001527,0.001527,0.001527,0.001908,0.010687,0.001145,0.000382,0.013359
std,0.075462,0.033825,0.064672,0.091267,0.05852,0.027624,0.077922,0.089186,0.10101,0.047809,0.033825,0.075462,0.019537,0.161288,0.039051,0.125617,0.019537,0.05163,0.047809,0.033825,0.047809,0.055184,0.033825,0.082617,0.047809,0.027624,0.064672,0.095288,0.033825,0.091267,0.033825,0.127079,0.033825,0.019537,0.019537,0.231639,0.047809,0.064672,0.027624,0.132755,0.157886,0.039051,0.12995,0.019537,0.033825,0.047809,0.039051,0.027624,0.132755,0.027624,0.104644,0.102844,0.111542,0.039051,0.0933,0.027624,0.043652,0.067535,0.197956,0.064672,0.019537,0.064672,0.019537,0.027624,0.05163,0.027624,0.075462,0.027624,0.072918,0.05852,0.019537,0.055184,0.087053,0.033825,0.064672,0.027624,0.047809,0.143386,0.019537,0.043652,0.182164,0.039051,0.379549,0.05163,0.080305,0.05163,0.019537,0.064672,0.10815,0.099141,0.05163,0.055184,0.067535,0.027624,0.061674,0.039051,0.121118,0.067535,0.033825,0.072918,0.097234,0.055184,0.039051,0.039051,0.089186,0.064672,0.019537,0.047809,0.05852,0.033825,0.033825,0.075462,0.019537,0.033825,0.05852,0.095288,0.064672,0.148393,0.027624,0.047809,0.089186,0.118017,0.027624,0.019537,0.039051,0.027624,0.113197,0.027624,0.043652,0.075462,0.039051,0.061674,0.064672,0.019537,0.077922,0.061674,0.097234,0.019537,0.039051,0.122637,0.033825,0.027624,0.067535,0.033825,0.055184,0.043652,0.039051,0.033825,0.039051,0.039051,0.039051,0.043652,0.102844,0.033825,0.019537,0.114827
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [30]:
bng_grouped = bng_onehot.groupby(["Location"]).mean().reset_index()

print(bng_grouped.shape)
bng_grouped

(230, 157)


Unnamed: 0,Location,ATM,Accessories Store,Afghan Restaurant,American Restaurant,Andhra Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Bed & Breakfast,Beer Garden,Big Box Store,Bike Shop,Bistro,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Bus Line,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Camera Store,Candy Store,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Cricket Ground,Cupcake Shop,Department Store,Design Studio,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,Football Stadium,Furniture / Home Store,Gastropub,Golf Course,Grocery Store,Gym,Gym Pool,Historic Site,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Hyderabadi Restaurant,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indian Sweet Shop,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Jewelry Store,Juice Bar,Karnataka Restaurant,Kerala Restaurant,Korean Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Music Store,Nightclub,North Indian Restaurant,Paper / Office Supplies Store,Park,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pub,Punjabi Restaurant,Rajasthani Restaurant,Recreation Center,Rest Area,Restaurant,Rock Climbing Spot,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Sporting Goods Shop,Stadium,Steakhouse,Supermarket,Tea Room,Tennis Court,Thai Restaurant,Theater,Tibetan Restaurant,Toy / Game Store,Trail,Udupi Restaurant,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
0,A F station yelahanka,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Adugodi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.05,0.0,0.25,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Air Force hospital,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Amruthahalli,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Anekalbazar,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Arabic College,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0
6,Aranya Bhavan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Attur,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Austin Town,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Avalahalli,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [31]:
len(bng_grouped[bng_grouped["Coffee Shop"] > 0])

31

Create a new DataFrame for Shopping Mall data only

In [32]:
bng_mall = bng_grouped[["Location","Coffee Shop"]]

In [33]:
bng_mall.head()

Unnamed: 0,Location,Coffee Shop
0,A F station yelahanka,0.0
1,Adugodi,0.0
2,Air Force hospital,0.0
3,Amruthahalli,0.0
4,Anekalbazar,0.0


7. Cluster Neighborhoods

Run k-means to cluster the neighborhoods in Kuala Lumpur into 3 clusters.

In [34]:
# set number of clusters
bngclusters = 3

bng_clustering = bng_mall.drop(["Location"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=bngclusters, random_state=0).fit(bng_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 0, 0, 2, 0, 0, 0, 0], dtype=int32)

In [35]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
bng_merged = bng_mall.copy()

# add clustering labels
bng_merged["Cluster Labels"] = kmeans.labels_

In [36]:
bng_merged.rename(columns={"Location": "Location"}, inplace=True)
bng_merged.head()

Unnamed: 0,Location,Coffee Shop,Cluster Labels
0,A F station yelahanka,0.0,0
1,Adugodi,0.0,0
2,Air Force hospital,0.0,0
3,Amruthahalli,0.0,0
4,Anekalbazar,0.0,0


In [38]:
# merge Bangalore_grouped with Bangalore_data to add latitude/longitude for each neighborhood
bng_merged = bng_merged.join(bng.set_index("Location"), on="Location")

print(bng_merged.shape)
bng_merged.head() # check the last columns!

(230, 8)


Unnamed: 0,Location,Coffee Shop,Cluster Labels,Pincode,State,District,Latitude,Longitude
0,A F station yelahanka,0.0,0,560063,Karnataka,Bangalore,12.943915,77.60671
1,Adugodi,0.0,0,560030,Karnataka,Bangalore,12.931355,77.633979
2,Air Force hospital,0.0,0,560007,Karnataka,Bangalore,13.06344,77.593097
3,Amruthahalli,0.0,0,560092,Karnataka,Bangalore,13.04788,77.595907
4,Anekalbazar,0.0,0,562106,Karnataka,Bangalore,13.012302,77.611605


In [39]:
# sort the results by Cluster Labels
print(bng_merged.shape)
bng_merged.sort_values(["Cluster Labels"], inplace=True)
bng_merged

(230, 8)


Unnamed: 0,Location,Coffee Shop,Cluster Labels,Pincode,State,District,Latitude,Longitude
0,A F station yelahanka,0.0,0,560063,Karnataka,Bangalore,12.943915,77.60671
142,Marsur,0.0,0,562106,Karnataka,Bangalore,12.99911,77.63658
143,Maruthi Sevanagar,0.0,0,560033,Karnataka,Bangalore,13.033115,77.561116
144,Mathikere,0.0,0,560054,Karnataka,Bangalore,12.945664,77.575075
145,Medimallasandra,0.0,0,560067,Karnataka,Bangalore,12.88507,77.604894
146,Mico Layout,0.0,0,560076,Karnataka,Bangalore,13.006596,77.56235
147,Milk Colony,0.0,0,560055,Karnataka,Bangalore,12.88507,77.604894
148,Mount St joseph,0.0,0,560076,Karnataka,Bangalore,13.033115,77.561116
149,Mundur,0.026667,0,560049,Karnataka,Bangalore,12.97078,77.610376
150,Muthanallur,0.0,0,560099,Karnataka,Bangalore,12.938065,77.744277


Finally, let's visualize the resulting clusters

In [40]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(bngclusters)
ys = [i+x+(i*x)**2 for i in range(bngclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(bng_merged['Latitude'], bng_merged['Longitude'], bng_merged['Location'], bng_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [None]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

### 10. Examining Clusters

Cluster 0

In [41]:
bng_merged.loc[bng_merged['Cluster Labels'] == 0]

Unnamed: 0,Location,Coffee Shop,Cluster Labels,Pincode,State,District,Latitude,Longitude
0,A F station yelahanka,0.0,0,560063,Karnataka,Bangalore,12.943915,77.60671
142,Marsur,0.0,0,562106,Karnataka,Bangalore,12.99911,77.63658
143,Maruthi Sevanagar,0.0,0,560033,Karnataka,Bangalore,13.033115,77.561116
144,Mathikere,0.0,0,560054,Karnataka,Bangalore,12.945664,77.575075
145,Medimallasandra,0.0,0,560067,Karnataka,Bangalore,12.88507,77.604894
146,Mico Layout,0.0,0,560076,Karnataka,Bangalore,13.006596,77.56235
147,Milk Colony,0.0,0,560055,Karnataka,Bangalore,12.88507,77.604894
148,Mount St joseph,0.0,0,560076,Karnataka,Bangalore,13.033115,77.561116
149,Mundur,0.026667,0,560049,Karnataka,Bangalore,12.97078,77.610376
150,Muthanallur,0.0,0,560099,Karnataka,Bangalore,12.938065,77.744277


Cluster 1

In [42]:
bng_merged.loc[bng_merged['Cluster Labels'] == 1]

Unnamed: 0,Location,Coffee Shop,Cluster Labels,Pincode,State,District,Latitude,Longitude
191,Singanayakanahalli,0.2,1,560064,Karnataka,Bangalore,12.98341,77.622785
220,Viswaneedam,0.166667,1,560091,Karnataka,Bangalore,12.966475,77.56571
215,Vijayanagar,0.2,1,560040,Karnataka,Bangalore,12.967861,77.53687
214,Vidyaranyapura,0.2,1,560097,Karnataka,Bangalore,12.967861,77.53687
40,Cahmrajendrapet,0.166667,1,560002,Karnataka,Bangalore,12.966475,77.56571
42,Chamrajpet Bazar,0.2,1,560018,Karnataka,Bangalore,12.967861,77.53687
41,Chamrajpet,0.166667,1,560018,Karnataka,Bangalore,12.966475,77.56571


Cluster 2

In [43]:
bng_merged.loc[bng_merged['Cluster Labels'] == 2]

Unnamed: 0,Location,Coffee Shop,Cluster Labels,Pincode,State,District,Latitude,Longitude
222,Viveknagar,0.142857,2,560047,Karnataka,Bangalore,13.003656,77.569745
5,Arabic College,0.142857,2,560045,Karnataka,Bangalore,13.003656,77.569745
165,Padmanabhnagar,0.142857,2,560070,Karnataka,Bangalore,13.003656,77.569745
115,Kanteeravanagar,0.125,2,560096,Karnataka,Bangalore,12.93168,77.542804
34,Bnagalore Viswavidalaya,0.142857,2,560056,Karnataka,Bangalore,12.89295,77.641675
14,Banashankari,0.125,2,560050,Karnataka,Bangalore,12.93168,77.542804
79,Highcourt,0.142857,2,560001,Karnataka,Bangalore,12.89295,77.641675
92,Industrial Estate,0.125,2,560010,Karnataka,Bangalore,12.93168,77.542804
200,Subramanyapura,0.142857,2,560061,Karnataka,Bangalore,13.003656,77.569745
132,Madhavan Park,0.142857,2,560011,Karnataka,Bangalore,12.89295,77.641675


### 11. Observations:

Most of the Coffee Shops are concentrated in the a few parts of Bangalore city, with the highest number in cluster 1 and a moderate number in cluster 2.

On the other hand, cluster 0 has very few to no coffee shops in the neighborhoods. 

This represents a great opportunity and high potential areas to open new Coffee Shops as there is very little to no competition from existing coffee shops. 

Meanwhile, coffee shops in cluster 1 are likely suffering from intense competition due to oversupply and high concentration of coffee Shops. 

From another perspective, this also shows that the oversupply of coffee shops mostly happened in only a few parts of the city, and most of the city is open for business.

Therefore, this project recommends entrepreneurs to capitalize on these findings to open new coffee shops in neighborhoods in cluster 0 with little to no competition.

Entrepreneurs with unique selling propositions can stand out from the competition and open new coffee shops in neighborhoods in cluster 2 with moderate competition.

Lastly, entrepreneurs are advised to avoid neighborhoods in cluster 1 which already have a high concentration of coffee shops and are likely suffering from intense competition.

Project by Advait Sawant (10/02/2021)