# <h1><b>Battle of Neighborhoods - A Coursera Capstone project (Week 2) </b></h1>

<!-- ## Table of contents
* [Introduction](#introduction)
* [Data](#data)
* [Data_Sources](#data sources)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion) -->

## **Introduction** <a name="introduction"></a>

Arizona and Hawaii- both the states in the United States of America have "state of the art" telescopes and observatories. As a grad student who wants to pursue a PhD in Astronomy, these giant visible-light range telescopes are subjects of fascination. For this capstone project, I'll explore the neighborhoods of Hawaii as well as Arizona and cluster the neighborhoods to show similarities and dissimilarities between these two places. 

Hopefully, **students, faculty members and job-seeking candidates who want to work at the observatories of either of the mentioned places will be benefitted from this project as they will be able to make decisions about whether to stay near this place or travel from a further location, based on their preferences.**

## <b>Data</b><a name="data"></a>

### Description of the data that will be used in this project-

 - **List of neighborhoods in Hawaii with their latitudes and longitudes**. The types of venues have also been extracted so that segmentation and clustering is easier. 
 - **List of neighborhoods in Arizona with their latitudes and longitudes** and similarly the types of venues have also been extracted.
 - **Location data from the Foursquare API to segment and cluster the neighborhoods.**

## **Data Sources** <a name="data sources"></a>

In this project, **Keck Observatory** in **Hawaii** and **Steward Observatory** at the University of Arizona in **Arizona** will be used as the centers, around which the neighborhoods will be explored.

- **Geodata for Hawaii,extracted from a GeoJson file from NYU Spatial Data Repository by loading the .json file.**
- **Geodata for Arizona, extracted from a GeoJson file from NYU Spatial Data Repository, extracted from a GeoJson file from NYU Spatial Data Repository by loading the .json file.**

While downloading the Geojson file, it is important to check if the file contains "point"/"multipoint"
 features. Otherwise, in case of "polygon" features, extracting the types and names of neighborhoods and types of venues becomes much complicated.
 - **Geocode information from Geopy.** 
 - **Location data from the Foursquare API to segment and cluster the neighborhoods. For this segment, the CLIENT_ID and the CLIENT_SECRET is required. It is also required to specify the RADIUS  upto which distance around the specified location the "exploring neighborhood" process will take place.**

Now we'll be gathering the required data to  explore the different neighborhoods around the mentioned centers.

## **Methodology**

**As the 1st step**, we'll need to import different libraries for the analysis. Then we will load the .json or the GeoJson file for Hawaii and import a dataframe with all the required Features,i.e. "Neighborhood","Latitude" and "Longitude", from it. After cleaning the dataframe, we create a Map of Hawaii with Folium and superimpose the neighborhoods on it. 

We then do the same thing for Arizona's Steward Observatory. We load the Geojson file and then import a dataframe from it with all the required Features,i.e. "Neighborhood","Latitude" and "Longitude". Then we create a map of Arizona with Folium and superimpose the markers on it.

**In the 2nd step**, we explore the neighborhoods around the mentioned centers in both Hawaii and Arizona using the *Foursqaure API/venues/explore*. We will be needing our Foursquare CLIENT_ID and CLIENT_SECRET. We will get another .json file from the **requests** library and will explore the neighborhoods for venues and will get the type of every single venue by defining **get_category_type**. After that, we will get the most common venues around the particular center by defining a specific function for that.

**In the 3rd step**, we will segment and cluster the neighborhoods around both the centers in Hawaii and Arizona. The clusters will be shown on the map of Hawaii and Arizona. For this step, we will use **K-means clustering** from *SciKit-learn* package.


**After completing the clustering method, we will analyze the neighborhoods of each of the places and then suggest a solution of the mentioned problem in the introduction.**



#### **Importing required libraries-**

In [87]:
import  numpy as np

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

## **Analysis**

As we need to explore the neighborhoods around Hawaii's Mauna Kea Observatory and Arizona's Steward Observatory, the Geodata of these places via a Geojson file is required. These Geojson files will be imported as **Hawaii.json** for Hawaii and **Arizona.json** for Arizona.

#### **Exploring Hawaii's data-**

First, we will be exploring the neighborhoods around Hawaii's Keck observatory.

To do that, we need to load the **Hawaii.json** file and import it as **hawaii_data**.

In [88]:
with open('Hawaii.json') as json_data:
    hawaii_data = json.load(json_data)

Now, we'll take a look at **hawaii_data**.

In [5]:
hawaii_data

{'type': 'FeatureCollection',
 'totalFeatures': 936,
 'features': [{'type': 'Feature',
   'id': 'TG00HILPT.1',
   'geometry': {'type': 'MultiPoint',
    'coordinates': [[-156.036222, 19.782857]]},
   'geometry_name': 'the_geom',
   'properties': {'GIST_ID': 1, 'CFCC': 'D82', 'NAME': ''}},
  {'type': 'Feature',
   'id': 'TG00HILPT.2',
   'geometry': {'type': 'MultiPoint',
    'coordinates': [[-156.003466, 19.64056]]},
   'geometry_name': 'the_geom',
   'properties': {'GIST_ID': 2,
    'CFCC': 'D71',
    'NAME': 'Kukailimoku Point Lighthouse'}},
  {'type': 'Feature',
   'id': 'TG00HILPT.3',
   'geometry': {'type': 'MultiPoint',
    'coordinates': [[-156.011014, 19.64701401]]},
   'geometry_name': 'the_geom',
   'properties': {'GIST_ID': 3, 'CFCC': 'D51', 'NAME': 'Old Kona Airport'}},
  {'type': 'Feature',
   'id': 'TG00HILPT.4',
   'geometry': {'type': 'MultiPoint',
    'coordinates': [[-156.043785, 19.73810299]]},
   'geometry_name': 'the_geom',
   'properties': {'GIST_ID': 4,
    'CFCC

As we can see after loading the data that all the relevant data is in the *features* key, which is basically a list of the neighborhoods. So, we define a new variable that includes this data.

In [6]:
neighborhoods_data = hawaii_data['features']

As we have the data of neighborhoods stored as *neighborhoods_data*, we will have a look at the data.

In [8]:
neighborhoods_data

[{'type': 'Feature',
  'id': 'TG00HILPT.1',
  'geometry': {'type': 'MultiPoint',
   'coordinates': [[-156.036222, 19.782857]]},
  'geometry_name': 'the_geom',
  'properties': {'GIST_ID': 1, 'CFCC': 'D82', 'NAME': ''}},
 {'type': 'Feature',
  'id': 'TG00HILPT.2',
  'geometry': {'type': 'MultiPoint', 'coordinates': [[-156.003466, 19.64056]]},
  'geometry_name': 'the_geom',
  'properties': {'GIST_ID': 2,
   'CFCC': 'D71',
   'NAME': 'Kukailimoku Point Lighthouse'}},
 {'type': 'Feature',
  'id': 'TG00HILPT.3',
  'geometry': {'type': 'MultiPoint',
   'coordinates': [[-156.011014, 19.64701401]]},
  'geometry_name': 'the_geom',
  'properties': {'GIST_ID': 3, 'CFCC': 'D51', 'NAME': 'Old Kona Airport'}},
 {'type': 'Feature',
  'id': 'TG00HILPT.4',
  'geometry': {'type': 'MultiPoint',
   'coordinates': [[-156.043785, 19.73810299]]},
  'geometry_name': 'the_geom',
  'properties': {'GIST_ID': 4,
   'CFCC': 'D85',
   'NAME': 'Ellison S Onizuka Space Center'}},
 {'type': 'Feature',
  'id': 'TG00HILP

To inspect tha data, first we need to know how the data looks like and what its features are.

In [89]:
neighborhoods_data[1]

{'type': 'Feature',
 'id': 'TG00HILPT.2',
 'geometry': {'type': 'MultiPoint', 'coordinates': [[-156.003466, 19.64056]]},
 'geometry_name': 'the_geom',
 'properties': {'GIST_ID': 2,
  'CFCC': 'D71',
  'NAME': 'Kukailimoku Point Lighthouse'}}

So, we see that the data contains the name od the neghborhood and its coordinates(Latitude and Longitude). 

We need to convert the features data into a dataframe so that we can do exploratory analysis on it.

We have all the essential features such as **"Neighborhood", "Latitude" and "Longitude"**. So we create an empty dataframe and then include all the features as column names and then include the data from **neighborhoods_data**.

In [90]:
column_names = ['Neighborhood', 'Latitude', 'Longitude'] 

neighborhoods = pd.DataFrame(columns=column_names) # An empty dataframe created.
neighborhoods

Unnamed: 0,Neighborhood,Latitude,Longitude


In [91]:
for data in neighborhoods_data:
    neighborhood_name = data['properties']['NAME']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[0][1]
    neighborhood_lon = neighborhood_latlon[0][0]
    
    neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

Now we have filled the dataframe with the required features. 

A look at the dataframe-

In [92]:
neighborhoods.head(20)

Unnamed: 0,Neighborhood,Latitude,Longitude
0,,19.782857,-156.036222
1,Kukailimoku Point Lighthouse,19.64056,-156.003466
2,Old Kona Airport,19.647014,-156.011014
3,Ellison S Onizuka Space Center,19.738103,-156.043785
4,Keahole Point Lighthouse,19.730984,-156.063283
5,Honokohau Small Boat Harbor,19.673002,-156.025103
6,Kailua Airport,19.647014,-156.011014
7,,19.429102,-154.88329
8,,19.406082,-154.91839
9,,19.497396,-154.945666


The dataframe looks correct, but we can see a lot of rows are empty and have just coordinates. We'll have to clean up the dataframe.

Let's see the shape of the dataframe.

In [93]:
s = neighborhoods.shape
s

(936, 3)

In [None]:
for j in range(0,s[0]):
    if neighborhoods.iloc[j,0]=="":
            neighborhoods.drop(labels = j,axis = 0,inplace = True)
            neighborhoods.reset_index(drop = True,inplace = True)

In [109]:
neighborhoods

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Kukailimoku Point Lighthouse,19.64056,-156.003466
1,Old Kona Airport,19.647014,-156.011014
2,Ellison S Onizuka Space Center,19.738103,-156.043785
3,Keahole Point Lighthouse,19.730984,-156.063283
4,Honokohau Small Boat Harbor,19.673002,-156.025103
5,Kailua Airport,19.647014,-156.011014
6,Water Tower,19.676486,-156.006658
7,Kealakekua Bay Park,19.478702,-155.921985
8,Kalahiki Cemetery,19.378003,-155.878344
9,Hookena School,19.390148,-155.881706


In [112]:
s1 = neighborhoods.shape

In [116]:
print('The dataframe has {} neighborhoods.'.format(s1[0]
    )
)

The dataframe has 515 neighborhoods.


As we can see after cleaning the dataframe, there are 515 neighborhoods.

Now we create a map of Hawaii with the help of **Folium.** The *Nominatim* will use "hawaii_explorer" as *user_agent*.

In [117]:
address = 'Hawaii, USA'

geolocator = Nominatim(user_agent="hawaii_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Hawaii are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Hawaii are 21.2160437, -157.975203.


now we superimpose the neighborhoods on the top of the map of the Hawaii we created.

In [120]:
map_hawaii= folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, label in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_hawaii)  
    
map_hawaii

We'll be exploring around the Keck Observatory, so let's get its coordinates from thte **neighborhoods** dataframe.

In [126]:
neighborhoods[neighborhoods["Neighborhood"]=="Keck Observatory"]

Unnamed: 0,Neighborhood,Latitude,Longitude
239,Keck Observatory,19.829536,-155.474938


Getting the coordinates of the **Keck observatory**-

In [129]:
Latitude = neighborhoods[neighborhoods["Neighborhood"]=="Keck Observatory"].iloc[0,1]
Longitude = neighborhoods[neighborhoods["Neighborhood"]=="Keck Observatory"].iloc[0,2]

In [131]:
neighborhood_name = "Keck Observatory" # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               Latitude, 
                                                               Longitude))

Latitude and longitude values of Keck Observatory are 19.829536, -155.474938.


We will explore around this mentioned neighborhood using thte Foursquare API.

For doing this, we will be needing **CLIENT_ID** and **CLIENT_SECRET** from the Foursquare developer console.

In [170]:
CLIENT_ID = 'HXVAUOWVYH4BQ3FBLW4S2FXKHH0UMJISPKP0VRM0D3KPFSRB' # Foursquare ID
CLIENT_SECRET = "2P4EN1UVGEK5OBMCOYRUFI0CQ1MKP2A1HKPAQ0KASSA2D4XW" #  Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Credentails:
CLIENT_ID: HXVAUOWVYH4BQ3FBLW4S2FXKHH0UMJISPKP0VRM0D3KPFSRB
CLIENT_SECRET:2P4EN1UVGEK5OBMCOYRUFI0CQ1MKP2A1HKPAQ0KASSA2D4XW


For the next step, **venues/explore** on the Foursquare API will be used to explore around the neighborhood.

In [175]:
LIMIT = 100
radius = 7000 #The radius around the center for which distance the neighborhood will be explored.
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Latitude,
    Longitude,
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=HXVAUOWVYH4BQ3FBLW4S2FXKHH0UMJISPKP0VRM0D3KPFSRB&client_secret=2P4EN1UVGEK5OBMCOYRUFI0CQ1MKP2A1HKPAQ0KASSA2D4XW&v=20180605&ll=19.829536,-155.474938&radius=7000&limit=100'

It is important to note that we had to use an incredibly large RADIUS to get popular locations around the Keck observatory,because generally the observatories are located at remote places. This helps getting more light from the astronomical bodies and less light polution which interfere with the observations. Also, the observatory is situated on a mountain top so the atmosphere is thin and light scattering by the atmosphere is less.

From the **requests** library, we'll get the results of the names of the neighborhoods around the center.

In [176]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e56e53995feaf001bacabf9'},
 'response': {'headerLocation': 'Current map view',
  'headerFullLocation': 'Current map view',
  'headerLocationGranularity': 'unknown',
  'totalResults': 7,
  'suggestedBounds': {'ne': {'lat': 19.892536063000065,
    'lng': -155.408092000248},
   'sw': {'lat': 19.766535936999936, 'lng': -155.54178399975203}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b89ecebf964a520ab5632e3',
       'name': 'Mauna Kea Observatory Complex',
       'location': {'address': 'John A. Burns Way',
        'lat': 19.822871495182433,
        'lng': -155.46963214874268,
        'labeledLatLngs': [{'label': 'display',
          'lat': 19.822871495182433,
          'lng': -155.46963214874268}],
        'distance': 926,
       

In [177]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [179]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Mauna Kea Observatory Complex,Mountain,19.822871,-155.469632
1,Coffee Break,Breakfast Spot,19.821434,-155.48448
2,Waiau (lake),Lake,19.811426,-155.4777
3,D's Home Repair Service,Home Service,19.784796,-155.493622
4,Rover's Fitness Club,Gym / Fitness Center,19.867388,-155.425321
5,Monstera Noodles | Sushi,Asian Restaurant,19.815004,-155.537851
6,Mauna Kea Ice Age Natural Area Preserve,Nature Preserve,19.769506,-155.456116


In [180]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

7 venues were returned by Foursquare.


Now, we will be defining a function which will extract nearby venues from the Foursqaure API.

In [181]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

We will be getting nearby venues of the Keck Observatory now.

In [182]:
hawaii_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )

Kukailimoku Point Lighthouse
Old Kona Airport
Ellison S Onizuka Space Center
Keahole Point Lighthouse
Honokohau Small Boat Harbor
Kailua Airport
Water Tower
Kealakekua Bay Park
Kalahiki Cemetery
Hookena School
Honaunau School
Alae School
Seamountain Ninole Golf Course
Kaalaiki Landing Strip
Upper Paauau Landing Strip
Kaalaiki Landing Strip
Naalehu Park
Waiohinu Park
Whittington Beach County Park
Mookini Heiau
Kahola Historical Sites State
Upolu Airport
Kauhola Point Lighthouse
Kohala Hospital
Halaula School
Wainaia Cemetery
Waimea School
Parker School
Laupahoehoe School
Hakalau School
Waikaumalo Park
Mauna Kea Science Reserve
Pepeekeo Airstrip
Andrade Camp
Waiakea Middle School
Wainaku Camp
Lehia Park
Kealoha Beach Park
Post Office
Holualoa Elementary School
Holualoa Library
Konawaena High School
Kona Hospital
Higashihara Park
Mauna Lani Point Resort
Mauna Loa Observatory
Kulani Correctional Facility
Mauna Loa Boys School
Keaau School
Main Post Office
Dolphin Bay
Maui's Canoe (Rock)
Fe

KeyError: 'groups'

In [183]:
print(hawaii_venues.shape)

(5705, 7)


now let's take look at the venues dataframe.

In [184]:
hawaii_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kukailimoku Point Lighthouse,19.64056,-156.003466,"Umeke's Fish Market, Bar, & Grill",19.642319,-156.000651,Hawaiian Restaurant
1,Kukailimoku Point Lighthouse,19.64056,-156.003466,Orchid Thai Cuisine,19.642815,-156.000835,Thai Restaurant
2,Kukailimoku Point Lighthouse,19.64056,-156.003466,El Maguey,19.642389,-156.000248,Mexican Restaurant
3,Kukailimoku Point Lighthouse,19.64056,-156.003466,Keiki Beach,19.63898,-156.00254,Beach
4,Kukailimoku Point Lighthouse,19.64056,-156.003466,Irie Hawaii,19.641475,-156.000211,Smoke Shop


In [185]:
hawaii_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
(Pvt) Park,20,20,20,20,20,20
11,1,1,1,1,1,1
12,1,1,1,1,1,1
13,2,2,2,2,2,2
A&B Sugar Museum,15,15,15,15,15,15
Afook-Chinen Civic Auditorium,10,10,10,10,10,10
Ahu'ena Heiau,27,27,27,27,27,27
Ahukini Rec Pier State Park,2,2,2,2,2,2
Aiea Heights Rest Home,2,2,2,2,2,2
Aiea High School,1,1,1,1,1,1


In [186]:
print('There are {} uniques categories.'.format(len(hawaii_venues['Venue Category'].unique())))

There are 282 uniques categories.


So, if we consider a really vast search area from the observatory, it's a very diverse neighborhood and there are a lot of things to do.

In [187]:
# one hot encoding
hawaii_onehot = pd.get_dummies(hawaii_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
hawaii_onehot['Neighborhood'] = hawaii_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [hawaii_onehot.columns[-1]] + list(hawaii_onehot.columns[:-1])
hawaii_onehot = hawaii_onehot[fixed_columns]

hawaii_onehot.head(20)

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bar,Baseball Field,Baseball Stadium,Basketball Court,Bay,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Big Box Store,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Cajun / Creole Restaurant,Campground,Canal,Candy Store,Carpet Store,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Basketball Court,College Library,College Rec Center,Comedy Club,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Credit Union,Creperie,Cruise,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dive Shop,Dive Spot,Doctor's Office,Dog Run,Donut Shop,Drugstore,Electronics Store,Fabric Shop,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Hobby Shop,Home Service,Hostel,Hot Dog Joint,Hot Spring,Hotel,Hotel Bar,Hotel Pool,Hotpot Restaurant,Hunting Supply,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Laundry Service,Lawyer,Lighthouse,Lingerie Store,Liquor Store,Lounge,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Military Base,Mini Golf,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,National Park,Nature Preserve,New American Restaurant,Nightclub,Noodle House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Event Space,Outdoor Sculpture,Outdoor Supply Store,Outlet Store,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Photography Lab,Pier,Pizza Place,Playground,Plaza,Poke Place,Pool,Print Shop,Pub,Racetrack,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Rest Area,Restaurant,Rock Climbing Spot,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,School,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,South Indian Restaurant,Souvenir Shop,Spa,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,State / Provincial Park,Stationery Store,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Tattoo Parlor,Tea Room,Temple,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Thrift / Vintage Store,Tour Provider,Tourist Information Center,Toy / Game Store,Track,Trail,Tram Station,Travel & Transport,Tree,Udon Restaurant,Vacation Rental,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Waste Facility,Waterfall,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0
8,Kukailimoku Point Lighthouse,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,Old Kona Airport,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [188]:
hawaii_onehot.shape

(5705, 283)

In [191]:
hawaii_grouped = hawaii_onehot.groupby('Neighborhood').mean().reset_index()
hawaii_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bar,Baseball Field,Baseball Stadium,Basketball Court,Bay,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Big Box Store,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Cajun / Creole Restaurant,Campground,Canal,Candy Store,Carpet Store,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,College Basketball Court,College Library,College Rec Center,Comedy Club,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Credit Union,Creperie,Cruise,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dive Shop,Dive Spot,Doctor's Office,Dog Run,Donut Shop,Drugstore,Electronics Store,Fabric Shop,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Hobby Shop,Home Service,Hostel,Hot Dog Joint,Hot Spring,Hotel,Hotel Bar,Hotel Pool,Hotpot Restaurant,Hunting Supply,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Laundry Service,Lawyer,Lighthouse,Lingerie Store,Liquor Store,Lounge,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Military Base,Mini Golf,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,National Park,Nature Preserve,New American Restaurant,Nightclub,Noodle House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Event Space,Outdoor Sculpture,Outdoor Supply Store,Outlet Store,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Photography Lab,Pier,Pizza Place,Playground,Plaza,Poke Place,Pool,Print Shop,Pub,Racetrack,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Rest Area,Restaurant,Rock Climbing Spot,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,School,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,South Indian Restaurant,Souvenir Shop,Spa,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,State / Provincial Park,Stationery Store,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Tattoo Parlor,Tea Room,Temple,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Thrift / Vintage Store,Tour Provider,Tourist Information Center,Toy / Game Store,Track,Trail,Tram Station,Travel & Transport,Tree,Udon Restaurant,Vacation Rental,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Waste Facility,Waterfall,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo
0,(Pvt) Park,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.15,0.2,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,11,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,13,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,A&B Sugar Museum,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.066667,0.133333,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [192]:
hawaii_grouped.shape

(394, 283)

In [193]:
num_top_venues = 10
for hood in hawaii_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = hawaii_grouped[hawaii_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----(Pvt) Park----
               venue  freq
0     History Museum  0.20
1      Historic Site  0.15
2        Bus Station  0.10
3      Grocery Store  0.05
4        Video Store  0.05
5    Harbor / Marina  0.05
6               Park  0.05
7              Hotel  0.05
8  Convenience Store  0.05
9    Automotive Shop  0.05


----11----
                           venue  freq
0                Bed & Breakfast   1.0
1              Accessories Store   0.0
2                  National Park   0.0
3          Performing Arts Venue   0.0
4                           Park   0.0
5  Paper / Office Supplies Store   0.0
6                   Outlet Store   0.0
7           Outdoor Supply Store   0.0
8              Outdoor Sculpture   0.0
9            Outdoor Event Space   0.0


----12----
                           venue  freq
0                Bed & Breakfast   1.0
1              Accessories Store   0.0
2                  National Park   0.0
3          Performing Arts Venue   0.0
4                           Park  

In [194]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [209]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = hawaii_grouped['Neighborhood']

for ind in np.arange(hawaii_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(hawaii_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,(Pvt) Park,History Museum,Historic Site,Bus Station,Electronics Store,Park,Grocery Store,Fast Food Restaurant,Automotive Shop,Convenience Store,Hotel
1,11,Bed & Breakfast,Zoo,General Entertainment,Food,Food & Drink Shop,Food Court,Food Truck,Forest,French Restaurant,Fried Chicken Joint
2,12,Bed & Breakfast,Zoo,General Entertainment,Food,Food & Drink Shop,Food Court,Food Truck,Forest,French Restaurant,Fried Chicken Joint
3,13,BBQ Joint,Hostel,Zoo,Forest,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant
4,A&B Sugar Museum,Coffee Shop,Salon / Barbershop,Diner,Salad Place,Hardware Store,Bakery,Mobile Phone Shop,Big Box Store,History Museum,Sandwich Place


In [211]:
kclusters = 9

hawaii_grouped_clustering = hawaii_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(hawaii_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 4, 4, 0, 2, 2, 2, 1, 2, 1], dtype=int32)

In [224]:
# add clustering labels
# neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

hawaii_merged = neighborhoods

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
hawaii_merged =hawaii_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

hawaii_merged # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Kukailimoku Point Lighthouse,19.64056,-156.003466,2.0,Hawaiian Restaurant,Yoga Studio,Beach,Thai Restaurant,Mexican Restaurant,Noodle House,Korean Restaurant,Fried Chicken Joint,Smoke Shop,Food Truck
1,Old Kona Airport,19.647014,-156.011014,6.0,Beach,Garden,Zoo,French Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,Forest,Fried Chicken Joint
2,Ellison S Onizuka Space Center,19.738103,-156.043785,2.0,Airport Service,Airport,Airport Terminal,Gift Shop,Rest Area,Science Museum,Café,Zoo,French Restaurant,Food
3,Keahole Point Lighthouse,19.730984,-156.063283,2.0,Lighthouse,Zoo,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest,French Restaurant
4,Honokohau Small Boat Harbor,19.673002,-156.025103,1.0,Boat or Ferry,Harbor / Marina,Athletics & Sports,Resort,Beach,Home Service,Tour Provider,Zoo,Food,Food & Drink Shop
5,Kailua Airport,19.647014,-156.011014,6.0,Beach,Garden,Zoo,French Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,Forest,Fried Chicken Joint
6,Water Tower,19.676486,-156.006658,,,,,,,,,,,
7,Kealakekua Bay Park,19.478702,-155.921985,2.0,Park,Memorial Site,Travel & Transport,Zoo,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest
8,Kalahiki Cemetery,19.378003,-155.878344,,,,,,,,,,,
9,Hookena School,19.390148,-155.881706,,,,,,,,,,,


In [231]:
hawaii_merged.dropna(axis = 0,inplace  = True)
hawaii_merged.head(130)

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Kukailimoku Point Lighthouse,19.64056,-156.003466,2.0,Hawaiian Restaurant,Yoga Studio,Beach,Thai Restaurant,Mexican Restaurant,Noodle House,Korean Restaurant,Fried Chicken Joint,Smoke Shop,Food Truck
1,Old Kona Airport,19.647014,-156.011014,6.0,Beach,Garden,Zoo,French Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,Forest,Fried Chicken Joint
2,Ellison S Onizuka Space Center,19.738103,-156.043785,2.0,Airport Service,Airport,Airport Terminal,Gift Shop,Rest Area,Science Museum,Café,Zoo,French Restaurant,Food
3,Keahole Point Lighthouse,19.730984,-156.063283,2.0,Lighthouse,Zoo,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest,French Restaurant
4,Honokohau Small Boat Harbor,19.673002,-156.025103,1.0,Boat or Ferry,Harbor / Marina,Athletics & Sports,Resort,Beach,Home Service,Tour Provider,Zoo,Food,Food & Drink Shop
5,Kailua Airport,19.647014,-156.011014,6.0,Beach,Garden,Zoo,French Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,Forest,Fried Chicken Joint
7,Kealakekua Bay Park,19.478702,-155.921985,2.0,Park,Memorial Site,Travel & Transport,Zoo,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest
10,Honaunau School,19.4531,-155.881097,2.0,Coffee Shop,Hawaiian Restaurant,Café,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Forest,Zoo
12,Seamountain Ninole Golf Course,19.137274,-155.511145,2.0,Resort,Beach,Golf Course,Hotel Pool,Forest,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck
16,Naalehu Park,19.063949,-155.585591,2.0,Food Truck,Supermarket,Farmers Market,Café,Food & Drink Shop,Mexican Restaurant,Coffee Shop,Bakery,Restaurant,Donut Shop


In [235]:
#create map
map_clusters = folium.Map(location=[Latitude, Longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(hawaii_merged['Latitude'], hawaii_merged['Longitude'], hawaii_merged['Neighborhood'], hawaii_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster-1)],
        fill=True,
        fill_color=rainbow[int(cluster-1)],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters