<H1> Coursera Capstone Project

This notebook will apply the skills and tools that I have learned during the IBM Coursera Data Science Certification. It will demonstrate my understanding of these tools and ability to use them to creatively solve complex problems.

To clarify the requirements of the project, I will restate the rubric here:

For Week 1:
<ul>
    <li>A description of the problem and a discussion of the background. (15 marks)</li>
    <li>A description of the data and how it will be used to solve the problem. (15 marks)</li>
    </ul>

For Week 2:
<ul>
    <li>A link to your Notebook on your Github repository, showing your code. (15 marks)</li>
    <li>A full report consisting of all of the following components (15 marks):
        <ul>
            <li>Introduction where you discuss the business problem and who would be interested in this project.</li>
            <li>Data where you describe the data that will be used to solve the problem and the source of the data.</li>
            <li>Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.</li>
            <li>Results section where you discuss the results.</li>
            <li>Discussion section where you discuss any observations you noted and any recommendations you can make based on the results. </li>
            <li>Conclusion section where you conclude the report.</li>
        </ul>
    <li>Your choice of a presentation or blogpost. (10 marks)</li>
    </ul>

<H2>Introduction

<H4>The Problem</H4> 
This notebook will investigate the viability of opening a 24/7 diner in Philadelphia, Pennsylvania.
<p></p>
As a Philadelphia native, I love the city. But I am always struck by how many times people remark to me how early the city shuts down compared to its neighbor, New York. There are very few late night eateries, particularly near the University City district, where many students live and have expressed interest in late night options for after party hangouts or study fuel.

<H2>Methodology

<H4>DATA: This Notebook as a Solution</H4>
In this notebook, I will use data scraped from Wikipedia to identify neighborhoods in Philadelphia, PA. After which, I will create a dataframe and map the coordinates of the neighborhood. I will use the latitude and longitude to get venue data from Four Square to confirm or reject my hypothesis that University City is an viable neighborhood to open a 24/7 diner.

<H4> Importing Libraries and Begin Scraping Data

In [1]:
#importing libraries
import numpy as np #for vectorized data

import pandas as pd #for data analysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_columns", None)

import json #for json files

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim #convert addresses into Lat and Long
import geocoder

import requests #for requests
import lxml

from bs4 import BeautifulSoup #library to parse HTML and XML

from pandas.io.json import json_normalize #transform JSON file into pandas dataframe

#matplotlib
import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

import folium #rendering maps

print('Libraries successfully imported. Begin scraping')

Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\Users\alexs\Anaconda

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.9.11          |           py37_0         147 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         147 KB

The following packages will be UPDATED:

  certifi                                  2019.6.16-py37_1 --> 2019.9.11-py37_0



Downloading and Extracting Packages

certifi-2019.9.11    | 147 KB    |            |   0% 
certifi-2019.9.11    | 147 KB    | #          |  11% 
certifi-2019.9.11    | 147 KB    | ########## | 100% 
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...

<h4>Scraping Site

In [2]:
PHL = requests.get('https://en.wikipedia.org/wiki/Category:Neighborhoods_in_Philadelphia')

In [3]:
soup = BeautifulSoup(PHL.text, 'html.parser')

In [4]:
philadist=[]

In [5]:
#append the data into the dataframe
for row in soup.find_all("div", class_="mw-category")[0].findAll("a"):
    philadist.append(row.text)
philadf = pd.DataFrame({"Neighborhood":philadist})
philadf.head()           

Unnamed: 0,Neighborhood
0,"Center City, Philadelphia"
1,North Philadelphia
2,Northeast Philadelphia
3,Northwest Philadelphia
4,South Philadelphia


In [6]:
#find the shape of the dataframe
philadf.shape

(49, 1)

At this stage, I'd like to point out that I'll be dividing Philadelphia into its districts rather than its neighborhoods, as it is a smaller city, many of the neighborhoods are quite small and to consider the business problem that we are trying to solve (the opening up of a 24/7 diner), it is sufficient enough to consider districts. Many people eat outside of their neighborhoods while remaining inside their district. Therefore we will base our analysis on district data.

<H3>Getting the Coordinates

In [7]:
# define a function to get coordinates
def get_latlng(Neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Philadelphia, Pennsylvania'.format(Neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [8]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(Neighborhood) for Neighborhood in philadf["Neighborhood"].tolist() ]

In [9]:
coords

[[39.95130000000006, -75.15473999999995],
 [39.981750000000034, -75.13382999999999],
 [40.09280000000007, -74.98702999999995],
 [40.09280000000007, -74.98702999999995],
 [39.96411002943043, -75.16105003116108],
 [39.91004000000004, -75.18636999999995],
 [40.000150000000076, -75.07010999999994],
 [40.000150000000076, -75.07010999999994],
 [39.958110786797675, -75.15022810571087],
 [40.076940000000036, -75.20800999999994],
 [39.955970000000036, -75.15815999999995],
 [40.014500000000055, -75.19217999999995],
 [39.967250000000035, -75.17046999999997],
 [40.08096000000006, -75.08028999999993],
 [40.02834000000007, -75.08534999999995],
 [40.029540000000054, -75.17510999999996],
 [40.04218000000003, -75.02884999999998],
 [40.01281000000006, -75.14256999999998],
 [39.98625000000004, -75.13189999999997],
 [39.95879000000008, -75.17178999999999],
 [40.02899000000008, -75.15183999999994],
 [39.96415799999999, -75.1988025],
 [39.960467386002854, -75.22934957052628],
 [40.065810000000056, -75.18505

In [10]:
#Dataframe of coordinates
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [11]:
#merge dataframes
philadf['Latitude'] = df_coords['Latitude']
philadf['Longitude'] = df_coords['Longitude']

In [12]:
# confirm the new dataframe
print(philadf.shape)
philadf

(49, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,"Center City, Philadelphia",39.9513,-75.15474
1,North Philadelphia,39.98175,-75.13383
2,Northeast Philadelphia,40.0928,-74.98703
3,Northwest Philadelphia,40.0928,-74.98703
4,South Philadelphia,39.96411,-75.16105
5,Southwest Philadelphia,39.91004,-75.18637
6,"Bridesburg-Kensington-Richmond, Philadelphia",40.00015,-75.07011
7,"Bridesburg, Philadelphia",40.00015,-75.07011
8,"Callowhill, Philadelphia",39.958111,-75.150228
9,"Chestnut Hill, Philadelphia",40.07694,-75.20801


In [13]:
# save dataframe as csv
philadf.to_csv("philadf.csv", index=False)

<H3> Creating Map of Philadelphia Neighborhood

In [14]:
address = "Philadelphia Pennsylvania"

geolocator = Nominatim(user_agent="Jupyter")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Philadelphia, Pennsylvania, USA are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Philadelphia, Pennsylvania, USA are 39.9527237, -75.1635262.


In [15]:
#create map
mapphila = folium.Map(location = [latitude, longitude], zoom_start = 10)

#add Districts
for lat, lng, Neighborhood in zip(philadf['Latitude'], philadf['Longitude'], philadf['Neighborhood']):
    label = '{}'.format('Neighborhood')
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7,
        partse_html=False).add_to(mapphila)

mapphila

In [16]:
#Map as HTML
mapphila.save('mapphila.html')

<H3>Explore Districts using Foursquare API

In [17]:
#Foursquare Credentials
CLIENT_ID = 'GCLKXCIUP2HPNJKHP03BLDMFU2VHZDDMIHZP4RIPJJSNAYGB'
CLIENT_SECRET = 'GHSMJAUI1YLPIBHENISF5HI2IFVOC0OS5L1UM4OA2TO3OZJB'
VERSION = '20191018'

print('My Credentials:')
print('CLIENT_ID:', CLIENT_ID)
print('CLIENT_SECRET:', CLIENT_SECRET)

My Credentials:
CLIENT_ID: GCLKXCIUP2HPNJKHP03BLDMFU2VHZDDMIHZP4RIPJJSNAYGB
CLIENT_SECRET: GHSMJAUI1YLPIBHENISF5HI2IFVOC0OS5L1UM4OA2TO3OZJB


<H4>Foursquare: 100 Top Venues within 2000 Meters

In [19]:
radius = 2000
LIMIT = 200

venues = []
for lat, lng, Neighborhood in zip(philadf['Latitude'], philadf['Longitude'], philadf['Neighborhood']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
            CLIENT_ID,
            CLIENT_SECRET,
            VERSION,
            lat,
            lng,
            radius,
            LIMIT)
    results = requests.get(url).json()['response']['groups'][0]['items']
    #get information for venues
    for venue in results: 
         venues.append((
        Neighborhood,
        lat,
        lng,
        venue['venue']['name'],
        venue['venue']['location']['lat'],
        venue['venue']['location']['lng'],
        venue['venue']['categories'][0]['name']))

In [20]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)
venues_df

Unnamed: 0,0,1,2,3,4,5,6
0,"Center City, Philadelphia",39.95130,-75.15474,Di Bruno Bros.,39.949148,-75.155587,Gourmet Shop
1,"Center City, Philadelphia",39.95130,-75.15474,Morimoto,39.949836,-75.153241,Japanese Restaurant
2,"Center City, Philadelphia",39.95130,-75.15474,MOM's Organic Market,39.950918,-75.158815,Organic Grocery
3,"Center City, Philadelphia",39.95130,-75.15474,Walnut Street Theatre,39.948553,-75.155504,Theater
4,"Center City, Philadelphia",39.95130,-75.15474,Oishiipoke,39.953254,-75.156392,Poke Place
5,"Center City, Philadelphia",39.95130,-75.15474,Reading Terminal Market,39.953341,-75.159306,Market
6,"Center City, Philadelphia",39.95130,-75.15474,La Colombe Torrefaction,39.950563,-75.150758,Coffee Shop
7,"Center City, Philadelphia",39.95130,-75.15474,Independence National Historical Park,39.950666,-75.149787,National Park
8,"Center City, Philadelphia",39.95130,-75.15474,Steven Singer Jewelers,39.948151,-75.154230,Jewelry Store
9,"Center City, Philadelphia",39.95130,-75.15474,Fat Salmon,39.947995,-75.153453,Sushi Restaurant


In [21]:
# define the column names
venues_df.columns = ['Neighborhood', 'NeighborhoodLatitude', 'NeighborhoodLongitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(4549, 7)


Unnamed: 0,Neighborhood,NeighborhoodLatitude,NeighborhoodLongitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,"Center City, Philadelphia",39.9513,-75.15474,Di Bruno Bros.,39.949148,-75.155587,Gourmet Shop
1,"Center City, Philadelphia",39.9513,-75.15474,Morimoto,39.949836,-75.153241,Japanese Restaurant
2,"Center City, Philadelphia",39.9513,-75.15474,MOM's Organic Market,39.950918,-75.158815,Organic Grocery
3,"Center City, Philadelphia",39.9513,-75.15474,Walnut Street Theatre,39.948553,-75.155504,Theater
4,"Center City, Philadelphia",39.9513,-75.15474,Oishiipoke,39.953254,-75.156392,Poke Place


The number of venues for each neighborhood:

In [22]:
venues_df.groupby(['Neighborhood']).count()

Unnamed: 0_level_0,NeighborhoodLatitude,NeighborhoodLongitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Bridesburg, Philadelphia",47,47,47,47,47,47
"Bridesburg-Kensington-Richmond, Philadelphia",47,47,47,47,47,47
"Callowhill, Philadelphia",100,100,100,100,100,100
"Center City, Philadelphia",100,100,100,100,100,100
"Chestnut Hill, Philadelphia",83,83,83,83,83,83
"Chinatown, Philadelphia",100,100,100,100,100,100
"East Falls, Philadelphia",100,100,100,100,100,100
"Fairmount, Philadelphia",100,100,100,100,100,100
"Fox Chase, Philadelphia",74,74,74,74,74,74
"Frankford, Philadelphia",92,92,92,92,92,92


In [23]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 286 uniques categories.


In [24]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:50]

array(['Gourmet Shop', 'Japanese Restaurant', 'Organic Grocery',
       'Theater', 'Poke Place', 'Market', 'Coffee Shop', 'National Park',
       'Jewelry Store', 'Sushi Restaurant', 'Donut Shop', 'Snack Place',
       'Beer Garden', 'Ramen Restaurant', 'New American Restaurant',
       'Bagel Shop', 'Shanghai Restaurant', 'Brewery', 'Bakery', 'Park',
       'Salad Place', 'Hot Dog Joint', 'Sandwich Place', 'Historic Site',
       'Ice Cream Shop', 'Hotel', 'Deli / Bodega', 'Indian Restaurant',
       'Sculpture Garden', 'Burger Joint', 'Mexican Restaurant',
       'History Museum', 'Bar', 'Mediterranean Restaurant',
       'Noodle House', 'Gastropub', 'Wine Bar',
       'Vegetarian / Vegan Restaurant', 'Asian Restaurant',
       'Optical Shop', 'Indie Movie Theater', 'Churrascaria',
       'Comfort Food Restaurant', 'Pizza Place', 'Breakfast Spot',
       'Italian Restaurant', 'Spa', 'Concert Hall', 'Pub',
       'American Restaurant'], dtype=object)

In [25]:
# check if the results contain "Diner"
"Neighborhood" in venues_df['VenueCategory'].unique()

False

<H3>Analyze Neighborhoods

In [26]:
# one hot encoding
venues_type_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add the neighborhood column
venues_type_onehot['Neighborhood'] = venues_df['Neighborhood']
fix_columns = list(venues_type_onehot.columns[-1:]) + list(venues_type_onehot.columns[:-1])
venues_type_onehot = venues_type_onehot[fix_columns]

print(venues_type_onehot.shape)
venues_type_onehot.head()

(4549, 287)


Unnamed: 0,Neighborhood,ATM,Accessories Store,African Restaurant,Airport Service,American Restaurant,Antique Shop,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Austrian Restaurant,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Beer Bar,Beer Garden,Beer Store,Big Box Store,Board Shop,Boarding House,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Creperie,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Eastern European Restaurant,English Restaurant,Entertainment Service,Ethiopian Restaurant,Event Service,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,General Entertainment,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,High School,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hookah Bar,Hospital,Hostel,Hot Dog Joint,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Insurance Office,Intersection,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lebanese Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nail Salon,National Park,New American Restaurant,Nightclub,Nightlife Spot,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Paella Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Pier,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Print Shop,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Restaurant,Rock Climbing Spot,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Snack Place,Soccer Field,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Squash Court,Stadium,Steakhouse,Storage Facility,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Toll Booth,Tourist Information Center,Toy / Game Store,Trail,Train,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,"Center City, Philadelphia",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Center City, Philadelphia",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Center City, Philadelphia",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Center City, Philadelphia",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Center City, Philadelphia",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


<H4>Now, I will group the rows by neighborhood and get the mean frequency.

In [27]:
# get the occurrence of each venue type in each neighborhood
philagrouped = venues_type_onehot.groupby(['Neighborhood']).mean().reset_index()

print(philagrouped.shape)
philagrouped

(49, 287)


Unnamed: 0,Neighborhood,ATM,Accessories Store,African Restaurant,Airport Service,American Restaurant,Antique Shop,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Austrian Restaurant,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Beer Bar,Beer Garden,Beer Store,Big Box Store,Board Shop,Boarding House,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Creperie,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Eastern European Restaurant,English Restaurant,Entertainment Service,Ethiopian Restaurant,Event Service,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,General Entertainment,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,High School,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hookah Bar,Hospital,Hostel,Hot Dog Joint,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Insurance Office,Intersection,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lebanese Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nail Salon,National Park,New American Restaurant,Nightclub,Nightlife Spot,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Paella Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Pier,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Print Shop,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Restaurant,Rock Climbing Spot,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Snack Place,Soccer Field,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Squash Court,Stadium,Steakhouse,Storage Facility,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Toll Booth,Tourist Information Center,Toy / Game Store,Trail,Train,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,"Bridesburg, Philadelphia",0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.085106,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06383,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.106383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0
1,"Bridesburg-Kensington-Richmond, Philadelphia",0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.085106,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06383,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.106383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0
2,"Callowhill, Philadelphia",0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.02,0.02,0.02,0.0,0.0,0.01,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
3,"Center City, Philadelphia",0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.05,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.02,0.03,0.02,0.0,0.0,0.01,0.03,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0
4,"Chestnut Hill, Philadelphia",0.0,0.0,0.0,0.0,0.048193,0.0,0.0,0.0,0.012048,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.048193,0.060241,0.012048,0.012048,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024096,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.036145,0.0,0.0,0.0,0.012048,0.0,0.024096,0.0,0.0,0.0,0.0,0.012048,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024096,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.012048,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.024096,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.036145,0.012048,0.0,0.012048,0.012048,0.0,0.012048,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024096,0.0,0.0,0.0,0.0,0.012048,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.024096,0.0,0.0,0.012048,0.036145,0.0,0.0,0.048193,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.024096,0.0,0.0,0.0,0.012048,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.012048,0.012048,0.012048,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.012048,0.0,0.012048,0.0,0.0,0.0,0.0,0.012048,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.012048,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Chinatown, Philadelphia",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.05,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.01,0.03,0.0,0.0,0.0,0.01,0.04,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.04,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0
6,"East Falls, Philadelphia",0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.03,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.03,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.02,0.0,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.05,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0
7,"Fairmount, Philadelphia",0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.06,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.04,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
8,"Fox Chase, Philadelphia",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.027027,0.027027,0.067568,0.0,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.067568,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.013514,0.054054,0.0,0.013514,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.0,0.027027,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.013514,0.0,0.0,0.054054,0.0,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0
9,"Frankford, Philadelphia",0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.01087,0.043478,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.021739,0.01087,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.032609,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.043478,0.01087,0.0,0.0,0.0,0.0,0.0,0.021739,0.01087,0.0,0.0,0.0,0.0,0.054348,0.0,0.0,0.0,0.065217,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.01087,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032609,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.01087,0.01087,0.065217,0.0,0.0,0.065217,0.01087,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.0,0.01087,0.0,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,0.021739,0.01087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [28]:
len(philagrouped[philagrouped["Diner"] > 0])

19

<H4>New DataFrame for Philadelphian Diners

In [29]:
#Creating Diner dataframe
phila_diner = philagrouped[["Neighborhood", "Diner"]]
phila_diner.head()

Unnamed: 0,Neighborhood,Diner
0,"Bridesburg, Philadelphia",0.0
1,"Bridesburg-Kensington-Richmond, Philadelphia",0.0
2,"Callowhill, Philadelphia",0.01
3,"Center City, Philadelphia",0.0
4,"Chestnut Hill, Philadelphia",0.012048


<H3>Clustering Neighborhoods

In [30]:
# set number of clusters
phclusters = 4

ph_clustering = phila_diner.drop(["Neighborhood"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=phclusters, random_state=0).fit(ph_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 1, 0, 1, 0, 1, 1, 1, 1, 1])

In [31]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
philamerged = phila_diner.copy()

# add clustering labels
philamerged["Cluster Labels"] = kmeans.labels_

In [32]:
philamerged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace = True)
philamerged.head()

Unnamed: 0,Neighborhood,Diner,Cluster Labels
0,"Bridesburg, Philadelphia",0.0,1
1,"Bridesburg-Kensington-Richmond, Philadelphia",0.0,1
2,"Callowhill, Philadelphia",0.01,0
3,"Center City, Philadelphia",0.0,1
4,"Chestnut Hill, Philadelphia",0.012048,0


In [33]:
philamerged = philamerged.join(philadf.set_index("Neighborhood"), on="Neighborhood")

print(philamerged.shape)
philamerged.head()

(49, 5)


Unnamed: 0,Neighborhood,Diner,Cluster Labels,Latitude,Longitude
0,"Bridesburg, Philadelphia",0.0,1,40.00015,-75.07011
1,"Bridesburg-Kensington-Richmond, Philadelphia",0.0,1,40.00015,-75.07011
2,"Callowhill, Philadelphia",0.01,0,39.958111,-75.150228
3,"Center City, Philadelphia",0.0,1,39.9513,-75.15474
4,"Chestnut Hill, Philadelphia",0.012048,0,40.07694,-75.20801


In [34]:
# sort the results by Cluster Labels
print(philamerged.shape)
philamerged.sort_values(["Cluster Labels"], inplace=True)
philamerged

(49, 5)


Unnamed: 0,Neighborhood,Diner,Cluster Labels,Latitude,Longitude
48,"Wister, Philadelphia",0.01,0,40.03505,-75.15972
25,"Old City, Philadelphia",0.01,0,39.95009,-75.14507
23,"Northern Liberties, Philadelphia",0.01,0,39.96596,-75.1415
21,North Philadelphia,0.01,0,39.98175,-75.13383
20,"North Central, Philadelphia",0.01,0,39.964824,-75.156266
34,"Roxborough, Philadelphia",0.01,0,40.03798,-75.22306
17,"Market East, Philadelphia",0.01,0,39.960467,-75.22935
35,"Society Hill, Philadelphia",0.01,0,39.94437,-75.1477
36,South Philadelphia,0.01,0,39.96411,-75.16105
38,Southwest Philadelphia,0.01,0,39.91004,-75.18637


<H4>Visualizing the clusters.

In [35]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(phclusters)
ys = [i+x+(i*x)**2 for i in range(phclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(philamerged['Latitude'], philamerged['Longitude'], philamerged['Neighborhood'], philamerged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [36]:
#Map as HTML file
map_clusters.save('map_clusters.html')

In [37]:
philamerged.loc[philamerged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Diner,Cluster Labels,Latitude,Longitude
48,"Wister, Philadelphia",0.01,0,40.03505,-75.15972
25,"Old City, Philadelphia",0.01,0,39.95009,-75.14507
23,"Northern Liberties, Philadelphia",0.01,0,39.96596,-75.1415
21,North Philadelphia,0.01,0,39.98175,-75.13383
20,"North Central, Philadelphia",0.01,0,39.964824,-75.156266
34,"Roxborough, Philadelphia",0.01,0,40.03798,-75.22306
17,"Market East, Philadelphia",0.01,0,39.960467,-75.22935
35,"Society Hill, Philadelphia",0.01,0,39.94437,-75.1477
36,South Philadelphia,0.01,0,39.96411,-75.16105
38,Southwest Philadelphia,0.01,0,39.91004,-75.18637


In [38]:
philamerged.loc[philamerged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Diner,Cluster Labels,Latitude,Longitude
39,"Spring Garden, Philadelphia",0.0,1,39.9655,-75.16974
46,"West Oak Lane, Philadelphia",0.0,1,40.05856,-75.1493
31,"Port Richmond, Philadelphia",0.0,1,39.98023,-75.09901
32,"Powelton Village, Philadelphia",0.0,1,39.96135,-75.19188
45,"Washington Square West, Philadelphia",0.0,1,39.94548,-75.15722
33,"Rittenhouse Square, Philadelphia",0.0,1,39.94711,-75.16943
42,"Templetown, Philadelphia",0.0,1,40.089303,-74.97822
41,"Strawberry Mansion, Philadelphia",0.0,1,39.994582,-75.192176
37,"Southwest Center City, Philadelphia",0.0,1,39.94095,-75.17962
30,"Poplar, Philadelphia",0.0,1,39.972069,-75.213304


In [39]:
philamerged.loc[philamerged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Diner,Cluster Labels,Latitude,Longitude
47,West Philadelphia,0.049383,2,40.053132,-75.028511


In [40]:
philamerged.loc[philamerged['Cluster Labels'] == 3]

Unnamed: 0,Neighborhood,Diner,Cluster Labels,Latitude,Longitude
11,"Holmesburg, Philadelphia",0.032967,3,40.04218,-75.02885
26,"Olney-Oak Lane, Philadelphia",0.025,3,40.054396,-75.011731


<H3>Conclusions

From this analysis, we can see that if we are looking to go to a diner in Philadelphia, there aren't any in Cluster 1, and very few in Cluster 0, 2, or 3. In terms of opportunity, this suggests that despite a very diverse and dense pool of eateries in Philadelphia, there is a definite opening in the market for diners.

It is important to note that by Diner, I am referring to an informal, inexpensive restaraunt that is open 24/7. This is an important distinction because as mentioned in the introduction, there are very few all night options other than fast food. And while a diner may be casual, it also allows for eat-in and full service experiences.

My original hypothesis suggested that University City (in Cluster 1) would be the best place for a diner due to the dense population of college students who would appreciate a diner for late night study fuel, a meeting place for group projects, and a midnight munchies spot for after parties or the morning after. After taking a look at the numbers, I can confirm that there is no competition in University City, and that it would be an ideal spot (in terms of local competition) to start a diner. And just as a final note, although I do believe that University City would be a good place to start a diner, the advantage of not having competition is not singular to this neighborhood in Philadelphia, so it could be recommended to start a diner in any of Cluster 1 or even 0, 2, or 3, facing the very scant competition in those latter areas.