## Segmenting and Clustering Neighborhoods in Philadelphia

## Introduction

Converting addresses into their equivalent latitude and longitude values, I used the Foursquare API to explore neighborhoods in Philadelphia. Also, Utilizing the **explore** function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. Used the *k*-means clustering algorithm to complete task. Then finally, I used the Folium library to visualize the neighborhoods in Philadelphia and their emerging clusters.

In [5]:
# library to handle data in a vectorized manner
import numpy as np 

# library for data analsysis
import pandas as pd 
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# library to handle JSON files
import json 

# uncomment this line if you haven't completed the Foursquare API lab
!conda install -c conda-forge geopy --yes

# convert an address into latitude and longitude values
from geopy.geocoders import Nominatim

# library to handle requests
import requests

# tranform JSON file into a pandas dataframe
from pandas.io.json import json_normalize 

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# uncomment this line if you haven't completed the Foursquare API lab
!conda install -c conda-forge folium=0.5.0 --yes 

# map rendering library
import folium 

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


In [6]:
dataframe = pd.read_csv('us-zip-code-latitude-and-longitude.csv', sep=';')

In [7]:
dataframe.head()

Unnamed: 0,Zip,City,State,Latitude,Longitude,Timezone,Daylight savings time flag,geopoint
0,19173,Philadelphia,PA,40.001811,-75.11787,-5,1,"40.001811,-75.11787"
1,19134,Philadelphia,PA,39.991712,-75.11116,-5,1,"39.991712,-75.11116"
2,19115,Philadelphia,PA,40.09261,-75.04118,-5,1,"40.09261,-75.04118"
3,19192,Philadelphia,PA,39.951112,-75.167622,-5,1,"39.951112,-75.167622"
4,19155,Philadelphia,PA,40.001811,-75.11787,-5,1,"40.001811,-75.11787"


In [8]:
dataframe = dataframe.sort_values(by=['Zip'], ignore_index=True)

In [9]:
dataframe.head()

Unnamed: 0,Zip,City,State,Latitude,Longitude,Timezone,Daylight savings time flag,geopoint
0,17959,New Philadelphia,PA,40.731739,-76.1278,-5,1,"40.731739,-76.1278"
1,19019,Philadelphia,PA,40.001811,-75.11787,-5,1,"40.001811,-75.11787"
2,19092,Philadelphia,PA,40.001811,-75.11787,-5,1,"40.001811,-75.11787"
3,19093,Philadelphia,PA,40.001811,-75.11787,-5,1,"40.001811,-75.11787"
4,19099,Philadelphia,PA,40.001811,-75.11787,-5,1,"40.001811,-75.11787"


In [10]:
with open('us-zip-code-latitude-and-longitude.json') as json_data:
    geo_data = json.load(json_data)

In [11]:
geo_data[0]

{'datasetid': 'us-zip-code-latitude-and-longitude',
 'recordid': 'b3b6cbba591fc334ae3f8bf56261ea1deb50a73b',
 'fields': {'city': 'Philadelphia',
  'zip': '19173',
  'dst': 1,
  'geopoint': [40.001811, -75.11787],
  'longitude': -75.11787,
  'state': 'PA',
  'latitude': 40.001811,
  'timezone': -5},
 'geometry': {'type': 'Point', 'coordinates': [-75.11787, 40.001811]},
 'record_timestamp': '2018-02-09T11:33:38.603-05:00'}

In [12]:
column_names = ['ZipCode', 'Latitude', 'Longitude'] 

coords3 = pd.DataFrame(columns=column_names)

In [13]:
coords3.head()

Unnamed: 0,ZipCode,Latitude,Longitude


In [14]:
for data in geo_data:
    zip_codes = data['fields']['zip'] 
    
        
    zipcode_latlon = data['geometry']['coordinates']
    zipcode_lat = zipcode_latlon[1]
    zipcode_lon = zipcode_latlon[0]
    
    coords3 = coords3.append({'ZipCode': zip_codes,
                                          'Latitude': zipcode_lat,
                                          'Longitude': zipcode_lon}, ignore_index=True)

In [15]:
coords3 = coords3.sort_values(by=['ZipCode'], ignore_index=True)

In [16]:
coords3['ZipCode'] = coords3['ZipCode'].astype(int)

In [17]:
coords3.dtypes

ZipCode        int64
Latitude     float64
Longitude    float64
dtype: object

In [18]:
coords3.head()

Unnamed: 0,ZipCode,Latitude,Longitude
0,17959,40.731739,-76.1278
1,19019,40.001811,-75.11787
2,19092,40.001811,-75.11787
3,19093,40.001811,-75.11787
4,19099,40.001811,-75.11787


In [19]:
neighborhoods = pd.read_excel('zipcodes2.xlsx')

In [20]:
neighborhoods.dtypes

ZipCode          int64
Section         object
Neighborhood    object
dtype: object

In [21]:
neighborhoods.head()

Unnamed: 0,ZipCode,Section,Neighborhood
0,19102,Center City,"Rittenhouse Square, Penn Center"
1,19103,Center City,"Avenue of the Arts, Fitler Square, French Quar..."
2,19104,West,"30th Street Station, Belmont Village, Haverfor..."
3,19106,Center City,"Elfreth's Alley, Franklin Square , Old City, P..."
4,19107,Center City,"Avenue of the Arts, Callowhill, Chinatown, Jew..."


In [22]:
coords3.set_index("ZipCode")
neighborhoods.set_index("ZipCode")
neighborhoods_all=pd.merge(neighborhoods, coords3)

In [23]:
neighborhoods_all.dtypes

ZipCode           int64
Section          object
Neighborhood     object
Latitude        float64
Longitude       float64
dtype: object

In [24]:
neighborhoods_all.head()

Unnamed: 0,ZipCode,Section,Neighborhood,Latitude,Longitude
0,19102,Center City,"Rittenhouse Square, Penn Center",39.952962,-75.16558
1,19103,Center City,"Avenue of the Arts, Fitler Square, French Quar...",39.952162,-75.17406
2,19104,West,"30th Street Station, Belmont Village, Haverfor...",39.961612,-75.19957
3,19106,Center City,"Elfreth's Alley, Franklin Square , Old City, P...",39.951062,-75.14589
4,19107,Center City,"Avenue of the Arts, Callowhill, Chinatown, Jew...",39.952112,-75.15853


In [25]:
print('The dataframe has {} sections and {} neighborhoods.'.format(
        len(neighborhoods_all['Section'].unique()),
        neighborhoods_all.shape[0]
    )
)

The dataframe has 7 sections and 47 neighborhoods.


In [26]:
sections = neighborhoods_all['Section'].unique().tolist()

#### Simplifing Map by segmenting and clustering neighborhoods in Center City, North, South and West, Philadelphia and slicing original dataframe.

In [27]:
philly_neighborhoods = neighborhoods_all.loc[neighborhoods_all['Section'].isin(['Center City', 'North', 'South', 'West'])]

In [28]:
philly_neighborhoods.head()

Unnamed: 0,ZipCode,Section,Neighborhood,Latitude,Longitude
0,19102,Center City,"Rittenhouse Square, Penn Center",39.952962,-75.16558
1,19103,Center City,"Avenue of the Arts, Fitler Square, French Quar...",39.952162,-75.17406
2,19104,West,"30th Street Station, Belmont Village, Haverfor...",39.961612,-75.19957
3,19106,Center City,"Elfreth's Alley, Franklin Square , Old City, P...",39.951062,-75.14589
4,19107,Center City,"Avenue of the Arts, Callowhill, Chinatown, Jew...",39.952112,-75.15853


#### Use geopy library to get the latitude and longitude values of Philadelphia.

In [29]:
address = 'Philadelphia, PA'

geolocator = Nominatim(user_agent="philly_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Philadelphia are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Philadelphia are 39.9527237, -75.1635262.


#### Visualizating a map of Philadelphia with neighborhoods superimposed on top

In [30]:
# create map of New York using latitude and longitude values
map_philly = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, section, neighborhood in zip(philly_neighborhoods['Latitude'], 
                                           philly_neighborhoods['Longitude'], 
                                           philly_neighborhoods['Section'], 
                                           philly_neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, section)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_philly)  
    
map_philly

#### Using Foursquare to extract venue data

##### * Define Foursquare Credentials and Version

In [31]:
CLIENT_ID = '2PPN5LI1MD2CSFPZX4AIJTYXXWBLHZNMKOTOV3G2PVFXFIFH' 
CLIENT_SECRET = 'KWCAELEQ50P5QJLR5KY23XPPIGRSC3IGTUDWTZAOTMN5WMQA'
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 2PPN5LI1MD2CSFPZX4AIJTYXXWBLHZNMKOTOV3G2PVFXFIFH
CLIENT_SECRET:KWCAELEQ50P5QJLR5KY23XPPIGRSC3IGTUDWTZAOTMN5WMQA


#### Exploring First Neighborhood in Dataframe.

Name of first neighborhood

In [32]:
philly_neighborhoods.loc[0, 'Neighborhood']

'Rittenhouse Square, Penn Center'

Getting the neighborhood's latitude and longitude values.

In [33]:
neighborhood_latitude = philly_neighborhoods.loc[0, 'Latitude']
neighborhood_longitude = philly_neighborhoods.loc[0, 'Longitude']

neighborhood_name = philly_neighborhoods.loc[0, 'Neighborhood']

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Rittenhouse Square, Penn Center are 39.952962, -75.16558.


#### Getting top 100 venues that are in 'Rittenhouse Square, Penn Center' within a radius of 500 meters

In [34]:
LIMIT = 100 

radius = 500 

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=2PPN5LI1MD2CSFPZX4AIJTYXXWBLHZNMKOTOV3G2PVFXFIFH&client_secret=KWCAELEQ50P5QJLR5KY23XPPIGRSC3IGTUDWTZAOTMN5WMQA&v=20180604&ll=39.952962,-75.16558&radius=500&limit=100'

#### Sending GET request and examine results

In [None]:
results = requests.get(url).json()

#### Extracting Venue Category

In [36]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#### Clean the json and structure it into a pandas dataframe

In [37]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Dilworth Park,Park,39.952772,-75.164723
1,La Colombe Coffee Roasters,Coffee Shop,39.951659,-75.165238
2,One Liberty Observation Deck,Scenic Lookout,39.95274,-75.168068
3,City Hall Courtyard,Plaza,39.952484,-75.163592
4,JFK Plaza / Love Park,Plaza,39.954123,-75.165303


#### Number of venues returned by Foursquare

In [38]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


### Exploring a Given Venue

#### Creating function to repeat the same process to all the neighborhoods in selected sections of Philadelphia.

In [39]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Code below runs the above function on each neighborhood and creates a new dataframe with all venues in selected Philadelphia neighborhoods (philadelphia_venues)

In [40]:
philadelphia_venues = getNearbyVenues(names=philly_neighborhoods['Neighborhood'],
                                   latitudes=philly_neighborhoods['Latitude'],
                                   longitudes=philly_neighborhoods['Longitude']
                                  )

Rittenhouse Square, Penn Center
Avenue of the Arts, Fitler Square, French Quarter, Logan Square, Penn Center
30th Street Station, Belmont Village, Haverford North, Mantua, Parkside, Powelton Village, Saunders Park, Spruce Hill, University City, Woodland Terrace
Elfreth's Alley, Franklin Square , Old City, Penn's Landing, Society Hill
Avenue of the Arts, Callowhill, Chinatown, Jewelers Row, Midtown Village , Washington Square West
Navy Yard
East Oak Lane, Feltonville, Koreatown, Olney
Brewerytown, Cecil B. Moore, Ludlow, Poplar, Sharswood
Hartranft, Olde Kensington, West Kensington, Yorktown
Northern Liberties, Callowhill
East Oak Lane, Oak Lane
Art Museum, Fairmount, Francisville, Spring Garden, Staton
Cathedral Park, Haddington, Mill Creek, Wynnefield, Wynnefield Heights
Allegheny West, Glenwood, South Lehigh, Strawberry Mansion
Fairhill, Glenwood
Walnut Hill, Belfield, Ogontz
Cedar Park, Cobbs Creek, Dunlap, Garden Court, Haddington, Mill Creek, Spruce Hill, University City
Feltonvil

#### Checking the size of resulting dataframe

In [41]:
print(philadelphia_venues.shape)
philadelphia_venues.head()

(789, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rittenhouse Square, Penn Center",39.952962,-75.16558,Dilworth Park,39.952772,-75.164723,Park
1,"Rittenhouse Square, Penn Center",39.952962,-75.16558,La Colombe Coffee Roasters,39.951659,-75.165238,Coffee Shop
2,"Rittenhouse Square, Penn Center",39.952962,-75.16558,One Liberty Observation Deck,39.95274,-75.168068,Scenic Lookout
3,"Rittenhouse Square, Penn Center",39.952962,-75.16558,City Hall Courtyard,39.952484,-75.163592,Plaza
4,"Rittenhouse Square, Penn Center",39.952962,-75.16558,JFK Plaza / Love Park,39.954123,-75.165303,Plaza


#### Checking how many venues were returned for each neighborhood

In [42]:
philadelphia_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"30th Street Station, Belmont Village, Haverford North, Mantua, Parkside, Powelton Village, Saunders Park, Spruce Hill, University City, Woodland Terrace",15,15,15,15,15,15
"Allegheny West, Glenwood, South Lehigh, Strawberry Mansion",11,11,11,11,11,11
"Angora, Bartram Village, Cedar Park, Cobbs Creek, Garden Court, Spruce Hill, Squirrel Hill, University City",8,8,8,8,8,8
"Art Museum, Fairmount, Francisville, Spring Garden, Staton",22,22,22,22,22,22
"Avenue of Technology, Carroll Park, Haddington, Overbrook, Overbrook Farms, Overbrook Park",6,6,6,6,6,6
"Avenue of the Arts, Callowhill, Chinatown, Jewelers Row, Midtown Village , Washington Square West",100,100,100,100,100,100
"Avenue of the Arts, Fitler Square, French Quarter, Logan Square, Penn Center",100,100,100,100,100,100
"Belfield, East Oak Lane, Fern Rock, Logan, Ogontz",5,5,5,5,5,5
"Bella Vista , Dickinson , Fabric Row, Hawthorne, Italian Market, Little Saigon, Pennsport, Queen Village, South Street, Southwark, Wharton",77,77,77,77,77,77
"Brewerytown, Cecil B. Moore, Ludlow, Poplar, Sharswood",12,12,12,12,12,12


#### How many unique categories can be curated from all the returned venues

In [43]:
print('There are {} uniques categories.'.format(len(philadelphia_venues['Venue Category'].unique())))

There are 189 uniques categories.


In [44]:
#print out the list of categories
philadelphia_venues['Venue Category'].unique()[:100]

array(['Park', 'Coffee Shop', 'Scenic Lookout', 'Plaza', 'Hotel',
       'Salad Place', 'Steakhouse', 'Clothing Store', 'Skating Rink',
       'Movie Theater', 'Miscellaneous Shop', 'American Restaurant',
       'Chocolate Shop', 'Seafood Restaurant', 'Taco Place',
       'Dessert Shop', 'New American Restaurant', 'Yoga Studio',
       'Pizza Place', 'Concert Hall', 'Grocery Store',
       'French Restaurant', 'Public Art', 'Lingerie Store',
       'Breakfast Spot', 'Restaurant', 'Churrascaria', 'Burger Joint',
       'Israeli Restaurant', 'Sporting Goods Shop', 'Donut Shop',
       'Italian Restaurant', 'Art Museum', 'Arts & Crafts Store',
       'Vegetarian / Vegan Restaurant', 'Food Service',
       'Chinese Restaurant', 'Spa', 'Theme Park', 'Cosmetics Shop',
       'Latin American Restaurant', 'Sandwich Place', 'Pharmacy',
       'Gourmet Shop', 'Smoke Shop', 'Café', 'Pub', 'Burrito Place',
       'Bakery', 'Mediterranean Restaurant', 'General Entertainment',
       'Deli / Bodega'

#### Analyzing Each Neighborhood

In [45]:
# one hot encoding
philadelphia_onehot = pd.get_dummies(philadelphia_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
philadelphia_onehot['Neighborhood'] = philadelphia_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [philadelphia_onehot.columns[-1]] + list(philadelphia_onehot.columns[:-1])
philadelphia_onehot = philadelphia_onehot[fixed_columns]

philadelphia_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Beer Bar,Beer Garden,Big Box Store,Bistro,Bookstore,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comedy Club,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Costume Shop,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Drugstore,English Restaurant,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Flower Shop,Food,Food Service,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gas Station,Gastropub,General Entertainment,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Movie Theater,Museum,Music Venue,Nail Salon,National Park,New American Restaurant,Noodle House,Optical Shop,Organic Grocery,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Piano Bar,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool Hall,Pub,Public Art,Rental Car Location,Restaurant,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shanghai Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Smoke Shop,Snack Place,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Steakhouse,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"Rittenhouse Square, Penn Center",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Rittenhouse Square, Penn Center",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Rittenhouse Square, Penn Center",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Rittenhouse Square, Penn Center",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Rittenhouse Square, Penn Center",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Examining the new dataframe size.

In [46]:
philadelphia_onehot.shape

(789, 190)

#### Grouping rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [47]:
philadelphia_grouped = philadelphia_onehot.groupby('Neighborhood').mean().reset_index()
philadelphia_grouped.head()

Unnamed: 0,Neighborhood,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Beer Bar,Beer Garden,Big Box Store,Bistro,Bookstore,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comedy Club,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Costume Shop,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Drugstore,English Restaurant,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Flower Shop,Food,Food Service,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gas Station,Gastropub,General Entertainment,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Movie Theater,Museum,Music Venue,Nail Salon,National Park,New American Restaurant,Noodle House,Optical Shop,Organic Grocery,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Piano Bar,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool Hall,Pub,Public Art,Rental Car Location,Restaurant,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shanghai Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Smoke Shop,Snack Place,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Steakhouse,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"30th Street Station, Belmont Village, Haverfor...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.066667,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Allegheny West, Glenwood, South Lehigh, Strawb...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Angora, Bartram Village, Cedar Park, Cobbs Cre...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Art Museum, Fairmount, Francisville, Spring Ga...",0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455
4,"Avenue of Technology, Carroll Park, Haddington...",0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Confirming new dataframe size

In [48]:
philadelphia_grouped.shape

(25, 190)

#### Printing each neighborhood along with the top 5 most common venues

In [49]:
num_top_venues = 5

for hood in philadelphia_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = philadelphia_grouped[philadelphia_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----30th Street Station, Belmont Village, Haverford North, Mantua, Parkside, Powelton Village, Saunders Park, Spruce Hill, University City, Woodland Terrace----
               venue  freq
0        Pizza Place  0.13
1  Mobile Phone Shop  0.07
2         Hookah Bar  0.07
3   Greek Restaurant  0.07
4        Coffee Shop  0.07


----Allegheny West, Glenwood, South Lehigh, Strawberry Mansion----
                  venue  freq
0         Grocery Store  0.27
1      Video Game Store  0.09
2             Drugstore  0.09
3  Fast Food Restaurant  0.09
4          Liquor Store  0.09


----Angora, Bartram Village, Cedar Park, Cobbs Creek, Garden Court, Spruce Hill, Squirrel Hill, University City----
                venue  freq
0        Intersection  0.25
1            Pharmacy  0.12
2         Supermarket  0.12
3  Chinese Restaurant  0.12
4              Bakery  0.12


----Art Museum, Fairmount, Francisville, Spring Garden, Staton----
                   venue  freq
0    American Restaurant  0.09
1     Itali

#### Putting that into a *pandas* dataframe.

Writing a function to sort the venues in descending order.

In [50]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Creating the new dataframe and display the top 10 venues for each neighborhood.

In [51]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
philadelphia_venues_sorted = pd.DataFrame(columns=columns)
philadelphia_venues_sorted['Neighborhood'] = philadelphia_grouped['Neighborhood']

for ind in np.arange(philadelphia_grouped.shape[0]):
    philadelphia_venues_sorted.iloc[ind, 1:] = return_most_common_venues(philadelphia_grouped.iloc[ind, :], num_top_venues)

philadelphia_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"30th Street Station, Belmont Village, Haverfor...",Pizza Place,Cosmetics Shop,Bubble Tea Shop,Greek Restaurant,Mobile Phone Shop,Deli / Bodega,Photography Studio,Piano Bar,Coffee Shop,Chinese Restaurant
1,"Allegheny West, Glenwood, South Lehigh, Strawb...",Grocery Store,Drugstore,Bakery,Breakfast Spot,Fast Food Restaurant,Boutique,Liquor Store,Business Service,Video Game Store,Discount Store
2,"Angora, Bartram Village, Cedar Park, Cobbs Cre...",Intersection,Southern / Soul Food Restaurant,Pharmacy,Supermarket,Bakery,Discount Store,Chinese Restaurant,Drugstore,Filipino Restaurant,Field
3,"Art Museum, Fairmount, Francisville, Spring Ga...",American Restaurant,Bar,Italian Restaurant,Home Service,Playground,Pet Store,Monument / Landmark,Mexican Restaurant,Intersection,Greek Restaurant
4,"Avenue of Technology, Carroll Park, Haddington...",American Restaurant,Pizza Place,Playground,Pharmacy,Seafood Restaurant,Cosmetics Shop,Hawaiian Restaurant,Diner,Ethiopian Restaurant,English Restaurant


#### Clustering Neighborhoods.

In [52]:
# set number of clusters
kclusters = 5
philadelphia_grouped_clustering = philadelphia_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(philadelphia_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 0, 3, 1, 1, 1, 1, 1, 1, 4], dtype=int32)

In [53]:
# add clustering labels
philadelphia_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

philadelphia_merged = philly_neighborhoods

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
philadelphia_merged = philadelphia_merged.join(philadelphia_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

philadelphia_merged.head() # check the last columns!

Unnamed: 0,ZipCode,Section,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,19102,Center City,"Rittenhouse Square, Penn Center",39.952962,-75.16558,1,Hotel,Coffee Shop,Bakery,Yoga Studio,Seafood Restaurant,Cosmetics Shop,Spa,Italian Restaurant,Pub,Scenic Lookout
1,19103,Center City,"Avenue of the Arts, Fitler Square, French Quar...",39.952162,-75.17406,1,American Restaurant,Sushi Restaurant,Deli / Bodega,Seafood Restaurant,New American Restaurant,Italian Restaurant,Hotel,Coffee Shop,Clothing Store,Bar
2,19104,West,"30th Street Station, Belmont Village, Haverfor...",39.961612,-75.19957,1,Pizza Place,Cosmetics Shop,Bubble Tea Shop,Greek Restaurant,Mobile Phone Shop,Deli / Bodega,Photography Studio,Piano Bar,Coffee Shop,Chinese Restaurant
3,19106,Center City,"Elfreth's Alley, Franklin Square , Old City, P...",39.951062,-75.14589,1,History Museum,Historic Site,Coffee Shop,New American Restaurant,Italian Restaurant,Hotel,American Restaurant,Café,Bar,Art Gallery
4,19107,Center City,"Avenue of the Arts, Callowhill, Chinatown, Jew...",39.952112,-75.15853,1,Bakery,Hotel,Sandwich Place,Burger Joint,Chinese Restaurant,Ice Cream Shop,Convenience Store,Mediterranean Restaurant,Pub,Hot Dog Joint


In [54]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(philadelphia_merged['Latitude'], 
                                  philadelphia_merged['Longitude'], 
                                  philadelphia_merged['Neighborhood'], 
                                  philadelphia_merged['Cluster Labels']):
    
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Cluster 1

In [55]:
philadelphia_merged.loc[philadelphia_merged['Cluster Labels'] == 0, philadelphia_merged.columns[[1] + list(range(5, philadelphia_merged.shape[1]))]]

Unnamed: 0,Section,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
24,North,0,Grocery Store,Drugstore,Bakery,Breakfast Spot,Fast Food Restaurant,Boutique,Liquor Store,Business Service,Video Game Store,Discount Store
25,North,0,Fast Food Restaurant,Grocery Store,Pizza Place,Pharmacy,Park,Discount Store,Donut Shop,Shopping Mall,Men's Store,Fried Chicken Joint
31,West,0,Platform,Food,Pharmacy,Pizza Place,Fast Food Restaurant,Breakfast Spot,Grocery Store,Discount Store,Lounge,Caribbean Restaurant


#### Cluster 2

In [56]:
philadelphia_merged.loc[philadelphia_merged['Cluster Labels'] == 1, philadelphia_merged.columns[[1] + list(range(5, philadelphia_merged.shape[1]))]]

Unnamed: 0,Section,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Center City,1,Hotel,Coffee Shop,Bakery,Yoga Studio,Seafood Restaurant,Cosmetics Shop,Spa,Italian Restaurant,Pub,Scenic Lookout
1,Center City,1,American Restaurant,Sushi Restaurant,Deli / Bodega,Seafood Restaurant,New American Restaurant,Italian Restaurant,Hotel,Coffee Shop,Clothing Store,Bar
2,West,1,Pizza Place,Cosmetics Shop,Bubble Tea Shop,Greek Restaurant,Mobile Phone Shop,Deli / Bodega,Photography Studio,Piano Bar,Coffee Shop,Chinese Restaurant
3,Center City,1,History Museum,Historic Site,Coffee Shop,New American Restaurant,Italian Restaurant,Hotel,American Restaurant,Café,Bar,Art Gallery
4,Center City,1,Bakery,Hotel,Sandwich Place,Burger Joint,Chinese Restaurant,Ice Cream Shop,Convenience Store,Mediterranean Restaurant,Pub,Hot Dog Joint
12,North,1,Seafood Restaurant,Donut Shop,Shoe Store,Bar,Shopping Plaza,Korean Restaurant,Kids Store,Sporting Goods Shop,Karaoke Bar,Supplement Shop
14,North,1,Restaurant,Grocery Store,Café,Park,Arts & Entertainment,Sandwich Place,Athletics & Sports,Brewery,Chinese Restaurant,Colombian Restaurant
15,North,1,Coffee Shop,Diner,Lounge,Donut Shop,Pharmacy,New American Restaurant,Brewery,Café,Sandwich Place,Restaurant
18,North,1,Convenience Store,Pizza Place,Cosmetics Shop,Pharmacy,Intersection,Japanese Restaurant,Donut Shop,Korean Restaurant,Dog Run,Field
22,North,1,American Restaurant,Bar,Italian Restaurant,Home Service,Playground,Pet Store,Monument / Landmark,Mexican Restaurant,Intersection,Greek Restaurant


#### Cluster 3

In [57]:
philadelphia_merged.loc[philadelphia_merged['Cluster Labels'] == 2, philadelphia_merged.columns[[1] + list(range(5, philadelphia_merged.shape[1]))]]

Unnamed: 0,Section,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,South,2,Food Truck,Donut Shop,Flower Shop,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Ethiopian Restaurant,English Restaurant


#### Cluster 4

In [58]:
philadelphia_merged.loc[philadelphia_merged['Cluster Labels'] == 3, philadelphia_merged.columns[[1] + list(range(5, philadelphia_merged.shape[1]))]]

Unnamed: 0,Section,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
32,North,3,Intersection,Pizza Place,Bar,Rental Car Location,Park,Basketball Court,Pharmacy,Deli / Bodega,Donut Shop,Field
35,West,3,Intersection,Southern / Soul Food Restaurant,Pharmacy,Supermarket,Bakery,Discount Store,Chinese Restaurant,Drugstore,Filipino Restaurant,Field


#### Cluster 5

In [59]:
philadelphia_merged.loc[philadelphia_merged['Cluster Labels'] == 4, philadelphia_merged.columns[[1] + list(range(5, philadelphia_merged.shape[1]))]]

Unnamed: 0,Section,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,North,4,Restaurant,Intersection,Deli / Bodega,Food,Park,Southern / Soul Food Restaurant,Art Gallery,Pizza Place,Playground,Cycle Studio
30,West,4,Caribbean Restaurant,Intersection,Deli / Bodega,Restaurant,Gas Station,Food,Yoga Studio,Filipino Restaurant,Field,Fast Food Restaurant


In [60]:
philadelphia_merged.loc[philadelphia_merged['Cluster Labels'] == 5, philadelphia_merged.columns[[1] + list(range(5, philadelphia_merged.shape[1]))]]

Unnamed: 0,Section,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
