# Coursera Capstone - Battle of the Neighborhoods (Week 2)

### Introduction:
In this project, I will attempt to discover what sort of neighborhood wealthy people tend to live in. For the purposes of the project, neighborhoods will be defined by the venues that are in them. For example, a neighborhood with a lot of bars and restaurants might be considered a trendy neighborhood where people like to go out a lot. The project is specifically focused on Seattle neighborhoods, and will be targeted to stakeholders interested in opening any sort of business that caters to the wealthy.


### Data
To complete this project, I will need a few different types of data:
* Data on where each neighborhood is located within Seattle.
* Data on what venues are in each neighborhood.
* Data on the income level of each neighborhood.

In order to obtain these data, I'll use the following sources:
* Seattle.gov public data to get the names of each neighborhood and the median income for each neighborhood.
* The Geopy package to get the latitude and longitude coordinates of each neighborhood.
* Foursquare API to get a list of venues for each neighborhood.


In [168]:
#needed packages
import pandas as pd
import geocoder
from sklearn.cluster import KMeans
import folium
import json
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
import numpy as np
from geopy.geocoders import Nominatim

pd.options.display.max_rows = 4000

### Data Prep

In [169]:
#read in the neighborhoods data
seattle_data = pd.read_csv('Seattle_Neighborhoods.csv', header = 0)
#just need these two columns
seattle_data = seattle_data[['GEN_ALIAS', 'MEDIAN_HH_INC_PAST_12MO_DOLLAR']]
#give the columns better names
seattle_data.columns = ['Neighborhood', 'Med_Income']

seattle_data.drop(22, inplace=True)
seattle_data = seattle_data.reset_index(drop=True)
seattle_data

Unnamed: 0,Neighborhood,Med_Income
0,Ballard,79162
1,North Beach/Blue Ridge,94804
2,Montlake/Portage Bay,132573
3,Interbay,74679
4,North Capitol Hill,96220
5,Capitol Hill,58476
6,Wedgwood/View Ridge,114723
7,Whittier Heights,100023
8,North Delridge,75000
9,Broadview/Bitter Lake,77688


In [170]:
#put columns for the lat/long data
seattle_data['Latitude'] = ""
seattle_data['Longitude'] = ""

#do some data cleaning so that the geolocator can find the lat/long data
seattle_data.loc[13, 'Neighborhood'] = 'South Beacon Hill'
seattle_data.loc[15, 'Neighborhood'] = 'Madrona'
seattle_data.loc[23, 'Neighborhood'] = 'North Beacon Hill'
seattle_data.loc[27, 'Neighborhood'] = 'Ravenna'
seattle_data.loc[33, 'Neighborhood'] = 'Fauntleroy'
seattle_data.loc[38, 'Neighborhood'] = 'Duwamish'
seattle_data.loc[41, 'Neighborhood'] = 'Pike Place'
seattle_data.loc[48, 'Neighborhood'] = 'Squire Park'
seattle_data.loc[51, 'Neighborhood'] = 'Cedar Park'

seattle_data

Unnamed: 0,Neighborhood,Med_Income,Latitude,Longitude
0,Ballard,79162,,
1,North Beach/Blue Ridge,94804,,
2,Montlake/Portage Bay,132573,,
3,Interbay,74679,,
4,North Capitol Hill,96220,,
5,Capitol Hill,58476,,
6,Wedgwood/View Ridge,114723,,
7,Whittier Heights,100023,,
8,North Delridge,75000,,
9,Broadview/Bitter Lake,77688,,


In [171]:
#add latitudes and longitudes to the dataframe
geolocator = Nominatim(user_agent = "seattle_explorer", timeout=3)

i = 0
for hood in seattle_data.Neighborhood:
    address = hood + ', Seattle'
    location = geolocator.geocode(address)
    seattle_data.loc[i, 'Latitude'] = location.latitude
    seattle_data.loc[i, 'Longitude'] = location.longitude
    i += 1

seattle_data.head()

Unnamed: 0,Neighborhood,Med_Income,Latitude,Longitude
0,Ballard,79162,47.6765,-122.386
1,North Beach/Blue Ridge,94804,47.7003,-122.396
2,Montlake/Portage Bay,132573,47.6303,-122.314
3,Interbay,74679,47.6407,-122.376
4,North Capitol Hill,96220,47.6238,-122.318


In [172]:
#make a map of all the neighborhoods
sea_lat = 47.6062
sea_long = -122.3321

map_seattle = folium.Map(location = [sea_lat, sea_long], zoom_start = 10)

for lat, long, hood in zip(seattle_data['Latitude'], seattle_data['Longitude'], seattle_data['Neighborhood']):
    label = hood
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat, long],
        radius = 4,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7,
        parse_html = False).add_to(map_seattle)

map_seattle

### Getting the venues data from Foursquare

In [173]:
#start creating profiles for each neighborhood
CLIENT_ID = 'XFVHDSOL1NLUBDL3FLWKTZE3E5P2XCVTFK1GVVA4Q0TC52LI' #Foursquare ID
CLIENT_SECRET = 'S4H4QOZMW2EFDC5IUICJL5OIZ0LONVIIQRHM2FQEHGJSEMN4' #Foursquare Secret
VERSION = '20180605' #Foursquare API version

#function to get venues near a given neighborhood (lat and long)
def getNearbyVenues(names, latitudes, longitudes, limit=100, radius=1000):
    
    venues_list = []
    for name, lat, long in zip(names, latitudes, longitudes):
        
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            long, 
            radius, 
            limit)
        
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
         # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            long, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
        
        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [174]:
#dataframe for all the venues grouped by neighborhood
seattle_venues = getNearbyVenues(names = seattle_data['Neighborhood'],
                                 latitudes = seattle_data['Latitude'],
                                 longitudes = seattle_data['Longitude'])
seattle_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Ballard,47.676507,-122.386223,Copine,47.675741,-122.387404,French Restaurant
1,Ballard,47.676507,-122.386223,Baleen,47.675971,-122.382039,Jewelry Store
2,Ballard,47.676507,-122.386223,Olaf's,47.674712,-122.387815,Bar
3,Ballard,47.676507,-122.386223,Cafe Besalu,47.671971,-122.387755,Bakery
4,Ballard,47.676507,-122.386223,Mabel Coffee,47.67965,-122.387924,Coffee Shop


In [175]:
#check how many venues were found for each neighborhood
venues_count = seattle_venues.groupby('Neighborhood').count().reset_index()
venues_count

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Alki/Admiral,39,39,39,39,39,39
1,Arbor Heights,6,6,6,6,6,6
2,Ballard,100,100,100,100,100,100
3,Beacon Hill,53,53,53,53,53,53
4,Belltown,100,100,100,100,100,100
5,Broadview/Bitter Lake,15,15,15,15,15,15
6,Capitol Hill,100,100,100,100,100,100
7,Cascade/Eastlake,98,98,98,98,98,98
8,Cedar Park,42,42,42,42,42,42
9,Columbia City,76,76,76,76,76,76


In [176]:
#get rid of neighborhoods that don't have enough venues (fewer than 30)
venues_count = venues_count[venues_count['Venue'] > 29]
venues_count = venues_count.reset_index()
seattle_venues = seattle_venues[seattle_venues['Neighborhood'].isin(venues_count['Neighborhood'])]
seattle_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Ballard,47.676507,-122.386223,Copine,47.675741,-122.387404,French Restaurant
1,Ballard,47.676507,-122.386223,Baleen,47.675971,-122.382039,Jewelry Store
2,Ballard,47.676507,-122.386223,Olaf's,47.674712,-122.387815,Bar
3,Ballard,47.676507,-122.386223,Cafe Besalu,47.671971,-122.387755,Bakery
4,Ballard,47.676507,-122.386223,Mabel Coffee,47.67965,-122.387924,Coffee Shop


In [177]:
#get dummy variables for venue categories for the clustering algorithm to work
seattle_onehot = pd.get_dummies(seattle_venues[['Venue Category']], prefix="", prefix_sep="")

#put the neighborhood names in
seattle_onehot = seattle_onehot.drop(columns = ['Neighborhood'], axis=1)
seattle_onehot.insert(0, 'Neighborhood', seattle_venues['Neighborhood'])
seattle_onehot.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,Advertising Agency,African Restaurant,Airport,Alternative Healer,American Restaurant,Antique Shop,...,Volleyball Court,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Ballard,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Ballard,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Ballard,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Ballard,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Ballard,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Create a dataframe for the clustering algorithm

In [256]:
#take the mean frequency of each venue in each neighborhood to create a profile
seattle_grouped = seattle_onehot.groupby('Neighborhood').mean().reset_index()
seattle_grouped

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,Advertising Agency,African Restaurant,Airport,Alternative Healer,American Restaurant,Antique Shop,...,Volleyball Court,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Alki/Admiral,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Ballard,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0
2,Beacon Hill,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0
3,Belltown,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0
4,Capitol Hill,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,...,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0
5,Cascade/Eastlake,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.010204,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.010204,0.0,0.0
6,Cedar Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Columbia City,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.013158,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013158,0.0,0.0
8,Duwamish,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.04,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,First Hill,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,...,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0


In [257]:
#before clustering, create a dataframe that shows the top 10 venues from each neighborhood.
#this helps with interpreting each cluster
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#create the dataframe
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = seattle_grouped['Neighborhood']

for ind in np.arange(seattle_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(seattle_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alki/Admiral,Coffee Shop,Food Truck,Pub,Mexican Restaurant,Beach,Trail,Australian Restaurant,Italian Restaurant,Greek Restaurant,Theater
1,Ballard,Coffee Shop,Ice Cream Shop,Bakery,Thai Restaurant,Burger Joint,Bar,Mexican Restaurant,Vietnamese Restaurant,Pizza Place,Cocktail Bar
2,Beacon Hill,Coffee Shop,Mexican Restaurant,Pizza Place,Pharmacy,Bakery,Café,Sandwich Place,Fast Food Restaurant,Video Store,Playground
3,Belltown,Hotel,Bakery,Coffee Shop,Breakfast Spot,Cocktail Bar,Donut Shop,Deli / Bodega,Sushi Restaurant,Pizza Place,Bar
4,Capitol Hill,Coffee Shop,Bar,Cocktail Bar,Bakery,American Restaurant,Thai Restaurant,Italian Restaurant,Park,Thrift / Vintage Store,Indian Restaurant
5,Cascade/Eastlake,Coffee Shop,Pizza Place,Salon / Barbershop,Café,Grocery Store,Burger Joint,Clothing Store,Furniture / Home Store,ATM,Toy / Game Store
6,Cedar Park,Trail,Pharmacy,Vietnamese Restaurant,Pub,Bus Stop,Beer Bar,Fried Chicken Joint,Beer Store,Food Truck,Dance Studio
7,Columbia City,Park,Pizza Place,Convenience Store,Ice Cream Shop,Pub,Bar,Mexican Restaurant,Thai Restaurant,Gym,Ethiopian Restaurant
8,Duwamish,Coffee Shop,Food Truck,Mexican Restaurant,Café,American Restaurant,Gym / Fitness Center,Hotel,Bar,Office,Italian Restaurant
9,First Hill,Coffee Shop,Café,Sandwich Place,Ice Cream Shop,Hotel,Pizza Place,New American Restaurant,Bar,Theater,Juice Bar


In [258]:
#run the clustering algorithm (k-means)
#number of clusters
kclusters = 4

#cluster it
seattle_grouped_clustering = seattle_grouped.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(seattle_grouped_clustering)

In [259]:
#add cluster labels to dataframe
neighborhoods_venues_sorted.insert(1, 'Cluster Labels', kmeans.labels_)
neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alki/Admiral,3,Coffee Shop,Food Truck,Pub,Mexican Restaurant,Beach,Trail,Australian Restaurant,Italian Restaurant,Greek Restaurant,Theater
1,Ballard,1,Coffee Shop,Ice Cream Shop,Bakery,Thai Restaurant,Burger Joint,Bar,Mexican Restaurant,Vietnamese Restaurant,Pizza Place,Cocktail Bar
2,Beacon Hill,0,Coffee Shop,Mexican Restaurant,Pizza Place,Pharmacy,Bakery,Café,Sandwich Place,Fast Food Restaurant,Video Store,Playground
3,Belltown,1,Hotel,Bakery,Coffee Shop,Breakfast Spot,Cocktail Bar,Donut Shop,Deli / Bodega,Sushi Restaurant,Pizza Place,Bar
4,Capitol Hill,1,Coffee Shop,Bar,Cocktail Bar,Bakery,American Restaurant,Thai Restaurant,Italian Restaurant,Park,Thrift / Vintage Store,Indian Restaurant


In [260]:
#add latitude and longitude
seattle_merged = seattle_data[seattle_data['Neighborhood'].isin(venues_count['Neighborhood'])]
seattle_merged = seattle_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
seattle_merged

Unnamed: 0,Neighborhood,Med_Income,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Ballard,79162,47.6765,-122.386,1,Coffee Shop,Ice Cream Shop,Bakery,Thai Restaurant,Burger Joint,Bar,Mexican Restaurant,Vietnamese Restaurant,Pizza Place,Cocktail Bar
2,Montlake/Portage Bay,132573,47.6303,-122.314,1,Coffee Shop,Bakery,Thai Restaurant,Park,Italian Restaurant,Garden,Furniture / Home Store,Scenic Lookout,Bar,Cocktail Bar
3,Interbay,74679,47.6407,-122.376,0,Bus Stop,Sandwich Place,Bakery,Grocery Store,Burger Joint,Golf Course,Coffee Shop,Playground,ATM,Pharmacy
4,North Capitol Hill,96220,47.6238,-122.318,1,Coffee Shop,Bar,Cocktail Bar,Bakery,American Restaurant,Thai Restaurant,Italian Restaurant,Park,Thrift / Vintage Store,Indian Restaurant
5,Capitol Hill,58476,47.6238,-122.318,1,Coffee Shop,Bar,Cocktail Bar,Bakery,American Restaurant,Thai Restaurant,Italian Restaurant,Park,Thrift / Vintage Store,Indian Restaurant
7,Whittier Heights,100023,47.6833,-122.371,3,Deli / Bodega,Food Truck,Park,Pizza Place,Coffee Shop,Bar,Cocktail Bar,Bus Station,Gas Station,Pub
10,Roxhill/Westwood,67722,47.5192,-122.369,3,Convenience Store,Cosmetics Shop,Coffee Shop,Thai Restaurant,Spa,Supermarket,Bakery,Bank,Shopping Mall,Latin American Restaurant
12,Rainier Beach,67933,47.5231,-122.27,0,Light Rail Station,Mexican Restaurant,Coffee Shop,Grocery Store,Fast Food Restaurant,Convenience Store,Ethiopian Restaurant,Garden,Marijuana Dispensary,Fried Chicken Joint
13,South Beacon Hill,51655,47.5776,-122.31,0,Coffee Shop,Mexican Restaurant,Sandwich Place,Pharmacy,Pizza Place,Pub,Video Store,Convenience Store,Fast Food Restaurant,Café
14,Northgate/Maple Leaf,72493,47.7023,-122.328,3,Cosmetics Shop,Sandwich Place,Indian Restaurant,Bus Station,Shoe Store,Burger Joint,Clothing Store,Coffee Shop,Bakery,Kids Store


### Analyze the clusters

In [261]:
#visualize the clusters on the map
map_clusters = folium.Map(location=[sea_lat, sea_long], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(seattle_merged['Latitude'], seattle_merged['Longitude'], seattle_merged['Neighborhood'], seattle_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster],
        fill=True,
        fill_color=rainbow[cluster],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [262]:
#examine the clusters and evaluate
seattle_merged.loc[seattle_merged['Cluster Labels'] == 0, seattle_merged.columns[[2] + list(range(0, seattle_merged.shape[1]))]]

Unnamed: 0,Latitude,Neighborhood,Med_Income,Latitude.1,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,47.6407,Interbay,74679,47.6407,-122.376,0,Bus Stop,Sandwich Place,Bakery,Grocery Store,Burger Joint,Golf Course,Coffee Shop,Playground,ATM,Pharmacy
12,47.5231,Rainier Beach,67933,47.5231,-122.27,0,Light Rail Station,Mexican Restaurant,Coffee Shop,Grocery Store,Fast Food Restaurant,Convenience Store,Ethiopian Restaurant,Garden,Marijuana Dispensary,Fried Chicken Joint
13,47.5776,South Beacon Hill,51655,47.5776,-122.31,0,Coffee Shop,Mexican Restaurant,Sandwich Place,Pharmacy,Pizza Place,Pub,Video Store,Convenience Store,Fast Food Restaurant,Café
18,47.5385,High Point,52121,47.5385,-122.377,0,Park,Playground,Pizza Place,Coffee Shop,Field,Beer Store,Massage Studio,Food Truck,Sporting Goods Shop,Brewery
20,47.7299,Olympic Hills/Victory Heights,66298,47.7299,-122.299,0,Gas Station,Gym / Fitness Center,Supermarket,Coffee Shop,Vietnamese Restaurant,Bus Station,Japanese Restaurant,Beer Store,Martial Arts Dojo,Fried Chicken Joint
23,47.5793,North Beacon Hill,67657,47.5793,-122.312,0,Coffee Shop,Mexican Restaurant,Pizza Place,Pharmacy,Bakery,Café,Sandwich Place,Fast Food Restaurant,Video Store,Playground
24,47.5915,Judkins Park,47428,47.5915,-122.304,0,Park,Coffee Shop,Pizza Place,Gym,Vietnamese Restaurant,Café,Residential Building (Apartment / Condo),Thrift / Vintage Store,Gym / Fitness Center,Bakery
27,47.6757,Ravenna,92464,47.6757,-122.298,0,Grocery Store,Pizza Place,Coffee Shop,Café,Thai Restaurant,Chinese Restaurant,Yoga Studio,Brewery,Mediterranean Restaurant,Seafood Restaurant
39,47.5786,Mt. Baker/North Rainier,100714,47.5786,-122.29,0,Park,Pizza Place,Business Service,Coffee Shop,Fast Food Restaurant,Sandwich Place,Pharmacy,Thai Restaurant,Vietnamese Restaurant,Video Store
44,47.5793,Beacon Hill,67157,47.5793,-122.312,0,Coffee Shop,Mexican Restaurant,Pizza Place,Pharmacy,Bakery,Café,Sandwich Place,Fast Food Restaurant,Video Store,Playground


In [263]:
seattle_merged.loc[seattle_merged['Cluster Labels'] == 1, seattle_merged.columns[[2] + list(range(0, seattle_merged.shape[1]))]]

Unnamed: 0,Latitude,Neighborhood,Med_Income,Latitude.1,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,47.6765,Ballard,79162,47.6765,-122.386,1,Coffee Shop,Ice Cream Shop,Bakery,Thai Restaurant,Burger Joint,Bar,Mexican Restaurant,Vietnamese Restaurant,Pizza Place,Cocktail Bar
2,47.6303,Montlake/Portage Bay,132573,47.6303,-122.314,1,Coffee Shop,Bakery,Thai Restaurant,Park,Italian Restaurant,Garden,Furniture / Home Store,Scenic Lookout,Bar,Cocktail Bar
4,47.6238,North Capitol Hill,96220,47.6238,-122.318,1,Coffee Shop,Bar,Cocktail Bar,Bakery,American Restaurant,Thai Restaurant,Italian Restaurant,Park,Thrift / Vintage Store,Indian Restaurant
5,47.6238,Capitol Hill,58476,47.6238,-122.318,1,Coffee Shop,Bar,Cocktail Bar,Bakery,American Restaurant,Thai Restaurant,Italian Restaurant,Park,Thrift / Vintage Store,Indian Restaurant
15,47.6128,Madrona,121549,47.6128,-122.291,1,Ethiopian Restaurant,Park,Coffee Shop,Playground,Bar,Gift Shop,Bakery,Taco Place,Asian Restaurant,Thai Restaurant
17,47.5489,Georgetown,58611,47.5489,-122.33,1,Coffee Shop,Brewery,Bar,Café,Pizza Place,Sandwich Place,Lounge,Sushi Restaurant,Dessert Shop,Mexican Restaurant
19,47.6613,University District,29215,47.6613,-122.313,1,Coffee Shop,Indian Restaurant,Vietnamese Restaurant,Korean Restaurant,Bubble Tea Shop,Chinese Restaurant,Thai Restaurant,Mediterranean Restaurant,Bookstore,Grocery Store
21,47.6359,Madison Park,134811,47.6359,-122.28,1,Bar,Park,Bank,American Restaurant,Playground,Café,Kitchen Supply Store,Soccer Field,Lake,Beach
22,47.5649,West Seattle Junction/Genesee Hill,99203,47.5649,-122.385,1,Pizza Place,Coffee Shop,Asian Restaurant,Golf Course,Burger Joint,Mexican Restaurant,Sporting Goods Shop,Furniture / Home Store,Cosmetics Shop,Bakery
28,47.6132,Belltown,76693,47.6132,-122.345,1,Hotel,Bakery,Coffee Shop,Breakfast Spot,Cocktail Bar,Donut Shop,Deli / Bodega,Sushi Restaurant,Pizza Place,Bar


In [264]:
seattle_merged.loc[seattle_merged['Cluster Labels'] == 2, seattle_merged.columns[[2] + list(range(0, seattle_merged.shape[1]))]]

Unnamed: 0,Latitude,Neighborhood,Med_Income,Latitude.1,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
47,47.669,Greenwood/Phinney Ridge,92464,47.669,-122.361,2,Zoo Exhibit,Brewery,Park,Pub,Chinese Restaurant,Thrift / Vintage Store,Burger Joint,Café,Sandwich Place,Gastropub


In [265]:
seattle_merged.loc[seattle_merged['Cluster Labels'] == 3, seattle_merged.columns[[2] + list(range(0, seattle_merged.shape[1]))]]

Unnamed: 0,Latitude,Neighborhood,Med_Income,Latitude.1,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,47.6833,Whittier Heights,100023,47.6833,-122.371,3,Deli / Bodega,Food Truck,Park,Pizza Place,Coffee Shop,Bar,Cocktail Bar,Bus Station,Gas Station,Pub
10,47.5192,Roxhill/Westwood,67722,47.5192,-122.369,3,Convenience Store,Cosmetics Shop,Coffee Shop,Thai Restaurant,Spa,Supermarket,Bakery,Bank,Shopping Mall,Latin American Restaurant
14,47.7023,Northgate/Maple Leaf,72493,47.7023,-122.328,3,Cosmetics Shop,Sandwich Place,Indian Restaurant,Bus Station,Shoe Store,Burger Joint,Clothing Store,Coffee Shop,Bakery,Kids Store
16,47.6802,Green Lake,105587,47.6802,-122.324,3,Coffee Shop,Burger Joint,Spa,Vegetarian / Vegan Restaurant,Gym,Frozen Yogurt Shop,Mexican Restaurant,Grocery Store,Rental Car Location,Gym / Fitness Center
26,47.5512,Seward Park,86348,47.5512,-122.266,3,Park,Playground,Pub,Pet Store,Convenience Store,Trail,Burger Joint,Beach,Boat or Ferry,Bookstore
29,47.5866,Alki/Admiral,99854,47.5866,-122.398,3,Coffee Shop,Food Truck,Pub,Mexican Restaurant,Beach,Trail,Australian Restaurant,Italian Restaurant,Greek Restaurant,Theater
34,47.6955,Licton Springs,80012,47.6955,-122.338,3,Bus Station,Chinese Restaurant,Coffee Shop,Inn,Sandwich Place,Thai Restaurant,Karaoke Bar,Bakery,Basketball Stadium,Fabric Shop
40,47.5579,Columbia City,58972,47.5579,-122.285,3,Park,Pizza Place,Convenience Store,Ice Cream Shop,Pub,Bar,Mexican Restaurant,Thai Restaurant,Gym,Ethiopian Restaurant


In [268]:
#find average median income for each cluster
mean_incomes = seattle_merged.groupby('Cluster Labels').mean().reset_index()
mean_incomes

Unnamed: 0,Cluster Labels,Med_Income
0,0,71344.818182
1,1,83876.25
2,2,92464.0
3,3,83876.375
