# Capstone Project - Housing Amenities and Prices in Seattle Neighborhoods. 

## Introduction  
In the past few years, as the increasing number of workers in high-tech industries in Seattle, the housing market faces sharply rising demand.  There are sixty neighborhoods in Seattle, separated by their zip codes. These neighborhoods vary a lot in both housing prices and types of venues.  When new workers move into Seattle, they usually consider both the affordability of the house and amenities around the house. I create a map to illustrate both categories of information that could help the city dwellers pick the neighborhood that suits their preferences. 


## Data  

I use the following data in my analysis:  
1. the zip code list of Seattle: https://www.zip-codes.com/city/wa-seattle.asp  
2. coordinates of Seattle neighborhoods: Opencage Geocoder API
3. numbers and types of venues in each neighborhood: Foursquare API
4. boundaries of zip code area: ArcGIS Hub
5. average housing prices: www.zillow.com


Note:
* In order to avoid download Seattle venue data and neighborhood coordinate data multiple times, I collect them through codes separated from the main notebook: 
    

#### Python pacakges

In [3]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler,PolynomialFeatures
import json # library to handle JSON files
!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
# import k-means from clustering stage
from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
!conda install -c conda-forge geocoder --yes
from geopy.geocoders import Nominatim
!conda install -c conda-forge xlrd --yes
from pandas import ExcelWriter
from pandas import ExcelFile
print('Libraries imported.')

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.1

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.1

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.1

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.1

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.

Libraries imported.


#### Prepare the Seattle venues dataset, what are the amenities in each neighborhood? 

Venue information is mainly stored in the **seattle_venue.csv** dataset. But the opencage API gives wrong coordinates for five zip codes. I have googled coordinates of  those neighborhoods and saved venue information of them in the **seattle_venue_sup.csv** dataset.

In [5]:
seattle_venues=pd.read_csv('seattle_venues.csv')
seattle_venues['zipcode']=seattle_venues['zipcode'].astype('str')
seattle_venues.drop(['Unnamed: 0'], axis =1,inplace=True)
seattle_venues.head()

Unnamed: 0,zipcode,lat,long,Venue,Venue Latitude,Venue Longitude,Venue Category
0,98101,47.61076,-122.336181,Monorail Espresso,47.610828,-122.335048,Coffee Shop
1,98101,47.61076,-122.336181,Din Tai Fung Dumpling House,47.612671,-122.335073,Dumpling Restaurant
2,98101,47.61076,-122.336181,The 5th Avenue Theatre,47.608996,-122.334162,Theater
3,98101,47.61076,-122.336181,Victrola Coffee Roasters,47.610895,-122.338952,Coffee Shop
4,98101,47.61076,-122.336181,Veggie Grill,47.610058,-122.336497,Vegetarian / Vegan Restaurant


In [6]:
seattle_venues.shape

(4823, 7)

In [7]:
correct_coordinates=pd.read_csv('corrected_cor.csv')
correct_coordinates['zipcode']=correct_coordinates['zipcode'].astype('str')
need_to_modify_zip=correct_coordinates['zipcode'].values.tolist()


In [8]:
seattle_venues=seattle_venues[~seattle_venues.zipcode.isin(need_to_modify_zip)]
seattle_venues['zipcode'].nunique()#count number of zipcodes

55

In [9]:
seattle_venues_sup=pd.read_csv('seattle_venues_sup.csv')
seattle_venues_sup['zipcode']=seattle_venues_sup['zipcode'].astype('str')
seattle_venues_sup.drop(['Unnamed: 0'], axis =1,inplace=True)
seattle_venues_sup.head()

Unnamed: 0,zipcode,lat,long,Venue,Venue Latitude,Venue Longitude,Venue Category
0,98111,47.6099,-122.34,Ellenos Real Greek Yogurt,47.608848,-122.340476,Frozen Yogurt Shop
1,98111,47.6099,-122.34,Pike Place Market,47.609467,-122.341465,Market
2,98111,47.6099,-122.34,Pike Place Fish Market,47.608813,-122.340371,Fish Market
3,98111,47.6099,-122.34,Beecher's Handmade Cheese,47.60957,-122.341851,Cheese Shop
4,98111,47.6099,-122.34,World Spice Merchants,47.608653,-122.341405,Herbs & Spices Store


Now let's merge the two datasets together.

In [10]:
seattle_venues=seattle_venues.append(seattle_venues_sup)
seattle_venues['zipcode'].nunique()

60

In [11]:
seattle_venues.head()

Unnamed: 0,zipcode,lat,long,Venue,Venue Latitude,Venue Longitude,Venue Category
0,98101,47.61076,-122.336181,Monorail Espresso,47.610828,-122.335048,Coffee Shop
1,98101,47.61076,-122.336181,Din Tai Fung Dumpling House,47.612671,-122.335073,Dumpling Restaurant
2,98101,47.61076,-122.336181,The 5th Avenue Theatre,47.608996,-122.334162,Theater
3,98101,47.61076,-122.336181,Victrola Coffee Roasters,47.610895,-122.338952,Coffee Shop
4,98101,47.61076,-122.336181,Veggie Grill,47.610058,-122.336497,Vegetarian / Vegan Restaurant


In [12]:
sale_venues=seattle_venues

In [13]:
sale_venues.head()

Unnamed: 0,zipcode,lat,long,Venue,Venue Latitude,Venue Longitude,Venue Category
0,98101,47.61076,-122.336181,Monorail Espresso,47.610828,-122.335048,Coffee Shop
1,98101,47.61076,-122.336181,Din Tai Fung Dumpling House,47.612671,-122.335073,Dumpling Restaurant
2,98101,47.61076,-122.336181,The 5th Avenue Theatre,47.608996,-122.334162,Theater
3,98101,47.61076,-122.336181,Victrola Coffee Roasters,47.610895,-122.338952,Coffee Shop
4,98101,47.61076,-122.336181,Veggie Grill,47.610058,-122.336497,Vegetarian / Vegan Restaurant


How many venue categories?

In [14]:
sale_venues['Venue Category'].nunique()

322

In [15]:
sale_venues['Venue Category'].value_counts()

Coffee Shop                                 382
Pizza Place                                 149
Hotel                                       141
Sandwich Place                              133
Bakery                                      123
Park                                        122
Bar                                         104
Cocktail Bar                                103
Ice Cream Shop                               89
Brewery                                      88
Mexican Restaurant                           83
Breakfast Spot                               78
Italian Restaurant                           76
Burger Joint                                 74
Vietnamese Restaurant                        70
Café                                         70
Grocery Store                                69
American Restaurant                          65
Sushi Restaurant                             61
Thai Restaurant                              60
Seafood Restaurant                      

## Cluster the neighborhoods

In [16]:
# one hot encoding
sale_onehot = pd.get_dummies(sale_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood and coordinate columns back to dataframe
sale_onehot['zipcode'] = sale_venues['zipcode'] 
sale_onehot['Latitude']=sale_venues['lat']
sale_onehot['Longitude']=sale_venues['long']
# move the three columns to the first column
cols_to_move = ['zipcode', 'Latitude', 'Longitude']
sale_onehot= sale_onehot[ cols_to_move + [ col for col in sale_onehot.columns if col not in cols_to_move ] ]
#fixed_columns = [sale_onehot.columns[-1]] + list(sale_onehot.columns[:-1])
#sale_onehot = sale_onehot[fixed_columns]

sale_onehot.head()

Unnamed: 0,zipcode,Latitude,Longitude,ATM,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Dealership,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Shop,Bike Trail,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Campground,Canal,Candy Store,Caribbean Restaurant,Casino,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Science Building,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community College,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Credit Union,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Dive Shop,Dog Run,Donut Shop,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Fabric Shop,Fair,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Herbs & Spices Store,High School,History Museum,Hobby Shop,Home Service,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Inn,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Laundry Service,Lawyer,Library,Lingerie Store,Liquor Store,Lounge,Luggage Store,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Newsstand,Night Market,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Nightlife,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pawn Shop,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pie Shop,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Police Station,Polish Restaurant,Pool,Post Office,Poutine Place,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Restaurant,Rock Club,Roller Rink,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Social Club,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Storage Facility,Student Center,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,98101,47.61076,-122.336181,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,98101,47.61076,-122.336181,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,98101,47.61076,-122.336181,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,98101,47.61076,-122.336181,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,98101,47.61076,-122.336181,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [17]:
sale_onehot.shape

(5308, 325)

In [18]:
neighborhood_venues = sale_onehot.groupby('zipcode').mean().reset_index()
venue_grouped=neighborhood_venues.drop(['Latitude','Longitude'],axis=1)
venue_grouped.head()

Unnamed: 0,zipcode,ATM,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Dealership,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Shop,Bike Trail,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Campground,Canal,Candy Store,Caribbean Restaurant,Casino,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Science Building,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community College,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Credit Union,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Dive Shop,Dog Run,Donut Shop,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Fabric Shop,Fair,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Herbs & Spices Store,High School,History Museum,Hobby Shop,Home Service,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Inn,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Laundry Service,Lawyer,Library,Lingerie Store,Liquor Store,Lounge,Luggage Store,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Newsstand,Night Market,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Nightlife,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pawn Shop,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pie Shop,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Police Station,Polish Restaurant,Pool,Post Office,Poutine Place,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Restaurant,Rock Club,Roller Rink,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Social Club,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Storage Facility,Student Center,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,98101,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.1,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.02,0.0,0.0
1,98102,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.01,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0
2,98103,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.05,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.06
3,98104,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.06,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.05,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0
4,98105,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0


In [19]:
num_top_venues = 5

for hood in venue_grouped['zipcode']:
    print("----"+hood+"----")
    temp = venue_grouped[venue_grouped['zipcode'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----98101----
            venue  freq
0           Hotel  0.10
1     Coffee Shop  0.07
2          Bakery  0.04
3  Breakfast Spot  0.03
4             Spa  0.02


----98102----
            venue  freq
0     Coffee Shop  0.07
1  Ice Cream Shop  0.04
2            Café  0.04
3    Cocktail Bar  0.04
4     Pizza Place  0.03


----98103----
            venue  freq
0     Coffee Shop  0.07
1     Zoo Exhibit  0.06
2             Bar  0.05
3     Pizza Place  0.05
4  Ice Cream Shop  0.05


----98104----
                   venue  freq
0            Coffee Shop  0.08
1           Cocktail Bar  0.06
2  Vietnamese Restaurant  0.06
3                  Hotel  0.05
4     Seafood Restaurant  0.04


----98105----
             venue  freq
0      Coffee Shop  0.06
1   Ice Cream Shop  0.04
2       Restaurant  0.04
3  Thai Restaurant  0.03
4              Pub  0.03


----98106----
           venue  freq
0    Coffee Shop  0.09
1    Pizza Place  0.06
2  Grocery Store  0.05
3   Burger Joint  0.03
4       Pharmacy  0.03


In [20]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [27]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['zipcode','Latitude','Longitude']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted[['zipcode','Latitude','Longitude']] = neighborhood_venues[['zipcode','Latitude','Longitude']]

for ind in np.arange(venue_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 3:] = return_most_common_venues(venue_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,zipcode,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,98101,47.61076,-122.336181,Hotel,Coffee Shop,Bakery,Breakfast Spot,Sushi Restaurant,Dumpling Restaurant,Theater,Market,French Restaurant,Rock Club
1,98102,47.621611,-122.321227,Coffee Shop,Cocktail Bar,Ice Cream Shop,Café,Bar,Yoga Studio,Pizza Place,Mexican Restaurant,Sandwich Place,Italian Restaurant
2,98103,47.673749,-122.343934,Coffee Shop,Zoo Exhibit,Bar,Ice Cream Shop,Pizza Place,Pub,Park,Burger Joint,Café,Japanese Restaurant
3,98104,47.600708,-122.331334,Coffee Shop,Cocktail Bar,Vietnamese Restaurant,Hotel,Seafood Restaurant,Breakfast Spot,Sushi Restaurant,Italian Restaurant,Bookstore,Concert Hall
4,98105,47.662934,-122.320552,Coffee Shop,Ice Cream Shop,Restaurant,Grocery Store,Bar,Bubble Tea Shop,Pub,Thai Restaurant,Pet Store,Korean Restaurant
5,98106,47.516871,-122.35483,Coffee Shop,Pizza Place,Grocery Store,Burger Joint,Bank,Fried Chicken Joint,Pharmacy,Convenience Store,Playground,Fast Food Restaurant
6,98107,47.664346,-122.38136,Brewery,Bar,Mexican Restaurant,Coffee Shop,Cocktail Bar,Ice Cream Shop,New American Restaurant,Clothing Store,Sushi Restaurant,Sandwich Place
7,98108,47.567587,-122.322364,Brewery,Coffee Shop,Pizza Place,Food Truck,BBQ Joint,Bar,Pub,Taco Place,Grocery Store,Café
8,98109,47.633123,-122.348679,Coffee Shop,Museum,Mexican Restaurant,Restaurant,Camera Store,Italian Restaurant,Gym / Fitness Center,Park,Pizza Place,Bakery
9,98110,47.636977,-122.524881,Coffee Shop,Pizza Place,Bakery,Ice Cream Shop,Park,Pharmacy,Trail,American Restaurant,Diner,Wine Bar


In [28]:
neighborhoods_venues_sorted.shape

(60, 13)

In [29]:
# set number of clusters
kclusters = 4

venue_grouped_clustering = venue_grouped.drop('zipcode', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(venue_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 2, 0, 2, 2, 2, 1, 2, 2], dtype=int32)

In [30]:
values, counts = np.unique(kmeans.labels_, return_counts=True)
print(values,counts)

[0 1 2 3] [17  9 33  1]


In [31]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

In [32]:
neighborhoods_venues_sorted.head()

Unnamed: 0,Cluster Labels,zipcode,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,98101,47.61076,-122.336181,Hotel,Coffee Shop,Bakery,Breakfast Spot,Sushi Restaurant,Dumpling Restaurant,Theater,Market,French Restaurant,Rock Club
1,2,98102,47.621611,-122.321227,Coffee Shop,Cocktail Bar,Ice Cream Shop,Café,Bar,Yoga Studio,Pizza Place,Mexican Restaurant,Sandwich Place,Italian Restaurant
2,2,98103,47.673749,-122.343934,Coffee Shop,Zoo Exhibit,Bar,Ice Cream Shop,Pizza Place,Pub,Park,Burger Joint,Café,Japanese Restaurant
3,0,98104,47.600708,-122.331334,Coffee Shop,Cocktail Bar,Vietnamese Restaurant,Hotel,Seafood Restaurant,Breakfast Spot,Sushi Restaurant,Italian Restaurant,Bookstore,Concert Hall
4,2,98105,47.662934,-122.320552,Coffee Shop,Ice Cream Shop,Restaurant,Grocery Store,Bar,Bubble Tea Shop,Pub,Thai Restaurant,Pet Store,Korean Restaurant


In [189]:
neighborhoods_venues_sorted.shape

(60, 14)

## How to interpret the clusters?

I use the characteristics of the centroid of a cluster to illustrate the frequency of venues in each cluster.

In [33]:
centroids=pd.DataFrame(data=kmeans.cluster_centers_)
centroids.columns=venue_grouped.columns[1:]
centroids

Unnamed: 0,ATM,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Dealership,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Shop,Bike Trail,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Campground,Canal,Candy Store,Caribbean Restaurant,Casino,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Science Building,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community College,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Credit Union,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Dive Shop,Dog Run,Donut Shop,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Fabric Shop,Fair,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Herbs & Spices Store,High School,History Museum,Hobby Shop,Home Service,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Inn,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Laundry Service,Lawyer,Library,Lingerie Store,Liquor Store,Lounge,Luggage Store,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Movie Theater,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Newsstand,Night Market,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Nightlife,Outdoor Sculpture,Paper / Office Supplies Store,Park,Pawn Shop,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pie Shop,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Police Station,Polish Restaurant,Pool,Post Office,Poutine Place,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Restaurant,Rock Club,Roller Rink,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Social Club,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Storage Facility,Student Center,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,0.000588,0.0005882353,0.0,5.421011e-20,0.001764706,0.003529412,0.004117647,0.001176471,0.015294,0.0,0.0,-2.168404e-19,0.0005882353,0.001176,0.013529,0.001176,0.000588,0.0,1.084202e-19,0.0,-2.168404e-19,-4.336809e-19,0.001176,0.0,0.0,0.029412,-8.673616999999999e-19,0.014118,-4.336809e-19,0.000588,0.0,1.301043e-18,0.0005882353,0.001176,-4.336809e-19,0.000588,0.0,0.000588,5.421011e-20,-1.084202e-19,0.0,-4.336809e-19,0.012941,0.003529,0.000588,-2.168404e-19,0.028824,0.012353,0.0,0.001176471,0.003529,1.084202e-19,-2.168404e-19,0.0,0.0,0.0,0.0,0.014706,-1.084202e-19,-2.168404e-19,-2.710505e-20,0.0,-2.168404e-19,0.005294,2.168404e-19,0.009411765,1.734723e-18,1.084202e-19,0.007059,0.0,0.000588,0.039412,0.081765,0.0,-2.168404e-19,0.0,1.084202e-19,0.0,0.0,0.01411765,0.0,0.004706,0.000588,0.0,0.0,-5.421011e-20,0.0,0.0005882353,0.001176471,0.005294,0.001765,0.0,0.007647,-2.168404e-19,0.001176,0.0,-8.673616999999999e-19,-4.336809e-19,0.0,0.0,0.011176,0.017059,0.0005882353,0.0,0.0,0.0,-4.336809e-19,0.0,-2.168404e-19,0.0,-2.168404e-19,0.0,0.0,0.000588,4.336809e-19,0.005882353,0.000588,0.013529,4.336809e-19,2.168404e-19,2.168404e-19,0.000588,0.0,0.0005882353,0.001176,0.0,0.01764706,0.000588,0.008824,0.0,8.673616999999999e-19,2.168404e-19,0.001765,0.0,0.001176,0.002353,0.0005882353,0.001765,0.002353,-8.673616999999999e-19,0.0,0.008823529,0.001765,0.005294,0.005294,0.002353,-5.421011e-20,-2.168404e-19,2.168404e-19,0.000588,-1.084202e-19,2.168404e-19,0.008824,0.0,0.0,0.006471,0.0,-2.168404e-19,0.07411765,0.0005882353,0.001176471,0.014118,0.001765,0.001764706,0.0005882353,1.084202e-19,2.168404e-19,-2.168404e-19,0.028235,0.007059,0.004117647,-2.168404e-19,0.001176,0.0005882353,0.0,1.084202e-19,-2.168404e-19,1.084202e-19,0.0,0.0,-2.168404e-19,0.0,2.168404e-19,0.0,0.0,0.002353,0.003529,0.0,0.001764706,0.007059,0.010588,0.0005882353,0.001176,0.0,0.001176,0.0005882353,0.003529,0.008824,0.004117647,0.0,1.734723e-18,0.001176,0.0,-2.168404e-19,-1.084202e-19,0.007647059,0.001176,0.0005882353,0.007059,0.0,5.421011e-20,0.0,0.014118,0.0005882353,1.084202e-19,-2.168404e-19,0.003529,0.0,0.0,0.0,0.0,0.008235,0.0005882353,0.0,0.006471,0.0,0.007059,5.421011e-20,-1.084202e-19,0.005882,0.000588,0.001176471,0.0,0.0,0.012353,0.001176,0.007059,0.005294,-2.168404e-19,0.0005882353,-4.336809e-19,-1.084202e-19,0.0,0.001176,-2.710505e-20,0.0,0.001176,0.000588,0.006471,1.084202e-19,0.0,0.0005882353,0.004118,0.01,0.0,-5.421011e-20,0.004118,0.009412,0.021765,0.0,0.005294,0.0005882353,0.004118,0.021176,-8.673616999999999e-19,0.0,4.336809e-19,0.0,-5.421011e-20,2.168404e-19,0.0,0.0,5.421011e-20,0.004705882,0.0,-2.168404e-19,0.001765,0.015294,0.007647059,0.002353,0.001176,-4.336809e-19,0.001764706,0.007058824,0.006470588,2.168404e-19,0.0,0.001176,1.084202e-19,0.020588,0.0005882353,0.001765,0.0,-1.084202e-19,0.0,0.0,0.005294,0.014706,0.0,0.0,4.336809e-19,0.001176471,-3.469447e-18,2.168404e-19,-2.168404e-19,0.0,0.0,0.007647,8.673616999999999e-19,-1.734723e-18,0.013529,0.0,0.006470588,1.084202e-19,0.014706,0.008824,-2.168404e-19,0.0,0.0,0.012941,0.0,-4.336809e-19
1,0.010062,-5.421011e-20,-2.710505e-20,0.002573099,-1.084202e-19,0.0,0.0,-5.421011e-20,0.008175,0.001111,0.001111,0.0,-5.421011e-20,0.002698,0.0,0.00774,0.0069,1.084202e-19,0.001587302,-2.710505e-20,0.002222222,4.336809e-19,0.005556,-2.710505e-20,-2.710505e-20,0.019985,0.01262795,0.012072,0.001277139,0.0,0.0,0.01307103,-2.710505e-20,0.0,0.0,0.002911,0.0,0.001372,0.001461988,1.084202e-19,0.0,0.006273703,0.0,0.001634,0.0,0.0,0.001111,0.027106,0.0,-5.421011e-20,0.006939,0.001111111,0.008951051,0.005874,0.005556,-2.710505e-20,0.0,0.010147,0.001461988,0.0,0.001277139,-2.710505e-20,-2.168404e-19,0.0,0.00617284,4.336809e-19,0.01354155,0.002778212,0.0,0.003889,0.001587,0.001587,0.055365,-2.710505e-20,0.0,-5.421011e-20,0.002911126,-2.710505e-20,0.0,0.0,0.0,0.016797,0.003175,-2.710505e-20,0.0,0.0,0.001111,2.168404e-19,1.084202e-19,0.002388,0.002778,0.001111,0.0,0.001111111,0.004366,0.0,-4.336809e-19,0.00238825,0.0,0.003889,0.001111,0.0,-2.710505e-20,-2.710505e-20,0.004022,0.003684,0.0,0.0,0.0,-5.421011e-20,0.0,0.001277,0.003889,0.005835,0.005609539,-2.168404e-19,0.005,0.003889,0.0,0.001461988,0.00617284,0.001111,-2.710505e-20,-5.421011e-20,0.01407,-5.421011e-20,-8.673616999999999e-19,0.002911,0.001587,-5.421011e-20,0.001111111,0.001587302,0.002554,0.004022,0.02444,0.001462,-5.421011e-20,0.0,0.001587,0.007944674,0.001111111,0.0,0.012628,0.027095,0.00809,0.006887,0.0,0.002573099,0.004184682,0.004188,0.001587302,-1.084202e-19,0.0,-2.710505e-20,0.0,0.0,-2.710505e-20,0.0,-3.469447e-18,-5.421011e-20,-1.084202e-19,0.004637,0.0,2.168404e-19,0.001277139,0.002911126,0.004557964,0.002778212,0.006939,0.01569,0.0,-2.168404e-19,0.003175,-2.168404e-19,0.0,5.421011e-20,0.001111111,0.002911126,0.002778,0.001111111,2.168404e-19,0.0,-1.084202e-19,0.001111,0.0,0.0,0.003049,-2.710505e-20,-1.084202e-19,0.002222,0.0,-2.710505e-20,0.001634,0.0,0.012893,-2.168404e-19,0.017687,0.001587,0.0,-2.168404e-19,0.0,0.0,-2.710505e-20,0.0,1.084202e-19,4.336809e-19,0.001111,1.084202e-19,0.001462,0.0,0.001461988,0.0,0.002222,-2.710505e-20,0.002911126,0.001111111,0.0,-2.710505e-20,-5.421011e-20,-5.421011e-20,0.001462,0.001111,-1.084202e-19,-2.710505e-20,0.073516,-2.710505e-20,0.003333,0.001461988,0.001277139,0.00561,0.013918,-1.084202e-19,0.001634,0.001634,0.049772,0.021749,0.003889,0.001462,0.0,-2.710505e-20,0.004545113,0.001587302,0.0,0.021717,0.001277139,-2.710505e-20,0.001277,0.001111,0.005133,0.001111111,0.0,-5.421011e-20,0.002573,0.0,-2.710505e-20,0.0,0.0,0.004809,0.014579,-2.710505e-20,0.016826,-5.421011e-20,0.0,0.008526,0.001461988,0.0,-2.168404e-19,-2.710505e-20,0.0,0.0,0.0,0.001462,0.001461988,-2.168404e-19,0.001111111,0.001111111,0.006173,0.00381,0.0,0.0,0.004366,0.002911126,-1.084202e-19,0.0,-4.336809e-19,0.001461988,-2.710505e-20,0.0,0.002911126,0.001277,0.001111111,0.009929,0.0,1.084202e-19,0.0,1.084202e-19,0.00985,0.001587,-2.710505e-20,0.001111,4.336809e-19,-5.421011e-20,0.0198971,0.00617284,0.0,-5.421011e-20,-2.710505e-20,0.001277,0.0,0.01588966,0.027339,0.001111,0.0,0.002911126,0.002911,0.004705,0.001111111,0.001462,-1.084202e-19,0.001277,-2.710505e-20,0.0
2,0.001523,0.0003030303,0.0003030303,1.6263029999999999e-19,0.0003030303,-6.505213e-19,-4.336809e-19,1.084202e-19,0.012206,0.000303,0.002056,0.0009090909,0.0003030303,0.000303,0.000817,0.004081,0.005907,0.001753247,0.0005050505,0.0003030303,0.0003030303,0.006122338,0.0038,0.0003030303,0.0003030303,0.020239,0.00682438,0.023617,0.001119671,0.000606,0.001285,0.003247133,5.421011e-20,0.005189,0.001818182,0.002591,0.001158,0.000606,1.084202e-19,0.001414141,0.002121,0.001119671,0.006559,0.000303,0.002389,0.0009090909,0.009466,0.014235,0.002151,1.084202e-19,0.021312,0.0008254963,0.000522466,0.001471,0.002777,0.0003030303,0.000505,0.012389,0.0006060606,0.0009090909,8.131516e-20,0.0003030303,0.002028762,0.002424,-5.421010999999999e-19,0.0006060606,0.006991439,-2.168404e-19,0.001515,0.00112,0.003666,0.010655,0.069426,0.0003030303,0.0009090909,0.0006060606,1.6263029999999999e-19,0.0003030303,0.000636,-1.734723e-18,0.000692,0.01314,0.007877,0.0003030303,0.000505,0.0005136107,0.000303,0.002424242,0.0006360306,0.002626,0.005086,0.000909,0.003419,0.0008441558,0.008602,0.000939,0.003899608,0.003774132,0.000522,0.003174,0.006954,0.001515,5.421011e-20,0.0003030303,0.001231,0.002173,0.001818182,0.002121,0.0009090909,0.0006060606,0.0009090909,0.000322,0.008248,0.010536,0.000522466,1.084202e-18,0.001928,0.000909,0.002323232,0.0005941771,0.001119671,0.000909,0.0003030303,0.0003030303,0.00922,0.0006060606,0.003846944,0.003406,0.003461,0.0005941771,0.005002986,0.001965718,0.001129,0.00145,0.009973,0.003388,0.0003030303,0.000909,0.000909,0.005567117,5.421011e-20,-8.673616999999999e-19,0.004005,0.014235,0.00924,0.01197,0.0005136107,0.001086597,3.2526069999999995e-19,0.003714,0.001111111,0.001594896,0.000606,0.0003030303,0.001515,0.000505,0.0003030303,0.0009090909,0.004836601,0.0003030303,0.0008441558,0.019592,0.003704,0.001440723,5.421011e-20,1.6263029999999999e-19,0.0008441558,0.0006060606,0.007374,0.010577,-4.336809e-19,0.001836268,0.000606,0.001743753,0.001515,0.0008254963,0.0006060606,1.6263029999999999e-19,0.003636,5.421011e-20,0.002616745,0.001515,0.001594896,0.000333,0.000939,0.003413,0.001515,0.0003030303,0.0003030303,0.004724,0.001818,5.421011e-20,0.004683,0.000636,0.00716,0.002121212,0.02322,0.003784,-4.336809e-19,0.002424242,0.008435897,0.000303,0.0003030303,-3.2526069999999995e-19,0.001231464,0.002656233,0.003241,0.001086597,0.003333,0.001806,1.084202e-19,0.001515,0.008513,5.421011e-20,1.6263029999999999e-19,0.0006060606,0.001966,0.0003030303,0.0006060606,0.0006060606,0.002646,0.000303,0.0009090909,0.0003030303,0.020899,0.0003030303,0.001212,1.084202e-19,0.0006980971,0.015054,0.010713,0.0006060606,0.000522,0.0,0.032607,0.006902,0.000303,0.000909,-3.2526069999999995e-19,5.421011e-20,0.001375788,0.0006060606,0.001515,0.013858,8.131516e-20,0.0003030303,0.000939,0.002424,0.007518,0.001147186,0.001285,0.0003030303,0.010808,0.002602,0.0003030303,0.0003330003,0.001212,0.006037,0.029349,0.0003030303,0.00112,0.0003030303,0.002403,0.004775,0.004019139,0.000636,0.003157289,0.0003030303,0.0005136107,0.003363303,0.001515,0.000333,0.0003030303,4.336809e-19,5.421011e-20,0.0006060606,0.000625,0.006209,8.673616999999999e-19,0.000303,0.003904,0.003393349,0.0003030303,-8.673616999999999e-19,0.002746615,-4.336809e-19,0.0003030303,0.00901,0.0006360306,0.008188,1.084202e-19,0.006505,0.000927,0.001231464,0.002048,0.001086597,0.014853,0.003469,0.0003030303,0.00442,0.004644079,1.084202e-19,0.008996784,-5.421010999999999e-19,0.0009090909,0.0006060606,0.0003030303,0.003372,0.003772205,0.01019433,0.009354,0.000909,2.168404e-19,1.6263029999999999e-19,0.00299,0.003666,0.0006060606,0.00325,0.001212121,0.006778,0.0003030303,0.001818182
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.130435,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.26087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04347826,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [34]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [35]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Cluster Labels']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
centroid_venues= pd.DataFrame(columns=columns)
centroid_venues['Cluster Labels'] = centroids.index
for ind in np.arange(kclusters):
    centroid_venues.iloc[ind, 1:] = return_most_common_venues(centroids.iloc[ind, :], num_top_venues)

centroid_venues

Unnamed: 0,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,Coffee Shop,Hotel,Cocktail Bar,Bakery,Breakfast Spot,Italian Restaurant,Sandwich Place,Seafood Restaurant,Sushi Restaurant,French Restaurant
1,1,Park,Coffee Shop,Pizza Place,Vietnamese Restaurant,Brewery,Grocery Store,Gas Station,Playground,Pub,Bakery
2,2,Coffee Shop,Pizza Place,Sandwich Place,Bar,Mexican Restaurant,Burger Joint,Park,Bakery,Ice Cream Shop,Pet Store
3,3,Park,Convenience Store,Coffee Shop,Pizza Place,Gym,Motel,Bus Station,Storage Facility,Baseball Field,Organic Grocery


##### There are 17 neighborhoods in cluster 0, 9 neighborhoods in cluster 1, 33 in cluster 2, and 1 in cluster 3.  
Both cluster 0 and cluster 2 are attractive for city dwellers, featuring cafes, restaurants, and bars. Venues in cluster 0 may cater to customers who prefer a more luxurious lifestyle. 
With the park as the most frequent venue category, neighborhoods in cluster 1 provide good living area for city dwellers who enjoy the beauty of nature. Neighborhoods in cluster 3 seem to have fewer amenities for a city lifestyle.

## Lets mark those neighborhoods on the map to see their locations.

First find Seattle's location

In [68]:
import API_KEY #This is my API key for opencage API
from API_KEY import API_KEY

In [4]:
 
address = "Seattle. Washington" # Formats the place name
url = 'https://api.opencagedata.com/geocode/v1/json?q={}&key={}'.format(address, API_KEY) # Gets the proper url to make the API call
obj = json.loads(requests.get(url).text) # Loads the JSON file in the form of a python dictionary
results = obj['results'] # Extracts the results information out of the JSON file
latitude = results[0]['geometry']['lat'] # Extracts the latitude value
longitude = results[0]['geometry']['lng'] # Extracts the longitude value
[latitude,longitude]

[47.6038321, -122.3300624]

Now add neighborhoods on the map.

In [36]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(neighborhoods_venues_sorted['Latitude'], neighborhoods_venues_sorted['Longitude'], neighborhoods_venues_sorted['zipcode'], neighborhoods_venues_sorted['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='black', #rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=1).add_to(map_clusters)
       
map_clusters

Not surprisingly, neighborhoods in cluster 0 are in downtown Seattle.

### Now analyze the housing price data.

##### The dataset comes from  https://www.zillow.com/seattle-wa/home-values/. There are only 34 neighborhoods available. 

In [37]:
xl = pd.ExcelFile("seattle-wa.xls") # housing price data
xl.sheet_names

['All Homes',
 'Single Fam',
 'Condo',
 'Top Tier',
 'Middle Tier',
 'Bottom Tier',
 'Duplex',
 'Studio',
 'One Bed',
 'Two Bed',
 'Three Bed',
 'Four Bed',
 'Many Bed']

In [38]:
price_data=xl.parse('All Homes')
price_data.head()

Unnamed: 0,"Seattle, WA - All Homes",Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13,Unnamed: 14,Unnamed: 15,Unnamed: 16,Unnamed: 17,Unnamed: 18,Unnamed: 19,Unnamed: 20,Unnamed: 21,Unnamed: 22,Unnamed: 23,Unnamed: 24,Unnamed: 25,Unnamed: 26,Unnamed: 27,Unnamed: 28,Unnamed: 29,Unnamed: 30,Unnamed: 31,Unnamed: 32,Unnamed: 33,Unnamed: 34,Unnamed: 35,Unnamed: 36,Unnamed: 37,Unnamed: 38,Unnamed: 39,Unnamed: 40,Unnamed: 41,Unnamed: 42,Unnamed: 43,Unnamed: 44,Unnamed: 45,Unnamed: 46,Unnamed: 47,Unnamed: 48
0,Statistic,,,Zillow Home Value Index,,,,,,Listings with price cut (%),,,,Median value per sq. ft. ($),,,,Median price cut (%),,,,Median list price ($),,,,Median sale price ($),,,,Median list price / sq. ft. ($),,,,Median rent list price ($),,,,Median rent list price / sq. ft. ($),,,,Homes foreclosed,,,,Zillow Rent Index,,,
1,Region Name,Region Type,Type,Current,Month Over Month,Quarter Over Quarter,Year Over Year,5 Year Annualized,10 Year Annualized,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year
2,Seattle,city,All Homes,741800,0.0119633,0.0159227,-0.0233292,0.0873,0.0581,0.071,-0.0621197,-0.132457,-0.0438624,638,0,0,0,0.0297,-1.5444e-05,-0.00332061,-0.006,698300,-0.00107296,0.000358166,0.00467626,713900,0.0106172,0.0323933,0.0143507,510,-0.00161673,-0.01217,0.00921816,2600,-0.0280374,-0.037037,0.04,2.1951,0.0286201,0.0173067,0.0569106,---,---,---,---,2650,0.00378788,-0.0221402,0.0114504
3,98101,zipcode,All Homes,687300,0.00758779,0.00737808,-0.0517746,0.083,0.0568,0.12,-0.0710112,-0.0895238,-0.0188889,800,0,0,0,---,---,---,---,735000,0.0809618,0.0777126,-0.109091,---,---,---,---,845,0.0295128,0.0287602,0.0420686,---,---,---,---,---,---,---,---,---,---,---,---,2410,-0.00823045,-0.0282258,-0.0474308
4,98102,zipcode,All Homes,725200,0.00868775,0.00729228,-0.0410737,0.0771,0.0525,0.0476,-0.0982143,-0.162907,-0.0624727,778,0,0,0,---,---,---,---,659000,-0.0237037,0.0544844,-0.174185,776800,0.200804,0.0685007,0.175901,633,0.000385881,-0.0160151,-0.0251632,---,---,---,---,2.7192,0.0119054,0.0228799,0.0574495,0,---,---,-0.0194,2630,0.0313725,0,0.0313725


In [39]:
temp=price_data.iloc[1:]
temp.reset_index(drop=True, inplace=True)
temp.head()

Unnamed: 0,"Seattle, WA - All Homes",Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13,Unnamed: 14,Unnamed: 15,Unnamed: 16,Unnamed: 17,Unnamed: 18,Unnamed: 19,Unnamed: 20,Unnamed: 21,Unnamed: 22,Unnamed: 23,Unnamed: 24,Unnamed: 25,Unnamed: 26,Unnamed: 27,Unnamed: 28,Unnamed: 29,Unnamed: 30,Unnamed: 31,Unnamed: 32,Unnamed: 33,Unnamed: 34,Unnamed: 35,Unnamed: 36,Unnamed: 37,Unnamed: 38,Unnamed: 39,Unnamed: 40,Unnamed: 41,Unnamed: 42,Unnamed: 43,Unnamed: 44,Unnamed: 45,Unnamed: 46,Unnamed: 47,Unnamed: 48
0,Region Name,Region Type,Type,Current,Month Over Month,Quarter Over Quarter,Year Over Year,5 Year Annualized,10 Year Annualized,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year,Current,Month Over Month,Quarter Over Quarter,Year Over Year
1,Seattle,city,All Homes,741800,0.0119633,0.0159227,-0.0233292,0.0873,0.0581,0.071,-0.0621197,-0.132457,-0.0438624,638,0,0,0,0.0297,-1.5444e-05,-0.00332061,-0.006,698300,-0.00107296,0.000358166,0.00467626,713900,0.0106172,0.0323933,0.0143507,510,-0.00161673,-0.01217,0.00921816,2600,-0.0280374,-0.037037,0.04,2.1951,0.0286201,0.0173067,0.0569106,---,---,---,---,2650,0.00378788,-0.0221402,0.0114504
2,98101,zipcode,All Homes,687300,0.00758779,0.00737808,-0.0517746,0.083,0.0568,0.12,-0.0710112,-0.0895238,-0.0188889,800,0,0,0,---,---,---,---,735000,0.0809618,0.0777126,-0.109091,---,---,---,---,845,0.0295128,0.0287602,0.0420686,---,---,---,---,---,---,---,---,---,---,---,---,2410,-0.00823045,-0.0282258,-0.0474308
3,98102,zipcode,All Homes,725200,0.00868775,0.00729228,-0.0410737,0.0771,0.0525,0.0476,-0.0982143,-0.162907,-0.0624727,778,0,0,0,---,---,---,---,659000,-0.0237037,0.0544844,-0.174185,776800,0.200804,0.0685007,0.175901,633,0.000385881,-0.0160151,-0.0251632,---,---,---,---,2.7192,0.0119054,0.0228799,0.0574495,0,---,---,-0.0194,2630,0.0313725,0,0.0313725
4,98103,zipcode,All Homes,813200,0.0119215,0.0171974,-0.0279829,0.0846,0.0622,0.067,-0.0974053,-0.165688,-0.0399376,576,0,0,0,---,---,---,---,725000,-0.0129344,-3.44828e-05,-0.0189783,785000,0.0174984,0.016576,0.00692663,548,0.0247948,0.0247878,-0.020089,2650,-0.0185185,-0.0535714,0.06,2.1945,0.0291651,0.00126038,0.042246,---,---,---,---,2810,0.00716846,-0.0140351,0.0181159


In [40]:
new_header=temp.iloc[0]
temp=temp.iloc[2:]
temp.columns=new_header
temp.reset_index(drop=True,inplace=True)
temp.head()

Unnamed: 0,Region Name,Region Type,Type,Current,Month Over Month,Quarter Over Quarter,Year Over Year,5 Year Annualized,10 Year Annualized,Current.1,Month Over Month.1,Quarter Over Quarter.1,Year Over Year.1,Current.2,Month Over Month.2,Quarter Over Quarter.2,Year Over Year.2,Current.3,Month Over Month.3,Quarter Over Quarter.3,Year Over Year.3,Current.4,Month Over Month.4,Quarter Over Quarter.4,Year Over Year.4,Current.5,Month Over Month.5,Quarter Over Quarter.5,Year Over Year.5,Current.6,Month Over Month.6,Quarter Over Quarter.6,Year Over Year.6,Current.7,Month Over Month.7,Quarter Over Quarter.7,Year Over Year.7,Current.8,Month Over Month.8,Quarter Over Quarter.8,Year Over Year.8,Current.9,Month Over Month.9,Quarter Over Quarter.9,Year Over Year.9,Current.10,Month Over Month.10,Quarter Over Quarter.10,Year Over Year.10
0,98101,zipcode,All Homes,687300,0.00758779,0.00737808,-0.0517746,0.083,0.0568,0.12,-0.0710112,-0.0895238,-0.0188889,800,0,0,0,---,---,---,---,735000,0.0809618,0.0777126,-0.109091,---,---,---,---,845,0.0295128,0.0287602,0.0420686,---,---,---,---,---,---,---,---,---,---,---,---,2410,-0.00823045,-0.0282258,-0.0474308
1,98102,zipcode,All Homes,725200,0.00868775,0.00729228,-0.0410737,0.0771,0.0525,0.0476,-0.0982143,-0.162907,-0.0624727,778,0,0,0,---,---,---,---,659000,-0.0237037,0.0544844,-0.174185,776800,0.200804,0.0685007,0.175901,633,0.000385881,-0.0160151,-0.0251632,---,---,---,---,2.7192,0.0119054,0.0228799,0.0574495,0,---,---,-0.0194,2630,0.0313725,0.0,0.0313725
2,98103,zipcode,All Homes,813200,0.0119215,0.0171974,-0.0279829,0.0846,0.0622,0.067,-0.0974053,-0.165688,-0.0399376,576,0,0,0,---,---,---,---,725000,-0.0129344,-3.44828e-05,-0.0189783,785000,0.0174984,0.016576,0.00692663,548,0.0247948,0.0247878,-0.020089,2650,-0.0185185,-0.0535714,0.06,2.1945,0.0291651,0.00126038,0.042246,---,---,---,---,2810,0.00716846,-0.0140351,0.0181159
3,98104,zipcode,All Homes,606800,0.0120578,0.0129127,-0.025664,0.0838,0.0474,---,---,---,---,654,0,0,0,---,---,---,---,693900,0.00734526,0.00734526,0.0280593,---,---,---,---,811,-0.00704918,0.0476694,-0.0541519,---,---,---,---,---,---,---,---,0,---,---,---,2410,0.00416667,-0.00413223,0.0168776
4,98105,zipcode,All Homes,991000,0.0048518,0.0117219,-0.0236257,0.0805,0.0594,0.0423,-0.101977,-0.125685,-0.135829,570,0,0,0,---,---,---,---,849900,0.039633,-0.102511,0.0301818,1100900,0.0016377,0.297772,0.082604,526,-0.00611888,-0.00904995,-0.0339314,2950,-0.0117253,0.0300279,0.0640216,---,---,---,---,0,---,---,---,3130,0.00320513,-0.0515152,0.00967742


In [41]:
seattle_sale=temp.iloc[:,[0,3]]
seattle_sale.columns=['zipcode','price']
seattle_sale.reset_index(drop=True,inplace=True)
seattle_sale.head()

Unnamed: 0,zipcode,price
0,98101,687300
1,98102,725200
2,98103,813200
3,98104,606800
4,98105,991000


In [42]:
seattle_sale.shape

(34, 2)

#### Now merge the neighborhoods_venues_sorted data with housing sale data.

In [43]:
seattle_mega=pd.merge(neighborhoods_venues_sorted,seattle_sale, on='zipcode')
seattle_mega['price']=seattle_mega['price'].astype('float')
seattle_mega

Unnamed: 0,Cluster Labels,zipcode,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,price
0,0,98101,47.61076,-122.336181,Hotel,Coffee Shop,Bakery,Breakfast Spot,Sushi Restaurant,Dumpling Restaurant,Theater,Market,French Restaurant,Rock Club,687300.0
1,2,98102,47.621611,-122.321227,Coffee Shop,Cocktail Bar,Ice Cream Shop,Café,Bar,Yoga Studio,Pizza Place,Mexican Restaurant,Sandwich Place,Italian Restaurant,725200.0
2,2,98103,47.673749,-122.343934,Coffee Shop,Zoo Exhibit,Bar,Ice Cream Shop,Pizza Place,Pub,Park,Burger Joint,Café,Japanese Restaurant,813200.0
3,0,98104,47.600708,-122.331334,Coffee Shop,Cocktail Bar,Vietnamese Restaurant,Hotel,Seafood Restaurant,Breakfast Spot,Sushi Restaurant,Italian Restaurant,Bookstore,Concert Hall,606800.0
4,2,98105,47.662934,-122.320552,Coffee Shop,Ice Cream Shop,Restaurant,Grocery Store,Bar,Bubble Tea Shop,Pub,Thai Restaurant,Pet Store,Korean Restaurant,991000.0
5,2,98106,47.516871,-122.35483,Coffee Shop,Pizza Place,Grocery Store,Burger Joint,Bank,Fried Chicken Joint,Pharmacy,Convenience Store,Playground,Fast Food Restaurant,524900.0
6,2,98107,47.664346,-122.38136,Brewery,Bar,Mexican Restaurant,Coffee Shop,Cocktail Bar,Ice Cream Shop,New American Restaurant,Clothing Store,Sushi Restaurant,Sandwich Place,782600.0
7,1,98108,47.567587,-122.322364,Brewery,Coffee Shop,Pizza Place,Food Truck,BBQ Joint,Bar,Pub,Taco Place,Grocery Store,Café,579500.0
8,2,98109,47.633123,-122.348679,Coffee Shop,Museum,Mexican Restaurant,Restaurant,Camera Store,Italian Restaurant,Gym / Fitness Center,Park,Pizza Place,Bakery,740900.0
9,0,98112,47.626709,-122.306787,Coffee Shop,Italian Restaurant,Cocktail Bar,Bakery,Sushi Restaurant,Park,Garden,Scenic Lookout,American Restaurant,Indian Restaurant,1249200.0


In [44]:
cluster_price=seattle_mega[['Cluster Labels','price']].groupby(['Cluster Labels'],as_index=False).mean()
cluster_price

Unnamed: 0,Cluster Labels,price
0,0,787416.666667
1,1,688837.5
2,2,658410.526316
3,3,411600.0


In [45]:
# download countries geojson file
!wget --quiet https://opendata.arcgis.com/datasets/e6c555c6ae7542b2bdec92485892b6e6_113.geojson

In [60]:
zip_list=seattle_mega['zipcode'].tolist() #neighborhoods_venues_sorted['zipcode'].tolist() 
zip_list

['98101',
 '98102',
 '98103',
 '98104',
 '98105',
 '98106',
 '98107',
 '98108',
 '98109',
 '98112',
 '98115',
 '98116',
 '98117',
 '98118',
 '98119',
 '98121',
 '98122',
 '98125',
 '98126',
 '98133',
 '98134',
 '98136',
 '98144',
 '98146',
 '98148',
 '98155',
 '98164',
 '98166',
 '98168',
 '98177',
 '98178',
 '98188',
 '98198',
 '98199']

In [61]:
with open('e6c555c6ae7542b2bdec92485892b6e6_113.geojson', 'r') as jsonFile:
    geo_data = json.load(jsonFile)

In [62]:
tmp = geo_data
# remove ZIP codes not in geo data
geozips = []
for i in range(len(tmp['features'])):
    if tmp['features'][i]['properties']['ZIPCODE'] in list(zip_list):
        geozips.append(tmp['features'][i])
# creating new JSON object
new_json = dict.fromkeys(['type','features'])
new_json['type'] = 'FeatureCollection'
new_json['features'] = geozips
# save uodated JSON object
open("cleaned_geodata.json", "w").write(json.dumps(new_json, sort_keys=True, indent=4, separators=(',', ': ')))

3618622

In [63]:
with open('cleaned_geodata.json', 'r') as jsonFile:
    seattle_geo = json.load(jsonFile)

In [64]:
type(seattle_geo)

dict

In [65]:
seattle_map = folium.Map(location=[latitude, longitude], zoom_start=12, tiles='Mapbox Bright')

In [66]:
# generate choropleth map.
seattle_map.choropleth(
    geo_data=seattle_geo,
    data=seattle_sale,
    columns=['zipcode', 'price'],
    key_on='feature.properties.ZIPCODE',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='House Price',
    #reset=True
)
seattle_map

In [67]:
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(seattle_mega['Latitude'], seattle_mega['Longitude'], seattle_mega['zipcode'], seattle_mega['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='darkgreen', #rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=1).add_to(seattle_map)
seattle_map


The graph above shows the housing price and amenity cluster of each neighborhood.  

## Result and Discussion

Seattle provides neighborhoods of various characteristics that could satisfy the needs of different people. For people who enjoy metropolis life and people prefer the beauty of nature, there are expensive houses around central Seattle and more affordable choices in other neighborhoods.  

There are two limitations to our data. First,  there are only 34 neighborhoods with housing price information. Second, an ideal dataset of this analysis should use amenity and housing price information for each housing transaction. My study thus only provides limited guidance for people to search for a new home in the Seattle area. 