# Finding alternative cities to live in the Northwest US

## Table of contents
* [Introduction](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction <a name="introduction"></a>

#### Background
Over the past decade, Seattle has led other metropolitan cities as the fastest growing city in the United States[1]. Such
rapid growth, however, has caused a general sense of crowding and displacement for existing residents[2], particularly those who were first drawn to the area for its "small city" character and affordable housing[3]. This real estate boom has not been limited to the greater Seattle area. Housing prices across the entire state of Washington have undergone unprecedented growth[4].


#### Problem
The steep rise in the cost of living is starting to compel a number of Washington residents to seek alternative places to live, particularly those who are able to telecommute and/or whose job prospects arn't tied to a particular location and retirees. The question for this subset of people is *how to even get started browsing prospective places to move*, as Washington State alone has 211 cities[5].

To answer this question, we'll start with the assumption that greater Seattle residents looking to move are still interested in in living in the Northwest US and seek to find alternative cities with similar ammenities as their current one. Given this scope, we can sample the superset of cities in Washington and adjacent states (Oregon and Idaho) to create a kind of "fingerprint" of popular venues (such as certain types of restaurants, stores and natural areas) for each city, and then use this to identify potential similarities with other cities. The findings of this exercise could then be used as a recommendation guide for further, in-person real estate research.

#### Audience
The primary audience of this study might include realtors and potential home buyers/renters in the Northwest (Washington/Idaho/Oregon) region. The findings could also be used by Northwest entrepreneurs looking to open new businesses or even a way of fostering outreach and partnerships among Northwest municipal chambers of commerce.

## Data <a name="data"></a>

#### Sources

To obtain a list of Northwest cities, we'll scrape Wikipedia for a list of cities in Washington[5], Oregon[6] and Idaho[7]. We'll use the Foursquare venue recommendation API[8] to obtain a list of the most popular venues for each city and query location data (latitude/longitude) using the Mapquest Geocoding API[9] in order to map all the cities and visualize the clusters.

#### Tools

We'll use the following Python libraries as commented below.

In [6]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes 

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests 

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans # import k-means from clustering stage

!conda install -c conda-forge folium=0.5.0 --yes 

import folium # map rendering library

print('Libraries imported.')

Collecting package metadata: ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Collecting package metadata: ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Libraries imported.


#### Preparation

First, obtain a list of cities in each of Washington, Oregon and Idaho by scraping the Wikipedia pages on those topics. The lists of cities on those pages are structured in tables, so we can easily use Pandas to read in the HTML table and convert it to a dataframe. We'll set up a new dataframe to store the location data for each Northwest city.

In [7]:
# Get a list of cities of Washington State and prepare a dataframe
tables = pd.read_html("https://en.wikipedia.org/wiki/List_of_cities_and_towns_in_Washington")
df_wa = tables[0]
df_wa.columns = df_wa.iloc[0]
df_wa = df_wa.iloc[2:]
df_wa = df_wa[['Name']]
df_wa.rename(columns={'Name': 'City'}, inplace=True)
df_wa = df_wa.reset_index(drop=True)
df_wa['State']='WA'
df_wa['Latitude']=''
df_wa['Longitude']=''
df_wa

Unnamed: 0,City,State,Latitude,Longitude
0,Aberdeen,WA,,
1,Airway Heights,WA,,
2,Algona,WA,,
3,Anacortes,WA,,
4,Arlington,WA,,
5,Asotin,WA,,
6,Auburn,WA,,
7,Bainbridge Island,WA,,
8,Battle Ground,WA,,
9,Bellevue,WA,,


In [8]:
# Get a list of cities of Oregon State and prepare a dataframe
tables = pd.read_html("https://en.wikipedia.org/wiki/List_of_cities_in_Oregon")
df_or = tables[1]
df_or.columns = df_or.iloc[0]
df_or = df_or.iloc[2:]
df_or = df_or[['City']]
df_or['City'] = df_or['City'].str.replace('�', '')
df_or = df_or.reset_index(drop=True)
df_or['State']='OR'
df_or['Latitude']=''
df_or['Longitude']=''
df_or

Unnamed: 0,City,State,Latitude,Longitude
0,Salem,OR,,
1,Eugene,OR,,
2,Gresham,OR,,
3,Hillsboro,OR,,
4,Beaverton,OR,,
5,Bend,OR,,
6,Medford,OR,,
7,Springfield,OR,,
8,Corvallis,OR,,
9,Albany,OR,,


In [9]:
# Get a list of cities of Idaho State and prepare a dataframe
tables = pd.read_html("https://en.wikipedia.org/wiki/List_of_cities_in_Idaho")
df_id = tables[0]
df_id.columns = df_id.iloc[0]
df_id = df_id.iloc[2:]
df_id = df_id[['City']]
df_id['City'] = df_id['City'].str.replace('�', '')
df_id = df_id.reset_index(drop=True)
df_id['State']='ID'
df_id['Latitude']=''
df_id['Longitude']=''
df_id

Unnamed: 0,City,State,Latitude,Longitude
0,Meridian,ID,,
1,Nampa,ID,,
2,Idaho Falls,ID,,
3,Pocatello,ID,,
4,Caldwell,ID,,
5,C�ur d'Alene,ID,,
6,Twin Falls,ID,,
7,Lewiston,ID,,
8,Post Falls,ID,,
9,Rexburg,ID,,


Consolidate all the dataframes into a single one.

In [12]:
df_northwest = df_wa.copy(deep=True)
df_northwest = df_northwest.append(df_or)
df_northwest = df_northwest.append(df_id)
df_northwest.reset_index(inplace=True,drop=True)
df_northwest.shape

(652, 4)

Next, use the Mapquest API to look up the location (in terms of longitude and latitude) of each city. This is called *geocoding*. Cache the results in *csv* format so its quick to load them again if needed.

In [13]:
# Add Mapquest credentials to run the following code
MAPQUEST_KEY = ''
MAPQUEST_SECRET = ''

In [24]:
for index, row in df_northwest.iterrows():
    location = row['City'] + "," + row['State']
    url = 'https://www.mapquestapi.com/geocoding/v1/address?key={}&inFormat=kvp&outFormat=json&location={}&thumbMaps=false'.format(
    MAPQUEST_KEY, 
    location)
    response = requests.get(url).json()
    df_northwest.at[index,'Latitude'] = response['results'][0]['locations'][0]['latLng']['lat']
    df_northwest.at[index,'Longitude'] = response['results'][0]['locations'][0]['latLng']['lng']
    
df_northwest.to_csv('NorthwestCities.csv', sep=',',index=False)

Check the master dataframe.

In [26]:
df_northwest

Unnamed: 0,City,State,Latitude,Longitude
0,Aberdeen,WA,46.9755,-123.816
1,Airway Heights,WA,47.643,-117.593
2,Algona,WA,47.2791,-122.25
3,Anacortes,WA,48.5054,-122.632
4,Arlington,WA,48.1913,-122.126
5,Asotin,WA,46.3406,-117.049
6,Auburn,WA,47.3075,-122.226
7,Bainbridge Island,WA,47.6431,-122.527
8,Battle Ground,WA,45.7807,-122.548
9,Bellevue,WA,47.6137,-122.191


Map out the cities to ensure location data looks correct.

In [27]:
# Use geopy library to get the latitude and longitude values of Washington State
address = 'Washington State, USA'

geolocator = Nominatim(user_agent="northwest_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Washington are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Washington are 47.2868352, -120.2126139.


In [28]:
# create map of the Northwest US using latitude and longitude values
map_northwest = folium.Map(location=[latitude, longitude], zoom_start=5)

# add markers to map
for lat, lng, name in zip(df_northwest['Latitude'], df_northwest['Longitude'], df_northwest['City']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_northwest)  
    
map_northwest

Now we're ready to query the Foursquare API for the top venues of each city.

In [18]:
# To run the remainder of this notebook yourself, obtain Foursquare developer credentials and add them here
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '' # Foursquare API version

Let's explore the first city in the dataframe to make sure everything is working correctly.

In [20]:
city_latitude = df_northwest.loc[0, 'Latitude'] # City latitude value
city_longitude = df_northwest.loc[0, 'Longitude'] # City longitude value

city_name = df_northwest.loc[0, 'City'] # Name
city_state = df_northwest.loc[0, 'State'] # State

print('Latitude and longitude values of {}, {} are: {}, {}.'.format(city_name,
                                                               city_state,
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of Aberdeen, WA are: 46.97551, -123.815517.


In [21]:
# Get the top 100 venues within the default city radius
LIMIT = 100 # limit of number of venues returned by Foursquare API

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    city_latitude, 
    city_longitude, 
    LIMIT)

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5c9787a04c1f6729025af446'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': '$-$$$$', 'key': 'price'},
    {'name': 'Open now', 'key': 'openNow'}]},
  'suggestedRadius': 10000,
  'headerLocation': 'Aberdeen',
  'headerFullLocation': 'Aberdeen',
  'headerLocationGranularity': 'city',
  'totalResults': 71,
  'suggestedBounds': {'ne': {'lat': 46.98616500184368,
    'lng': -123.7037284668247},
   'sw': {'lat': 46.9434858956073, 'lng': -123.91647086388492}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4d828e4b5e70224bbc0ce608',
       'name': 'Oceana Spa',
       'location': {'address': '501 W Wishkah St',
        'lat': 46.971636655386625,
        'lng': -123.82257641754265,
        'labeledLatLngs

Looks good so far. Let's define a function to extract the category of a given venue.

In [29]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Next we'll structure the returned venue data into a dataframe and filter based on category.

In [30]:
# Clean the data and structure it as a dataframe
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Oceana Spa,Spa,46.971637,-123.822576
1,Rediviva Restaurant,New American Restaurant,46.974826,-123.817105
2,Sucher & Sons Star Wars Shop,Toy / Game Store,46.976254,-123.813901
3,Amore' Italian Restaurant,Italian Restaurant,46.973007,-123.817937
4,Billy's Bar and Grill,Bar,46.975505,-123.81354


Looks good. Now le'ts set up a function to do this across all our Northwest cities.

In [31]:
# Create a function to repeat the same process to all cities
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

And run it through the full list of Northwest cities.

In [32]:
# Run the function on each city and store in new dataframe
# CAUTION: You only get 950 Foursquare API calls per day with a "Sandbox Tier" (free) account
northwest_venues = getNearbyVenues(names=df_northwest['City'],
                                   latitudes=df_northwest['Latitude'],
                                   longitudes=df_northwest['Longitude'])

Aberdeen
Airway Heights
Algona
Anacortes
Arlington
Asotin
Auburn
Bainbridge Island
Battle Ground
Bellevue
Bellingham
Benton City
Bingen
Black Diamond
Blaine
Bonney Lake
Bothell
Bremerton
Brewster
Bridgeport
Brier
Buckley
Burien
Burlington
Camas
Carnation
Cashmere
Castle Rock
Centralia
Chehalis
Chelan
Cheney
Chewelah
Clarkston
Cle Elum
Clyde Hill
Colfax
College Place
Colville
Connell
Cosmopolis
Covington
Davenport
Dayton
Deer Park
Des Moines
DuPont
Duvall
East Wenatchee
Edgewood
Edmonds
Electric City
Ellensburg
Elma
Entiat
Enumclaw
Ephrata
Everett
Everson
Federal Way
Ferndale
Fife
Fircrest
Forks
George
Gig Harbor
Gold Bar
Goldendale
Grand Coulee
Grandview
Granger
Granite Falls
Harrington
Hoquiam
Ilwaco
Issaquah
Kahlotus
Kalama
Kelso
Kenmore
Kennewick
Kent
Kettle Falls
Kirkland
Kittitas
La Center
Lacey
Lake Forest Park
Lake Stevens
Lakewood
Langley
Leavenworth
Liberty Lake
Long Beach
Longview
Lynden
Lynnwood
Mabton
Maple Valley
Marysville
Mattawa
McCleary
Medical Lake
Medina
Mercer Islan

In [33]:
# Cache the results in case we need to reload the dataframe (for debugging purposes)
northwest_venues.to_csv('NorthwestVenues.csv', sep=',',index=False)

Sanity check on our dataframe of city venues to ensure everything looks in order.

In [34]:
# Check the size of dataframe
print(northwest_venues.shape)
northwest_venues.head()

(5934, 7)


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Aberdeen,46.97551,-123.815517,Rediviva Restaurant,46.974826,-123.817105,New American Restaurant
1,Aberdeen,46.97551,-123.815517,Sucher & Sons Star Wars Shop,46.976254,-123.813901,Toy / Game Store
2,Aberdeen,46.97551,-123.815517,Billy's Bar and Grill,46.975505,-123.81354,Bar
3,Aberdeen,46.97551,-123.815517,Amore' Italian Restaurant,46.973007,-123.817937,Italian Restaurant
4,Aberdeen,46.97551,-123.815517,Breakwater Seafood And Chowder,46.975644,-123.811814,Seafood Restaurant


Now we can count the number of venues for each city. Some cities (for example, [bedroom communities](https://www.merriam-webster.com/dictionary/bedroom%20community)) have very few venue entries on Foursquare. I tested different limits and found that a city requires at least about 10 venue entries in order to have an adequate venue "profile" for meaningful clustering results with other cities. Given that, we'll drop cities with fewer than 10 venues for the remainder of this study. 

In [35]:
# Tally up the total venues per city
grouped = northwest_venues.groupby('City').count()
print('Original count of cities: {}'.format(len(grouped.index)))

# Drop cities with inadequate amount of venue data (they skew the clustering results)
grouped = grouped[grouped.Venue < 10]
list(grouped.index.values)
northwest_venues = northwest_venues[~northwest_venues['City'].isin(list(grouped.index.values))]
grouped2 = northwest_venues.groupby('City').count()

print('Count of cities with more than 10 venues: {}'.format(len(grouped2.index)))

Original count of cities: 593
Count of cities with more than 10 venues: 180


How many unique categories among the returned venues?

In [37]:
print('There are {} unique categories.'.format(len(northwest_venues['Venue Category'].unique())))

There are 308 unique categories.


## Methodology <a name="methodology"></a>

Now that we've gathered and prepped all the data we need, we're ready to analyze it. We'll use a popular unsupervised machine learning algorithm called [k-means clustering](https://en.wikipedia.org/wiki/K-means_clustering) that enables us to partition observations into a specified number of clusters in order to discover underlying patterns. For our data, we'll find the the top 5 venue categories for each city (based on occurances in the dataset), and use that as each city's vector profile for finding similarities with other cities.

First we need to calculate the average frequency for each venue category across each city. We can quickly do this with a Pandas dataframe by converting each venue category into a boolean (yes/no) column using [One-hot](https://en.wikipedia.org/wiki/One-hot) encoding.

In [38]:
# one hot encoding
northwest_onehot = pd.get_dummies(northwest_venues[['Venue Category']], prefix="", prefix_sep="")

# Add city column back to dataframe
northwest_onehot['City'] = northwest_venues['City'] 

# move city column to the first column
fixed_columns = [northwest_onehot.columns[-1]] + list(northwest_onehot.columns[:-1])
northwest_onehot = northwest_onehot[fixed_columns]

# Check size of new dataframe
northwest_onehot.shape

(4272, 309)

The dataframe shape looks correct, as the column count matches the number of unique venue categories we calculated earlier.
Next we'll group rows by city mean of frequency for each category.

In [39]:
northwest_grouped = northwest_onehot.groupby('City').mean().reset_index()
northwest_grouped

Unnamed: 0,City,ATM,Accessories Store,Adult Boutique,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Arcade,Art Gallery,Art Museum,Art Studio,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bistro,Board Shop,Boat Rental,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Butcher,Caf�,Cajun / Creole Restaurant,Camera Store,Campground,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Carpet Store,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Costume Shop,Credit Union,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Dentist's Office,Department Store,Dessert Shop,Diner,Disc Golf,Discount Store,Distillery,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Eastern European Restaurant,Electronics Store,English Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Financial or Legal Service,Fish & Chips Shop,Fish Market,Fishing Spot,Fishing Store,Flea Market,Flower Shop,Fondue Restaurant,Food,Food & Drink Shop,Food Service,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Shop,Gym,Gym / Fitness Center,Hakka Restaurant,Halal Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,High School,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hot Dog Joint,Hotel,Hotel Bar,Hunting Supply,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Inn,Insurance Office,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Lawyer,Library,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Mobile Phone Shop,Mongolian Restaurant,Monument / Landmark,Motel,Motorsports Shop,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,New American Restaurant,Nightclub,Noodle House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Sculpture,Outlet Store,Paella Restaurant,Paper / Office Supplies Store,Park,Pawn Shop,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Pie Shop,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Pool Hall,Print Shop,Pub,RV Park,Record Shop,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,River,Road,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Ski Lodge,Smoke Shop,Snack Place,Soccer Field,Social Club,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,State / Provincial Park,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Thrift / Vintage Store,Tour Provider,Tourist Information Center,Toy / Game Store,Track,Trail,Train,Train Station,Tree,Turkish Restaurant,University,Used Bookstore,Vacation Rental,Vape Store,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Watch Shop,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio
0,Aberdeen,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Airway Heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Astoria,0.022727,0.0,0.0,0.0,0.0,0.0,0.068182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.113636,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Auburn,0.0,0.0,0.041667,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Baker City,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bandon,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.16,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.16,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Battle Ground,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Beaverton,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.019231,0.038462,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.057692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.057692,0.0,0.0,0.019231,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.038462,0.038462,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.019231,0.0,0.0,0.0,0.0,0.019231,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0
8,Bellevue,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.034483,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051724,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0
9,Bellingham,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.02381,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0


In [40]:
# New dataframe size
northwest_grouped.shape

(180, 309)

Now we can find the five most common venues for each city.

In [41]:
num_top_venues = 5

for city in northwest_grouped['City']:
    print("----"+city+"----")
    temp = northwest_grouped[northwest_grouped['City'] == city].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Aberdeen----
                  venue  freq
0         Grocery Store  0.08
1           Coffee Shop  0.08
2                  Bank  0.08
3  Fast Food Restaurant  0.08
4     Convenience Store  0.08


----Airway Heights----
                 venue  freq
0       Sandwich Place  0.09
1                  Bar  0.09
2        Grocery Store  0.09
3  Japanese Restaurant  0.09
4                 Caf�  0.09


----Astoria ----
                 venue  freq
0              Brewery  0.11
1   Seafood Restaurant  0.09
2                  Bar  0.07
3  American Restaurant  0.07
4          Pizza Place  0.05


----Auburn----
                 venue  freq
0          Coffee Shop  0.12
1        Grocery Store  0.08
2  American Restaurant  0.08
3                 Caf�  0.08
4           Restaurant  0.04


----Baker City ----
                 venue  freq
0  American Restaurant  0.18
1   Chinese Restaurant  0.18
2            Multiplex  0.09
3                 Caf�  0.09
4     Motorsports Shop  0.09


----Bandon----
       

                 venue  freq
0                 Caf�  0.11
1               Lounge  0.05
2       Ice Cream Shop  0.05
3          Coffee Shop  0.05
4  American Restaurant  0.05


----Ellensburg----
                 venue  freq
0                  Bar  0.07
1            Bookstore  0.07
2        Grocery Store  0.07
3              Brewery  0.07
4  American Restaurant  0.07


----Enumclaw----
               venue  freq
0        Pizza Place  0.15
1              Diner  0.08
2               Caf�  0.08
3     Breakfast Spot  0.08
4  German Restaurant  0.08


----Ephrata----
                venue  freq
0            Pharmacy  0.13
1                 ATM  0.07
2    Asian Restaurant  0.07
3              Lawyer  0.07
4  Mexican Restaurant  0.07


----Estacada----
            venue  freq
0     Pizza Place  0.20
1           Diner  0.07
2            Caf�  0.07
3         Brewery  0.07
4  Breakfast Spot  0.07


----Eugene ----
              venue  freq
0       Coffee Shop  0.08
1               Bar  0.06
2  Su

4        History Museum   0.1


----Lynnwood----
                 venue  freq
0   Mexican Restaurant  0.09
1   Chinese Restaurant  0.09
2  Japanese Restaurant  0.06
3         Burger Joint  0.06
4     Sushi Restaurant  0.06


----Madras ----
                venue  freq
0  Mexican Restaurant  0.14
1    Sushi Restaurant  0.07
2         Coffee Shop  0.07
3               Diner  0.07
4   Other Repair Shop  0.07


----Manzanita----
                  venue  freq
0                   ATM  0.06
1        General Travel  0.06
2         Grocery Store  0.06
3  Marijuana Dispensary  0.06
4                Bakery  0.06


----Maple Valley----
                  venue  freq
0  Fast Food Restaurant  0.16
1          Burger Joint  0.05
2      Asian Restaurant  0.05
3           Gas Station  0.05
4    Mexican Restaurant  0.05


----Marysville----
                           venue  freq
0           Fast Food Restaurant  0.14
1                    Coffee Shop  0.10
2                    Pizza Place  0.10
3          

                 venue  freq
0  American Restaurant  0.08
1                  Bar  0.06
2          Coffee Shop  0.06
3       Sandwich Place  0.03
4  Sporting Goods Shop  0.03


----Sandpoint ----
                 venue  freq
0          Coffee Shop  0.12
1                  Bar  0.12
2  Sporting Goods Shop  0.06
3                  Pub  0.06
4   Mexican Restaurant  0.06


----Sandy----
                 venue  freq
0  Sporting Goods Shop  0.10
1          Pizza Place  0.10
2       Sandwich Place  0.05
3       Shipping Store  0.05
4            BBQ Joint  0.05


----Scappoose----
                        venue  freq
0                         ATM  0.07
1                      Lawyer  0.07
2                 Pizza Place  0.07
3  Construction & Landscaping  0.07
4                   Pet Store  0.07


----SeaTac----
                 venue  freq
0          Coffee Shop  0.25
1      Airport Service  0.15
2                Hotel  0.10
3  Rental Car Location  0.10
4                  ATM  0.10


----Seaside-

The raw data looks good. Now let's sort and structure it for further processing.

In [42]:
# Function to sort venues in decscending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [90]:
# Create a dataframe with top 5 venues for each city
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# Create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# Create a new dataframe
cities_venues_sorted = pd.DataFrame(columns=columns)
cities_venues_sorted['City'] = northwest_grouped['City']

for ind in np.arange(northwest_grouped.shape[0]):
    cities_venues_sorted.iloc[ind, 1:] = return_most_common_venues(northwest_grouped.iloc[ind, :], num_top_venues)

cities_venues_sorted.head()

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Aberdeen,Fast Food Restaurant,Bank,Grocery Store,Coffee Shop,Convenience Store
1,Airway Heights,Grocery Store,Caf�,Chinese Restaurant,Park,Korean Restaurant
2,Astoria,Brewery,Seafood Restaurant,American Restaurant,Bar,Pizza Place
3,Auburn,Coffee Shop,Grocery Store,American Restaurant,Caf�,Bank
4,Baker City,American Restaurant,Chinese Restaurant,Motorsports Shop,Bagel Shop,Automotive Shop


Now we're ready to apply the K-means clustering algorithm. After trying out different `k` values (where `k`= *number of clusters*), I found the clusters to be most meaningful and interesting with around `k=6`. The output of the K-means algorithm is an array of cluster assignments for each row in our dataframe.

In [91]:
# Run K-means to break up into clusters
kclusters = 6

northwest_grouped_clustering = northwest_grouped.drop('City', 1)

# Run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(northwest_grouped_clustering)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:400]

array([4, 2, 1, 4, 1, 1, 4, 4, 4, 4, 4, 2, 2, 4, 4, 4, 2, 4, 2, 2, 4, 4,
       1, 1, 2, 2, 2, 2, 2, 4, 2, 1, 4, 4, 2, 3, 4, 2, 1, 1, 2, 4, 4, 4,
       4, 2, 2, 2, 2, 4, 4, 2, 2, 2, 4, 4, 3, 1, 4, 2, 2, 4, 2, 1, 2, 4,
       2, 4, 1, 2, 2, 4, 1, 3, 3, 4, 4, 4, 4, 4, 2, 4, 4, 2, 4, 2, 4, 1,
       4, 1, 3, 0, 2, 4, 4, 3, 3, 4, 4, 4, 4, 4, 4, 2, 4, 4, 2, 1, 2, 4,
       4, 3, 4, 4, 4, 4, 3, 2, 3, 1, 4, 4, 1, 4, 3, 5, 4, 1, 2, 4, 4, 4,
       2, 3, 1, 4, 2, 2, 4, 4, 2, 4, 4, 3, 4, 5, 4, 4, 4, 1, 4, 2, 4, 4,
       4, 2, 2, 1, 4, 1, 4, 4, 4, 4, 3, 4, 4, 4, 4, 1, 4, 4, 2, 1, 2, 2,
       2, 2, 4, 4])

In [92]:
cities_venues_sorted.columns

Index(['City', '1st Most Common Venue', '2nd Most Common Venue',
       '3rd Most Common Venue', '4th Most Common Venue',
       '5th Most Common Venue'],
      dtype='object')

Now we can stich the cluster labels back into our dataframe and also combine city location data. With all this info combined we'll be ready to visualize the results.

In [93]:
# Create dataframe that includes the cluster and top 5 venues

# Add clustering labels
cities_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

northwest_merged = df_northwest

# Merge northwest_grouped with northwest_data to add latitude/longitude for each city
northwest_merged = northwest_merged.join(cities_venues_sorted.set_index('City'), on='City')

# Drop cities with no venue data
northwest_merged = northwest_merged.dropna()

northwest_merged

Unnamed: 0,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Aberdeen,WA,46.9755,-123.816,4.0,Fast Food Restaurant,Bank,Grocery Store,Coffee Shop,Convenience Store
1,Airway Heights,WA,47.643,-117.593,2.0,Grocery Store,Caf�,Chinese Restaurant,Park,Korean Restaurant
6,Auburn,WA,47.3075,-122.226,4.0,Coffee Shop,Grocery Store,American Restaurant,Caf�,Bank
8,Battle Ground,WA,45.7807,-122.548,4.0,Coffee Shop,Pizza Place,Hotel,Asian Restaurant,Discount Store
9,Bellevue,WA,47.6137,-122.191,4.0,Coffee Shop,Hotel,Mexican Restaurant,Women's Store,Bakery
10,Bellingham,WA,48.7549,-122.478,4.0,Coffee Shop,Caf�,Bookstore,Chinese Restaurant,Breakfast Spot
12,Bingen,WA,45.7148,-121.466,2.0,Mexican Restaurant,Train Station,Food Truck,Park,Pizza Place
14,Blaine,WA,48.994,-122.752,4.0,Coffee Shop,Diner,ATM,Steakhouse,Taco Place
16,Bothell,WA,47.7601,-122.205,4.0,Coffee Shop,Mexican Restaurant,Pub,Park,Sushi Restaurant
17,Bremerton,WA,47.5674,-122.633,4.0,Coffee Shop,Asian Restaurant,Cocktail Bar,Arcade,Mexican Restaurant


## Results and discussion <a name="results"></a>

Now we're ready to map out the data to get a feel for the results. We'll use the Python [Folium](https://python-visualization.github.io/folium/) library to render our clusters, using a distinct color for each.

In [94]:
# Create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=6)

# Set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(northwest_merged['Latitude'], northwest_merged['Longitude'], northwest_merged['City'], northwest_merged['Cluster Labels']):
    cluster = int(cluster)
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Intuitively the results seem promising, in terms of holding some patterns about our dataset. The clusters seem
generally dispersed geographically and balanced in terms of member count. Let's examine the individual clusters to
try and discern how/why they broke out  the way they did.

#### Cluster 0: Outlier city
This cluster consists of a single outlier city, Lynden WA. Although none of its top common venues seems uncommon in
itself, perhaps it was the combination of all 5 that proved particularly unique among our dataset.

In [132]:
northwest_merged.loc[northwest_merged['Cluster Labels'] == 0, northwest_merged.columns[[0] + [1] + list(range(5, northwest_merged.shape[1]))]]

Unnamed: 0,City,State,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
95,Lynden,WA,Bakery,Sandwich Place,Playground,Coffee Shop,Caf�


#### Cluster 1: Vacation destinations
Cluster 1 is characterized by hotels/resorts, restaurants and nightlife (bars, breweries, etc). Many of the cities on this
list are vacation destinations and/or popular weekend getaway spots.

In [139]:
# Cluster 1
northwest_merged.loc[northwest_merged['Cluster Labels'] == 1, northwest_merged.columns[[0] + [1] + list(range(5, northwest_merged.shape[1]))]]

Unnamed: 0,City,State,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
38,Colville,WA,American Restaurant,Mexican Restaurant,Other Repair Shop,Automotive Shop,Coffee Shop
43,Dayton,WA,American Restaurant,Hotel,Restaurant,Bakery,Asian Restaurant
74,Ilwaco,WA,Harbor / Marina,Seafood Restaurant,Caf�,Grocery Store,High School
91,Leavenworth,WA,Park,Hotel,Resort,Candy Store,American Restaurant
114,Mount Vernon,WA,American Restaurant,Bar,Thai Restaurant,Italian Restaurant,Indian Restaurant
141,Port Townsend,WA,Hotel,American Restaurant,Thai Restaurant,Caf�,Seafood Restaurant
146,Puyallup,WA,Hotel,Deli / Bodega,American Restaurant,Event Space,Coffee Shop
178,Stevenson,WA,American Restaurant,Restaurant,Coffee Shop,Seafood Restaurant,Food & Drink Shop
204,White Salmon,WA,Grocery Store,Caf�,Coffee Shop,Paper / Office Supplies Store,Bakery
244,Pendleton,OR,Hotel,Steakhouse,Caf�,Restaurant,Convenience Store


#### Cluster 2: : Restaurant cities
Pizza place is common among nearly all of the cities in this cluster. It looks like Pizza Place, Mexican Restaurant, Chinese Restaurant, American Restaurant, and Bar are all grouping together here. The cities on this list tend to be larger than bedroom communities, but somewhat smaller than major uban centers.

In [138]:
# Cluster 2
northwest_merged.loc[northwest_merged['Cluster Labels'] == 2, northwest_merged.columns[[0] + [1] + list(range(5, northwest_merged.shape[1]))]]

Unnamed: 0,City,State,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Airway Heights,WA,Grocery Store,Caf�,Chinese Restaurant,Park,Korean Restaurant
12,Bingen,WA,Mexican Restaurant,Train Station,Food Truck,Park,Pizza Place
23,Burlington,WA,Pizza Place,Pub,Farmers Market,Caf�,Convenience Store
25,Carnation,WA,Pizza Place,Caf�,Supermarket,Vietnamese Restaurant,Bar
33,Clarkston,WA,Restaurant,Pharmacy,Thai Restaurant,Mobile Phone Shop,Pizza Place
52,Ellensburg,WA,Brewery,American Restaurant,Bookstore,Bar,Grocery Store
55,Enumclaw,WA,Pizza Place,Pub,Bakery,Supermarket,Breakfast Spot
56,Ephrata,WA,Pharmacy,ATM,American Restaurant,Asian Restaurant,Park
60,Ferndale,WA,Bar,Indian Restaurant,Convenience Store,Fast Food Restaurant,Park
71,Granite Falls,WA,Asian Restaurant,Trail,Pizza Place,Mexican Restaurant,Steakhouse


#### Cluster 3: Fast food cities

The unifying characteristic among these cities is that *Fast Food Restaurant* is the prominent venue type. These are smaller cities and bedroom communities that tend to be located between larger cities with more ammentiies. At first glance, a "Fast food city" might not seem particularly attractive, but this cluster
deserves further exploration for prospective home buyers looking for more seclusion and lower real estate prices.

In [135]:
# Cluster 3 
northwest_merged.loc[northwest_merged['Cluster Labels'] == 3, northwest_merged.columns[[0] + [1] + list(range(5, northwest_merged.shape[1]))]]

Unnamed: 0,City,State,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
41,Covington,WA,Fast Food Restaurant,Sandwich Place,Automotive Shop,Big Box Store,Pharmacy
94,Longview,WA,Fast Food Restaurant,ATM,Pizza Place,Coffee Shop,Asian Restaurant
98,Maple Valley,WA,Fast Food Restaurant,Burger Joint,Mexican Restaurant,Seafood Restaurant,Gas Station
99,Marysville,WA,Fast Food Restaurant,Pizza Place,Sandwich Place,Coffee Shop,Supermarket
136,Pasco,WA,Fast Food Restaurant,Pizza Place,Pharmacy,Mexican Restaurant,Grocery Store
153,Richland,WA,Fast Food Restaurant,Sandwich Place,Garden Center,Hotel,Taco Place
224,Keizer,OR,Hotel,Pharmacy,Mexican Restaurant,Fast Food Restaurant,Optical Shop
226,Oregon City,OR,Fast Food Restaurant,Coffee Shop,Sandwich Place,ATM,Pharmacy
263,Prineville,OR,Bank,Sandwich Place,Burger Joint,Salad Place,Hotel
278,Seaside,OR,Hotel,Fast Food Restaurant,Candy Store,Breakfast Spot,Pizza Place


#### Cluster 4: Coffee shop cities
In addition to coffee shops being the prevalent venue type, the cities in this cluster are charaterized
by a diverse set of ammenities, indicative of larger urban centers.

In [136]:
# Cluster 4
northwest_merged.loc[northwest_merged['Cluster Labels'] == 4, northwest_merged.columns[[0] + [1] + list(range(5, northwest_merged.shape[1]))]]

Unnamed: 0,City,State,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Aberdeen,WA,Fast Food Restaurant,Bank,Grocery Store,Coffee Shop,Convenience Store
6,Auburn,WA,Coffee Shop,Grocery Store,American Restaurant,Caf�,Bank
8,Battle Ground,WA,Coffee Shop,Pizza Place,Hotel,Asian Restaurant,Discount Store
9,Bellevue,WA,Coffee Shop,Hotel,Mexican Restaurant,Women's Store,Bakery
10,Bellingham,WA,Coffee Shop,Caf�,Bookstore,Chinese Restaurant,Breakfast Spot
14,Blaine,WA,Coffee Shop,Diner,ATM,Steakhouse,Taco Place
16,Bothell,WA,Coffee Shop,Mexican Restaurant,Pub,Park,Sushi Restaurant
17,Bremerton,WA,Coffee Shop,Asian Restaurant,Cocktail Bar,Arcade,Mexican Restaurant
22,Burien,WA,Mexican Restaurant,Bakery,Ice Cream Shop,Italian Restaurant,New American Restaurant
24,Camas,WA,Coffee Shop,Mexican Restaurant,Brewery,ATM,Grocery Store


#### Cluster 5: More outliers

The yield for the final cluster was a couple more outlier cities, also in Washington state. These cities both have a
category in their top 5 venue types that is unique among our dataset: the *Gastropub* in Prosser, and the *Bowling Alley*
in Sedro-Woolley.

In [137]:
# Cluster 5
northwest_merged.loc[northwest_merged['Cluster Labels'] == 5, northwest_merged.columns[[0] + [1] + list(range(5, northwest_merged.shape[1]))]]

Unnamed: 0,City,State,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
144,Prosser,WA,Brewery,American Restaurant,Coffee Shop,Pizza Place,Gastropub
164,Sedro-Woolley,WA,American Restaurant,Pizza Place,Bowling Alley,Lawyer,Convenience Store


## Conclusion <a name="conclusion"></a>

Starting from a list of 653 total cities across the states of Washington, Oregon, and Idaho, we found 593 cities with
Foursquare venue data. A Foursquare query of venues in those cities yielded 5935 venues, however it was necessary to
filter out cities with fewer than 10 venues, as their data profile later proved insufficient for meaningful clustering.
After filtering out those cities, only 180 cities remained�less than 30% of our original group of cities. The 180 cities used in the final analysis represented 4272 venues and 309 unique venue types. We used the k-means clustering algorithm to
group them into six distinct clusters, however only four of those clusters were truly meaningful in terms of revealing
insights among our dataset that we could use to answer the original question of our business problem: *how can Northwest residents identify similar cities as prospective places to move?* The results of our analysis certainly provide one answer
to the question, with the caveat that there remains a host of many other Northwest cities (not to mention towns) that we
weren't able to include in the study for lack of adequate data. 

Throughout the process of this study we uncovered limitations in comprehensively addressing the business problem at hand.
Nevertheless, we did find some interesting patterns among our refined dataset of larger Northwest cities with an adequate
amount of Foursquare venue data. Next steps in the process might be to supplement the data used to cluster cities with 
additional sources, such as the average home price and population size. With additional data like this, we might be able to
retain and cluster our full list of Northwest cities, while still providing finer-grained grouping patterns for cities with
ample Foursqure venue data.

## Footnotes

[1] https://www.seattletimes.com/seattle-news/data/114000-more-people-seattle-now-this-decades-fastest-growing-big-city-in-all-of-united-states/

[2] https://www.seattletimes.com/pacific-nw-magazine/surviving-seattles-sidewalks-pedestrian-rage-rises-as-the-population-grows/

[3] For example, https://www.seattletimes.com/opinion/after-14-years-ive-had-it-im-leaving-seattle/

[4] https://www.seattletimes.com/business/home-prices-rising-faster-in-washington-than-in-any-other-state/

[5] https://en.wikipedia.org/wiki/List_of_cities_and_towns_in_Washington

[6] https://en.wikipedia.org/wiki/List_of_cities_in_Oregon

[7] https://en.wikipedia.org/wiki/List_of_cities_in_Idaho

[8] https://developer.foursquare.com/docs/api/venues/explore

[9] https://developer.mapquest.com/documentation/geocoding-api/