# Capstone Project - The Battle of the Neighborhoods Portland Edition

## Introduction

This notebook will analyze the different types of venues and the frequency of each specific venue types for the neighborhoods around Portland OR.

## Business Problem

My wife wants to open a Mexican restaurant in Portland OR. We are originally from Chicago and have not found a good authentic Mexican restaurant so my wife would like to fill that void. She would like me to analyze the neighborhoods and their venues through clustering and determine the best neighborhood to open her restaurant.

## Data

The following data sources were used in the analysis:
 - Latitude and Longitude data from public.opendatasoft.com
 - Postal Code data from portlandneighborhood.com
 - Neighborhood venues data from Foursquare

## Methodology

Importing the Portland postal codes, neighborhoods, geo coordinates

In [1]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude
0,97201,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781
1,97202,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",45.481791,-122.64055
2,97203,"Cathedral Park, Portsmouth, St. Johns, Univers...",45.589689,-122.73875
3,97204,"Downtown Portland, Goose Hollow, Old Town Chin...",45.51854,-122.6755
4,97205,"Downtown Portland, Goose Hollow",45.52054,-122.68573


In [2]:
portland_df = df_data_1
portland_df

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude
0,97201,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781
1,97202,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",45.481791,-122.64055
2,97203,"Cathedral Park, Portsmouth, St. Johns, Univers...",45.589689,-122.73875
3,97204,"Downtown Portland, Goose Hollow, Old Town Chin...",45.51854,-122.6755
4,97205,"Downtown Portland, Goose Hollow",45.52054,-122.68573
5,97206,"Brentwood Darlington, Foster Powell, Mt Scott ...",45.482341,-122.60007
6,97209,"Old Town Chinatown, Pearl District",45.52889,-122.68458
7,97210,"Hillside, Linnton, Northwest, Portland Northwe...",45.534839,-122.7095
8,97211,"Concordia, East Columbia, King, Sabin, Vernon,...",45.56544,-122.64635
9,97212,"Alameda, Beaumont Wilshire, Eliot, Grant Park,...",45.54424,-122.64353


Importing remaining libraries

In [3]:
import numpy as np
import json 
from geopy.geocoders import Nominatim 
import requests 
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0
import folium
from bs4 import BeautifulSoup 
import requests

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



USe geocoders to determin Portland latitude and longitute

In [4]:
address = 'Portland, OR'
geolocator = Nominatim(user_agent='tor_explorer')
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographic coordinates of Portland are {},{}.'.format(latitude,longitude))

The geographic coordinates of Portland are 45.5202471,-122.6741949.


Creating map with folium to vizualize the different neighborhoods of Portland

In [5]:
map_portland = folium.Map(location=[latitude, longitude], zoom_start=10)

neighborhoods = portland_df

for lat, lng, neighborhood in zip(neighborhoods['Latitude'],neighborhoods['Longitude'],neighborhoods['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat,lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_portland)
    
map_portland

Utilizing FourSquare api to pull Portland venues

In [43]:
CLIENT_ID = 
CLIENT_SECRET = 
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [7]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [8]:
portland_venues = getNearbyVenues(names=portland_df['Neighborhood'],
                                    latitudes=portland_df['Latitude'],
                                    longitudes=portland_df['Longitude']
                                )

Arlington Heights, Corbett Terwilliger Lair Hill, Hillsdale, Homestead, Southwest Hills
Brooklyn, Creston Kenilworth, Eastmoreland, Hosford Abernethy, Reed, Sellwood Moreland
Cathedral Park, Portsmouth, St. Johns, University Park
Downtown Portland, Goose Hollow, Old Town Chinatown
Downtown Portland, Goose Hollow
Brentwood Darlington, Foster Powell, Mt Scott Arleta, Richmond, South Tabor, Woodstock
Old Town Chinatown, Pearl District
Hillside, Linnton, Northwest, Portland Northwest Industrial
Concordia, East Columbia, King, Sabin, Vernon, Woodlawn
Alameda, Beaumont Wilshire, Eliot, Grant Park, Hollywood, Irvington, King, Sabin
Center, Grant Park, Hollywood, Montavilla, Rose City Park, Roseway
Buckman, Hosford Abernethy, Kerns, Laurelhurst, Richmond, Sunnyside
Mt Tabor
Hazelwood, Mill Park
Arbor Lodge,  Bridgeton, Hayden Island, Humboldt, Kenton, Overlook, Piedmont
Cully
Arnold Creek, Ash Creek, Collins View, Corbett Terwilliger Lair Hill, Crestwood, Far Southwest, Hillsdale, Maplewood, M

Exploring the Portland Neighborhood venues

In [9]:
portland_venues.shape

(508, 7)

In [10]:
portland_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781,Portland Aerial Tram - Upper Terminal,45.499376,-122.685137,Tram Station
1,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781,Starbucks,45.49782,-122.68569,Coffee Shop
2,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781,Casey Eye Institute,45.49867,-122.683774,Eye Doctor
3,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781,Shell,45.502457,-122.688078,Gas Station
4,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",45.481791,-122.64055,Gigantic Brewing Company,45.48506,-122.639577,Brewery


Determine number of venues per neighborhood

In [11]:
portland_venues.groupby("Neighborhood").count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Alameda, Beaumont Wilshire, Eliot, Grant Park, Hollywood, Irvington, King, Sabin",6,6,6,6,6,6
"Arbor Lodge, Bridgeton, Hayden Island, Humboldt, Kenton, Overlook, Piedmont",20,20,20,20,20,20
Ardenwald,9,9,9,9,9,9
"Arlington Heights, Corbett Terwilliger Lair Hill, Hillsdale, Homestead, Southwest Hills",4,4,4,4,4,4
"Arnold Creek, Ash Creek, Collins View, Corbett Terwilliger Lair Hill, Crestwood, Far Southwest, Hillsdale, Maplewood, Markham, Marshall Park, Multnomah, South Burlingame, West Portland Park",10,10,10,10,10,10
"Boise, Eliot, Overlook",13,13,13,13,13,13
"Brentwood Darlington, Foster Powell, Mt Scott Arleta, Richmond, South Tabor, Woodstock",8,8,8,8,8,8
"Brooklyn, Creston Kenilworth, Eastmoreland, Hosford Abernethy, Reed, Sellwood Moreland",9,9,9,9,9,9
"Buckman, Hosford Abernethy, Kerns, Laurelhurst, Richmond, Sunnyside",29,29,29,29,29,29
"Cathedral Park, Portsmouth, St. Johns, University Park",17,17,17,17,17,17


In [12]:
print('There are {} unique venue categories.'.format(len(portland_venues['Venue Category'].unique())))

There are 178 unique venue categories.


One Hot Encoding dataframe creation needed for clustering

In [13]:
portland_onehot = pd.get_dummies(portland_venues[['Venue Category']], prefix="", prefix_sep="")
portland_onehot['Neighborhood'] = portland_venues['Neighborhood'] 
fixed_columns = [portland_onehot.columns[-1]] + list(portland_onehot.columns[:-1])
portland_onehot = portland_onehot[fixed_columns]
portland_grouped = portland_onehot.groupby('Neighborhood').mean().reset_index()
portland_grouped.head()

Unnamed: 0,Neighborhood,ATM,American Restaurant,Amphitheater,Antique Shop,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Automotive Shop,...,Tram Station,Tree,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,"Alameda, Beaumont Wilshire, Eliot, Grant Park,...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Arbor Lodge, Bridgeton, Hayden Island, Humbol...",0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0
2,Ardenwald,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Arlington Heights, Corbett Terwilliger Lair Hi...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Arnold Creek, Ash Creek, Collins View, Corbett...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [14]:
portland_grouped.shape

(28, 179)

In [15]:
portland_grouped.columns

Index(['Neighborhood', 'ATM', 'American Restaurant', 'Amphitheater',
       'Antique Shop', 'Arts & Crafts Store', 'Asian Restaurant',
       'Athletics & Sports', 'Auto Garage', 'Automotive Shop',
       ...
       'Tram Station', 'Tree', 'Vegetarian / Vegan Restaurant', 'Video Store',
       'Vietnamese Restaurant', 'Whisky Bar', 'Wine Bar', 'Wine Shop',
       'Women's Store', 'Yoga Studio'],
      dtype='object', length=179)

Analyzing the frequency mean for Mexican Restaurants by neighborhood

In [16]:
portland_grouped[["Neighborhood","Mexican Restaurant"]]

Unnamed: 0,Neighborhood,Mexican Restaurant
0,"Alameda, Beaumont Wilshire, Eliot, Grant Park,...",0.0
1,"Arbor Lodge, Bridgeton, Hayden Island, Humbol...",0.05
2,Ardenwald,0.111111
3,"Arlington Heights, Corbett Terwilliger Lair Hi...",0.0
4,"Arnold Creek, Ash Creek, Collins View, Corbett...",0.0
5,"Boise, Eliot, Overlook",0.153846
6,"Brentwood Darlington, Foster Powell, Mt Scott ...",0.0
7,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",0.111111
8,"Buckman, Hosford Abernethy, Kerns, Laurelhurst...",0.034483
9,"Cathedral Park, Portsmouth, St. Johns, Univers...",0.0


Analyzing the top 10 venues for each neighborhood

In [17]:
num_top_venues = 10

for hood in portland_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = portland_grouped[portland_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 4})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Alameda, Beaumont Wilshire, Eliot, Grant Park, Hollywood, Irvington, King, Sabin----
                       venue    freq
0               Optical Shop  0.1667
1                Coffee Shop  0.1667
2                  Pet Store  0.1667
3         Italian Restaurant  0.1667
4               Soccer Field  0.1667
5              Garden Center  0.1667
6                Music Store  0.0000
7  Middle Eastern Restaurant  0.0000
8          Mobile Phone Shop  0.0000
9       Mongolian Restaurant  0.0000


----Arbor Lodge,  Bridgeton, Hayden Island, Humboldt, Kenton, Overlook, Piedmont----
                    venue  freq
0             Video Store  0.10
1                Dive Bar  0.10
2                     ATM  0.05
3        Sushi Restaurant  0.05
4           Jewelry Store  0.05
5      Light Rail Station  0.05
6                     Gym  0.05
7             Gas Station  0.05
8  Furniture / Home Store  0.05
9      Mexican Restaurant  0.05


----Ardenwald----
                       venue    freq
0       

Create dataframe wiht each neighborhoods top 10 most common venues

In [18]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [19]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = portland_grouped['Neighborhood']

for ind in np.arange(portland_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(portland_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Alameda, Beaumont Wilshire, Eliot, Grant Park,...",Optical Shop,Coffee Shop,Pet Store,Italian Restaurant,Soccer Field,Garden Center,Music Store,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant
1,"Arbor Lodge, Bridgeton, Hayden Island, Humbol...",Video Store,Dive Bar,ATM,Sushi Restaurant,Jewelry Store,Light Rail Station,Gym,Gas Station,Furniture / Home Store,Mexican Restaurant
2,Ardenwald,Business Service,Pizza Place,Health & Beauty Service,Gift Shop,Food Truck,Mexican Restaurant,Deli / Bodega,Coffee Shop,Office,Middle Eastern Restaurant
3,"Arlington Heights, Corbett Terwilliger Lair Hi...",Coffee Shop,Eye Doctor,Tram Station,Gas Station,ATM,Office,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop
4,"Arnold Creek, Ash Creek, Collins View, Corbett...",Convenience Store,Automotive Shop,Pizza Place,Locksmith,Lingerie Store,Shipping Store,Sandwich Place,Greek Restaurant,Hospital,Intersection
5,"Boise, Eliot, Overlook",Mexican Restaurant,Brewery,ATM,Bagel Shop,Scandinavian Restaurant,Laundromat,Hotel,Gym / Fitness Center,Furniture / Home Store,Bakery
6,"Brentwood Darlington, Foster Powell, Mt Scott ...",Food Truck,Market,Food,Home Service,Shopping Mall,Tree,Bus Stop,ATM,Movie Theater,Middle Eastern Restaurant
7,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",Convenience Store,Home Service,Food Truck,Garden,Farmers Market,Brewery,Train Station,Mexican Restaurant,Auto Garage,Park
8,"Buckman, Hosford Abernethy, Kerns, Laurelhurst...",Pizza Place,Breakfast Spot,Thrift / Vintage Store,Sri Lankan Restaurant,Hardware Store,Bus Stop,Mexican Restaurant,Marijuana Dispensary,Lounge,Sandwich Place
9,"Cathedral Park, Portsmouth, St. Johns, Univers...",Food Truck,Convenience Store,Coffee Shop,Gym,Beer Bar,Soccer Field,Bar,Bank,BBQ Joint,Supermarket


K-Means Clustering

In [30]:
# set number of clusters
kclusters = 9

portland_grouped_clustering = portland_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(portland_grouped_clustering)

# check cluster labels generated for each row in the dataframe
#kmeans.labels_[0:10]
kmeans.labels_

array([1, 1, 1, 8, 1, 1, 0, 1, 1, 0, 2, 0, 0, 1, 1, 1, 3, 1, 7, 0, 1, 1,
       4, 6, 0, 1, 5, 1], dtype=int32)

Combine the portland df and top 10 neighborhood venues with the cluster label

In [31]:
# add clustering labels
neighborhoods_venues_sorted.drop(['Cluster Labels'], axis=1, inplace=True)
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

#portland_merged = portland_df

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
portland_merged = portland_df.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

#clean up missing values and change cluster to int for mapping
portland_merged.dropna(axis=0, how='any', inplace=True)
portland_merged['Cluster Labels'] = portland_merged['Cluster Labels'].astype(int)

portland_merged # check the last columns!

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,97201,"Arlington Heights, Corbett Terwilliger Lair Hi...",45.49894,-122.68781,8,Coffee Shop,Eye Doctor,Tram Station,Gas Station,ATM,Office,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop
1,97202,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",45.481791,-122.64055,1,Convenience Store,Home Service,Food Truck,Garden,Farmers Market,Brewery,Train Station,Mexican Restaurant,Auto Garage,Park
2,97203,"Cathedral Park, Portsmouth, St. Johns, Univers...",45.589689,-122.73875,0,Food Truck,Convenience Store,Coffee Shop,Gym,Beer Bar,Soccer Field,Bar,Bank,BBQ Joint,Supermarket
3,97204,"Downtown Portland, Goose Hollow, Old Town Chin...",45.51854,-122.6755,1,Hotel,Coffee Shop,Park,Restaurant,Food Truck,Sandwich Place,Food Court,Jewelry Store,Asian Restaurant,Japanese Restaurant
4,97205,"Downtown Portland, Goose Hollow",45.52054,-122.68573,1,Coffee Shop,Hotel,Bookstore,Italian Restaurant,Sandwich Place,Cocktail Bar,Pizza Place,Clothing Store,Shoe Store,Theater
5,97206,"Brentwood Darlington, Foster Powell, Mt Scott ...",45.482341,-122.60007,0,Food Truck,Market,Food,Home Service,Shopping Mall,Tree,Bus Stop,ATM,Movie Theater,Middle Eastern Restaurant
6,97209,"Old Town Chinatown, Pearl District",45.52889,-122.68458,1,Coffee Shop,Dive Bar,Pizza Place,Café,Ice Cream Shop,Spa,Mediterranean Restaurant,Vietnamese Restaurant,Park,Cocktail Bar
7,97210,"Hillside, Linnton, Northwest, Portland Northwe...",45.534839,-122.7095,0,Park,Yoga Studio,Gym,Sporting Goods Shop,Food Truck,Café,Office,Thai Restaurant,Spa,Grocery Store
8,97211,"Concordia, East Columbia, King, Sabin, Vernon,...",45.56544,-122.64635,0,Bus Stop,Convenience Store,Bakery,Food Truck,Food,Park,Pizza Place,Taco Place,Tea Room,Thai Restaurant
9,97212,"Alameda, Beaumont Wilshire, Eliot, Grant Park,...",45.54424,-122.64353,1,Optical Shop,Coffee Shop,Pet Store,Italian Restaurant,Soccer Field,Garden Center,Music Store,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant


Vizualize the k-means clusters

In [32]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(portland_merged['Latitude'], portland_merged['Longitude'], portland_merged['Neighborhood'], portland_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Analyze each cluster

In [33]:
portland_merged.loc[portland_merged['Cluster Labels'] == 0, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Cathedral Park, Portsmouth, St. Johns, Univers...",0,Food Truck,Convenience Store,Coffee Shop,Gym,Beer Bar,Soccer Field,Bar,Bank,BBQ Joint,Supermarket
5,"Brentwood Darlington, Foster Powell, Mt Scott ...",0,Food Truck,Market,Food,Home Service,Shopping Mall,Tree,Bus Stop,ATM,Movie Theater,Middle Eastern Restaurant
7,"Hillside, Linnton, Northwest, Portland Northwe...",0,Park,Yoga Studio,Gym,Sporting Goods Shop,Food Truck,Café,Office,Thai Restaurant,Spa,Grocery Store
8,"Concordia, East Columbia, King, Sabin, Vernon,...",0,Bus Stop,Convenience Store,Bakery,Food Truck,Food,Park,Pizza Place,Taco Place,Tea Room,Thai Restaurant
10,"Center, Grant Park, Hollywood, Montavilla, Ros...",0,Food Truck,Brewery,Convenience Store,Street Food Gathering,Park,Chinese Restaurant,Bar,Office,Mobile Phone Shop,Mongolian Restaurant
12,Mt Tabor,0,Lake,Basketball Court,Amphitheater,Bus Stop,Cheese Shop,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop,Mountain,Movie Theater


In [34]:
portland_merged.loc[portland_merged['Cluster Labels'] == 1, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"Brooklyn, Creston Kenilworth, Eastmoreland, Ho...",1,Convenience Store,Home Service,Food Truck,Garden,Farmers Market,Brewery,Train Station,Mexican Restaurant,Auto Garage,Park
3,"Downtown Portland, Goose Hollow, Old Town Chin...",1,Hotel,Coffee Shop,Park,Restaurant,Food Truck,Sandwich Place,Food Court,Jewelry Store,Asian Restaurant,Japanese Restaurant
4,"Downtown Portland, Goose Hollow",1,Coffee Shop,Hotel,Bookstore,Italian Restaurant,Sandwich Place,Cocktail Bar,Pizza Place,Clothing Store,Shoe Store,Theater
6,"Old Town Chinatown, Pearl District",1,Coffee Shop,Dive Bar,Pizza Place,Café,Ice Cream Shop,Spa,Mediterranean Restaurant,Vietnamese Restaurant,Park,Cocktail Bar
9,"Alameda, Beaumont Wilshire, Eliot, Grant Park,...",1,Optical Shop,Coffee Shop,Pet Store,Italian Restaurant,Soccer Field,Garden Center,Music Store,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant
11,"Buckman, Hosford Abernethy, Kerns, Laurelhurst...",1,Pizza Place,Breakfast Spot,Thrift / Vintage Store,Sri Lankan Restaurant,Hardware Store,Bus Stop,Mexican Restaurant,Marijuana Dispensary,Lounge,Sandwich Place
13,"Hazelwood, Mill Park",1,Mobile Phone Shop,Coffee Shop,Big Box Store,Fast Food Restaurant,Discount Store,ATM,Sandwich Place,Buffet,Spa,Chinese Restaurant
14,"Arbor Lodge, Bridgeton, Hayden Island, Humbol...",1,Video Store,Dive Bar,ATM,Sushi Restaurant,Jewelry Store,Light Rail Station,Gym,Gas Station,Furniture / Home Store,Mexican Restaurant
15,Cully,1,Mexican Restaurant,Taco Place,Farm,Gas Station,Bar,Mediterranean Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop
16,"Arnold Creek, Ash Creek, Collins View, Corbett...",1,Convenience Store,Automotive Shop,Pizza Place,Locksmith,Lingerie Store,Shipping Store,Sandwich Place,Greek Restaurant,Hospital,Intersection


In [35]:
portland_merged.loc[portland_merged['Cluster Labels'] == 2, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,Centennial,2,Cupcake Shop,Playground,ATM,Office,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop,Mountain


In [36]:
portland_merged.loc[portland_merged['Cluster Labels'] == 3, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,"Forest Park, Northwest Heights",3,Insurance Office,ATM,Office,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop,Mountain,Movie Theater


In [37]:
portland_merged.loc[portland_merged['Cluster Labels'] == 4, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
23,Linnton,4,Beer Garden,Farm,ATM,Music Store,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop,Mountain


In [38]:
portland_merged.loc[portland_merged['Cluster Labels'] == 5, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,"Pleasant Valley, Powellhurst Gilbert",5,Dance Studio,Park,ATM,Office,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop,Mountain


In [39]:
portland_merged.loc[portland_merged['Cluster Labels'] == 6, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,"Madison South, Parkrose Heights, Parkrose, Sum...",6,Intersection,Park,Mountain,ATM,Office,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop


In [40]:
portland_merged.loc[portland_merged['Cluster Labels'] == 7, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
27,"Healy Heights, Homestead",7,Tennis Court,Bus Line,Carpet Store,ATM,Office,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop,Mountain


In [41]:
portland_merged.loc[portland_merged['Cluster Labels'] == 8, portland_merged.columns[[1] + list(range(4, portland_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Arlington Heights, Corbett Terwilliger Lair Hi...",8,Coffee Shop,Eye Doctor,Tram Station,Gas Station,ATM,Office,Middle Eastern Restaurant,Mobile Phone Shop,Mongolian Restaurant,Motorcycle Shop


## Results and Discussion

Our analysis shows that Portland offers a wide assortment of venues in each neighborhood. Food trucks are huge in Portland and typically there are many in one space so they are outside of the dense city center. The neighborhoods with the most coffee shops and restuarnats are in the heart of the city or right off highways directly from the city. These two cluster are not ideal to open an restaurant let a lone a Mexican resaurant. Clusters 2 though 6 all list a Mexican restuarant in their top 10 venues already so I would avoid those areas as well. The two ideal places to open a Mexican restaurant would be clusters 7 and 8 or the Healy Heights, Arlington Heights, Corbett Terwilliger Lair Hill, Hillsdale, Homestead or Southwest Hills neighborhoods.

## Conclusion

The purpose of this project was to cluster Portland neighborhoods by venue types in order to aid my wife in narrowing down the search for an optimal location for a new Mexican restaurant. By gathering venue data from Foursquare and identifing the top 10 venues per neighborhood. Clustering was then performed to create sub groups to identify the best neighborhoods to open a Mexican Restuarant.