## Final Project: A Bakery in Portland, OR

### Question 1: Introduction/Business Problem

#### In this notebook, I will explore neighborhoods in Portland, Oregon in order to find a good location for a bakery/coffee shop. My target audience is people who might want to start a local business such as bakers or coffee shop owners. The location of the bakery is important to the audience because they want to place their shop in a neighborhood that does not have similar businesses, so there will be less competition. This audience also wants their bakery to be a neighborhood bakery where residents can easily pick up fresh bread and pastries, or grab coffee and walk at nearby parks. Therefore, the audeince wants the location to be residential, in a neighborhood with a few parks.

### Question 2: Data

#### I will use Foursquare data to explore the nighborhoods in Portland, OR. I will get lists of venues in each neighborhood, and look for neigborhoods where bakeries and coffee shops are not in the list of venues. From that list I will choose neighborhoods that are more residential, so they won't have a high numbr of venues, though they do want some venues, especially parks, schools, playgrounds, and maybe some restauraunts. Also the venues in th eneighborhood should not include places like airports, night clubs, shopping malls, or distribution centers, as the audience wants a quiet neighborhood bakery.

#### Before I use the Foursquare data, I will obtain data about Portland neighborhoods. My data consists of a geoJSON file from PortlandMaps.com, that shows each neighborhood and its geometry. I will use geoPandas to get the centroid of each neighborhood, and then clean the data before I start using Foursquare.

In [1]:
import pandas as pd

#### Install Folium

In [2]:
!pip install folium



#### Import Folium and make an initial map of Portland

In [2]:
import folium

m = folium.Map(location=[45.5236, -122.6750])

In [3]:
m

#### Install and import necessary libraries: requests, json, geopandas. Geopandas is a library that handles geoJSON files. I needed this because the neighborhood data from Portlland Open Maps was in GeoJSON format, and I wanted to be able put the data in a dataframe, so that I could use the Foursquare data.

In [4]:
import requests
import json
import numpy as np

In [6]:
conda install geopandas

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [5]:
import geopandas as gpd

## Obtain the Data

#### Use geoJSON file from PortlandMaps Open Data: https://gis-pdx.opendata.arcgis.com/datasets/neighborhood-boundaries
#### Shows Neighborhood boundaries. Metadata is here: https://www.portlandmaps.com/metadata/index.cfm?&action=DisplayLayer&LayerID=53509

In [6]:
URL = "https://opendata.arcgis.com/datasets/1ef75e34b8504ab9b14bef0c26cade2c_3.geojson"
gdf = gpd.read_file(URL)

gdf

Unnamed: 0,OBJECTID,NAME,COMMPLAN,SHARED,COALIT,HORZ_VERT,Shape_Length,MAPLABEL,ID,geometry
0,1,LINNTON,,N,NWNW,HORZ,52741.719772,Linnton,1,"POLYGON ((-122.82371 45.60616, -122.82319 45.6..."
1,2,FOREST PARK/LINNTON,,Y,NWNW,,57723.635350,Forest Park/Linnton,2,"POLYGON ((-122.82319 45.60616, -122.82371 45.6..."
2,3,FOREST PARK,,N,NWNW,HORZ,82725.497522,Forest Park,3,"POLYGON ((-122.79159 45.54843, -122.79086 45.5..."
3,4,CATHEDRAL PARK,,N,NPNS,HORZ,11434.254777,Cathedral Park,4,"POLYGON ((-122.76461 45.58519, -122.76135 45.5..."
4,5,UNIVERSITY PARK,,N,NPNS,HORZ,11950.859827,University Park,5,"POLYGON ((-122.73855 45.58395, -122.74104 45.5..."
...,...,...,...,...,...,...,...,...,...,...
125,126,KENTON,ALBINA,N,NPNS,HORZ,19247.188225,Kenton,126,"POLYGON ((-122.67859 45.57721, -122.67853 45.5..."
126,127,BRIDGETON,,N,NPNS,HORZ,8635.720662,Bridgeton,127,"POLYGON ((-122.65704 45.60239, -122.65893 45.6..."
127,128,EAST COLUMBIA,,N,NPNS,HORZ,15397.269131,East Columbia,128,"POLYGON ((-122.66015 45.59948, -122.66041 45.5..."
128,129,SUNDERLAND ASSOCIATION OF NEIGHBORS,,N,CNN,HORZ,20706.496916,Sunderland,129,"POLYGON ((-122.64031 45.60116, -122.64095 45.6..."


#### Reset the index to make it easier to read

In [7]:
gdf = gdf.set_index("OBJECTID")
gdf

Unnamed: 0_level_0,NAME,COMMPLAN,SHARED,COALIT,HORZ_VERT,Shape_Length,MAPLABEL,ID,geometry
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,LINNTON,,N,NWNW,HORZ,52741.719772,Linnton,1,"POLYGON ((-122.82371 45.60616, -122.82319 45.6..."
2,FOREST PARK/LINNTON,,Y,NWNW,,57723.635350,Forest Park/Linnton,2,"POLYGON ((-122.82319 45.60616, -122.82371 45.6..."
3,FOREST PARK,,N,NWNW,HORZ,82725.497522,Forest Park,3,"POLYGON ((-122.79159 45.54843, -122.79086 45.5..."
4,CATHEDRAL PARK,,N,NPNS,HORZ,11434.254777,Cathedral Park,4,"POLYGON ((-122.76461 45.58519, -122.76135 45.5..."
5,UNIVERSITY PARK,,N,NPNS,HORZ,11950.859827,University Park,5,"POLYGON ((-122.73855 45.58395, -122.74104 45.5..."
...,...,...,...,...,...,...,...,...,...
126,KENTON,ALBINA,N,NPNS,HORZ,19247.188225,Kenton,126,"POLYGON ((-122.67859 45.57721, -122.67853 45.5..."
127,BRIDGETON,,N,NPNS,HORZ,8635.720662,Bridgeton,127,"POLYGON ((-122.65704 45.60239, -122.65893 45.6..."
128,EAST COLUMBIA,,N,NPNS,HORZ,15397.269131,East Columbia,128,"POLYGON ((-122.66015 45.59948, -122.66041 45.5..."
129,SUNDERLAND ASSOCIATION OF NEIGHBORS,,N,CNN,HORZ,20706.496916,Sunderland,129,"POLYGON ((-122.64031 45.60116, -122.64095 45.6..."


#### Find the centroid of each neighborhood polygon

In [8]:
gdf['centroid'] = gdf.centroid
gdf['centroid']

OBJECTID
1      POINT (-122.79326 45.60379)
2      POINT (-122.78177 45.58063)
3      POINT (-122.79208 45.56438)
4      POINT (-122.75732 45.58737)
5      POINT (-122.73008 45.57635)
                  ...             
126    POINT (-122.69739 45.59465)
127    POINT (-122.66805 45.60298)
128    POINT (-122.66187 45.59390)
129    POINT (-122.63658 45.58387)
130    POINT (-122.71838 45.58719)
Name: centroid, Length: 130, dtype: geometry

In [9]:
gdf.head(10)

Unnamed: 0_level_0,NAME,COMMPLAN,SHARED,COALIT,HORZ_VERT,Shape_Length,MAPLABEL,ID,geometry,centroid
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1,LINNTON,,N,NWNW,HORZ,52741.719772,Linnton,1,"POLYGON ((-122.82371 45.60616, -122.82319 45.6...",POINT (-122.79326 45.60379)
2,FOREST PARK/LINNTON,,Y,NWNW,,57723.63535,Forest Park/Linnton,2,"POLYGON ((-122.82319 45.60616, -122.82371 45.6...",POINT (-122.78177 45.58063)
3,FOREST PARK,,N,NWNW,HORZ,82725.497522,Forest Park,3,"POLYGON ((-122.79159 45.54843, -122.79086 45.5...",POINT (-122.79208 45.56438)
4,CATHEDRAL PARK,,N,NPNS,HORZ,11434.254777,Cathedral Park,4,"POLYGON ((-122.76461 45.58519, -122.76135 45.5...",POINT (-122.75732 45.58737)
5,UNIVERSITY PARK,,N,NPNS,HORZ,11950.859827,University Park,5,"POLYGON ((-122.73855 45.58395, -122.74104 45.5...",POINT (-122.73008 45.57635)
6,MC UNCLAIMED #14,,N,UNCLAIMED,,23667.613908,MC Unclaimed #14,6,"POLYGON ((-122.76461 45.58519, -122.76542 45.5...",POINT (-122.72747 45.55769)
7,PIEDMONT,ALBINA,N,NPNS,VERT,10849.327392,Piedmont,7,"POLYGON ((-122.67545 45.58659, -122.67593 45.5...",POINT (-122.67042 45.57644)
8,WOODLAWN,ALBINA,N,NECN,HORZ,8078.360994,Woodlawn,8,"POLYGON ((-122.66133 45.58112, -122.66136 45.5...",POINT (-122.65304 45.57257)
9,CULLY ASSOCIATION OF NEIGHBORS,,N,CNN,HORZ,18179.39209,Cully Association of Neighbors,9,"POLYGON ((-122.62053 45.57178, -122.62052 45.5...",POINT (-122.60151 45.56375)
10,ARBOR LODGE,ALBINA,N,NPNS,HORZ,9466.411504,Arbor Lodge,10,"POLYGON ((-122.67859 45.57721, -122.68210 45.5...",POINT (-122.69084 45.57215)


## Clean the Data

#### I have several columns that I don't need. I will drop commplan, shared, coalit, horz_vert, shape_length, maplabel, and ID

In [10]:
gdf.drop(columns=['COMMPLAN', 'SHARED', 'COALIT', 'HORZ_VERT', 'Shape_Length', 'MAPLABEL', 'ID'], inplace = True)
gdf.head()

Unnamed: 0_level_0,NAME,geometry,centroid
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,LINNTON,"POLYGON ((-122.82371 45.60616, -122.82319 45.6...",POINT (-122.79326 45.60379)
2,FOREST PARK/LINNTON,"POLYGON ((-122.82319 45.60616, -122.82371 45.6...",POINT (-122.78177 45.58063)
3,FOREST PARK,"POLYGON ((-122.79159 45.54843, -122.79086 45.5...",POINT (-122.79208 45.56438)
4,CATHEDRAL PARK,"POLYGON ((-122.76461 45.58519, -122.76135 45.5...",POINT (-122.75732 45.58737)
5,UNIVERSITY PARK,"POLYGON ((-122.73855 45.58395, -122.74104 45.5...",POINT (-122.73008 45.57635)


#### I need to convert te centroid to a lat/long column, but its type is geometry. I'll convert it to a string so it's a little eaier to work with

In [11]:
gdf.dtypes

NAME          object
geometry    geometry
centroid    geometry
dtype: object

In [12]:
gdf["centroid"]=gdf["centroid"].astype("str")

In [13]:
gdf.dtypes

NAME          object
geometry    geometry
centroid      object
dtype: object

#### Remove the word "point" from the centroid column

In [14]:
gdf['centroid'] = gdf['centroid'].str.replace('POINT ', '')

In [15]:
gdf.head()

Unnamed: 0_level_0,NAME,geometry,centroid
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,LINNTON,"POLYGON ((-122.82371 45.60616, -122.82319 45.6...",(-122.7932636868726 45.60378993637875)
2,FOREST PARK/LINNTON,"POLYGON ((-122.82319 45.60616, -122.82371 45.6...",(-122.7817746512875 45.58063023423158)
3,FOREST PARK,"POLYGON ((-122.79159 45.54843, -122.79086 45.5...",(-122.7920776241818 45.56438278026928)
4,CATHEDRAL PARK,"POLYGON ((-122.76461 45.58519, -122.76135 45.5...",(-122.7573167006587 45.58736826406709)
5,UNIVERSITY PARK,"POLYGON ((-122.73855 45.58395, -122.74104 45.5...",(-122.730079200974 45.57635375668902)


#### Now I need to separate the centroid into two separate columns:  latitude and longitude. This will allow me to map the center of each Portland neighborhood.

In [16]:
# Create two empty lists for the results
Latitude = []
Longitude = []

# For each row in centroid
for row in gdf['centroid']:
    # Split the row by the space and append
    # everything before the space to longitude
    Longitude.append(row.split(' ')[0])
    # Split the row by the space and append
    # everything after the space to latitude
    Latitude.append(row.split(' ')[1])
    
gdf['Latitude'] = Latitude
gdf['Longitude'] = Longitude

In [17]:
gdf.head()

Unnamed: 0_level_0,NAME,geometry,centroid,Latitude,Longitude
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,LINNTON,"POLYGON ((-122.82371 45.60616, -122.82319 45.6...",(-122.7932636868726 45.60378993637875),45.60378993637875),(-122.7932636868726
2,FOREST PARK/LINNTON,"POLYGON ((-122.82319 45.60616, -122.82371 45.6...",(-122.7817746512875 45.58063023423158),45.58063023423158),(-122.7817746512875
3,FOREST PARK,"POLYGON ((-122.79159 45.54843, -122.79086 45.5...",(-122.7920776241818 45.56438278026928),45.56438278026928),(-122.7920776241818
4,CATHEDRAL PARK,"POLYGON ((-122.76461 45.58519, -122.76135 45.5...",(-122.7573167006587 45.58736826406709),45.58736826406709),(-122.7573167006587
5,UNIVERSITY PARK,"POLYGON ((-122.73855 45.58395, -122.74104 45.5...",(-122.730079200974 45.57635375668902),45.57635375668902),(-122.730079200974


#### Take those parentheses off the lat and long columns

In [18]:
gdf['Latitude'] = gdf['Latitude'].str.replace(')', '')

In [19]:
gdf['Longitude'] = gdf['Longitude'].str.replace('(', '')

In [20]:
gdf.head()

Unnamed: 0_level_0,NAME,geometry,centroid,Latitude,Longitude
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,LINNTON,"POLYGON ((-122.82371 45.60616, -122.82319 45.6...",(-122.7932636868726 45.60378993637875),45.60378993637875,-122.7932636868726
2,FOREST PARK/LINNTON,"POLYGON ((-122.82319 45.60616, -122.82371 45.6...",(-122.7817746512875 45.58063023423158),45.58063023423158,-122.7817746512875
3,FOREST PARK,"POLYGON ((-122.79159 45.54843, -122.79086 45.5...",(-122.7920776241818 45.56438278026928),45.56438278026928,-122.7920776241818
4,CATHEDRAL PARK,"POLYGON ((-122.76461 45.58519, -122.76135 45.5...",(-122.7573167006587 45.58736826406709),45.58736826406709,-122.7573167006587
5,UNIVERSITY PARK,"POLYGON ((-122.73855 45.58395, -122.74104 45.5...",(-122.730079200974 45.57635375668902),45.57635375668902,-122.730079200974


#### I'm converting the gdf to a df, dropping the geometry and centroid colums, but saving the variable in case I need it later to map the neighborhoods

In [21]:
from pandas import DataFrame

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

print('Libraries imported.')

Libraries imported.


In [22]:
gdf.drop(columns=['geometry', 'centroid'], inplace = True)
gdf.head()

Unnamed: 0_level_0,NAME,Latitude,Longitude
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,LINNTON,45.60378993637875,-122.7932636868726
2,FOREST PARK/LINNTON,45.58063023423158,-122.7817746512875
3,FOREST PARK,45.56438278026928,-122.7920776241818
4,CATHEDRAL PARK,45.58736826406709,-122.7573167006587
5,UNIVERSITY PARK,45.57635375668902,-122.730079200974


In [23]:
df = pd.DataFrame(gdf)

#### Convert lat and long to floats

In [24]:
df.dtypes

NAME         object
Latitude     object
Longitude    object
dtype: object

In [25]:
df["Latitude"]=df["Latitude"].astype(float)

In [26]:
df["Longitude"]=df["Longitude"].astype(float)

In [27]:
df.dtypes

NAME          object
Latitude     float64
Longitude    float64
dtype: object

In [28]:
df.head(10)

Unnamed: 0_level_0,NAME,Latitude,Longitude
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,LINNTON,45.60379,-122.793264
2,FOREST PARK/LINNTON,45.58063,-122.781775
3,FOREST PARK,45.564383,-122.792078
4,CATHEDRAL PARK,45.587368,-122.757317
5,UNIVERSITY PARK,45.576354,-122.730079
6,MC UNCLAIMED #14,45.557689,-122.727474
7,PIEDMONT,45.576438,-122.670418
8,WOODLAWN,45.572565,-122.653037
9,CULLY ASSOCIATION OF NEIGHBORS,45.563753,-122.601509
10,ARBOR LODGE,45.572152,-122.690842


In [29]:
df.reset_index(drop=True, inplace=True)
df.head()


Unnamed: 0,NAME,Latitude,Longitude
0,LINNTON,45.60379,-122.793264
1,FOREST PARK/LINNTON,45.58063,-122.781775
2,FOREST PARK,45.564383,-122.792078
3,CATHEDRAL PARK,45.587368,-122.757317
4,UNIVERSITY PARK,45.576354,-122.730079


### Show a map of Portland with centroids of each neighborhood

In [30]:
# create map of Portland Neighborhoods using latitude and longitude values
m = folium.Map(location=[45.5236, -122.6750], zoom_start=12)

# add markers to map
for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['NAME']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(m)  
    
m

## Use Foursquare to explore neighborhoods

#### Enter my info for foursquare

In [31]:
CLIENT_ID = 'TDKMGX0URSB410BSSHFVA2UOSPJGWPVY5MUA3ME5APTIGSTS' # your Foursquare ID
CLIENT_SECRET = 'LDINZ0KVRKCBSRBLXWE0WM2CGFFHHX3MDFLZBEGAGA5WFKKF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

#### Here we definte a function that will get venues within 500 m of my neighborhood centroids

In [32]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Run the "getNearbyVenues" function on each Portland neighborhood. Then print the size and show the first 20 rows

In [33]:
Portland_venues = getNearbyVenues(names=df['NAME'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

LINNTON
FOREST PARK/LINNTON
FOREST PARK
CATHEDRAL PARK
UNIVERSITY PARK
MC UNCLAIMED #14
PIEDMONT
WOODLAWN
CULLY ASSOCIATION OF NEIGHBORS
ARBOR LODGE
OVERLOOK
CONCORDIA
PARKROSE
SUMNER ASSOCIATION OF NEIGHBORS
ARGAY TERRACE
HUMBOLDT
KING
VERNON
WILKES COMMUNITY GROUP
BEAUMONT-WILSHIRE
SABIN COMMUNITY ASSOCIATION
ALAMEDA
BOISE
NORTHWEST HEIGHTS
ROSEWAY
MADISON SOUTH
ARGAY/WILKES COMMUNITY GROUP
BOISE/ELIOT
ELIOT
IRVINGTON COMMUNITY ASSOCIATION
SABIN COMMUNITY ASSN./IRVINGTON COMMUNITY ASSN.
ALAMEDA/IRVINGTON COMMUNITY ASSN.
ROSE CITY PARK
PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS
NORTHWEST DISTRICT ASSOCIATION
ALAMEDA/BEAUMONT-WILSHIRE
FOREST PARK/NORTHWEST DISTRICT ASSOCIATION
RUSSELL
ROSEWAY/MADISON SOUTH
GRANT PARK
MC UNCLAIMED #5
PEARL DISTRICT
GRANT PARK/HOLLYWOOD
HOLLYWOOD
WOODLAND PARK
LLOYD DISTRICT COMMUNITY ASSOCIATION
SULLIVAN'S GULCH
SULLIVAN'S GULCH/GRANT PARK
MONTAVILLA
HILLSIDE/NORTHWEST DISTRICT ASSN.
LAURELHURST
KERNS
LLOYD DISTRICT COMMUNITY ASSN./SULLIVAN'S GULCH
HILLS

In [64]:
print(Portland_venues.shape)
Portland_venues.head(20)

(2003, 6)


Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LINNTON,45.60379,-122.793264,Subway,45.603014,-122.78859,Sandwich Place
LINNTON,45.60379,-122.793264,Shell,45.603473,-122.788861,Gas Station
LINNTON,45.60379,-122.793264,7-Eleven,45.602668,-122.788165,Convenience Store
FOREST PARK/LINNTON,45.58063,-122.781775,Forest Park Hardesty Trailhead,45.578979,-122.781064,Trail
CATHEDRAL PARK,45.587368,-122.757317,Cathedral Park,45.587744,-122.759822,Park
CATHEDRAL PARK,45.587368,-122.757317,Hoplandia Beer,45.589662,-122.755614,Beer Store
CATHEDRAL PARK,45.587368,-122.757317,Occidental Wursthaus,45.588864,-122.761344,German Restaurant
CATHEDRAL PARK,45.587368,-122.757317,Taqueria Y Panaderia Santa Cruz,45.590201,-122.755332,Mexican Restaurant
CATHEDRAL PARK,45.587368,-122.757317,The Great North,45.590399,-122.754684,Coffee Shop
CATHEDRAL PARK,45.587368,-122.757317,Occidental Brewing Company,45.588807,-122.76168,Brewery


In [65]:
Portland_venues.reset_index(level =['Neighborhood'], inplace = True) 
Portland_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,LINNTON,45.60379,-122.793264,Subway,45.603014,-122.78859,Sandwich Place
1,LINNTON,45.60379,-122.793264,Shell,45.603473,-122.788861,Gas Station
2,LINNTON,45.60379,-122.793264,7-Eleven,45.602668,-122.788165,Convenience Store
3,FOREST PARK/LINNTON,45.58063,-122.781775,Forest Park Hardesty Trailhead,45.578979,-122.781064,Trail
4,CATHEDRAL PARK,45.587368,-122.757317,Cathedral Park,45.587744,-122.759822,Park


#### How many venues for each neighborhood

In [66]:
Portland_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ALAMEDA,4,4,4,4,4,4
ALAMEDA/BEAUMONT-WILSHIRE,4,4,4,4,4,4
ALAMEDA/IRVINGTON COMMUNITY ASSN.,6,6,6,6,6,6
ARBOR LODGE,11,11,11,11,11,11
ARDENWALD-JOHNSON CREEK,9,9,9,9,9,9
ARDENWALD-JOHNSON CREEK/WOODSTOCK,6,6,6,6,6,6
ARGAY TERRACE,12,12,12,12,12,12
ARGAY/WILKES COMMUNITY GROUP,5,5,5,5,5,5
ARLINGTON HEIGHTS,21,21,21,21,21,21
ARLINGTON HEIGHTS/SYLVAN-HIGHLANDS,22,22,22,22,22,22


#### I want to use only neighborhoods that are more residential, so they will have fewer venues. So I am finding all neighborhoods that have more than 15 venues, and then I will remove those from the Portland_venues df.

In [67]:
venue_totals = Portland_venues.groupby('Neighborhood').count()

In [68]:
venue_totals_filtered = venue_totals[venue_totals['Venue'] >= 15] 
venue_totals_filtered.head()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ARLINGTON HEIGHTS,21,21,21,21,21,21
ARLINGTON HEIGHTS/SYLVAN-HIGHLANDS,22,22,22,22,22,22
BEAUMONT-WILSHIRE,15,15,15,15,15,15
BOISE,83,83,83,83,83,83
BOISE/ELIOT,21,21,21,21,21,21


#### Looks like I have 41 neighborhoods with 15 or more venues to remove. I'll get a list of them so I can copy-paste and then drop them.

In [69]:
venue_totals_filtered.shape

(41, 6)

#### Put the busy neighborhoods into an array

In [70]:
busy_neighborhoods = venue_totals_filtered.index.values
busy_neighborhoods

array(['ARLINGTON HEIGHTS', 'ARLINGTON HEIGHTS/SYLVAN-HIGHLANDS',
       'BEAUMONT-WILSHIRE', 'BOISE', 'BOISE/ELIOT',
       'BUCKMAN COMMUNITY ASSOCIATION', 'CATHEDRAL PARK', 'CONCORDIA',
       'CRESTON-KENILWORTH', 'ELIOT', 'GOOSE HOLLOW FOOTHILLS LEAGUE',
       'GOOSE HOLLOW FOOTHILLS LEAGUE/SOUTHWEST HILLS RESIDENTIAL LEAGUE',
       'GRANT PARK/HOLLYWOOD', 'HAZELWOOD', 'HAZELWOOD/MILL PARK',
       'HILLSDALE', 'HOLLYWOOD',
       'HOSFORD-ABERNETHY NEIGHBORHOOD DISTRICT ASSN.', 'HUMBOLDT',
       'KERNS', 'KING', "LLOYD DISTRICT COMMUNITY ASSN./SULLIVAN'S GULCH",
       'LLOYD DISTRICT COMMUNITY ASSOCIATION', 'MONTAVILLA', 'MULTNOMAH',
       'NORTHWEST DISTRICT ASSOCIATION', 'OLD TOWN COMMUNITY ASSOCIATION',
       'OVERLOOK', 'PEARL DISTRICT', 'PORTLAND DOWNTOWN', 'RICHMOND',
       'ROSE CITY PARK', 'ROSEWAY',
       'SABIN COMMUNITY ASSN./IRVINGTON COMMUNITY ASSN.',
       'SELLWOOD-MORELAND IMPROVEMENT LEAGUE', "SULLIVAN'S GULCH",
       "SULLIVAN'S GULCH/GRANT PARK", 'SUN

In [72]:
# Thinking I don't need this...
Portland_venues = Portland_venues.set_index("Neighborhood")

In [73]:
portland_venues_low = Portland_venues.drop(['ARLINGTON HEIGHTS', 'ARLINGTON HEIGHTS/SYLVAN-HIGHLANDS',
       'BEAUMONT-WILSHIRE', 'BOISE', 'BOISE/ELIOT',
       'BUCKMAN COMMUNITY ASSOCIATION', 'CATHEDRAL PARK', 'CONCORDIA',
       'CRESTON-KENILWORTH', 'ELIOT', 'GOOSE HOLLOW FOOTHILLS LEAGUE',
       'GOOSE HOLLOW FOOTHILLS LEAGUE/SOUTHWEST HILLS RESIDENTIAL LEAGUE',
       'GRANT PARK/HOLLYWOOD', 'HAZELWOOD', 'HAZELWOOD/MILL PARK',
       'HILLSDALE', 'HOLLYWOOD',
       'HOSFORD-ABERNETHY NEIGHBORHOOD DISTRICT ASSN.', 'HUMBOLDT',
       'KERNS', 'KING', "LLOYD DISTRICT COMMUNITY ASSN./SULLIVAN'S GULCH",
       'LLOYD DISTRICT COMMUNITY ASSOCIATION', 'MONTAVILLA', 'MULTNOMAH',
       'NORTHWEST DISTRICT ASSOCIATION', 'OLD TOWN COMMUNITY ASSOCIATION',
       'OVERLOOK', 'PEARL DISTRICT', 'PORTLAND DOWNTOWN', 'RICHMOND',
       'ROSE CITY PARK', 'ROSEWAY',
       'SABIN COMMUNITY ASSN./IRVINGTON COMMUNITY ASSN.',
       'SELLWOOD-MORELAND IMPROVEMENT LEAGUE', "SULLIVAN'S GULCH",
       "SULLIVAN'S GULCH/GRANT PARK", 'SUNNYSIDE', 'VERNON',
       'WOODLAND PARK', 'WOODSTOCK'], axis=0)

In [74]:
portland_venues_low.shape

(426, 6)

In [75]:
portland_venues_low

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LINNTON,45.60379,-122.793264,Subway,45.603014,-122.78859,Sandwich Place
LINNTON,45.60379,-122.793264,Shell,45.603473,-122.788861,Gas Station
LINNTON,45.60379,-122.793264,7-Eleven,45.602668,-122.788165,Convenience Store
FOREST PARK/LINNTON,45.58063,-122.781775,Forest Park Hardesty Trailhead,45.578979,-122.781064,Trail
UNIVERSITY PARK,45.576354,-122.730079,Merlo Field,45.574739,-122.727743,College Soccer Field
UNIVERSITY PARK,45.576354,-122.730079,Chiles Center,45.575108,-122.72854,College Basketball Court
UNIVERSITY PARK,45.576354,-122.730079,Student Lead Urban Garden,45.57696,-122.733625,Garden
UNIVERSITY PARK,45.576354,-122.730079,Mago Hunt Recital Hall University of Portland,45.573198,-122.727912,Theater
MC UNCLAIMED #14,45.557689,-122.727474,Kelley Imaging Systems,45.557622,-122.731581,Paper / Office Supplies Store
PIEDMONT,45.576438,-122.670418,Black Rock Coffee Bar,45.577042,-122.668155,Coffee Shop


#### I've cut out the busiest neighborhoods, and now have 79 neighborhoods remianing. I'd like to remove neighborhoods with coffee shops or bakeries, as I want to find a place with less competition

In [76]:
#remove coffee shops
coffee_shops = portland_venues_low.loc[portland_venues_low['Venue Category'] == 'Coffee Shop']
coffee_shops

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
PIEDMONT,45.576438,-122.670418,Black Rock Coffee Bar,45.577042,-122.668155,Coffee Shop
WOODLAWN,45.572565,-122.653037,Woodlawn Coffee and Pastry,45.571753,-122.657063,Coffee Shop
ARBOR LODGE,45.572152,-122.690842,Grindhouse Coffee,45.56987,-122.68711,Coffee Shop
ALAMEDA,45.549062,-122.635862,Guilder,45.54829,-122.641335,Coffee Shop
ALAMEDA/IRVINGTON COMMUNITY ASSN.,45.545149,-122.641661,Guilder,45.54829,-122.641335,Coffee Shop
RUSSELL,45.539319,-122.527556,Spinal Tap,45.543268,-122.52602,Coffee Shop
ROSEWAY/MADISON SOUTH,45.541872,-122.581062,Bebo's Coffee,45.545176,-122.579045,Coffee Shop
NORTH TABOR,45.526056,-122.60533,Seven Virtues Coffee Roasters,45.526354,-122.602358,Coffee Shop
NORTH TABOR,45.526056,-122.60533,Starbucks,45.522549,-122.606596,Coffee Shop
SOUTH PORTLAND,45.488039,-122.673813,Essence Coffee & Tea,45.484473,-122.676329,Coffee Shop


In [77]:
#get list of neighborhoods with coffee shops
remove_coffee_shops = coffee_shops.index.values
remove_coffee_shops

array(['PIEDMONT', 'WOODLAWN', 'ARBOR LODGE', 'ALAMEDA',
       'ALAMEDA/IRVINGTON COMMUNITY ASSN.', 'RUSSELL',
       'ROSEWAY/MADISON SOUTH', 'NORTH TABOR', 'NORTH TABOR',
       'SOUTH PORTLAND', 'HOMESTEAD', 'POWELLHURST-GILBERT',
       'POWELLHURST-GILBERT', 'MT. SCOTT-ARLETA', 'MAPLEWOOD',
       'ARDENWALD-JOHNSON CREEK'], dtype=object)

In [78]:
# drop rows with coffee shops
portland_venues_low = portland_venues_low.drop(['PIEDMONT', 'WOODLAWN', 'ARBOR LODGE', 'ALAMEDA',
       'ALAMEDA/IRVINGTON COMMUNITY ASSN.', 'RUSSELL',
       'ROSEWAY/MADISON SOUTH', 'NORTH TABOR', 'NORTH TABOR',
       'SOUTH PORTLAND', 'HOMESTEAD', 'POWELLHURST-GILBERT',
       'POWELLHURST-GILBERT', 'MT. SCOTT-ARLETA', 'MAPLEWOOD',
       'ARDENWALD-JOHNSON CREEK'], axis=0)

In [79]:
portland_venues_low

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LINNTON,45.60379,-122.793264,Subway,45.603014,-122.78859,Sandwich Place
LINNTON,45.60379,-122.793264,Shell,45.603473,-122.788861,Gas Station
LINNTON,45.60379,-122.793264,7-Eleven,45.602668,-122.788165,Convenience Store
FOREST PARK/LINNTON,45.58063,-122.781775,Forest Park Hardesty Trailhead,45.578979,-122.781064,Trail
UNIVERSITY PARK,45.576354,-122.730079,Merlo Field,45.574739,-122.727743,College Soccer Field
UNIVERSITY PARK,45.576354,-122.730079,Chiles Center,45.575108,-122.72854,College Basketball Court
UNIVERSITY PARK,45.576354,-122.730079,Student Lead Urban Garden,45.57696,-122.733625,Garden
UNIVERSITY PARK,45.576354,-122.730079,Mago Hunt Recital Hall University of Portland,45.573198,-122.727912,Theater
MC UNCLAIMED #14,45.557689,-122.727474,Kelley Imaging Systems,45.557622,-122.731581,Paper / Office Supplies Store
CULLY ASSOCIATION OF NEIGHBORS,45.563753,-122.601509,Angel Food and Fun Mexican Restaurant,45.560074,-122.600868,Mexican Restaurant


In [80]:
portland_venues_low.shape

(305, 6)

In [81]:
# display venue counts grouped by neighborhood to see how many neighborhoods I have 
venue_totals_low = portland_venues_low.groupby('Neighborhood').count()
venue_totals_low

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ALAMEDA/BEAUMONT-WILSHIRE,4,4,4,4,4,4
ARDENWALD-JOHNSON CREEK/WOODSTOCK,6,6,6,6,6,6
ARGAY TERRACE,12,12,12,12,12,12
ARGAY/WILKES COMMUNITY GROUP,5,5,5,5,5,5
ARNOLD CREEK,2,2,2,2,2,2
ASHCREEK,6,6,6,6,6,6
ASHCREEK/CRESTWOOD,6,6,6,6,6,6
BRENTWOOD-DARLINGTON,4,4,4,4,4,4
BRIDGETON,4,4,4,4,4,4
BRIDLEMILE,3,3,3,3,3,3


In [82]:
venue_totals_low.shape

(65, 6)

In [83]:
# find bakeries. There's only 1
bakeries = portland_venues_low.loc[portland_venues_low['Venue Category'] == 'Bakery']
bakeries

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ARDENWALD-JOHNSON CREEK/WOODSTOCK,45.462998,-122.61921,Franz Bakery Outlet,45.462788,-122.615516,Bakery


In [87]:
# remove the bakery
portland_venues_low = portland_venues_low.drop(['ARDENWALD-JOHNSON CREEK/WOODSTOCK'], axis=0)

KeyError: "['ARDENWALD-JOHNSON CREEK/WOODSTOCK'] not found in axis"

In [88]:
# Display the remaining venue counts, grouped by neighborhood
venue_totals_low = portland_venues_low.groupby('Neighborhood').count()
venue_totals_low

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ALAMEDA/BEAUMONT-WILSHIRE,4,4,4,4,4,4
ARGAY TERRACE,12,12,12,12,12,12
ARGAY/WILKES COMMUNITY GROUP,5,5,5,5,5,5
ARNOLD CREEK,2,2,2,2,2,2
ASHCREEK,6,6,6,6,6,6
ASHCREEK/CRESTWOOD,6,6,6,6,6,6
BRENTWOOD-DARLINGTON,4,4,4,4,4,4
BRIDGETON,4,4,4,4,4,4
BRIDLEMILE,3,3,3,3,3,3
BRIDLEMILE/SOUTHWEST HILLS RESIDENTIAL LEAGUE,1,1,1,1,1,1


In [89]:
venue_totals_low.shape

(64, 6)

In [90]:
portland_venues_low

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LINNTON,45.60379,-122.793264,Subway,45.603014,-122.78859,Sandwich Place
LINNTON,45.60379,-122.793264,Shell,45.603473,-122.788861,Gas Station
LINNTON,45.60379,-122.793264,7-Eleven,45.602668,-122.788165,Convenience Store
FOREST PARK/LINNTON,45.58063,-122.781775,Forest Park Hardesty Trailhead,45.578979,-122.781064,Trail
UNIVERSITY PARK,45.576354,-122.730079,Merlo Field,45.574739,-122.727743,College Soccer Field
UNIVERSITY PARK,45.576354,-122.730079,Chiles Center,45.575108,-122.72854,College Basketball Court
UNIVERSITY PARK,45.576354,-122.730079,Student Lead Urban Garden,45.57696,-122.733625,Garden
UNIVERSITY PARK,45.576354,-122.730079,Mago Hunt Recital Hall University of Portland,45.573198,-122.727912,Theater
MC UNCLAIMED #14,45.557689,-122.727474,Kelley Imaging Systems,45.557622,-122.731581,Paper / Office Supplies Store
CULLY ASSOCIATION OF NEIGHBORS,45.563753,-122.601509,Angel Food and Fun Mexican Restaurant,45.560074,-122.600868,Mexican Restaurant


#### I have removed busy nieghborhoods, and neighborhoods that already have coffee shops or bakeries. Now I would like to find neighborhoods that have at least one park. 

In [91]:
# find neighborhoods with parks.
parks = portland_venues_low.loc[portland_venues_low['Venue Category'] == 'Park']
parks

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
NORTHWEST HEIGHTS,45.5403,-122.771887,Forest Heights Park,45.543284,-122.776122,Park
MADISON SOUTH,45.541545,-122.574206,Glenhaven Park,45.54379,-122.579732,Park
ARGAY/WILKES COMMUNITY GROUP,45.550552,-122.510636,Wilkes Park,45.549828,-122.504396,Park
PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS,45.540394,-122.548186,Knott Park,45.540462,-122.545425,Park
GRANT PARK,45.539315,-122.629179,Grant Park,45.539932,-122.629707,Park
HILLSIDE/NORTHWEST DISTRICT ASSN.,45.528865,-122.705431,Swift Watch,45.532694,-122.705925,Park
HILLSIDE,45.525879,-122.715946,Pittock Mansion Gate Lodge,45.524843,-122.716442,Park
HILLSIDE,45.525879,-122.715946,Forest Park - Cumberland Trailhead,45.529606,-122.714889,Park
MT. TABOR,45.51431,-122.598677,Mt. Tabor Park,45.512723,-122.59429,Park
MILL PARK,45.512087,-122.540271,Mill Park,45.51078,-122.541206,Park


In [92]:
# get list of neighborhoods with parks
parks_list = parks.index.values
parks_list

array(['NORTHWEST HEIGHTS', 'MADISON SOUTH',
       'ARGAY/WILKES COMMUNITY GROUP',
       'PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS', 'GRANT PARK',
       'HILLSIDE/NORTHWEST DISTRICT ASSN.', 'HILLSIDE', 'HILLSIDE',
       'MT. TABOR', 'MILL PARK', 'BRIDLEMILE', 'FOSTER-POWELL',
       'LENTS/POWELLHURST-GILBERT',
       'HEALY HEIGHTS/SOUTHWEST HILLS RESIDENTIAL LEAGUE',
       'CENTENNIAL COMMUNITY ASSN./PLEASANT VALLEY', 'PLEASANT VALLEY',
       'SOUTHWEST HILLS RESIDENTIAL LEAGUE', 'HAYHURST', 'EASTMORELAND',
       'BRENTWOOD-DARLINGTON', 'MC UNCLAIMED #11', 'MC UNCLAIMED #11',
       'WEST PORTLAND PARK', 'WEST PORTLAND PARK', 'KENTON',
       'EAST COLUMBIA', 'PORTSMOUTH'], dtype=object)

In [95]:
Portland_venues.reset_index(level =['Neighborhood'], inplace = True) 
Portland_venues.head()

KeyError: 'Requested level (Neighborhood) does not match index name (None)'

In [96]:
Portland_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,LINNTON,45.60379,-122.793264,Subway,45.603014,-122.78859,Sandwich Place
1,LINNTON,45.60379,-122.793264,Shell,45.603473,-122.788861,Gas Station
2,LINNTON,45.60379,-122.793264,7-Eleven,45.602668,-122.788165,Convenience Store
3,FOREST PARK/LINNTON,45.58063,-122.781775,Forest Park Hardesty Trailhead,45.578979,-122.781064,Trail
4,CATHEDRAL PARK,45.587368,-122.757317,Cathedral Park,45.587744,-122.759822,Park


#### Get subset of Portland_venues that only has neighborhoods that aren't busy (less than 15 venues), have no coffee shops or bakeries, and have at least one park

In [97]:
parks = Portland_venues[Portland_venues['Neighborhood'].isin(['NORTHWEST HEIGHTS', 'MADISON SOUTH',
       'ARGAY/WILKES COMMUNITY GROUP',
       'PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS', 'GRANT PARK',
       'HILLSIDE/NORTHWEST DISTRICT ASSN.', 'HILLSIDE', 'HILLSIDE',
       'MT. TABOR', 'MILL PARK', 'BRIDLEMILE', 'FOSTER-POWELL',
       'LENTS/POWELLHURST-GILBERT',
       'HEALY HEIGHTS/SOUTHWEST HILLS RESIDENTIAL LEAGUE',
       'CENTENNIAL COMMUNITY ASSN./PLEASANT VALLEY', 'PLEASANT VALLEY',
       'SOUTHWEST HILLS RESIDENTIAL LEAGUE', 'HAYHURST', 'EASTMORELAND',
       'BRENTWOOD-DARLINGTON', 'MC UNCLAIMED #11', 'MC UNCLAIMED #11',
       'WEST PORTLAND PARK', 'WEST PORTLAND PARK', 'KENTON',
       'EAST COLUMBIA', 'PORTSMOUTH'])]


In [98]:
parks

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
417,NORTHWEST HEIGHTS,45.5403,-122.771887,Forest Heights Park,45.543284,-122.776122,Park
418,NORTHWEST HEIGHTS,45.5403,-122.771887,Dinner,45.537473,-122.77673,Cafeteria
419,NORTHWEST HEIGHTS,45.5403,-122.771887,Ouickie,45.537483,-122.776815,Bridal Shop
449,MADISON SOUTH,45.541545,-122.574206,The Lumberyard,45.541498,-122.577589,Bike Shop
450,MADISON SOUTH,45.541545,-122.574206,Phở Oregon,45.540347,-122.578717,Vietnamese Restaurant
451,MADISON SOUTH,45.541545,-122.574206,Mekong Bistro,45.544365,-122.578317,Cambodian Restaurant
452,MADISON SOUTH,45.541545,-122.574206,Glenhaven Park,45.54379,-122.579732,Park
453,MADISON SOUTH,45.541545,-122.574206,Pulehu Pizza,45.541233,-122.577214,Pizza Place
454,MADISON SOUTH,45.541545,-122.574206,Pub @ the Yard,45.541435,-122.577363,Pub
455,ARGAY/WILKES COMMUNITY GROUP,45.550552,-122.510636,Round table pizza,45.554574,-122.510421,Pizza Place


In [99]:
# Display the remaining venue counts, grouped by neighborhood
parks_total = parks.groupby('Neighborhood').count()
parks_total

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ARGAY/WILKES COMMUNITY GROUP,5,5,5,5,5,5
BRENTWOOD-DARLINGTON,4,4,4,4,4,4
BRIDLEMILE,3,3,3,3,3,3
CENTENNIAL COMMUNITY ASSN./PLEASANT VALLEY,7,7,7,7,7,7
EAST COLUMBIA,2,2,2,2,2,2
EASTMORELAND,1,1,1,1,1,1
FOSTER-POWELL,5,5,5,5,5,5
GRANT PARK,4,4,4,4,4,4
HAYHURST,3,3,3,3,3,3
HEALY HEIGHTS/SOUTHWEST HILLS RESIDENTIAL LEAGUE,2,2,2,2,2,2


In [100]:
parks_total.shape

(24, 6)

In [102]:
parks

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
417,NORTHWEST HEIGHTS,45.5403,-122.771887,Forest Heights Park,45.543284,-122.776122,Park
418,NORTHWEST HEIGHTS,45.5403,-122.771887,Dinner,45.537473,-122.77673,Cafeteria
419,NORTHWEST HEIGHTS,45.5403,-122.771887,Ouickie,45.537483,-122.776815,Bridal Shop
449,MADISON SOUTH,45.541545,-122.574206,The Lumberyard,45.541498,-122.577589,Bike Shop
450,MADISON SOUTH,45.541545,-122.574206,Phở Oregon,45.540347,-122.578717,Vietnamese Restaurant
451,MADISON SOUTH,45.541545,-122.574206,Mekong Bistro,45.544365,-122.578317,Cambodian Restaurant
452,MADISON SOUTH,45.541545,-122.574206,Glenhaven Park,45.54379,-122.579732,Park
453,MADISON SOUTH,45.541545,-122.574206,Pulehu Pizza,45.541233,-122.577214,Pizza Place
454,MADISON SOUTH,45.541545,-122.574206,Pub @ the Yard,45.541435,-122.577363,Pub
455,ARGAY/WILKES COMMUNITY GROUP,45.550552,-122.510636,Round table pizza,45.554574,-122.510421,Pizza Place


In [103]:
parks = parks.set_index("Neighborhood")

#### Now that I've narrowed my list down to 24 neighborhoods, I looked at them individually. I need to drop a few more rows for some special cases (e.g. a neighborhood with venues that all consist of a zoo, one has a food truck that is a coffee shop, one has a freightliner delaership, one with a museum, and one with a big intersection/construction site.

In [104]:
# drop rows with coffee shops
final_neighborhoods = parks.drop(['HILLSIDE','SOUTHWEST HILLS RESIDENTIAL LEAGUE', 
                                                'KENTON','EAST COLUMBIA', 'MC UNCLAIMED #11', 
                                                'PLEASANT VALLEY', 'HILLSIDE/NORTHWEST DISTRICT ASSN.'], axis=0)

In [105]:
final_neighborhoods

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
NORTHWEST HEIGHTS,45.5403,-122.771887,Forest Heights Park,45.543284,-122.776122,Park
NORTHWEST HEIGHTS,45.5403,-122.771887,Dinner,45.537473,-122.77673,Cafeteria
NORTHWEST HEIGHTS,45.5403,-122.771887,Ouickie,45.537483,-122.776815,Bridal Shop
MADISON SOUTH,45.541545,-122.574206,The Lumberyard,45.541498,-122.577589,Bike Shop
MADISON SOUTH,45.541545,-122.574206,Phở Oregon,45.540347,-122.578717,Vietnamese Restaurant
MADISON SOUTH,45.541545,-122.574206,Mekong Bistro,45.544365,-122.578317,Cambodian Restaurant
MADISON SOUTH,45.541545,-122.574206,Glenhaven Park,45.54379,-122.579732,Park
MADISON SOUTH,45.541545,-122.574206,Pulehu Pizza,45.541233,-122.577214,Pizza Place
MADISON SOUTH,45.541545,-122.574206,Pub @ the Yard,45.541435,-122.577363,Pub
ARGAY/WILKES COMMUNITY GROUP,45.550552,-122.510636,Round table pizza,45.554574,-122.510421,Pizza Place


#### I've narrowed it down to 17 neighborhoods (down from 130)

In [106]:
# Display the remaining venue counts, grouped by neighborhood
final_neighborhoods_grouped = final_neighborhoods.groupby('Neighborhood').count()
final_neighborhoods_grouped 

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ARGAY/WILKES COMMUNITY GROUP,5,5,5,5,5,5
BRENTWOOD-DARLINGTON,4,4,4,4,4,4
BRIDLEMILE,3,3,3,3,3,3
CENTENNIAL COMMUNITY ASSN./PLEASANT VALLEY,7,7,7,7,7,7
EASTMORELAND,1,1,1,1,1,1
FOSTER-POWELL,5,5,5,5,5,5
GRANT PARK,4,4,4,4,4,4
HAYHURST,3,3,3,3,3,3
HEALY HEIGHTS/SOUTHWEST HILLS RESIDENTIAL LEAGUE,2,2,2,2,2,2
LENTS/POWELLHURST-GILBERT,8,8,8,8,8,8


### Create a final map showing ideal locations for a bakery/coffee shop

In [108]:
# get list of neighborhoods with parks
final_neighborhoods_list = final_neighborhoods_grouped.index.values
final_neighborhoods_list

array(['ARGAY/WILKES COMMUNITY GROUP', 'BRENTWOOD-DARLINGTON',
       'BRIDLEMILE', 'CENTENNIAL COMMUNITY ASSN./PLEASANT VALLEY',
       'EASTMORELAND', 'FOSTER-POWELL', 'GRANT PARK', 'HAYHURST',
       'HEALY HEIGHTS/SOUTHWEST HILLS RESIDENTIAL LEAGUE',
       'LENTS/POWELLHURST-GILBERT', 'MADISON SOUTH', 'MILL PARK',
       'MT. TABOR', 'NORTHWEST HEIGHTS',
       'PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS', 'PORTSMOUTH',
       'WEST PORTLAND PARK'], dtype=object)

In [110]:
df.head()

Unnamed: 0,NAME,Latitude,Longitude
0,LINNTON,45.60379,-122.793264
1,FOREST PARK/LINNTON,45.58063,-122.781775
2,FOREST PARK,45.564383,-122.792078
3,CATHEDRAL PARK,45.587368,-122.757317
4,UNIVERSITY PARK,45.576354,-122.730079


#### Need to get a df with lat and long, so I went back to the original list (df) and got a subset with my chosen neighborhoods

In [113]:
df_final = df[df['NAME'].isin(['ARGAY/WILKES COMMUNITY GROUP', 'BRENTWOOD-DARLINGTON',
       'BRIDLEMILE', 'CENTENNIAL COMMUNITY ASSN./PLEASANT VALLEY',
       'EASTMORELAND', 'FOSTER-POWELL', 'GRANT PARK', 'HAYHURST',
       'HEALY HEIGHTS/SOUTHWEST HILLS RESIDENTIAL LEAGUE',
       'LENTS/POWELLHURST-GILBERT', 'MADISON SOUTH', 'MILL PARK',
       'MT. TABOR', 'NORTHWEST HEIGHTS',
       'PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS', 'PORTSMOUTH',
       'WEST PORTLAND PARK'])]

In [114]:
df_final

Unnamed: 0,NAME,Latitude,Longitude
23,NORTHWEST HEIGHTS,45.5403,-122.771887
25,MADISON SOUTH,45.541545,-122.574206
26,ARGAY/WILKES COMMUNITY GROUP,45.550552,-122.510636
33,PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS,45.540394,-122.548186
39,GRANT PARK,45.539315,-122.629179
63,MT. TABOR,45.51431,-122.598677
71,MILL PARK,45.512087,-122.540271
81,BRIDLEMILE,45.491312,-122.726829
84,FOSTER-POWELL,45.49239,-122.590429
86,LENTS/POWELLHURST-GILBERT,45.491147,-122.55498


#### reset the index

In [116]:
df_final.reset_index(drop=True, inplace=True)
df_final

Unnamed: 0,NAME,Latitude,Longitude
0,NORTHWEST HEIGHTS,45.5403,-122.771887
1,MADISON SOUTH,45.541545,-122.574206
2,ARGAY/WILKES COMMUNITY GROUP,45.550552,-122.510636
3,PARKROSE HEIGHTS ASSOCIATION OF NEIGHBORS,45.540394,-122.548186
4,GRANT PARK,45.539315,-122.629179
5,MT. TABOR,45.51431,-122.598677
6,MILL PARK,45.512087,-122.540271
7,BRIDLEMILE,45.491312,-122.726829
8,FOSTER-POWELL,45.49239,-122.590429
9,LENTS/POWELLHURST-GILBERT,45.491147,-122.55498


In [115]:
# create final map of narrowed down Portland neighborhoods using latitude and longitude values
m_final = folium.Map(location=[45.5236, -122.6750], zoom_start=12)

# add markers to map
for lat, lng, label in zip(df_final['Latitude'], df_final['Longitude'], df_final['NAME']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(m_final)  
    
m_final