# Capstone Project

# Portland, Oregon Microbrewery Market (week 2)

### Applied Data Science Capstone for IBM/Coursera


## Table of Contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)


## Introduction: Business Problem <a name="introduction"></a>

Portland is the largest city in Oregon. Portland is located in the Northwest section of the state at the junction of the Willamette and Columbia rivers. Portland is a growing, attractive metropolis with a diverse economy. The region continues to attract entreprenuers, innovators and IT professionals.

Microbrewers have found the culture of Portland extremely friendly to their craft brews. In fact, the Oregon Craft Beer Association claims that **Portland**, with 117 breweries in the greater metro area as of 2018, **has more breweries than any other city in the world** https://oregoncraftbeer.org/facts/.

The focus of this project is to discover whether or not any locations exist in Portland that are not oversaturated with microbreweries. This report will target specifically entrepreneurs interested in opening a **microbrewery** in **Portland**, Oregon.

Since Portland has more microbreweries than any other city in the world, we will look for **locations with no microbrewery in the vicinity**. We will interpret this to mean no other microbrewery within 500 meters, or approximately 5 min walk. We'll be **looking as close as possible to the center of Portland**. Additionally, we'll **exclude the suburbs** from our search because they are largely residential areas with fewer vistors and fewer businesses and industries. And, we'll exclude other types of restaurants from our search as patrons of microbreweries tend to choose among the microbreweries available rathern than between a microbrewery and another type of restaurant.

##  Data <a name="data"></a>

Given how we've framed the problem the following factors will influence our decision:

  * The number of microbreweries centered around downtown Portland.
  * The distance between each microbrewery and its next closest neighbor.

The required information will be extracted from the following data sources:

  * Coordinates of downtown Portland and distances between microbreweries will be obtained using **GeoPy Geocoder**.
  * The names, addresses and coordinates of microbreweries will be obtained through **Foursquare API**.

### Downtown Portland

We'll start by finding a latitude and longitude for downtown Portland using GeoPy's Nominatim to convert an address into coordinates.

First, we load the Nominatim function:

In [1]:
from geopy.geocoders import Nominatim # convert address into latitude and longitude

Then we run the function for the address 'Portland, OR'.

In [2]:
address = 'Portland, OR'

geolocator = Nominatim(user_agent="foursquare_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinates of Portland are {}, {}.'.format(latitude, longitude))

The geographical coordinates of Portland are 45.5202471, -122.6741949.


Next we import the Folium library for map construction and create a map of Portland using the coordinates we derived in the last step.

In [3]:
# map construction library

import folium

In [4]:
# create map of Portland using latitude and longitude values

map_portland = folium.Map(location=[latitude, longitude], zoom_start = 13) 

map_portland

Now that we have a map centered on downtown Portland let's procure the data on microbreweries from Foursquare.

### Foursquare API



First we define our Foursquare credentials:

In [5]:
CLIENT_ID ="3U5ZTVO5FPT4Z1JJWRMPOCW1XG2DNSI3OR4JKIQFNIIDUZNC"
CLIENT_SECRET ="TQ5QY3SW5QTRW10SO4IGDLIAPO45IEUUHIIDENMJGQHNPEMR"

Next we define the other parameters needed to extract the microbrewery data, including version, radius and limit. Microbreweries have their own category id, so we'll use it to extract data on only microbreweries.

In [34]:
Version = '20180605'

categoryId = '50327c8591d4c4b30a586d5d'

radius = 6000

limit = 100

data = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&categoryId={}&v={}&radius={}&limit={}'\
.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, categoryId, Version, radius, limit)

data

'https://api.foursquare.com/v2/venues/search?client_id=3U5ZTVO5FPT4Z1JJWRMPOCW1XG2DNSI3OR4JKIQFNIIDUZNC&client_secret=TQ5QY3SW5QTRW10SO4IGDLIAPO45IEUUHIIDENMJGQHNPEMR&ll=45.5202471,-122.6741949&categoryId=50327c8591d4c4b30a586d5d&v=20180605&radius=6000&limit=100'

Let's import the Requests module and start unpacking our data.

In [35]:
import requests

results = requests.get(data).json()

results

{'meta': {'code': 200, 'requestId': '5e8ff8d89da7ee001b40a948'},
 'response': {'venues': [{'id': '48837e4ef964a52036511fe3',
    'name': 'Deschutes Brewery Portland Public House',
    'location': {'address': '210 NW 11th Ave',
     'crossStreet': 'at NW Davis St',
     'lat': 45.524544086316254,
     'lng': -122.68198230367531,
     'labeledLatLngs': [{'label': 'display',
       'lat': 45.524544086316254,
       'lng': -122.68198230367531}],
     'distance': 773,
     'postalCode': '97209',
     'cc': 'US',
     'city': 'Portland',
     'state': 'OR',
     'country': 'United States',
     'formattedAddress': ['210 NW 11th Ave (at NW Davis St)',
      'Portland, OR 97209',
      'United States']},
    'categories': [{'id': '50327c8591d4c4b30a586d5d',
      'name': 'Brewery',
      'pluralName': 'Breweries',
      'shortName': 'Brewery',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/brewery_',
       'suffix': '.png'},
      'primary': True}],
    'venuePage': {'i

Among other things, we get a name, address, latitude and longitude for each microbrewery.

Next we convert the data into a dataframe. We start by importing the json_normalize function we'll use to covert the data to table form.

In [36]:
from pandas.io.json import json_normalize

We set a variable equal to the segments of the Foursquare data that contain the information we want. We transform that to a pandas dataframe then examine the first few rows of the dataframe to see what we have.

In [37]:
brew = results['response']['venues']

brew_df = json_normalize(brew)

brew_df.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,...,location.country,location.formattedAddress,venuePage.id,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,location.neighborhood
0,48837e4ef964a52036511fe3,Deschutes Brewery Portland Public House,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1586493668,False,210 NW 11th Ave,at NW Davis St,45.524544,-122.681982,"[{'label': 'display', 'lat': 45.52454408631625...",...,United States,"[210 NW 11th Ave (at NW Davis St), Portland, O...",36780574.0,,,,,,,
1,5e486a88fda02b000899b294,Gorges Beer: The Trailhead,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1586493668,False,2705 SE Ankeny St,,45.522343,-122.638284,"[{'label': 'display', 'lat': 45.522343, 'lng':...",...,United States,"[2705 SE Ankeny St, Portland, OR 97214, United...",,,,,,,,
2,5cbf7b91b399f7002c94b190,Away Days Brewing,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1586493668,False,1516 SE 10th Ave,,45.511986,-122.655981,"[{'label': 'display', 'lat': 45.51198593825523...",...,United States,"[1516 SE 10th Ave, Portland, OR 97214, United ...",,1406736.0,https://www.grubhub.com/restaurant/away-days-b...,grubhub,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_grubhub_20180129.png,
3,5b37dc162a7ab6002ca58d9f,Baerlic Brewing Beer Hall at the Barley Pod,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1586493668,False,6035 NE Halsey St,at NE 60th Ave,45.533899,-122.601326,"[{'label': 'display', 'lat': 45.53389890556244...",...,United States,"[6035 NE Halsey St (at NE 60th Ave), Portland,...",,,,,,,,
4,5aadc02ee07550183870d5ee,10 Barrel,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1586493668,False,1 N Center Court St,,45.531254,-122.667361,"[{'label': 'display', 'lat': 45.531254, 'lng':...",...,United States,"[1 N Center Court St, Portland, OR 97227, Unit...",,,,,,,,


In [38]:
brew_df.shape

(50, 25)

We won't need all those columns to plot the microbrewery locations on a map. Let's select only the columns we'll need and drop the location. prefix from the column headings.

In [39]:
brew_df = brew_df[['name', 'location.address', 'location.lat', 'location.lng']]

brew_df.columns = [column.split('.')[-1] for column in brew_df.columns]

brew_df.head()

Unnamed: 0,name,address,lat,lng
0,Deschutes Brewery Portland Public House,210 NW 11th Ave,45.524544,-122.681982
1,Gorges Beer: The Trailhead,2705 SE Ankeny St,45.522343,-122.638284
2,Away Days Brewing,1516 SE 10th Ave,45.511986,-122.655981
3,Baerlic Brewing Beer Hall at the Barley Pod,6035 NE Halsey St,45.533899,-122.601326
4,10 Barrel,1 N Center Court St,45.531254,-122.667361


That's looking more like something we can work with. Let's check to see if we have empty cells or missing data.

In [40]:
brew_df.isnull().sum(axis = 0)

name       0
address    1
lat        0
lng        0
dtype: int64

Looks like we're missing just one address. We'll leave that one cell empty as the lat and lng is the critical information for our mapping project.

Let's look at the data type to see if everything is in shape for mapping and calculations.

In [41]:
brew_df.dtypes

name        object
address     object
lat        float64
lng        float64
dtype: object

We need to convert the address column to string for use as labels in our map. The other data types look good for what we need.

In [14]:
brew_df.address = brew_df.address.astype(str)

Now we can plot our microbreweries on our Portland map and see where we are so far.

In [42]:
map_portland = folium.Map(location=[latitude, longitude], zoom_start = 11)

# add a red circle marker to represent downtown portland

folium.features.CircleMarker(
    [latitude, longitude],
    radius = 10,
    color ='red',
    popup ='PDX Downtown',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.5
).add_to(map_portland)

# add the micro breweries as blue circle markers

for lat, lng, label in zip(brew_df.lat, brew_df.lng, brew_df.address):
    folium.features.CircleMarker(
        [lat, lng],
        radius = 5,
        color ='blue',
        popup = folium.Popup(label),
        fill = True,
        fill_color = 'blue',
        fill_opacity = 0.5
    ).add_to(map_portland)

map_portland

Great. Now we have all the microbreweries within 6km of downtown Portland. We have the data we need and it's ready for analysis to create our report on prime locations for a new microbrewery!

## Methodology <a name="methodology"></a>

For this project we're focusing on finding areas in Portland where the density of microbreweries is low. In particular, we're looking for areas where we can locate a new microbrewery more than 500 meters from a competitor. We're making the assumption that if an existing microbrewery is at least 500 meters from its nearest competitor then there is likely to be space for a new microbrewery.

In the first step we collected the data we need for our analysis: the latitude and longitude of every microbrewery within 6km of downtown Portland (according to the Foursquare microbrewery category).

The next step for our analysis is calculating the distances between our microbrewies and downtown Portland, as well as the distance from each microbrewery to its nearest competitor. We'll use that data to segment our microbreweries into clusters (using k-means clustering) so we can determine what regions of the city might be more promising for a new microbrewery.

Our last step will focus on the most promising areas based on our k-means clustering analysis. We'll zero in on areas that meet requirements agreed to in discussion with our microbrewery entrepreneur. This will include areas where we are likely to find suitable and cost-effective locations that are more than 500 meters from any competitor. We'll prepare maps and tables of the locations that our client can use as a starting point for an 'on-the-ground' search of potential site locations.

## Analysis <a name="analysis"></a>

We want to be close enough to downtown Portland that our microbrewery is a viable option for travelers staying downtown, but not so close that we have multiple competitors less than 500 meters from us. So, let's calculate each microbrewery's distance from downtown. We'll use this data in our k-means analysis later on.

We'll start by importing the GeoPy distance function. Next, we'll set our coordinates for downtown PDX we obtained earlier.

In [16]:
from geopy import distance

PDX = (45.5202471, -122.6741949)

To calculate the distances from the microbreweries to the center of Portland, we set up a *for loop*. In the *for loop* we run the distance function for each iteration of latitude and longitude. Then, we save the calculation to a list. Final, we print the list to confirm our results. 

In [43]:
dist_PDX = []

for i in range(len(brew_df['lat'])):
    loc_i = (brew_df['lat'][i], brew_df['lng'][i])
    
    dist = round((distance.great_circle(PDX, loc_i).km), 1)
    
    dist_PDX.append(dist)
    
print(dist_PDX)

[0.8, 2.8, 1.7, 5.9, 1.3, 5.5, 5.6, 5.9, 3.0, 4.9, 4.0, 3.5, 1.6, 4.6, 3.0, 1.0, 0.8, 1.4, 0.8, 2.0, 3.9, 2.1, 5.9, 0.8, 1.5, 1.4, 1.9, 1.8, 2.6, 5.3, 1.1, 6.2, 6.4, 1.1, 1.2, 2.2, 1.5, 2.0, 2.3, 3.3, 2.2, 4.8, 5.4, 3.5, 4.1, 4.8, 1.7, 2.6, 2.3, 0.9]


We also want to know the distances between our microbreweries. We're looking for the closest competition for each microbrewery, noting especially if the competitor is within 500 meters of the microbrewery.

To do this we set up two *for loops*. The first *for loop* is the base microbrewery we compare to all the other locations. Inside of it we set up a second *for loop* to iterate through the locations of all the microbreweries and run the distance function.

We don't want to run the distance function for a microbrewery on itself, so we set up an *if* clause in the second *for loop* to check for this condition. Each time the second *for loop* finishes it adds the minimum distance found (one reason to bypass running the distance function for a microbrewery on itself) to a list. Finally, we print the list to confirm the results.

In [44]:
#PDX_long, PDX_lat = lonlat_to_xy(a, b)

nnbrew = []

for i in range(len(brew_df['lat'])):
    loc_i = (brew_df['lat'][i], brew_df['lng'][i])
    
    nnp = []
               
    for j in range(len(brew_df['lat'])):
        if j == i:
            pass
        else:
            loc_j = (brew_df['lat'][j], brew_df['lng'][j])
            rng = round((distance.great_circle(loc_i, loc_j).km), 2)
            nnp.append(rng)
    
    nnbrew.append(min(nnp))
    
print(nnbrew)

[0.05, 0.46, 0.47, 0.02, 0.41, 1.26, 0.92, 0.02, 0.46, 1.09, 0.96, 0.7, 0.7, 1.1, 0.25, 0.04, 0.05, 0.37, 0.49, 0.36, 0.96, 0.59, 1.29, 0.55, 0.04, 0.08, 0.38, 0.39, 0.76, 1.22, 1.1, 1.6, 1.84, 0.04, 0.22, 0.38, 0.04, 0.45, 0.36, 0.25, 0.56, 0.92, 1.09, 1.16, 0.75, 0.75, 0.38, 0.64, 0.38, 0.2]


Those distances look accurate.

Next we want to add our distance calculations to our microbrewery dataframe. We'll do that by adding two new columns to the end of the dataframe. Then we'll view our progress.

In [45]:
brew_df['Dis from PDX Center'] = dist_PDX

brew_df['Closest Rival'] = nnbrew

brew_df.style

#print(brew_df.head())

Unnamed: 0,name,address,lat,lng,Dis from PDX Center,Closest Rival
0,Deschutes Brewery Portland Public House,210 NW 11th Ave,45.5245,-122.682,0.8,0.05
1,Gorges Beer: The Trailhead,2705 SE Ankeny St,45.5223,-122.638,2.8,0.46
2,Away Days Brewing,1516 SE 10th Ave,45.512,-122.656,1.7,0.47
3,Baerlic Brewing Beer Hall at the Barley Pod,6035 NE Halsey St,45.5339,-122.601,5.9,0.02
4,10 Barrel,1 N Center Court St,45.5313,-122.667,1.3,0.41
5,Leikam Tap Room,5812 E Burnside St,45.5226,-122.604,5.5,1.26
6,Look Long Brewing Company,,45.5706,-122.682,5.6,0.92
7,The Barley Pod,6035 NE Halsey St,45.5338,-122.602,5.9,0.02
8,Migration Brewing,2828 NE Glisan St,45.5263,-122.636,3.0,0.46
9,Great Notion Brewing,2204 NE Alberta St #101,45.5589,-122.643,4.9,1.09


Nice.

The distance calculations will be useful for partitioning our microbreweries through k-means clustering. We'll start by importing the Kmeans algorithm.

In [21]:
from sklearn.cluster import KMeans

Next we'll drop the columns we don't need to run the algorithm and store the truncated dataframe in a variable. Then we'll run the algorithm on that variable. Based on inspection of the microbrewery map, fewer clusters seems better than more clusters. 

We're looking for real estate where we could locate a microbrewery close to downtown but at least 500 meters from the nearest competitor. Whether the distance is 500 meters or 750 meters is really moot, so partitioning our data into lots of group is overkill. We'll set our clusters, then, to three. 

Finally, we view the first ten labels generated by the algorithm.

In [46]:
# set number of clusters

PDX_clusters = brew_df.drop(['name', 'address', 'lat', 'lng'], 1)

# run k-means clustering
kmeans = KMeans(n_clusters = 3, random_state = 0).fit(PDX_clusters)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 0, 1, 2, 1, 2, 2, 2, 0, 2])

Now we'll add those labels to our main dataframe and inspect the results.

In [47]:
brew_df.insert(0, 'labels', kmeans.labels_)

brew_df

Unnamed: 0,labels,name,address,lat,lng,Dis from PDX Center,Closest Rival
0,1,Deschutes Brewery Portland Public House,210 NW 11th Ave,45.524544,-122.681982,0.8,0.05
1,0,Gorges Beer: The Trailhead,2705 SE Ankeny St,45.522343,-122.638284,2.8,0.46
2,1,Away Days Brewing,1516 SE 10th Ave,45.511986,-122.655981,1.7,0.47
3,2,Baerlic Brewing Beer Hall at the Barley Pod,6035 NE Halsey St,45.533899,-122.601326,5.9,0.02
4,1,10 Barrel,1 N Center Court St,45.531254,-122.667361,1.3,0.41
5,2,Leikam Tap Room,5812 E Burnside St,45.522624,-122.604116,5.5,1.26
6,2,Look Long Brewing Company,,45.57057,-122.681785,5.6,0.92
7,2,The Barley Pod,6035 NE Halsey St,45.5338,-122.601571,5.9,0.02
8,0,Migration Brewing,2828 NE Glisan St,45.526289,-122.636384,3.0,0.46
9,2,Great Notion Brewing,2204 NE Alberta St #101,45.558881,-122.642771,4.9,1.09


Let's see how many microbreweries are in each of clusters.

In [48]:
brew_df['labels'].value_counts()

1    26
2    13
0    11
Name: labels, dtype: int64

I suspect the largest cluster is the group of microbreweries packed around downtown Portland. The other two clusters are more likely to be areas of interest to our microbrewery entreprenuers. Let's map our microbreweries again with a different color for each cluster to find out.

To assign a different color for each cluster we write a *for loop* that detects the label through a series of *if* statments that assigns the appropriate cluster color to the label.

In [49]:
map_cluster = folium.Map(location = PDX, zoom_start=12)

folium.features.CircleMarker(PDX,
    radius = 10,
    color ='red',
    popup ='PDX Downtown',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.5
).add_to(map_cluster)

# add markers to the map
#markers_colors = []

for lat, lon, cluster in zip(brew_df['lat'], brew_df['lng'],\
                                  brew_df['labels']):
        
        if cluster == 1:
            folium.CircleMarker([lat, lon],
            radius = 5,
            color = 'blue',
            fill = True,
            fill_color = 'blue',
            fill_opacity = 0.7).add_to(map_cluster)
            
        elif cluster == 2:
            folium.CircleMarker([lat, lon],
            radius = 5,
            color = 'red',
            fill = True,
            fill_color = 'red',
            fill_opacity = 0.7).add_to(map_cluster)
        
        else:
            folium.CircleMarker([lat, lon],
            radius = 5,
            color = 'green',
            fill = True,
            fill_color = 'green',
            fill_opacity = 0.7).add_to(map_cluster)
        
map_cluster

Indeed, our largest cluster (designated in blue in the map) looks to be packed around downtown Portland. Our red and green clusters are further away from downtown and they are more spread out. These areas will be of more interest to our microbrewery entreprenuers.

Let's make tables of the red and green clusters that include the name, address, latitude and longitude of the microbreweries in each cluster. We'll provide these to our entrepreneurs as a starting point for further research.

In [50]:
import pandas as pd

red_c = brew_df[brew_df.labels == 2]

red_cs = red_c.sort_values(by = ['Closest Rival'], axis = 0, ascending = False).reset_index(drop = True)

red_cs.style

Unnamed: 0,labels,name,address,lat,lng,Dis from PDX Center,Closest Rival
0,2,Double Mountain Brewery & Taproom,4336 SE Woodstock Blvd,45.479,-122.618,6.4,1.84
1,2,Fire on the Mountain,3443 NE 57th Ave,45.5481,-122.605,6.2,1.6
2,2,Breakside Brewery,820 NE Dekum St,45.5717,-122.657,5.9,1.29
3,2,Leikam Tap Room,5812 E Burnside St,45.5226,-122.604,5.5,1.26
4,2,13 Virtues Brewing Co.,6410 SE Milwaukie Ave,45.4762,-122.649,5.3,1.22
5,2,Old Town Brewing,5201 NE M L King Blvd,45.5606,-122.662,4.6,1.1
6,2,Great Notion Brewing,2204 NE Alberta St #101,45.5589,-122.643,4.9,1.09
7,2,Lemur's Sör Ház,4635 NE 35th Ave,45.5573,-122.629,5.4,1.09
8,2,Look Long Brewing Company,,45.5706,-122.682,5.6,0.92
9,2,Lucky Labrador Tap Room,1700 N Killingsworth St,45.5626,-122.685,4.8,0.92


We want to give our entrepreuners areas where the next closest brewery is more than 500 meters away. Looks like every microbrewery on this list fits that description, except the last two. Based on the name and address, this is one brewery, so we'll drop one.

We'll save the resulting dataframe as a .csv file so we can send it our client.

In [51]:
red_cs.drop([red_cs.index[12]], inplace = True)

red_cs.to_csv('brew_red_cluster.csv')

red_cs.style

Unnamed: 0,labels,name,address,lat,lng,Dis from PDX Center,Closest Rival
0,2,Double Mountain Brewery & Taproom,4336 SE Woodstock Blvd,45.479,-122.618,6.4,1.84
1,2,Fire on the Mountain,3443 NE 57th Ave,45.5481,-122.605,6.2,1.6
2,2,Breakside Brewery,820 NE Dekum St,45.5717,-122.657,5.9,1.29
3,2,Leikam Tap Room,5812 E Burnside St,45.5226,-122.604,5.5,1.26
4,2,13 Virtues Brewing Co.,6410 SE Milwaukie Ave,45.4762,-122.649,5.3,1.22
5,2,Old Town Brewing,5201 NE M L King Blvd,45.5606,-122.662,4.6,1.1
6,2,Great Notion Brewing,2204 NE Alberta St #101,45.5589,-122.643,4.9,1.09
7,2,Lemur's Sör Ház,4635 NE 35th Ave,45.5573,-122.629,5.4,1.09
8,2,Look Long Brewing Company,,45.5706,-122.682,5.6,0.92
9,2,Lucky Labrador Tap Room,1700 N Killingsworth St,45.5626,-122.685,4.8,0.92


Let's create a map that includes only the microbreweries on our final red cluster list. We'll save this map so we can give it our client as well.

In [52]:
map_red_cluster = folium.Map(location=[latitude, longitude], zoom_start = 12)

# add a blue circle marker to represent downtown portland

folium.features.CircleMarker(
    [latitude, longitude],
    radius = 10,
    color ='blue',
    popup ='PDX Downtown',
    fill = True,
    fill_color = 'blue',
    fill_opacity = 0.5
).add_to(map_red_cluster)

# add our red cluster with red circle markers

for lat, lng, label in zip(red_cs.lat, red_cs.lng, red_cs.address):
    folium.features.CircleMarker(
        [lat, lng],
        radius = 5,
        color ='red',
        popup = folium.Popup(label),
        fill = True,
        fill_color = 'red',
        fill_opacity = 0.5
    ).add_to(map_red_cluster)

map_red_cluster

In [30]:
map_red_cluster.save('map_red_cluster.html')

Let's create a similar table for our green cluster.

In [53]:
green_c = brew_df[brew_df.labels == 0]

green_cs = green_c.sort_values(by = ['Closest Rival'], ascending = False).reset_index(drop = True)

green_cs.style

Unnamed: 0,labels,name,address,lat,lng,Dis from PDX Center,Closest Rival
0,0,Great Notion Brewing,2444 NW 28th Ave,45.5399,-122.709,3.5,1.16
1,0,Hopworks Urban Brewery,2944 SE Powell Blvd,45.4969,-122.635,4.0,0.96
2,0,Little Beast Brewing Beer Garden,3412 SE Division St,45.5046,-122.629,3.9,0.96
3,0,Broadway Grill & Brewery,1700 NE Broadway St,45.5349,-122.648,2.6,0.76
4,0,Ruse Brewing,4784 SE 17th Ave,45.4877,-122.648,4.1,0.75
5,0,Hopworks BikeBar,3947 N Williams Ave,45.5513,-122.667,3.5,0.7
6,0,Culmination Brewing,2117 NE Oregon St,45.5289,-122.644,2.6,0.64
7,0,Gorges Beer: The Trailhead,2705 SE Ankeny St,45.5223,-122.638,2.8,0.46
8,0,Migration Brewing,2828 NE Glisan St,45.5263,-122.636,3.0,0.46
9,0,Ecliptic Brewing,825 N Cook St,45.5473,-122.675,3.0,0.25


Looks like we have more microbreweries on this list that are closer together than 500 meters. Let's drop those rows from our final table.

Again, we'll save the final table to share with our client.

In [54]:
green_cs.drop(green_cs[green_cs['Closest Rival'] < .5].index, inplace = True)

green_cs.to_csv('brew_green_cluster.csv')

green_cs.style

Unnamed: 0,labels,name,address,lat,lng,Dis from PDX Center,Closest Rival
0,0,Great Notion Brewing,2444 NW 28th Ave,45.5399,-122.709,3.5,1.16
1,0,Hopworks Urban Brewery,2944 SE Powell Blvd,45.4969,-122.635,4.0,0.96
2,0,Little Beast Brewing Beer Garden,3412 SE Division St,45.5046,-122.629,3.9,0.96
3,0,Broadway Grill & Brewery,1700 NE Broadway St,45.5349,-122.648,2.6,0.76
4,0,Ruse Brewing,4784 SE 17th Ave,45.4877,-122.648,4.1,0.75
5,0,Hopworks BikeBar,3947 N Williams Ave,45.5513,-122.667,3.5,0.7
6,0,Culmination Brewing,2117 NE Oregon St,45.5289,-122.644,2.6,0.64


Finally, we'll create a map for our green cluster and save it.

In [55]:
map_green_cluster = folium.Map(location=[latitude, longitude], zoom_start = 12)

# add a blue circle marker to represent downtown portland

folium.features.CircleMarker(
    [latitude, longitude],
    radius = 10,
    color ='blue',
    popup ='PDX Downtown',
    fill = True,
    fill_color = 'blue',
    fill_opacity = 0.5
).add_to(map_green_cluster)

# add our red cluster with red circle markers

for lat, lng, label in zip(green_cs.lat, green_cs.lng, green_cs.address):
    folium.features.CircleMarker(
        [lat, lng],
        radius = 5,
        color ='green',
        popup = folium.Popup(label),
        fill = True,
        fill_color = 'green',
        fill_opacity = 0.5
    ).add_to(map_green_cluster)

map_green_cluster

In [38]:
map_green_cluster.save('map_green_cluster.html')

## Results and Discussion <a name="results"></a>

The analysis illlustrates that microbreweries in Portland are more densely distributed the closer we get to the center of downtown. The NE and SE neighborhood areas exhibit multiple areas where microbreweries are less densely distributed, thus holding the most promise for establishing a new microbrewery without a nearby competitor.

Narrowing our focus to the microbreweries in NE and SE Portland, we filtered out all microbreweries that had a competitor within 500 meters. The remaining microbreweries represent areas of lower competition for customers. These areas look particularly attractive because in the past five years the neighborhoods have seen renewed property development and renovation. Additionally, real estate in downtown Portland may be overdeveloped. Property in the surrounding neighborhoods could be obtained at a discount relative to downtown, providing a competitive advantage to a new microbrewery.

The best candidates for locating a new microbrewery are represented by the green and red clusters on our maps. These maps, and the tables used to generate them, are stored and available for our client's use. This information, however, should be supplemented by additional data relevant to selecting a prime location for a new microbrewery. The objective of this report is to identify areas where microbreweries appear not to be overcrowded. Certainly, an entrepreneur would want to look at real estate prices, zoning laws, available parking, etc., to determine if any of the locations identified through our analysis would be suitable. Our client should therefore use our recommendations as a starting point for more detailed analysis that could result in a location that has both no nearby competitors and several other factors in its favor.

## Conclusion <a name="conclusion"></a>

The focus of this project was to assist our client by narrowing their search for a prime location for a new microbrewery. By mapping the location of existing microbreweries provided through Foursquare data we got a general idea that microbreweries are concentrated around downtown Portland, with less density in the neighborhoods that surround downtown. K-means clustering generated promising locations in the surrounding neighborhoods in NE and SE Portland. Those clusters were further refined to create maps and a table of addresses for further exploration by the client.

A final decision on a prime location for a microbrewery should consider the location analysis in the project. Our client should also consider a number of other relevent factors, such real estate availability and price, current trends in the microbrewery market and neighborhood social and economic factors, when making a final decision. Those additional factors are beyond the scope of this project, the purpose of which was to analyze relative location of existing microbreweries near downtown Portland.