# Business Problem and Background

    I represent a national donut shop, Galaxy Donuts, and we are expanding into the state of Minnesota. Our focus is on finding locations in the city of Minneapolis that will maximize profitability for the company. The biggest problem identified by our new CEO is our donut competition has been established for some time in the Minneapolis area. However, based on our proprietary Donut Density Ratio (DDR) we know there is room for one location for our brand. 
    
    We will need to analyze the 83 neighborhoods of Minneapolis proper, and make the decision based on optimal distance from location of competitor donut shops in and around Minneapolis. We also want to place our store in a neighborhood with a low crime rate. Finally, we want to place the location of our stores in higher density areas that will ensure plenty of traffic, as well as other activities for our customers to do after they have their minds blown by our donuts.


# The Data

    I will be using the Foursquare API to identify all donut shops in and around the city of Minneapolis. These will be indicated with red dots on the map. Next, I will determine which neighborhoods are the optimal/maximal distance from the other donut shops.
    
    Once I have that list of neighborhoods, I will analyze those specific neighborhoods, dividing out all residential and industrial space, leaving only commercial space. I will also map the crime rate by neighborhood. To do this, I will utilize city of Minneapolis open source data to help with this determination. Then I will use the Foursquare API to find businesses that might complement a donut shop (list still being determined, but will be included in the final report), and attempt to place the neighborhood in an area where the mean customer scores are highest, indicating a more satisfying experience/location. 
    
    The final work product will be a map of Minneapolis, linked to the Foursquare API denoting all competitors, and I will use all the data discussed above to place the coveted blue dot where Galaxy Donuts will open its newest location(s). All the findings will be returned in a formatted report with visuals, and I will also write the computer program used for analysis in a Jupyter Notebook and share it on my Github page with my colleagues.


In [40]:
# I first import all the libraries I need for my initial analysis
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [41]:
#I bring in a csv titles 'Neighborhood Crime Stats' and convert it to a dataframe
mpls_nbrhd = pd.read_csv('NEIGHBORHOOD_CRIME_STATS.csv', delimiter = ',')

In [42]:
#I use the groupby method to get sum total of crimes by each neighborhood
mpls_nbrhd = mpls_nbrhd.groupby('neighborhood')['number'].sum()

In [43]:
#This displays the summed dataframe of Minneapolis neighborhoods
mpls_nbrhd.to_frame()

Unnamed: 0_level_0,number
neighborhood,Unnamed: 1_level_1
Armatage,193
Audubon Park,362
Bancroft,212
Beltrami,121
Bottineau,161
...,...
West Calhoun,185
Whittier,1894
Willard - Hay,848
Windom,318


In [44]:
#I reset the index to make the dataframe easier to manipulate and convert it to pf1
mpls_nbrhd1 = mpls_nbrhd.reset_index()
mpls_nbrhd1

Unnamed: 0,neighborhood,number
0,Armatage,193
1,Audubon Park,362
2,Bancroft,212
3,Beltrami,121
4,Bottineau,161
...,...,...
82,West Calhoun,185
83,Whittier,1894
84,Willard - Hay,848
85,Windom,318


In [45]:
!conda install -c conda-forge folium=0.7.0 --yes
import folium

print('Folium installed and imported!')

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.

Folium installed and imported!


In [46]:
from folium import plugins
!wget --quiet https://opendata.arcgis.com/datasets/055ca54e5fcc47329f081c9ef51d038e_0.geojson -O 055ca54e5fcc47329f081c9ef51d038e_0.geojson

print('GeoJSON file downloaded!')

GeoJSON file downloaded!


In [47]:
# Minneapolis latitude and longitude values
latitude = 44.986656
longitude = -93.258133
mpls_neighborhood_geo = 'https://opendata.arcgis.com/datasets/055ca54e5fcc47329f081c9ef51d038e_0.geojson'

# Create map
mpls_map = folium.Map(
       location=[latitude,longitude],
       zoom_start=11.35)

mpls_map.choropleth(
    geo_data=mpls_neighborhood_geo,
    data=mpls_nbrhd1,
    columns=['neighborhood','number'],
    key_on='feature.properties.BDNAME',
    threshold_scale=[0, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500],
    fill_color='YlOrRd',
    fill_opacity='0.7',
    line_opacity='0.2',
    legend_name='Crime Rate in Minneapolis',
    reset=True,
)
# display the map
mpls_map

In [48]:
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

In [49]:
CLIENT_ID = 'G5T2H5YFJAPACKLEQST13SJLTRRB4QMHFGJMDF0ZSRHSXPDX' # your Foursquare ID
CLIENT_SECRET = 'WB3CGZOIHBTSLJYERUI3VORHYWDEAF5R034JZRT3M4TVXRYC' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentials:
CLIENT_ID: G5T2H5YFJAPACKLEQST13SJLTRRB4QMHFGJMDF0ZSRHSXPDX
CLIENT_SECRET:WB3CGZOIHBTSLJYERUI3VORHYWDEAF5R034JZRT3M4TVXRYC


In [50]:
url = 'https://api.foursquare.com/v2/venues/search?categoryId=4bf58dd8d48988d148941735&intent=browse&radius=30000&client_id={}&client_secret={}&ll={},{}&v={}&query='.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION)
url

'https://api.foursquare.com/v2/venues/search?categoryId=4bf58dd8d48988d148941735&intent=browse&radius=30000&client_id=G5T2H5YFJAPACKLEQST13SJLTRRB4QMHFGJMDF0ZSRHSXPDX&client_secret=WB3CGZOIHBTSLJYERUI3VORHYWDEAF5R034JZRT3M4TVXRYC&ll=44.986656,-93.258133&v=20180604&query='

In [51]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5db756116b9b490028966c77'},
 'response': {'venues': [{'id': '584c04838ee56059df135672',
    'name': "Dunkin'",
    'location': {'address': 'MSP Airport, Terminal 1',
     'crossStreet': 'Main Concourse',
     'lat': 44.88374364677225,
     'lng': -93.21137493853037,
     'labeledLatLngs': [{'label': 'display',
       'lat': 44.88374364677225,
       'lng': -93.21137493853037}],
     'distance': 12034,
     'postalCode': '55111',
     'cc': 'US',
     'city': 'Saint Paul',
     'state': 'MN',
     'country': 'United States',
     'formattedAddress': ['MSP Airport, Terminal 1 (Main Concourse)',
      'Saint Paul, MN 55111',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d148941735',
      'name': 'Donut Shop',
      'pluralName': 'Donut Shops',
      'shortName': 'Donuts',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/donuts_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1572296231

In [52]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,...,location.country,location.formattedAddress,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,location.neighborhood,venuePage.id
0,584c04838ee56059df135672,Dunkin',"[{'id': '4bf58dd8d48988d148941735', 'name': 'D...",v-1572296231,False,"MSP Airport, Terminal 1",Main Concourse,44.883744,-93.211375,"[{'label': 'display', 'lat': 44.88374364677225...",...,United States,"[MSP Airport, Terminal 1 (Main Concourse), Sai...",,,,,,,,
1,580fa20238facc4e6ffd1569,Angel Food Bakery & Donut Bar,"[{'id': '4bf58dd8d48988d16a941735', 'name': 'B...",v-1572296231,False,Concourse E,,44.88504,-93.212566,"[{'label': 'display', 'lat': 44.8850400628044,...",...,United States,"[Concourse E, Saint Paul, MN 55111, United Sta...",,,,,,,,
2,58517cc7739d855b44e53255,Dunkin',"[{'id': '4bf58dd8d48988d148941735', 'name': 'D...",v-1572296231,False,2425 Rice St,,45.013967,-93.106732,"[{'label': 'display', 'lat': 45.01396688648932...",...,United States,"[2425 Rice St, Roseville, MN 55113, United Sta...",,,,,,,,
3,51254ee3e4b0597635a30975,Glam Doll Donuts,"[{'id': '4bf58dd8d48988d148941735', 'name': 'D...",v-1572296231,False,2605 Nicollet Ave,at E 26th St.,44.955156,-93.277816,"[{'label': 'display', 'lat': 44.95515607246256...",...,United States,"[2605 Nicollet Ave (at E 26th St.), Minneapoli...",,,,,,,,
4,4b5864b8f964a520c35528e3,Donut Connection,"[{'id': '4bf58dd8d48988d148941735', 'name': 'D...",v-1572296231,False,1037 1st Ave E,,44.800545,-93.513115,"[{'label': 'display', 'lat': 44.80054540165129...",...,United States,"[1037 1st Ave E, Shakopee, MN 55379, United St...",,,,,,,,


In [53]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,neighborhood,id
0,Dunkin',Donut Shop,"MSP Airport, Terminal 1",Main Concourse,44.883744,-93.211375,"[{'label': 'display', 'lat': 44.88374364677225...",12034,55111,US,Saint Paul,MN,United States,"[MSP Airport, Terminal 1 (Main Concourse), Sai...",,584c04838ee56059df135672
1,Angel Food Bakery & Donut Bar,Bakery,Concourse E,,44.88504,-93.212566,"[{'label': 'display', 'lat': 44.8850400628044,...",11868,55111,US,Saint Paul,MN,United States,"[Concourse E, Saint Paul, MN 55111, United Sta...",,580fa20238facc4e6ffd1569
2,Dunkin',Donut Shop,2425 Rice St,,45.013967,-93.106732,"[{'label': 'display', 'lat': 45.01396688648932...",12299,55113,US,Roseville,MN,United States,"[2425 Rice St, Roseville, MN 55113, United Sta...",,58517cc7739d855b44e53255
3,Glam Doll Donuts,Donut Shop,2605 Nicollet Ave,at E 26th St.,44.955156,-93.277816,"[{'label': 'display', 'lat': 44.95515607246256...",3833,55408,US,Minneapolis,MN,United States,"[2605 Nicollet Ave (at E 26th St.), Minneapoli...",,51254ee3e4b0597635a30975
4,Donut Connection,Donut Shop,1037 1st Ave E,,44.800545,-93.513115,"[{'label': 'display', 'lat': 44.80054540165129...",28871,55379,US,Shakopee,MN,United States,"[1037 1st Ave E, Shakopee, MN 55379, United St...",,4b5864b8f964a520c35528e3
5,Dunkin',Donut Shop,,,44.884108,-93.308363,"[{'label': 'display', 'lat': 44.88410763898639...",12082,55423,US,Minneapolis,MN,United States,"[Minneapolis, MN 55423, United States]",,5beee5d5625a66002c783fba
6,Mel-O-Glaze Bakery,Bakery,4800 28th Ave S,at E Minnehaha Pkwy,44.915864,-93.232306,"[{'label': 'display', 'lat': 44.91586386149392...",8138,55417,US,Minneapolis,MN,United States,"[4800 28th Ave S (at E Minnehaha Pkwy), Minnea...",,451f6c35f964a5209e3a1fe3
7,Dunkin',Donut Shop,9595 Zachary Ln N,,45.128772,-93.423023,"[{'label': 'display', 'lat': 45.12877176026772...",20454,55369,US,Maple Grove,MN,United States,"[9595 Zachary Ln N, Maple Grove, MN 55369, Uni...",,5b142c22c58ed7002c9219a1
8,Dunkin',Café,143 Snelling Ave N,,44.945622,-93.167318,"[{'label': 'display', 'lat': 44.9456218, 'lng'...",8486,55104,US,Saint Paul,MN,United States,"[143 Snelling Ave N, Saint Paul, MN 55104, Uni...",,5d9e2d3d3155f3000bd6a06c
9,Dunkin',Donut Shop,1420 Yankee Doodle Rd,at Pilot Knob Rd,44.833159,-93.168716,"[{'label': 'display', 'lat': 44.8331594, 'lng'...",18484,55121,US,Eagan,MN,United States,"[1420 Yankee Doodle Rd (at Pilot Knob Rd), Eag...",,5cb20be9a35f46002557ea67


In [54]:
# Here I remove a single row that was obviously mislabelled in Foursquare (Mucci's Italian Restaurant)
dataframe_filtered = dataframe_filtered.drop([22], axis=0)
dataframe_filtered.name

0                           Dunkin'
1     Angel Food Bakery & Donut Bar
2                           Dunkin'
3                  Glam Doll Donuts
4                  Donut Connection
5                           Dunkin'
6                Mel-O-Glaze Bakery
7                           Dunkin'
8                           Dunkin'
9                           Dunkin'
10                  Cardigan Donuts
11                 Glam Doll Donuts
12                The Thirsty Whale
13                       Sleepy V's
14            Bogart's Doughnut Co.
15               Puffy Cream Donuts
16         YoYo Donuts & Coffee Bar
17                          Dunkin'
18                          Dunkin'
19                          Dunkin'
20                          Dunkin'
21             Bogart’s Doughnut Co
23                          Mucci's
24                   Dunkin' Donuts
25                          Dunkin'
Name: name, dtype: object

In [55]:
# getting the geojson from http://opendata.minneapolismn.gov outlining all 83 neighborhoods in Minneapolis
!wget --quiet https://opendata.arcgis.com/datasets/055ca54e5fcc47329f081c9ef51d038e_0.geojson
print('GeoJSON file downloaded!')

GeoJSON file downloaded!


In [56]:
#I create a simple dataframe so I can combine it with existing data and plot the shop location
galaxy = pd.DataFrame({
'lat':[44.912674],
'lon':[-93.328876],
'name':['Galaxy Donuts']
})
galaxy

Unnamed: 0,lat,lon,name
0,44.912674,-93.328876,Galaxy Donuts


In [57]:
#geojson file with mapping data for all 83 Minneapolis neighborhoods 
mpls_geo = 'https://opendata.arcgis.com/datasets/055ca54e5fcc47329f081c9ef51d038e_0.geojson'

venues_map = folium.Map(location=[latitude, longitude], zoom_start=11) # generate map centered around Minneapolis

# add the donut shops as red circle markers
for lat, lng, name in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.name):
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        color='red',
        popup=name,
        fill = False,
        fill_color=None,
        fill_opacity=0.6
    ).add_to(venues_map)


# add Galaxy as a Big Blue Donut
for lat, lon, name in zip(galaxy.lat, galaxy.lon, galaxy.name):
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        color='blue',
        popup=name,
        fill = True,
        fill_color=None,
        fill_opacity=0.4
    ).add_to(venues_map)
    
# I use a choropleth style overlay to visually represent the 83 Minneapolis neighborhoods
# for the sake of seeing where the donut shops are located in comparison this will be
# important for my final analysis
venues_map.choropleth(
    geo_data=mpls_neighborhood_geo,
    data=mpls_nbrhd1,
    columns=['neighborhood','number'],
    key_on='feature.properties.BDNAME',
    threshold_scale=[0, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500],
    fill_color='YlOrRd',
    fill_opacity='0.5',
    line_opacity='0.2',
    legend_name='Crime Rate in Minneapolis',
    reset=True,
)

# display map
venues_map