# Where should I open a coffee shop?

Phillz Coffee is looking to add a new location in San Francisco, but they aren't sure where to put it. Using the FourSquare API, I can find the optimal position for a new coffee shop based on locations of existing coffee shops and density of other venues. Ideally, a new shop would be in a place that doesn't already have another shop to compete with nearby, but has a high density of other venues which would indicate high traffic and thereby more customers.
    
I will first create a grid of locations to use as test locations (this can be any arbitrary collection of points, but for the sake of simplicity in my example, I choose a 5x5 evenly-spaced grid of points, with the central point coinciding with the listed latitude and longitude for San Francisco). For each test location, I will determine its distance from the nearest coffee shop, as well as its distance from its 10th closest venue to approximate the inverse density of venues about that location. Then, for each test location, I will calculate a z-score for its distance from the nearest coffee shop and distance from 10th closest venue. Since high distance from the nearest coffee shop is desirable, and low distance from 10th closest venue is desirable, I will subtract the z score of the latter from the former (z_coffee - z_10th_closest) to measure how fit a location is for a new coffee shop.
    
To visualize the result, I will create a Folium map and color each test location point on a scale from red (bad spot for coffee shop) to green (good spot for coffee shop). To determine the hue, I'll use the combined scores for the two variables z_coffee and z_10th_closest as mentioned above and use the distribution of scores to assign each point a color on the distribution of (r, g, b) values from (255, 0, 0) to (0, 255, 0).

See below rendered map for results/conclusion.

In [241]:
#make grid of city
#create 2 z-scores for each point, one for how coffee-less it is, and another for how busy it is
#pick places with biggeset sum of those two
from scipy.stats import zscore
import random

In [242]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [243]:
address = 'San Francisco, CA'

geolocator = Nominatim(user_agent="sf_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of San Francisco are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of San Francisco are 37.7790262, -122.4199061.


In [244]:
# create map of New York using latitude and longitude values
map_sf = folium.Map(location=[latitude, longitude], zoom_start=13)
degree_radius = 1/50
num_points = 5
lats = np.linspace(latitude-degree_radius, latitude+degree_radius, num_points)
longs = np.linspace(longitude-degree_radius, longitude+degree_radius, num_points)

grid = [[(lats[i], longs[j]) for i in range(len(lats))] for j in range(len(longs))]


In [245]:
CLIENT_ID = 'V4B0OQKHZ4TKV3Q0B1TDMMG3DS5HR5GQ30IL5MWAOPLSDSUO' # your Foursquare ID
CLIENT_SECRET = 'TE32IAOLMN2GVS04QTY2FQCDCVRWLUNWCQSMJKS0ZLQV2Y3W' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version # A default Foursquare API limit value

In [247]:
def getNearbyVenues(lat, lng, radius=1000, LIMIT=10):
    
    venues_list=[]
            
        # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        CLIENT_ID, 
        CLIENT_SECRET, 
        VERSION, 
        lat, 
        lng, 
        radius, 
        LIMIT)

    # make the GET request
    venues = requests.get(url).json()["response"]['groups'][0]['items']
    for venue in venues:
        venue_info = venue['venue']
        venue_info = {'name': venue_info['name'], 'distance': venue_info['location']['distance'], 'category': venue_info['categories'][0]['name']}
        venues_list.append(venue_info)

    return venues_list

    

In [248]:
grid_venues = [[getNearbyVenues(grid[i][j][0], grid[i][j][1]) for i in range(num_points)] for j in range(num_points)]

In [249]:
tenth_furthest_grid = [[grid_venues[i][j][-1]['distance'] for i in range(len(grid_venues))] for j in range(len(grid_venues))]
#for i in range(len(grid_venues)):
#    for j in range(len(grid_venues[i])):
#        100/point[-1]["distance"]
tenth_furthest_grid

[[439, 436, 321, 283, 225],
 [454, 286, 475, 360, 342],
 [242, 296, 293, 283, 259],
 [289, 317, 292, 244, 200],
 [453, 236, 335, 235, 370]]

In [250]:
#now, how far away is the nearest coffee shop?
coffee = 'Coffee Shop'
def find_coffee_dist(lat, lng):
    venues = getNearbyVenues(lat, lng, LIMIT = 100)
    try:
        ind = [venue['category'] for venue in venues].index(coffee)
    except:
        return -1
    return venues[ind]['distance']
    

In [251]:
coffee_dist_grid = [[find_coffee_dist(grid[i][j][0], grid[i][j][1]) for i in range(num_points)] for j in range(num_points)]

In [252]:
coffee_dist_grid

[[468, 444, 311, 126, 526],
 [378, 222, 285, 458, 493],
 [339, 570, 405, 71, 509],
 [601, 556, 314, 185, 272],
 [371, 642, 215, 228, 539]]

In [253]:
z_coffee = zscore(coffee_dist_grid) #higher is better
z_density = zscore(tenth_furthest_grid) #lower is better

In [254]:
score_grid = z_coffee-z_density

In [255]:
score_grid

array([[-0.31286294, -2.1242318 ,  0.40875615, -0.70278587,  1.40785691],
       [-1.42469743, -1.37952079, -2.28441435,  0.04380922, -0.69670057],
       [ 0.49170646,  0.84061186,  2.36163561, -1.11557317,  0.72151788],
       [ 2.73324658,  0.42926443,  0.8848338 ,  0.62392395, -0.77606212],
       [-1.48739268,  2.23387631, -1.3708112 ,  1.15062587, -0.65661211]])

In [256]:
min_score = min([min(score_grid[i]) for i in range(len(score_grid))])
max_score = max([max(score_grid[i]) for i in range(len(score_grid))])
score_range = max_score - min_score
num_colors = num_points**2

In [257]:
color_mat = [[int(num_colors*(score_grid[i][j] - min_score)/(score_range+0.1)) for i in range(num_points)] for j in range(num_points)]

In [258]:
from colour import Color
red = Color("red")
colors = list(red.range_to(Color("green"), num_colors))

In [260]:
for i in range(len(grid)):
    for j in range(len(grid[i])):
        folium.CircleMarker(
            [grid[i][j][0], grid[i][j][1]],
            radius=7,
            color='blue',
            fill=True,
            fill_color=colors[color_mat[i][j]].hex,
            fill_opacity=0.9,
            parse_html=False).add_to(map_sf)  
map_sf

# Results/Discussion/Conclusion

Above (or if not rendered, in a .png file in the repo) is the map illustrating the recommendation system for new coffee shop placement. The parameters of this model are highly flexible, from the density and reach of the test location grid, to the venue type, to the density metric, to the weighting of importance between venue density and coffee shop sparsity. Thus, one could apply the above methodology in any city, and for any type of venue. These types of methods could be used by city planners, business executives, outreach directors, and any number of other users who need to understand the gaps in the distributions of venues and services throughout a city.



In [None]:
!git add .
!git commit -m "Add final notebook"
!git push origin main

[main 129c223] Add final notebook
 Committer: Daniel Costa <dcosta@Daniels-MacBook-Pro-57.local>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:

    git config --global --edit

After doing this, you may fix the identity used for this commit with:

    git commit --amend --reset-author

 3 files changed, 410 insertions(+), 8 deletions(-)
 create mode 100644 .ipynb_checkpoints/Coffee Shop-checkpoint.ipynb
 create mode 100644 Coffee Shop.ipynb
