## Problem Description

In this report, I would like to address the question of where to open a new cafe in Cambridge, UK. The potential stakeholders are the people who are interested in opening a new cafe business in the Cambridge area. The general rationale behind this analysis is to balance an exploration and exploitation of different areas. 

An area with a relatively small number of cafes means less competition, but it also might mean that there aren't many potential customers. An area with a large number of cafes could mean that it maybe near a tourist attraction, thus plenty of potential customers. However, it would also mean that it is fairly competitive. 

We will aim to address this problem along these two directions.

## Data Description

The data that I will be using are obtained using the Foursquare API, which contains the location information of different cafes in Cambridge, UK.

Below are my code for obtaining the data, and an initial exploratory analysis.

In [1]:
!conda install -c conda-forge folium=0.5.0 --yes

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done


  current version: 4.8.3
  latest version: 4.8.4

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/hankui/anaconda3

  added / updated specs:
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    certifi-2019.9.11          |           py37_0         147 KB  conda-forge
    conda-4.8.4                |   py37hc8dfbb8_2         3.1 MB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
 

In [3]:
# import necessary libraries
import folium # plotting library
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner

In [4]:
# sensitive Foursquare log in info 

Your credentails:
CLIENT_ID: BKNYVLZF5UBHV5JZDPLUTAZGFRMEJSUTWWA4JOQTUMOTCJCN
CLIENT_SECRET:2LHOBXUKK5UMYTYLK0RYEQFIZKN5XD1BXWCCLHJIQY1IGGO0


In [5]:
import requests

request_parameters = {
    "client_id": CLIENT_ID,
    "client_secret": CLIENT_SECRET,
    "v": VERSION,
    "section": "coffee",
    "near": "Cambridge, UK",
    "radius": 1000,
    "limit": 1000}

data = requests.get("https://api.foursquare.com/v2/venues/explore", params=request_parameters)

In [6]:
d = data.json()["response"]
d.keys()

dict_keys(['suggestedFilters', 'geocode', 'headerLocation', 'headerFullLocation', 'headerLocationGranularity', 'query', 'totalResults', 'suggestedBounds', 'groups'])

In [29]:
items = d["groups"][0]["items"]

df_raw = []
for item in items:
    venue = item["venue"]
    categories, uid, name, location = venue["categories"], venue["id"], venue["name"], venue["location"]
    assert len(categories) == 1
    shortname = categories[0]["shortName"]
    
    if not "address" in location:
        continue
    address = location['address']
    
    if not "postalCode" in location:
        continue
    postalcode = location["postalCode"]
    lat = location["lat"]
    lng = location["lng"]
    
    datarow = (uid, name, shortname, address, postalcode, lat, lng)
    df_raw.append(datarow)

df = pd.DataFrame(df_raw, columns=["uid", "name", "shortname", "address", "postalcode", "lat", "lng"])
print("found %i cafes" % len(df))
df.head()

found 36 cafes


Unnamed: 0,uid,name,shortname,address,postalcode,lat,lng
0,5499607c498e80876defe57d,Hot Numbers,Coffee Shop,4 Trumpington St,CB2 1QA,52.198515,0.121923
1,5184c967498e0b8ab6cae4ba,Aromi,Café,1 Bene’t Street,CB2 3QN,52.204278,0.118949
2,4c01057d19d8c928e2258829,Savino's,Café,3 Emmanuel St,CB1 1NE,52.204327,0.123427
3,4bb20522f964a52083b23ce3,Michaelhouse Café,Café,St Michael's Church,CB2 1SU,52.206008,0.118124
4,586a4fb4561ded113af769fd,Bould Brothers Coffee,Coffee Shop,16 Round Church St,CB5 8AD,52.208555,0.118677


In [30]:
d["geocode"]

{'what': '',
 'where': 'cambridge uk',
 'center': {'lat': 52.2, 'lng': 0.11667},
 'displayString': 'Cambridge, Cambridgeshire, United Kingdom',
 'cc': 'GB',
 'geometry': {'bounds': {'ne': {'lat': 52.23097999999993,
    'lng': 0.17491199999983564},
   'sw': {'lat': 52.17158499999988, 'lng': 0.09852149999989024}}},
 'slug': 'cambridge-united-kingdom',
 'longId': '72057594040581877'}

In [38]:
cambridge_centre = d["geocode"]["center"]

from folium import plugins

# create map of Cambridge using latitude and longitude values
cambridge_map = folium.Map(location = [cambridge_centre["lat"], cambridge_centre["lng"]], zoom_start = 15)

def add_markers(df):
    for (j, row) in df.iterrows():
        label = folium.Popup(row["name"], parse_html = True)
        folium.CircleMarker(
            [row["lat"], row["lng"]],
            radius = 6,
            popup = label,
            color = 'blue',
            fill = True,
            fill_color = '#3186cc',
            fill_opacity = 0.7,
            parse_html = False).add_to(cambridge_map)

add_markers(df)
hm_data = df[["lat", "lng"]].as_matrix().tolist()
cambridge_map.add_child(plugins.HeatMap(hm_data))

cambridge_map

