# Capstone Project - Cafes in Helsinki


## Introduction / business problem

In a city of Helsinki, if someone is looking to open a café restaurant, the question is, where would you recommend that they open it? 
The background of the problem is that in order for a café to be profitable, there must be enough customers, and in order to have enough customers, it is not worth setting up a café in the immediate promixity of existing ones.

Let's also make sure that audience is explicitly defined to be the local restaurant entrepreuners in Helsinki and they should care about this problem because the location of the new café has a significant impact on the expected returns.

# Data


A description of the data: the data used to solve this problem is geolocation data collected from FourSquare. Adequate explanation and discussion, with examples, of the data is the following. Data is a single dataframe, containing at least a location of the café. Explanation of the location data is a standard tuple (lat, lng), where lat stands for latitude and lng for longitude. Some other metadata like name, postal code and so on is also collected, but let us discuss that they are not absolutely necessary for the analysis. Example of the data:

| identifier	| Name	| Shortname	| Address	| Postalcode	| Latitude	| Londitude |
|---------------|-------|-----------|-----------|---------------|-----------|-----------|
| 4ddd2d44b0fba481fc927360	| Patisserie Teemu & Markus | Bakery |	Yrjönkatu 25	| 00100 |	60.167899 |	24.938190 |
| 50f688d5e4b023d2f274b506	|	Kaffecentralen	|	Coffee Shop	|	Fredrikinkatu 59	|	00100	|	60.167580	|	24.932526	|
| 5aec747112f0a9002c9b92ab	|	La Torrefazione	|	Café	|	Mannerheimintie 22	|	00100	|	60.170721	|	24.936158	|
| 4b4cb879f964a520c0bb26e3	|	The Ounce	|	Tea Room	|	Fredrikinkatu 55	|	00100	|	60.167182	|	24.932993	|
| 4af1c9e2f964a52031e321e3	|	La Torrefazione	|	Coffee Shop	|	Aleksanterinkatu 50	|	00100	|	60.168877	|	24.943845	|

Data will be used in the following way: by knowing the locations of already existing cafes, it's possible to apply unsupervised learning technique like kernel density estimation (KDE) to determine the area of influence of the existing cafes, and start up new café which is not in the area of influence.

In [1]:
!conda install -c conda-forge folium=0.5.0 --yes

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    certifi-2019.11.28         |   py36h9f0ad1d_1         149 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         152 KB

The following NEW packages will be INSTALLED:

    python_abi: 3.6-1_cp36m       conda-forge

The following packages will be UPDATED:

    certifi:    2019.11.28-py36_0 conda-forge --> 2019.11.28-py36h9f0ad1d_1 conda-forge


Downloading and Extracting Packages
python_abi-3.6       | 4 KB      | ##################################### | 100% 
certifi-2019.11.28   | 149 KB    | ##################################### | 100% 
Pr

In [2]:
import pandas as pd
import folium

In [3]:
import requests

request_parameters = {
    "client_id": 'KUIF3CE2ZUOR5AHL4IIHT4I2QVGTQ4F0D40ZJORGY1EO54CZ',
    "client_secret": '5JW2TLMU1O314BNW53HV5PPACU2DHXXSDN10NN00WRPYT2CO',
    "v": '20180605',
    "section": "coffee",
    "near": "Helsinki",
    "radius": 1000,
    "limit": 50}

data = requests.get("https://api.foursquare.com/v2/venues/explore", params=request_parameters)

In [4]:
d = data.json()["response"]
d.keys()

dict_keys(['suggestedFilters', 'geocode', 'headerLocation', 'headerFullLocation', 'headerLocationGranularity', 'query', 'totalResults', 'suggestedBounds', 'groups'])

In [5]:
d["headerLocationGranularity"], d["headerLocation"], d["headerFullLocation"]

('city', 'Helsinki', 'Helsinki')

In [6]:
d["suggestedBounds"], d["totalResults"]

({'ne': {'lat': 60.178172775811575, 'lng': 24.952162607605953},
  'sw': {'lat': 60.160364637231424, 'lng': 24.921636447256862}},
 85)

In [7]:
d["geocode"]

{'what': '',
 'where': 'helsinki',
 'center': {'lat': 60.16952, 'lng': 24.93545},
 'displayString': 'Helsinki, Finland',
 'cc': 'FI',
 'geometry': {'bounds': {'ne': {'lat': 60.29788198177167,
    'lng': 25.2415192349548},
   'sw': {'lat': 60.12648197436881, 'lng': 24.8240229894074}}},
 'slug': 'helsinki-finland',
 'longId': '72057594038586161'}

In [8]:
d["groups"][0].keys()

dict_keys(['type', 'name', 'items'])

In [9]:
d["groups"][0]["type"], d["groups"][0]["name"]

('Recommended Places', 'recommended')

In [10]:
items = d["groups"][0]["items"]
print("number of items: %i" % len(items))
items[0]

number of items: 50


{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '4ddd2d44b0fba481fc927360',
  'name': 'Patisserie Teemu & Markus',
  'location': {'address': 'Yrjönkatu 25',
   'lat': 60.16789868904846,
   'lng': 24.938189914522386,
   'labeledLatLngs': [{'label': 'display',
     'lat': 60.16789868904846,
     'lng': 24.938189914522386}],
   'postalCode': '00100',
   'cc': 'FI',
   'city': 'Helsinki',
   'state': 'Uusimaa',
   'country': 'Suomi',
   'formattedAddress': ['Yrjönkatu 25', '00100 Helsinki', 'Suomi']},
  'categories': [{'id': '4bf58dd8d48988d16a941735',
    'name': 'Bakery',
    'pluralName': 'Bakeries',
    'shortName': 'Bakery',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/bakery_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referralId': 'e-5-4ddd2d44b0fba481fc927360-0'}

In [11]:
items[1]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '50f688d5e4b023d2f274b506',
  'name': 'Kaffecentralen',
  'location': {'address': 'Fredrikinkatu 59',
   'crossStreet': 'Kansakoulukatu',
   'lat': 60.167580051384675,
   'lng': 24.932525558737044,
   'labeledLatLngs': [{'label': 'display',
     'lat': 60.167580051384675,
     'lng': 24.932525558737044}],
   'postalCode': '00100',
   'cc': 'FI',
   'city': 'Helsinki',
   'state': 'Uusimaa',
   'country': 'Suomi',
   'formattedAddress': ['Fredrikinkatu 59 (Kansakoulukatu)',
    '00100 Helsinki',
    'Suomi']},
  'categories': [{'id': '4bf58dd8d48988d1e0931735',
    'name': 'Coffee Shop',
    'pluralName': 'Coffee Shops',
    'shortName': 'Coffee Shop',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/coffeeshop_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referra

# Methodology 

The methodology in this project consists of two parts:
* Exploratory Data Analysis: Visualise the cofee shop in Helsinki to understand the most suitable zone to open a Café
* Modelling: To help people find the best location I generated an heatmap to undestand the best zone with low numbers of cafè immediadly near and an average zone covering

In [12]:
df_raw = []
for item in items:
    venue = item["venue"]
    categories, uid, name, location = venue["categories"], venue["id"], venue["name"], venue["location"]
    assert len(categories) == 1
    shortname = categories[0]["shortName"]
    address = location["address"]
    if not "postalCode" in location:
        continue
    postalcode = location["postalCode"]
    lat = location["lat"]
    lng = location["lng"]
    datarow = (uid, name, shortname, address, postalcode, lat, lng)
    df_raw.append(datarow)
df = pd.DataFrame(df_raw, columns=["uid", "name", "shortname", "address", "postalcode", "lat", "lng"])
print("found %i cafes" % len(df))
df.head()

found 47 cafes


Unnamed: 0,uid,name,shortname,address,postalcode,lat,lng
0,4ddd2d44b0fba481fc927360,Patisserie Teemu & Markus,Bakery,Yrjönkatu 25,100,60.167899,24.93819
1,50f688d5e4b023d2f274b506,Kaffecentralen,Coffee Shop,Fredrikinkatu 59,100,60.16758,24.932526
2,5aec747112f0a9002c9b92ab,La Torrefazione,Café,Mannerheimintie 22,100,60.170721,24.936158
3,4b4cb879f964a520c0bb26e3,The Ounce,Tea Room,Fredrikinkatu 55,100,60.167182,24.932993
4,556f2874498e103ac120a121,Kissakahvila Helkatti,Pet Café,Fredrikinkatu 55,100,60.167274,24.933142


In [13]:
helsinki_center = d["geocode"]["center"]
helsinki_center

{'lat': 60.16952, 'lng': 24.93545}

In [15]:
from folium import plugins

map_helsinki = folium.Map(location=[helsinki_center["lat"], helsinki_center["lng"]], zoom_start=14)

def add_markers(df):
    for (j, row) in df.iterrows():
        label = folium.Popup(row["name"], parse_html=True)
        folium.CircleMarker(
            [row["lat"], row["lng"]],
            radius=5,
            popup=label,
            color='blue',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(map_helsinki)

add_markers(df)
hm_data = df[["lat", "lng"]].as_matrix().tolist()
map_helsinki.add_child(plugins.HeatMap(hm_data))

map_helsinki



In [16]:
lat = 60.168749
lng = 24.945747
map_helsinki = folium.Map(location=[lat, lng], zoom_start=17)
add_markers(df)
folium.CircleMarker(
    [lat, lng],
    radius=15,
    popup="Our Cafe!",
    color='red',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_helsinki)
map_helsinki

# Results and Discussion 

The aim of this project is to help open any kind of shop in any city. For example if a person is looking to open any other resturant, a simple change in the Foursquare call could give us the bike stores locations, hence helping us open a new shop.
As our aim was to find the perfect spot for a café, we delimited the city center as the best option and, using the heatmap, we found a spot without many coffe shops and near two very trafficked streets: between Aleksanterinkatu and Mikonkatu


# Conclusion 

This project helps a person get a better understanding the coffee economy in Helsinki. It is always helpful to make use of technology to stay one step ahead. 

For every kind of shop added value could be generated by adding more map (i.e. public transportation mapping overlap, nearest shopping mall) that can predict more accurately if your shop can have success in a peculiar area