# Capstone Project - The Battle of the Neighborhoods (Week 2)

## Table of Contents

1. Introduction: Business Problem
2. Foursquare API Explore Function
3. Data
4. Methodology
5. Analysis
6. Results and Discussion
7. Conclusion


## Introduction: Business Problem
In this project we will try to find an optimal location for opening a restraunt in **Amsterdam, Netherlands** near *Central Railway Station*. Our sole sole objective is to find location that is surrounded by least number of restaraunts and is nearest to *Central Railway Station* which is the famous transport center in Europe. Since there are lots of restaurants in this location we will try to detect locations that are not already crowded with restaurants. We are also particularly interested in areas with no restaurants in vicinity. We would also prefer locations as close to  *Central Railway Station* of Amsterdam as possible, assuming that first two conditions are met.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## Data

Based on the definition of our problem, factors that will influence our decision are:

 - Number of existing restaurants in the neighborhood (any type of restaurant)
 - Distance of neighborhood from *Central Railway Station of Amsterdam*
 
We decided to use regularly spaced grid of locations, centered around *Central Railway Station of Amsterdam*, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:

 - centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
 - number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API**
 - coordinate of *Central Railway Station of Amsterdam* will be obtained using **Google Maps API geocoding**

Firstly we will do a general exploration using Foursquare API search query of food.

### Import necessary libraries

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    certifi-2019.6.16          |           py36_1         149 KB  conda-forge
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.49-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

### Foursquare Credentials and Version

In [2]:
CLIENT_ID = 'SYEWADAPWXPOGCEVE32FQSELE4CBCEABDFCQZKRKHS4NGOHE'
CLIENT_SECRET = 'VEDXRHYYJ1S34VOZZDH3APR5F2PHH32R2WUEMPX51AGUKAWL'
VERSION = '20190729'
LIMIT = 100
radius = 4000
print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: SYEWADAPWXPOGCEVE32FQSELE4CBCEABDFCQZKRKHS4NGOHE
CLIENT_SECRET:VEDXRHYYJ1S34VOZZDH3APR5F2PHH32R2WUEMPX51AGUKAWL


#### To get geographical coordinates of Amsterdam Centraal, Netherlands

In [3]:
address = 'Amsterdam Centraal, Netherlands'

geolocator = Nominatim(user_agent="amsterdam_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print("The geographical coordinates of Amsterdam Centraal is {}, {}.".format(latitude, longitude))

The geographical coordinates of Amsterdam Centraal is 52.378901, 4.9005805.


#### Since we are planning to open a new restaurant, we will use Foursquare API to run a search query of food. 

In [4]:
search_query = 'food'
print(search_query + ' .... OK!')

food .... OK!


In [5]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=SYEWADAPWXPOGCEVE32FQSELE4CBCEABDFCQZKRKHS4NGOHE&client_secret=VEDXRHYYJ1S34VOZZDH3APR5F2PHH32R2WUEMPX51AGUKAWL&ll=52.378901,4.9005805&v=20190729&query=food&radius=4000&limit=100'

#### Get data in the form of json file

In [6]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5d3ef04ca30619002c094a76'},
 'response': {'venues': [{'id': '504dc9f3e4b00c01095766a0',
    'name': 'Food Corner',
    'location': {'address': 'Dam Square',
     'lat': 52.37352715384458,
     'lng': 4.893949262019628,
     'labeledLatLngs': [{'label': 'display',
       'lat': 52.37352715384458,
       'lng': 4.893949262019628}],
     'distance': 748,
     'cc': 'NL',
     'country': 'Nederland',
     'formattedAddress': ['Dam Square', 'Nederland']},
    'categories': [{'id': '4bf58dd8d48988d120951735',
      'name': 'Food Court',
      'pluralName': 'Food Courts',
      'shortName': 'Food Court',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/food_foodcourt_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1564405836',
    'hasPerk': False},
   {'id': '5877c5c1e9dad12ac9e9aa0e',
    'name': 'STACH  food',
    'location': {'address': '1e Constantijn Huijgens',
     'crossStreet': 'Overtoom',
     'lat

In [7]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.neighborhood,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d120951735', 'name': 'F...",False,504dc9f3e4b00c01095766a0,Dam Square,NL,,Nederland,,748,"[Dam Square, Nederland]","[{'label': 'display', 'lat': 52.37352715384458...",52.373527,4.893949,,,,Food Corner,v-1564405836,
1,"[{'id': '4bf58dd8d48988d1f5941735', 'name': 'G...",False,5877c5c1e9dad12ac9e9aa0e,1e Constantijn Huijgens,NL,Amsterdam,Nederland,Overtoom,2490,"[1e Constantijn Huijgens (Overtoom), Amsterdam...","[{'label': 'display', 'lat': 52.36285262494781...",52.362853,4.875043,,,Noord-Holland,STACH food,v-1564405836,
2,"[{'id': '4bf58dd8d48988d16a941735', 'name': 'B...",False,52d7ec76498e19770b7d649b,Stormsteeg 8,NL,Amsterdam,Nederland,,655,"[Stormsteeg 8, 1012 BD Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37406002111417...",52.37406,4.895095,,1012 BD,Noord-Holland,St. Anny Food,v-1564405836,
3,"[{'id': '50be8ee891d4fa8dcc7199a7', 'name': 'M...",False,4c9f2aa103133704e7ca6fd5,Jan van Galenstraat 4,NL,Amsterdam,Nederland,,2267,"[Jan van Galenstraat 4, 1051 KM Amsterdam, Ned...","[{'label': 'display', 'lat': 52.3762512, 'lng'...",52.376251,4.86749,,1051 KM,Noord-Holland,Food Center Amsterdam,v-1564405836,
4,"[{'id': '4bf58dd8d48988d120951735', 'name': 'F...",False,5cbb2697c876c80039d5e8d3,Nieuwezijds Voorburgwal 182,NL,Amsterdam,Nederland,,876,"[Nieuwezijds Voorburgwal 182, 1012 SJ Amsterda...","[{'label': 'display', 'lat': 52.373764, 'lng':...",52.373764,4.890815,,1012 SJ,Noord-Holland,The Food Department,v-1564405836,545894791.0


In [8]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head(10)

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,Food Corner,Food Court,Dam Square,NL,,Nederland,,748,"[Dam Square, Nederland]","[{'label': 'display', 'lat': 52.37352715384458...",52.373527,4.893949,,,,504dc9f3e4b00c01095766a0
1,STACH food,Gourmet Shop,1e Constantijn Huijgens,NL,Amsterdam,Nederland,Overtoom,2490,"[1e Constantijn Huijgens (Overtoom), Amsterdam...","[{'label': 'display', 'lat': 52.36285262494781...",52.362853,4.875043,,,Noord-Holland,5877c5c1e9dad12ac9e9aa0e
2,St. Anny Food,Bakery,Stormsteeg 8,NL,Amsterdam,Nederland,,655,"[Stormsteeg 8, 1012 BD Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37406002111417...",52.37406,4.895095,,1012 BD,Noord-Holland,52d7ec76498e19770b7d649b
3,Food Center Amsterdam,Market,Jan van Galenstraat 4,NL,Amsterdam,Nederland,,2267,"[Jan van Galenstraat 4, 1051 KM Amsterdam, Ned...","[{'label': 'display', 'lat': 52.3762512, 'lng'...",52.376251,4.86749,,1051 KM,Noord-Holland,4c9f2aa103133704e7ca6fd5
4,The Food Department,Food Court,Nieuwezijds Voorburgwal 182,NL,Amsterdam,Nederland,,876,"[Nieuwezijds Voorburgwal 182, 1012 SJ Amsterda...","[{'label': 'display', 'lat': 52.373764, 'lng':...",52.373764,4.890815,,1012 SJ,Noord-Holland,5cbb2697c876c80039d5e8d3
5,STACH Food,Organic Grocery,Haarlemmerstraat 150,NL,Amsterdam,Nederland,,801,"[Haarlemmerstraat 150, 1013 EZ Amsterdam, Nede...","[{'label': 'display', 'lat': 52.38139046900278...",52.38139,4.889511,,1013 EZ,Noord-Holland,52a3050011d2ae60be6128a0
6,PLAYERS Food & Drinks,Restaurant,Kleine Gartmanplantsoen 25,NL,Amsterdam,Nederland,at Leidsekruisstraat,2087,[Kleine Gartmanplantsoen 25 (at Leidsekruisstr...,"[{'label': 'display', 'lat': 52.36311062437967...",52.363111,4.884,,1017 RP,Noord-Holland,4b3633ddf964a520093125e3
7,Homemade - Fresh Food & Drinks,Sandwich Place,Singel 447,NL,Amsterdam,Nederland,Amsterdam,1460,"[Singel 447 (Amsterdam), 1012 WP Amsterdam, Ne...","[{'label': 'display', 'lat': 52.36729984225847...",52.3673,4.890536,,1012 WP,Noord-Holland,4ad9d80cf964a520311b21e3
8,Halal Food,Falafel Restaurant,Damrak,NL,Amsterdam,Nederland,,526,"[Damrak, Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37524869072558...",52.375249,4.895655,Stadsdeel Centrum,,Noord-Holland,50a8eeb6e4b0740a188b900d
9,Vegan Junk Food Bar,Vegetarian / Vegan Restaurant,,NL,Amsterdam,Nederland,,1521,"[1017 BE Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.36639117639653...",52.366391,4.891563,,1017 BE,Noord-Holland,5c24d2a0e4c459002c9c4e72


In [9]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) 

# add a red circle marker to represent the Taj
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Amsterdam Centraal',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

In [10]:
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?client_id=SYEWADAPWXPOGCEVE32FQSELE4CBCEABDFCQZKRKHS4NGOHE&client_secret=VEDXRHYYJ1S34VOZZDH3APR5F2PHH32R2WUEMPX51AGUKAWL&ll=52.378901,4.9005805&v=20190729&radius=4000&limit=100'

## Get details of venues near Amsterdam Centraal

In [11]:
import requests

In [12]:
results = requests.get(url).json()

#### Get relevant part of JSON

In [13]:
items = results['response']['groups'][0]['items']
items[0]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '55c10ee5498ecb81fe809f9d',
  'name': 'Omelegg - City Centre',
  'location': {'address': 'Nieuwebrugsteeg 24',
   'crossStreet': 'warmoesstraat',
   'lat': 52.37606,
   'lng': 4.899802,
   'labeledLatLngs': [{'label': 'display', 'lat': 52.37606, 'lng': 4.899802}],
   'distance': 320,
   'postalCode': '1012 AH',
   'cc': 'NL',
   'city': 'Amsterdam',
   'state': 'Noord-Holland',
   'country': 'Nederland',
   'formattedAddress': ['Nieuwebrugsteeg 24 (warmoesstraat)',
    '1012 AH Amsterdam',
    'Nederland']},
  'categories': [{'id': '4bf58dd8d48988d143941735',
    'name': 'Breakfast Spot',
    'pluralName': 'Breakfast Spots',
    'shortName': 'Breakfast',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/breakfast_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []},
  'venue

In [14]:
dataframe = json_normalize(items) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories'] + [col for col in dataframe.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# filter the category for each row
dataframe_filtered['venue.categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered.columns = [col.split('.')[-1] for col in dataframe_filtered.columns]

dataframe_filtered.head(10)

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,Omelegg - City Centre,Breakfast Spot,Nieuwebrugsteeg 24,NL,Amsterdam,Nederland,warmoesstraat,320,"[Nieuwebrugsteeg 24 (warmoesstraat), 1012 AH A...","[{'label': 'display', 'lat': 52.37606, 'lng': ...",52.37606,4.899802,,1012 AH,Noord-Holland,55c10ee5498ecb81fe809f9d
1,art'otel,Hotel,Prins Hendrikkade 33,NL,Amsterdam,Nederland,Martelaarsgracht 5,271,"[Prins Hendrikkade 33 (Martelaarsgracht 5), 10...","[{'label': 'display', 'lat': 52.37764, 'lng': ...",52.37764,4.89716,,1012 TM,Noord-Holland,51d532e9e4b098c3c66f49d9
2,Toastable,Sandwich Place,Nieuwendijk 6,NL,Amsterdam,Nederland,,392,"[Nieuwendijk 6, 1012 MK Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37879049945435...",52.37879,4.894805,,1012 MK,Noord-Holland,561e39f5498e059a294a7913
3,Jacketz,Restaurant,32 Nieuwendijk,NL,Amsterdam,Nederland,Engelsteeg,358,"[32 Nieuwendijk (Engelsteeg), 1012 ML Amsterda...","[{'label': 'display', 'lat': 52.37822855550258...",52.378229,4.895425,Stadsdeel Centrum,1012 ML,Noord-Holland,5900ec9898fbfc4fa94b2f39
4,SkyLounge Amsterdam,Hotel Bar,Oosterdoksstraat 4,NL,Amsterdam,Nederland,,385,"[Oosterdoksstraat 4, 1011 DK Amsterdam, Nederl...","[{'label': 'display', 'lat': 52.37686021496292...",52.37686,4.90516,,1011 DK,Noord-Holland,4de6588a18389f0558772bbf
5,Choux,Restaurant,De Ruijterkade 128,NL,Amsterdam,Nederland,,473,"[De Ruijterkade 128, 1011AC Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37769306062365...",52.377693,4.90726,,1011AC,Noord-Holland,55674fe7498e0a4781b0f3d8
6,Bierproeflokaal In de Wildeman,Bar,Kolksteeg 3,NL,Amsterdam,Nederland,,475,"[Kolksteeg 3, 1012 PT Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37622160710857...",52.376222,4.895134,,1012 PT,Noord-Holland,4a26ffccf964a52011811fe3
7,Ashoka Restaurant -Amsterdam Centrum,Indian Restaurant,Spuistraat 3G,NL,Amsterdam,Nederland,,476,"[Spuistraat 3G, 1012 SP Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37709413565575...",52.377094,4.894229,Stadsdeel Centrum,1012 SP,Noord-Holland,53e0cca8498ed3c8aa65b287
8,De Koffieschenkerij,Coffee Shop,Oudekerksplein 23,NL,Amsterdam,Nederland,Oudezijds Voorburgwal,560,"[Oudekerksplein 23 (Oudezijds Voorburgwal), 10...","[{'label': 'display', 'lat': 52.37404292238903...",52.374043,4.898427,,1012 GX,Noord-Holland,51d98583498e78da6626be60
9,Cannibale Royale,Burger Joint,Lange Niezel 15,NL,Amsterdam,Nederland,,449,"[Lange Niezel 15, 1012 GS Amsterdam, Nederland]","[{'label': 'display', 'lat': 52.37503133828516...",52.375031,4.89869,De Wallen,1012 GS,Noord-Holland,57910d45498eeaf3c1c08128


Since around 4km in radius Amsterdam Centraal, there are many restaraunts so will try to find most optimal location for it. We would also prefer locations as close to Amsterdam Centraal as possible.We will divide area around Amsterdam Centraal into grids upto 6km radius and try to find ares with least number of restraunts in those grids. Let's create latitude & longitude coordinates for centroids of our candidate neighborhoods. We will create a grid of cells covering our area of interest which is aprox. 12x12 killometers centered around Amsterdam Centraal.

In [15]:
centraal = [52.3791, 4.9003]

Let's create a hexagonal grid of cells: we offset every other row, and adjust vertical row spacing so that every cell center is equally distant from all it's neighborhoods.

In [16]:
!pip install shapely
import shapely.geometry

!pip install pyproj
import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('centraal longitude={}, latitude={}'.format(centraal[1], centraal[0]))
x, y = lonlat_to_xy(centraal[1], centraal[0])
print('centraal UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('centraal longitude={}, latitude={}'.format(lo, la))

Collecting shapely
[?25l  Downloading https://files.pythonhosted.org/packages/38/b6/b53f19062afd49bb5abd049aeed36f13bf8d57ef8f3fa07a5203531a0252/Shapely-1.6.4.post2-cp36-cp36m-manylinux1_x86_64.whl (1.5MB)
[K     |████████████████████████████████| 1.5MB 2.3MB/s eta 0:00:01
[?25hInstalling collected packages: shapely
Successfully installed shapely-1.6.4.post2
Collecting pyproj
[?25l  Downloading https://files.pythonhosted.org/packages/16/59/43869adef45ce4f1cf7d5c3aef1ea5d65d449050abdda5de7a2465c5729d/pyproj-2.2.1-cp36-cp36m-manylinux1_x86_64.whl (11.2MB)
[K     |████████████████████████████████| 11.2MB 2.0MB/s eta 0:00:01
[?25hInstalling collected packages: pyproj
Successfully installed pyproj-2.2.1
Coordinate transformation check
-------------------------------
centraal longitude=4.9003, latitude=52.3791
centraal UTM X=-186556.6743409466, Y=5851350.986540761
centraal longitude=4.900300000000002, latitude=52.3791


In [17]:
centraal_x, centraal_y = lonlat_to_xy(centraal[1], centraal[0]) # centraal in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = centraal_x - 6000
x_step = 600
y_min = centraal_y - 6000 - (int(21/k)*k*600 - 12000)/2
y_step = 600 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(centraal_x, centraal_y, x, y)
        if (distance_from_center <= 6001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')


364 candidate neighborhood centers generated.


In [18]:
#!pip install folium

import folium

In [19]:
map_centraal = folium.Map(location=centraal, zoom_start=12)
folium.Marker(centraal, popup='Amsterdam Centraal').add_to(map_centraal)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_centraal)
map_centraal

In [20]:
google_api_key='AIzaSyB8QdOk3UbmTiC4xoS4nGnM9ZvQYX9nVFk'
def get_address(api_key, latitude, longitude, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&latlng={},{}'.format(api_key, latitude, longitude)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None

addr = get_address(google_api_key, centraal[0], centraal[1])
print('Reverse geocoding check')
print('-----------------------')
print('Address of [{}, {}] is: {}'.format(centraal[0], centraal[1], addr))


Reverse geocoding check
-----------------------
Address of [52.3791, 4.9003] is: None


In [21]:
google_api_key='AIzaSyB8QdOk3UbmTiC4xoS4nGnM9ZvQYX9nVFk'
print('Obtaining location addresses: ', end='')
addresses = []
for lat, lon in zip(latitudes, longitudes):
    address = get_address(google_api_key, lat, lon)
    if address is None:
        address = 'NO ADDRESS'
    address = address.replace(', Germany', '') # We don't need country part of address
    addresses.append(address)
    print(' .', end='')
print(' done.')

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [22]:
import pandas as pd

df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center})

df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center
0,NO ADDRESS,52.326262,4.885943,-188356.674341,5845635.0,5992.495307
1,NO ADDRESS,52.327012,4.89461,-187756.674341,5845635.0,5840.3767
2,NO ADDRESS,52.327761,4.903278,-187156.674341,5845635.0,5747.173218
3,NO ADDRESS,52.328509,4.911946,-186556.674341,5845635.0,5715.767665
4,NO ADDRESS,52.329257,4.920615,-185956.674341,5845635.0,5747.173218
5,NO ADDRESS,52.330004,4.929283,-185356.674341,5845635.0,5840.3767
6,NO ADDRESS,52.33075,4.937953,-184756.674341,5845635.0,5992.495307
7,NO ADDRESS,52.329736,4.871882,-189256.674341,5846155.0,5855.766389
8,NO ADDRESS,52.330486,4.880549,-188656.674341,5846155.0,5604.462508
9,NO ADDRESS,52.331236,4.889217,-188056.674341,5846155.0,5408.326913


In [23]:
df_locations.shape

(364, 6)

In [24]:
df_locations.to_pickle('./locations.pkl')

In [25]:
food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'diner', 'taverna', 'steakhouse']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', India', '')
    return address

def get_venues_near_location(lat, lon, category, CLIENT_ID, CLIENT_SECRET, radius=500, limit=100):
    version = '20180729'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        CLIENT_ID, CLIENT_SECRET, version, lat, lon, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
    return venues

In [26]:
import pickle

def get_restaurants(lats, lons):
    restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        # Using radius=350 to meke sure we have overlaps/full coverage so we don't miss any restaurant (we're using dictionaries to remove any duplicates resulting from area overlaps)
        venues = get_venues_near_location(lat, lon, food_category, CLIENT_ID, CLIENT_SECRET, radius=350, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res = is_restaurant(venue_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, x, y)
                if venue_distance<=300:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, location_restaurants

# Try to load from local file system in case we did this before
restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('restaurants_350.pkl', 'rb') as f:
        restaurants = pickle.load(f)
    with open('location_restaurants_350.pkl', 'rb') as f:
        location_restaurants = pickle.load(f)
    print('Restaurant data loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    # Let's persists this in local file system
    with open('restaurants_350.pkl', 'wb') as f:
        pickle.dump(restaurants, f)
    with open('location_restaurants_350.pkl', 'wb') as f:
        pickle.dump(location_restaurants, f)

Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [27]:
import numpy as np

print('Total number of restaurants:', len(restaurants))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 2156
Average number of restaurants in neighborhood: 5.293956043956044


## Analysis
Let's perform some basic explanatory data analysis and derive some additional info from our raw data. First let's count **the number of restaurants in every area candidate**:

In [28]:
location_restaurants_count = [len(res) for res in location_restaurants]

df_locations['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius = 300m:', np.array(location_restaurants_count).mean())

df_locations.head(10)

Average number of restaurants in every area with radius = 300m: 5.293956043956044


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,NO ADDRESS,52.326262,4.885943,-188356.674341,5845635.0,5992.495307,7
1,NO ADDRESS,52.327012,4.89461,-187756.674341,5845635.0,5840.3767,1
2,NO ADDRESS,52.327761,4.903278,-187156.674341,5845635.0,5747.173218,0
3,NO ADDRESS,52.328509,4.911946,-186556.674341,5845635.0,5715.767665,1
4,NO ADDRESS,52.329257,4.920615,-185956.674341,5845635.0,5747.173218,0
5,NO ADDRESS,52.330004,4.929283,-185356.674341,5845635.0,5840.3767,3
6,NO ADDRESS,52.33075,4.937953,-184756.674341,5845635.0,5992.495307,3
7,NO ADDRESS,52.329736,4.871882,-189256.674341,5846155.0,5855.766389,4
8,NO ADDRESS,52.330486,4.880549,-188656.674341,5846155.0,5604.462508,13
9,NO ADDRESS,52.331236,4.889217,-188056.674341,5846155.0,5408.326913,6


In [32]:
df=df_locations[(df_locations['Restaurants in area'] == 0) & (df_locations['Distance from center']<=1500)].reset_index()
df.drop('index',inplace=True,axis=1)
df

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,NO ADDRESS,52.387173,4.88516,-187456.674341,5852390.0,1374.772708,0
1,NO ADDRESS,52.387923,4.893839,-186856.674341,5852390.0,1081.665383,0
2,NO ADDRESS,52.388673,4.902519,-186256.674341,5852390.0,1081.665383,0
3,NO ADDRESS,52.389422,4.911199,-185656.674341,5852390.0,1374.772708,0


In [33]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the Taj

# add a red circle marker to represent the Taj
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Amsterdam Centraal',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the restaurants candidants as blue circle markers
for lat, lng in zip(df.Latitude, df.Longitude):
    folium.features.CircleMarker(
        [lat, lng],
        radius=10,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

## Results

Our analysis shows that there are many restraunts near Amsterdam Centraal and we were mainly concerned about finding best suitable location that is not already crowded with restraunts and is nearest to Amsterdam Centraal. Total number (~2156 in our initial area of interest which was 12x12km around Amsterdam Centraal), there are large pockets of low restraunt density fairly close to Amsterdam Centraal. Highest concentration of restraunts was detected north from Amsterdam Centraal, so we found out the neighbourhoods with no restaraunt and distance from Amsterdam Centraal less than 1.5 km and found out 4 such suitable neighbourhoods.

Result of all these zones containing largest number of potential new restraunt location. Purpose of this analysis was to provide information on areas close to Amsterdam Centraal.It is entirely possible that there is a very good reason for small number of restaraunts in any of those areas, reasons which would make them unsuitable for a new restaurant regardless of lack of competition in the area. Recommended zones should therefore be considered only as a starting point for more detailed analysis which could eventually result in location which has not only no nearby competition but also other factors taken into account and all other relevant conditions met.

## Conclusion
Purpose of this project was to identify Amsterdam areas close to Amsterdam Centraal with low number of restaurants in order to aid stakeholders in narrowing down the search for optimal location for a new restaurant. By calculating restaurant density distribution from Foursquare data we have first identified coordinates and neighbourhoods that justify further analysis and then generated extensive collection of locations which satisfy some basic requirements regarding existing nearby restaurants.Final decission on optimal restaurant location will be made by stakeholders based on specific characteristics of neighborhoods and locations in every recommended zone, taking into consideration additional factors like attractiveness of each location (proximity to park or water), levels of noise / proximity to major roads, real estate availability, prices, social and economic dynamics of every neighborhood etc.