# Capstone Project - The Battle of the Neighborhoods (Week 2)

### Table of contents
#### 1 - Introduction: Business Problem
#### 2 - Data
#### 3 - Methodology
#### 4 - Analysis
#### 5 - Results and Discussion
#### 6 - Conclusion

### 1 - Introduction: Business Problem

#### We want to find all the theatres near “Rotunda do Marquês”, in Lisbon. 
#### Then, pick one (with the higher classification in Foursquare) and choose the closest restaurant to the theatre. 

In [78]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


### 2 - Data

#### using dataset to find coordinates

### load and explore data from:
### http://geodados.cm-lisboa.pt/datasets/teatros/data

In [2]:
import pandas as pd

url="https://raw.githubusercontent.com/a-teresa/Coursera_Capstone/master/Teatros.csv"

df_theatres = pd.read_csv(url)
df_theatres.head()

Unnamed: 0,X,Y,OBJECTID,COD_SIG,IDTIPO,INF_NOME,INF_MORADA,INF_TELEFONE,INF_FAX,INF_EMAIL,INF_SITE,INF_DESCRICAO,INF_FONTE,INF_MUNICIPAL,DTM_UPD,GlobalID
0,-9.192913,38.760653,1,1102316001001004,999,Teatro Armando Cortez,"Estrada da Pontinha, 7",+351 217 110 890 | +351 912 342 334,,geral@casadoartista.net,http://www.casadoartista.net/,Apoiarte - Lotação/Capacidade 300 lugares.,-,0,2017-05-16T10:31:43.000Z,68ac8659-d122-44ca-aec7-c1070e316642
1,-9.211664,38.694272,2,3200616034001001,999,Gabinete Curiosidades Karnart,"Avenida da Índia, 168",+351 213 466 411/+351 914 150 935,,geral@karnart.org,www.karnart.org,O Gabinete Curiosidades Karnart é a sede da KA...,www.karnart.org,1,2019-08-21T11:31:17.000Z,4685f80e-8009-463b-897d-a53b6df3a1bd
2,-9.202426,38.750718,3,807103013001012,999,Teatro Turim,"Estrada de Benfica, 723A",+351 217 606 666,,geral@teatroturim.com,www.teatroturim.com,,-,0,2012-11-22T00:00:00.000Z,1f5b122f-e5b6-48ae-b65d-cc7b44dd3f3d
3,-9.14578,38.712304,4,2801402006003001,999,Teatro do Bairro,"Rua Luz Soriano, 63",+351 213 473 358,,teatrodobairro.geral@sapo.pt,www.teatrodobairro.org,"O Teatro do Bairro naceu em 2011, e, criou-se ...",-,0,2019-03-12T17:24:17.000Z,647fde91-d7e5-46cd-99be-220047f3fd60
4,-9.131669,38.711071,5,3400404018001002,999,Café - Teatro Santiago Alquimista,"Rua de Santiago, 19",21 888 45 03,,mail@santiagoalquimista.com,www.santiagoalquimista.com/,,http://agendalx.pt/cgi-bin/iportal_agendalx,0,2018-11-23T12:32:57.000Z,8303a955-32b4-49f4-90d7-bf389c56df8a


##### clean data:

In [3]:
df_theatres =df_theatres.drop(["DTM_UPD", "GlobalID", "INF_MUNICIPAL", "INF_FONTE", "INF_DESCRICAO", "INF_SITE", "INF_EMAIL", "INF_FAX", "INF_TELEFONE", "IDTIPO", "COD_SIG"], axis = 1)

In [262]:
df_theatres.head()

Unnamed: 0,X,Y,OBJECTID,INF_NOME,INF_MORADA
0,-9.192913,38.760653,1,Teatro Armando Cortez,"Estrada da Pontinha, 7"
1,-9.211664,38.694272,2,Gabinete Curiosidades Karnart,"Avenida da Índia, 168"
2,-9.202426,38.750718,3,Teatro Turim,"Estrada de Benfica, 723A"
3,-9.14578,38.712304,4,Teatro do Bairro,"Rua Luz Soriano, 63"
4,-9.131669,38.711071,5,Café - Teatro Santiago Alquimista,"Rua de Santiago, 19"


### 4 - Methodology

In [5]:
!pip install shapely
import shapely.geometry

!pip install pyproj
import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)


Collecting shapely
[?25l  Downloading https://files.pythonhosted.org/packages/20/fa/c96d3461fda99ed8e82ff0b219ac2c8384694b4e640a611a1a8390ecd415/Shapely-1.7.0-cp36-cp36m-manylinux1_x86_64.whl (1.8MB)
[K    100% |################################| 1.8MB 3.5MB/s eta 0:00:01
[?25hInstalling collected packages: shapely
Successfully installed shapely-1.7.0
Collecting pyproj
[?25l  Downloading https://files.pythonhosted.org/packages/ce/37/705ee471f71130d4ceee41bbcb06f3b52175cb89273cbb5755ed5e6374e0/pyproj-2.6.0-cp36-cp36m-manylinux2010_x86_64.whl (10.4MB)
[K    100% |################################| 10.4MB 1.3MB/s eta 0:00:01
[?25hInstalling collected packages: pyproj
Successfully installed pyproj-2.6.0


In [13]:
lisbon_center = 38.7254678,-9.1497845
address = 'Rotunda do Marquês'
print('Coordinate of {}: {}'.format(address, lisbon_center))

Coordinate of Rotunda do Marquês: (38.7254678, -9.1497845)


In [14]:
print('Coordinate transformation check')
print('-------------------------------')
print('Lisbon center longitude={}, latitude={}'.format(lisbon_center[1], lisbon_center[0]))
x, y = lonlat_to_xy(lisbon_center[1], lisbon_center[0])
print('Lisbon center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Lisbon center longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
Lisbon center longitude=-9.1497845, latitude=38.7254678
Lisbon center UTM X=-1611513.2175671104, Y=4574244.947137361
Lisbon center longitude=-9.1497845, latitude=38.725467800000004


In [15]:
lisbon_center_x, lisbon_center_y = lonlat_to_xy(lisbon_center[1], lisbon_center[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = lisbon_center_x - 3000
x_step = 300
y_min = lisbon_center_y - 3000 - (int(21/k)*k*300 - 6000)/2
y_step = 300 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 150 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(lisbon_center_x, lisbon_center_y, x, y)
        if (distance_from_center <= 3001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

364 candidate neighborhood centers generated.


In [16]:
map_lisbon = folium.Map(location=lisbon_center, zoom_start=13)
folium.Marker(lisbon_center, popup='Rotunda do Marquês').add_to(map_lisbon)
for lat, lon in zip(latitudes, longitudes):
    #folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_lisbon) 
    folium.Circle([lat, lon], radius=150, color='blue', fill=False).add_to(map_lisbon)
    #folium.Marker([lat, lon]).add_to(map_lisbon)
map_lisbon

In [17]:
# instantiate a feature group for the X, Y in the dataframe Theatres
theatres = folium.map.FeatureGroup()

# loop through the theatres and add each to the theatres feature group
for lat, lng, in zip(df_theatres.Y, df_theatres.X):
    theatres.add_child(
        folium.features.CircleMarker(
            [lat, lng],
            radius=10, #  circle markers 
            color='red',
            fill=True,
            fill_color='yellow',
            fill_opacity=0.6
        )
    )
       
# add incidents to map
map_lisbon.add_child(theatres)


### 5 - Results and Discussion

In [18]:
# instantiate a feature group for the theatres in the dataframe
theatres = folium.map.FeatureGroup()

# loop through the theatres and add each to the theatres feature group
for lat, lng, in zip(df_theatres.Y, df_theatres.X):
    theatres.add_child(
        folium.features.CircleMarker(
            [lat, lng],
            radius=10, #  circle markers 
            color='red',
            fill=True,
            fill_color='yellow',
            fill_opacity=0.6
        )
    )

# add pop-up text to each marker on the map
latitudes = list(df_theatres.Y)
longitudes = list(df_theatres.X)
labels = list(df_theatres.INF_NOME)

for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(map_lisbon)    
 
    
# add incidents to map
map_lisbon.add_child(theatres)

https://pt.foursquare.com/explore?mode=share&near=Lisboa%2C%20Portugal&nearGeoId=72057594040194993&q=teatro&vne=38.799316%2C-9.046726&vsw=38.649321%2C-9.25478&share=1&rid=5e909b99660a9f001b36c179

In [265]:
# @hidden_cell
CLIENT_ID = 'N2NPGXFC3AAJQORRJA2TVSWQWDG2M1CQN5NFEHXLRKNICCXT' # your Foursquare ID
CLIENT_SECRET = 'JNMDR1SL42NLU5ZK5SGO1L5BAZ5UTWTWGRXMI4THPM1MPOOJ' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
RADIUS = 150
#print('Your credentails:')
#print('CLIENT_ID: ' + CLIENT_ID)
#print('CLIENT_SECRET:' + CLIENT_SECRET)

## looking for best ratings ?

https://pt.foursquare.com/explore?mode=url&near=Lisboa%2C%20Portugal&nearGeoId=72057594040194993&q=teatro

##### 1. São Luiz Teatro Municipal, Rua Maria Antónia Cardoso, 38


### Finding restaurants nearby:

In [138]:
#São Luiz Teatro Municipal
latitude = 38.709214
longitude = -9.142291 
print(latitude, longitude)

38.709214 -9.142291


In [241]:
search_query = 'food'
radius = 500
print(search_query + ' .... OK!')

food .... OK!


In [266]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)


In [242]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e90b37347e0d60026052bc9'},
 'response': {'venues': [{'id': '59496bc6829b0c4ac05a18bd',
    'name': 'Re-Food',
    'location': {'address': 'R. do Teixeira 9',
     'lat': 38.714367,
     'lng': -9.14468,
     'labeledLatLngs': [{'label': 'display',
       'lat': 38.714367,
       'lng': -9.14468}],
     'distance': 428,
     'postalCode': '1200-089',
     'cc': 'PT',
     'city': 'Lisboa',
     'state': 'Lisboa',
     'country': 'Portugal',
     'formattedAddress': ['R. do Teixeira 9', '1200-089 Lisboa', 'Portugal']},
    'categories': [{'id': '56aa371be4b08b9a8d573550',
      'name': 'Food Service',
      'pluralName': 'Food Services',
      'shortName': 'Food Service',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/foodanddrink_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1586541500',
    'hasPerk': False},
   {'id': '5b632bb1d1a402002c86573f',
    'name': 'Stairwell - Wine Bar & Creative Food'

In [243]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId
0,"[{'id': '56aa371be4b08b9a8d573550', 'name': 'F...",False,59496bc6829b0c4ac05a18bd,R. do Teixeira 9,PT,Lisboa,Portugal,428,"[R. do Teixeira 9, 1200-089 Lisboa, Portugal]","[{'label': 'display', 'lat': 38.714367, 'lng':...",38.714367,-9.14468,1200-089,Lisboa,Re-Food,v-1586541500
1,"[{'id': '4bf58dd8d48988d123941735', 'name': 'W...",False,5b632bb1d1a402002c86573f,"Rua da Misericórdia, 139",PT,Lisboa,Portugal,381,"[Rua da Misericórdia, 139, 1200 Lisboa, Portugal]","[{'label': 'display', 'lat': 38.712764, 'lng':...",38.712764,-9.143336,1200,Lisboa,Stairwell - Wine Bar & Creative Food,v-1586541500
2,"[{'id': '4bf58dd8d48988d1f5941735', 'name': 'G...",False,4d03e08ce350b60cb1dd7f42,,PT,,Portugal,418,[Portugal],"[{'label': 'display', 'lat': 38.71101467110674...",38.711015,-9.139775,,,Penhas Douradas Food,v-1586541500
3,"[{'id': '52e81612bcbc57f1066b79ff', 'name': 'H...",False,5a85f6a81f74405d2196c5bb,,PT,Lisboa,Portugal,146,"[1150 Lisboa, Portugal]","[{'label': 'display', 'lat': 38.71527, 'lng': ...",38.71527,-9.138209,1150,Lisboa,Ahmad Halal Foods Restaurante,v-1586541500
4,"[{'id': '4def73e84765ae376e57713a', 'name': 'P...",False,5a302f4fa35dce6b0677e5d8,"Rua Nova da Trindade, 10",PT,Lisboa,Portugal,418,"[Rua Nova da Trindade, 10, 1200-108 Lisboa, Po...","[{'label': 'display', 'lat': 38.71154, 'lng': ...",38.71154,-9.142247,1200-108,Lisboa,Ofício,v-1586541500


In [244]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,cc,city,country,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
0,Re-Food,Food Service,R. do Teixeira 9,PT,Lisboa,Portugal,428,"[R. do Teixeira 9, 1200-089 Lisboa, Portugal]","[{'label': 'display', 'lat': 38.714367, 'lng':...",38.714367,-9.14468,1200-089,Lisboa,59496bc6829b0c4ac05a18bd
1,Stairwell - Wine Bar & Creative Food,Wine Bar,"Rua da Misericórdia, 139",PT,Lisboa,Portugal,381,"[Rua da Misericórdia, 139, 1200 Lisboa, Portugal]","[{'label': 'display', 'lat': 38.712764, 'lng':...",38.712764,-9.143336,1200,Lisboa,5b632bb1d1a402002c86573f
2,Penhas Douradas Food,Gourmet Shop,,PT,,Portugal,418,[Portugal],"[{'label': 'display', 'lat': 38.71101467110674...",38.711015,-9.139775,,,4d03e08ce350b60cb1dd7f42
3,Ahmad Halal Foods Restaurante,Halal Restaurant,,PT,Lisboa,Portugal,146,"[1150 Lisboa, Portugal]","[{'label': 'display', 'lat': 38.71527, 'lng': ...",38.71527,-9.138209,1150,Lisboa,5a85f6a81f74405d2196c5bb
4,Ofício,Portuguese Restaurant,"Rua Nova da Trindade, 10",PT,Lisboa,Portugal,418,"[Rua Nova da Trindade, 10, 1200-108 Lisboa, Po...","[{'label': 'display', 'lat': 38.71154, 'lng': ...",38.71154,-9.142247,1200-108,Lisboa,5a302f4fa35dce6b0677e5d8


In [260]:
s = dataframe_filtered['distance']
s.min()

146

### 6 - Conclusion

#### The stakeholder found the answer to the questions: theatre São Luís and Ahmad Halal Foods. 
#### Different outputs could the stakeholder find if differents assumptions are chosen (e.g. differen initial point, different ratio, different option for higher rate or closest restaurant).
#### The stakeholder could find new answer with nem assumptions using the same data and similar methodology.