# Peer-graded Assignment: Capstone Project

# The Battle of Neighborhoods (Week 1) 

## Introduction/Business Problem

In this project I will try to find what is the __best metro station__ to stop if I want to go for an __ice cream__  and then either enjoy the sun in a nice __outdoor space__ or a instead go to a __museum__, in __Lisbon__, __Portugal__.

Because I may be limited in terms of FourSquare quota, I only want to check the __red__ and __green__ lines.

## Data

### Libraries

A number of libraries will be needed in this project.

In [1]:
import numpy as np # library for vectorized computation
import pandas as pd # library to process data as dataframes
import folium
import requests
import json # library to handle JSON files
print('Libraries imported.')

Libraries imported.


### Lisbon Metro Stations 

Lisbon Metro has 4 lines (Green, Blue, Yellow and Red) and 50 stations, 6 of them being double stations for two lines:
1. Alameda (Red and Green Line)
2. Marques de Pombal (Blue and Yellow Line)
3. Campo Grande (Green and Yellow Line)
4. Baixa-Chiado (Blue and Green Line)
5. Sao Sebastiao (Blue and Red Line)
6. Saldanha (Red And Yellow Line)

Let's import the coordinates of all of the stations (csv file in my github area).

In [2]:
url = "https://raw.githubusercontent.com/AnaMariaAnaMaria/Coursera_Capstone/master/MetroLx.csv"
headers = ["Abbrevation","Name","Line","Latitude","Longitude"]
coordinates = pd.read_csv(url,names = headers,skiprows=1)
coordinates

Unnamed: 0,Abbrevation,Name,Line,Latitude,Longitude
0,AE,Areeiro,green,38.742222,-9.134167
1,AF,Alfornelos,blue,38.760278,-9.205
2,AH,Alto dos Moinhos,blue,38.749444,-9.179444
3,AL,Alvalade,green,38.753333,-9.143889
4,AM,Alameda,green,38.736667,-9.133889
5,AM,Alameda,red,38.736667,-9.133889
6,AN,Anjos,green,38.726111,-9.134722
7,AP,Aeroporto,red,38.768611,-9.128611
8,AR,Arroios,green,38.733056,-9.133889
9,AS,Amadora Este,blue,38.757778,-9.218056


Let's create a map of Lisbon with all of the stations and their respective line (colour).
Please note that if the map is not visible, it can also be seen in https://github.com/AnaMariaAnaMaria/Coursera_Capstone/blob/master/LisbonMetroStations.JPG.

In [3]:
#lisbon coordinates:
lx_lat = 38.736946
lx_long = -9.142685

# create map
map_lx = folium.Map(location=[lx_lat, lx_long], zoom_start=12)

# add markers to the map
for lat, lon, poi, line in zip(coordinates['Latitude'], coordinates['Longitude'],coordinates['Name'],coordinates['Line']):
    label = folium.Popup(str(poi)+' ('+str(line)+' line)' , parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=line,
        fill=True,
        fill_color=line,
        fill_opacity=0.7).add_to(map_lx)

map_lx

Because I may be limited in terms of FourSquare quota, I am just going to be looking at the __Red__ and __Green__ lines.

In [4]:
two_lines = coordinates[coordinates.Line != 'blue']
two_lines = two_lines[two_lines.Line != 'yellow']
two_lines

Unnamed: 0,Abbrevation,Name,Line,Latitude,Longitude
0,AE,Areeiro,green,38.742222,-9.134167
3,AL,Alvalade,green,38.753333,-9.143889
4,AM,Alameda,green,38.736667,-9.133889
5,AM,Alameda,red,38.736667,-9.133889
6,AN,Anjos,green,38.726111,-9.134722
7,AP,Aeroporto,red,38.768611,-9.128611
8,AR,Arroios,green,38.733056,-9.133889
13,BC,Baixa-Chiado,green,38.710556,-9.139444
14,BV,Bela Vista,red,38.746667,-9.116944
17,CG,Campo Grande,green,38.76,-9.157778


#### Finding Venues in FourSquare

Let's use FourSquare to find what venues exist near each green or red line station.

My credentials are loaded from a json file.

In [5]:
#credentials as per suggestion in https://www.coursera.org/learn/applied-data-science-capstone/discussions/weeks/3/threads/VCjKK35VEemkuBJz3kVAHA
secrets = json.load(open('secrets.json'))
CLIENT_ID = secrets['CLIENT_ID']
CLIENT_SECRET = secrets['CLIENT_SECRET']
VERSION = secrets['VERSION']

print('Credentials loaded')

Credentials loaded


I am only interested in specific categories (from https://developer.foursquare.com/docs/resources/categories):

In [6]:
venues_cat = [
 '4bf58dd8d48988d1c9941735' # Ice Cream Shop
,'4bf58dd8d48988d165941735' # Scenic Lookout
,'4bf58dd8d48988d15a941735' # Garden
,'4bf58dd8d48988d163941735' # Park
,'4bf58dd8d48988d1e2931735' # Art Gallery
,'4bf58dd8d48988d190941735' # History Museum
,'4bf58dd8d48988d18f941735' # Art Museum
,'4fceea171983d5d06c3e9823' # Aquarium
,'50aaa49e4b90af0d42d5de11' # Castle
,'4bf58dd8d48988d191941735' # Science Museum
,'4bf58dd8d48988d15b941735' # Farm
,'4bf58dd8d48988d166941735' # Sculpture Garden
,'56aa371be4b08b9a8d5734db' # Amphitheater
,'4bf58dd8d48988d1e2941735' # Beach
,'4bf58dd8d48988d181941735' # Museum
]

In [7]:
# adapted function that extracts nearby venues from the lab week 3
def getNearbyVenues(names, latitudes, longitudes, radius=1000,LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['id'],
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Metro Station', 
                  'Metro Station Latitude', 
                  'Metro Station Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category Id',
                  'Venue Category']
    
    return(nearby_venues)

In [8]:
lx_venues = getNearbyVenues(names=two_lines['Name'],
                                   latitudes=two_lines['Latitude'],
                                   longitudes=two_lines['Longitude']
                                  )

Areeiro
Alvalade
Alameda
Alameda
Anjos
Aeroporto
Arroios
Baixa-Chiado
Bela Vista
Campo Grande
Chelas
Cabo Ruivo
Cais do Sodre
Encarnacao
Intendente
Martim Moniz
Moscavide
Olaias
Oriente
Olivais
Roma
Rossio
Saldanha
Sao Sebastiao
Telheiras


Let's save the list into a csv file.

In [10]:
lx_venues.to_csv('test.csv', sep='\t', encoding='utf-8')

The list of venues has 2,103 elements.


In [11]:
lx_venues.shape

(2103, 8)

I am only interested in the categories mentioned above - venues_cat.

There are 147 venues in these categories.

In [12]:
lx_venues_new = lx_venues[lx_venues['Venue Category Id'].isin(venues_cat)]
lx_venues_new.shape

(147, 8)

In [13]:
lx_venues_new

Unnamed: 0,Metro Station,Metro Station Latitude,Metro Station Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category Id,Venue Category
0,Areeiro,38.742222,-9.134167,FIB - il vero gelato italiano (geladosfib),38.744250,-9.134210,4bf58dd8d48988d1c9941735,Ice Cream Shop
2,Areeiro,38.742222,-9.134167,Jardim Fernando Pessa,38.743069,-9.137231,4bf58dd8d48988d15a941735,Garden
6,Areeiro,38.742222,-9.134167,Parque da Fonte Luminosa,38.737068,-9.132833,4bf58dd8d48988d163941735,Park
11,Areeiro,38.742222,-9.134167,Casa do Gelado,38.744872,-9.139545,4bf58dd8d48988d1c9941735,Ice Cream Shop
24,Areeiro,38.742222,-9.134167,Culturgest,38.740828,-9.142939,4bf58dd8d48988d1e2931735,Art Gallery
37,Areeiro,38.742222,-9.134167,Surf Gelados,38.739372,-9.137052,4bf58dd8d48988d1c9941735,Ice Cream Shop
51,Areeiro,38.742222,-9.134167,La Fabrica,38.736899,-9.141959,4bf58dd8d48988d1c9941735,Ice Cream Shop
61,Areeiro,38.742222,-9.134167,Jardim do Arco do Cego,38.735908,-9.142256,4bf58dd8d48988d15a941735,Garden
100,Alvalade,38.753333,-9.143889,Gelados Conchanata,38.753678,-9.142381,4bf58dd8d48988d1c9941735,Ice Cream Shop
143,Alvalade,38.753333,-9.143889,Jardim do Campo Grande,38.756770,-9.153823,4bf58dd8d48988d163941735,Park


Let's see all the venues on a map:

In [14]:
# create map
map_lx = folium.Map(location=[lx_lat, lx_long], zoom_start=12)

# add markers to the map
for lat, lon, poi in zip(lx_venues_new['Venue Latitude'], lx_venues_new['Venue Longitude'],lx_venues_new['Venue'],):
    label = folium.Popup(str(poi) , parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.7).add_to(map_lx)

map_lx

Please note that if the map is not visible, it can also be seen in https://github.com/AnaMariaAnaMaria/Coursera_Capstone/blob/master/LisbonVenues.JPG