# Living in the MotorValley
## Which is the best place to live for those who start working in Ferrari?
##### This is the final project of the Coursera Applied Data Science Capstone course
###### By Sebastian D'Amico
<br/><br/> 
### Introduction
Ferrari headquarter is in Maranello, a small town at about 18 km from Modena, with a population of 17,504 (as of 2017). It is known worldwide as the home of Ferrari and Scuderia Ferrari Formula One racing team. Several other towns surround Maranello, and Modena, with 184.000-ish inhabitants, is the closest and biggest town with Shopping Centers, University, nightlife and many other services that push most of the people, joining Ferrari, to look for a house. Obviously is difficult to find all the services that are present in Modena in any other small town that surround Maranello, but which are the main differences between all the towns? The aim of this project is to classify the Maranello and surrounding towns in terms of available services and venues to help people joining Ferrari to judge, with real data, which is the place that better suits his own requirements.
<br/><br/> 
### The data
Different datasources will be used for this project. First of all I will take all the towns in the province of Modena (47 total municipalities) from the following website:
https://zip-codes.nonsolocap.it/emilia-romagna/91-cap-province-of-modena/  
I will then try to use the Geocoder Python to get coordinates from each postal code. In case of failing, I will manually extract Latitude and Longitude from Google Maps for each town.  
On top of that, the distance from Ferrari will be associated to each town such that people can judge also based on the time they will spend to go to the office. If I don't manage to get the distance using Google API, I will extract it manually.  
Finally, Foursquare will be used to explore each town, extracting information of all the venues categories that will be used for having a better picture of what can be found in each town.  
Plots and tables will help to better analize tha data and to drive the analysis also based on the results, still having as main target what already described in the introduction. 
<br/><br/> 

## 1. Collecting Data

Here I start collecting data. The section will be divided in subsections with the following main goals:  
1. Getting the main list of towns in the Province of Modena and potentially also Modena neighborhood. For each Location, Latitude and Longitude will be added using the Geocoder Python library if possible. Otherwise a manual extraction from Google Maps or Wikipedia will be performed. For each location, the distance to Ferrari Headquarter will be added.
2. Using the Foursquare API, venues for all locations will be extracted. Top 10 categories will be associated to each location

### 1.1 The list of towns in the provice of Modena

In [2]:
import pandas as pd

Data in the table below has been collected using the following web resouces:  
https://www.comune.modena.it/decentramento/il-decentramento-a-modena/la-frazioni-centri-di-periferia  
https://zip-codes.nonsolocap.it/emilia-romagna/91-cap-province-of-modena/  
https://www.coordinate-gps.it/  
https://www.mapdevelopers.com/distance_from_to.php  
Unfortunatelly no API available so data collected manually.

In [25]:
df_location_raw = pd.read_csv('D:\\Users\\sebastian\\OneDrive\\_09.Istruzione\\Coursera\\AppliedDataScience\\4.AppliedDataScienceCapstone\\GitRepo\\FinalProject\\locations_csv.csv')

In [26]:
df_location_raw.head()

Unnamed: 0,Location,Latitude,Longitude,DistanceToMaranello
0,Modena,44.650177,10.921732,22.96
1,Marzaglia,44.650963,10.803609,20.87
2,Cittanova,44.650268,10.850284,21.8
3,Cognento,44.636302,10.871811,16.96
4,Baggiovara,44.607764,10.867751,11.27


In [27]:
df_location_raw.shape

(59, 4)

In [28]:
df_location_raw.describe()

Unnamed: 0,Latitude,Longitude,DistanceToMaranello
count,59.0,59.0,59.0
mean,44.566252,10.903604,32.081542
std,0.2042,0.147902,17.493013
min,44.179951,10.570198,0.0
25%,44.423297,10.807728,18.19
50%,44.588018,10.924393,29.9
75%,44.714898,11.003846,47.895
max,44.913174,11.293905,66.37


### 1.2 Extracting venues from FOURSQUARE

#### Initialize foursquare credentials

In [29]:
CLIENT_ID = 'HDZNPM0PGJAW51WPPBQU521JFFGWY05PLE145J5BIEEQRCU2' # your Foursquare ID
CLIENT_SECRET = 'CKZG5MJELFPC5MPLCTRCYSH141XAEQJXCB024MWYES2CLKEA' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: HDZNPM0PGJAW51WPPBQU521JFFGWY05PLE145J5BIEEQRCU2
CLIENT_SECRET:CKZG5MJELFPC5MPLCTRCYSH141XAEQJXCB024MWYES2CLKEA


#### Let's create a function to get the venues given lat and long

In [30]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Location', 
                  'Location Latitude', 
                  'Location Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Run the function above on each Location to get the venues

In [31]:
import requests
LIMIT = 500 # limit of number of venues returned by Foursquare API
radius = 1500 # define radius

location_venue = getNearbyVenues(names=df_location_raw['Location'],
                                   latitudes=df_location_raw['Latitude'],
                                   longitudes=df_location_raw['Longitude']
                                  )

Modena
Marzaglia
Cittanova
Cognento
Baggiovara
Portile
Paganine
San Damaso
Fossalta
Saliceto Panaro
Villanova
Ganaceto
Tre olmi
Bastiglia
Bomporto
Campogalliano
Camposanto
Carpi
Castelfranco Emilia
Castelnuovo Rangone
Castelvetro di Modena
Cavezzo
Concordia sulla Secchia
Fanano
Finale Emilia
Fiorano Modenese
Fiumalbo
Formigine
Frassinoro
Guiglia
Lama Mocogno
Maranello
Marano sul Panaro
Medolla
Mirandola
Montecreto
Montefiorino
Montese
Nonantola
Novi di Modena
Palagano
Pavullo nel Frignano
Pievepelago
Polinago
Prignano sulla Secchia
Ravarino
Riolunato
San Cesario sul Panaro
San Felice sul Panaro
San Possidonio
San Prospero
Sassuolo
Savignano sul Panaro
Serramazzoni
Sestola
Soliera
Spilamberto
Vignola
Zocca


In [37]:
print(location_venue.shape)
location_venue.head(5)

(415, 7)


Unnamed: 0,Location,Location Latitude,Location Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Modena,44.650177,10.921732,Piazza della Pomposa,44.649044,10.923808,Plaza
1,Modena,44.650177,10.921732,Osteria Ermes,44.649429,10.92537,Italian Restaurant
2,Modena,44.650177,10.921732,La Tenda,44.651706,10.919946,Event Space
3,Modena,44.650177,10.921732,Tri Scalein,44.65001,10.92391,Café
4,Modena,44.650177,10.921732,Ristretto,44.647324,10.922675,Wine Bar


## 2. Analyzing Data

In this section I analyze the data starting from the locations visualization moving then to the venues analisys.

### 2.1 Locations

#### Creating the map

In [42]:
#MAPS LIBRARIES
import folium # map rendering library

In [54]:
# create map of Manhattan using latitude and longitude values
ferrari_lat = 44.531836
ferrari_lon = 10.863925
map_maranello = folium.Map(location=[ferrari_lat, ferrari_lon], zoom_start=9)

# add markers to map
for lat, lng, label in zip(df_location_raw['Latitude'], df_location_raw['Longitude'], df_location_raw['Location']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
        fill=False,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_maranello)  

    folium.Marker(
    location=[ferrari_lat, ferrari_lon],
    popup='Ferrari',
    icon=folium.Icon(color='red')
).add_to(map_maranello)

map_maranello