# The Battle of Neighborhoods - MADRID

### Introduction

I have selected the city of Madrid to carry out my capstone project. The objective of carrying out this analysis is to make a segmentation and a cluster of each one of the zones to select the most appropriate to open a restaurant.

Madrid is my city. Before starting I would like to give a brief presentation of this great city.

The capital and economic center of Spain is a very diverse city with a great culture. Full of charming corners with great restaurants and places of great interest such as the Padro museum or the Santiago Bernabeu stadium

Therefore, the objective of the project can be summarized in the following question:<b> What is the most suitable neighborhood to open a new restaurant? </b>

To achieve and answer this question, we begin by identifying the sources of the information.
The following page will be used and more specifically the link excel to obtain the neighborhoods and districts of the city of Madrid

<b>Page:</b>

http://www.madrid.org/iestadis/fijas/clasificaciones/barrios.htm

<b>Excel:</b>

http://www.madrid.org/iestadis/fijas/clasificaciones/descarga/cobar18.xls


In the first part, we carry out an analysis and treatment of this information. Ending this section with the search for the latitude and longitude of each of these neighborhoods to be able to use the advantages of the Foursquare library.

<b>Let's start</b>

### Import and Download Libraries

In [188]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.0               |             py_0          26 KB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    openssl-1.1.1f             |       h516909a_0         2.1 MB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ------------------------------------------------------------
                       

### Load Data as Dataframe

In [189]:
df = pd.read_excel('http://www.madrid.org/iestadis/fijas/clasificaciones/descarga/cobar18.xls') 
df.head()

Unnamed: 0,munic,distr,ldistr,barrio,descrip,secci
0,796,1,Centro,1,Palacio,1
1,796,1,Centro,1,Palacio,2
2,796,1,Centro,1,Palacio,3
3,796,1,Centro,1,Palacio,4
4,796,1,Centro,1,Palacio,6


### Cleaning data and pre-procesing

In [190]:
df.rename(columns={'ldistr':'Borough', 'descrip': 'Neighborhood'}, inplace=True)
df['munic'] = df['munic'].apply(lambda x: str(x))
df['distr'] = df['distr'].apply(lambda x: str(x))
df['barrio'] = df['barrio'].apply(lambda x: str(x))
df['PostalCode'] = df['munic'] + df['distr']
df = df.drop(['barrio', 'secci', 'distr', 'munic'], axis=1)
df.head()

Unnamed: 0,Borough,Neighborhood,PostalCode
0,Centro,Palacio,7961
1,Centro,Palacio,7961
2,Centro,Palacio,7961
3,Centro,Palacio,7961
4,Centro,Palacio,7961


In [191]:
df.shape

(2443, 3)

In [192]:
df2 = df.drop_duplicates()
df2.reset_index(drop=True, inplace=True)
df2.head()

Unnamed: 0,Borough,Neighborhood,PostalCode
0,Centro,Palacio,7961
1,Centro,Embajadores,7961
2,Centro,Cortes,7961
3,Centro,Justicia,7961
4,Centro,Universidad,7961


In [193]:
df2.shape

(131, 3)

In [194]:
df3 = df2.groupby(['PostalCode','Borough'], sort=True).agg(', '.join)
df3.reset_index(inplace=True)
df3.shape

(21, 3)

In [195]:
df3.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer..."
1,79610,Latina,"Los Cármenes, Puerta del Angel, Lucero, Aluche..."
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu..."
3,79612,Usera,"Orcasitas, Orcasur, San Fermín, Almendrales, M..."
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer..."


### Find Latitude and Longitude

In [196]:
def get_geocoder(borough_from_df):
    # Addres
    address = 'Madrid ' + borough_from_df + ', Spain'
    print('Search address', address)
    # Obtein the latitude and longitude
    geolocator = Nominatim(user_agent="ny_explorer")
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    return latitude,longitude

In [198]:
df3['Latitude'] = 0.0
df3['Longitude'] = 0.0

In [199]:
for i in range(0,len(df3)):
    df3['Latitude'][i],df3['Longitude'][i]=get_geocoder(df3.iloc[i]['Borough'])

Search address Madrid Centro, Spain


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app


Barrio Centro
Search address Madrid Latina, Spain
Barrio Latina
Search address Madrid Carabanchel, Spain
Barrio Carabanchel
Search address Madrid Usera, Spain
Barrio Usera
Search address Madrid Puente de Vallecas, Spain
Barrio Puente de Vallecas
Search address Madrid Moratalaz, Spain
Barrio Moratalaz
Search address Madrid Ciudad Lineal, Spain
Barrio Ciudad Lineal
Search address Madrid Hortaleza, Spain
Barrio Hortaleza
Search address Madrid Villaverde, Spain
Barrio Villaverde
Search address Madrid Villa de Vallecas, Spain
Barrio Villa de Vallecas
Search address Madrid Vicalvaro, Spain
Barrio Vicalvaro
Search address Madrid Arganzuela, Spain
Barrio Arganzuela
Search address Madrid San Blas, Spain
Barrio San Blas
Search address Madrid Barajas, Spain
Barrio Barajas
Search address Madrid Retiro, Spain
Barrio Retiro
Search address Madrid Salamanca, Spain
Barrio Salamanca
Search address Madrid Chamartin, Spain
Barrio Chamartin
Search address Madrid Tetuan, Spain
Barrio Tetuan
Search address M

In [200]:
df3

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819
1,79610,Latina,"Los Cármenes, Puerta del Angel, Lucero, Aluche...",40.411603,-3.749912
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",40.375855,-3.74091
3,79612,Usera,"Orcasitas, Orcasur, San Fermín, Almendrales, M...",40.37754,-3.715229
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer...",40.381633,-3.668024
5,79614,Moratalaz,"Pavones, Horcajo, Marroquina, Media Legua, Fon...",40.400081,-3.631538
6,79615,Ciudad Lineal,"Ventas, Pueblo Nuevo, Quintana, Concepción, Sa...",40.43398,-3.657251
7,79616,Hortaleza,"Palomas, Piovera, Canillas, Pinar del Rey, Apo...",40.458139,-3.641003
8,79617,Villaverde,"Villaverde alto, Casco Histórico de Villaverde...",40.358858,-3.708645
9,79618,Villa de Vallecas,"Casco Histórico de Vallecas, Santa Eugenia, En...",40.373537,-3.614098


### Plot Result

In [201]:
from geopy import Nominatim
address = 'Madrid, Spain'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Madrid are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Madrid are 40.4167047, -3.7035825.


In [202]:
map_madrid = folium.Map(location=[latitude,longitude],zoom_start=10)

for lat,lng,borough,neighbourhood in zip(df3['Latitude'],df3['Longitude'],df3['Borough'],df3['Neighborhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_madrid)
map_madrid