# Importance of the hospitality industry in Barcelona and Madrid - Report

#### By Michael Chipantasi

## 1. Introduction: Business Problem

### 1.1. Background

It is a fact that **Hospitality** has a big impact on countries's economies, specially in develop countries where people could have access to restaurants from five Michellin Stars to bars or cafes. 

Also, it is knowing that mediterranean cuisine has an enormous popularity everywhere. It has been one of the most influence culinary culture and it has been named one of the healthies diets. 

On the other hand,**Spanish cuisine** has developed in the last 20 years considerably. Spanish restaurant as *Celler de Can Roca* has been named *The Best Restaurant in the World* twice, in 2013 and 2018. Moreover, in the last 20 years a hundreds of Spanish chefs had been on top of their ranking and their names have been recognized around the world as exemple: Joan Roca, Martin Berasategui, Daviz Muñoz, Jordi Cruz among other great chefs.

In the last year, the world has suffered a huge pandemia, **COVID-19**, that has changed everything. Hospitality, big/small restaurants, cafes, winer bar and others, has been impacted negatively for the different restrictions that the differents governments has taken to reduce the pandemic impact in their countries. These measures affected to everybody's economies in different ways. For this reason, it is important to know how it has influenced in our lives.

### 1.2. Problem

In this project we will try to identify how important is Hospitality (restaurants) in the economy of the two of the most important cities of Spain, **Madrid** and **Barcelona**.

To know the importance of Hospitality in each city, we need to determinate how many restaurants are based in each city. Once we have the total number of restaurants, we will define which are the top five kind of restaurant/food that predominate in both Madrid and Barcelona. 

Finally, we will try to find which district has more restaurants for each city and compare where they are located to determinate the distribution of the leisure in Madrid and Barcelona. 

Following these steps would help to understand which city had been **more affected** by the lockdown that the Spanish Goverment has imposed as an action to reduce the level of inflection due to COVID-19 that it is been produced in Barcelona and Madrid in the past year. 

## 2. Data

### 2.1. Data source

To solve our problem, we will use the following data:
 
 - total number of restaurants
 - classified restaurant by kind of food
 - located restaurant by districts

We need to find the above data for both Barcelona and Madrid. 

a. The **total number of restaurants** will be us an idea of how important is Hospitality in the economy of the city. 

b. If we can **classify restaurants** by the kind of food, it will give us some idees of how different are the customs of the citizens of each city.

c. By **locatting the restaurant by districts**, it could help us to determinate where people have more chances to find a place to have a meal.

Also, we will need to use the next sources to extract the information required to solve this project:

 - Foursquere API --> number of restaurants
 - Google Maps API reverse geocoding --> location of the cities
 - Google Maps API geocoding --> location of the restaurants
 - Wikipedia --> district of the cities


To sum up, the sources that will be use to develop this project are the following:

In [1]:
import numpy as np

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json
import requests
import urllib.request
import random
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans
from bs4 import BeautifulSoup

In [26]:
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim

Collecting package metadata (current_repodata.json): done
Solving environment: | 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/noarch::ibm-wsrt-py37main-main==custom=1937
  - conda-forge/linux-64::pytorch==1.8.0=cpu_py37hafa7651_0
  - defaults/noarch::ibm-wsrt-py37main-keep==0.0.0=1937
done

# All requested packages already installed.



In [23]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium

Collecting package metadata (current_repodata.json): done
Solving environment: - 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/noarch::ibm-wsrt-py37main-main==custom=1937
  - conda-forge/linux-64::pytorch==1.8.0=cpu_py37hafa7651_0
  - defaults/noarch::ibm-wsrt-py37main-keep==0.0.0=1937
done

# All requested packages already installed.



### 2.2. Data cleaning

To start with this project, it has been used **Wikipedia** to collect a **list of the different districts** for both Barcelona and Madrid using BeautifulSoup.
Once the data was available, it was transformed to Pandas DataFrame to have an easy way develop the different actions during the project.

* **BARCELONA**

In [2]:
url = "https://es.wikipedia.org/wiki/Distritos_de_Barcelona"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "lxml")

In [3]:
all_tables = soup.find_all("table")
right_table = soup.find_all('table', class_='wikitable')

In [4]:
column_names = ['Number','District','Location','Size (km2)','Population','Density','Neighborhoods','Council']
bcn = pd.DataFrame(columns =column_names)

In [5]:
for tr_cell in right_table[0].find_all('tr'):
    row_data = []
    for td_cell in tr_cell.find_all('td'):
        row_data.append(td_cell.text.strip())
    if len(row_data)==8:
        bcn.loc[len(bcn)] = row_data

In [6]:
bcn

Unnamed: 0,Number,District,Location,Size (km2),Population,Density,Neighborhoods,Council
0,1,Ciutat Vella,,411,103 429,"25 159,09","El Raval (1), Barrio Gótico (2), La Barcelonet...",Jordi Rabassa (Barcelona en Comú)
1,2,Eixample,,746,265 910,"35 625,67","El Fort Pienc (5), Sagrada Familia (6), Dreta ...",Jordi Martí (Barcelona en Comú)
2,3,Sants-Montjuïc,,2268,184 091,811832,"Poble Sec (11), La Marina del Prat Vermell (12...",Marc Serra (Barcelona en Comú)
3,4,Les Corts,,602,81 974,"13 607,11","Les Corts (19), La Maternitat i Sant Ramon (2...",Xavier Marcé (PSC)
4,5,Sarrià-Sant Gervasi,,1991,149 260,749711,"Vallvidrera, el Tibidabo i les Planes (22), Sa...",Albert Batlle (PSC)
5,6,Gràcia,,419,121 798,"29 082,62","Vallcarca i els Penitents (28), El Coll (29), ...",Eloi Badia (Barcelona en Comú)
6,7,Horta - Guinardó,,1196,171 495,"14 342,64","Baix Guinardó (33), Can Baró (34), El Guinardó...",Rosa Alarcón (PSC)
7,8,Nou Barris,,805,170 669,"21 198,48","Vilapicina i La Torre Llobeta (44), Porta (45)...",Marga Marí-Klose (PSC)
8,9,Sant Andreu,,659,149 821,"22 724,25","La Trinitat Vella (57), Baró de Viver (58), El...",Lucía Martín (Barcelona en Comú)
9,10,Sant Martí,,1039,238 315,"22 943,58","El Camp de l'Arpa del Clot (64), El Clot (65),...",David Escudé (PSC)


* **MADRID**

In [7]:
url_2 = "https://en.wikipedia.org/wiki/Districts_of_Madrid"
page_2 = urllib.request.urlopen(url_2)
soup_2 = BeautifulSoup(page_2, "lxml")

In [8]:
all_tables_2 = soup_2.find_all("table")
right_table_2 = soup_2.find_all('table', class_='wikitable sortable')

In [9]:
column_names_2 = ['Number','District','Size (ha)','Population','Density','Location','Neighborhoods']
mad = pd.DataFrame(columns =column_names_2)

In [10]:
for tr_cell in right_table_2[0].find_all('tr'):
    row_data = []
    for td_cell in tr_cell.find_all('td'):
        row_data.append(td_cell.text.strip())
    if len(row_data)==7:
        mad.loc[len(mad)] = row_data

In [11]:
mad

Unnamed: 0,Number,District,Size (ha),Population,Density,Location,Neighborhoods
0,1,Centro,522.82,131928,252.34,,Palacio (11)Embajadores (12)Cortes (13)Justici...
1,2,Arganzuela,646.22,151965,235.16,,Imperial (21)Acacias (22)Chopera (23)Legazpi (...
2,3,Retiro,546.62,118516,216.82,,Pacífico (31)Adelfas (32)Estrella (33)Ibiza (3...
3,4,Salamanca,539.24,143800,266.67,,Recoletos (41)Goya (42)Fuente del Berro (43)Gu...
4,5,Chamartín,917.55,143424,156.31,,El Viso (51)Prosperidad (52)Ciudad Jardín (53)...
5,6,Tetuán,537.47,153789,286.13,,Bellas Vistas (61)Cuatro Caminos (62)Castillej...
6,7,Chamberí,467.92,137401,293.64,,Gaztambide (71)Arapiles (72)Trafalgar (73)Alma...
7,8,Fuencarral-El Pardo,23783.84,238756,10.04,,El Pardo (81)Fuentelarreina (82)Peñagrande (83...
8,9,Moncloa-Aravaca,4653.11,116903,25.12,,Casa de Campo (91)Argüelles (92)Ciudad Univers...
9,10,Latina,2542.72,233808,91.95,,Los Cármenes (101)Puerta del Ángel (102)Lucero...


It was eliminated the columns that were unuseful for the determinate the solution to the project's problem. For Barcelona, Number, Location, Density and Council were dropped from the data table. While for Madrid, it was dropped Number, Location and Density. 

* **BARCELONA**

In [24]:
bcn_districts = bcn.drop(['Number','Location','Density','Council'], axis=1)
bcn_districts.District = bcn_districts.District.replace({"Gràcia": "Gracia Barcelona"})
bcn_districts

Unnamed: 0,District,Size (km2),Population,Neighborhoods
0,Ciutat Vella,411,103 429,"El Raval (1), Barrio Gótico (2), La Barcelonet..."
1,Eixample,746,265 910,"El Fort Pienc (5), Sagrada Familia (6), Dreta ..."
2,Sants-Montjuïc,2268,184 091,"Poble Sec (11), La Marina del Prat Vermell (12..."
3,Les Corts,602,81 974,"Les Corts (19), La Maternitat i Sant Ramon (2..."
4,Sarrià-Sant Gervasi,1991,149 260,"Vallvidrera, el Tibidabo i les Planes (22), Sa..."
5,Gracia Barcelona,419,121 798,"Vallcarca i els Penitents (28), El Coll (29), ..."
6,Horta - Guinardó,1196,171 495,"Baix Guinardó (33), Can Baró (34), El Guinardó..."
7,Nou Barris,805,170 669,"Vilapicina i La Torre Llobeta (44), Porta (45)..."
8,Sant Andreu,659,149 821,"La Trinitat Vella (57), Baró de Viver (58), El..."
9,Sant Martí,1039,238 315,"El Camp de l'Arpa del Clot (64), El Clot (65),..."


* **MADRID**

In [15]:
mad_districts = mad.drop(['Number','Location','Density'], axis=1)
mad_districts.District = mad_districts.District.replace({"Retiro": "Retiro Madrid"})
mad_districts.District = mad_districts.District.replace({"Tetuán": "Tetuán Madrid"})
mad_districts

Unnamed: 0,District,Size (ha),Population,Neighborhoods
0,Centro,522.82,131928,Palacio (11)Embajadores (12)Cortes (13)Justici...
1,Arganzuela,646.22,151965,Imperial (21)Acacias (22)Chopera (23)Legazpi (...
2,Retiro Madrid,546.62,118516,Pacífico (31)Adelfas (32)Estrella (33)Ibiza (3...
3,Salamanca,539.24,143800,Recoletos (41)Goya (42)Fuente del Berro (43)Gu...
4,Chamartín,917.55,143424,El Viso (51)Prosperidad (52)Ciudad Jardín (53)...
5,Tetuán Madrid,537.47,153789,Bellas Vistas (61)Cuatro Caminos (62)Castillej...
6,Chamberí,467.92,137401,Gaztambide (71)Arapiles (72)Trafalgar (73)Alma...
7,Fuencarral-El Pardo,23783.84,238756,El Pardo (81)Fuentelarreina (82)Peñagrande (83...
8,Moncloa-Aravaca,4653.11,116903,Casa de Campo (91)Argüelles (92)Ciudad Univers...
9,Latina,2542.72,233808,Los Cármenes (101)Puerta del Ángel (102)Lucero...


As a result, the **data table** for each city was composed of the following columns list: *District, Size, Population, Neighborhoods*.

### 2.3. Refining data

Once the list of districts was defined for each city, **geolocator** was used to determinate the location of each one of the district. Indeed, to collect all this new information it was created two new columns in the districts table: *Latitude* and *Longitude*.

Another step was taken to define the data. After determinating the location of each district, it appeared that some districts were not defined correctly by the geolocator. In fact, the district *"Gracia"* had to be changed to *"Gracia Barcelona"* to be located in the correct spot. While in Madrid, the following districts had to be modified to relocate them in the correct place: *"Retiro"* by *"Retiro Madrid"* and *"Tetuan"* by *"Tetuan Madrid"*, as it was showed previously.

Finally, the **districts table** for each city were composed by the following information for every district: **District**, **Size**, **Population**, **Neighborhoods**, **Latitude** and **Longitude**.

* **BARCELONA**

In [32]:
bcn_districts['Latitude'] = bcn_districts['District'].apply(lambda x: geolocator.geocode(x).latitude)
bcn_districts['Longitude'] = bcn_districts['District'].apply(lambda x: geolocator.geocode(x).longitude)
bcn_districts

Unnamed: 0,District,Size (km2),Population,Neighborhoods,Latitude,Longitude
0,Ciutat Vella,411,103 429,"El Raval (1), Barrio Gótico (2), La Barcelonet...",41.374985,2.173277
1,Eixample,746,265 910,"El Fort Pienc (5), Sagrada Familia (6), Dreta ...",41.393689,2.163655
2,Sants-Montjuïc,2268,184 091,"Poble Sec (11), La Marina del Prat Vermell (12...",41.364762,2.154233
3,Les Corts,602,81 974,"Les Corts (19), La Maternitat i Sant Ramon (2...",41.385244,2.132863
4,Sarrià-Sant Gervasi,1991,149 260,"Vallvidrera, el Tibidabo i les Planes (22), Sa...",41.413039,2.10762
5,Gracia Barcelona,419,121 798,"Vallcarca i els Penitents (28), El Coll (29), ...",41.410171,2.155136
6,Horta - Guinardó,1196,171 495,"Baix Guinardó (33), Can Baró (34), El Guinardó...",41.428556,2.143617
7,Nou Barris,805,170 669,"Vilapicina i La Torre Llobeta (44), Porta (45)...",41.445815,2.179801
8,Sant Andreu,659,149 821,"La Trinitat Vella (57), Baró de Viver (58), El...",41.437439,2.196859
9,Sant Martí,1039,238 315,"El Camp de l'Arpa del Clot (64), El Clot (65),...",41.406782,2.203655


* **MADRID**

In [33]:
mad_districts['Latitude'] = mad_districts['District'].apply(lambda x: geolocator.geocode(x).latitude)
mad_districts['Longitude'] = mad_districts['District'].apply(lambda x: geolocator.geocode(x).longitude)
mad_districts

Unnamed: 0,District,Size (ha),Population,Neighborhoods,Latitude,Longitude
0,Centro,522.82,131928,Palacio (11)Embajadores (12)Cortes (13)Justici...,47.549025,1.732406
1,Arganzuela,646.22,151965,Imperial (21)Acacias (22)Chopera (23)Legazpi (...,40.398068,-3.693734
2,Retiro Madrid,546.62,118516,Pacífico (31)Adelfas (32)Estrella (33)Ibiza (3...,40.41115,-3.676057
3,Salamanca,539.24,143800,Recoletos (41)Goya (42)Fuente del Berro (43)Gu...,40.965157,-5.664018
4,Chamartín,917.55,143424,El Viso (51)Prosperidad (52)Ciudad Jardín (53)...,40.701869,-4.957008
5,Tetuán Madrid,537.47,153789,Bellas Vistas (61)Cuatro Caminos (62)Castillej...,40.460578,-3.698281
6,Chamberí,467.92,137401,Gaztambide (71)Arapiles (72)Trafalgar (73)Alma...,45.566267,5.920364
7,Fuencarral-El Pardo,23783.84,238756,El Pardo (81)Fuentelarreina (82)Peñagrande (83...,40.556346,-3.778591
8,Moncloa-Aravaca,4653.11,116903,Casa de Campo (91)Argüelles (92)Ciudad Univers...,40.439495,-3.744204
9,Latina,2542.72,233808,Los Cármenes (101)Puerta del Ángel (102)Lucero...,41.459526,13.012591


## 3. Methodology

Once the data base was determinated, the study of the project started. 

First, a **map** of the cities was created using **folium.Map** to visualise where was located every district. To obtain the specific localitation for both cities, it was used geolocator as well. Then, in the map, it was added a blue circle pop-up for each district to indicate exactly where is every district situate.  

* **BARCELONA**

In [34]:
barcelona = 'Barcelona, Barcelona'

geolocator = Nominatim(user_agent="Micar_21")
location = geolocator.geocode(barcelona)
latitude = location.latitude
longitude = location.longitude

In [35]:
map_bcn = folium.Map(location=[latitude, longitude], zoom_start = 10)

for lat, lng, district, neighborhood in zip(bcn_districts['Latitude'], bcn_districts['Longitude'], bcn_districts['District'], bcn_districts['Neighborhoods']):
    label = '{}'.format(district)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat, lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_bcn)

map_bcn

* **MADRID**

In [36]:
madrid = 'Madrid, Madrid'

geolocator_2 = Nominatim(user_agent="Micar_21")
location_2 = geolocator_2.geocode(madrid)
latitude_2 = location_2.latitude
longitude_2 = location_2.longitude

In [37]:
map_mad = folium.Map(location=[latitude_2, longitude_2], zoom_start = 10)

for lat, lng, district, neighborhood in zip(mad_districts['Latitude'], mad_districts['Longitude'], mad_districts['District'], mad_districts['Neighborhoods']):
    label = '{}'.format(district)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat, lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_mad)

map_mad

Secondly, it was define the **foursquare credentials**: Client ID, Client Secret, Access Token & Version. This step allowed to explore among different venues that exist in each district. 

In [38]:
CLIENT_ID = 'TILJIOO41TAHQHNOEJCU3BQJ5D4PHGUEXCPT0LKQNRJWSPYO' # your Foursquare ID
CLIENT_SECRET = 'P4QZGN4PGDZPXOYD55R2NBWVENSPLM3K4JB4YNZXZLC0K4GQ' # your Foursquare Secret
ACCESS_TOKEN = 'EVSQ1WHTTHP1S5G2UZNCCR0OPFR2V5IQIHLTJNYBGEHB1W0J' # your FourSquare Access Token
VERSION = '20180604'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: TILJIOO41TAHQHNOEJCU3BQJ5D4PHGUEXCPT0LKQNRJWSPYO
CLIENT_SECRET:P4QZGN4PGDZPXOYD55R2NBWVENSPLM3K4JB4YNZXZLC0K4GQ


To explore in foursquare data base, it was necessary to create a **URL** using the personal credentials and the specific location that want to be explore. In this case, the location was defined by the latitude and longitude columns in the districts table. As a result, the URL was something like that: 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius{}&limit{}'.format(Client_ID, Client_Secret, Version, lat, lng, radius, limit). To make a test, a URL associate to one district was create to analyse the first district of each city.

* **BARCELONA**

In [39]:
bcn_districts.loc[0, 'District']
district_latitude = bcn_districts.loc[0, 'Latitude']
district_longitude = bcn_districts.loc[0, 'Longitude']
district_name = bcn_districts.loc[0, 'District']
print('Latitude and longitude values of {} are {}, {}.'.format(district_name, district_latitude, 
                                                               district_longitude))

Latitude and longitude values of Ciutat Vella are 41.3749846, 2.17327724224704.


In [40]:
LIMIT = 100
radius = 5000
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius{}&limit{}'.format(
    CLIENT_ID, CLIENT_SECRET, VERSION,district_latitude, district_longitude, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=TILJIOO41TAHQHNOEJCU3BQJ5D4PHGUEXCPT0LKQNRJWSPYO&client_secret=P4QZGN4PGDZPXOYD55R2NBWVENSPLM3K4JB4YNZXZLC0K4GQ&v=20180604&ll=41.3749846,2.17327724224704&radius5000&limit100'

In [41]:
results = requests.get(url).json()

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [43]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

30 venues were returned by Foursquare.


* **MADRID**

In [44]:
mad_districts.loc[0, 'District']
district_latitude_2 = mad_districts.loc[0, 'Latitude']
district_longitude_2 = mad_districts.loc[0, 'Longitude']
district_name_2 = mad_districts.loc[0, 'District']

print('Latitude and longitude values of {} are {}, {}.'.format(district_name_2, district_latitude_2, 
                                                               district_longitude_2))

Latitude and longitude values of Centro are 47.5490251, 1.7324062.


In [45]:
LIMIT =100
radius = 5000
url_2 = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius{}&limit{}'.format(
    CLIENT_ID, CLIENT_SECRET, VERSION,district_latitude_2, district_longitude_2, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=TILJIOO41TAHQHNOEJCU3BQJ5D4PHGUEXCPT0LKQNRJWSPYO&client_secret=P4QZGN4PGDZPXOYD55R2NBWVENSPLM3K4JB4YNZXZLC0K4GQ&v=20180604&ll=41.3749846,2.17327724224704&radius5000&limit100'

In [46]:
results_2 = requests.get(url_2).json()

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [48]:
print('{} venues were returned by Foursquare.'.format(nearby_venues_2.shape[0]))

7 venues were returned by Foursquare.


As a default, even the limit number of venues was a 100 and the radius was 5km from the location of each district, it was obtained 30 venues as highest number of venues for each district of both Barcelona and Madrid. 

After create the URL, this was opened using **requests.get(URL).json.**. The page was open as a json file and it was used to extract different information about the different venues of each district like "venue name", "venue categories" and "venue location". All the venues were collected through getNearbyVenues and getNearbyVenues_2 for Barcelona and Madrid correspondently.

* **BARCELONA**

In [49]:
def getNearbyVenues(names, latitudes, longitudes):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius{}&limit{}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng,
            radius,
        LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['District', 
                  'District Latitude', 
                  'District Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [50]:
bcn_venues = getNearbyVenues(names=bcn_districts['District'], latitudes=bcn_districts['Latitude'], longitudes=bcn_districts['Longitude'])

Ciutat Vella
Eixample
Sants-Montjuïc
Les Corts
Sarrià-Sant Gervasi
Gracia Barcelona
Horta - Guinardó
Nou Barris
Sant Andreu
Sant Martí


In [51]:
print(bcn_venues.shape)
bcn_venues.head()

(300, 7)


Unnamed: 0,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Ciutat Vella,41.374985,2.173277,macera,41.375589,2.170493,Cocktail Bar
1,Ciutat Vella,41.374985,2.173277,The Fish & Chips Shop,41.375965,2.174152,Bistro
2,Ciutat Vella,41.374985,2.173277,Cassette Bar,41.377324,2.173629,Bar
3,Ciutat Vella,41.374985,2.173277,Marea Alta,41.376484,2.175106,Seafood Restaurant
4,Ciutat Vella,41.374985,2.173277,Pizza Circus,41.377905,2.172911,Pizza Place


* **MADRID**

In [52]:
def getNearbyVenues_2(names, latitudes, longitudes):
    
    venues_list_2=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url_2 = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius{}&limit{}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng,
            radius,
        LIMIT)
            
        # make the GET request
        results_2 = requests.get(url_2).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list_2.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results_2])

    nearby_venues_2 = pd.DataFrame([item for venue_list_2 in venues_list_2 for item in venue_list_2])
    nearby_venues_2.columns = ['District', 
                  'District Latitude', 
                  'District Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues_2)

In [53]:
mad_venues = getNearbyVenues_2(names=mad_districts['District'], latitudes=mad_districts['Latitude'], longitudes=mad_districts['Longitude'])

Centro
Arganzuela
Retiro Madrid
Salamanca
Chamartín
Tetuán Madrid
Chamberí
Fuencarral-El Pardo
Moncloa-Aravaca
Latina
Carabanchel
Usera
Puente de Vallecas
Moratalaz
Ciudad Lineal
Hortaleza
Villaverde
Villa de Vallecas
Vicálvaro
San Blas-Canillejas
Barajas


In [64]:
print(mad_venues.shape)
mad_venues.head()

(586, 7)


Unnamed: 0,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Centro,47.549025,1.732406,Zara,47.612253,1.730619,Clothing Store
1,Centro,47.549025,1.732406,Gite Bellevue,47.553991,1.738166,Bed & Breakfast
2,Centro,47.549025,1.732406,Carrefour Contact,47.52555,1.812174,Supermarket
3,Centro,47.549025,1.732406,Pasteur TP,47.620298,1.758394,Construction & Landscaping
4,Centro,47.549025,1.732406,Alimentation Générale,47.542476,1.848846,Supermarket


Once all this information about the venues of each city was obtained, the following steps were run to solve the project problem:
* it was **group by District** to know how many venues were collected for each district.

* **BARCELONA**

In [62]:
bcn_venues_districts = bcn_venues.groupby('District').count()
bcn_venues_districts = bcn_venues_districts.drop(['District Latitude','District Longitude','Venue Latitude','Venue Longitude','Venue Category'], axis=1)
bcn_venues_districts

Unnamed: 0_level_0,Venue
District,Unnamed: 1_level_1
Ciutat Vella,30
Eixample,30
Gracia Barcelona,30
Horta - Guinardó,30
Les Corts,30
Nou Barris,30
Sant Andreu,30
Sant Martí,30
Sants-Montjuïc,30
Sarrià-Sant Gervasi,30


* **MADRID**

In [63]:
mad_venues_districts = mad_venues.groupby('District').count()
mad_venues_districts = mad_venues_districts.drop(['District Latitude','District Longitude','Venue Latitude','Venue Longitude','Venue Category'], axis=1)
mad_venues_districts

Unnamed: 0_level_0,Venue
District,Unnamed: 1_level_1
Arganzuela,30
Barajas,30
Carabanchel,30
Centro,7
Chamartín,9
Chamberí,30
Ciudad Lineal,30
Fuencarral-El Pardo,30
Hortaleza,30
Latina,30


* Then, **venue_category** was created to represent the total number of venues of each category exists in each city. 

* **BARCELONA**

In [61]:
venue_category_bcn = bcn_venues.groupby('Venue Category').count() 
venue_category_bcn = venue_category_bcn.drop(['District Latitude','District Longitude', 'Venue','Venue Latitude','Venue Longitude'], axis=1)
venue_category_bcn = venue_category_bcn.sort_values(by=['District'], ascending=[False])
venue_category_bcn

Unnamed: 0_level_0,District
Venue Category,Unnamed: 1_level_1
Restaurant,16
Park,15
Tapas Restaurant,14
Spanish Restaurant,14
Mediterranean Restaurant,13
Hotel,10
Plaza,10
Scenic Lookout,7
Pizza Place,7
Bakery,7


* **MADRID**

In [65]:
venue_category_mad = mad_venues.groupby('Venue Category').count() 
venue_category_mad = venue_category_mad.drop(['District Latitude','District Longitude', 'Venue','Venue Latitude','Venue Longitude'], axis=1)
venue_category_mad = venue_category_mad.sort_values(by=['District'], ascending=[False])
venue_category_mad

Unnamed: 0_level_0,District
Venue Category,Unnamed: 1_level_1
Spanish Restaurant,60
Park,30
Restaurant,21
Italian Restaurant,17
Gym,17
Tapas Restaurant,17
Bar,16
Supermarket,16
Hotel,16
Pub,15


* The previous step helped to elaborate a list of *Venue Category* that was related only to **Hospitality**.

* **BARCELONA**

In [73]:
hospitality_bcn = bcn_venues[bcn_venues['Venue Category'].isin(['Argentinian Restaurant','Asian Restaurant','Bakery','Bar','Bed & Breakfast','Beer Bar','Bistro','Breakfast Spot','Burger Joint','Café','Chinese Restaurant','Cocktail Bar','Coffee Shop','Cupcake Shop','Deli/Bodega','Dessert Shop','Diner','Donut Shop','Empanada Restaurant','Food','Food & Drink Shop','Gastropub','German Restaurant','Hotel','Ice Cream Shop','Italian Restaurant','Japanese Restaurant','Mediterranean Restaurant','Mexican Restaurant','Pizza Place','Polish Restaurant','Restaurant','Sandwich Place','Seafood Restaurant','Snack Placce','Spanish Restaurant','Sushi Restaurant','Taco Place','Tapas Restaurant','Thai Restaurant','Vegetarian/Vegan Restaurant','Wine Bar'])]
print(hospitality_bcn.shape)
hospitality_bcn

(147, 7)


Unnamed: 0,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Ciutat Vella,41.374985,2.173277,macera,41.375589,2.170493,Cocktail Bar
1,Ciutat Vella,41.374985,2.173277,The Fish & Chips Shop,41.375965,2.174152,Bistro
2,Ciutat Vella,41.374985,2.173277,Cassette Bar,41.377324,2.173629,Bar
3,Ciutat Vella,41.374985,2.173277,Marea Alta,41.376484,2.175106,Seafood Restaurant
4,Ciutat Vella,41.374985,2.173277,Pizza Circus,41.377905,2.172911,Pizza Place
7,Ciutat Vella,41.374985,2.173277,El Pachuco,41.376369,2.169148,Mexican Restaurant
9,Ciutat Vella,41.374985,2.173277,My Fu*king Restaurant,41.377767,2.173236,Spanish Restaurant
10,Ciutat Vella,41.374985,2.173277,Frankie Gallo Cha Cha Cha,41.37845,2.172683,Pizza Place
15,Ciutat Vella,41.374985,2.173277,Cañete,41.379154,2.173092,Tapas Restaurant
17,Ciutat Vella,41.374985,2.173277,Trópico,41.377817,2.171247,Restaurant


* **MADRID**

In [72]:
hospitality_mad = mad_venues[mad_venues['Venue Category'].isin(['Argentinian Restaurant','Asian Restaurant','BBQ Joint','Bakery','Bar','Beer Bar','Beer Garden','Bistro','Brazilian Restaurant','Breakfast Spot','Burger Joint','Café','Chinese Restaurant','Cocktail Bar','Coffee Shop','Cuban Restaurant','Cupcake Shop','Deli/Bodega','Dessert Shop','Diner','Donut Shop','Dumpling Restaurant','Eastern European Restaurant','Fast Food Restaurant','Food Service','French Restaurant','Gastropub','Hotel','Hotel Bar','Ice Cream Shop','Indian Restaurant','Italian Restaurant','Japanese Restaurant','Juice Bar','Korean Restaurant','Latin American Restaurant','Mediterranean Restaurant','Mexican Restaurant','Middle Eastern Restaurant','Paella Restaurant','Pastry Shop','Persian Restaurant','Peruvian Restaurant','Pizza Place','Polish Restaurant','Restaurant','Sandwich Place','Seafood Restaurant','Snack Placce','Soup Place','Spanish Restaurant','Steakhouse','Sushi Restaurant','Tapas Restaurant','Thai Restaurant','Vegetarian/Vegan Restaurant','Venezuelan Restaurant','Wine Bar'])]
print(hospitality_mad.shape)
hospitality_mad

(319, 7)


Unnamed: 0,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
5,Centro,47.549025,1.732406,Le Beauharnais,47.542572,1.84909,Restaurant
7,Arganzuela,40.398068,-3.693734,Tres Cerditos,40.397316,-3.694184,Chinese Restaurant
10,Arganzuela,40.398068,-3.693734,Magasand Deli,40.396811,-3.691293,Restaurant
11,Arganzuela,40.398068,-3.693734,Las tinajas,40.396993,-3.697779,Tapas Restaurant
12,Arganzuela,40.398068,-3.693734,PanArte,40.399279,-3.694182,Bakery
14,Arganzuela,40.398068,-3.693734,Salón de Té Al Yabal,40.399015,-3.700249,Cocktail Bar
15,Arganzuela,40.398068,-3.693734,Trattoria In Crescendo,40.394582,-3.698388,Italian Restaurant
16,Arganzuela,40.398068,-3.693734,Havana Blues,40.40205,-3.698488,Cuban Restaurant
17,Arganzuela,40.398068,-3.693734,La Pequeña Graná,40.399574,-3.69855,Tapas Restaurant
18,Arganzuela,40.398068,-3.693734,El Quinto Pecado,40.400028,-3.694446,Gastropub


* Once the list of Hospitality venues were defined, it was group by District to help to identify which District has more numbers of restaurants, bars, etc.

* **BARCELONA**

In [70]:
hospitality_districts_bcn = hospitality_bcn.groupby('District').count()
hospitality_districts_bcn = hospitality_districts_bcn.drop(['District Latitude','District Longitude','Venue Latitude','Venue Longitude','Venue Category'], axis=1)
hospitality_districts_bcn = hospitality_districts_bcn.sort_values(by=['Venue'], ascending=[False])
hospitality_districts_bcn

Unnamed: 0_level_0,Venue
District,Unnamed: 1_level_1
Sant Martí,23
Les Corts,20
Ciutat Vella,18
Eixample,18
Gracia Barcelona,17
Horta - Guinardó,14
Sant Andreu,14
Nou Barris,13
Sarrià-Sant Gervasi,7
Sants-Montjuïc,3


* **MADRID**

In [67]:
hospitality_districts_mad = hospitality_mad.groupby('District').count()
hospitality_districts_mad = hospitality_districts_mad.drop(['District Latitude','District Longitude','Venue Latitude','Venue Longitude','Venue Category'], axis=1)
hospitality_districts_mad = hospitality_districts_mad.sort_values(by=['Venue'], ascending=[False])
hospitality_districts_mad

Unnamed: 0_level_0,Venue
District,Unnamed: 1_level_1
Tetuán Madrid,23
Hortaleza,22
Ciudad Lineal,21
Latina,21
Barajas,21
San Blas-Canillejas,19
Salamanca,18
Carabanchel,17
Fuencarral-El Pardo,17
Arganzuela,16


* Finally, the hospitality venues were grouped by categories to identify which one was more popular among the citizens of each city

* **BARCELONA**

In [60]:
hospitality_categories_bcn = hospitality_bcn.groupby('Venue Category').count()
hospitality_categories_bcn = hospitality_categories_bcn.drop(['District','District Latitude','District Longitude','Venue Latitude','Venue Longitude'], axis=1)
hospitality_categories_bcn = hospitality_categories_bcn.sort_values(by=['Venue'], ascending=[False])
hospitality_categories_bcn

Unnamed: 0_level_0,Venue
Venue Category,Unnamed: 1_level_1
Restaurant,16
Spanish Restaurant,14
Tapas Restaurant,14
Mediterranean Restaurant,13
Hotel,10
Bakery,7
Pizza Place,7
Bar,6
Coffee Shop,5
Italian Restaurant,5


* **MADRID**

In [68]:
hospitality_categories_mad = hospitality_mad.groupby('Venue Category').count()
hospitality_categories_mad = hospitality_categories_mad.drop(['District','District Latitude','District Longitude','Venue Latitude','Venue Longitude'], axis=1)
hospitality_categories_mad = hospitality_categories_mad.sort_values(by=['Venue'], ascending=[False])
hospitality_categories_mad

Unnamed: 0_level_0,Venue
Venue Category,Unnamed: 1_level_1
Spanish Restaurant,60
Restaurant,21
Italian Restaurant,17
Tapas Restaurant,17
Bar,16
Hotel,16
Pizza Place,12
Café,12
Ice Cream Shop,11
Coffee Shop,10


## 4. Results

### 4.1. Barcelona

Barcelona is a Spanish coast city located on the northeastern of the Iberian Peninsula. It has a numerous neighborhoods divided in 10 different Districts: *Ciutat Vella*, *Eixample*, *Sants-Montjuic*, *Les Corts*, *Sarrià-Sant Gervasi*, *Gracia*, *Horta-Guinardo*, *Nou Barris*, *Sant Andreu* and *Sant Marti*.

Running the code described previously helped to discover the following aspect about Barcelona and its Hospitality industry:

* 1. The Foursquare API gave only 30 venues for each district of Barcelona. It means that the **total number of venues** collected for this study was **300**. Even it is a short number, it help to make an idea how important is Hospitality in each district because we compare same number of venues in each district.


* 2. After grouped by venue categories, it was easy to identify which ones are related to Hospitality and a list was made with them. As a result, 147 venues were related to **Hospitality represents 49%** of the total of venues of Barcelona.


* 3. In terms of **Districts**, the most relevant in the Hospitality industry is ***Sant Martí*** which represent 15.64% of the total of Hospitality venues in Barcelona. Also, this district has 80% of its venues related somehow with Hospitality. To get a general idea, the top 5 districts in this aspect are:
      
   * *Sant Martí* with 15.64% of Hospitality venues.
   * *Les Corts* with 13.60% of Hospitality venues.
   * *Ciutat Vella* with 12.24% of Hospitality venues.
   * *Eixample* with 12.24% of Hospitality venues.
   * *Gracia* with 11.56% of Hospitality venues.


* 4. About **type of food**, it could say that Barcelona has some preference for the **typical Mediterranean food** as show their preference where *Tapas Restaurant* (9.52%), *Spanish Restaurant* (9.52%) and *Mediterranean Restaurant* (8.84%) are in the top 4 venues category. The top of the list is *Restaurant* (10.84%) which do not determinate the type of any food.

### 4.2. Madrid

Madrid is the most important and populous city of Spain. In fact, it is the capital of this country. It is situated on the center of the Iberian Peninsula. In Madrid exists 131 neighborhoods which are divided into the following 21 districts: *Centro, Arganzuela, Retiro, Salamanca, Chamartin, Tetuan, Chamberi, Fuencarral-El Pardo, Moncloa-Aravaca, Latina, Carabanchel, Usera, Puente de Vallecas, Moratalaz, Ciudad Lineal, Hortaleza, Villaverde, Villa de Vallecas, Vicalvaro, San Blas-Canillejas* and *Barajas*.

As it was done with Barcelona, Madrid has the following specific aspect about Hospitality:

* 1. The Foursquare API gave 30 venues for each district of Madrid but *Centro* and *Chamartin* that only received 7 and 9 venues, respectively. This means that, in this project, Madrid has **589 venues** to analyse.


* 2. After grouped by categories, it can be determinated that 319 of the venues are related to Hospitality. It could be interpretated as **Hospitality represents 54.15%** of the total of venues in Madrid.


* 3. Analysing by **Districts**, ***Tetuan*** and ***Hortaleza*** are the two more notorious district in terms of Hospitality in Madrid because the first one represents **7.21%**, while *Hortaleza* 6.89%. In fact, 76.67% and 73.33% of their venues are related to Hospitality. Moreover, the top 5 district with more relevance in the Hospitality industry in Madrid are:

    * *Tetuan* with 7.21% of the Hospitality venues.
    * *Hortaleza* with 6.89% of the Hospitality venues.
    * *Ciudad Lineal* with 6.58% of the Hospitality venues.
    * *Latina* with 6.58% of the Hospitality venues.
    * *Barajas* with 6.58% of the Hospitality venues.


* 4. In terms of **type of food**, the most common type of venue is the **Spanish Restaurant** with **18.80%** of the total of Hospitality venues. In addition, the top 5 categories that are more popular in Madrid are:

    * Spanish Restaurant (18.80%)
    * Restaurant (6.58%)
    * Italian Restaurant (5.32%)
    * Tapas Restaurant (5.32%)
    * Bar (5.01%)
 

## 5. Discussion

### 5.1. Results comparation

As the results show, in terms of Hospitalit venues, Madrid presents a big proportion respect the other type of venues than Barcelona. Also, it can be interpretated as the Hospitality industry represents in Madrid 54.15% of the tertiary sector. While in Barcelona this variable only represents 49%. 

An other aspect to be considered is the distribution in the different districts of each city. Barcelona, smaller city than Madrid, has only 10 districts that could explain the distribution among the districts is bigger than Madrid. As it shows, *Sant Marti* has 15.64% of the Hospitality venues of Barcelona, following closely by *Les Corts* and *Ciutat Vella* with 13.60% and 12.24% of the total of these venues, respectively. On the same theme, Madrid presents *Tetuan* and *Hortaleza* as the district with more influence on Hospitality venues with 7.21% and 6.89%. 

The last aspect that was considered in this study is the type of food that predominate in each city. It is clearly than in both cities, Spanish restaurants has a big importance. In Madrid, Spanish restaurants is the best option of the citizens because 18.80% of the Hospitality venues were this type. However, Spanish restaurant in Barcelona is not important as Madrid, in fact this type of food is in the second position (9.52%). Barcelona shows a big variety of preference respect type of food, the top five categories are between 8% and 10%.  

### 5.2. Observations and recommendations

The main proposal of this project is to determinate how important is Hospitality industry in Madrid and Barcelona. Moreover, it was interested to deep-in the data and analyse with districts has bigger influence on this aspect. However, the analysis was successfull, for future research on this field it could be interesting to determinate how **important is Hospitality industry for each district**. It means, quantify the number of hospitality venues in each district and divide for the total of the district's venues.

Other aspect to consider for future analysis is to try to get in deep and make the same **analysis by neighborhoods** of the city. It would help to get a huge detail of the importance of hospitality industry. Perhaps, using neighborhoods instead districts could give an **increase on the amount of data**. It is know that as big as posssible is the data, better and realistict results can be obtained. 

The last recommendation, this project is suitable for any town, city, country or continent. In the future, it could compare more than two cities. For exemple, on next studies it could be compare capital of different European countries as Madrid, Paris, Berlin, Rome or Lisbon. 

## 6. Conclusion

The objective of this project is to analyse the importance of the hospitality industry into the two biggest cities of Spain: Barcelona and Madrid. 

After define the districts, search for venues and select the relevant data, it can be determinate the following list of conclusions:

* Hospitality industry in Madrid has a bigger impact than in Barcelona. This industry represents 54.15% in Madrid, while in Barcelona is 49%.

* Distribution of the hospitality venues around each city is significantly different. In Barcelona this venues are distributed proportionately among the different districts; in contrast, Madrid shows a desproportionate distribution. An explanation of this effect is that the number of districts of each city is truly different.

* Despite local food is the most predominant type of food in both cities, Barcelona has more influence of the Mediterranean restaurants and Madrid dominate the Spanish restaurants. 

Through these conclusions, it can be said that the measures that the Spanish government have token to control the effect of Covid-19 it has affected both cities considerably. However, Madrid has been the worst affected because almost 55% of its economy is derived from the Hospitality industry.

Taking the social aspect in consideration, citizens of both cities prefer to consume local food based on Mediterranean or Spanish diet instead international food as Thai, Japanese or Chinese. 

Finally, it is easier to find a place to have a meal in Madrid, if you are in *Tetuan*, *Hortaleza* or *Ciudad Lineal*; but your chances are better if you are in Barcelona and specifically in these districts: *Sant Marti*, *Les Corts* or *Ciutat Vella*.