### Open file and prepare Data

In [2]:
import pandas as pd
import numpy as np
import pickle
import json

In [184]:
df = pd.read_csv("./data/515k-hotel-reviews-data-in-europe.zip")

In [10]:
def get_country(adress):
    country = adress.split()[-1]
    if country == "Kingdom":
        return("United Kingdom")
    else:
        return(country)
    
df['Country'] = df.Hotel_Address.apply(lambda x: get_country(x))
df.Country.value_counts()

United Kingdom    262301
Spain              60149
France             59928
Netherlands        57214
Austria            38939
Italy              37207
Name: Country, dtype: int64

In [11]:
def get_city(adress, country):
    city = adress.split()[-2]
    if country == "United Kingdom":
        return(adress.split()[-5])
    else:
        return(city)

df['City'] = df[['Hotel_Address','Country']].apply(lambda x: get_city(x[0], x[1]), axis=1)
df.City.value_counts()

London       262301
Barcelona     60149
Paris         59928
Amsterdam     57214
Vienna        38939
Milan         37207
Name: City, dtype: int64

### Scrapping Hotel Prices
There's a lot more of information missing which we think is important to get better insights. 
To get all this information the first thing i need is to get the exact address that directs me into the booking page containing information of each Hotel. 
This first step will be done through scrapping the links in Google.
Once scrapped the final address in booking, another scrapping exercise will be needed in order to get the desired information for each Hotel.

In [12]:
hotel_city = df[['Hotel_Name','City']].groupby(['Hotel_Name','City']).count().reset_index()
scrap = hotel_city.apply(lambda x: ('Hotel ' + x[0] + ' ' + x[1]), axis=1)
scrap.head()

0                    Hotel 11 Cadogan Gardens London
1                               Hotel 1K Hotel Paris
2    Hotel 25hours Hotel beim MuseumsQuartier Vienna
3                                    Hotel 41 London
4    Hotel 45 Park Lane Dorchester Collection London
dtype: object

This is the adress i will use in a search engine. We wanted to use Google but there are limitations, so we tried other engines. The engine which worked better has been 'startpage'. The idea is that i will pick the first adress coming from booking.com as the final address of the hotel in the system. Once i will gather all of the addresses i will try to scrap the prices in each one of them

In [13]:
search_address = ['https://www.startpage.com/do/search?cmd=process_search&query=booking.com+'
                     +i.replace(" ", "+") for i in scrap]

Once i've got all the addresses of a Google search i start scrapping the addresses in Booking. I also save the important information just in case i need to start over again or i need to reuse the information at some other point of the analysis.

In [14]:
pickle.dump(hotel_city, open('./sav/hotel_city.sav', 'wb'))
pickle.dump(search_address, open('./sav/search_address.sav', 'wb'))

#### Scrapping of Booking addresses

In [6]:
import requests
from bs4 import BeautifulSoup
import time

I will save all the addresses in a list and i will also set a variable calle begin which will tell me from where to start from, just in case the scraping process stops and i need to start over again

In [16]:
bookingaddress = []

In [17]:
begin = 0
bookingaddress = bookingaddress[:begin]

In [18]:
for i, search in enumerate(search_address[begin:]):
    URL = search
    page = requests.get(URL)
    soup = BeautifulSoup(page.content, 'html.parser')
    booking_page = []
    
    for page in soup.find_all('a', href=True):
        if page['href'][:30] == 'https://www.booking.com/hotel/':
            booking_page.append(page['href'])
            break
    
    if len(booking_page) == 0:
        addhotel = ""
    elif booking_page[0][-8:] != '.es.html':
        addhotel = booking_page[0][:-5]+'.es.html'
    else:
        addhotel = booking_page[0]
        
    bookingaddress.append(addhotel)
    print(begin+i,addhotel)

0 https://www.booking.com/hotel/gb/number-eleven.es.html
1 https://www.booking.com/hotel/fr/1-k-hotel.en-gb.es.html
2 https://www.booking.com/hotel/at/25hours-wien.es.html
3 https://www.booking.com/hotel/gb/41clubredcarnations.es.html
4 https://www.booking.com/hotel/gb/parklane.en-gb.es.html
5 https://www.booking.com/hotel/gb/88-studios.en-gb.es.html
6 https://www.booking.com/hotel/fr/9hotel-republique.es.html
7 https://www.booking.com/hotel/fr/a-la-villa-madame.es.html
8 https://www.booking.com/hotel/es/abac-barcelona.es.html
9 https://www.booking.com/hotel/es/achotelsbarcelona.es.html
10 https://www.booking.com/hotel/es/ac-marriott-diagonal-lilla.es.html
11 https://www.booking.com/hotel/es/acirla.es.html
12 https://www.booking.com/hotel/it/ac-milano.es.html
13 https://www.booking.com/hotel/fr/ac-paris-porte-maillot-by-marriott.es.html
14 https://www.booking.com/hotel/es/ac-sants.es.html
15 https://www.booking.com/hotel/es/victoriasuites.es.html
16 https://www.booking.com/hotel/it/dor

134 https://www.booking.com/hotel/it/mirage-milano.es.html
135 https://www.booking.com/hotel/fr/comfort-paris-18eme-saint-pierre.es.html
136 https://www.booking.com/hotel/at/arenberg.es.html
137 https://www.booking.com/hotel/it/st-george-milano.es.html
138 https://www.booking.com/hotel/fr/best-western-plus-le-18-paris.es.html
139 https://www.booking.com/hotel/fr/lejardindecluny.es.html
140 https://www.booking.com/hotel/it/madisonhotel.es.html
141 https://www.booking.com/hotel/gb/maitrise-edgware-road.es.html
142 https://www.booking.com/hotel/fr/mercedes.es.html
143 https://www.booking.com/hotel/gb/moningtonhotel.en-gb.es.html
144 https://www.booking.com/hotel/fr/best-western-nouvel-orleans.es.html
145 https://www.booking.com/hotel/fr/opera-batignolles.es.html
146 https://www.booking.com/hotel/gb/epping-forest-hotel.es.html
147 https://www.booking.com/hotel/gb/palm.en-gb.es.html
148 https://www.booking.com/hotel/fr/bestwesternfr.es.html
149 https://www.booking.com/hotel/fr/best-western-

259 https://www.booking.com/hotel/gb/holidayinnkensington.en-gb.es.html
260 https://www.booking.com/hotel/gb/holiday-inn-london-kings-cross-bloomsbury.en-gb.es.html
261 https://www.booking.com/hotel/gb/crowne-plaza-london-the-city-london1.en-gb.es.html
262 https://www.booking.com/hotel/it/milan-city.es.html
263 https://www.booking.com/hotel/fr/holiday-inn-paris-republique.es.html
264 https://www.booking.com/hotel/fr/declic.es.html
265 https://www.booking.com/hotel/at/do-co-vienna.es.html
266 https://www.booking.com/hotel/gb/jarvis-regentspark.en-gb.es.html
267 https://www.booking.com/hotel/at/am-opernring.es.html
268 https://www.booking.com/hotel/at/triest.es.html
269 https://www.booking.com/hotel/nl/leurope.es.html
270 https://www.booking.com/hotel/gb/devonporthouse.en-gb.es.html
271 https://www.booking.com/hotel/at/wilhelmshof.es.html
272 https://www.booking.com/hotel/at/hotelkaiserfranzjoseph.es.html
273 https://www.booking.com/hotel/fr/derbyalma.es.html
274 https://www.booking.com/

385 https://www.booking.com/hotel/gb/great-northern-london.en-gb.es.html
386 https://www.booking.com/hotel/gb/great-st-helen.en-gb.es.html
387 https://www.booking.com/hotel/gb/grosvenor-house-london.en-gb.es.html
388 https://www.booking.com/hotel/gb/grosvenor-house-apartments-by-jumeirah-living.en-gb.es.html
389 https://www.booking.com/hotel/es/gran-via-678.es.html
390 https://www.booking.com/hotel/es/grandpasage.es.html
391 https://www.booking.com/hotel/fr/jules-et-jim.en-gb.es.html
392 https://www.booking.com/hotel/fr/aiglon.es.html
393 https://www.booking.com/hotel/fr/amastan-paris.es.html
394 https://www.booking.com/hotel/fr/arvor.es.html
395 https://www.booking.com/hotel/fr/balzac.es.html
396 https://www.booking.com/hotel/fr/fouquet-s-barriere-paris.es.html
397 https://www.booking.com/hotel/fr/baume-paris.es.html
398 https://www.booking.com/hotel/fr/beauchamps.es.html
399 https://www.booking.com/hotel/fr/bedford.es.html
400 https://www.booking.com/hotel/fr/bel-ami.es.html
401 http

517 https://www.booking.com/hotel/gb/londonolympia.en-gb.es.html
518 https://www.booking.com/hotel/gb/hilton-london-paddington.en-gb.es.html
519 https://www.booking.com/hotel/gb/hilton-london-tower-bridge.en-gb.es.html
520 https://www.booking.com/hotel/gb/hilton-wembley.en-gb.es.html
521 https://www.booking.com/hotel/it/hilton-milan.es.html
522 https://www.booking.com/hotel/fr/hilton-paris-opera.es.html
523 https://www.booking.com/hotel/at/hilton-vienna.es.html
524 https://www.booking.com/hotel/at/hilton-vienna-danube.es.html
525 https://www.booking.com/hotel/at/hilton-vienna-plaza.es.html
526 https://www.booking.com/hotel/nl/hiamsterdam.es.html
527 https://www.booking.com/hotel/nl/holiday-inn-amsterdam-arena-towers.es.html
528 https://www.booking.com/hotel/gb/holiday-inn-london-bloomsbury.en-gb.es.html
529 https://www.booking.com/hotel/gb/holiday-inn-london-brent-cross.en-gb.es.html
530 https://www.booking.com/hotel/gb/londoncamdenlock.en-gb.es.html
531 https://www.booking.com/hotel/g

649 https://www.booking.com/hotel/gb/indigo-london-tower-hill.en-gb.es.html
650 https://www.booking.com/hotel/fr/indigo-paris-opera.es.html
651 https://www.booking.com/hotel/at/gtcapricorno.es.html
652 https://www.booking.com/hotel/nl/jl-no76.es.html
653 https://www.booking.com/hotel/at/hotjohannstrausswien.es.html
654 https://www.booking.com/hotel/at/konig-von-ungarn.es.html
655 https://www.booking.com/hotel/at/hotelkaiserinelisabeth.es.html
656 https://www.booking.com/hotel/at/hotelkavalier.es.html
657 https://www.booking.com/hotel/fr/l-39-antoine.es.html
658 https://www.booking.com/hotel/fr/la-lanterne-paris.es.html
659 https://www.booking.com/hotel/gb/hotellaplacelondon.en-gb.es.html
660 https://www.booking.com/hotel/it/laspeziamilano.es.html
661 https://www.booking.com/hotel/fr/lavillasaintgermaindespres.es.html
662 https://www.booking.com/hotel/at/topazz.es.html
663 https://www.booking.com/hotel/at/landhaus-fuhrgassl-huber.es.html
664 https://www.booking.com/hotel/fr/le-10-bis.es

785 https://www.booking.com/hotel/fr/vignon.es.html
786 https://www.booking.com/hotel/es/vilamari.es.html
787 https://www.booking.com/hotel/es/villa-emilia.es.html
788 https://www.booking.com/hotel/fr/holiday-villa-lafayette-paris.es.html
789 https://www.booking.com/hotel/fr/villa-saxe-eiffel.es.html
790 https://www.booking.com/hotel/it/hotelvittoria.es.html
791 https://www.booking.com/hotel/nl/vondel.es.html
792 https://www.booking.com/hotel/es/vueling-bcn-by-hc.es.html
793 https://www.booking.com/hotel/it/wagnermilan.es.html
794 https://www.booking.com/hotel/at/wandl.es.html
795 https://www.booking.com/hotel/es/well-and-come.en-gb.es.html
796 https://www.booking.com/hotel/fr/hotelwestend.es.html
797 https://www.booking.com/hotel/fr/whistler.es.html
798 https://www.booking.com/hotel/gb/xanadu.en-gb.es.html
799 https://www.booking.com/hotel/gb/xenia.en-gb.es.html
800 https://www.booking.com/hotel/at/zeitgeist-vienna.es.html
801 https://www.booking.com/hotel/fr/d-orsay-paris.es.html
802

915 https://www.booking.com/hotel/fr/les-jardins-du-marais.es.html
916 https://www.booking.com/hotel/fr/les-matins-de-paris.es.html
917 https://www.booking.com/hotel/fr/les-plumes.es.html
918 https://www.booking.com/hotel/at/lindner-am-belvedere-wien.es.html
919 https://www.booking.com/hotel/fr/tulipinnlittlepalace.es.html
920 https://www.booking.com/hotel/gb/londonbridgehotel.en-gb.es.html
921 https://www.booking.com/hotel/gb/london-city-suites-by-montcalm.en-gb.es.html
922 https://www.booking.com/hotel/gb/london-elizabeth-hotel.en-gb.es.html
923 https://www.booking.com/hotel/gb/london-hilton-on-park-lane.en-gb.es.html
924 https://www.booking.com/hotel/gb/london-marriott-county-hall.en-gb.es.html
925 https://www.booking.com/hotel/gb/london-marriott-grosvenor-square.en-gb.es.html
926 https://www.booking.com/hotel/gb/london-marriott-kensington.en-gb.es.html
927 https://www.booking.com/hotel/gb/london-marriott-marble-arch.en-gb.es.html
928 https://www.booking.com/hotel/gb/london-marriott

1034 https://www.booking.com/hotel/fr/hotel-montfleuri.es.html
1035 https://www.booking.com/hotel/es/monument.es.html
1036 https://www.booking.com/hotel/nl/morgan-amp-mees.es.html
1037 https://www.booking.com/hotel/gb/mybloomsbury.es.html
1038 https://www.booking.com/hotel/gb/myhotelchelsea.en-gb.es.html
1039 https://www.booking.com/hotel/fr/my-home-in-paris.es.html
1040 https://www.booking.com/hotel/nl/nhcaransa.es.html
1041 https://www.booking.com/hotel/nl/nhamsterdamcentre.es.html
1042 https://www.booking.com/hotel/nl/nhmuseumquarter.es.html
1043 https://www.booking.com/hotel/nl/nhcitynorth.es.html
1044 https://www.booking.com/hotel/nl/nhschiller.es.html
1045 https://www.booking.com/hotel/nl/nh-amsterdam-zuid.es.html
1046 https://www.booking.com/hotel/es/nh-barcelona-stadium.es.html
1047 https://www.booking.com/hotel/nl/jollycarlton.es.html
1048 https://www.booking.com/hotel/nl/nhamsterdam.es.html
1049 https://www.booking.com/hotel/nl/nhbarbizon.es.html
1050 https://www.booking.com/

1157 https://www.booking.com/hotel/fr/phileas-hotel.es.html
1158 https://www.booking.com/hotel/nl/pillows-anna-van-den-vondel.es.html
1159 https://www.booking.com/hotel/fr/platine-hotel.es.html
1160 https://www.booking.com/hotel/fr/plazatoureiffel.es.html
1161 https://www.booking.com/hotel/es/guillermo-tell.es.html
1162 https://www.booking.com/hotel/gb/portobello-house.es.html
1163 https://www.booking.com/hotel/es/primero-primera.es.html
1164 https://www.booking.com/hotel/fr/princedegalleshotel.es.html
1165 https://www.booking.com/hotel/nl/pulitzer.es.html
1166 https://www.booking.com/hotel/es/barcelona-skipper.es.html
1167 https://www.booking.com/hotel/gb/bernardshawhotel.en-gb.es.html
1168 https://www.booking.com/hotel/fr/pullman-paris-bercy.es.html
1169 https://www.booking.com/hotel/fr/tour-eiffel.es.html
1170 https://www.booking.com/hotel/fr/tour-eiffel.es.html
1171 https://www.booking.com/hotel/it/qualys-hotel-nasco.es.html
1172 https://www.booking.com/hotel/fr/r-kipling.es.html
1

1281 https://www.booking.com/hotel/gb/renaissance-st-pancras-london.en-gb.es.html
1282 https://www.booking.com/hotel/gb/st-paul.en-gb.es.html
1283 https://www.booking.com/hotel/it/andersonhotel.es.html
1284 https://www.booking.com/hotel/it/businesspalace.en-gb.es.html
1285 https://www.booking.com/hotel/it/starhotels-echo.es.html
1286 https://www.booking.com/hotel/it/hotelritz.es.html
1287 https://www.booking.com/hotel/it/starhotels-tourist.es.html
1288 https://www.booking.com/hotel/gb/stauntonhotel.es.html
1289 https://www.booking.com/hotel/gb/staybridge-suites-london-stratford.en-gb.es.html
1290 https://www.booking.com/hotel/gb/staybridge-suites-london-vauxhall.en-gb.es.html
1291 https://www.booking.com/hotel/at/steigenberger-herrenhof.es.html
1292 https://www.booking.com/hotel/gb/strandpalace.en-gb.es.html
1293 https://www.booking.com/hotel/at/strandhotel-alte-donau.es.html
1294 https://www.booking.com/hotel/it/style.es.html
1295 https://www.booking.com/hotel/at/suitehoteloper.es.htm

1403 https://www.booking.com/hotel/gb/the-tophams.en-gb.es.html
1404 https://www.booking.com/hotel/nl/toren.en-gb.es.html
1405 https://www.booking.com/hotel/gb/thistletower.en-gb.es.html
1406 https://www.booking.com/hotel/gb/the-trafalgar-st-james-london-curio-collection-by-hilton.en-gb.es.html
1407 https://www.booking.com/hotel/gb/victoria-station.en-gb.es.html
1408 https://www.booking.com/hotel/gb/the-waldorf-hilton.en-gb.es.html
1409 https://www.booking.com/hotel/gb/the-wellesley.en-gb.es.html
1410 https://www.booking.com/hotel/gb/royalcourtapartments.en-gb.es.html
1411 https://www.booking.com/hotel/gb/the-westbridge-limited.en-gb.es.html
1412 https://www.booking.com/hotel/gb/westbury-hotel.en-gb.es.html
1413 https://www.booking.com/hotel/it/westinpalacemilano.es.html
1414 https://www.booking.com/hotel/fr/thewestinparis.es.html
1415 https://www.booking.com/hotel/gb/the-whitechapel.en-gb.es.html
1416 https://www.booking.com/hotel/es/the-wittmore.en-gb.es.html
1417 https://www.booking

I fix some addresses have a **'en-gb'** suffix that i realised to be in the scraped addresses just after finishing the process. This is a nice job, so I save the list. I need to work a little bit more on it because I realised some addresses haven't been found and i need to find them manually. I will also merge this list with the real name and check everything went fine. I will finish this step using Excel.

In [29]:
bookingaddress = [i.replace('.en-gb', '') for i in bookingaddress]

In [37]:
hotel_city['Booking_Address'] = bookingaddress
hotel_city.to_excel("./data/hotel_city_booking.xlsx", index=False)  

In [55]:
pickle.dump(bookingaddress, open('./sav/bookingaddress.sav', 'wb'))
pickle.dump(hotel_city, open('./sav/hotel_city_booking.sav', 'wb'))

#### Scrapping of other Booking information (complete dataset for loop)

In [3]:
head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"}
hotels = {}
name_scraped = []
address_scraped = []

df = pd.read_excel("./data/hotel_city_booking v2.xlsx")
bookingaddress = df.Booking_Address

In [4]:
hotels = pickle.load(open('./sav/hotels_dict.sav', 'rb'))
begin = 772

In [7]:
for num, url in enumerate(bookingaddress[begin:]):
        
    print(num+begin, url)
    
    if type(url) == float:
        continue
    
    # Open Booking.com URL
    URL = url
    page = requests.get(URL, headers=head)
    soup = BeautifulSoup(page.content, 'html.parser')
    
    # Get Header
    if len(soup.select("[type='application/ld+json']")) > 0:
        data = soup.select("[type='application/ld+json']")[0]
        hotel_header = json.loads(data.text)
    else:
        continue
    
    # Get Hotel Name
    name = json.loads(data.text)['name']
    name_scraped.append(name)
    address_scraped.append(url)
    
    # Get Price
    pricetag = json.loads(data.text)['priceRange']
    if pricetag is not None:
        price = int(pricetag.split("€",1)[1].split()[0].replace('.',''))
    else:
        price = np.nan
    
    # Get Stars
    if soup.find(class_='star_track') != None: 
        stars = soup.find(class_='star_track')['title']
    else:
        stars = np.nan
    
    # Get Facilities    
    facilities = {}
    raw = soup.find_all(class_='facilitiesChecklistSection')
    for i in raw:
        dept = i.getText().split('\n')
        facilities_dept = [x for x in i.getText().split('\n') if x]
        dept = facilities_dept[0]
        items = facilities_dept[1:]    
        facilities[dept] = items
        
    # Append Information to Dicctionary
    hotels[URL] = {'Name': name, 'Price': price, 'Stars': stars, 'Facilities': facilities, 'Header': hotel_header}
    
    time.sleep(3)

772 https://www.booking.com/hotel/fr/the-peninsula-paris.es.html
773 https://www.booking.com/hotel/es/the-serras.es.html
774 https://www.booking.com/hotel/it/tiziano.es.html
775 https://www.booking.com/hotel/it/hotel-tocq.es.html
776 https://www.booking.com/hotel/at/topazz.es.html
777 https://www.booking.com/hotel/fr/hoteldelatourdauvergne.es.html
778 https://www.booking.com/hotel/fr/trianon-rive-gauche.es.html
779 https://www.booking.com/hotel/nl/v-fizeaustraat.es.html
780 https://www.booking.com/hotel/nl/v-nesplein.es.html
781 https://www.booking.com/hotel/it/viu-milan.es.html
782 https://www.booking.com/hotel/fr/verneuil.es.html
783 https://www.booking.com/hotel/at/viennart.es.html
784 https://www.booking.com/hotel/at/hotelvienna.es.html
785 https://www.booking.com/hotel/fr/vignon.es.html
786 https://www.booking.com/hotel/es/vilamari.es.html
787 https://www.booking.com/hotel/es/villa-emilia.es.html
788 https://www.booking.com/hotel/fr/holiday-villa-lafayette-paris.es.html
789 https:

906 https://www.booking.com/hotel/fr/le-roch-amp-spa.es.html
907 https://www.booking.com/hotel/fr/le-saint-a-paris.es.html
908 https://www.booking.com/hotel/fr/lesenatparis.es.html
909 https://www.booking.com/hotel/fr/tourville.es.html
910 https://www.booking.com/hotel/fr/le-tsuba.es.html
911 https://www.booking.com/hotel/fr/legend.es.html
912 https://www.booking.com/hotel/it/leonardo-hotels-milan.es.html
913 https://www.booking.com/hotel/at/leonardo-vienna.es.html
914 https://www.booking.com/hotel/fr/les-jardins-de-la-villa.es.html
915 https://www.booking.com/hotel/fr/les-jardins-du-marais.es.html
916 https://www.booking.com/hotel/fr/les-matins-de-paris.es.html
917 https://www.booking.com/hotel/fr/les-plumes.es.html
918 https://www.booking.com/hotel/at/lindner-am-belvedere-wien.es.html
919 https://www.booking.com/hotel/fr/tulipinnlittlepalace.es.html
920 https://www.booking.com/hotel/gb/londonbridgehotel.es.html
921 https://www.booking.com/hotel/gb/london-city-suites-by-montcalm.es.ht

1029 https://www.booking.com/hotel/gb/mondrian-london.es.html
1030 https://www.booking.com/hotel/fr/d-argentine.es.html
1031 https://www.booking.com/hotel/fr/meyerhold-amp-spa.es.html
1032 https://www.booking.com/hotel/gb/montagu-place.es.html
1033 https://www.booking.com/hotel/gb/the-montcalm-royal-london-house.es.html
1034 https://www.booking.com/hotel/fr/hotel-montfleuri.es.html
1035 https://www.booking.com/hotel/es/monument.es.html
1036 https://www.booking.com/hotel/nl/morgan-amp-mees.es.html
1037 https://www.booking.com/hotel/gb/mybloomsbury.es.html
1038 https://www.booking.com/hotel/gb/myhotelchelsea.es.html
1039 https://www.booking.com/hotel/fr/my-home-in-paris.es.html
1040 https://www.booking.com/hotel/nl/nhcaransa.es.html
1041 https://www.booking.com/hotel/nl/nhamsterdamcentre.es.html
1042 https://www.booking.com/hotel/nl/nhmuseumquarter.es.html
1043 https://www.booking.com/hotel/nl/nhcitynorth.es.html
1044 https://www.booking.com/hotel/nl/nhschiller.es.html
1045 https://www.b

1156 https://www.booking.com/hotel/it/petit-palais-de-charme.es.html
1157 https://www.booking.com/hotel/fr/phileas-hotel.es.html
1158 https://www.booking.com/hotel/nl/pillows-anna-van-den-vondel.es.html
1159 https://www.booking.com/hotel/fr/platine-hotel.es.html
1160 https://www.booking.com/hotel/fr/plazatoureiffel.es.html
1161 https://www.booking.com/hotel/es/guillermo-tell.es.html
1162 https://www.booking.com/hotel/gb/portobello-house.es.html
1163 https://www.booking.com/hotel/es/primero-primera.es.html
1164 https://www.booking.com/hotel/fr/princedegalleshotel.es.html
1165 https://www.booking.com/hotel/nl/pulitzer.es.html
1166 https://www.booking.com/hotel/es/barcelona-skipper.es.html
1167 https://www.booking.com/hotel/gb/bernardshawhotel.es.html
1168 https://www.booking.com/hotel/fr/pullman-paris-bercy.es.html
1169 nan
1170 https://www.booking.com/hotel/fr/tour-eiffel.es.html
1171 https://www.booking.com/hotel/it/qualys-hotel-nasco.es.html
1172 https://www.booking.com/hotel/fr/r-kip

1285 https://www.booking.com/hotel/it/starhotels-echo.es.html
1286 https://www.booking.com/hotel/it/hotelritz.es.html
1287 https://www.booking.com/hotel/it/starhotels-tourist.es.html
1288 https://www.booking.com/hotel/gb/stauntonhotel.es.html
1289 https://www.booking.com/hotel/gb/staybridge-suites-london-stratford.es.html
1290 https://www.booking.com/hotel/gb/staybridge-suites-london-vauxhall.es.html
1291 https://www.booking.com/hotel/at/steigenberger-herrenhof.es.html
1292 https://www.booking.com/hotel/gb/strandpalace.es.html
1293 https://www.booking.com/hotel/at/strandhotel-alte-donau.es.html
1294 https://www.booking.com/hotel/it/style.es.html
1295 https://www.booking.com/hotel/at/suitehoteloper.es.html
1296 https://www.booking.com/hotel/fr/helzear-champs-elysa-c-es.es.html
1297 https://www.booking.com/hotel/fr/helzear-rive-gauche.es.html
1298 https://www.booking.com/hotel/es/sunotel-central.es.html
1299 https://www.booking.com/hotel/es/sunotel-club-central.es.html
1300 https://www.b

1413 https://www.booking.com/hotel/it/westinpalacemilano.es.html
1414 https://www.booking.com/hotel/fr/thewestinparis.es.html
1415 https://www.booking.com/hotel/gb/the-whitechapel.es.html
1416 https://www.booking.com/hotel/es/the-wittmore.es.html
1417 https://www.booking.com/hotel/it/the-ralph-suite.es.html
1418 https://www.booking.com/hotel/gb/the-zetter.es.html
1419 https://www.booking.com/hotel/gb/the-zetter-townhouse.es.html
1420 https://www.booking.com/hotel/gb/the-zetter-townhouse-marylebone.es.html
1421 https://www.booking.com/hotel/gb/mic-and-conference-centre.es.html
1422 https://www.booking.com/hotel/gb/euston-square-hitel.es.html
1423 https://www.booking.com/hotel/gb/thistlebloomsbury.es.html
1424 https://www.booking.com/hotel/gb/thistlehydepark.es.html
1425 https://www.booking.com/hotel/gb/kensingtongardens.es.html
1426 https://www.booking.com/hotel/gb/thistletrafalgar.es.html
1427 https://www.booking.com/hotel/gb/threadneedles-london.es.html
1428 https://www.booking.com/ho

In [11]:
pickle.dump(hotels, open('./sav/hotels_dict.sav', 'wb'))

In [10]:
len(hotels)

1476

In [44]:
Precio = [hotels[i]['Price'] for i in hotels]
Keys = list(hotels.keys())
Estrellas = [hotels[i]['Stars'] for i in hotels]
Nombre = [hotels[i]['Name'] for i in hotels]
Address = [hotels[i]['Header']['address']['streetAddress'] for i in hotels]

len(Precio), len(Keys), len(Precio), len(Nombre), len(Address)

(1476, 1476, 1476, 1476, 1476)

In [45]:
scraping = pd.DataFrame(zip(Keys, Nombre, Address, Precio, Estrellas))
scraping.head()

Unnamed: 0,0,1,2,3,4
0,https://www.booking.com/hotel/gb/number-eleven...,11 Cadogan Gardens,"11 Cadogan Gardens, Sloane Square, Kensington ...",266.0,hotel de 5 estrellas
1,https://www.booking.com/hotel/fr/1-k-hotel.es....,1K Paris,"13 Boulevard Du Temple, Le Marais - 3er distri...",363.0,hotel de 4 estrellas
2,https://www.booking.com/hotel/at/25hours-wien....,25hours Hotel beim MuseumsQuartier,"Lerchenfelder Straße 1-3, 07. Neubau, 1070 Vie...",117.0,hotel de 4 estrellas
3,https://www.booking.com/hotel/gb/41clubredcarn...,41,"41 Buckingham Palace Road, Westminster Borough...",366.0,hotel de 5 estrellas
4,https://www.booking.com/hotel/gb/parklane.es.html,45 Park Lane - Dorchester Collection,"45 Park Lane, Westminster Borough, Londres, W1...",817.0,hotel de 5 estrellas


In [46]:
scraping.to_excel("./data/scraping.xlsx", index=False)