# Capstone

Overview of the Case

A certain non-profit organization in the Philippines wants to collect data about hotels, restaurants, tourist sites, etc for tourists.

The organization hired data engineers to collect data from different sources and to preprocess the collected data.

The data will be used to provide information to tourists for them to create a better itinerary when they visit the Philippines

**Tasks**
* Gather data from different sources (at least 2 different sources) using python
* Enumerate the sources of your data (e.g. name of website - website, PSA - excel/csv, etc.)
* Make sure that your sources are credible
* Upload all the files (csv files and Jupyter files)

Sources:

www.tripadvisor.com

www.booking.com

www.ph.hotels.com


**Task 1: Prepare hotel data**

Extract hotel data (Note: at least 25 different hotels)
* Name of hotel
* Location of hotel (Note: Barangay, City/Municipality, Province)
* Amenities
* Price range (e.g. price per person, room, or hour)
* Number of available rooms (e.g number of rooms for two people) 
    **Enumerated the different rooms they have**
* Other data (extract more data regarding hotels that can be used by tourists)
* Customer reviews per attribute of hotels from review sites ( e.g. customer review regarding the customer service of hotels). At least 15 customer reviews.Make sure that each review is unique and posted by different users (i.e. no duplicate reviews and no duplicate username).

Store collected data in a csv file. **Filename format:** Hotel.csv



In [1]:
#import importants tools
from lxml import html
from bs4 import BeautifulSoup as soup
import requests
import pandas as pd

In [2]:
req= requests.get('https://www.tripadvisor.com.ph/Hotels-g298573-Manila_Metro_Manila_Luzon-Hotels.html')

In [3]:
bsobj = soup(req.content,'lxml')
#Grab a hotel name:
hotel = []
for name in bsobj.findAll('div',{'class':'listing_title'}):
  hotel.append(name.text.replace('Sponsored ','').strip())
hotel

['1775 Adriatico Suites',
 'City Garden Suites',
 'Regency Grand Suites',
 'Eurotel Pedro Gil',
 'Executive Hotel',
 'Heroes Hotel',
 'Winford Manila Resort & Casino',
 'Go Hotels Otis-Manila',
 'Adriatico Arms Hotel',
 'JMM Grand Suites',
 'Hotel Kimberly Manila',
 'Go Hotels Ermita',
 'Red Planet Manila Binondo',
 'OYO 152 Sangco Condotel',
 'Aloha Hotel',
 'Leesons Residences',
 'Stay Malate',
 'Oriental Zen Suites',
 'Diamond Hotel Philippines',
 'White Knight Hotel Intramuros',
 'RedDoorz Plus @ Better Living Paranaque',
 'Ramada by Wyndham Manila Central',
 'Tropicana Suites',
 'Casa Blanca Apartment',
 'Fersal Hotel - Manila',
 'Hotel Sogo Quirino, Malate',
 'La Casarita',
 'Manila Manor Hotel',
 'Halina Hotel Avenida',
 'Hotel Sogo - Sta Mesa']

In [4]:
ratings = []
for rating in bsobj.findAll('a',{'class':'ui_bubble_rating'}):
  ratings.append(rating['alt'])
ratings

['4.5 of 5 bubbles',
 '4 of 5 bubbles',
 '4 of 5 bubbles',
 '3 of 5 bubbles',
 '3.5 of 5 bubbles',
 '5 of 5 bubbles',
 '3.5 of 5 bubbles',
 '4 of 5 bubbles',
 '4 of 5 bubbles',
 '3 of 5 bubbles',
 '4 of 5 bubbles',
 '3.5 of 5 bubbles',
 '4.5 of 5 bubbles',
 '4 of 5 bubbles',
 '3.5 of 5 bubbles',
 '4 of 5 bubbles',
 '5 of 5 bubbles',
 '4.5 of 5 bubbles',
 '4.5 of 5 bubbles',
 '4 of 5 bubbles',
 '3 of 5 bubbles',
 '4.5 of 5 bubbles',
 '4 of 5 bubbles',
 '3 of 5 bubbles',
 '3 of 5 bubbles',
 '3.5 of 5 bubbles',
 '3 of 5 bubbles',
 '2 of 5 bubbles',
 '2.5 of 5 bubbles',
 '2 of 5 bubbles']

In [6]:
price = []

for p in bsobj.findAll('div',{'class':'price-wrap'}):
  price.append(p.text.replace('₱',' ').strip()) 
price[:5]
#save the data frame 
d1 = {'Hotel':hotel,'Ratings':ratings,'Price':price}
df = pd.DataFrame.from_dict(d1)

['1,833', '2,394', '1,902', '1,632', '4,526 3,369']

In [87]:
#saved just in case the site changes
df.to_csv('D:\\001 UPSKILLING, ARAL MODES,ETC\Data Science Training\SPARTA\Module 14 Python for Data Engineering\Final Capstone Project\\df4.csv', index= False)

## Get Details from Each Hotel
### Address, Amenities, Available Rooms, Reviews (at least 15)

# 1. 1775 Adriatico Suites

In [9]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1774042-Reviews-1775_Adriatico_Suites-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
# Get 1775 Adriatico Suites address
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)

#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1775 Interior M. Adriatico Brgy. 699, Zone 076, Malate,, Manila, Luzon 1004 Philippines']
["Paid public parking nearby,Free High Speed Internet (WiFi),Outdoor pool,Fitness Center with Gym / Workout Room,Free breakfast,Children's television networks,Highchairs available,Airport transportation"]


In [10]:
req2= requests.get('https://www.booking.com/hotel/ph/adriatico-suites.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
availroom

['Two Bedroom Suite,Deluxe Room,Standard Room,Premier Family Room,Premiere Double Room,Premiere Junior Room,Prestige Poolside,Prestige Double Room,Premiere Accessible Room']

### Reviews

In [11]:
#1st 5 reviews
req3=requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1774042-Reviews-1775_Adriatico_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Joel D', 'Cyrene A', 'Nicole', 'Rae Shin', 'Marnella Bianca M']
['June 2021', 'June 2021', 'May 2021', 'June 2021', 'April 2021']


In [12]:
#2nd 5 reviews
req4=requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1774042-Reviews-or5-1775_Adriatico_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['ROMALYN', 'Lhyean De Guzman', 'Marianne V', 'Marc Eivan S', 'Sophia Stephanie Mosende']
['May 2021', 'May 2021', 'May 2021', 'April 2021', 'May 2021']


In [13]:
#3rd 5 reviews
req5=requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1774042-Reviews-or10-1775_Adriatico_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['April P', 'Stacy', 'Nicole', 'Kayen', 'Ces Dupitas']
['April 2021', 'April 2021', 'March 2021', 'February 2021', 'February 2021']


In [14]:
df10= pd.concat([df2,df3,df4]).reset_index(drop=True)
df10

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Joel D,June 2021,Visited this place twice.. the Room rates ex...
1,Cyrene A,June 2021,Love the place. The room were spacious and has...
2,Nicole,May 2021,Definitely a hidden gem in the middle of city ...
3,Rae Shin,June 2021,The place was very comforting. I have been sta...
4,Marnella Bianca M,April 2021,Really appreciate everything from the staff to...
5,ROMALYN,May 2021,"Hi , thank you. We really enjoyed staying at y..."
6,Lhyean De Guzman,May 2021,Overall stay was so perfect that we actually p...
7,Marianne V,May 2021,Overall the place is nice but the room that we...
8,Marc Eivan S,April 2021,Stayed for 3 days and 2 nights over the course...
9,Sophia Stephanie Mosende,May 2021,This place is awesome. I highly recommend it t...


In [15]:
#Pre-processing the data
#Concatenate values of reviews
df10['Full Review']=df10['ReviewerName']+' ; '+df10['ReviewDate']+' ; '+df10['ReviewSummary']
df5= df10['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df5[i] for i in range(0, len(df5), 1)}

#transpose data frame reviews
df6 = pd.DataFrame.from_dict(res_dct,orient="index").T

#Concatenate Address,Amenities,AvailableRooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h1= pd.concat([df20,df6],axis=1)
h1

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1775 Interior M. Adriatico Brgy. 699, Zone 076...","Paid public parking nearby,Free High Speed Int...","Two Bedroom Suite,Deluxe Room,Standard Room,Pr...",Joel D ; June 2021 ; Visited this place twice....,Cyrene A ; June 2021 ; Love the place. The roo...,Nicole ; May 2021 ; Definitely a hidden gem in...,Rae Shin ; June 2021 ; The place was very comf...,Marnella Bianca M ; April 2021 ; Really apprec...,"ROMALYN ; May 2021 ; Hi , thank you. We really...",Lhyean De Guzman ; May 2021 ; Overall stay was...,Marianne V ; May 2021 ; Overall the place is n...,Marc Eivan S ; April 2021 ; Stayed for 3 days ...,Sophia Stephanie Mosende ; May 2021 ; This pla...,April P ; April 2021 ; This is my second time ...,Stacy ; April 2021 ; Booking process is so eas...,Nicole ; March 2021 ; We're glad we found 1775...,Kayen ; February 2021 ; It was our 2nd time st...,Ces Dupitas ; February 2021 ; My partner and I...


# 2. City Garden Suites

In [16]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d477891-Reviews-City_Garden_Suites-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
# Get City Garden Suites address
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1158 A. Mabini Street Ermita, Manila, Luzon 1000 Philippines']
['Free parking,Free High Speed Internet (WiFi),Fitness Center with Gym / Workout Room,Free breakfast,Bicycle rental,Airport transportation,Business Center with Internet Access,Conference facilities']


In [17]:
req2= requests.get('https://www.booking.com/hotel/ph/city-garden-suites.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
availroom

['Deluxe Twin Room,Junior Suite,Superior Double Room,One Bedroom Suite Double,Superior Twin Room,One Bedroom Suite Twin,Standard Double Room No View,Standard Twin Room No View,Penthouse Suite Double with Balcony,Penthouse Suite Twin with Balcony,Deluxe King Room,Junior Suite,Superior Double Room,Superior Twin Room']

### Reviews

In [18]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d477891-Reviews-City_Garden_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['travelgamerhub', 'Hector Periquin', 'Wei A', 'Judit R', 'MarieGrace']
['August 2020', 'October 2019', 'January 2020', 'December 2019', 'December 2019']


In [19]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d477891-Reviews-or5-City_Garden_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Joven O', '581markjf', 'Stay425892', 'Julius M', 'Jamil A']
['December 2019', 'September 2019', 'September 2019', 'September 2019', 'August 2019']


In [20]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d477891-Reviews-or10-City_Garden_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)    

['Joel Castro', 'Encomiendero', 'Hopiah', 'MORE NA', 'JoMC']
['July 2019', 'July 2019', 'July 2019', 'June 2019', 'May 2019']


In [21]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,travelgamerhub,August 2020,Stayed for 3 nights because I need to do my ch...
1,Hector Periquin,October 2019,The hotel is not new but they are continually ...
2,Wei A,January 2020,Stayed 5 nights 6 days Pros: Central location...
3,Judit R,December 2019,I requested a non-smoking room as high as poss...
4,MarieGrace,December 2019,I have tried staying at City Garden Suites for...
5,Joven O,December 2019,Family of three for 6 days stay.It was a clean...
6,581markjf,September 2019,City garden Suites is my hotel of preference w...
7,Stay425892,September 2019,We stayed over the weekend with friends. We we...
8,Julius M,September 2019,"We booked for a business trip via phone, the f..."
9,Jamil A,August 2019,As a hotelier i know the standards and the rig...


In [22]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#Concatenate Address,Amenities,AvailableRooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

h2= pd.concat([df20,df7],axis=1)
h2

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1158 A. Mabini Street Ermita, Manila, Luzon 10...","Free parking,Free High Speed Internet (WiFi),F...","Deluxe Twin Room,Junior Suite,Superior Double ...",travelgamerhub ; August 2020 ; Stayed for 3 ni...,Hector Periquin ; October 2019 ; The hotel is ...,Wei A ; January 2020 ; Stayed 5 nights 6 days ...,Judit R ; December 2019 ; I requested a non-sm...,MarieGrace ; December 2019 ; I have tried stay...,Joven O ; December 2019 ; Family of three for ...,581markjf ; September 2019 ; City garden Suite...,Stay425892 ; September 2019 ; We stayed over t...,Julius M ; September 2019 ; We booked for a bu...,Jamil A ; August 2019 ; As a hotelier i know t...,Joel Castro ; July 2019 ; We had an incredible...,Encomiendero ; July 2019 ; Good quality custom...,Hopiah ; July 2019 ; The hotel is ver near to ...,MORE NA ; June 2019 ; We booked 2 rooms (2 sep...,JoMC ; May 2019 ; We had our reservation about...


# 3. Regency Grand Suites

In [23]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6427322-Reviews-Regency_Grand_Suites-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1622 Birch Tower Condominium Jorge Bocobo Street, Manila, Luzon 1004 Philippines']
['Free High Speed Internet (WiFi),Pool,Fitness Center with Gym / Workout Room,Bar / lounge,Children Activities (Kid / Family Friendly),Business Center with Internet Access,Concierge,Non-smoking hotel']


In [24]:
req2= requests.get('https://www.booking.com/hotel/ph/regencygrand-suit.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Deluxe Studio,Executive Twin Studio,One-Bedroom Suite,Premier Studio,Room Selected at Check In,Prestige Club Suite,Prestige Club Twin Bed Suite']


### Reviews

In [25]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6427322-Reviews-Regency_Grand_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Gnej G', 'Rowrence', 'Ronald  L', 'FrequentFlier715362', 'Clarita Gamba-Macandili']
['December 2020', 'November 2020', 'August 2020', 'February 2020', 'December 2019']


In [26]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6427322-Reviews-or5-Regency_Grand_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Dave_in_Barrie_11', 'rmanay346185', 'nganch551542', 'fletch531', 'saf7670']
['November 2019', 'November 2019', 'November 2019', 'November 2019', 'October 2019']


In [27]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6427322-Reviews-or10-Regency_Grand_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Nasser K', 'Monu K', 'J Chua', 'Johnz', 'saf7670']
['September 2019', 'August 2019', 'July 2019', 'June 2019', 'May 2019']


In [28]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Gnej G,December 2020,Super ganda ng ambiance lalo n pag need mo ng ...
1,Rowrence,November 2020,The hotel stay was very pleasant and comfortab...
2,Ronald L,August 2020,Excellent place a great location accessible fo...
3,FrequentFlier715362,February 2020,"It is close to Mall, easy to get taxi, very he..."
4,Clarita Gamba-Macandili,December 2019,One of the best hotel in manila. Staffs are ni...
5,Dave_in_Barrie_11,November 2019,I'd give this a 2.5 if I could. I've stayed h...
6,rmanay346185,November 2019,The hotel met most of my expectations in terms...
7,nganch551542,November 2019,"Great location, Robinson's Mall is only walkin..."
8,fletch531,November 2019,Rooms are generally ok but it's luck of the dr...
9,saf7670,October 2019,I have often returned to the Regency Grand sin...


In [29]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h3= pd.concat([df20,df7],axis=1)
h3

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,1622 Birch Tower Condominium Jorge Bocobo Stre...,"Free High Speed Internet (WiFi),Pool,Fitness C...","Deluxe Studio,Executive Twin Studio,One-Bedroo...",Gnej G ; December 2020 ; Super ganda ng ambian...,Rowrence ; November 2020 ; The hotel stay was ...,Ronald L ; August 2020 ; Excellent place a gr...,FrequentFlier715362 ; February 2020 ; It is cl...,Clarita Gamba-Macandili ; December 2019 ; One ...,Dave_in_Barrie_11 ; November 2019 ; I'd give t...,rmanay346185 ; November 2019 ; The hotel met m...,"nganch551542 ; November 2019 ; Great location,...",fletch531 ; November 2019 ; Rooms are generall...,saf7670 ; October 2019 ; I have often returned...,Nasser K ; September 2019 ; Dang came to airpo...,Monu K ; August 2019 ; Incredible value for mo...,J Chua ; July 2019 ; Its really nice to feel t...,"Johnz ; June 2019 ; Strategic location, the ho...",saf7670 ; May 2019 ; I arrived in Manila after...


# 4. Eurotel Pedro Gil

In [2]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1175684-Reviews-Eurotel_Pedro_Gil-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['618 Pedro Gil Street Ermita, Manila, Luzon 1004 Philippines']
['Free parking,Free High Speed Internet (WiFi),Free breakfast,Airport transportation,Conference facilities,Meeting rooms,Massage,24-hour security']


In [3]:
req2= requests.get('https://www.booking.com/hotel/ph/eurotel-pedro-gil.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Standard Queen Room,Euro Suite 1,Euro Suite 2,Studio,Standard Twin Room']


In [4]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1175684-Reviews-Eurotel_Pedro_Gil-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['oliveoilvint', 'Cj118', 'R. Escala', 'Las Buganvillas', 'theroyalputri']
['March 2021', 'January 2021', 'October 2020', 'April 2020', 'February 2020']


In [5]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1175684-Reviews-or5-Eurotel_Pedro_Gil-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['John Ray L', 'Peter Paul Duran', 'MattandGene', 'fattyboy8', 'eesaahg']
['January 2020', 'October 2019', 'August 2019', 'July 2019', 'July 2019']


In [6]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1175684-Reviews-or10-Eurotel_Pedro_Gil-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['bimbo m', 'jean', 'Persioux Crowley', 'j n', 'joylovesben']
['May 2019', 'July 2018', 'May 2019', 'January 2019', 'January 2019']


In [7]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,oliveoilvint,March 2021,I don't recommande at all. Especially for quar...
1,Cj118,January 2021,The location is very convenient. Service was s...
2,R. Escala,October 2020,Stayed for 3days/2nights. Booked the room thru...
3,Las Buganvillas,April 2020,I stayed in the hotel during the lockdown in M...
4,theroyalputri,February 2020,Stayed here for 3 nights as it was the practic...
5,John Ray L,January 2020,I chose to stay here because I had errands to ...
6,Peter Paul Duran,October 2019,Room smelled like the comfort room. Asked the ...
7,MattandGene,August 2019,Shopping mall across the street along with a v...
8,fattyboy8,July 2019,I stayed here due to the location My first ro...
9,eesaahg,July 2019,Booked this hotel due to my father’s location ...


In [8]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h4= pd.concat([df20,df7],axis=1)
h4

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"618 Pedro Gil Street Ermita, Manila, Luzon 100...","Free parking,Free High Speed Internet (WiFi),F...","Standard Queen Room,Euro Suite 1,Euro Suite 2,...",oliveoilvint ; March 2021 ; I don't recommande...,Cj118 ; January 2021 ; The location is very co...,R. Escala ; October 2020 ; Stayed for 3days/2n...,Las Buganvillas ; April 2020 ; I stayed in the...,theroyalputri ; February 2020 ; Stayed here fo...,John Ray L ; January 2020 ; I chose to stay he...,Peter Paul Duran ; October 2019 ; Room smelled...,MattandGene ; August 2019 ; Shopping mall acro...,fattyboy8 ; July 2019 ; I stayed here due to t...,eesaahg ; July 2019 ; Booked this hotel due to...,bimbo m ; May 2019 ; I consider Eurotel at Ped...,jean ; July 2018 ; just opposite robinsons man...,Persioux Crowley ; May 2019 ; Worst is their a...,j n ; January 2019 ; I'm not picky when it com...,joylovesben ; January 2019 ; The place is aver...


# 5. Executive Hotel

In [31]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d446807-Reviews-Executive_Hotel-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1630 A. Mabini Street Malate, Manila, Luzon 1000 Philippines']
['Parking,Free High Speed Internet (WiFi),Bar / lounge,Game room,Babysitting,Conference facilities,Meeting rooms,Massage']


In [62]:
req2= requests.get('https://ph.hotels.com/ho332608/executive-hotel-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
availroom = [item.replace('eB','e,B') for item in avroom]
availroom = [item.replace('mB','m,B') for item in availroom]

print(availroom)

['Junior Suite,Basic Twin Room,Basic Double Room']


In [48]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d446807-Reviews-Executive_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Vangiedazo', 'Mr. E', 'Qas419', 'Andy S', 'Amer A']
['February 2020', 'December 2019', 'September 2019', 'September 2019', 'July 2019']


In [49]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d446807-Reviews-or5-Executive_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Curtis C', 'Geminidreams', 'davey671', 'Amer A', 'Traveler89052']
['June 2019', 'April 2019', 'April 2019', 'March 2019', 'December 2018']


In [50]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d446807-Reviews-or10-Executive_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['schlesiw', 'Mark B', 'Steven P', 'paulrodriquez', 'Bruno A']
['April 2018', 'February 2018', 'December 2017', 'March 2017', 'December 2017']


In [51]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Vangiedazo,February 2020,Ive been here for 5 days and the stay is relax...
1,Mr. E,December 2019,"Room is clean and the location is good, just a..."
2,Qas419,September 2019,The hotel is located in a crowded street near ...
3,Andy S,September 2019,I had a pleasant stay throughout... thanks to ...
4,Amer A,July 2019,"Very nice hotels and good value, also perfect ..."
5,Curtis C,June 2019,I recently stayed at the Executive Hotel on 8 ...
6,Geminidreams,April 2019,"Spent 5 days there. Checkin was quick, room wa..."
7,davey671,April 2019,Booked a room for 4 nights online and requeste...
8,Amer A,March 2019,Firstibul choosing an hotel is dependly what y...
9,Traveler89052,December 2018,The hotel is not what is use to be. The inter...


In [63]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h5= pd.concat([df20,df7],axis=1)
h5

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1630 A. Mabini Street Malate, Manila, Luzon 10...","Parking,Free High Speed Internet (WiFi),Bar / ...","Junior Suite,Basic Twin Room,Basic Double Room",Vangiedazo ; February 2020 ; Ive been here for...,Mr. E ; December 2019 ; Room is clean and the ...,Qas419 ; September 2019 ; The hotel is located...,Andy S ; September 2019 ; I had a pleasant sta...,Amer A ; July 2019 ; Very nice hotels and good...,Curtis C ; June 2019 ; I recently stayed at th...,Geminidreams ; April 2019 ; Spent 5 days there...,davey671 ; April 2019 ; Booked a room for 4 ni...,Amer A ; March 2019 ; Firstibul choosing an ho...,Traveler89052 ; December 2018 ; The hotel is n...,schlesiw ; April 2018 ; We spent a few nights ...,Mark B ; February 2018 ; The rooms are well fu...,"Steven P ; December 2017 ; Nice stay, very lar...",paulrodriquez ; March 2017 ; I did enjoy stayi...,Bruno A ; December 2017 ; I have been several ...


# 6. Heroes Hotel

In [64]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12812181-Reviews-Heroes_Hotel-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Florentino Torres Osmena Highway, Manila, Luzon 1017 Philippines']
['Free parking,Free High Speed Internet (WiFi),Free breakfast,Bicycle rental,Bicycles available,Children Activities (Kid / Family Friendly),Airport transportation,Business Center with Internet Access']


In [65]:
req2= requests.get('https://www.booking.com/hotel/ph/heroes.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Superior Double or Twin Room,Deluxe Queen Room,Budget Quadruple Room,Suite']


In [66]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12812181-Reviews-Heroes_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Christopher W', 'James R', 'Timallalone', 'SerainaNonym', 'AngelaSept']
['January 2020', 'January 2020', 'January 2020', 'January 2020', 'December 2019']


In [67]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12812181-Reviews-or5-Heroes_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Manoj Kumar', 'mnltabi', 'So Sardan', 'Saan aabot? Eatstraveltime', 'maricel c.']
['December 2019', 'October 2019', 'August 2019', 'August 2019', 'June 2019']


In [68]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12812181-Reviews-or10-Heroes_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Charlie', 'Megan Johnston', 'kateeeforkaterina', 'SpeedyG-7', 'Tadz R']
['July 2019', 'July 2019', 'July 2019', 'June 2019', 'June 2019']


In [69]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Christopher W,January 2020,This hotel has my preference due to the locati...
1,James R,January 2020,"Great welcome, fantastic staff,limited but ver..."
2,Timallalone,January 2020,Really enjoyed a week’s stay at this welcoming...
3,SerainaNonym,January 2020,It’s the best hotel i’ve been to in manila and...
4,AngelaSept,December 2019,Fantastic hotel for the price. Loved the sheet...
5,Manoj Kumar,December 2019,"Nice concept of DC and Marvel heroes, I had a ..."
6,mnltabi,October 2019,My cousins and I stayed here over the weekend ...
7,So Sardan,August 2019,I like this place. They have a lot of heroes a...
8,Saan aabot? Eatstraveltime,August 2019,Indeed always a heroes welcome whenever you st...
9,maricel c.,June 2019,the place is easy to find and what I liked the...


In [70]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h6= pd.concat([df20,df7],axis=1)
h6

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"Florentino Torres Osmena Highway, Manila, Luzo...","Free parking,Free High Speed Internet (WiFi),F...","Superior Double or Twin Room,Deluxe Queen Room...",Christopher W ; January 2020 ; This hotel has ...,"James R ; January 2020 ; Great welcome, fantas...",Timallalone ; January 2020 ; Really enjoyed a ...,SerainaNonym ; January 2020 ; It’s the best ho...,AngelaSept ; December 2019 ; Fantastic hotel f...,Manoj Kumar ; December 2019 ; Nice concept of ...,mnltabi ; October 2019 ; My cousins and I stay...,So Sardan ; August 2019 ; I like this place. T...,Saan aabot? Eatstraveltime ; August 2019 ; Ind...,maricel c. ; June 2019 ; the place is easy to ...,Charlie ; July 2019 ; Decor was fantastic (con...,Megan Johnston ; July 2019 ; This hotel was so...,kateeeforkaterina ; July 2019 ; We booked a su...,SpeedyG-7 ; June 2019 ; Stayed here in mid Jun...,"Tadz R ; June 2019 ; The room is very clean, s..."


# 7. Winford Manila Resort & Casino

In [71]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d9732359-Reviews-Winford_Manila_Resort_Casino-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['MJC Drive San Lazaro Tourism Business Park, Manila, Luzon 1014 Philippines']
["Free parking,Free High Speed Internet (WiFi),Pool,Fitness Center with Gym / Workout Room,Free breakfast,Casino and Gambling,Evening entertainment,Children's television networks"]


In [72]:
req2= requests.get('https://www.booking.com/hotel/ph/winford-and-casino.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Deluxe Double Room,Executive Suite,Deluxe King Room']


In [73]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d9732359-Reviews-Winford_Manila_Resort_Casino-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['DGAnon', 'Michael M', 'jof69', 'General_koh', 'Ads P']
['January 2020', 'December 2019', 'December 2019', 'November 2019', 'October 2019']


In [74]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d9732359-Reviews-or5-Winford_Manila_Resort_Casino-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Lemontea291', 'Jason A', 'Cynthia Bryth', 'Diane B', 'MikaSF627']
['December 2018', 'October 2019', 'August 2019', 'August 2018', 'June 2019']


In [75]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d9732359-Reviews-or10-Winford_Manila_Resort_Casino-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Gayleesi', 'Vanity Helina Low', 'Joseph B', 'Judilynn N. Solidum', 'Mike V']
['June 2019', 'May 2019', 'May 2019', 'May 2019', 'March 2019']


In [76]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,DGAnon,January 2020,The overall experience leaves much to be desir...
1,Michael M,December 2019,First the restraunt Choi garden we went back t...
2,jof69,December 2019,My review is specifically for Choi Garden at W...
3,General_koh,November 2019,A decent hotel with good room size. The check ...
4,Ads P,October 2019,"As expected, I really had a recharging sleep a..."
5,Lemontea291,December 2018,"After 8 hours depart from my home, all fatigue..."
6,Jason A,October 2019,"Feedback: Hotel room & facilities are ok, we..."
7,Cynthia Bryth,August 2019,The have the worst parking scheme that I ever ...
8,Diane B,August 2018,Booked for 3 nights at the executive suite but...
9,MikaSF627,June 2019,We booked a Deluxe King room and sent a reques...


In [77]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h7= pd.concat([df20,df7],axis=1)
h7

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"MJC Drive San Lazaro Tourism Business Park, Ma...","Free parking,Free High Speed Internet (WiFi),P...","Deluxe Double Room,Executive Suite,Deluxe King...",DGAnon ; January 2020 ; The overall experience...,Michael M ; December 2019 ; First the restraun...,jof69 ; December 2019 ; My review is specifica...,General_koh ; November 2019 ; A decent hotel w...,"Ads P ; October 2019 ; As expected, I really h...",Lemontea291 ; December 2018 ; After 8 hours de...,Jason A ; October 2019 ; Feedback: Hotel roo...,Cynthia Bryth ; August 2019 ; The have the wor...,Diane B ; August 2018 ; Booked for 3 nights at...,MikaSF627 ; June 2019 ; We booked a Deluxe Kin...,Gayleesi ; June 2019 ; Considering we live pra...,Vanity Helina Low ; May 2019 ; Perfect locatio...,Joseph B ; May 2019 ; We spent Mother's Day Lu...,Judilynn N. Solidum ; May 2019 ; I have been t...,"Mike V ; March 2019 ; nice rooms, super cheap ..."


# 8. Go Hotels Otis-Manila

In [78]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4314691-Reviews-Go_Hotels_Otis_Manila-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Robinsons Otis, 1536 Paz Guazon St. Paco, Manila, Luzon 1007 Philippines']
['Paid private parking on-site,Free High Speed Internet (WiFi),Car hire,Business Center with Internet Access,24-hour security,Baggage storage,24-hour check-in,24-hour front desk']


In [79]:
req2= requests.get('https://www.booking.com/hotel/ph/go-hotels-otis-manila.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Twin Room,Double Room,Hotel Care Package - Standard Twin Room,Hotel Care Package - Standard Double Room']


In [80]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4314691-Reviews-Go_Hotels_Otis_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Katherine A', 'BonyF', 'Tharyldia Shane', 'Ann C', 'Nikki Baysa']
['February 2021', 'April 2021', 'March 2021', 'February 2021', 'February 2021']


In [81]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4314691-Reviews-or5-Go_Hotels_Otis_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['nvilla', 'MariaEllenaP', 'Vanessa A', 'pawiks1925', 'timmymariano1626']
['December 2019', 'November 2019', 'November 2019', 'November 2019', 'October 2019']


In [82]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4314691-Reviews-or10-Go_Hotels_Otis_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['1512mhdelPilar', 'Marty Hansen', 'Sherpa25882412712', 'JPRoblems', 'Gessa Mae L.']
['September 2019', 'September 2019', 'August 2019', 'August 2019', 'August 2019']


In [83]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Katherine A,February 2021,No Windows ( in my room) No hot water for show...
1,BonyF,April 2021,Please keep out of this hotel this is worst ho...
2,Tharyldia Shane,March 2021,"Well I booked Go hotels for my fiancé, he stay..."
3,Ann C,February 2021,Stayed there for 7 days for my quarantine Pro...
4,Nikki Baysa,February 2021,My parents went back to the Philippines for “f...
5,nvilla,December 2019,I have checked-in at several Go Hotel branches...
6,MariaEllenaP,November 2019,The hotel was on the fifth floor. The moment I...
7,Vanessa A,November 2019,"A good hotel, except for the security and park..."
8,pawiks1925,November 2019,Staff including guards and parking attendants ...
9,timmymariano1626,October 2019,We checked in at Go Hotels Otis last Oct 25 fo...


In [84]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h8= pd.concat([df20,df7],axis=1)
h8

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"Robinsons Otis, 1536 Paz Guazon St. Paco, Mani...","Paid private parking on-site,Free High Speed I...","Twin Room,Double Room,Hotel Care Package - Sta...",Katherine A ; February 2021 ; No Windows ( in ...,BonyF ; April 2021 ; Please keep out of this h...,Tharyldia Shane ; March 2021 ; Well I booked G...,Ann C ; February 2021 ; Stayed there for 7 day...,Nikki Baysa ; February 2021 ; My parents went ...,nvilla ; December 2019 ; I have checked-in at ...,MariaEllenaP ; November 2019 ; The hotel was o...,"Vanessa A ; November 2019 ; A good hotel, exce...",pawiks1925 ; November 2019 ; Staff including g...,timmymariano1626 ; October 2019 ; We checked i...,1512mhdelPilar ; September 2019 ; This hotel s...,Marty Hansen ; September 2019 ; My 2nd stay in...,Sherpa25882412712 ; August 2019 ; Location is ...,JPRoblems ; August 2019 ; Hotel is located nea...,Gessa Mae L. ; August 2019 ; Reservation was q...


# 9. Adriatico Arms Hotel

In [98]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d310336-Reviews-Adriatico_Arms_Hotel-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['561 Julian Nakpil, Manila, Luzon 1004 Philippines']
['Free parking,Free High Speed Internet (WiFi),Bar / lounge,Car hire,24-hour security,Convenience store,24-hour front desk,Dry cleaning']


In [97]:
req2= requests.get('https://ph.hotels.com/ho576001/adriatico-arms-hotel-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
# added semi-colon as separator instead of comma due to having it on description
availroom = [item.replace('dS','d;S') for item in avroom]
availroom = [item.replace('sD','s;D') for item in availroom]
availroom = [item.replace('mD','m;D') for item in availroom]
print(availroom)

['Standard Room, 1 Queen Bed;Standard Room, 2 Twin Beds;Deluxe Double Room, 2 Double Beds;Deluxe Triple Room;Deluxe Quadruple Room, 2 Double Beds']


In [99]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d310336-Reviews-Adriatico_Arms_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Nikko', 'Crazy Eagle', 'test', 'MissZ', 'BAI']
['February 2021', 'November 2018', 'July 2019', 'April 2019', 'February 2019']


In [100]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d310336-Reviews-or5-Adriatico_Arms_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Crazy Eagle', 'kevinbirch49', 'rowena L', 'Laarni A', 'Kristal K']
['December 2018', 'November 2018', 'June 2018', 'October 2018', 'September 2018']


In [101]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d310336-Reviews-or10-Adriatico_Arms_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Sarah D', 'rowena L', 'Nathalie B', 'EbebPL', 'Keith R']
['August 2016', 'November 2016', 'November 2016', 'November 2016', 'August 2016']


In [102]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Nikko,February 2021,Adriatico Arms is a wonderful haven for those ...
1,Crazy Eagle,November 2018,This is a budget hotel that is comfortable and...
2,test,July 2019,Every time we pass trough Manila we stay at th...
3,MissZ,April 2019,Place is descent and affordable.the room is cl...
4,BAI,February 2019,The hotel is just walking distance from a larg...
5,Crazy Eagle,December 2018,This is a budget hotel that is comfortable and...
6,kevinbirch49,November 2018,Location fantastic every thing you need for a ...
7,rowena L,June 2018,As have said before. Fav place to stay and eat...
8,Laarni A,October 2018,"Location wise, good. Since it is just a few me..."
9,Kristal K,September 2018,This was my first time staying here and I real...


In [103]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h9= pd.concat([df20,df7],axis=1)
h9

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"561 Julian Nakpil, Manila, Luzon 1004 Philippines","Free parking,Free High Speed Internet (WiFi),B...","Standard Room, 1 Queen Bed;Standard Room, 2 Tw...",Nikko ; February 2021 ; Adriatico Arms is a wo...,Crazy Eagle ; November 2018 ; This is a budget...,test ; July 2019 ; Every time we pass trough M...,MissZ ; April 2019 ; Place is descent and affo...,BAI ; February 2019 ; The hotel is just walkin...,Crazy Eagle ; December 2018 ; This is a budget...,kevinbirch49 ; November 2018 ; Location fantas...,rowena L ; June 2018 ; As have said before. Fa...,"Laarni A ; October 2018 ; Location wise, good....",Kristal K ; September 2018 ; This was my first...,Sarah D ; August 2016 ; Despite for it to be n...,rowena L ; November 2016 ; I have been using t...,Nathalie B ; November 2016 ; We picked this pl...,"EbebPL ; November 2016 ; Good prices, helpful ...",Keith R ; August 2016 ; I spent a week at Adri...


# 10. JMM Grand Suites

In [104]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4014256-Reviews-JMM_Grand_Suites-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Jorge Bocobo Street Along Pedro Gil Street opposite Robinsons, Birch Tower 1622, Manila, Luzon 1004 Philippines']
['Secured parking,Free High Speed Internet (WiFi),Pool,Fitness Center with Gym / Workout Room,Restaurant,Billiards,Table tennis,Children Activities (Kid / Family Friendly)']


In [105]:
req2= requests.get('https://www.booking.com/hotel/ph/jmm-grand-suites.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Executive King Suite,One-Bedroom Apartment,Three-Bedroom Apartment,Studio,Two-Bedroom Apartment,Executive Room with Two Single Beds,Deluxe Studio']


In [106]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4014256-Reviews-JMM_Grand_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Mark B', 'Paul H', 'LakePanorama', 'Resort589258', 'Myers T']
['June 2020', 'February 2020', 'December 2019', 'June 2019', 'June 2019']


In [107]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4014256-Reviews-or5-JMM_Grand_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Jeff G', 'saf7670', 'Nor Wayne S', 'Angeline Rose', 'gwattya']
['March 2019', 'May 2019', 'January 2019', 'January 2019', 'January 2019']


In [108]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d4014256-Reviews-or10-JMM_Grand_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['vicpk', 'marley2828', 'Zevious', 'andrewcorica', 'AkiChanTells']
['January 2019', 'December 2018', 'December 2018', 'August 2018', 'August 2017']


In [109]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Mark B,June 2020,"Booked to stay here via Agoda, this was a prep..."
1,Paul H,February 2020,We found JMM Grand Suites rather shabby and no...
2,LakePanorama,December 2019,JMM Grand Suites is a two-minute walk to the R...
3,Resort589258,June 2019,The location is very accessible in most amenit...
4,Myers T,June 2019,More frequently this has been my go to hotel i...
5,Jeff G,March 2019,Rented a 1 bedroom apt. for 2 people but hotel...
6,saf7670,May 2019,"I spent one night at the Regency Grand Suites,..."
7,Nor Wayne S,January 2019,We booked here in advance and we inquire every...
8,Angeline Rose,January 2019,Im so disapointed with their service..i will n...
9,gwattya,January 2019,Was allocated a 31st room floor where my husba...


In [110]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h10= pd.concat([df20,df7],axis=1)
h10

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,Jorge Bocobo Street Along Pedro Gil Street opp...,"Secured parking,Free High Speed Internet (WiFi...","Executive King Suite,One-Bedroom Apartment,Thr...",Mark B ; June 2020 ; Booked to stay here via A...,Paul H ; February 2020 ; We found JMM Grand Su...,LakePanorama ; December 2019 ; JMM Grand Suite...,Resort589258 ; June 2019 ; The location is ver...,Myers T ; June 2019 ; More frequently this has...,Jeff G ; March 2019 ; Rented a 1 bedroom apt. ...,saf7670 ; May 2019 ; I spent one night at the ...,Nor Wayne S ; January 2019 ; We booked here in...,Angeline Rose ; January 2019 ; Im so disapoint...,gwattya ; January 2019 ; Was allocated a 31st ...,vicpk ; January 2019 ; I have stayed here nume...,marley2828 ; December 2018 ; My fiance and I h...,Zevious ; December 2018 ; We’ve stayed here be...,andrewcorica ; August 2018 ; Do I recommend th...,AkiChanTells ; August 2017 ; JMM Grand Suites ...


# 11. Hotel Kimberly Manila

In [111]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d308324-Reviews-Hotel_Kimberly_Manila-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['770 Pedro Gil Street Malate, Manila, Luzon 1004 Philippines']
['Free public parking nearby,Free High Speed Internet (WiFi),Free breakfast,Kids stay free,Books, DVDs, music for children,Airport transportation,Business Center with Internet Access,Conference facilities']


In [112]:
req2= requests.get('https://www.booking.com/hotel/ph/kimberly-manila.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Premier Family Suite with Kitchen,Deluxe Family Room,Executive Suite,Superior Twin Room,Superior Triple Room,Superior Family Room,Deluxe Triple Room,Deluxe Twin Room,Superior Quadruple Room,Deluxe Quadruple Room,Family Suite,Kimberly Queen Loft Annex,Kimberly Family Loft Annex,Premier Studio Suite with Kitchen']


In [113]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d308324-Reviews-Hotel_Kimberly_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Donna', 'hanzpadillo0421', 'Lee', 'Lei delacruz', 'Kaitlyn']
['May 2021', 'May 2021', 'April 2021', 'April 2021', 'March 2021']


In [114]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d308324-Reviews-or5-Hotel_Kimberly_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['jerphlee', 'jojomanimbao', 'junjunpregoner', 'Mark M', 'Wally-Singapore']
['February 2021', 'October 2020', 'September 2020', 'July 2020', 'January 2020']


In [115]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d308324-Reviews-or10-Hotel_Kimberly_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Paul V', 'Malay A', 'wanderforwows', 'cqfp123', 'AboAziz']
['February 2020', 'December 2019', 'January 2020', 'January 2020', 'January 2020']


In [116]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Donna,May 2021,The hotel staff did an extra mile in granting ...
1,hanzpadillo0421,May 2021,"thank you and GOD BLESS,I'm very much apprecia..."
2,Lee,April 2021,Location Location Location! An excellent Hotel...
3,Lei delacruz,April 2021,"Although we booked for quarantine purposes, we..."
4,Kaitlyn,March 2021,We had a late check in at the hotel and coming...
5,jerphlee,February 2021,The hotel reception and customer service is ve...
6,jojomanimbao,October 2020,I was extra because I came early in hotel. My ...
7,junjunpregoner,September 2020,The hotel is so clean and nice. There place is...
8,Mark M,July 2020,I've been looking for a place to stay after my...
9,Wally-Singapore,January 2020,I decided to give Hotel Kimberly try in Janua...


In [117]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h11= pd.concat([df20,df7],axis=1)
h11

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"770 Pedro Gil Street Malate, Manila, Luzon 100...","Free public parking nearby,Free High Speed Int...","Premier Family Suite with Kitchen,Deluxe Famil...",Donna ; May 2021 ; The hotel staff did an extr...,hanzpadillo0421 ; May 2021 ; thank you and GOD...,Lee ; April 2021 ; Location Location Location!...,Lei delacruz ; April 2021 ; Although we booked...,Kaitlyn ; March 2021 ; We had a late check in ...,jerphlee ; February 2021 ; The hotel reception...,jojomanimbao ; October 2020 ; I was extra beca...,junjunpregoner ; September 2020 ; The hotel is...,Mark M ; July 2020 ; I've been looking for a p...,Wally-Singapore ; January 2020 ; I decided to ...,Paul V ; February 2020 ; This hotel was part o...,Malay A ; December 2019 ; Just 9 km from the A...,wanderforwows ; January 2020 ; Check in was gr...,cqfp123 ; January 2020 ; Stayed 3 non-consecut...,AboAziz ; January 2020 ; - Hotel charged me ad...


# 12. Go Hotels Ermita

In [121]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12548113-Reviews-Go_Hotels_Ermita-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Lot 2 Block 36 3A, Mabini St., Malate, Manila, Luzon 1004 Philippines']
['Paid private parking on-site,Free High Speed Internet (WiFi),Restaurant,Children Activities (Kid / Family Friendly),Meeting rooms,Baggage storage,Non-smoking hotel,24-hour front desk']


In [120]:
req2= requests.get('https://ph.hotels.com/ho727299328/go-hotels-ermita-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
# added semi-colon as separator instead of comma due to having it on description
availroom = [item.replace('mT','m,T') for item in avroom]
print(availroom)

['Queen Room,Twin Room']


In [122]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12548113-Reviews-Go_Hotels_Ermita-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['JC Deala', 'Unlad', 'Krysta', 'MPad', 'groovyguyabano88']
['May 2021', 'March 2021', 'April 2021', 'April 2021', 'October 2020']


In [123]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12548113-Reviews-or5-Go_Hotels_Ermita-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Dora the explorer', 'Jimbo P', 'Grace', 'Jeremy B', 'Sheyna P']
['July 2020', 'February 2020', 'February 2020', 'January 2020', 'January 2020']


In [124]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12548113-Reviews-or10-Go_Hotels_Ermita-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Christine M', 'elle021628', 'MariaEllenaP', 'PreetyNagi', 'Valerie C']
['December 2019', 'December 2019', 'November 2019', 'November 2019', 'October 2019']


In [125]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,JC Deala,May 2021,"Stayed for 11 days quarantine, no housekeepi..."
1,Unlad,March 2021,Worst customer service ever. The reservations ...
2,Krysta,April 2021,Stayed here last on April 2021 because of its ...
3,MPad,April 2021,"The room was clean, did not see any bugs. Bed ..."
4,groovyguyabano88,October 2020,This is a great budget hotel. Has all basic ne...
5,Dora the explorer,July 2020,"Employees are great. Very professional, respo..."
6,Jimbo P,February 2020,"They had a good service, nice room, cold airco..."
7,Grace,February 2020,The room is nice and clean but too small esp t...
8,Jeremy B,January 2020,Good location and clean but very poor staff se...
9,Sheyna P,January 2020,Location is accessible. Poor internet connecti...


In [126]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h12= pd.concat([df20,df7],axis=1)
h12

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"Lot 2 Block 36 3A, Mabini St., Malate, Manila,...","Paid private parking on-site,Free High Speed I...","Queen Room,Twin Room",JC Deala ; May 2021 ; Stayed for 11 days quar...,Unlad ; March 2021 ; Worst customer service ev...,Krysta ; April 2021 ; Stayed here last on Apri...,"MPad ; April 2021 ; The room was clean, did no...",groovyguyabano88 ; October 2020 ; This is a gr...,Dora the explorer ; July 2020 ; Employees are ...,Jimbo P ; February 2020 ; They had a good serv...,Grace ; February 2020 ; The room is nice and c...,Jeremy B ; January 2020 ; Good location and cl...,Sheyna P ; January 2020 ; Location is accessib...,Christine M ; December 2019 ; Staffs are very ...,elle021628 ; December 2019 ; Firstly i would l...,MariaEllenaP ; November 2019 ; My sister and I...,PreetyNagi ; November 2019 ; We stayed here fo...,"Valerie C ; October 2019 ; Check in was swift,..."


# 13. Red Planet Manila Binondo

In [127]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13401314-Reviews-Red_Planet_Manila_Binondo-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['251-61 Juan Luna Street Binondo, Manila, Luzon 1006 Philippines']
['Free parking,Free High Speed Internet (WiFi),Children Activities (Kid / Family Friendly),Business Center with Internet Access,24-hour security,Baggage storage,24-hour check-in,24-hour front desk']


In [128]:
req2= requests.get('https://www.booking.com/hotel/ph/red-planet-binondo.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Double Room,Twin Room']


In [129]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13401314-Reviews-Red_Planet_Manila_Binondo-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Edelita T', 'RDT2418', 'Camille J', 'eugelavega', 'dimsons978']
['June 2021', 'May 2021', 'June 2021', 'May 2021', 'May 2021']


In [130]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13401314-Reviews-or5-Red_Planet_Manila_Binondo-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['madelyneleilagauna', 'stevedoromal75', 'Carlos R', 'engelbertargel0001', 'Tessie T']
['May 2021', 'May 2021', 'May 2020', 'February 2021', 'December 2020']


In [131]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13401314-Reviews-or10-Red_Planet_Manila_Binondo-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Aida Ramos', 'ebrp1989', 'andrewpulmones', 'Ron', 'Lynne N']
['November 2020', 'November 2020', 'October 2020', 'January 2020', 'February 2020']


In [132]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Edelita T,June 2021,So clean and value for money! Staff were frien...
1,RDT2418,May 2021,Excellent hotel during mandatory quarantine. ...
2,Camille J,June 2021,Booking process was so helpful. They will repl...
3,eugelavega,May 2021,Linens are not replaced as per schedule. I sta...
4,dimsons978,May 2021,What I like most about is the cleanliness of t...
5,madelyneleilagauna,May 2021,"I arrived with my room really filthy, full of ..."
6,stevedoromal75,May 2021,"when I stya in your hotel, its so nice, clean,..."
7,Carlos R,May 2020,I chose Red Planet in Binondo because I had a ...
8,engelbertargel0001,February 2021,"Very accomodating, the palce is well maintain,..."
9,Tessie T,December 2020,"Thank you for the nice room Good location, rea..."


In [133]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h13= pd.concat([df20,df7],axis=1)
h13

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"251-61 Juan Luna Street Binondo, Manila, Luzon...","Free parking,Free High Speed Internet (WiFi),C...","Double Room,Twin Room",Edelita T ; June 2021 ; So clean and value for...,RDT2418 ; May 2021 ; Excellent hotel during ma...,Camille J ; June 2021 ; Booking process was so...,eugelavega ; May 2021 ; Linens are not replace...,dimsons978 ; May 2021 ; What I like most about...,madelyneleilagauna ; May 2021 ; I arrived with...,stevedoromal75 ; May 2021 ; when I stya in you...,Carlos R ; May 2020 ; I chose Red Planet in Bi...,engelbertargel0001 ; February 2021 ; Very acco...,Tessie T ; December 2020 ; Thank you for the n...,Aida Ramos ; November 2020 ; This is my 4th ti...,ebrp1989 ; November 2020 ; Only downside is th...,andrewpulmones ; October 2020 ; Housekeeping s...,Ron ; January 2020 ; Stayed here for a chinese...,Lynne N ; February 2020 ; Red Planet Manila - ...


# 14. OYO 152 Sangco Condotel

In [134]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d17387323-Reviews-OYO_152_Sangco_Condotel-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1820, San Pedro St, Malate, Manila, Luzon 1004 Philippines']
['Free parking,Children Activities (Kid / Family Friendly),Spa']


In [135]:
req2= requests.get('https://www.booking.com/hotel/ph/oyo-152-sangco-condotel.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Standard Double Room,Deluxe Double Room,Suite Family,Standard Bunk 4 Bed,Standard Bunk 6 Bed,Standard Bunk 8 Bed']


In [137]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d17387323-Reviews-OYO_152_Sangco_Condotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)


['Alvin O.', 'Arnaud J', 'Elodie01121']
['September 2020', 'March 2020', 'January 2020']


In [141]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Alvin O.,September 2020,Excellent price and room despite lack of most ...
1,Arnaud J,March 2020,Bonne localisation restaurants et barc a proxi...
2,Elodie01121,January 2020,We had one night at Oyo 152 Hotel and it was s...
3,madelyneleilagauna,May 2021,"I arrived with my room really filthy, full of ..."
4,stevedoromal75,May 2021,"when I stya in your hotel, its so nice, clean,..."
5,Carlos R,May 2020,I chose Red Planet in Binondo because I had a ...
6,engelbertargel0001,February 2021,"Very accomodating, the palce is well maintain,..."
7,Tessie T,December 2020,"Thank you for the nice room Good location, rea..."
8,Aida Ramos,November 2020,This is my 4th time in Red Planet Binondo. As ...
9,ebrp1989,November 2020,Only downside is the food - limited and not a ...


In [142]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h14= pd.concat([df20,df7],axis=1)
h14

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13
0,"1820, San Pedro St, Malate, Manila, Luzon 1004...","Free parking,Children Activities (Kid / Family...","Standard Double Room,Deluxe Double Room,Suite ...",Alvin O. ; September 2020 ; Excellent price an...,Arnaud J ; March 2020 ; Bonne localisation res...,Elodie01121 ; January 2020 ; We had one night ...,madelyneleilagauna ; May 2021 ; I arrived with...,stevedoromal75 ; May 2021 ; when I stya in you...,Carlos R ; May 2020 ; I chose Red Planet in Bi...,engelbertargel0001 ; February 2021 ; Very acco...,Tessie T ; December 2020 ; Thank you for the n...,Aida Ramos ; November 2020 ; This is my 4th ti...,ebrp1989 ; November 2020 ; Only downside is th...,andrewpulmones ; October 2020 ; Housekeeping s...,Ron ; January 2020 ; Stayed here for a chinese...,Lynne N ; February 2020 ; Red Planet Manila - ...


# 15. Aloha Hotel

In [143]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d320908-Reviews-Aloha_Hotel-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['2150 Roxas Blvd Cor. Quirino Ave, Malate, Manila, Luzon 1004 Philippines']
['Parking,Free High Speed Internet (WiFi),Coffee shop,Nightclub / DJ,Children Activities (Kid / Family Friendly),Car hire,Business Center with Internet Access,Meeting rooms']


In [144]:
req2= requests.get('https://www.booking.com/hotel/ph/aloha.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Double or Twin Room with Sea View,Classic Double or Twin Room with City View']


In [145]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d320908-Reviews-Aloha_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Meander827892', 'eric', 'Babe2015', 'Victor C.', 'Jomer F']
['December 2019', 'December 2019', 'August 2019', 'September 2019', 'February 2019']


In [146]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d320908-Reviews-or5-Aloha_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Ayman', 'MAM', 'BobNarrs', '#LetsDoThis', 'Pablo C']
['November 2018', 'January 2019', 'September 2018', 'August 2018', 'November 2018']


In [147]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d320908-Reviews-or10-Aloha_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Eriberto T', 'Joe Bee', 'E R', 'John Paulo S', 'Tony V']
['September 2018', 'May 2018', 'July 2017', 'July 2017', 'February 2018']


In [148]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Meander827892,December 2019,"I arrived and check in went smoothly, got the ..."
1,eric,December 2019,Cockroaches and pests abound the rooms. Food i...
2,Babe2015,August 2019,"Nice Location, Seaview, Sunset view, Good fo..."
3,Victor C.,September 2019,"Aloha Hotel was a great hotel once, not anymor..."
4,Jomer F,February 2019,The room is spacious but dont expect get a goo...
5,Ayman,November 2018,"Booked here several times ""ZEN Rooms Aloha Man..."
6,MAM,January 2019,Hotel maintenance was lacking and customer ser...
7,BobNarrs,September 2018,I really like this hotel. Although it is an ol...
8,#LetsDoThis,August 2018,The place is generally okay. Bathroom is good;...
9,Pablo C,November 2018,Very nice hotel the room was really clean mini...


In [149]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h15= pd.concat([df20,df7],axis=1)
h15

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"2150 Roxas Blvd Cor. Quirino Ave, Malate, Mani...","Parking,Free High Speed Internet (WiFi),Coffee...","Double or Twin Room with Sea View,Classic Doub...",Meander827892 ; December 2019 ; I arrived and ...,eric ; December 2019 ; Cockroaches and pests a...,"Babe2015 ; August 2019 ; Nice Location, Seavi...",Victor C. ; September 2019 ; Aloha Hotel was a...,Jomer F ; February 2019 ; The room is spacious...,Ayman ; November 2018 ; Booked here several ti...,MAM ; January 2019 ; Hotel maintenance was lac...,BobNarrs ; September 2018 ; I really like this...,#LetsDoThis ; August 2018 ; The place is gener...,Pablo C ; November 2018 ; Very nice hotel the ...,Eriberto T ; September 2018 ; The sun especial...,Joe Bee ; May 2018 ; Stayed at Aloha for it is...,"E R ; July 2017 ; For the price, you can withs...",John Paulo S ; July 2017 ; The breakfast buffe...,Tony V ; February 2018 ; Had a family gatherin...


# 16. Leesons Residences

In [150]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1427141-Reviews-Leesons_Residences-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['944 Remedios Street, Malate, Manila, Luzon 1004 Philippines']
['Parking,Free High Speed Internet (WiFi),Wifi,Full body massage,24-hour security,Baggage storage,24-hour check-in,24-hour front desk']


In [151]:
req2= requests.get('https://www.booking.com/hotel/ph/leesons-residences.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Superior Double Room,Deluxe Room,King Room,One-Bedroom Suite']


In [152]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1427141-Reviews-Leesons_Residences-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['116zannes', 'fattyboy8', 'Dion11191984', 'Belle', 'Lai']
['September 2019', 'August 2019', 'May 2019', 'March 2019', 'February 2019']


In [153]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1427141-Reviews-or5-Leesons_Residences-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['J Ian Cepe', 'Maud Violain', 'Suraj', 'naLie’sTravel', 'Ershad Balosh']
['February 2019', 'January 2019', 'January 2019', 'January 2019', 'December 2018']


In [154]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1427141-Reviews-or10-Leesons_Residences-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['GrandTour28152677783', 'Christine T', 'ruby u', 'Wanderer828246', 'Nyl-ana A']
['November 2018', 'November 2018', 'October 2018', 'October 2018', 'October 2018']


In [155]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,116zannes,September 2019,I have been hosted in this excellent location ...
1,fattyboy8,August 2019,Nice rooms Staff a pit slow Lobby is very h...
2,Dion11191984,May 2019,"This place is suitable for travelers, students..."
3,Belle,March 2019,I arrived early and paid for early check-in pe...
4,Lai,February 2019,Room has enough space for solo traveller...sta...
5,J Ian Cepe,February 2019,"The room is not that great (no table, one chai..."
6,Maud Violain,January 2019,We came for a few days to Manila and decided t...
7,Suraj,January 2019,I really like the hotel. Its very convenient t...
8,naLie’sTravel,January 2019,Booked thru booking.com under zen rooms for my...
9,Ershad Balosh,December 2018,"IN the Heart of Makati, near to Robinsons Mani..."


In [156]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h16= pd.concat([df20,df7],axis=1)
h16

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"944 Remedios Street, Malate, Manila, Luzon 100...","Parking,Free High Speed Internet (WiFi),Wifi,F...","Superior Double Room,Deluxe Room,King Room,One...",116zannes ; September 2019 ; I have been hoste...,fattyboy8 ; August 2019 ; Nice rooms Staff a ...,Dion11191984 ; May 2019 ; This place is suitab...,Belle ; March 2019 ; I arrived early and paid ...,Lai ; February 2019 ; Room has enough space fo...,J Ian Cepe ; February 2019 ; The room is not t...,Maud Violain ; January 2019 ; We came for a fe...,Suraj ; January 2019 ; I really like the hotel...,naLie’sTravel ; January 2019 ; Booked thru boo...,Ershad Balosh ; December 2018 ; IN the Heart o...,GrandTour28152677783 ; November 2018 ; Courteo...,Christine T ; November 2018 ; its my second ti...,ruby u ; October 2018 ; the receptionist Emman...,Wanderer828246 ; October 2018 ; Rooms are supe...,Nyl-ana A ; October 2018 ; Rooms facilities ar...


# 17. Stay Malate

In [157]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13551807-Reviews-Stay_Malate-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1750 Adriatico Street corner Nakpil St. Dona Josefa Bldg., Manila, Luzon 1004 Philippines']
['Public wifi,Bar / lounge,Restaurant,Non-smoking hotel,Self-serve laundry,Air conditioning,Non-smoking rooms']


In [158]:
req2= requests.get('https://www.booking.com/hotel/ph/stay-amare-residences-malate.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Apartment']


In [159]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13551807-Reviews-Stay_Malate-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['travelqueen', 'Star L', 'mariah shinas', 'Yvonne', 'Cheyenne H']
['March 2019', 'November 2018', 'June 2019', 'February 2019', 'December 2018']


In [160]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d13551807-Reviews-or5-Stay_Malate-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Rafal C', 'Azu G', 'Jumboaway']
['August 2018', 'July 2018', 'June 2018']


In [161]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,travelqueen,March 2019,My family and I stayed at this hostel a couple...
1,Star L,November 2018,For the price its one of the best hostel of ma...
2,mariah shinas,June 2019,It is indeed a great place to stay and even mi...
3,Yvonne,February 2019,This Guesthouse has roof top bar and a lounge ...
4,Cheyenne H,December 2018,I stayed here for 2 weeks while doing a course...
5,Rafal C,August 2018,This guesthouse has a central lounge area for ...
6,Azu G,July 2018,With my friend we stayed two nights in dorm. Y...
7,Jumboaway,June 2018,I'm used to stay here since 2009. Formerly cal...
8,GrandTour28152677783,November 2018,Courteous and friendly reception. Affordable a...
9,Christine T,November 2018,its my second time to book for my husband but ...


In [162]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h17= pd.concat([df20,df7],axis=1)
h17

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13
0,1750 Adriatico Street corner Nakpil St. Dona J...,"Public wifi,Bar / lounge,Restaurant,Non-smokin...",Apartment,travelqueen ; March 2019 ; My family and I sta...,Star L ; November 2018 ; For the price its one...,mariah shinas ; June 2019 ; It is indeed a gre...,Yvonne ; February 2019 ; This Guesthouse has r...,Cheyenne H ; December 2018 ; I stayed here for...,Rafal C ; August 2018 ; This guesthouse has a ...,Azu G ; July 2018 ; With my friend we stayed t...,Jumboaway ; June 2018 ; I'm used to stay here ...,GrandTour28152677783 ; November 2018 ; Courteo...,Christine T ; November 2018 ; its my second ti...,ruby u ; October 2018 ; the receptionist Emman...,Wanderer828246 ; October 2018 ; Rooms are supe...,Nyl-ana A ; October 2018 ; Rooms facilities ar...


# 18. Oriental Zen Suites

In [163]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d8602181-Reviews-Oriental_Zen_Suites-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1545 A Mendoza St, Sampaloc, Manila, Luzon 1008 Philippines']
['Free parking,Free High Speed Internet (WiFi),Free breakfast,Bicycle rental,Children Activities (Kid / Family Friendly),Airport transportation,Business Center with Internet Access,24-hour security']


In [164]:
req2= requests.get('https://www.booking.com/hotel/ph/oriental-zen-suites.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Deluxe Room,Premier King Apartment,Standard Room,Superior Room']


In [165]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d8602181-Reviews-Oriental_Zen_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Alvin O.', 'Azraei', 'Daniella Blanco', 'Ronaldo Docot', 'wanbravo']
['September 2020', 'March 2020', 'February 2020', 'February 2020', 'December 2019']


In [166]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d8602181-Reviews-or5-Oriental_Zen_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['IslandRasta', 'All', 'GailCC4', 'Jackie C', 'Kei G']
['July 2019', 'December 2018', 'December 2017', 'December 2017', 'April 2018']


In [167]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d8602181-Reviews-or10-Oriental_Zen_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Elvie C', 'Romeo L', 'Trek760249', 'lilly j', 'hadsbjafdasfvd']
['December 2017', 'March 2018', 'March 2018', 'March 2018', 'November 2017']


In [168]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Alvin O.,September 2020,"Excellent hotel for it's price, has all of the..."
1,Azraei,March 2020,Very clean and great value. Very recommended t...
2,Daniella Blanco,February 2020,Oriental Zen Suites is a highly Recommended s...
3,Ronaldo Docot,February 2020,"On February 21, we had our wedding preparation..."
4,wanbravo,December 2019,The room is tidy and clean and the amenities a...
5,IslandRasta,July 2019,I was welcomed by a rainy evening in Manila an...
6,All,December 2018,Nice location close to huge SM Lazaro shopping...
7,GailCC4,December 2017,My family and I visited the Philippines during...
8,Jackie C,December 2017,My family and I visited the Philippines for a ...
9,Kei G,April 2018,The room is neat and tidy. The staff are very ...


In [169]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h18= pd.concat([df20,df7],axis=1)
h18

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1545 A Mendoza St, Sampaloc, Manila, Luzon 100...","Free parking,Free High Speed Internet (WiFi),F...","Deluxe Room,Premier King Apartment,Standard Ro...",Alvin O. ; September 2020 ; Excellent hotel fo...,Azraei ; March 2020 ; Very clean and great val...,Daniella Blanco ; February 2020 ; Oriental Zen...,Ronaldo Docot ; February 2020 ; On February 21...,wanbravo ; December 2019 ; The room is tidy an...,IslandRasta ; July 2019 ; I was welcomed by a ...,All ; December 2018 ; Nice location close to h...,GailCC4 ; December 2017 ; My family and I visi...,Jackie C ; December 2017 ; My family and I vis...,Kei G ; April 2018 ; The room is neat and tidy...,"Elvie C ; December 2017 ; Location is great, s...",Romeo L ; March 2018 ; The room is okay and av...,Trek760249 ; March 2018 ; This is a very nice ...,lilly j ; March 2018 ; I had a really great st...,hadsbjafdasfvd ; November 2017 ; Stayed here l...


# 19. Diamond Hotel Philippines

In [170]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d306018-Reviews-Diamond_Hotel_Philippines-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Roxas Boulevard cor. Dr. J. Quintos St., Manila, Luzon 1000 Philippines']
['Free parking,Free High Speed Internet (WiFi),Pool,Fitness Center with Gym / Workout Room,Free breakfast,Kids stay free,Airport transportation,Conference facilities']


In [171]:
req2= requests.get('https://www.booking.com/hotel/ph/diamond-philippines.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Diamond Club King Room - Non-Smoking,Executive Suite Non-Smoking,Deluxe King Room Non-Smoking,Deluxe Regency Non-Smoking,Premier King Room Non-smoking,Deluxe King Room - Smoking,Deluxe Twin Room Non-Smoking,Deluxe Twin Room Smoking,Deluxe Regency Smoking,Premier King Room Non-Smoking Pool View,Premier Twin Room Non-smoking,Executive Suite Smoking,Diamond Club Twin Room - Non-Smoking']


In [172]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d306018-Reviews-Diamond_Hotel_Philippines-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['adkens22', 'Anne N', 'mjbarnuevo', 'garciajonald92', 'mhondrew21']
['June 2021', 'May 2021', 'June 2021', 'May 2021', 'June 2021']


In [173]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d306018-Reviews-or5-Diamond_Hotel_Philippines-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['silvastephenbryan', 'lawrenceaureliuss', 'delrosariojovael25', 'alexanderpbantilanjr', 'bosslasaballa']
['May 2021', 'June 2021', 'May 2021', 'May 2021', 'May 2021']


In [174]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d306018-Reviews-or10-Diamond_Hotel_Philippines-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['joysellesecatin', 'mrrr2021', 'dextermendoza124', 'sunrisefiel', 'janartillaga']
['May 2021', 'May 2021', 'May 2021', 'May 2021', 'May 2021']


In [175]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,adkens22,June 2021,What I can say is that your hotel is so excel...
1,Anne N,May 2021,I would like to come back. Great stay. I had a...
2,mjbarnuevo,June 2021,One of the leading hotels in the Philippines t...
3,garciajonald92,May 2021,nice & clean hotel good behavior to there gues...
4,mhondrew21,June 2021,The staff are really accomodating and friendly...
5,silvastephenbryan,May 2021,Very good. very fast internet. delicious food....
6,lawrenceaureliuss,June 2021,I like it here it beutiful and still very good...
7,delrosariojovael25,May 2021,"Excellent experienced during stay, courteous s..."
8,alexanderpbantilanjr,May 2021,Overall stay was pleasant. The rooms were clea...
9,bosslasaballa,May 2021,Very accomodating and service is prompt. Staye...


In [176]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h19= pd.concat([df20,df7],axis=1)
h19

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"Roxas Boulevard cor. Dr. J. Quintos St., Manil...","Free parking,Free High Speed Internet (WiFi),P...","Diamond Club King Room - Non-Smoking,Executive...",adkens22 ; June 2021 ; What I can say is that ...,Anne N ; May 2021 ; I would like to come back....,mjbarnuevo ; June 2021 ; One of the leading ho...,garciajonald92 ; May 2021 ; nice & clean hotel...,mhondrew21 ; June 2021 ; The staff are really ...,silvastephenbryan ; May 2021 ; Very good. very...,lawrenceaureliuss ; June 2021 ; I like it here...,delrosariojovael25 ; May 2021 ; Excellent expe...,alexanderpbantilanjr ; May 2021 ; Overall stay...,bosslasaballa ; May 2021 ; Very accomodating a...,joysellesecatin ; May 2021 ; Bored staying upo...,mrrr2021 ; May 2021 ; On my arrival the securi...,dextermendoza124 ; May 2021 ; It's was perfect...,sunrisefiel ; May 2021 ; The diamond hotel for...,janartillaga ; May 2021 ; I guess there's noth...


# 20. White Knight Hotel Intramuros

In [190]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1953180-Reviews-White_Knight_Hotel_Intramuros-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Plaza San Luis Complex General Luna Street corner Urdaneta Street Intramuros, Manila, Luzon 1002 Philippines']
['Free parking,Free High Speed Internet (WiFi),Bar / lounge,Bicycle rental,Nightclub / DJ,Babysitting,Airport transportation,Banquet room']


In [179]:
req2= requests.get('https://ph.hotels.com/ho396604/white-knight-hotel-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
# added semi-colon as separator instead of comma due to having it on description
availroom = [item.replace('dD','d;D') for item in avroom]
availroom = [item.replace('dE','d;E') for item in availroom]
availroom = [item.replace('sS','s;S') for item in availroom]
print(availroom)

['Standard Room, 1 Queen Bed;Deluxe Room, 1 Queen Bed;Executive Room, 1 Queen Bed;Executive Twin Room, 2 Twin Beds;Suite']


In [180]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1953180-Reviews-White_Knight_Hotel_Intramuros-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Lynne N', 'Cynthia D', 'Landon014', 'Ash', 'Liam H']
['June 2021', 'December 2019', 'February 2020', 'March 2020', 'March 2020']


In [181]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1953180-Reviews-or5-White_Knight_Hotel_Intramuros-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Las Buganvillas', 'RobertoM', 'cyclingflanman', 'A.B. De Castro', 'mgnpunzalan']
['February 2020', 'January 2020', 'December 2019', 'December 2019', 'November 2019']


In [182]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1953180-Reviews-or10-White_Knight_Hotel_Intramuros-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['xanthipp', 'MaurizzioAlexandre', 'Eze E', 'Raechel Donahue', 'Daniel D']
['October 2019', 'October 2019', 'October 2019', 'October 2019', 'October 2019']


In [183]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Lynne N,June 2021,"Revisiting the Walled City of Intramuros, Mani..."
1,Cynthia D,December 2019,"Not only do they have dirty rooms, no water bu..."
2,Landon014,February 2020,Booked Queen Executive Room 6 months in advanc...
3,Ash,March 2020,The hotel is fortunate to have such amazing su...
4,Liam H,March 2020,After checking into hotel I was told they had ...
5,Las Buganvillas,February 2020,We went for drinks in the bar which has great ...
6,RobertoM,January 2020,Bad comunication: I send email before check in...
7,cyclingflanman,December 2019,I have stayed at this hotel many times. I love...
8,A.B. De Castro,December 2019,"Like most of the hotels within Intramuros, Whi..."
9,mgnpunzalan,November 2019,We stayed there for a photoshoot in Intramuros...


In [184]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h20= pd.concat([df20,df7],axis=1)
h20

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,Plaza San Luis Complex General Luna Street cor...,"Free parking,Free High Speed Internet (WiFi),B...","Standard Room, 1 Queen Bed;Deluxe Room, 1 Quee...",Lynne N ; June 2021 ; Revisiting the Walled Ci...,Cynthia D ; December 2019 ; Not only do they h...,Landon014 ; February 2020 ; Booked Queen Execu...,Ash ; March 2020 ; The hotel is fortunate to h...,Liam H ; March 2020 ; After checking into hote...,Las Buganvillas ; February 2020 ; We went for ...,RobertoM ; January 2020 ; Bad comunication: I ...,cyclingflanman ; December 2019 ; I have stayed...,A.B. De Castro ; December 2019 ; Like most of ...,mgnpunzalan ; November 2019 ; We stayed there ...,xanthipp ; October 2019 ; We loved this quaint...,MaurizzioAlexandre ; October 2019 ; The staff ...,Eze E ; October 2019 ; I enjoyed my experience...,Raechel Donahue ; October 2019 ; In the Intram...,Daniel D ; October 2019 ; The location of this...


# 21. RedDoorz Plus @ Better Living Paranaque

In [191]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d14030677-Reviews-RedDoorz_Plus_Better_Living_Paranaque-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
# no amenities in the site
print(amenity)

['153 Dona Soledad Avenue, Manila, Luzon 1711 Philippines']
['']


In [192]:
req2= requests.get('https://www.booking.com/hotel/ph/reddoorz-plus-better-living-paranaque.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Family Room,Double Room,Suite']


In [193]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d14030677-Reviews-RedDoorz_Plus_Better_Living_Paranaque-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['saf7670', 'AzitiZ']
['May 2019', 'June 2018']


In [194]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,saf7670,May 2019,"I hadn’t expected to be staying here, but my o..."
1,AzitiZ,June 2018,If you use booking sites that display this one...
2,Las Buganvillas,February 2020,We went for drinks in the bar which has great ...
3,RobertoM,January 2020,Bad comunication: I send email before check in...
4,cyclingflanman,December 2019,I have stayed at this hotel many times. I love...
5,A.B. De Castro,December 2019,"Like most of the hotels within Intramuros, Whi..."
6,mgnpunzalan,November 2019,We stayed there for a photoshoot in Intramuros...
7,xanthipp,October 2019,"We loved this quaint, pretty, Spanish Colonial..."
8,MaurizzioAlexandre,October 2019,"The staff is always ready to attend anything, ..."
9,Eze E,October 2019,I enjoyed my experience a bit too much I’m afr...


In [195]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h21= pd.concat([df20,df7],axis=1)
h21

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12
0,"153 Dona Soledad Avenue, Manila, Luzon 1711 Ph...",,"Family Room,Double Room,Suite",saf7670 ; May 2019 ; I hadn’t expected to be s...,AzitiZ ; June 2018 ; If you use booking sites ...,Las Buganvillas ; February 2020 ; We went for ...,RobertoM ; January 2020 ; Bad comunication: I ...,cyclingflanman ; December 2019 ; I have stayed...,A.B. De Castro ; December 2019 ; Like most of ...,mgnpunzalan ; November 2019 ; We stayed there ...,xanthipp ; October 2019 ; We loved this quaint...,MaurizzioAlexandre ; October 2019 ; The staff ...,Eze E ; October 2019 ; I enjoyed my experience...,Raechel Donahue ; October 2019 ; In the Intram...,Daniel D ; October 2019 ; The location of this...


# 22. Ramada by Wyndham Manila Central

In [196]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d2322638-Reviews-Ramada_by_Wyndham_Manila_Central-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['Ongpin corner Quintin Paredes Sts Binondo, Manila, Luzon 1006 Philippines']
['Free parking,Free High Speed Internet (WiFi),Fitness Center with Gym / Workout Room,Free breakfast,Kids stay free,Highchairs available,Airport transportation,Business Center with Internet Access']


In [198]:
req2= requests.get('https://www.booking.com/hotel/ph/ramada-central-manila.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Executive Double or Twin Room,Superior Double or Twin Room,Deluxe Double or Twin Room,One-Bedroom Suite,Special Offer - Superior Room (for Quarantine),Special Offer - Deluxe Double or Twin Room (for Quarantine),Special Offer - Executive Double or Twin Room (for Quarantine),Special Offer - One-Bedroom Suite (for Quarantine)']


In [197]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d2322638-Reviews-Ramada_by_Wyndham_Manila_Central-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Kaye Carsula', 'Trisha L', 'spilledmytrust', 'cherrydxb', 'sakubenjose']
['December 2020', 'January 2020', 'March 2020', 'January 2020', 'March 2020']


In [199]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d2322638-Reviews-or5-Ramada_by_Wyndham_Manila_Central-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['The Gypsy Club', 'brettclarkson2003', 'Tin Ley', 'D & E', 'DonnaMarie']
['March 2020', 'March 2020', 'March 2020', 'February 2020', 'February 2020']


In [200]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d2322638-Reviews-or10-Ramada_by_Wyndham_Manila_Central-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['billyukmanchester', 'REJR', 'Núria Freixes', 'Kevin', 'Heidi']
['February 2020', 'February 2020', 'February 2020', 'February 2020', 'February 2020']


In [201]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Kaye Carsula,December 2020,I booked this hotel as a requirement for isola...
1,Trisha L,January 2020,I stayed in a 2-bed room. Cons The bedsheet...
2,spilledmytrust,March 2020,We stayed in Ramada before the ECQ was impleme...
3,cherrydxb,January 2020,Got upgraded to a suite..river facing rooms. G...
4,sakubenjose,March 2020,Excellent location. Very clean hotel although...
5,The Gypsy Club,March 2020,Fantastic stay at this hotel! Right in the hea...
6,brettclarkson2003,March 2020,This place is fantastic i can recommend this a...
7,Tin Ley,March 2020,Great place to stay with friends and family. N...
8,D & E,February 2020,We didn't like the location and the old hotel ...
9,DonnaMarie,February 2020,Ramada Manila is located at the metro close to...


In [202]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h22= pd.concat([df20,df7],axis=1)
h22

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"Ongpin corner Quintin Paredes Sts Binondo, Man...","Free parking,Free High Speed Internet (WiFi),F...","Executive Double or Twin Room,Superior Double ...",Kaye Carsula ; December 2020 ; I booked this h...,Trisha L ; January 2020 ; I stayed in a 2-bed ...,spilledmytrust ; March 2020 ; We stayed in Ram...,cherrydxb ; January 2020 ; Got upgraded to a s...,sakubenjose ; March 2020 ; Excellent location....,The Gypsy Club ; March 2020 ; Fantastic stay a...,brettclarkson2003 ; March 2020 ; This place is...,Tin Ley ; March 2020 ; Great place to stay wit...,D & E ; February 2020 ; We didn't like the loc...,DonnaMarie ; February 2020 ; Ramada Manila is ...,billyukmanchester ; February 2020 ; I keep on ...,REJR ; February 2020 ; The staff are very acco...,Núria Freixes ; February 2020 ; The room and s...,Kevin ; February 2020 ; My stay in the Ramada ...,Heidi ; February 2020 ; We were staying few ni...


# 23. Tropicana Suites

In [203]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1053326-Reviews-Tropicana_Suites-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1630 Luis Maria Guerrero Street Malate, Manila, Luzon 1004 Philippines']
['Free parking,Free High Speed Internet (WiFi),Pool,Fitness Center with Gym / Workout Room,Free breakfast,Children Activities (Kid / Family Friendly),Airport transportation,Business Center with Internet Access']


In [204]:
req2= requests.get('https://www.booking.com/hotel/ph/tropicana-suites.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Studio,One-Bedroom Suite,One-Bedroom Suite,Studio,Superior Suite,Superior Suite,Two-Bedroom Suite,Two-Bedroom Suite']


In [205]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1053326-Reviews-Tropicana_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Molly', 'Rosemarie C', 'Apmzer', 'AnnMES', 'Bubba']
['February 2020', 'August 2019', 'October 2019', 'December 2018', 'September 2019']


In [206]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1053326-Reviews-or5-Tropicana_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Manutiger', 'agerges', 'Zena Marie D', 'Charysa Santos', 'Katherine']
['September 2019', 'August 2019', 'June 2019', 'June 2019', 'June 2019']


In [207]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1053326-Reviews-or10-Tropicana_Suites-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Hillo', 'drmagsasaka', 'April H', 'Wally-Singapore', 'Peggy Giles S']
['June 2019', 'May 2019', 'March 2019', 'February 2019', 'March 2019']


In [208]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Molly,February 2020,Great hotel. Good size rooms cleaned daily and...
1,Rosemarie C,August 2019,Breakfast is not free. You have to give tips t...
2,Apmzer,October 2019,We spent two weeks in a 2 bedroom family suite...
3,AnnMES,December 2018,We booked a 2 bedroom unit. It has a living ro...
4,Bubba,September 2019,Spent 5 nights here at the Tropicana with the ...
5,Manutiger,September 2019,"Unfortunately, was unable to go due to family ..."
6,agerges,August 2019,suites consist of a living room and a bedroom ...
7,Zena Marie D,June 2019,I highly recommend staying in Tropicana Suites...
8,Charysa Santos,June 2019,We stayed for two nights at this hotel and the...
9,Katherine,June 2019,"Great service, staff was very friendly and qui..."


In [209]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h23= pd.concat([df20,df7],axis=1)
h23

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1630 Luis Maria Guerrero Street Malate, Manila...","Free parking,Free High Speed Internet (WiFi),P...","Studio,One-Bedroom Suite,One-Bedroom Suite,Stu...",Molly ; February 2020 ; Great hotel. Good size...,Rosemarie C ; August 2019 ; Breakfast is not f...,Apmzer ; October 2019 ; We spent two weeks in ...,AnnMES ; December 2018 ; We booked a 2 bedroom...,Bubba ; September 2019 ; Spent 5 nights here a...,"Manutiger ; September 2019 ; Unfortunately, wa...",agerges ; August 2019 ; suites consist of a li...,Zena Marie D ; June 2019 ; I highly recommend ...,Charysa Santos ; June 2019 ; We stayed for two...,"Katherine ; June 2019 ; Great service, staff w...",Hillo ; June 2019 ; The accommodations were ve...,drmagsasaka ; May 2019 ; The studio unit that ...,April H ; March 2019 ; Stayed two times during...,Wally-Singapore ; February 2019 ; I chose this...,Peggy Giles S ; March 2019 ; They offered to h...


# 24. Casa Blanca Apartment

In [210]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12731306-Reviews-Casa_Blanca_Apartment-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1447 Adriatico Street Corner Salas St., Manila, Luzon 1000 Philippines']
['Free High Speed Internet (WiFi),Salon,Convenience store,Gift shop,Non-smoking hotel,Shops,24-hour front desk,Air conditioning']


In [211]:
req2= requests.get('https://www.booking.com/hotel/ph/casa-blanca-apartment.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Standard Apartment']


In [212]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d12731306-Reviews-Casa_Blanca_Apartment-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['The50s']
['April 2018']


In [213]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,The50s,April 2018,It has kitchenette but cutleries plates provid...
1,Manutiger,September 2019,"Unfortunately, was unable to go due to family ..."
2,agerges,August 2019,suites consist of a living room and a bedroom ...
3,Zena Marie D,June 2019,I highly recommend staying in Tropicana Suites...
4,Charysa Santos,June 2019,We stayed for two nights at this hotel and the...
5,Katherine,June 2019,"Great service, staff was very friendly and qui..."
6,Hillo,June 2019,The accommodations were very nice. The suite h...
7,drmagsasaka,May 2019,The studio unit that was given to me was the l...
8,April H,March 2019,"Stayed two times during our holiday in Manila,..."
9,Wally-Singapore,February 2019,I chose this hotel as I had a really good expe...


In [214]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h24= pd.concat([df20,df7],axis=1)
h24

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11
0,"1447 Adriatico Street Corner Salas St., Manila...","Free High Speed Internet (WiFi),Salon,Convenie...",Standard Apartment,The50s ; April 2018 ; It has kitchenette but c...,"Manutiger ; September 2019 ; Unfortunately, wa...",agerges ; August 2019 ; suites consist of a li...,Zena Marie D ; June 2019 ; I highly recommend ...,Charysa Santos ; June 2019 ; We stayed for two...,"Katherine ; June 2019 ; Great service, staff w...",Hillo ; June 2019 ; The accommodations were ve...,drmagsasaka ; May 2019 ; The studio unit that ...,April H ; March 2019 ; Stayed two times during...,Wally-Singapore ; February 2019 ; I chose this...,Peggy Giles S ; March 2019 ; They offered to h...


# 25. Fersal Hotel - Manila

In [215]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1582450-Reviews-Fersal_Hotel_Manila-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1455 A. Mendoza Street, corner Alvarez Street Sta Cruz, Manila, Luzon 0913 Philippines']
['Free parking,Free High Speed Internet (WiFi),Free breakfast,Car hire,Massage,Baggage storage,Concierge,24-hour front desk']


In [216]:
req2= requests.get('https://www.booking.com/hotel/ph/fersal-manila.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Junior Suite,Deluxe Queen Room,Deluxe Double Room']


In [217]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1582450-Reviews-Fersal_Hotel_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['stivmoore', 'Jomer F', 'Leni-Nonoy L', 'maxim', 'Melvyn F']
['December 2019', 'December 2018', 'January 2019', 'February 2019', 'November 2018']


In [218]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1582450-Reviews-or5-Fersal_Hotel_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Carmie M', 'Elvie C', 'Angie B', 'stefyalbores', 'Wico P']
['October 2018', 'May 2018', 'June 2018', 'July 2017', 'May 2018']


In [219]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1582450-Reviews-or10-Fersal_Hotel_Manila-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['Rei S', 'bong2abas', 'Elvie C', 'Anurag C', 'Ezzen A']
['May 2018', 'April 2018', 'April 2018', 'January 2018', 'January 2017']


In [220]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,stivmoore,December 2019,Customer service was excellent and I had to co...
1,Jomer F,December 2018,We notify our booking that we will arrive late...
2,Leni-Nonoy L,January 2019,i was a bit disappointed with the way we were ...
3,maxim,February 2019,The restaurant in this budget hotel is a good ...
4,Melvyn F,November 2018,Ferzal Hotel Manila is always on my top list w...
5,Carmie M,October 2018,Friend recommended Fersal and we stayed for ...
6,Elvie C,May 2018,Come and check in at fersal manila promo is st...
7,Angie B,June 2018,This is our first time to stay at fersal hotel...
8,stefyalbores,July 2017,The accomodation was affordable and the facili...
9,Wico P,May 2018,Nice place to stay. Its worth the price you pa...


In [221]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h25= pd.concat([df20,df7],axis=1)
h25

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1455 A. Mendoza Street, corner Alvarez Street ...","Free parking,Free High Speed Internet (WiFi),F...","Junior Suite,Deluxe Queen Room,Deluxe Double Room",stivmoore ; December 2019 ; Customer service w...,Jomer F ; December 2018 ; We notify our bookin...,Leni-Nonoy L ; January 2019 ; i was a bit disa...,maxim ; February 2019 ; The restaurant in this...,Melvyn F ; November 2018 ; Ferzal Hotel Manila...,Carmie M ; October 2018 ; Friend recommended ...,Elvie C ; May 2018 ; Come and check in at fers...,Angie B ; June 2018 ; This is our first time t...,stefyalbores ; July 2017 ; The accomodation wa...,Wico P ; May 2018 ; Nice place to stay. Its wo...,Rei S ; May 2018 ; The staffs are very acommod...,bong2abas ; April 2018 ; The hotel offers affo...,Elvie C ; April 2018 ; Before they offer 12 hr...,Anurag C ; January 2018 ; Hotel Fersal is loca...,Ezzen A ; January 2017 ; this hotel has a free...


# 26. Hotel Sogo Quirino, Malate

In [222]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6500102-Reviews-Hotel_Sogo_Quirino_Malate-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['2177 Madre Ignacia St. corner Quirino, Malate, Manila, Luzon Philippines']
['Free High Speed Internet (WiFi),Wifi,Babysitting,Children Activities (Kid / Family Friendly),Non-smoking hotel,Air conditioning,Room service,Family rooms']


In [225]:
req2= requests.get('https://ph.hotels.com/ho494445/hotel-sogo-quirino-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
# added semi-colon as separator instead of comma due to having it on description
availroom = [item.replace('dE','d;E') for item in avroom]
print(availroom)

['Deluxe Room, 1 Queen Bed;Executive Room']


In [226]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6500102-Reviews-Hotel_Sogo_Quirino_Malate-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['zivino', 'kirstenj29', 'jamessC5102XP', 'Joseph J', 'Toto P']
['May 2018', 'April 2017', 'January 2017', 'August 2016', 'August 2016']


In [227]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6500102-Reviews-or5-Hotel_Sogo_Quirino_Malate-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Wayne S', 'marvin m', 'Saichi P', 'Mackie M', 'chelseasayo']
['January 2016', 'December 2015', 'October 2015', 'August 2015', 'December 2014']


In [228]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6500102-Reviews-or10-Hotel_Sogo_Quirino_Malate-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['quigman26', 'jaycemousr', 'JL90706', 'Nicca2014', 'BananaMango99']
['October 2014', 'October 2014', 'August 2014', 'September 2013', 'March 2014']


In [229]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,zivino,May 2018,"I spend about 12 night in malate sogo,at 18£ a..."
1,kirstenj29,April 2017,we checked in since it was the first hotel we ...
2,jamessC5102XP,January 2017,"room was great, but the back gate security gua..."
3,Joseph J,August 2016,My first experience going to hotel and it was ...
4,Toto P,August 2016,Me and may partner went there to stay over nig...
5,Wayne S,January 2016,Every time I told anyone I stayed here they la...
6,marvin m,December 2015,very clean room--air con worked--good bed--won...
7,Saichi P,October 2015,as for me this is the perfect place to unwind ...
8,Mackie M,August 2015,This line of hotels in the Philippines caters ...
9,chelseasayo,December 2014,Service ♡ Location ♡ Food ♡ Cc payment ♡ Servi...


In [230]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h26= pd.concat([df20,df7],axis=1)
h26

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"2177 Madre Ignacia St. corner Quirino, Malate,...","Free High Speed Internet (WiFi),Wifi,Babysitti...","Deluxe Room, 1 Queen Bed;Executive Room",zivino ; May 2018 ; I spend about 12 night in ...,kirstenj29 ; April 2017 ; we checked in since ...,"jamessC5102XP ; January 2017 ; room was great,...",Joseph J ; August 2016 ; My first experience g...,Toto P ; August 2016 ; Me and may partner went...,Wayne S ; January 2016 ; Every time I told any...,marvin m ; December 2015 ; very clean room--ai...,Saichi P ; October 2015 ; as for me this is th...,Mackie M ; August 2015 ; This line of hotels i...,chelseasayo ; December 2014 ; Service ♡ Locati...,quigman26 ; October 2014 ; As a frequent trave...,jaycemousr ; October 2014 ; This is one of the...,JL90706 ; August 2014 ; Stayed here for 1 nigh...,Nicca2014 ; September 2013 ; My brother tell m...,BananaMango99 ; March 2014 ; I wanted somethin...


# 27. La Casarita

In [231]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d11926131-Reviews-La_Casarita-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['333 San Rafael Street San Miguel District, Manila, Luzon 1005 Philippines']
['Paid private parking on-site,Parking,Wifi,Coffee shop,Children Activities (Kid / Family Friendly),Car hire,24-hour security,Non-smoking hotel']


In [232]:
req2= requests.get('https://www.booking.com/hotel/ph/la-casarita.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Double Room,Quadruple Room,Economy Double Room,Apartment']


In [233]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d11926131-Reviews-La_Casarita-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Barin M', 'Minh Christine T', 'philip james r v']
['June 2019', 'January 2018', 'January 2017']


In [234]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Barin M,June 2019,Stayed here for 3 days and 2 nights for the 2-...
1,Minh Christine T,January 2018,Le taxi a eu beaucoup de difficultés à trouver...
2,philip james r v,January 2017,This establishment is located within the secur...
3,Wayne S,January 2016,Every time I told anyone I stayed here they la...
4,marvin m,December 2015,very clean room--air con worked--good bed--won...
5,Saichi P,October 2015,as for me this is the perfect place to unwind ...
6,Mackie M,August 2015,This line of hotels in the Philippines caters ...
7,chelseasayo,December 2014,Service ♡ Location ♡ Food ♡ Cc payment ♡ Servi...
8,quigman26,October 2014,As a frequent traveler to region 3 i have foun...
9,jaycemousr,October 2014,This is one of the pay by the 'short stay' typ...


In [235]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h27= pd.concat([df20,df7],axis=1)
h27

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13
0,"333 San Rafael Street San Miguel District, Man...","Paid private parking on-site,Parking,Wifi,Coff...","Double Room,Quadruple Room,Economy Double Room...",Barin M ; June 2019 ; Stayed here for 3 days a...,Minh Christine T ; January 2018 ; Le taxi a eu...,philip james r v ; January 2017 ; This establi...,Wayne S ; January 2016 ; Every time I told any...,marvin m ; December 2015 ; very clean room--ai...,Saichi P ; October 2015 ; as for me this is th...,Mackie M ; August 2015 ; This line of hotels i...,chelseasayo ; December 2014 ; Service ♡ Locati...,quigman26 ; October 2014 ; As a frequent trave...,jaycemousr ; October 2014 ; This is one of the...,JL90706 ; August 2014 ; Stayed here for 1 nigh...,Nicca2014 ; September 2013 ; My brother tell m...,BananaMango99 ; March 2014 ; I wanted somethin...


# 28. Manila Manor Hotel

In [236]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1228862-Reviews-Manila_Manor_Hotel-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['1660 J.Bocobo St Malate, Manila, Luzon 1004 Philippines']
['Paid public parking nearby,Free High Speed Internet (WiFi),Sauna,Coffee shop,Children Activities (Kid / Family Friendly),Conference facilities,Banquet room,Massage']


In [237]:
req2= requests.get('https://www.booking.com/hotel/ph/manila-manor.en-gb.html')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('a',{'class':'jqrt togglelink'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
avroom= ",".join(avroom)
availroom=[]
availroom.append(avroom)
print(availroom)

['Superior Double or Twin Room,Deluxe Double or Twin Room,Deluxe Suite,Room Selected at Check In']


In [238]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1228862-Reviews-Manila_Manor_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Noel Bryan C', 'Mbb404', 'Ayman', 'Jomer F', 'travelershenanigans']
['November 2019', 'August 2019', 'August 2018', 'June 2018', 'April 2018']


In [239]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1228862-Reviews-or5-Manila_Manor_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Meander64664528149', 'Алия Ф', 'Phil E', 'Froilan R', 'chrismV5917TT']
['November 2018', 'September 2018', 'September 2018', 'September 2018', 'August 2018']


In [240]:
#3rd 5 reviews
req5= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1228862-Reviews-or10-Manila_Manor_Hotel-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req5.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)
reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d4 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df4 = pd.DataFrame.from_dict(d4)

['flatheels', 'zivino', 'Romeo A', 'Phillip B', 'Luigi A']
['September 2017', 'June 2018', 'May 2018', 'February 2018', 'January 2018']


In [241]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Noel Bryan C,November 2019,It has the old Manila vibe on to it which I li...
1,Mbb404,August 2019,Lesson learned; this is gonna be my first and ...
2,Ayman,August 2018,"Start with Cockroach walking in my below, anot..."
3,Jomer F,June 2018,This is an old building besides of the bellagi...
4,travelershenanigans,April 2018,We stayed there for 1 night for the Script con...
5,Meander64664528149,November 2018,Despite my emphasis to book a quiet Deluxe roo...
6,Алия Ф,September 2018,"Unfortunately, my friend booked this hotel for..."
7,Phil E,September 2018,The photographs of this hotel needs to be chan...
8,Froilan R,September 2018,The worst hotel i've ever been. The only hotel...
9,chrismV5917TT,August 2018,Although the staff are really nice and the loc...


In [242]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h28= pd.concat([df20,df7],axis=1)
h28

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1660 J.Bocobo St Malate, Manila, Luzon 1004 Ph...","Paid public parking nearby,Free High Speed Int...","Superior Double or Twin Room,Deluxe Double or ...",Noel Bryan C ; November 2019 ; It has the old ...,Mbb404 ; August 2019 ; Lesson learned; this is...,Ayman ; August 2018 ; Start with Cockroach wal...,Jomer F ; June 2018 ; This is an old building ...,travelershenanigans ; April 2018 ; We stayed t...,Meander64664528149 ; November 2018 ; Despite m...,"Алия Ф ; September 2018 ; Unfortunately, my fr...",Phil E ; September 2018 ; The photographs of t...,Froilan R ; September 2018 ; The worst hotel i...,chrismV5917TT ; August 2018 ; Although the sta...,flatheels ; September 2017 ; Love the show? Th...,"zivino ; June 2018 ; Very dirty places,noise,W...",Romeo A ; May 2018 ; It's an old hotel and doe...,Phillip B ; February 2018 ; Arrived and room w...,Luigi A ; January 2018 ; We stay there para ma...


# 29. Halina Hotel Avenida

In [249]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6012996-Reviews-Halina_Hotel_Avenida-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('div',{'class':'_1sPw_t0w _3sCS_WGO'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['635 Katupusan St. cor Rizal Avenue, Sta Cruz/ Recto, Manila, Luzon Philippines']
['Free High Speed Internet (WiFi),Wifi,Restaurant,24-hour security,Baggage storage,24-hour check-in,24-hour front desk,Air conditioning']


In [251]:
req2= requests.get('https://ph.hotels.com/ho525458/halina-hotel-avenida-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
# added semi-colon as separator instead of comma due to having it on description
availroom = [item.replace('mE','m;E') for item in avroom]
print(availroom)

['Standard Room;Executive Room']


In [244]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d6012996-Reviews-Halina_Hotel_Avenida-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['ccompaneros', 'Jewel_of_the_Nile10']
['January 2017', 'November 2015']


In [252]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,ccompaneros,January 2017,"The staff were friendly and helpful, the locat..."
1,Jewel_of_the_Nile10,November 2015,Stayed here for a couple of nights on a weekda...
2,Meander64664528149,November 2018,Despite my emphasis to book a quiet Deluxe roo...
3,Алия Ф,September 2018,"Unfortunately, my friend booked this hotel for..."
4,Phil E,September 2018,The photographs of this hotel needs to be chan...
5,Froilan R,September 2018,The worst hotel i've ever been. The only hotel...
6,chrismV5917TT,August 2018,Although the staff are really nice and the loc...
7,flatheels,September 2017,Love the show? This hotel is great if you want...
8,zivino,June 2018,"Very dirty places,noise,WiFi don't work.. The ..."
9,Romeo A,May 2018,It's an old hotel and does not have even a pai...


In [253]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h29= pd.concat([df20,df7],axis=1)
h29

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12
0,"635 Katupusan St. cor Rizal Avenue, Sta Cruz/ ...","Free High Speed Internet (WiFi),Wifi,Restauran...",Standard Room;Executive Room,ccompaneros ; January 2017 ; The staff were fr...,Jewel_of_the_Nile10 ; November 2015 ; Stayed h...,Meander64664528149 ; November 2018 ; Despite m...,"Алия Ф ; September 2018 ; Unfortunately, my fr...",Phil E ; September 2018 ; The photographs of t...,Froilan R ; September 2018 ; The worst hotel i...,chrismV5917TT ; August 2018 ; Although the sta...,flatheels ; September 2017 ; Love the show? Th...,"zivino ; June 2018 ; Very dirty places,noise,W...",Romeo A ; May 2018 ; It's an old hotel and doe...,Phillip B ; February 2018 ; Arrived and room w...,Luigi A ; January 2018 ; We stay there para ma...


# 30. Hotel Sogo - Sta Mesa

In [254]:
req1= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1976029-Reviews-Hotel_Sogo_Sta_Mesa-Manila_Metro_Manila_Luzon.html')
bsobj = soup(req1.content,'lxml')
add = []
for ad in bsobj.findAll('span',{'class':'_3ErVArsu jke2_wbp'}):
    if add is not None:
        add.append(ad.text.strip())
        break
print(add)
#Get amenities
ame = []
for am in bsobj.findAll('div',{'class':'_2rdvbNSg'}):
    ame.append(am.text.replace('hotel amenity ',am.text.replace('_','').strip()).strip())        
ame= ame[0:8]
ame= ",".join(ame)

amenity=[]
amenity.append(ame)
print(amenity)

['4166 R. Magsaysay Boulevard, Manila, Luzon 1016 Philippines']
['Valet parking,Free High Speed Internet (WiFi),Coffee shop,Massage,Concierge,Non-smoking hotel,24-hour front desk,Express check-in / check-out']


In [257]:
req2= requests.get('https://ph.hotels.com/ho493455/hotel-sogo-stamesa-manila-philippines')
bsobj = soup(req2.content,'lxml')
#Get available rooms
avroom=[]
for av in bsobj.findAll('ul',{'class':'mK9qzN'}):
    #if avroom is  not None:
    avroom.append(av.text.strip())
        
#avroom= ",".join(avroom)
availroom=[]
# added semi-colon as separator instead of comma due to having it on description
availroom = [item.replace('dE','d;E') for item in avroom]
print(availroom)

['Deluxe Room, 1 Queen Bed;Executive Room, 1 Queen Bed']


In [255]:
#1st 5 reviews
req3= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1976029-Reviews-Hotel_Sogo_Sta_Mesa-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req3.content,'lxml')
#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
    
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d2 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df2 = pd.DataFrame.from_dict(d2)

['Agnes C', 'Rex Ebuenga', 'Clarxxx', 'Joshua D', 'rjoaquin1']
['June 2018', 'April 2018', 'December 2016', 'June 2016', 'June 2016']


In [258]:
#2nd 5 reviews
req4= requests.get('https://www.tripadvisor.com.ph/Hotel_Review-g298573-d1976029-Reviews-or5-Hotel_Sogo_Sta_Mesa-Manila_Metro_Manila_Luzon.html#REVIEWS')
bsobj = soup(req4.content,'lxml')

#Get reviews per username
reviewname=[]
for rn in bsobj.findAll('a',{'class':'ui_header_link _1r_My98y'}):
    reviewname.append(rn.text.strip())
print(reviewname)

reviewdate=[]
for rd in bsobj.findAll('span',{'class':'_34Xs-BQm'}):
    reviewdate.append(rd.text.replace('Date of stay: ','').strip())
print(reviewdate)

reviewsummary=[]
for rs in bsobj.findAll('q',{'class':'IRsGHoPm'}):
    reviewsummary.append(rs.text.strip())
    
d3 = {'ReviewerName':reviewname,'ReviewDate':reviewdate,'ReviewSummary':reviewsummary}
df3 = pd.DataFrame.from_dict(d3)

['Zhongjian', 'R0sstheB0ss', 'HUGH B']
['October 2015', 'December 2012', 'June 2012']


In [259]:
#Combine the Reviews
df5= pd.concat([df2,df3,df4]).reset_index(drop=True)
df5

Unnamed: 0,ReviewerName,ReviewDate,ReviewSummary
0,Agnes C,June 2018,We checked into this budget hotel because we w...
1,Rex Ebuenga,April 2018,"I live in the area, however, my house is real..."
2,Clarxxx,December 2016,Don!t ever think to go to this very bad hotel....
3,Joshua D,June 2016,this hotel is very clean and the staff are unf...
4,rjoaquin1,June 2016,We booked a room at Hotel Sogo - Sta Mesa and ...
5,Zhongjian,October 2015,Located near Shoe Mart Centerpoint and the V. ...
6,R0sstheB0ss,December 2012,This hotel appears to operate as a short time ...
7,HUGH B,June 2012,I checked in to the hotel on the 5Th of June 2...
8,flatheels,September 2017,Love the show? This hotel is great if you want...
9,zivino,June 2018,"Very dirty places,noise,WiFi don't work.. The ..."


In [260]:
#Pre-processing the data
#Concatenate values of reviews
df5['Full Review']=df5['ReviewerName']+' ; '+df5['ReviewDate']+' ; '+df5['ReviewSummary']
df6= df5['Full Review']

#Convert to Dictionary e.g. Review 1
res_dct= {'Review '+str(i+1): df6[i] for i in range(0, len(df6), 1)}

#transpose data frame reviews
df7 = pd.DataFrame.from_dict(res_dct,orient="index").T

#concatenate address, amenities, availablerooms
conc={'Address':add,'Amenities':amenity,'AvailableRooms':availroom}
df20 = pd.DataFrame.from_dict(conc)

#combine 2 dataframes that was concatenated
h30= pd.concat([df20,df7],axis=1)
h30

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13
0,"4166 R. Magsaysay Boulevard, Manila, Luzon 101...","Valet parking,Free High Speed Internet (WiFi),...","Deluxe Room, 1 Queen Bed;Executive Room, 1 Que...",Agnes C ; June 2018 ; We checked into this bud...,"Rex Ebuenga ; April 2018 ; I live in the area,...",Clarxxx ; December 2016 ; Don!t ever think to ...,Joshua D ; June 2016 ; this hotel is very clea...,rjoaquin1 ; June 2016 ; We booked a room at Ho...,Zhongjian ; October 2015 ; Located near Shoe M...,R0sstheB0ss ; December 2012 ; This hotel appea...,HUGH B ; June 2012 ; I checked in to the hotel...,flatheels ; September 2017 ; Love the show? Th...,"zivino ; June 2018 ; Very dirty places,noise,W...",Romeo A ; May 2018 ; It's an old hotel and doe...,Phillip B ; February 2018 ; Arrived and room w...,Luigi A ; January 2018 ; We stay there para ma...


## Concatenate each hotel data

In [261]:
#Combined data
hoteldata= pd.concat([h1,h2,h3,h4,h5,h6,h7,h8,h9,h10,h11,h12,h13,h14,h15,h16,h17,h18,h19,h20,h21,h22,h23,h24,h25,h26,h27,h28,h29,h30]).reset_index(drop=True)
hoteldata

Unnamed: 0,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,Review 5,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,"1775 Interior M. Adriatico Brgy. 699, Zone 076...","Paid public parking nearby,Free High Speed Int...","Two Bedroom Suite,Deluxe Room,Standard Room,Pr...",Joel D ; June 2021 ; Visited this place twice....,Cyrene A ; June 2021 ; Love the place. The roo...,Nicole ; May 2021 ; Definitely a hidden gem in...,Rae Shin ; June 2021 ; The place was very comf...,Marnella Bianca M ; April 2021 ; Really apprec...,"ROMALYN ; May 2021 ; Hi , thank you. We really...",Lhyean De Guzman ; May 2021 ; Overall stay was...,Marianne V ; May 2021 ; Overall the place is n...,Marc Eivan S ; April 2021 ; Stayed for 3 days ...,Sophia Stephanie Mosende ; May 2021 ; This pla...,April P ; April 2021 ; This is my second time ...,Stacy ; April 2021 ; Booking process is so eas...,Nicole ; March 2021 ; We're glad we found 1775...,Kayen ; February 2021 ; It was our 2nd time st...,Ces Dupitas ; February 2021 ; My partner and I...
1,"1158 A. Mabini Street Ermita, Manila, Luzon 10...","Free parking,Free High Speed Internet (WiFi),F...","Deluxe Twin Room,Junior Suite,Superior Double ...",travelgamerhub ; August 2020 ; Stayed for 3 ni...,Hector Periquin ; October 2019 ; The hotel is ...,Wei A ; January 2020 ; Stayed 5 nights 6 days ...,Judit R ; December 2019 ; I requested a non-sm...,MarieGrace ; December 2019 ; I have tried stay...,Joven O ; December 2019 ; Family of three for ...,581markjf ; September 2019 ; City garden Suite...,Stay425892 ; September 2019 ; We stayed over t...,Julius M ; September 2019 ; We booked for a bu...,Jamil A ; August 2019 ; As a hotelier i know t...,Joel Castro ; July 2019 ; We had an incredible...,Encomiendero ; July 2019 ; Good quality custom...,Hopiah ; July 2019 ; The hotel is ver near to ...,MORE NA ; June 2019 ; We booked 2 rooms (2 sep...,JoMC ; May 2019 ; We had our reservation about...
2,1622 Birch Tower Condominium Jorge Bocobo Stre...,"Free High Speed Internet (WiFi),Pool,Fitness C...","Deluxe Studio,Executive Twin Studio,One-Bedroo...",Gnej G ; December 2020 ; Super ganda ng ambian...,Rowrence ; November 2020 ; The hotel stay was ...,Ronald L ; August 2020 ; Excellent place a gr...,FrequentFlier715362 ; February 2020 ; It is cl...,Clarita Gamba-Macandili ; December 2019 ; One ...,Dave_in_Barrie_11 ; November 2019 ; I'd give t...,rmanay346185 ; November 2019 ; The hotel met m...,"nganch551542 ; November 2019 ; Great location,...",fletch531 ; November 2019 ; Rooms are generall...,saf7670 ; October 2019 ; I have often returned...,Nasser K ; September 2019 ; Dang came to airpo...,Monu K ; August 2019 ; Incredible value for mo...,J Chua ; July 2019 ; Its really nice to feel t...,"Johnz ; June 2019 ; Strategic location, the ho...",saf7670 ; May 2019 ; I arrived in Manila after...
3,"618 Pedro Gil Street Ermita, Manila, Luzon 100...","Free parking,Free High Speed Internet (WiFi),F...","Standard Queen Room,Euro Suite 1,Euro Suite 2,...",oliveoilvint ; March 2021 ; I don't recommande...,Cj118 ; January 2021 ; The location is very co...,R. Escala ; October 2020 ; Stayed for 3days/2n...,Las Buganvillas ; April 2020 ; I stayed in the...,theroyalputri ; February 2020 ; Stayed here fo...,John Ray L ; January 2020 ; I chose to stay he...,Peter Paul Duran ; October 2019 ; Room smelled...,MattandGene ; August 2019 ; Shopping mall acro...,fattyboy8 ; July 2019 ; I stayed here due to t...,eesaahg ; July 2019 ; Booked this hotel due to...,bimbo m ; May 2019 ; I consider Eurotel at Ped...,jean ; July 2018 ; just opposite robinsons man...,Persioux Crowley ; May 2019 ; Worst is their a...,j n ; January 2019 ; I'm not picky when it com...,joylovesben ; January 2019 ; The place is aver...
4,"1630 A. Mabini Street Malate, Manila, Luzon 10...","Parking,Free High Speed Internet (WiFi),Bar / ...","Junior Suite,Basic Twin Room,Basic Double Room",Vangiedazo ; February 2020 ; Ive been here for...,Mr. E ; December 2019 ; Room is clean and the ...,Qas419 ; September 2019 ; The hotel is located...,Andy S ; September 2019 ; I had a pleasant sta...,Amer A ; July 2019 ; Very nice hotels and good...,Curtis C ; June 2019 ; I recently stayed at th...,Geminidreams ; April 2019 ; Spent 5 days there...,davey671 ; April 2019 ; Booked a room for 4 ni...,Amer A ; March 2019 ; Firstibul choosing an ho...,Traveler89052 ; December 2018 ; The hotel is n...,schlesiw ; April 2018 ; We spent a few nights ...,Mark B ; February 2018 ; The rooms are well fu...,"Steven P ; December 2017 ; Nice stay, very lar...",paulrodriquez ; March 2017 ; I did enjoy stayi...,Bruno A ; December 2017 ; I have been several ...
5,"Florentino Torres Osmena Highway, Manila, Luzo...","Free parking,Free High Speed Internet (WiFi),F...","Superior Double or Twin Room,Deluxe Queen Room...",Christopher W ; January 2020 ; This hotel has ...,"James R ; January 2020 ; Great welcome, fantas...",Timallalone ; January 2020 ; Really enjoyed a ...,SerainaNonym ; January 2020 ; It’s the best ho...,AngelaSept ; December 2019 ; Fantastic hotel f...,Manoj Kumar ; December 2019 ; Nice concept of ...,mnltabi ; October 2019 ; My cousins and I stay...,So Sardan ; August 2019 ; I like this place. T...,Saan aabot? Eatstraveltime ; August 2019 ; Ind...,maricel c. ; June 2019 ; the place is easy to ...,Charlie ; July 2019 ; Decor was fantastic (con...,Megan Johnston ; July 2019 ; This hotel was so...,kateeeforkaterina ; July 2019 ; We booked a su...,SpeedyG-7 ; June 2019 ; Stayed here in mid Jun...,"Tadz R ; June 2019 ; The room is very clean, s..."
6,"MJC Drive San Lazaro Tourism Business Park, Ma...","Free parking,Free High Speed Internet (WiFi),P...","Deluxe Double Room,Executive Suite,Deluxe King...",DGAnon ; January 2020 ; The overall experience...,Michael M ; December 2019 ; First the restraun...,jof69 ; December 2019 ; My review is specifica...,General_koh ; November 2019 ; A decent hotel w...,"Ads P ; October 2019 ; As expected, I really h...",Lemontea291 ; December 2018 ; After 8 hours de...,Jason A ; October 2019 ; Feedback: Hotel roo...,Cynthia Bryth ; August 2019 ; The have the wor...,Diane B ; August 2018 ; Booked for 3 nights at...,MikaSF627 ; June 2019 ; We booked a Deluxe Kin...,Gayleesi ; June 2019 ; Considering we live pra...,Vanity Helina Low ; May 2019 ; Perfect locatio...,Joseph B ; May 2019 ; We spent Mother's Day Lu...,Judilynn N. Solidum ; May 2019 ; I have been t...,"Mike V ; March 2019 ; nice rooms, super cheap ..."
7,"Robinsons Otis, 1536 Paz Guazon St. Paco, Mani...","Paid private parking on-site,Free High Speed I...","Twin Room,Double Room,Hotel Care Package - Sta...",Katherine A ; February 2021 ; No Windows ( in ...,BonyF ; April 2021 ; Please keep out of this h...,Tharyldia Shane ; March 2021 ; Well I booked G...,Ann C ; February 2021 ; Stayed there for 7 day...,Nikki Baysa ; February 2021 ; My parents went ...,nvilla ; December 2019 ; I have checked-in at ...,MariaEllenaP ; November 2019 ; The hotel was o...,"Vanessa A ; November 2019 ; A good hotel, exce...",pawiks1925 ; November 2019 ; Staff including g...,timmymariano1626 ; October 2019 ; We checked i...,1512mhdelPilar ; September 2019 ; This hotel s...,Marty Hansen ; September 2019 ; My 2nd stay in...,Sherpa25882412712 ; August 2019 ; Location is ...,JPRoblems ; August 2019 ; Hotel is located nea...,Gessa Mae L. ; August 2019 ; Reservation was q...
8,"561 Julian Nakpil, Manila, Luzon 1004 Philippines","Free parking,Free High Speed Internet (WiFi),B...","Standard Room, 1 Queen Bed;Standard Room, 2 Tw...",Nikko ; February 2021 ; Adriatico Arms is a wo...,Crazy Eagle ; November 2018 ; This is a budget...,test ; July 2019 ; Every time we pass trough M...,MissZ ; April 2019 ; Place is descent and affo...,BAI ; February 2019 ; The hotel is just walkin...,Crazy Eagle ; December 2018 ; This is a budget...,kevinbirch49 ; November 2018 ; Location fantas...,rowena L ; June 2018 ; As have said before. Fa...,"Laarni A ; October 2018 ; Location wise, good....",Kristal K ; September 2018 ; This was my first...,Sarah D ; August 2016 ; Despite for it to be n...,rowena L ; November 2016 ; I have been using t...,Nathalie B ; November 2016 ; We picked this pl...,"EbebPL ; November 2016 ; Good prices, helpful ...",Keith R ; August 2016 ; I spent a week at Adri...
9,Jorge Bocobo Street Along Pedro Gil Street opp...,"Secured parking,Free High Speed Internet (WiFi...","Executive King Suite,One-Bedroom Apartment,Thr...",Mark B ; June 2020 ; Booked to stay here via A...,Paul H ; February 2020 ; We found JMM Grand Su...,LakePanorama ; December 2019 ; JMM Grand Suite...,Resort589258 ; June 2019 ; The location is ver...,Myers T ; June 2019 ; More frequently this has...,Jeff G ; March 2019 ; Rented a 1 bedroom apt. ...,saf7670 ; May 2019 ; I spent one night at the ...,Nor Wayne S ; January 2019 ; We booked here in...,Angeline Rose ; January 2019 ; Im so disapoint...,gwattya ; January 2019 ; Was allocated a 31st ...,vicpk ; January 2019 ; I have stayed here nume...,marley2828 ; December 2018 ; My fiance and I h...,Zevious ; December 2018 ; We’ve stayed here be...,andrewcorica ; August 2018 ; Do I recommend th...,AkiChanTells ; August 2017 ; JMM Grand Suites ...


In [262]:
#saved just in case the site changes
hoteldata.to_csv('D:\\001 UPSKILLING, ARAL MODES,ETC\Data Science Training\SPARTA\Module 14 Python for Data Engineering\Final Capstone Project\\hotel1.csv', index= False)

In [265]:
#read the data from first part
fhotel= pd.read_csv('D:\\001 UPSKILLING, ARAL MODES,ETC\Data Science Training\SPARTA\Module 14 Python for Data Engineering\Final Capstone Project\\df4.csv')
fhotel

Unnamed: 0,Hotel,Ratings,Price
0,1775 Adriatico Suites,4.5 of 5 bubbles,1833
1,City Garden Suites,4 of 5 bubbles,2394
2,Regency Grand Suites,4 of 5 bubbles,1902
3,Eurotel Pedro Gil,3 of 5 bubbles,1632
4,Executive Hotel,3.5 of 5 bubbles,"4,526 3,369"
5,Heroes Hotel,5 of 5 bubbles,3220
6,Winford Manila Resort & Casino,3.5 of 5 bubbles,7091
7,Go Hotels Otis-Manila,4 of 5 bubbles,2846
8,Adriatico Arms Hotel,4 of 5 bubbles,"2,784 2,443"
9,JMM Grand Suites,3 of 5 bubbles,1656


In [268]:
finalhotel= pd.concat([fhotel,hoteldata],axis=1)

In [269]:
finalhotel

Unnamed: 0,Hotel,Ratings,Price,Address,Amenities,AvailableRooms,Review 1,Review 2,Review 3,Review 4,...,Review 6,Review 7,Review 8,Review 9,Review 10,Review 11,Review 12,Review 13,Review 14,Review 15
0,1775 Adriatico Suites,4.5 of 5 bubbles,1833,"1775 Interior M. Adriatico Brgy. 699, Zone 076...","Paid public parking nearby,Free High Speed Int...","Two Bedroom Suite,Deluxe Room,Standard Room,Pr...",Joel D ; June 2021 ; Visited this place twice....,Cyrene A ; June 2021 ; Love the place. The roo...,Nicole ; May 2021 ; Definitely a hidden gem in...,Rae Shin ; June 2021 ; The place was very comf...,...,"ROMALYN ; May 2021 ; Hi , thank you. We really...",Lhyean De Guzman ; May 2021 ; Overall stay was...,Marianne V ; May 2021 ; Overall the place is n...,Marc Eivan S ; April 2021 ; Stayed for 3 days ...,Sophia Stephanie Mosende ; May 2021 ; This pla...,April P ; April 2021 ; This is my second time ...,Stacy ; April 2021 ; Booking process is so eas...,Nicole ; March 2021 ; We're glad we found 1775...,Kayen ; February 2021 ; It was our 2nd time st...,Ces Dupitas ; February 2021 ; My partner and I...
1,City Garden Suites,4 of 5 bubbles,2394,"1158 A. Mabini Street Ermita, Manila, Luzon 10...","Free parking,Free High Speed Internet (WiFi),F...","Deluxe Twin Room,Junior Suite,Superior Double ...",travelgamerhub ; August 2020 ; Stayed for 3 ni...,Hector Periquin ; October 2019 ; The hotel is ...,Wei A ; January 2020 ; Stayed 5 nights 6 days ...,Judit R ; December 2019 ; I requested a non-sm...,...,Joven O ; December 2019 ; Family of three for ...,581markjf ; September 2019 ; City garden Suite...,Stay425892 ; September 2019 ; We stayed over t...,Julius M ; September 2019 ; We booked for a bu...,Jamil A ; August 2019 ; As a hotelier i know t...,Joel Castro ; July 2019 ; We had an incredible...,Encomiendero ; July 2019 ; Good quality custom...,Hopiah ; July 2019 ; The hotel is ver near to ...,MORE NA ; June 2019 ; We booked 2 rooms (2 sep...,JoMC ; May 2019 ; We had our reservation about...
2,Regency Grand Suites,4 of 5 bubbles,1902,1622 Birch Tower Condominium Jorge Bocobo Stre...,"Free High Speed Internet (WiFi),Pool,Fitness C...","Deluxe Studio,Executive Twin Studio,One-Bedroo...",Gnej G ; December 2020 ; Super ganda ng ambian...,Rowrence ; November 2020 ; The hotel stay was ...,Ronald L ; August 2020 ; Excellent place a gr...,FrequentFlier715362 ; February 2020 ; It is cl...,...,Dave_in_Barrie_11 ; November 2019 ; I'd give t...,rmanay346185 ; November 2019 ; The hotel met m...,"nganch551542 ; November 2019 ; Great location,...",fletch531 ; November 2019 ; Rooms are generall...,saf7670 ; October 2019 ; I have often returned...,Nasser K ; September 2019 ; Dang came to airpo...,Monu K ; August 2019 ; Incredible value for mo...,J Chua ; July 2019 ; Its really nice to feel t...,"Johnz ; June 2019 ; Strategic location, the ho...",saf7670 ; May 2019 ; I arrived in Manila after...
3,Eurotel Pedro Gil,3 of 5 bubbles,1632,"618 Pedro Gil Street Ermita, Manila, Luzon 100...","Free parking,Free High Speed Internet (WiFi),F...","Standard Queen Room,Euro Suite 1,Euro Suite 2,...",oliveoilvint ; March 2021 ; I don't recommande...,Cj118 ; January 2021 ; The location is very co...,R. Escala ; October 2020 ; Stayed for 3days/2n...,Las Buganvillas ; April 2020 ; I stayed in the...,...,John Ray L ; January 2020 ; I chose to stay he...,Peter Paul Duran ; October 2019 ; Room smelled...,MattandGene ; August 2019 ; Shopping mall acro...,fattyboy8 ; July 2019 ; I stayed here due to t...,eesaahg ; July 2019 ; Booked this hotel due to...,bimbo m ; May 2019 ; I consider Eurotel at Ped...,jean ; July 2018 ; just opposite robinsons man...,Persioux Crowley ; May 2019 ; Worst is their a...,j n ; January 2019 ; I'm not picky when it com...,joylovesben ; January 2019 ; The place is aver...
4,Executive Hotel,3.5 of 5 bubbles,"4,526 3,369","1630 A. Mabini Street Malate, Manila, Luzon 10...","Parking,Free High Speed Internet (WiFi),Bar / ...","Junior Suite,Basic Twin Room,Basic Double Room",Vangiedazo ; February 2020 ; Ive been here for...,Mr. E ; December 2019 ; Room is clean and the ...,Qas419 ; September 2019 ; The hotel is located...,Andy S ; September 2019 ; I had a pleasant sta...,...,Curtis C ; June 2019 ; I recently stayed at th...,Geminidreams ; April 2019 ; Spent 5 days there...,davey671 ; April 2019 ; Booked a room for 4 ni...,Amer A ; March 2019 ; Firstibul choosing an ho...,Traveler89052 ; December 2018 ; The hotel is n...,schlesiw ; April 2018 ; We spent a few nights ...,Mark B ; February 2018 ; The rooms are well fu...,"Steven P ; December 2017 ; Nice stay, very lar...",paulrodriquez ; March 2017 ; I did enjoy stayi...,Bruno A ; December 2017 ; I have been several ...
5,Heroes Hotel,5 of 5 bubbles,3220,"Florentino Torres Osmena Highway, Manila, Luzo...","Free parking,Free High Speed Internet (WiFi),F...","Superior Double or Twin Room,Deluxe Queen Room...",Christopher W ; January 2020 ; This hotel has ...,"James R ; January 2020 ; Great welcome, fantas...",Timallalone ; January 2020 ; Really enjoyed a ...,SerainaNonym ; January 2020 ; It’s the best ho...,...,Manoj Kumar ; December 2019 ; Nice concept of ...,mnltabi ; October 2019 ; My cousins and I stay...,So Sardan ; August 2019 ; I like this place. T...,Saan aabot? Eatstraveltime ; August 2019 ; Ind...,maricel c. ; June 2019 ; the place is easy to ...,Charlie ; July 2019 ; Decor was fantastic (con...,Megan Johnston ; July 2019 ; This hotel was so...,kateeeforkaterina ; July 2019 ; We booked a su...,SpeedyG-7 ; June 2019 ; Stayed here in mid Jun...,"Tadz R ; June 2019 ; The room is very clean, s..."
6,Winford Manila Resort & Casino,3.5 of 5 bubbles,7091,"MJC Drive San Lazaro Tourism Business Park, Ma...","Free parking,Free High Speed Internet (WiFi),P...","Deluxe Double Room,Executive Suite,Deluxe King...",DGAnon ; January 2020 ; The overall experience...,Michael M ; December 2019 ; First the restraun...,jof69 ; December 2019 ; My review is specifica...,General_koh ; November 2019 ; A decent hotel w...,...,Lemontea291 ; December 2018 ; After 8 hours de...,Jason A ; October 2019 ; Feedback: Hotel roo...,Cynthia Bryth ; August 2019 ; The have the wor...,Diane B ; August 2018 ; Booked for 3 nights at...,MikaSF627 ; June 2019 ; We booked a Deluxe Kin...,Gayleesi ; June 2019 ; Considering we live pra...,Vanity Helina Low ; May 2019 ; Perfect locatio...,Joseph B ; May 2019 ; We spent Mother's Day Lu...,Judilynn N. Solidum ; May 2019 ; I have been t...,"Mike V ; March 2019 ; nice rooms, super cheap ..."
7,Go Hotels Otis-Manila,4 of 5 bubbles,2846,"Robinsons Otis, 1536 Paz Guazon St. Paco, Mani...","Paid private parking on-site,Free High Speed I...","Twin Room,Double Room,Hotel Care Package - Sta...",Katherine A ; February 2021 ; No Windows ( in ...,BonyF ; April 2021 ; Please keep out of this h...,Tharyldia Shane ; March 2021 ; Well I booked G...,Ann C ; February 2021 ; Stayed there for 7 day...,...,nvilla ; December 2019 ; I have checked-in at ...,MariaEllenaP ; November 2019 ; The hotel was o...,"Vanessa A ; November 2019 ; A good hotel, exce...",pawiks1925 ; November 2019 ; Staff including g...,timmymariano1626 ; October 2019 ; We checked i...,1512mhdelPilar ; September 2019 ; This hotel s...,Marty Hansen ; September 2019 ; My 2nd stay in...,Sherpa25882412712 ; August 2019 ; Location is ...,JPRoblems ; August 2019 ; Hotel is located nea...,Gessa Mae L. ; August 2019 ; Reservation was q...
8,Adriatico Arms Hotel,4 of 5 bubbles,"2,784 2,443","561 Julian Nakpil, Manila, Luzon 1004 Philippines","Free parking,Free High Speed Internet (WiFi),B...","Standard Room, 1 Queen Bed;Standard Room, 2 Tw...",Nikko ; February 2021 ; Adriatico Arms is a wo...,Crazy Eagle ; November 2018 ; This is a budget...,test ; July 2019 ; Every time we pass trough M...,MissZ ; April 2019 ; Place is descent and affo...,...,Crazy Eagle ; December 2018 ; This is a budget...,kevinbirch49 ; November 2018 ; Location fantas...,rowena L ; June 2018 ; As have said before. Fa...,"Laarni A ; October 2018 ; Location wise, good....",Kristal K ; September 2018 ; This was my first...,Sarah D ; August 2016 ; Despite for it to be n...,rowena L ; November 2016 ; I have been using t...,Nathalie B ; November 2016 ; We picked this pl...,"EbebPL ; November 2016 ; Good prices, helpful ...",Keith R ; August 2016 ; I spent a week at Adri...
9,JMM Grand Suites,3 of 5 bubbles,1656,Jorge Bocobo Street Along Pedro Gil Street opp...,"Secured parking,Free High Speed Internet (WiFi...","Executive King Suite,One-Bedroom Apartment,Thr...",Mark B ; June 2020 ; Booked to stay here via A...,Paul H ; February 2020 ; We found JMM Grand Su...,LakePanorama ; December 2019 ; JMM Grand Suite...,Resort589258 ; June 2019 ; The location is ver...,...,Jeff G ; March 2019 ; Rented a 1 bedroom apt. ...,saf7670 ; May 2019 ; I spent one night at the ...,Nor Wayne S ; January 2019 ; We booked here in...,Angeline Rose ; January 2019 ; Im so disapoint...,gwattya ; January 2019 ; Was allocated a 31st ...,vicpk ; January 2019 ; I have stayed here nume...,marley2828 ; December 2018 ; My fiance and I h...,Zevious ; December 2018 ; We’ve stayed here be...,andrewcorica ; August 2018 ; Do I recommend th...,AkiChanTells ; August 2017 ; JMM Grand Suites ...


In [270]:
#Save data to its final output
finalhotel.to_csv('D:\\001 UPSKILLING, ARAL MODES,ETC\Data Science Training\SPARTA\Module 14 Python for Data Engineering\Final Capstone Project\\Hotel.csv', index= False)

#### Notes:
The reviews are under: Reviews (review number)

It consists of username, date and the review they had. 

This might be lengthy since some were discussed on the latter part of the  for specific attribute so I decided to take everything. Actually, the reviews can be used for sentiment analysis but that would be a very different module xD

Some of it did not made the minimum requirement (15) due to few or little reviews.

The number of hotels scraped is more than requirement (25), it is 30.