# Applying Setniment Analysis on Booking Data

#### Data API: [Booking data API - RapidAPI](https://rapidapi.com/tipsters/api/booking-com)

**Import required libraries**

In [131]:
import pandas as pd
import requests

**Request API access**

In [118]:
url = "https://booking-com.p.rapidapi.com/v1/hotels/reviews"

querystring = {"sort_type":"SORT_MOST_RELEVANT","hotel_id":"1676161","locale":"en-gb","language_filter":"en-gb,de,fr","customer_type":"solo_traveller,review_category_group_of_friends"}

headers = {
	"X-RapidAPI-Key": "2850e10abcmshfd4aefea12432e5p1d19d4jsn6b8952a03dac",
	"X-RapidAPI-Host": "booking-com.p.rapidapi.com"
}

response = requests.get(url, headers=headers, params=querystring)


booking = response.json()
print(booking)

{'result': [{'travel_purpose': 'leisure', 'helpful_vote_count': 0, 'review_hash': '0e3047be91e84701', 'countrycode': 'ae', 'review_id': 4192864805, 'user_new_badges': [], 'pros': 'The hotel was remarkable. Its location was great and the hotel services and staff were more than amazing especially kristela! She was so so helpful and very professional.', 'cons': '', 'anonymous': '', 'reviewng': 1, 'title': 'Exceptional', 'hotel_id': 1676161, 'languagecode': 'en-gb', 'is_moderated': 0, 'is_incentivised': 0, 'author': {'nr_reviews': 0, 'type': 'review_category_group_of_friends', 'type_string': 'Group of friends', 'age_group': '', 'helpful_vote_count': 0, 'name': 'Suha', 'avatar': 'https://lh3.googleusercontent.com/-XdUIqdMkCWA/AAAAAAAAAAI/AAAAAAAAAAA/4252rscbv5M/photo.jpg?sz=5064', 'countrycode': 'ae', 'user_id': 163290578, 'city': ''}, 'tags': [], 'stayed_room_info': {'room_name': 'Superior Double or Twin Room with City View', 'num_nights': 1, 'checkout': '2023-09-10', 'room_id': 167616104,

**Considering `pros` and `cons` as the hotel reviews. `pros` and `cons` belong to the pair with `result` as its key 
Here we worked on extracting `pros` and `cons` and append them on reviews list**

In [120]:
#create empty list
review = []
#loop through every index in the list of pairs under the key `result`
for i in range(len(booking)):
    #append the pros and the cons of every result in the reviews list
    review.append(booking['result'][i]['pros'])
    review.append(booking['result'][i]['cons'])

**Create a dataframe called `df_booking` to hold the reviews list.**

In [121]:
df_booking = pd.DataFrame(review)
df_booking

Unnamed: 0,0
0,The hotel was remarkable. Its location was gre...
1,
2,1) Great Location \n2) Spacious Rooms\n3) Bre...
3,1) Toaster during breakfast was broken during ...
4,"Super service friendly staff, rich breakfast, ..."
5,Nothing.


**Column label needs to be changed to `Reviews`**

In [122]:
#rename column `0`
df_booking = df_booking.rename(columns={0 : 'Reviews'})

**Add a new column to the `df_booking` dataframe to hold the labels of the review's sentiment.**

In [123]:
df_booking['Label'] = None

In [124]:
df_booking

Unnamed: 0,Reviews,Label
0,The hotel was remarkable. Its location was gre...,
1,,
2,1) Great Location \n2) Spacious Rooms\n3) Bre...,
3,1) Toaster during breakfast was broken during ...,
4,"Super service friendly staff, rich breakfast, ...",
5,Nothing.,


**The dataframe contains empty reviews that we need to get rid of**

In [125]:
#keep the rows with non-empty reviews (after stripping it if its still empty drop it)
df_booking = df_booking[df_booking['Reviews'].str.strip() != ""]
#reorder the index
df_booking = df_booking.reset_index(drop=True)
df_booking

Unnamed: 0,Reviews,Label
0,The hotel was remarkable. Its location was gre...,
1,1) Great Location \n2) Spacious Rooms\n3) Bre...,
2,1) Toaster during breakfast was broken during ...,
3,"Super service friendly staff, rich breakfast, ...",
4,Nothing.,


**The second review has escape characters and numbers that needs to be cleaned.**

In [126]:
#replace every escape character with dot
df_booking['Reviews'] = df_booking['Reviews'].str.replace('\n', '.')
#replace every digit followed by `\` with nothing using regex
df_booking['Reviews'] = df_booking['Reviews'].str.replace(r'\d+\)', '')

print(df_booking) 

                                             Reviews Label
0  The hotel was remarkable. Its location was gre...  None
1   Great Location . Spacious  Rooms. Breakfast ....  None
2   Toaster during breakfast was broken during ou...  None
3  Super service friendly staff, rich breakfast, ...  None
4                                           Nothing.  None


  df_booking['Reviews'] = df_booking['Reviews'].str.replace(r'\d+\)', '')


In [127]:
df_booking

Unnamed: 0,Reviews,Label
0,The hotel was remarkable. Its location was gre...,
1,Great Location . Spacious Rooms. Breakfast ....,
2,Toaster during breakfast was broken during ou...,
3,"Super service friendly staff, rich breakfast, ...",
4,Nothing.,


# Sentiment analysis

>**TextBlob is a Python library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as sentiment analysis**

In [128]:
!pip install textblob



In [129]:
!python -m textblob.download_corpora

[nltk_data] Downloading package brown to /Users/rare/nltk_data...
[nltk_data]   Package brown is already up-to-date!
[nltk_data] Downloading package punkt to /Users/rare/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /Users/rare/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/rare/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package conll2000 to /Users/rare/nltk_data...
[nltk_data]   Package conll2000 is already up-to-date!
[nltk_data] Downloading package movie_reviews to
[nltk_data]     /Users/rare/nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!
Finished.


- **create a list to store the sentiment labels of each tweet in the dataframe `df`**
- **create `blob` object**
- **use the blob object to call the`.sentiment.polarity` function and store the numeric value in the `sentiment_score`**
- **used chained conditional statements to decide the labels and append them to the `sentiment_list`**
- **add the `sentiment_list` as a column in the `df['label']` already existing column**

In [130]:
from textblob import TextBlob

review_labels = []
for review in df_booking['Reviews']:
    blob = TextBlob(review)
    review_score = blob.sentiment.polarity

    if review_score > 0:
        review_labels.append("Positive")
    elif review_score < 0:
        review_labels.append("Negative")
    else:
        review_labels.append("Neutral")

df_booking['Label'] = review_labels

df_booking

Unnamed: 0,Reviews,Label
0,The hotel was remarkable. Its location was gre...,Positive
1,Great Location . Spacious Rooms. Breakfast ....,Positive
2,Toaster during breakfast was broken during ou...,Negative
3,"Super service friendly staff, rich breakfast, ...",Positive
4,Nothing.,Neutral


# **THE END**