# Tutorial: getting starting with Hugging Face, NLP, NVIDIA CUDA-accelerated PyTorch, sentiment analysis. 

- Download the official Booking.com accommodation reviews data from Hugging Face
- Get introduced to the tools used for natural language processind (NLP): Hugging Face Transformers, PyTorch, and pre-trained language models
- Fine-tune a pre-trained language model on the Booking.com accommodation review dataset to classify the sentiment of the review 

## Downloading the data
Dataset: https://huggingface.co/datasets/Booking-com/accommodation-reviews

- Manual method: Clicking through the website and downloading files one by one (cumbersome)
- Automatic method: Integrate the URL in the code and download automatically (convenient)



In [1]:
import polars as pl
from pathlib import Path

# Set save directory to Kaggle working environment
save_data_dir = Path('/kaggle/working/rectour24/')

# Create directory if it doesn't exist
if not save_data_dir.exists():
    save_data_dir.mkdir(parents=True)

hf_data_dir = 'hf://datasets/Booking-com/accommodation-reviews/rectour24/'

# names of files, as specified in the repository
data_names = [
    'train_reviews',
    'val_reviews',
    'test_reviews'
]

for data_name in data_names:
    data = pl.read_csv(f"{hf_data_dir}{data_name}.csv")
    data.write_csv(save_data_dir / f"{data_name}.csv")
    print(f"{data_name} saved")

train_reviews saved
val_reviews saved
test_reviews saved


## Introduction to the tools
### Hugging Face Transformers
Hugging Face Transformers is an open-source library that revolutionized Natural Language Processing (NLP) by making state-of-the-art language models easily accessible. The library provides thousands of pre-trained models covering tasks like text classification, translation, summarization, and more. Its intuitive API allows seamless integration with both PyTorch and TensorFlow, supporting rapid experimentation, training, and deployment in real-world applications. Recent transformer architectures, like BERT, RoBERTa, and GPT, are the backbone of many breakthroughs in NLP. 

Read more about HF Transformers [here](https://huggingface.co/docs/transformers/en/index).

### PyTorch
PyTorch is a powerful, flexible deep learning framework popular for research and production. It offers dynamic computation graphs ("define-by-run"), making it intuitive to debug and iterate through models. PyTorch seamlessly integrates with GPU acceleration, allowing scalable training of sophisticated neural networks. The framework underpins many NLP, computer vision, and reinforcement learning projects worldwide.

Read more about PyTorch [here](https://docs.pytorch.org/docs/stable/index.html).

### Sentiment Analysis
Sentiment analysis is a fundamental task in NLP focused on determining the emotional tone or opinion expressed in a piece of text—classifying it as positive, negative, or neutral. It is widely used in fields such as social media monitoring, product feedback, finance, and customer service. Modern sentiment analysis models leverage deep learning to capture subtle nuances in text, achieving higher accuracy than rule-based approaches.

Read more about sentiment analysis [here](https://www.geeksforgeeks.org/machine-learning/what-is-sentiment-analysis/). 

### NLP model
This tutorial uses the `cardiffnlp/twitter-roberta-base-sentiment-latest model`—a specialized variant of RoBERTa fine-tuned for sentiment analysis on Twitter data. Developed by Cardiff NLP, this model is trained to recognize sentiment in short, informal, social media-style texts, making it well-suited to analyzing tweets, reviews, and similar content. RoBERTa itself is an optimized version of BERT, designed for high performance through longer training and improved data preprocessing. The CardiffNLP model outputs probabilities for negative, neutral, and positive classes, and can be easily loaded via the Hugging Face Transformers API for inference and fine-tuning.

Read more about this model [here](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest).


## Start the analysis
### Import libraries

In [2]:
!uv pip install --quiet transformers
import polars as pl
from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
import numpy as np
from scipy.special import softmax
from tqdm.notebook import tqdm
import torch
import gc
from pathlib import Path

import matplotlib.pyplot as plt
import seaborn as sns

### Load the data

In [3]:
# load dataset
data_dir = "/kaggle/working/rectour24/"

train_reviews = pl.read_csv(data_dir + "train_reviews.csv")
valid_reviews = pl.read_csv(data_dir + "val_reviews.csv")
test_reviews = pl.read_csv(data_dir + "test_reviews.csv")

### Brief exploratory data analysis

The dataset contains the following columns:

| 'review_id'| 'accommodation_id' |  'review_title' | 'review_positive' |  'review_negative' |  'review_score' |  'review_helpful_votes' |

And the files contain the following number of rows:
- `train_reviews`: 1,628,989
- `valid_reviews`: 203,787
- `test_reviews`: 199,138

In [4]:
train_reviews

review_id,accommodation_id,review_title,review_positive,review_negative,review_score,review_helpful_votes
str,i64,str,str,str,f64,i64
"""bf762eec-0e44-42ff-a066-6be55a…",489020669,"""Nice &amp; friendly , Plenty o…","""Really nice staff. Good food.F…","""woken up 2.30 in the morning b…",10.0,0
"""3f1a116f-38ed-4fe8-9086-fd71b0…",1533822482,,"""The staff was helpful and the …","""They only placed one wash clot…",9.0,0
"""2cfd21e7-4e2d-4a31-be9d-9e22c7…",222537300,,"""&quot;Home Sweet Home&quot; is…",,10.0,3
"""a240f502-0ee3-47e3-964b-786b56…",644485349,,"""We havent stayed in a b&amp;b …",,9.0,0
"""5086e380-21d4-4d5c-be6f-f2d04a…",-192152850,,"""Location bed and pillows where…",,10.0,0
…,…,…,…,…,…,…
"""58b117be-d221-4db7-8d6a-29656c…",2013313501,,"""Great location for walking uph…","""Starbucks closes at 5pm and I …",10.0,0
"""99c40951-637f-4faf-913c-773e45…",-617199527,,"""The hotel is located in the ci…",,10.0,0
"""c63fdfb0-2c67-4801-b1f8-bdff2e…",1745110448,"""Great stay for a great price""","""Loved the view and the room (v…","""Nothing I can think of""",7.0,0
"""65a309cf-17d8-4828-aa7a-7ab46b…",-1740040086,"""Great Historic Location""","""Character property with a spac…",,9.0,0


## Setup the NLP Model and PyTorch

In [5]:
# model definition
# Preprocess text (username and link placeholders)
def preprocess(text):
    new_text = []
    for t in text.split(" "):
        t = '@user' if t.startswith('@') and len(t) > 1 else t
        t = 'http' if t.startswith('http') else t
        new_text.append(t)
    return " ".join(new_text)
MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

model.config.use_cache = False
model.resize_token_embeddings(len(tokenizer))


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

config.json:   0%|          | 0.00/929 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

2025-11-06 23:56:59.095036: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1762473419.288822      19 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1762473419.338357      19 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


pytorch_model.bin:   0%|          | 0.00/501M [00:00<?, ?B/s]

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(50265, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-11): 12 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
         

In [6]:
def sentiment_score(text):
    # Ensure device is defined
    if 'device' not in globals():
        global device
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    
    text = preprocess(text)
    encoded_input = tokenizer(text, return_tensors='pt').to(device)
    with torch.no_grad():  # Disable gradient calculation for inference
        output = model(**encoded_input)
    scores = output.logits[0]
    scores = softmax(scores.cpu().numpy())  # Move to CPU before converting to numpy
    return scores


In [7]:
# test
sentiment_score("Covid cases are increasing fast!")

model.safetensors:   0%|          | 0.00/501M [00:00<?, ?B/s]

array([0.72357666, 0.228679  , 0.04774431], dtype=float32)

In [8]:
sentiment_score("AI is increasing worker productivity")

array([0.00744043, 0.07173729, 0.92082226], dtype=float32)

In [9]:
sentiment_score("AI is killing worker productivity")

array([0.83918566, 0.14199126, 0.01882305], dtype=float32)

In [10]:
# Batch processing
def sentiment_score_batch(texts):
    # Ensure device is defined
    if 'device' not in globals():
        global device
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    
    preprocessed_texts = [preprocess(text) for text in texts]
    encoded_input = tokenizer(
        preprocessed_texts,
        return_tensors='pt',
        padding=True,
        truncation=True,
        max_length=514,
    ).to(device)

    with torch.no_grad():
        output = model(**encoded_input)
        scores = output.logits
        scores = softmax(scores.cpu().numpy(), axis=1)  # Move to CPU and apply softmax

    torch.cuda.empty_cache()
    del encoded_input
    gc.collect()

    return scores

# Process in batches
batch_size = 5000  # Adjust this based on your GPU memory capacity
review_title_scores = []

# Select top 10% of the rows to reduce time
num_rows = train_reviews.height 
top_10_percent = int(num_rows * 0.1)  
head_10pc_train_reviews = train_reviews.head(top_10_percent)

# for i in tqdm(range(0, len(train_reviews), batch_size)):
    # batch_texts = train_reviews['review_title'].to_numpy()[i:i+batch_size]
for i in tqdm(range(0, len(head_10pc_train_reviews), batch_size)):
    batch_texts = head_10pc_train_reviews['review_title'].to_numpy()[i:i+batch_size]

    # Identify None values
    none_mask = [text is None for text in batch_texts]

    # Filter out None values for processing
    valid_texts = [text for text in batch_texts if text is not None]

    if valid_texts:
        # Get sentiment scores for non-None values
        batch_scores = sentiment_score_batch(valid_texts)
    else:
        batch_scores = np.array([]).reshape(0, 3)  # Empty array for empty valid_texts

    # Initialize full scores array with zeros for None values
    full_scores = np.zeros((len(batch_texts), 3))

    # Insert the computed scores back into the appropriate positions
    valid_idx = 0
    for j, is_none in enumerate(none_mask):
        if not is_none:
            full_scores[j] = batch_scores[valid_idx]
            valid_idx += 1

    review_title_scores.append(full_scores)

review_title_scores = np.vstack(review_title_scores)

  0%|          | 0/33 [00:00<?, ?it/s]

In [11]:
review_title_scores.shape

(162898, 3)

In [12]:
head_10pc_train_reviews = (
    head_10pc_train_reviews.with_columns(
        review_title_negative=pl.Series(review_title_scores[:, 0]),
        review_title_neutral=pl.Series(review_title_scores[:, 1]),
        review_title_positive=pl.Series(review_title_scores[:, 2]),
    )
)

In [13]:
head_10pc_train_reviews

review_id,accommodation_id,review_title,review_positive,review_negative,review_score,review_helpful_votes,review_title_negative,review_title_neutral,review_title_positive
str,i64,str,str,str,f64,i64,f64,f64,f64
"""bf762eec-0e44-42ff-a066-6be55a…",489020669,"""Nice &amp; friendly , Plenty o…","""Really nice staff. Good food.F…","""woken up 2.30 in the morning b…",10.0,0,0.006272,0.030829,0.962899
"""3f1a116f-38ed-4fe8-9086-fd71b0…",1533822482,,"""The staff was helpful and the …","""They only placed one wash clot…",9.0,0,0.0,0.0,0.0
"""2cfd21e7-4e2d-4a31-be9d-9e22c7…",222537300,,"""&quot;Home Sweet Home&quot; is…",,10.0,3,0.0,0.0,0.0
"""a240f502-0ee3-47e3-964b-786b56…",644485349,,"""We havent stayed in a b&amp;b …",,9.0,0,0.0,0.0,0.0
"""5086e380-21d4-4d5c-be6f-f2d04a…",-192152850,,"""Location bed and pillows where…",,10.0,0,0.0,0.0,0.0
…,…,…,…,…,…,…,…,…,…
"""7ed297ea-b6fa-42cf-8f47-24e640…",-2144097448,,"""Located on the waterfront and …",,10.0,0,0.0,0.0,0.0
"""d647a0c4-d912-4d5a-9722-01c23a…",-1084815414,"""Good location but maintenance …","""-Bed was super comfortable. I …","""-Not clear if there&#39;s a cl…",6.0,0,0.079638,0.328265,0.592097
"""9234e4bc-56c2-4b7a-b7ea-af9b0f…",-437051311,"""Good""","""Location , pool , staff""","""The massage and hammam were a …",10.0,0,0.052735,0.262828,0.684437
"""b01c9f20-0473-45f1-a79b-3d9a6b…",-1556971278,"""Loved T-Bird!""","""The hotel is retro and older c…","""The only drawbacks to the room…",10.0,0,0.004704,0.017485,0.977811


In [14]:
valid_reviews

review_id,accommodation_id,review_title,review_positive,review_negative,review_score,review_helpful_votes
str,i64,str,str,str,f64,i64
"""bc875f5f-d6b3-4257-8657-c32438…",1715638058,"""Simple, quiet, clean accommoda…","""Simple, quiet, clean accommoda…",,7.0,0
"""1255bfb6-4474-47c5-ab95-039bde…",392754076,"""Great location, nice gardens a…","""Great pool with shade, nice ga…","""Room was nice just a little da…",8.0,0
"""07b6070b-ea7b-4780-badd-517080…",-1438158622,"""Lovely relaxing hotel, great v…","""The food was lovely and the po…","""My one complaint is that the t…",10.0,0
"""4f350d78-4540-41c1-b5e5-13d9e1…",-1703391740,"""Excellent - Highly recommend b…","""The place is hard to find. But…",,9.0,0
"""5f25926b-f16b-41b6-836f-b5f016…",653876828,,"""Hotel was very nice, clean, co…","""Location was ok, just a bit of…",9.0,0
…,…,…,…,…,…,…
"""960bd395-7b0c-4343-9173-e6c250…",198865257,"""Excellent location to shops, c…","""Excellent location and value f…",,10.0,0
"""1e3f012e-59cb-49ba-9f94-0c7033…",972221007,"""Modern, clean and high spec ap…","""Super modern and clean apartme…","""Na""",8.0,0
"""beb7ebb9-e650-41b9-adfa-eadcf2…",1767990297,"""Modern apartment s for a low p…","""Amazing modern apartment s, in…","""Nothing not to like.. Everythi…",10.0,0
"""a8bf9d76-5f68-4553-aecd-dbccf6…",693269630,,"""I enjoyed my stay staff were v…",,10.0,0


In [15]:
def sentiment_analysis_batch(texts:np.ndarray, batch_size:int):
    review_scores = []

    for i in tqdm(range(0, len(texts), batch_size)):
        batch_texts = texts[i:i+batch_size]

        # Identify None values
        none_mask = [text is None for text in batch_texts]

        # Filter out None values for processing
        valid_texts = [text for text in batch_texts if text is not None]

        if valid_texts:
            # Get sentiment scores for non-None values
            batch_scores = sentiment_score_batch(valid_texts)
        else:
            batch_scores = np.array([]).reshape(0, 3)  # Empty array for empty valid_texts

        # Initialize full scores array with zeros for None values
        full_scores = np.zeros((len(batch_texts), 3))

        # Insert the computed scores back into the appropriate positions
        valid_idx = 0
        for j, is_none in enumerate(none_mask):
            if not is_none:
                full_scores[j] = batch_scores[valid_idx]
                valid_idx += 1

        review_scores.append(full_scores)

    review_scores = np.vstack(review_scores)

    # torch.cuda.empty_cache()
    # gc.collect()

    return review_scores

In [16]:
# Select top 10% of the rows to reduce time
num_rows = valid_reviews.height 
top_10_percent = int(num_rows * 0.1)  
head_10pc_valid_reviews = valid_reviews.head(top_10_percent)

# valid_review_title_scores = sentiment_analysis_batch(valid_reviews['review_title'].to_numpy(), batch_size=256)
valid_review_title_scores = sentiment_analysis_batch(head_10pc_valid_reviews['review_title'].to_numpy(), batch_size=5000)

  0%|          | 0/5 [00:00<?, ?it/s]

In [17]:
head_10pc_valid_reviews

review_id,accommodation_id,review_title,review_positive,review_negative,review_score,review_helpful_votes
str,i64,str,str,str,f64,i64
"""bc875f5f-d6b3-4257-8657-c32438…",1715638058,"""Simple, quiet, clean accommoda…","""Simple, quiet, clean accommoda…",,7.0,0
"""1255bfb6-4474-47c5-ab95-039bde…",392754076,"""Great location, nice gardens a…","""Great pool with shade, nice ga…","""Room was nice just a little da…",8.0,0
"""07b6070b-ea7b-4780-badd-517080…",-1438158622,"""Lovely relaxing hotel, great v…","""The food was lovely and the po…","""My one complaint is that the t…",10.0,0
"""4f350d78-4540-41c1-b5e5-13d9e1…",-1703391740,"""Excellent - Highly recommend b…","""The place is hard to find. But…",,9.0,0
"""5f25926b-f16b-41b6-836f-b5f016…",653876828,,"""Hotel was very nice, clean, co…","""Location was ok, just a bit of…",9.0,0
…,…,…,…,…,…,…
"""5834bd5c-75bb-46e0-8b24-7263ef…",-1807522361,"""Wonderful and convenient locat…","""Wonderful and convenient locat…",,10.0,0
"""8b0116ff-85bc-4112-94e2-dd3ffd…",1024241486,"""pleasant for short stay""","""convenient location, nice rece…","""rooms were clean but very shab…",6.0,0
"""c280569b-4996-4ef5-bd66-123ba3…",-958690930,"""We had a wonderful 2 nights at…","""Delicious breakfast with a bea…","""Nothing""",8.0,0
"""9e4a6e16-05cc-4bd1-9725-0675db…",-375121988,,"""Location close to city""",,8.0,0


In [18]:
valid_review_title_scores.shape

(20378, 3)

In [19]:
# Select top 10% of the rows to reduce time
num_rows = valid_reviews.height 
top_10_percent = int(num_rows * 0.1)  
head_10pc_test_reviews = test_reviews.head(top_10_percent)

# test_review_title_scores = sentiment_analysis_batch(test_reviews['review_title'].to_numpy(), batch_size=256)
test_review_title_scores = sentiment_analysis_batch(head_10pc_test_reviews['review_title'].to_numpy(), batch_size=5000)

  0%|          | 0/5 [00:00<?, ?it/s]

In [20]:
head_10pc_valid_reviews = (
    head_10pc_valid_reviews.with_columns(
        review_title_negative=pl.Series(valid_review_title_scores[:, 0]),
        review_title_neutral=pl.Series(valid_review_title_scores[:, 1]),
        review_title_positive=pl.Series(valid_review_title_scores[:, 2]),
    )
)

head_10pc_test_reviews = (
    head_10pc_test_reviews.with_columns(
        review_title_negative=pl.Series(test_review_title_scores[:, 0]),
        review_title_neutral=pl.Series(test_review_title_scores[:, 1]),
        review_title_positive=pl.Series(test_review_title_scores[:, 2]),
    )
)

In [21]:
head_10pc_valid_reviews

review_id,accommodation_id,review_title,review_positive,review_negative,review_score,review_helpful_votes,review_title_negative,review_title_neutral,review_title_positive
str,i64,str,str,str,f64,i64,f64,f64,f64
"""bc875f5f-d6b3-4257-8657-c32438…",1715638058,"""Simple, quiet, clean accommoda…","""Simple, quiet, clean accommoda…",,7.0,0,0.033174,0.35094,0.615886
"""1255bfb6-4474-47c5-ab95-039bde…",392754076,"""Great location, nice gardens a…","""Great pool with shade, nice ga…","""Room was nice just a little da…",8.0,0,0.002874,0.014642,0.982485
"""07b6070b-ea7b-4780-badd-517080…",-1438158622,"""Lovely relaxing hotel, great v…","""The food was lovely and the po…","""My one complaint is that the t…",10.0,0,0.006497,0.012862,0.980641
"""4f350d78-4540-41c1-b5e5-13d9e1…",-1703391740,"""Excellent - Highly recommend b…","""The place is hard to find. But…",,9.0,0,0.00617,0.013217,0.980612
"""5f25926b-f16b-41b6-836f-b5f016…",653876828,,"""Hotel was very nice, clean, co…","""Location was ok, just a bit of…",9.0,0,0.0,0.0,0.0
…,…,…,…,…,…,…,…,…,…
"""5834bd5c-75bb-46e0-8b24-7263ef…",-1807522361,"""Wonderful and convenient locat…","""Wonderful and convenient locat…",,10.0,0,0.002825,0.012105,0.98507
"""8b0116ff-85bc-4112-94e2-dd3ffd…",1024241486,"""pleasant for short stay""","""convenient location, nice rece…","""rooms were clean but very shab…",6.0,0,0.019125,0.224177,0.756698
"""c280569b-4996-4ef5-bd66-123ba3…",-958690930,"""We had a wonderful 2 nights at…","""Delicious breakfast with a bea…","""Nothing""",8.0,0,0.003422,0.009615,0.986963
"""9e4a6e16-05cc-4bd1-9725-0675db…",-375121988,,"""Location close to city""",,8.0,0,0.0,0.0,0.0


In [22]:
head_10pc_test_reviews

review_id,accommodation_id,review_title,review_positive,review_negative,review_score,review_helpful_votes,review_title_negative,review_title_neutral,review_title_positive
str,i64,str,str,str,f64,i64,f64,f64,f64
"""d50f830f-fd60-492f-924b-46f948…",-663110570,,"""The cabin was very comfortable…","""Could have done with a couple …",7.0,0,0.0,0.0,0.0
"""1f03e5f0-f15e-4c7b-997e-0490ce…",-558978085,,"""The breakfast was awesome. The…","""Nothing.""",10.0,0,0.0,0.0,0.0
"""01121198-c633-47df-aa4a-b42ef0…",1477624081,"""Great place to stay in central…","""I highly recommend this spot i…","""Nothing.""",10.0,0,0.004134,0.042763,0.953103
"""73e3f388-3f51-4091-ad3a-bb6db1…",-1273110867,,"""The room, its location (near Ž…",,9.0,0,0.0,0.0,0.0
"""f181394d-35a0-4e27-9cdc-84427c…",1244380562,"""delightful apartment in a perf…","""great location and everything …","""need to plan where to park bef…",10.0,1,0.004232,0.0117,0.984068
…,…,…,…,…,…,…,…,…,…
"""a882a471-a285-4472-bb91-3d0971…",-41728243,"""I would definitely stay at the…","""The food was very good, especi…","""I lost a pair of favourite ear…",8.0,0,0.005828,0.256105,0.738067
"""f118d0e2-cc2c-475d-a6b7-c87e3c…",-720025442,"""Always appreciate a profession…","""safe parking at location. prof…","""A little walk to station (10 m…",8.0,0,0.011126,0.061646,0.927228
"""94aa67cd-9513-408b-83ac-9e3198…",-385108466,"""Overnight stay""","""This was just a stopover on ou…","""The breakfast was disappointin…",8.0,0,0.02524,0.869788,0.104971
"""2ff6a325-76fe-4884-97ac-53df87…",1810667136,,"""Ordinary romm at a good price.…",,8.0,0,0.0,0.0,0.0


## Credits
The code used in this notebook is taken from https://github.com/ty1260/rectour2024_challenge.git