<a href="https://colab.research.google.com/github/LuluW8071/Text-Sentiment-Analysis/blob/main/Text-Sentiment-Analysis-using-BERT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text-Sentiment-Analysis-using-BERT

In [1]:
!pip install transformers[torch] datasets evaluate seaborn




[notice] A new release of pip is available: 24.0 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip


## 1. Download and Load the dataset

The dataset that the following script will download is a combination of the Yelp Polarity Dataset and the IMDb Movie Dataset. The Yelp Polarity Dataset has been preprocessed by selecting specific columns to create a dataset suitable for sentiment analysis. This preprocessed dataset has been merged with the IMDb Movie Dataset.

In [2]:
import os
import gdown
import zipfile

file_url = 'https://drive.google.com/uc?id=1Jp3D5gdxGrwa5dHbr4p-pECrD8wi7vik'
file_name = 'sentiment_dataset.zip'

if not os.path.exists("./dataset/sentiment_combined.csv"):
    # Download the file from Google Drive
    gdown.download(file_url, file_name, quiet=False)
    extract_dir = './dataset'

    # Extract the zip file
    with zipfile.ZipFile(file_name, 'r') as zip_ref:
        zip_ref.extractall(extract_dir)

    # Remove the zip file after extraction
    os.remove(file_name)
    print("Files extracted successfully to:", extract_dir)

>__Note:__</br>
**BERT** (Bidirectional Encoder Representations from Transformers) can indeed be trained on a relatively small dataset to yield improved results for certain tasks, especially when fine-tuning a pre-trained model, due to its powerful architecture. It is already pre-trained on larger datasets, possesses powerful contextual understanding, and benefits from effective regularization techniques such as dropout and attention mechanisms, which help prevent overfitting.

>So, we can just take just `20000` datasets and train the **BERT** Model on it for our purpose.

In [3]:
import random
import pandas as pd
import numpy as np

# Reduce to 10000 samples if you want your model to train faster (while loss may increase)
samples = 20000

# Read dataset and take random 20000 samples
df = pd.read_csv("dataset/sentiment_combined.csv")
df = df.sample(n=samples, random_state=random.randint(0, 100))

# Reset the index
df.reset_index(drop=True, inplace=True)
df.head(), df.shape

(                                              review sentiment
 0  I was here briefly for a work assignment (I'm ...  positive
 1  No stars would be ideal.\nWe have been going h...  negative
 2  Awesome banquet meal! \u00a312 each every Sund...  positive
 3  I had to watch this film because the plot was ...  negative
 4  I don't get my nails and toes done often. It s...  positive,
 (20000, 2))

In [4]:
df['review'][0]

"I was here briefly for a work assignment (I'm a photographer) BUT if I could afford to stay here (I'm sure it's pricey) I would.  It's massive and gorgeous.  The location is out in the middle of nowhere, it's surrounded by nature and a golf course.  Everything from the ceiling to the bathrooms to the service and food I experienced, was fantastic!\\n\\nI really want to shout out to the kind of customer service we received. First, the place is huge so it's really easy to get lost, especially if you are looking for a specific place where an event is being held. The front desk people weren't really interested in helping but some lady pushing a cleaner cart took me pretty much all the way to where I needed.  Then when I got to the inside venue (we had cocktails outside, dinner/awards inside), the staff were still setting up the tables and adjusting the light and as a photographer, I was worried about how dark it was so they actually worked with me to get in the most low light they could!  

In [5]:
df['sentiment'].value_counts()

sentiment
negative    10061
positive     9939
Name: count, dtype: int64

## 2. Text Pre-Processing

- Cleaning up the text data by removing punctuation, extra spaces, and numbers.
- Transform sentences into individual words, remove common words (known as "stop words")

In [6]:
import re
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
from collections import Counter

# Precompile regular expressions for faster pre processing
non_word_chars_pattern = re.compile(r"[^\w\s]")
whitespace_pattern = re.compile(r"\s+")
digits_pattern = re.compile(r"\d")
username_pattern = re.compile(r"@([^\s]+)")
hashtags_pattern = re.compile(r"#\d+")
br_pattern = re.compile(r'<br\s*/?>\s*<br\s*/?>')

def preprocess_string(s):
    # Remove all non-word characters (everything except numbers and letters)
    s = non_word_chars_pattern.sub('', s)
    # Replace all runs of whitespaces with single space
    s = whitespace_pattern.sub(' ', s)
    # Replace digits with no space
    s = digits_pattern.sub('', s)
    # Replace usernames with no space
    s = username_pattern.sub('', s)
    # Replace hashtags with no space
    s = hashtags_pattern.sub('', s)
    # Replace <br /> pattern with empty string
    s = br_pattern.sub('', s)
    # Replace specific characters
    s = s.replace("https", "")
    s = s.replace("http", "")
    s = s.replace("rt", "")
    s = s.replace("-", "")
    # Replace br with empty string
    s = s.replace("br", "")
    # Replace newline character with empty string
    s = s.replace("\n", "")
    return s

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\mk473\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [7]:
from tqdm.notebook import tqdm_notebook

preprocessed_reviews = []

# Apply preprocessing
for review in tqdm_notebook(df['review'], desc='Preprocessing'):
    preprocessed_review = preprocess_string(review)
    preprocessed_reviews.append(preprocessed_review)

# Assign the preprocessed reviews back to  'review' column
df['review'] = preprocessed_reviews

Preprocessing:   0%|          | 0/20000 [00:00<?, ?it/s]

In [8]:
df['review'][0], df['sentiment'][0]

('I was here iefly for a work assignment Im a photographer BUT if I could afford to stay here Im sure its pricey I would Its massive and gorgeous The location is out in the middle of nowhere its surrounded by nature and a golf course Everything from the ceiling to the bathrooms to the service and food I experienced was fantasticnnI really want to shout out to the kind of customer service we received First the place is huge so its really easy to get lost especially if you are looking for a specific place where an event is being held The front desk people werent really interested in helping but some lady pushing a cleaner ca took me pretty much all the way to where I needed Then when I got to the inside venue we had cocktails outside dinnerawards inside the staff were still setting up the tables and adjusting the light and as a photographer I was worried about how dark it was so they actually worked with me to get in the most low light they could As far as set updecorfood I was there for

## 3. Mapping `sentiment` column to numeric values

In [9]:
# Map 'positive' to 1 & 'negative' to 0
df['sentiment'] = df['sentiment'].replace({'positive': 1, 'negative': 0})
df.head()

  df['sentiment'] = df['sentiment'].replace({'positive': 1, 'negative': 0})


Unnamed: 0,review,sentiment
0,I was here iefly for a work assignment Im a ph...,1
1,No stars would be idealnWe have been going her...,0
2,Awesome banquet meal ua each every Sunday Had ...,1
3,I had to watch this film because the plot was ...,0
4,I dont get my nails and toes done often It sho...,1


## 4. Spliiting datasets into train and test

In [10]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(df['review'],
                                                    df['sentiment'],
                                                    test_size=0.2)

len(X_train), len(X_test)

(16000, 4000)

In [11]:
X_train, X_test, y_train, y_test = list(X_train), list(X_test), list(y_train), list(y_test)
X_train[:2], y_train[:2]

(['William S Ha as Jim Treen the most eligible bachelor in Canyon City is finally getting hitched to pretty blonde waitress Leona Hutton as Molly Stewa His fiancée doesnt know it but Mr Ha is secretly the western towns Most Wanted bandit However Ha is planning to go straight due to his marriage plans Unfounately Ms Hutton discovers Has secret stash whilst cleaning up his untidy cabin so she calls off the wedding Next Hutton succumbs to the charms of mining swindler Frank Borzage as W Sloane Carey  Serviceable enteainment from superstar Ha he was ranked no less than  at the box office by Quigley Publications for the years  and  ahead of Mary Pickford The principles perform capably Later on Frank Borzage was quite a director and Leona Hutton a suicide   A Knight of the Trails  William S Ha William S Ha Leona Hutton Frank Borzage',
  'Worst place ever I feel like I was at a homeless shelter I dont try to sound like a rude person but this is my opinion about this place Food has no taste to

## 5. Preparing data using custom dataloader

In [12]:
import torch
import torch.nn.functional as F
from torch.utils.data import Dataset
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
from transformers import Trainer, TrainingArguments

# Setting device agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(device)

cuda


In [13]:
class data(Dataset):
  def __init__(self, encodings, labels):
    self.encodings = encodings
    self.labels = labels

  def __getitem__(self, index):
    item = {key: torch.tensor(val[index]) for key, val in self.encodings.items()}
    item['labels'] = torch.tensor(self.labels[index])
    return item

  def __len__(self):
    return len(self.labels)

## 6. Load PreTrained BERT Model

**BERT** (Bidirectional Encoder Representations from Transformers) is a pre-trained language representation model developed by researchers at Google.

<img src = "https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Attention_diagram_transformer.png?ssl=1">

- BERT architecture consists of `multiple encoder transformer blocks` stacked together.
- Each transformer block includes` multi-head self-attention` and `feed-forward neural networks`.
- `Multi-head self-attention` allows BERT to weigh word importance based on context, capturing long-range dependencies effectively.
- The output from `attention mechanisms` undergoes non-linear transformations via `feed-forward neural networks`.
- `Layer normalization` and `residual connections` stabilize training and facilitate gradient flow within each transformer block.
- `Positional encodings` preserve word order in sequences, aiding BERT in understanding the sequential nature of data.

>BERT is pre-trained on a large text corpus using tasks like masked language modeling and next sentence prediction. Fine-tuning on specific tasks involves adjusting the final layers of the pre-trained BERT model.

### [Explanation Video on BERT](https://www.youtube.com/watch?v=6ahxPTLZxU8)

In [14]:
from huggingface_hub import notebook_login

# Paste hugging face token with write permission enabled and log in
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [15]:
model_name = 'distilbert-base-uncased'
tokenizer = DistilBertTokenizerFast.from_pretrained(model_name)



## 7. Tokenize and Create Encoded Dataset

In [16]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer)

# Tokenize with truncation and padding and create dataset from tokenized data
train_encoding = tokenizer(X_train, truncation=True, padding=True)
test_encoding = tokenizer(X_test, truncation=True, padding=True)

train_dataset = data(train_encoding, y_train)
test_dataset = data(test_encoding, y_test)

KeyboardInterrupt: 

## 8. Fine-Tuning Distil BERT

Fine-tuning BERT, a pre-trained language model, allows us to adapt it to specific NLP tasks such as text classification, named entity recognition, sentiment analysis, and question answering.


<img src = "https://raw.githubusercontent.com/LuluW8071/Text-Sentiment-Analysis/dfa065d8169ae9d26460114e612118f5628d7dd3//assets/BERT-Fine-tuning-pipeline.png">

In [17]:
batch_size = 2

training_args = TrainingArguments(
    output_dir=f"{model_name}-finetuned-sentiment",
    num_train_epochs=1,                              # No of epochs to train
    per_device_train_batch_size=batch_size,          # Batch size for training per device
    per_device_eval_batch_size=batch_size,           # Batch size for evaluation per device
    learning_rate=2e-5,                              # Learning rate for optimizer
    warmup_steps=400,                                # No of warmup steps for the learning rate scheduler
    weight_decay=0.01,                               # Weight decay coefficient for regularization
    logging_dir='./logs',                            # Directory for logging training information
    load_best_model_at_end=True,                     # Whether to load the best model from checkpoints at the end of training
    logging_steps=400,                               # Log training metrics every `logging_steps` steps
    save_steps=400,                                  # Save model checkpoints every `save_steps` steps
    save_total_limit=2,                              # Save no of checkpoints
    evaluation_strategy = "steps",                         # When to run evaluation during training: steps, epochs or none
    fp16=True,                                       # Floating point 16 precision
    push_to_hub=True,                                # Save checkpoint in Hugging Face Hub
    report_to="tensorboard",                         # Enable TensorBoard integration
)


## 9. Train the Fine-Tuned BERT Model

In [18]:
from evaluate import load

accuracy_metric = load("accuracy")

# Compute_metrics function
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    accuracy = accuracy_metric.compute(predictions=predictions, references=labels)
    return accuracy

In [19]:
# Label mapping
label_mapping = {0: 'negative', 1: 'positive'}

model = DistilBertForSequenceClassification.from_pretrained(model_name,
                                                            num_labels=2)

# Override the model configuration for custom labels
model.config.id2label = label_mapping
model.config.label2id = {v: k for k, v in label_mapping.items()}


trainer = Trainer(
    model=model,                      # The instantiated Transformers model to be trained
    args=training_args,               # Training arguments, defined above
    train_dataset=train_dataset,      # Training dataset
    eval_dataset=test_dataset,        # Evaluation dataset
    tokenizer=tokenizer,              # Tokenizer
    data_collator=data_collator,      # Data collator
    compute_metrics=compute_metrics,  # Function to compute metrics
)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.bias', 'pre_classifier.weight', 'classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [20]:
from accelerate import Accelerator

# Initialize Accelerator and Trainer
Accelerator()
trainer.train()

  0%|          | 0/8000 [00:00<?, ?it/s]

You're using a DistilBertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


{'loss': 0.5386, 'learning_rate': 1.97e-05, 'epoch': 0.05}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.4660336971282959, 'eval_accuracy': 0.8885, 'eval_runtime': 22.0264, 'eval_samples_per_second': 181.601, 'eval_steps_per_second': 90.8, 'epoch': 0.05}
{'loss': 0.5019, 'learning_rate': 1.8963157894736844e-05, 'epoch': 0.1}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.41572555899620056, 'eval_accuracy': 0.8925, 'eval_runtime': 22.7431, 'eval_samples_per_second': 175.878, 'eval_steps_per_second': 87.939, 'epoch': 0.1}
{'loss': 0.549, 'learning_rate': 1.7910526315789477e-05, 'epoch': 0.15}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.6199649572372437, 'eval_accuracy': 0.874, 'eval_runtime': 22.5851, 'eval_samples_per_second': 177.108, 'eval_steps_per_second': 88.554, 'epoch': 0.15}
{'loss': 0.4432, 'learning_rate': 1.6857894736842106e-05, 'epoch': 0.2}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.41191366314888, 'eval_accuracy': 0.91175, 'eval_runtime': 22.8698, 'eval_samples_per_second': 174.903, 'eval_steps_per_second': 87.452, 'epoch': 0.2}
{'loss': 0.4479, 'learning_rate': 1.580526315789474e-05, 'epoch': 0.25}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.357967346906662, 'eval_accuracy': 0.91325, 'eval_runtime': 22.8923, 'eval_samples_per_second': 174.731, 'eval_steps_per_second': 87.366, 'epoch': 0.25}
{'loss': 0.4485, 'learning_rate': 1.475263157894737e-05, 'epoch': 0.3}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.28124547004699707, 'eval_accuracy': 0.92475, 'eval_runtime': 22.882, 'eval_samples_per_second': 174.81, 'eval_steps_per_second': 87.405, 'epoch': 0.3}
{'loss': 0.3102, 'learning_rate': 1.3700000000000003e-05, 'epoch': 0.35}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.3488783836364746, 'eval_accuracy': 0.92575, 'eval_runtime': 22.6802, 'eval_samples_per_second': 176.365, 'eval_steps_per_second': 88.182, 'epoch': 0.35}
{'loss': 0.4025, 'learning_rate': 1.2650000000000001e-05, 'epoch': 0.4}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.35251837968826294, 'eval_accuracy': 0.926, 'eval_runtime': 22.8855, 'eval_samples_per_second': 174.783, 'eval_steps_per_second': 87.392, 'epoch': 0.4}
{'loss': 0.3593, 'learning_rate': 1.1597368421052632e-05, 'epoch': 0.45}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.36412572860717773, 'eval_accuracy': 0.92525, 'eval_runtime': 22.6271, 'eval_samples_per_second': 176.779, 'eval_steps_per_second': 88.39, 'epoch': 0.45}
{'loss': 0.418, 'learning_rate': 1.0544736842105263e-05, 'epoch': 0.5}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.36467546224594116, 'eval_accuracy': 0.9235, 'eval_runtime': 22.7277, 'eval_samples_per_second': 175.997, 'eval_steps_per_second': 87.998, 'epoch': 0.5}
{'loss': 0.3285, 'learning_rate': 9.492105263157896e-06, 'epoch': 0.55}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.2993335425853729, 'eval_accuracy': 0.9365, 'eval_runtime': 22.7508, 'eval_samples_per_second': 175.818, 'eval_steps_per_second': 87.909, 'epoch': 0.55}
{'loss': 0.3695, 'learning_rate': 8.439473684210527e-06, 'epoch': 0.6}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.3591051697731018, 'eval_accuracy': 0.92075, 'eval_runtime': 22.7279, 'eval_samples_per_second': 175.995, 'eval_steps_per_second': 87.997, 'epoch': 0.6}
{'loss': 0.3301, 'learning_rate': 7.386842105263159e-06, 'epoch': 0.65}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.41080164909362793, 'eval_accuracy': 0.923, 'eval_runtime': 22.6988, 'eval_samples_per_second': 176.221, 'eval_steps_per_second': 88.111, 'epoch': 0.65}
{'loss': 0.3333, 'learning_rate': 6.336842105263158e-06, 'epoch': 0.7}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.3121863603591919, 'eval_accuracy': 0.93775, 'eval_runtime': 23.7829, 'eval_samples_per_second': 168.188, 'eval_steps_per_second': 84.094, 'epoch': 0.7}
{'loss': 0.2931, 'learning_rate': 5.2842105263157896e-06, 'epoch': 0.75}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.3154749870300293, 'eval_accuracy': 0.94, 'eval_runtime': 22.7378, 'eval_samples_per_second': 175.918, 'eval_steps_per_second': 87.959, 'epoch': 0.75}
{'loss': 0.2846, 'learning_rate': 4.2315789473684215e-06, 'epoch': 0.8}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.3384411036968231, 'eval_accuracy': 0.9375, 'eval_runtime': 23.6456, 'eval_samples_per_second': 169.165, 'eval_steps_per_second': 84.582, 'epoch': 0.8}
{'loss': 0.2903, 'learning_rate': 3.178947368421053e-06, 'epoch': 0.85}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.30547046661376953, 'eval_accuracy': 0.93875, 'eval_runtime': 22.7999, 'eval_samples_per_second': 175.439, 'eval_steps_per_second': 87.72, 'epoch': 0.85}
{'loss': 0.311, 'learning_rate': 2.1263157894736844e-06, 'epoch': 0.9}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.29965731501579285, 'eval_accuracy': 0.9385, 'eval_runtime': 22.4892, 'eval_samples_per_second': 177.863, 'eval_steps_per_second': 88.932, 'epoch': 0.9}
{'loss': 0.3016, 'learning_rate': 1.0763157894736843e-06, 'epoch': 0.95}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.2926977872848511, 'eval_accuracy': 0.94, 'eval_runtime': 23.7508, 'eval_samples_per_second': 168.416, 'eval_steps_per_second': 84.208, 'epoch': 0.95}
{'loss': 0.287, 'learning_rate': 2.3684210526315793e-08, 'epoch': 1.0}


  0%|          | 0/2000 [00:00<?, ?it/s]

{'eval_loss': 0.29275792837142944, 'eval_accuracy': 0.93975, 'eval_runtime': 24.4119, 'eval_samples_per_second': 163.854, 'eval_steps_per_second': 81.927, 'epoch': 1.0}
{'train_runtime': 881.0343, 'train_samples_per_second': 18.16, 'train_steps_per_second': 9.08, 'train_loss': 0.37740666103363035, 'epoch': 1.0}


TrainOutput(global_step=8000, training_loss=0.37740666103363035, metrics={'train_runtime': 881.0343, 'train_samples_per_second': 18.16, 'train_steps_per_second': 9.08, 'train_loss': 0.37740666103363035, 'epoch': 1.0})

## 10. Sentiment Prediction using custom text


In [21]:
# Tokenize text, get output from model and predict
def predict_sentiment(model, tokenizer, text, device):
    tokenized = tokenizer(text, truncation=True, padding=True, return_tensors='pt').to(device)
    outputs = model(**tokenized)

    probs = F.softmax(outputs.logits, dim=-1)
    preds = torch.argmax(outputs.logits, dim=-1).item()
    probs_max = probs.max().detach().cpu().numpy()

    prediction = "Positive" if preds == 1 else "Negative"
    print(f'{text}\nSentiment: {prediction}\tProbability: {probs_max*100:.2f}%\n', end="-"*50 + "\n")
    # return prediction, probs_max

In [22]:
texts = [
    "The traffic was horrendous this morning; I was stuck in it for over an hour.",
    "I was extremely disappointed with the quality of the product; it didn't meet my expectations at all.",
    "The customer service at the restaurant was very good; the staff went above and beyond to make us feel welcome.",
    "My recent stay at Paradise Resort was absolutely fantastic! From the moment I arrived, I was greeted with warm smiles and excellent service. The room was spacious, beautifully decorated, and spotlessly clean. I loved the breathtaking view from my balcony overlooking the pool and tropical gardens. The dining options were exceptional, and the resort's facilities were top-notch, offering everything from a fitness center to guided nature walks. Overall, Paradise Resort exceeded all my expectations, and I can't wait to return for another memorable stay!",
    "The movie started off promising, but it quickly went downhill. The plot was confusing, the acting was mediocre, and the ending was unsatisfying.",
    "I had a terrible experience at the restaurant last night. The food was cold, the service was slow, and the staff was rude.",
    "Despite the initial skepticism, I was pleasantly surprised by the performance of the new smartphone. Its sleek design, impressive camera quality, and fast processing speed exceeded my expectations.",
    "The concert was absolutely amazing! The energy of the performers, the enthusiasm of the crowd, and the quality of the music made it an unforgettable experience.",
    "I had high hopes for the book, but it turned out to be a disappointment. The characters were one-dimensional, the plot was predictable, and the writing style was uninspired.",
    "The presentation was well-prepared and delivered with confidence. The speaker engaged the audience effectively and provided valuable insights on the topic.",
    "The service at the hotel was impeccable. The staff was attentive, courteous, and always willing to assist with any request.",
    "The weather during our vacation was dreadful; it rained every day, and we were stuck indoors for most of the trip.",
    "The hiking trail offered breathtaking views of the mountains and lush forests. It was a challenging but rewarding experience.",
    "The customer support team was unhelpful and incompetent. They were unable to resolve my issue and seemed indifferent to my concerns.",
    "The play was a delightful blend of humor, drama, and suspense. The talented cast delivered stellar performances, and the storyline kept me engaged from start to finish.",
    "The new restaurant in town has quickly become my favorite dining spot. The food is delicious, the atmosphere is cozy, and the service is outstanding.",
]

for text in texts:
  predict_sentiment(model, tokenizer, text, device)

The traffic was horrendous this morning; I was stuck in it for over an hour.
Sentiment: Negative	Probability: 99.57%
--------------------------------------------------
I was extremely disappointed with the quality of the product; it didn't meet my expectations at all.
Sentiment: Negative	Probability: 99.78%
--------------------------------------------------
The customer service at the restaurant was very good; the staff went above and beyond to make us feel welcome.
Sentiment: Positive	Probability: 99.25%
--------------------------------------------------
My recent stay at Paradise Resort was absolutely fantastic! From the moment I arrived, I was greeted with warm smiles and excellent service. The room was spacious, beautifully decorated, and spotlessly clean. I loved the breathtaking view from my balcony overlooking the pool and tropical gardens. The dining options were exceptional, and the resort's facilities were top-notch, offering everything from a fitness center to guided nature 

In [23]:
# An example of complex review that contains both positive and negative sentiment
texts = ["Despite facing numerous challenges and setbacks, the team worked tirelessly and managed to exceed all expectations, achieving remarkable success. However, despite their best efforts, the project encountered multiple setbacks, ultimately leading to its failure and significant financial losses.",
         "The hotel room was clean and comfortable, and the amenities were well-maintained. However, the noise from the nearby construction site was disruptive due to which i could not focus when working.",
         "The movie had an intriguing plot and captivating visuals, but the sound quality was poor, making it difficult to fully enjoy the experience."]
for text in texts:
  predict_sentiment(model, tokenizer, text, device)

Despite facing numerous challenges and setbacks, the team worked tirelessly and managed to exceed all expectations, achieving remarkable success. However, despite their best efforts, the project encountered multiple setbacks, ultimately leading to its failure and significant financial losses.
Sentiment: Negative	Probability: 98.70%
--------------------------------------------------
The hotel room was clean and comfortable, and the amenities were well-maintained. However, the noise from the nearby construction site was disruptive due to which i could not focus when working.
Sentiment: Negative	Probability: 99.86%
--------------------------------------------------
The movie had an intriguing plot and captivating visuals, but the sound quality was poor, making it difficult to fully enjoy the experience.
Sentiment: Negative	Probability: 99.97%
--------------------------------------------------


In [23]:
# Breaking down above example into parts
texts = ["Despite facing numerous challenges and setbacks, the team worked tirelessly and managed to exceed all expectations, achieving remarkable success.",
         "However, despite their best efforts, the project encountered multiple setbacks, ultimately leading to its failure and significant financial losses.",
         "The hotel room was clean and comfortable, and the amenities were well-maintained.",
         "However, the noise from the nearby construction site was disruptive due to which i could not focus when working."]

for text in texts:
  predict_sentiment(model, tokenizer, text, device)

Despite facing numerous challenges and setbacks, the team worked tirelessly and managed to exceed all expectations, achieving remarkable success.
Sentiment: Positive	Probability: 99.13%
--------------------------------------------------
However, despite their best efforts, the project encountered multiple setbacks, ultimately leading to its failure and significant financial losses.
Sentiment: Negative	Probability: 98.69%
--------------------------------------------------
The hotel room was clean and comfortable, and the amenities were well-maintained.
Sentiment: Positive	Probability: 99.32%
--------------------------------------------------
However, the noise from the nearby construction site was disruptive due to which i could not focus when working.
Sentiment: Negative	Probability: 99.62%
--------------------------------------------------


Looks like **BERT** can accurately interpret the overall sentiment of the text, recognizing the positive aspects (clean and comfortable room, well-maintained amenities) as well as the negative aspect (disruptive noise from construction). By considering the context and weighing the various sentiments present, BERT can provide a nuanced understanding of the text's sentiment.

Overall, BERT's capability to handle mixed sentiments reflects its robustness and versatility in natural language understanding, making it a valuable tool for sentiment analysis and various other NLP tasks.

## 11. Evaluate & Plot Confusion Matrix

In [24]:
from sklearn.metrics import confusion_matrix
from tqdm.auto import tqdm
import seaborn as sns
import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'seaborn'

In [None]:
# Predict X_test dataset and evaluate through usage of metrics and
# plot Confusion Matrix
def predict_sentiment_and_evaluate(model, tokenizer, X_test, y_test, device):
  predictions = []

  for text in tqdm(X_test):
    # Tokenize and forward pass to model
    tokenized = tokenizer(text, truncation=True, padding=True, return_tensors='pt').to(device)
    outputs = model(**tokenized)

    # Inference
    preds = torch.argmax(outputs.logits, dim=-1).item()
    prediction = 1 if preds == 1 else 0
    predictions.append(prediction)

  # Confusion Matrix
  cm = confusion_matrix(y_test, predictions)
  plt.figure(figsize=(6, 4))
  sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", cbar=False,
              xticklabels=['Negative', 'Positive'],
              yticklabels=['Negative', 'Positive'])
  plt.xlabel('Predicted')
  plt.ylabel('Actual')
  plt.title('Confusion Matrix')
  plt.show()

In [None]:
predict_sentiment_and_evaluate(model, tokenizer, X_test, y_test, device)