# NLP Project - Model Testing
## Sentiment Analysis with BERT on Movie Reviews

[IMDB Dataset of 50K Movie Reviews](https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews)

Team Memebers:
- Chihabeddine Zitouni
- Patrick Molina
- Małgorzata Gierdewicz

In [1]:
import unittest
import pandas as pd
from transformers import BertTokenizer
import subprocess
import sys

import Utils as utils

## 1. Unit Testing

In [2]:
tc = unittest.TestCase()

- Cleaning Text from HTML test

In [3]:
raw_text = "<p>Hello!! This is <b>GREAT</b> movie. :)</p>"
expected = "hello this is great movie"

result  = utils.clean_text(raw_text)
tc.assertEqual(result, expected)

- Test labels mapping

In [4]:
df = pd.DataFrame({'sentiment': ['positive', 'negative', 'positive']})
mapping = {'positive': 1, 'negative': 0}
mapped = utils.map_labels(df.copy(), 'sentiment', mapping)
tc.assertListEqual(mapped['sentiment'].tolist(), [1, 0, 1])

## 2. Flake8

In [5]:
%pip install flake8

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [6]:
result = subprocess.run([sys.executable, "-m", "flake8", "Utils.py"], capture_output=True, text=True)
print(result.stdout)




In [7]:
result = subprocess.run([sys.executable, "-m", "flake8", "train_model.py"], capture_output=True, text=True)
print(result.stdout)




## 3. mypy

In [8]:
%pip install mypy





[notice] A new release of pip is available: 24.3.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [9]:
result = subprocess.run([sys.executable, "-m", "mypy", "Utils.py"], capture_output=True, text=True)
print(result.stdout)

[1m[92mSuccess: no issues found in 1 source file[0m



In [10]:
result = subprocess.run([sys.executable, "-m", "mypy", "train_model.py"], capture_output=True, text=True)
print(result.stdout)

[1m[92mSuccess: no issues found in 1 source file[0m



## 4. Model Testing with Extra analysis 

In [15]:
import torch
from transformers import BertTokenizer, BertForSequenceClassification

In [22]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cuda


In [None]:
MODEL_PATH = './models/sentiment_analysis_model_3EP_1705_153705.pth'
TOKENIZER_PATH = './models/tokenizer'
TEST_DATA_PATH = 'cleaned_splitted_data/test_dataset.csv'
MAX_LENGTH = 128

In [None]:
tokenizer = BertTokenizer.from_pretrained(TOKENIZER_PATH)
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
model.load_state_dict(torch.load(MODEL_PATH, map_location=device))
model.to(device)
model.eval()

In [None]:
test_set = pd.read_csv(TEST_DATA_PATH)
test_set.head()

10000

In [None]:
edge_cases = test_set.copy()
edge_cases = edge_cases.iloc[0:0]

Unnamed: 0,review,sentiment


In [40]:
for index, row in test_set.iterrows():
    
    text = row['review']
    real_value = row['sentiment']

    inputs = tokenizer(text, return_tensors='pt', max_length=MAX_LENGTH, truncation=True, padding='max_length')
    inputs = {key: val.to(device) for key, val in inputs.items()}

    predicted_class = utils.predict_sentiment(model, tokenizer, device, MAX_LENGTH, text)

    if predicted_class == 'Positive':
        predicted_value = 1
    else:
        predicted_value = 0

    if predicted_value != real_value:
        edge_cases = pd.concat([edge_cases, pd.DataFrame([row])], ignore_index=True)

In [51]:
edge_cases.head()

Unnamed: 0,review,sentiment
0,i really liked this summerslam due to the look...,1
1,not many television shows appeal to quite as m...,1
2,the film quickly gets to a major chase scene w...,0
3,jane austen would definitely approve of this o...,1
4,expectations were somewhat high for me when i ...,0


In [49]:
print(f"Model accuracy on test set:\n{100 - (len(edge_cases)/len(test_set)*100)} %")

Model accuracy on test set:
78.41 %


- Test with Personal review

In [50]:
my_review = "This is a great movie. I love it!"
predicted_class = utils.predict_sentiment(model, tokenizer, device, MAX_LENGTH, my_review)
print(f"Predicted sentiment for my review: {predicted_class}")

Predicted sentiment for my review: Positive
