![image](car.jpeg)

## Before you start

In order to complete the project you may wish to install some Hugging Face libraries such as `transformers` and `evaluate`.

In [110]:
!pip install transformers
!pip install evaluate

from transformers import logging
logging.set_verbosity(logging.WARNING)

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [111]:
!pip install pandas
import pandas as pd

# Load the car reviews dataset
file = "data/car_reviews.csv"
df = pd.read_csv(file, delimiter=';')

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [112]:
df.head()

Unnamed: 0,Review,Class
0,I am very satisfied with my 2014 Nissan NV SL....,POSITIVE
1,The car is fine. It's a bit loud and not very ...,NEGATIVE
2,"My first foreign car. Love it, I would buy ano...",POSITIVE
3,I've come across numerous reviews praising the...,NEGATIVE
4,I've been dreaming of owning an SUV for quite ...,POSITIVE


In [113]:
# Put the car reviews and their associated sentiment labels in two lists
reviews = df['Review'].tolist()
real_labels = df['Class'].tolist()
real_labels

['POSITIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE']

In [114]:
from transformers import pipeline
sentiment_pipeline = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')

predicted_labels = sentiment_pipeline(reviews)
predicted_labels

[{'label': 'POSITIVE', 'score': 0.9293975830078125},
 {'label': 'POSITIVE', 'score': 0.8654279708862305},
 {'label': 'POSITIVE', 'score': 0.9994640946388245},
 {'label': 'NEGATIVE', 'score': 0.9935314059257507},
 {'label': 'POSITIVE', 'score': 0.9986565113067627}]

In [115]:
for review, prediction, label in zip(reviews, predicted_labels, real_labels):
    print(f"Review: {review}\n Real Sentiment Label: {label}\n Predicted Sentiment Label: {prediction['label']}\n Confidence of Prediction: {prediction['score']:.4f}\n")

Review: I am very satisfied with my 2014 Nissan NV SL. I use this van for my business deliveries and personal use. Camping, road trips, etc. We dont have any children so I store most of the seats in my warehouse. I wanted the passenger van for the rear air conditioning. We drove our van from Florida to California for a Cross Country trip in 2014. We averaged about 18 mpg. We drove thru a lot of rain and It was a very comfortable and stable vehicle. The V8 Nissan Titan engine is a 500k mile engine. It has been tested many times by delivery and trucking companies. This is why Nissan gives you a 5 year or 100k mile bumper to bumper warranty. Many people are scared about driving this van because of its size. But with front and rear sonar sensors, large mirrors and the back up camera. It is easy to drive. The front and rear sensors also monitor the front and rear sides of the bumpers making it easier to park close to objects. Our Nissan NV is a Tow Monster. It pulls our 5000 pound travel tr

In [116]:
# Load accuracy and F1 score metrics
import evaluate
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

In [117]:
# Map categorical sentiment labels into integer labels 0,1
references = [1 if label == "POSITIVE" else 0 for label in real_labels]
predictions = [1 if label['label'] == "POSITIVE" else 0 for label in predicted_labels]
references

[1, 0, 1, 0, 1]

In [118]:
accuracy_result1 = accuracy.compute(references=references, predictions=predictions)
print(accuracy_result1)
accuracy_result = accuracy_result1['accuracy']
accuracy_result

{'accuracy': 0.8}


0.8

In [119]:
f1_result1 = f1.compute(references=references, predictions=predictions)
print(f1_result1)
f1_result = f1_result1['f1']
f1_result

{'f1': 0.8571428571428571}


0.8571428571428571

In [120]:
from transformers import pipeline

translation_pipeline = pipeline("translation", model="Helsinki-NLP/opus-mt-en-es")

In [121]:
first_review = reviews[0]

# Split the text into sentences
sentences = first_review.split('.')

# This takes the first two elements from the list, strips any extra whitespace, joins them back together with periods, and ensures it ends with a period.
first_two_sentences = '.'.join([sentence.strip() for sentence in sentences[:2]]) + '.'

first_two_sentences

'I am very satisfied with my 2014 Nissan NV SL.I use this van for my business deliveries and personal use.'

In [122]:
translated_text = translation_pipeline(first_two_sentences)
print(translated_text)
translated_review = translated_text[0]['translation_text']
translated_review

[{'translation_text': 'Estoy muy satisfecho con mi Nissan NV SL 2014.Utilizo esta camioneta para mis entregas de negocios y uso personal.'}]


'Estoy muy satisfecho con mi Nissan NV SL 2014.Utilizo esta camioneta para mis entregas de negocios y uso personal.'

In [123]:
# Load spanish translation reference translations from file
with open("data/reference_translations.txt", "r") as file:
    lines = file.readlines()
spanish_references = [line.strip() for line in lines]
spanish_references

['Estoy muy satisfecho con mi Nissan NV SL 2014. Utilizo esta camioneta para mis entregas comerciales y uso personal.',
 'Estoy muy satisfecho con mi Nissan NV SL 2014. Uso esta furgoneta para mis entregas comerciales y uso personal.']

In [124]:
# Load and calculate BLEU score metric
bleu = evaluate.load("bleu")
bleu_score = bleu.compute(references=[spanish_references], predictions=[translated_review])
print(bleu_score)
print(f"BLEU Score: {bleu_score['bleu']:.4f}")

{'bleu': 0.8232490471721702, 'precisions': [0.9090909090909091, 0.8571428571428571, 0.8, 0.7368421052631579], 'brevity_penalty': 1.0, 'length_ratio': 1.0476190476190477, 'translation_length': 22, 'reference_length': 21}
BLEU Score: 0.8232


In [125]:
from transformers import pipeline

context = reviews[1]
question = "What did he like about the brand?"

qa_pipeline = pipeline('question-answering', model="deepset/minilm-uncased-squad2")
qa_result = qa_pipeline(question=question, context=context)
answer = qa_result['answer']
print(answer)

ride quality, reliability


In [126]:
from transformers import pipeline

last_review = reviews[-1]

summarize_pipeline = pipeline('summarization', model="cnicu/t5-small-booksum")

summarize_result = summarize_pipeline(last_review, min_length=50, max_length=55)
summarized_text = summarize_result[0]['summary_text']
summarized_text

'the Nissan Rogue provides me with the desired SUV experience without burdening me with an exorbitant payment; the financial arrangement is quite reasonable. I have hauled 12 bags of mulch in the back with the seats down and could have held more. I find'