Use LLMs to solve diverse language tasks for a car dealership company.
1. Sentiment analysis of customer reviews
2. Text translation 
3. Extractive QA
4. Text summarization

In [1]:
# install Hugging Face libraries: `transformers` and `evaluate`.
!pip install transformers
!pip install evaluate

from transformers import logging
logging.set_verbosity(logging.WARNING)

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable


In [2]:
import pandas as pd
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from transformers import pipeline   # second method
import evaluate

# 1: Sentiment Analysis
Use a pre-trained LLM to classify the sentiment of the five car reviews in the car_reviews.csv dataset, and evaluate the classification accuracy and F1 score of predictions.
Store the model outputs in predicted_labels, then extract the labels and map them onto a list of {0,1} integer binary labels called predictions.
Store the calculated metrics in accuracy_result and f1_result.

In [3]:
df = pd.read_csv("data/car_reviews.csv", delimiter = ';', error_bad_lines=False)
df

Unnamed: 0,Review,Class
0,I am very satisfied with my 2014 Nissan NV SL....,POSITIVE
1,The car is fine. It's a bit loud and not very ...,NEGATIVE
2,"My first foreign car. Love it, I would buy ano...",POSITIVE
3,I've come across numerous reviews praising the...,NEGATIVE
4,I've been dreaming of owning an SUV for quite ...,POSITIVE


In [4]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import pandas as pd

reviews = df['Review'].tolist()
classes = df['Class'].tolist()
classes

['POSITIVE', 'NEGATIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE']

In [5]:
# USING AutoClasses
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

inputs = tokenizer(reviews, return_tensors='pt', padding=True, truncation=True)
inputs

{'input_ids': tensor([[ 101, 1045, 2572,  ..., 2009, 1012,  102],
        [ 101, 1996, 2482,  ...,    0,    0,    0],
        [ 101, 2026, 2034,  ...,    0,    0,    0],
        [ 101, 1045, 1005,  ...,    0,    0,    0],
        [ 101, 1045, 1005,  ...,    0,    0,    0]]), 'attention_mask': tensor([[1, 1, 1,  ..., 1, 1, 1],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0]])}

In [6]:
outputs = model(**inputs)
logits = outputs.logits
print(logits.size())
print(logits)

torch.Size([5, 2])
tensor([[-1.2737,  1.3038],
        [-0.8886,  0.9725],
        [-3.6580,  3.8730],
        [ 2.7305, -2.3037],
        [-3.2034,  3.4078]], grad_fn=<AddmmBackward0>)


In [7]:
predictions = torch.argmax(logits, dim = 1).tolist()
predicted_labels = ["POSITIVE" if pred_class == 1 else "NEGATIVE" for pred_class in predictions]

print(predicted_labels)
print(predictions)

['POSITIVE', 'POSITIVE', 'POSITIVE', 'NEGATIVE', 'POSITIVE']
[1, 1, 1, 0, 1]


In [8]:
# USING PIPELINES
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
text_classifier = pipeline("text-classification", model = model_name)
predicted_labels = text_classifier(reviews)

predictions = [1 if pred['label'] == 'POSITIVE' else 0 for pred in predicted_labels]

print(predicted_labels)
print(predictions)

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


[{'label': 'POSITIVE', 'score': 0.9293975830078125}, {'label': 'POSITIVE', 'score': 0.8654279708862305}, {'label': 'POSITIVE', 'score': 0.9994640946388245}, {'label': 'NEGATIVE', 'score': 0.9935314059257507}, {'label': 'POSITIVE', 'score': 0.9986565113067627}]
[1, 1, 1, 0, 1]


In [9]:
# Evaluation
accuracy_metric = evaluate.load("accuracy")
f1_metric = evaluate.load("f1")

references = [1 if curr_class == "POSITIVE" else 0 for curr_class in classes]

accuracy_result = accuracy_metric.compute(predictions = predictions, references = references)
f1_result = f1_metric.compute(predictions = predictions, references = references)

print(accuracy_result)
print(f1_result)

{'accuracy': 0.8}
{'f1': 0.8571428571428571}


# 2: Text Translation
The company is recently attracting customers from Spain. Extract and pass the first two sentences of the first review in the dataset to an English-to-Spanish translation LLM. Calculate the BLEU score to assess translation quality, using the content in reference_translations.txt as references.
Store the translated text generated by the LLM in translated_review.
Store the BLEU score metric result in bleu_score.

In [10]:
# trans_input = ["I am very satisfied with my 2014 Nissan NV SL.", "I use this van for my business deliveries and personal use."]

trans_input = reviews[0].split('.')[:2]
print(trans_input)

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-es")

['I am very satisfied with my 2014 Nissan NV SL', ' I use this van for my business deliveries and personal use']


In [11]:
outputs = translator(trans_input)
# Join 2 senteneces => 1 review
translated_review = " .".join(sentence['translation_text'] for sentence in outputs)
translated_review

'Estoy muy satisfecho con mi Nissan NV SL 2014 .Uso esta camioneta para mis entregas de negocios y uso personal'

In [12]:
# EVALUTION

with open('data/reference_translations.txt', 'r') as file:
    # Read the contents of the file and convert to string
    trans_references = [[line.strip() for line in file.readlines()]]
print(trans_references)

bleu_metric = evaluate.load("bleu")
bleu_score = bleu_metric.compute(predictions = [translated_review], references = trans_references)
print(bleu_score)

[['Estoy muy satisfecho con mi Nissan NV SL 2014. Utilizo esta camioneta para mis entregas comerciales y uso personal.', 'Estoy muy satisfecho con mi Nissan NV SL 2014. Uso esta furgoneta para mis entregas comerciales y uso personal.']]
{'bleu': 0.7671176261207451, 'precisions': [0.9047619047619048, 0.85, 0.7368421052631579, 0.6111111111111112], 'brevity_penalty': 1.0, 'length_ratio': 1.0, 'translation_length': 21, 'reference_length': 21}


# 3: Q/A
The 2nd review in the dataset emphasizes brand aspects. Load an extractive QA LLM such as "deepset/minilm-uncased-squad2" to formulate the question "What did he like about the brand?" and obtain an answer.

In [13]:
question = "What did he like about the brand?"
context = reviews[1]
context

"The car is fine. It's a bit loud and not very powerful. On one hand, compared to its peers, the interior is well-built. The transmission failed a few years ago, and the dealer replaced it under warranty with no issues. Now, about 60k miles later, the transmission is failing again. It sounds like a truck, and the issues are well-documented. The dealer tells me it is normal, refusing to do anything to resolve the issue. After owning the car for 4 years, there are many other vehicles I would purchase over this one. Initially, I really liked what the brand is about: ride quality, reliability, etc. But I will not purchase another one. Despite these concerns, I must say, the level of comfort in the car has always been satisfactory, but not worth the rest of issues found."

In [14]:
qa_model = pipeline("question-answering", model = "deepset/minilm-uncased-squad2")
answer = qa_model(question = question, context = context)
answer

{'score': 0.47736144065856934,
 'start': 569,
 'end': 594,
 'answer': 'ride quality, reliability'}

# 4: Text Summarization
Summarize the last review in the dataset, into approximately 50-55 tokens long. Store it in the variable summarized_text.

In [5]:
long_review = reviews[-1]
print(f"Original text:\n{long_review}")

summarization_model = pipeline("summarization", model="facebook/bart-large-cnn")
outputs = summarization_model(long_review, min_length = 50, max_length=55, clean_up_tokenization_spaces=True)
summarized_text = outputs[0]['summary_text']

print(f"Summarized text:\n{summarized_text}")

Original text:
I've been dreaming of owning an SUV for quite a while, but I've been driving cars that were already paid for during an extended period. I ultimately made the decision to transition to a brand-new car, which, of course, involved taking on new payments. However, given that I don't drive extensively, I was inclined to avoid a substantial financial commitment. The Nissan Rogue provides me with the desired SUV experience without burdening me with an exorbitant payment; the financial arrangement is quite reasonable. Handling and styling are great; I have hauled 12 bags of mulch in the back with the seats down and could have held more. I am VERY satisfied overall. I find myself needing to exercise extra caution when making lane changes, particularly owing to the blind spots resulting from the small side windows situated towards the rear of the vehicle. To address this concern, I am actively engaged in making adjustments to my mirrors and consciously reducing the frequency of la