![image](car.jpeg)

**Car-ing is sharing**, an auto dealership company for car sales and rental, is taking their services to the next level thanks to **Large Language Models (LLMs)**.

As their newly recruited AI and NLP developer, you've been asked to prototype a chatbot app with multiple functionalities that not only assist customers but also provide support to human agents in the company.

The solution should receive textual prompts and use a variety of pre-trained Hugging Face LLMs to respond to a series of tasks, e.g. classifying the sentiment in a car’s text review, answering a customer question, summarizing or translating text, etc.


In [88]:
# Import necessary packages
import pandas as pd
import torch
from transformers import pipeline
import evaluate
import pandas as pd

from transformers import logging
logging.set_verbosity(logging.WARNING)

# Sentiment analysis

In [89]:
reviews = pd.read_csv("data/car_reviews.csv", sep=";")
label_map = {"POSITIVE": 1, "NEGATIVE": 0}
reviews["Class"] = reviews["Class"].map(label_map)
reviews

Unnamed: 0,Review,Class
0,I am very satisfied with my 2014 Nissan NV SL....,1
1,The car is fine. It's a bit loud and not very ...,0
2,"My first foreign car. Love it, I would buy ano...",1
3,I've come across numerous reviews praising the...,0
4,I've been dreaming of owning an SUV for quite ...,1


In [90]:
classifier = pipeline( 
                        "text-classification",  
                        model="distilbert-base-uncased-finetuned-sst-2-english" 
) 


Device set to use cpu


In [91]:
predicted_labels = classifier(reviews["Review"].tolist(), truncation=True)
predictions = [label_map[p["label"]] for p in predicted_labels]
true_class = reviews["Class"].tolist()

In [92]:
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

accuracy_dict = accuracy.compute(predictions=predictions, references=true_class)
f1_dict = f1.compute(predictions=predictions, references=true_class)

accuracy_result = accuracy_dict["accuracy"]
f1_result = f1_dict["f1"]

print("Accuracy:", accuracy_result)
print("F1 score:", f1_result)

Accuracy: 0.8
F1 score: 0.8571428571428571


# Translation

In [93]:
translator = pipeline("translation_en_to_es", model="Helsinki-NLP/opus-mt-en-es")

Device set to use cpu


In [94]:
text = reviews['Review'][0]
sentences = text.split('.')
first_two = '. '.join(sentences[:2]) + '.'
print(first_two)

I am very satisfied with my 2014 Nissan NV SL.  I use this van for my business deliveries and personal use.


In [95]:
with open("data/reference_translations.txt", "r", encoding="utf-8") as f:
    ref_text = f.read()

In [96]:
ref_text

'Estoy muy satisfecho con mi Nissan NV SL 2014. Utilizo esta camioneta para mis entregas comerciales y uso personal.\nEstoy muy satisfecho con mi Nissan NV SL 2014. Uso esta furgoneta para mis entregas comerciales y uso personal.'

In [97]:
translation = translator(text, max_length=55)
translated_review = translation[0]['translation_text']
print(translated_review)

Your input_length: 365 is bigger than 0.9 * max_length: 55. You might consider increasing your max_length manually, e.g. translator('...', max_length=400)


Estoy muy satisfecho con mi 2014 Nissan NV SL. Utilizo esta furgoneta para mis entregas de negocios y uso personal. Camping, viajes por carretera, etc. No tenemos ningún niño así que guardo la mayoría de los asientos en mi almacén. Quería


In [98]:
reference = [[ref_text]]  

bleu = evaluate.load("bleu")

bleu_score = bleu.compute(predictions=[translated_review], references=reference)
print("BLEU score:", bleu_score["bleu"])

BLEU score: 0.2519515618425061


# Question & Answer

In [99]:
q_a_model = pipeline("question-answering", model="deepset/minilm-uncased-squad2")

Some weights of the model checkpoint at deepset/minilm-uncased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


In [100]:
context = reviews['Review'][1]
question = "What did he like about the brand?"

result = q_a_model(question=question, context=context)
answer = result['answer']

print(f"Answer: {answer} (score: {result['score']:.4f})")

Answer: ride quality, reliability (score: 0.4774)


# Summarization

In [101]:
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

Device set to use cpu


In [102]:
text = reviews['Review'][4]

summary = summarizer(text, max_length=55, min_length=50, do_sample=False)
summarized_text = summary[0]['summary_text']

print("Summary:", summarized_text)

Summary: The Nissan Rogue provides me with the desired SUV experience without burdening me with an exorbitant payment. Handling and styling are great; I have hauled 12 bags of mulch in the back with the seats down and could have held more. The engine delivers strong
