![image](car.jpeg)

**Car-ing is sharing**, an auto dealership company for car sales and rental, is taking their services to the next level thanks to **Large Language Models (LLMs)**.

As their newly recruited AI and NLP developer, you've been asked to prototype a chatbot app with multiple functionalities that not only assist customers but also provide support to human agents in the company.

The solution should receive textual prompts and use a variety of pre-trained Hugging Face LLMs to respond to a series of tasks, e.g. classifying the sentiment in a car’s text review, answering a customer question, summarizing or translating text, etc.


In [64]:
# Import necessary packages
import pandas as pd
import torch
from transformers import pipeline
from transformers import logging
import pandas as pd
from sklearn.metrics import accuracy_score, f1_score
import evaluate
logging.set_verbosity(logging.WARNING)

### Classify car reviews

In [65]:
sentiment_model = pipeline(task="sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

Device set to use cpu


In [66]:
# Load the dataset with proper error handling
file_path = "data/car_reviews.csv"
df = pd.read_csv(file_path, delimiter=";")

# Put the car reviews and their associated sentiment labels in two lists
reviews = df['Review'].tolist()
true_labels = df['Class'].tolist()

# Run sentiment analysis on each review
predicted_labels = sentiment_model(reviews)

# Convert model outputs to binary labels: POSITIVE -> 1, NEGATIVE -> 0
predictions = [1 if pred['label'] == 'POSITIVE' else 0 for pred in predicted_labels]
references = [1 if label == "POSITIVE" else 0 for label in true_labels]

# Calculate evaluation metrics
accuracy_result = accuracy_score(true_labels, predictions)
f1_result = f1_score(references, predictions)

print(f"Accuracy: {accuracy_result}")
print(f"F1 Score: {f1_result}")

Accuracy: 0.0
F1 Score: 0.8571428571428571


### Translate a car review

In [67]:
# Importing the necessary translation pipeline from Hugging Face's transformers
translator = pipeline(task="translation_en_to_es", model="Helsinki-NLP/opus-mt-en-es")

Device set to use cpu


In [68]:
# Select the first review from the reviews list
first_review = reviews[0]

# Translate the selected review from English to Spanish with a maximum length of 27 tokens
translated_review = translator(first_review, max_length=27)[0]['translation_text']

Your input_length: 365 is bigger than 0.9 * max_length: 27. You might consider increasing your max_length manually, e.g. translator('...', max_length=400)


In [69]:
translated_review

'Estoy muy satisfecho con mi 2014 Nissan NV SL. Uso esta furgoneta para mis entregas de negocios y uso personal.'

In [70]:
# Open the reference translations file and read all lines
with open("data/reference_translations.txt", 'r') as file:
    lines = file.readlines()

# Clean up the lines by stripping any extra spaces or newline characters
references = [line.strip() for line in lines]

# Load the BLEU score evaluation metric from the evaluate library
bleu = evaluate.load("bleu")

# Compute the BLEU score by comparing the translated review to the reference translations
bleu_score = bleu.compute(predictions=[translated_review], references=[references])

# Print the computed BLEU score (a measure of translation quality)
print(bleu_score['bleu'])

0.6022774485691839


### Ask a question about a car review

In [71]:
# Importing the question-answering pipeline from Hugging Face's transformers
qa_model = pipeline("question-answering", model="deepset/minilm-uncased-squad2")

Some weights of the model checkpoint at deepset/minilm-uncased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


In [72]:
# Select the second review from the reviews list to use as context
context = reviews[1]

# Define the question you want to ask about the review
question = "What did he like about the brand?"

# Use the question-answering model to find the answer based on the provided context (review)
qa_output = qa_model(question=question, context=context)

# Extract the answer from the model's output
answer = qa_output['answer']

# Print the extracted answer to the question
print(answer)

ride quality, reliability


### Summarize and analyze a car review

In [73]:
# Importing the summarization pipeline from Hugging Face's transformers
summarizer = pipeline(task="summarization", model="facebook/bart-large-cnn")

Device set to use cpu


In [74]:
# Select the last review from the reviews list to summarize
summary = summarizer(reviews[-1], max_length=53)

# Extract the summarized text from the model's output
summarized_text = summary[0]["summary_text"]

Your min_length=56 must be inferior than your max_length=53.


In [75]:
summarized_text

'The Nissan Rogue provides me with the desired SUV experience without burdening me with an exorbitant payment. Handling and styling are great; I have hauled 12 bags of mulch in the back with the seats down and could have held more. The engine'