![image](car.jpeg)

**Car-ing is sharing**, an auto dealership company for car sales and rental, is taking their services to the next level thanks to **Large Language Models (LLMs)**.

As their newly recruited AI and NLP developer, you've been asked to prototype a chatbot app with multiple functionalities that not only assist customers but also provide support to human agents in the company.

The solution should receive textual prompts and use a variety of pre-trained Hugging Face LLMs to respond to a series of tasks, e.g. classifying the sentiment in a car’s text review, answering a customer question, summarizing or translating text, etc.


## Before you start

In order to complete the project you may wish to install some Hugging Face libraries such as `transformers` and `evaluate`.

In [1]:
!pip install transformers
!pip install evaluate==0.4.0
!pip install datasets==2.10.0
!pip install sentencepiece==0.1.97

from transformers import logging
logging.set_verbosity(logging.WARNING)

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
Collecting evaluate==0.4.0
  Downloading evaluate-0.4.0-py3-none-any.whl.metadata (9.4 kB)
Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
Installing collected packages: evaluate
[0mSuccessfully installed evaluate-0.4.0
Defaulting to user installation because normal site-packages is not writeable
Collecting datasets==2.10.0
  Downloading datasets-2.10.0-py3-none-any.whl.metadata (20 kB)
Downloading datasets-2.10.0-py3-none-any.whl (469 kB)
Installing collected packages: datasets
[0mSuccessfully installed datasets-2.10.0
Defaulting to user installation because normal site-packages is not writeable


In [2]:
import pandas as pd
import evaluate
from transformers import pipeline

In [3]:
# Start your code here!
import pandas as pd
from transformers import pipeline
import evaluate

# Task 1: Classify car reviews
car_reviews = pd.read_csv("data/car_reviews.csv", delimiter=";")
#print(car_reviews)

# Load a sentiment analysis model
sentiment_pipeline = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

# Classify sentiment for each review
predicted_labels = [sentiment_pipeline(review)[0]['label'] for review in car_reviews['Review']]
#print(predicted_labels)
predictions = [1 if label == "POSITIVE" else 0 for label in predicted_labels]
#print(predictions)

# Convert "Positive" and "Negative" labels to 1 and 0
true_labels = car_reviews['Class'].map({"POSITIVE": 1, "NEGATIVE": 0})
#print(true_labels)

# Calculate accuracy and F1 score
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")
accuracy_result = accuracy.compute(references=true_labels, predictions=predictions)
f1_result = f1.compute(references=true_labels, predictions=predictions)

# Extract the actual score from the result dictionaries
accuracy_score = accuracy_result['accuracy']
f1_score = f1_result['f1']
numeric_labels = [1 if label == 'POSITIVE' else 0 for label in predicted_labels]
predicted_labels = numeric_labels

# Display classification results
print(accuracy_score)
print(f1_score)

Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/6.77k [00:00<?, ?B/s]

0.8
0.8571428571428571


In [4]:
# Task 2: Translate part of the first review into Spanish and calculate BLEU score

# Extract the first two sentences of the first review
first_review = car_reviews.iloc[0]['Review']
first_two_sentences = ". ".join(first_review.split(". ")[:2])
#print(first_review)
#print(first_two_sentences)

# Load model 
translator = pipeline(task="translation_en_to_es", model = "Helsinki-NLP/opus-mt-en-es")

# Translate English to Spanish
translated_review = translator(first_two_sentences, clean_up_tokenization_spaces=True)
print(translated_review[0]["translation_text"])

# Compute BLEU score 
reference_translations = pd.read_csv("data/reference_translations.txt", delimiter="\t")

bleu = evaluate.load("bleu")
bleu_scores = bleu.compute(references=[[ref] for ref in reference_translations.iloc[:, 0]], predictions=[translated_review[0]["translation_text"]])
bleu_score = pd.to_numeric(bleu_scores['bleu'])
print(bleu_scores)
print(bleu_score)

Downloading:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/312M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/293 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/802k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/826k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.59M [00:00<?, ?B/s]

Estoy muy satisfecho con mi Nissan NV SL 2014. Uso esta camioneta para mis entregas de negocios y uso personal


Downloading builder script:   0%|          | 0.00/5.94k [00:00<?, ?B/s]

Downloading extra modules:   0%|          | 0.00/1.55k [00:00<?, ?B/s]

Downloading extra modules:   0%|          | 0.00/3.34k [00:00<?, ?B/s]

{'bleu': 0.6712403123245676, 'precisions': [0.8571428571428571, 0.75, 0.631578947368421, 0.5], 'brevity_penalty': 1.0, 'length_ratio': 1.0, 'translation_length': 21, 'reference_length': 21}
0.6712403123245676


In [5]:
# Task 3: Extractive QA using "deepset/minilm-uncased-squad2"

# Load the extractive QA model
qa_pipeline = pipeline("question-answering", model="deepset/minilm-uncased-squad2")

# Define the question and context (second review)
question = "What did he like about the brand?"
context = car_reviews.iloc[1]['Review']

# Get the answer using the QA model
answer = qa_pipeline(question=question, context=context)["answer"]

# Display extracted answer
#print(context)
#print(question)
print(answer)

Downloading:   0%|          | 0.00/477 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/133M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/107 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

ride quality, reliability


In [7]:
# Task 4: Summarization and Bias Analysis

# Load a summarization model (e.g., "facebook/bart-large-cnn")
summarization_pipeline = pipeline("summarization", model="facebook/bart-large-cnn")

# Extract the last review
last_review = car_reviews.iloc[-1]['Review']
#print(last_review)

# Generate summary with a length of approximately 50-55 tokens
summarized_text = summarization_pipeline(last_review, max_length=55, min_length=50)
print(summarized_text)

toxicity_metric = evaluate.load("toxicity")
toxicity_results = toxicity_metric.compute(predictions=summarized_text[0]['summary_text'], aggregation="maximum")

regard_metric = evaluate.load("regard")
regard_results = regard_metric.compute(data = summarized_text[0]['summary_text'])

toxicity_results_list = [toxicity_results['max_toxicity']]

print(toxicity_results_list)
print(regard_results)




[{'summary_text': 'The Nissan Rogue provides me with the desired SUV experience without burdening me with an exorbitant payment. Handling and styling are great; I have hauled 12 bags of mulch in the back with the seats down and could have held more. The engine delivers strong'}]
[0.24305784702301025]
{'regard': [[{'label': 'neutral', 'score': 0.8498192429542542}, {'label': 'positive', 'score': 0.11421503871679306}, {'label': 'negative', 'score': 0.021512044593691826}, {'label': 'other', 'score': 0.014453746378421783}], [{'label': 'neutral', 'score': 0.7731531262397766}, {'label': 'positive', 'score': 0.1952335685491562}, {'label': 'negative', 'score': 0.01756386086344719}, {'label': 'other', 'score': 0.014049401506781578}], [{'label': 'neutral', 'score': 0.8415986895561218}, {'label': 'positive', 'score': 0.1203446015715599}, {'label': 'negative', 'score': 0.022679010406136513}, {'label': 'other', 'score': 0.015377769246697426}], [{'label': 'neutral', 'score': 0.6143266558647156}, {'la