Checking for package installation

In [20]:
!pip install pytorch
!pip install transformers
!pip install torch
!pip install pandas
!pip install evaluate

Collecting pytorch
  Using cached pytorch-1.0.2.tar.gz (689 bytes)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: pytorch
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py bdist_wheel[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for pytorch (setup.py) ... [?25lerror
[31m  ERROR: Failed building wheel for pytorch[0m[31m
[0m[?25h  Running setup.py clean for pytorch
Failed to build pytorch
[31mERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (pytorch)[0m[31m


### Imports


In [21]:
import pandas as pd
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForQuestionAnswering
import evaluate

### DataImports


# Step 1: Load the car reviews dataset

In [22]:


file_path = '/content/car_reviews.csv'
df = pd.read_csv(file_path, sep=";")
reviews = df['Review'].tolist()
true_labels = df['Class'].tolist()

# Step 2: Perform sentiment classification


In [23]:
classifier = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')
sentiment_predictions = classifier(reviews)

# Display results for each review


In [24]:
for i in range(len(reviews)):
    print(f"Review: {reviews[i]}")
    print(f"True Sentiment: {true_labels[i]}")
    print(f"Predicted: {sentiment_predictions[i]['label']} (Confidence: {sentiment_predictions[i]['score']:.4f})\n")

Review: I am very satisfied with my 2014 Nissan NV SL. I use this van for my business deliveries and personal use. Camping, road trips, etc. We dont have any children so I store most of the seats in my warehouse. I wanted the passenger van for the rear air conditioning. We drove our van from Florida to California for a Cross Country trip in 2014. We averaged about 18 mpg. We drove thru a lot of rain and It was a very comfortable and stable vehicle. The V8 Nissan Titan engine is a 500k mile engine. It has been tested many times by delivery and trucking companies. This is why Nissan gives you a 5 year or 100k mile bumper to bumper warranty. Many people are scared about driving this van because of its size. But with front and rear sonar sensors, large mirrors and the back up camera. It is easy to drive. The front and rear sensors also monitor the front and rear sides of the bumpers making it easier to park close to objects. Our Nissan NV is a Tow Monster. It pulls our 5000 pound travel tr

# Convert labels to binary format


In [25]:
binary_true_labels = [1 if label == "POSITIVE" else 0 for label in true_labels]
binary_predictions = [1 if pred['label'] == "POSITIVE" else 0 for pred in sentiment_predictions]

# Step 3: Evaluate accuracy and F1 score


In [26]:
accuracy_metric = evaluate.load("accuracy")
f1_metric = evaluate.load("f1")
accuracy_score = accuracy_metric.compute(predictions=binary_predictions, references=binary_true_labels)['accuracy']
f1_score = f1_metric.compute(predictions=binary_predictions, references=binary_true_labels)['f1']

print(f"Accuracy: {accuracy_score}")
print(f"F1 Score: {f1_score}")

Accuracy: 0.8
F1 Score: 0.8571428571428571


# Step 4: Translate first review to Spanish


In [27]:

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-es")
first_review_text = reviews[0]
translated_text = translator(first_review_text, max_length=27)[0]['translation_text']
print(f"Translated Review: {translated_text}")

Your input_length: 365 is bigger than 0.9 * max_length: 27. You might consider increasing your max_length manually, e.g. translator('...', max_length=400)


Translated Review: Estoy muy satisfecho con mi 2014 Nissan NV SL. Uso esta furgoneta para mis entregas de negocios y uso personal.


# Step 5: Load reference translations from file for BLEU score


In [28]:
with open("/content/reference_translations.txt", 'r') as file:
    ref_translations = [line.strip() for line in file.readlines()]

# Compute BLEU score
bleu_metric = evaluate.load("bleu")
bleu_result = bleu_metric.compute(predictions=[translated_text], references=[ref_translations])
print(f"BLEU Score: {bleu_result['bleu']}")

BLEU Score: 0.6022774485691839


# Step 6: Extractive QA from the second review


In [29]:
qa_model = "deepset/minilm-uncased-squad2"
tokenizer = AutoTokenizer.from_pretrained(qa_model)
model = AutoModelForQuestionAnswering.from_pretrained(qa_model)

# Define question and context
context_text = reviews[1]
query = "What did he like about the brand?"

# Tokenize input
inputs = tokenizer(query, context_text, return_tensors="pt")

# Perform inference to get answer span
with torch.no_grad():
    outputs = model(**inputs)
start_position = torch.argmax(outputs.start_logits)
end_position = torch.argmax(outputs.end_logits) + 1
answer_ids = inputs['input_ids'][0][start_position:end_position]

# Decode answer
extracted_answer = tokenizer.decode(answer_ids)
print(f"Extracted Answer: {extracted_answer}")

Some weights of the model checkpoint at deepset/minilm-uncased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Extracted Answer: ride quality, reliability


# Step 7: Summarize the last review


In [30]:
summarizer = pipeline("summarization", model="cnicu/t5-small-booksum")
final_review_text = reviews[-1]
summary_output = summarizer(final_review_text, max_length=53, min_length=50, do_sample=False)[0]['summary_text']

print(f"Summarized Review: {summary_output}")

Summarized Review: the Nissan Rogue provides me with the desired SUV experience without burdening me with an exorbitant payment; the financial arrangement is quite reasonable. I have hauled 12 bags of mulch in the back with the seats down and could have held more.
