## Aspect Based Sentiment Analysis ( ABSA )

ABSA is used to identify the different aspects of a given target entity (such as a product or service) and the sentiment expressed towards each aspect in customer reviews or other text data. 

ABSA is further divided in two subtasks:

Subtask 1 : It involves identifying aspect terms present in a given sentence containing pre-identified entities, such as restaurants. The goal is to extract the distinct aspect terms that refer to specific aspects of the target entity. Multi-word aspect terms should be considered as single terms.

For example, "I liked the service and the staff, but not the food”, “The food was nothing much, but I loved the staff”. Multi-word aspect terms (e.g., “hard disk”) should be treated as single terms (e.g., in “The hard disk is very noisy” the only aspect term is “hard disk”).


Subtask 2 : It involves determining the polarity (positive, negative, neutral, or conflict) of each aspect term in a given sentence. For a set of aspect terms, the task is to identify their polarity based on the sentiment expressed towards them.

For example:

“I loved their fajitas” → {fajitas: positive}
“I hated their fajitas, but their salads were great” → {fajitas: negative, salads: positive}
“The fajitas are their first plate” → {fajitas: neutral}
“The fajitas were great to taste, but not to see” → {fajitas: conflict}

Subtask 3: It involves identifying the aspect categories discussed in a given sentence from a predefined set of categories such as price, food, service, ambience, and anecdotes/miscellaneous. The aspect categories are coarser than the aspect terms of subtask 1 and may not necessarily occur as terms in the sentence.

For example, given the set of aspect categories {food, service, price, ambience, anecdotes/miscellaneous}:

“The restaurant was too expensive” → {price}
“The restaurant was expensive, but the menu was great” → {price, food}

Subtask 4: It involves determining the polarity of each pre-identified aspect category (e.g., food, price). The goal is to identify the sentiment polarity of each category based on the sentiment expressed towards them in a given sentence.

For example:

“The restaurant was too expensive” → {price: negative}
“The restaurant was expensive, but the menu was great” → {price: negative, food: positive}

In [1]:
# Installing libraries required for this task

import os
import sys
import time
from typing import Tuple

# vector LLM toolkit
import kscope
import numpy as np
import pandas as pd
import sklearn.metrics
import torch
import tqdm
import transformers
from torch.utils.data import DataLoader, Dataset
from transformers import AutoTokenizer

# Print version information - check you are using correct environment
print("Python version: " + sys.version)
print("PyTorch version: " + torch.__version__)
print("Transformers version: " + transformers.__version__)

Python version: 3.9.15 (main, Nov 24 2022, 08:29:02) 
[Clang 14.0.6 ]
PyTorch version: 1.13.1
Transformers version: 4.27.3


# Getting Started

Next, We will be starting with connecting to the Kaleidoscope service through which we can connect to the large language model, OPT-175B. We will also be checking how many models are available to us.

In [2]:
# Establish a client connection to the Kaleidoscope service
client = kscope.Client(gateway_host="llm.cluster.local", gateway_port=3001)

In [3]:
# checking how many models are available for use
client.models

['OPT-175B', 'OPT-6.7B']

In [4]:
# checking how many model instances are active
client.model_instances

[{'id': 'b11f3264-9c03-4114-9d56-d39a0fa63640',
  'name': 'OPT-175B',
  'state': 'ACTIVE'},
 {'id': 'c5ea8da7-3384-4c7b-b47f-95ed1a485897',
  'name': 'OPT-6.7B',
  'state': 'ACTIVE'}]

In [5]:
# For this notebook, we will be focusing on OPT-175B

model = client.load_model("OPT-175B")
# If this model is not actively running, it will get launched in the background.
# In this case, wait until it moves into an "ACTIVE" state before proceeding.
while model.state != "ACTIVE":
    time.sleep(1)

### Loading model and tokenizer

Need to instantiate a tokenizer to obtain appropriate token indices for our labels.
NOTE: All OPT models, regardless of size, used the same tokenizing. However, if you want to use a different type of model, a different tokenizer may be needed.

In [6]:
tokenizer = AutoTokenizer.from_pretrained("facebook/opt-350m")

In [7]:
class CustomDataset(Dataset):
    """
    A class for the dataset
    ...

    Attributes
    ----------
    df : pandas dataframe
        Dataset in the format of a pandas dataframe.
        Ensure it has columns named sentence_with_full_prompt and aspect_term_polarity

    Methods
    -------
    """

    def __init__(self, df: pd.DataFrame) -> None:
        self.df = df

    def __getitem__(self, index: int) -> Tuple[str, str, str]:
        row = self.df.iloc[index]
        text_prompt = row["sentence_with_full_prompt"]
        polarity = row["aspect_term_polarity"]
        text = row["text"]
        return text, text_prompt, polarity

    def __len__(self) -> int:
        return len(self.df)

### Prompt Examples

In next section, some prompt examples are given which gives a demonstration on how to setup a prompt for the task of zero-shot and few-shot.


#### Prompt Setup for Input and Zero-shot examples

* `df['sentence_with_prompt'] = 'Sentence: ' + df['text'] + ' ' + 'Sentiment on ' + df['aspect_term'] + ' is'`


* `df['sentence_with_prompt'] = 'Sentence: ' + df['text'] + ' ' + 'Sentiment on ' + df['aspect_term'] + ' is positive or negative? It is'`


* `df['sentence_with_prompt'] = 'Answer the question using the sentence provided. \nQuestion: What is the sentiment on ' + df['aspect_term'] + ' - positive, negative, or neutral?' + '\nSentence: ' + df['text'] + '\nAnswer:'`


* `df['sentence_with_prompt'] = 'Sentence: ' + df['text'] + ' ' + 'The sentiment associated with ' + df['aspect_term'] + ' is'`


* `df['sentence_with_prompt'] = '\nSentence: ' + df['text'] + ' ' + '\nQuestion: What is the sentiment on ' + df['aspect_term'] + '? \nAnswer:'`

#### Few-shot Examples

The examples below show few-shot demonstrations. These are to be prepended to the input and final question for the model to answer following the examples above. The first two examples below are two- and three-shot examples where the format is completion of "Sentiment on `<aspect term>` is" is used to produce a model response.

* `demonstrations = 'Sentence: Albert Einstein was one of the greatest intellects of his time. Sentiment on Albert Einstein is positive. \nSentence: The sweet lassi was excellent as was the lamb chettinad and the garlic naan but the rasamalai was forgettable. Sentiment on rasmalai is negative. '`


* `demonstrations = 'Sentence: Their pizza was good, but the service was bad. Sentiment on pizza is positive. \nSentence: I charge it at night and skip taking the cord with me because its''s too heavy. Sentiment on cord is negative. \nSentence: My suggestion is to eat family style because you''ll want to try the other dishes. Sentiment on dishes is neutral. '`

The examples below offer several other alternatives for formatting of the prompt to produce a response to the ABSA task from the model in a few-shot setting. These include both sentence completion and question-based forms.

* `demonstrations = 'Sentence: Albert Einstein was one of the greatest intellects of his time. Sentiment on Albert Einstein is positive or negative? It is positive. \nSentence: The sweet lassi was excellent as was the lamb chettinad and the garlic naan but the rasamalai was forgettable. Sentiment on rasmalai is positive or negative? It is negative. '`


* `promptStarting = 'Sentence: Albert Einstein was one of the greatest intellects of his time. The sentiment associated with Albert Einstein is positive. \nSentence: The sweet lassi was excellent as was the lamb chettinad and the garlic naan but the rasamalai was forgettable. The sentiment associated with rasmalai is negative. '`


* `demonstrations = 'Sentence: Albert Einstein was one of the greatest intellects of his time. The sentiment associated with Albert Einstein is positive. \nSentence: The sweet lassi was excellent as was the lamb chettinad and the garlic naan but the rasamalai was forgettable. The sentiment associated with rasmalai is negative. \nSentence: I had my software installed and ready to go, but the system crashed. The sentiment associated with software is neutral. '`


* `demonstrations = 'Sentence: Albert Einstein was one of the greatest intellects of his time. \nQuestion: What is the sentiment on Albert Einstein? \nAnswer: postive \n\nSentence: The sweet lassi was excellent as was the lamb chettinad and the garlic naan but the rasamalai was forgettable. \nQuestion: What is the sentiment on rasmalai? \nAnswer: negative '`

In this notebook we will be showing accuracy on two different approaches: Few-shot and Zero-shot prompting.

* zero-shot approach [`zero-shot`] (i.e. give the model the input sentence and ask for the sentiment).
* few-shot approach [`few-shot`] (i.e. give some example sentences for the model to determine what should come next for the input sentence, then ask for the sentiment of a new example) 

In [8]:
generation_type = "few-shot"  # other option here "zero-shot"

In [9]:
# This csv file contains customer reviews of laptops collected in 2014 with size of around 469
path = "resources/absa_datasets/"
# Below you can add more datasets like 'Laptop_Train_v2.csv', 'Restaurants_Train_v2.csv', and
# 'Restaurants_Test_Gold.csv'
datasets_list = [os.path.join(path, "Laptops_Test_Gold.csv")]

In [10]:
for d in tqdm.notebook.tqdm(range(len(datasets_list))):
    df = pd.read_csv(datasets_list[d])

    print("----------------------------------------------------------------")
    df.info()
    print()

    # Delete any rows with null values
    df = df.dropna(axis=0, how="any", subset=["aspect_term", "aspect_term_polarity"])

    # Set the prompt format for the input sentence (drawn from one of the example from above)
    df["sentence_with_prompt"] = (
        "Sentence: "
        + df["text"]
        + " "
        + "Sentiment on "
        + df["aspect_term"]
        + " is positive, negative, or neutral? It is"
    )

    # Make sure to index instances with positive and negative as polarity
    df = df.loc[
        df["aspect_term_polarity"].str.contains("positive") | df["aspect_term_polarity"].str.contains("negative")
    ]

    demonstrations = "Sentence: Albert Einstein was one of the greatest intellects of his time. Sentiment on Albert Einstein is positive, negative, or neutral? It is positive. \nSentence: The sweet lassi was excellent as was the lamb chettinad and the garlic naan but the rasamalai was forgettable. Sentiment on rasmalai is positive, negative, or neutral? It is negative. \nSentence: I had my software installed and ready to go, but the system crashed. Sentiment on software is positive, negative, or neutral? It is neutral."  # noqa E501

    # for few-shot, we give more context to the model to improve the model performance and generalizability.
    if generation_type == "few-shot":
        df["sentence_with_full_prompt"] = demonstrations + "\n" + df["sentence_with_prompt"]
    elif generation_type == "zero-shot":
        df["sentence_with_full_prompt"] = df["sentence_with_prompt"]
    else:
        raise ValueError("Invalid generation type: Please select from zero-shot or few-shot.")

    df.info()

    # Construct the dataloader with custom created dataset as input
    data = CustomDataset(df)
    dataloader = DataLoader(data, batch_size=20)

    # initialize predictions and labels
    predictions = []
    labels = []

    # parsing through the model along with vector hosted OPT-175B
    # For a discussion of the configuration parameters see:
    # src/reference_implementations/prompting_vector_llms/CONFIG_README.md
    for text, text_prompt, polarity in tqdm.notebook.tqdm(dataloader):
        gen_text = model.generate(
            text_prompt, {"max_tokens": 1, "top_k": 4, "top_p": 1.0, "rep_penalty": 1.0, "temperature": 1.0}
        ).generation["tokens"]
        # Note that we are looking at the models generated response and attempting to match it to on of the labels in
        # our labels space. If the model produces a different token it is considered wrong.
        first_predicted_tokens = [tokens[0].strip().lower() for tokens in gen_text]
        predictions.extend(first_predicted_tokens)
        labels.extend(list(polarity))

    # The labels associated with the dataset
    labels_order = ["positive", "negative"]

    cm = sklearn.metrics.confusion_matrix(np.array(labels), np.array(predictions), labels=labels_order)

    FP = cm.sum(axis=0) - np.diag(cm)
    FN = cm.sum(axis=1) - np.diag(cm)
    TP = np.diag(cm)

    recall = TP / (TP + FN)
    precision = TP / (TP + FP)
    f1 = 2 * (precision * recall) / (precision + recall)
    print(f"Prediction Accuracy: {TP.sum()/(cm.sum())}")

    print(f"Confusion Matrix with ordering {labels_order}")
    print(cm)
    print("========================================================")
    for label_index, label_name in enumerate(labels_order):
        print(
            f"Label: {label_name}, F1: {f1[label_index]}, Precision: {recall[label_index]}, "
            f"Recall: {precision[label_index]}"
        )

  0%|          | 0/1 [00:00<?, ?it/s]

----------------------------------------------------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1032 entries, 0 to 1031
Data columns (total 6 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   id                    800 non-null    object 
 1   text                  1032 non-null   object 
 2   aspect_term           654 non-null    object 
 3   aspect_term_polarity  654 non-null    object 
 4   aspect_term_from      654 non-null    float64
 5   aspect_term_to        654 non-null    float64
dtypes: float64(2), object(4)
memory usage: 48.5+ KB

<class 'pandas.core.frame.DataFrame'>
Int64Index: 469 entries, 0 to 1031
Data columns (total 8 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   id                         314 non-null    object 
 1   text                       469 non-null    object 
 2   aspect_term                469 no

  0%|          | 0/16 [00:00<?, ?it/s]

Prediction Accuracy: 0.6535211267605634
Confusion Matrix with ordering ['positive', 'negative']
[[169  95]
 [ 28  63]]
Label: positive, F1: 0.7331887201735358, Precision: 0.6401515151515151, Recall: 0.8578680203045685
Label: negative, F1: 0.5060240963855421, Precision: 0.6923076923076923, Recall: 0.3987341772151899
