<a href="https://colab.research.google.com/github/tabaraei/aspect-based-sentiment-analysis/blob/main/ABSA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Use Born and explanations provided to understand why the opinion is positive or negative, and at the end compare it with pre-trained model.

[Born Classifier](https://bornrule.eguidotti.com) is a text classification algorithm inspired by the notion of superposition of states in quantum
physics. Born provides good classification performance, explainability, and computational efficiency. In this
project, the goal is to exploit the Born explanation in order to use it for Aspect Based Sentiment Analysis. In
particular, the main idea to to proceed as follows:
1. Perform a sentiment analysis classification of documents using Born
2. Extract the explanation features for each pair of documents and predicted labels
3. Analyze the explanatory features in order to group them in candidate aspects
4. Associate each aspect to a specific sentence or portion of the text
5. Predict the sentiment for the sentence or text portion using the trained Born classifier
6. Associate then a (potentially different) sentiment to each sentence or text portion according to the aspect
7. Finally, evaluate the quality of the results for each aspect.

## Dataset

Any dataset supporting ABSA. See for example [here](https://paperswithcode.com/datasets?task=aspect-based-sentiment-analysis&page=1).

In [26]:
%%capture
!pip install datasets

In [30]:
from datasets import load_dataset

dataset = load_dataset('alexcadillon/SemEval2014Task4', 'restaurants') # "laptops"
train_data, test_data = dataset['train'], dataset['test']

train_data.shape, test_data.shape

((3041, 4), (800, 4))

In [35]:
x = train_data[2]
x

{'sentenceId': '1634',
 'text': "The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not.",
 'aspectTerms': [{'term': 'food',
   'polarity': 'positive',
   'from': '4',
   'to': '8'},
  {'term': 'kitchen', 'polarity': 'positive', 'from': '55', 'to': '62'},
  {'term': 'menu', 'polarity': 'neutral', 'from': '141', 'to': '145'}],
 'aspectCategories': [{'category': 'food', 'polarity': 'positive'}]}

In [39]:
for aspect in x['aspectTerms']:
    print(aspect['term'], aspect['polarity'])

[aspect['term'] for aspect in entry['aspectTerms']]

food positive
kitchen positive
menu neutral


['food', 'kitchen', 'menu']

In [41]:
def compute_overall_sentiment(aspect_sentiments):
    pos_count = aspect_sentiments.count('positive')
    neg_count = aspect_sentiments.count('negative')

    if pos_count > neg_count:
        return 'positive'
    elif neg_count > pos_count:
        return 'negative'
    else:
        return 'neutral'

def create_dataset(dataset):
    data = []
    for entry in dataset:
        document = entry['text']
        aspects = [aspect['term'] for aspect in entry['aspectTerms']]
        aspect_sentiments = [aspect['polarity'] for aspect in entry['aspectTerms']]
        overall_sentiment = compute_overall_sentiment(aspect_sentiments)
        data.append({
            'document': document,
            'aspects': aspects,
            'aspect_sentiments': aspect_sentiments,
            'overall_sentiment': overall_sentiment
        })
    return pd.DataFrame(data)

df = create_dataset(train_data)
df

Unnamed: 0,document,aspects,aspect_sentiments,overall_sentiment
0,But the staff was so horrible to us.,[staff],[negative],negative
1,"To be completely fair, the only redeeming fact...",[food],[positive],positive
2,"The food is uniformly exceptional, with a very...","[food, kitchen, menu]","[positive, positive, neutral]",positive
3,Where Gabriela personaly greets you and recomm...,[],[],neutral
4,"For those that go once and don't enjoy it, all...",[],[],neutral
...,...,...,...,...
3036,But that is highly forgivable.,[],[],neutral
3037,"From the appetizers we ate, the dim sum and ot...","[appetizers, dim sum, foods, food]","[positive, positive, positive, positive]",positive
3038,"When we arrived at 6:00 PM, the restaurant was...",[],[],neutral
3039,Each table has a pot of boiling water sunken i...,"[table, pot of boiling water, meats, vegetable...","[neutral, neutral, neutral, neutral, neutral, ...",neutral


In [29]:
train_data[0]

{'sentenceId': '3121',
 'text': 'But the staff was so horrible to us.',
 'aspectTerms': [{'term': 'staff',
   'polarity': 'negative',
   'from': '8',
   'to': '13'}],
 'aspectCategories': [{'category': 'service', 'polarity': 'negative'}]}

In [None]:
sample_text = train_data[2]['text']
sample_text

"The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not."

In [None]:
dataset = [
    {
        'sentenceId': '1634',
        'text': "The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not.",
        'aspectTerms': [
            {'term': 'food', 'polarity': 'positive', 'from': '4', 'to': '8'},
            {'term': 'kitchen', 'polarity': 'positive', 'from': '55', 'to': '62'},
            {'term': 'menu', 'polarity': 'neutral', 'from': '141', 'to': '145'}
        ],
        'aspectCategories': [{'category': 'food', 'polarity': 'positive'}]
    },
    {
        'sentenceId': '1635',
        'text': "The service was slow but the ambiance was pleasant.",
        'aspectTerms': [
            {'term': 'service', 'polarity': 'negative', 'from': '4', 'to': '11'},
            {'term': 'ambiance', 'polarity': 'positive', 'from': '28', 'to': '36'}
        ],
        'aspectCategories': [{'category': 'service', 'polarity': 'negative'}]
    }
]

import pandas as pd

# Flatten aspect terms into a DataFrame
rows = []
for item in dataset:
    text = item['text']
    for aspect in item['aspectTerms']:
        rows.append({
            'sentenceId': item['sentenceId'],
            'text': text,
            'aspect': aspect['term'],
            'polarity': aspect['polarity']
        })

# Create DataFrame
df = pd.DataFrame(rows)
df

Unnamed: 0,sentenceId,text,aspect,polarity
0,1634,"The food is uniformly exceptional, with a very...",food,positive
1,1634,"The food is uniformly exceptional, with a very...",kitchen,positive
2,1634,"The food is uniformly exceptional, with a very...",menu,neutral
3,1635,The service was slow but the ambiance was plea...,service,negative
4,1635,The service was slow but the ambiance was plea...,ambiance,positive


## NLTK

In [None]:
sample_text = "I love this product! It's amazing."

In [9]:
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

_ = nltk.download('vader_lexicon', quiet=True)
sentiment_analyzer = SentimentIntensityAnalyzer()
sentiment_analyzer.polarity_scores(sample_text)

{'neg': 0.0, 'neu': 0.266, 'pos': 0.734, 'compound': 0.8516}

## RoBERTa

In [None]:
sample_text = "I love this product! It's amazing."

In [22]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from scipy.special import softmax

MODEL = "cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

In [23]:
encoded_text = tokenizer(sample_text, return_tensors='pt')
output = model(**encoded_text)[0][0]
scores = softmax(output.detach().numpy())
neg, neu, pos = scores
scores

array([0.00212159, 0.00545376, 0.9924246 ], dtype=float32)

## Born

In [None]:
%%capture
!pip install bornrule

In [None]:
from bornrule import BornClassifier

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the classifier
clf = BornClassifier()
clf.fit(X_train, y_train)

# Evaluate the model
y_pred = clf.predict(X_test)
print(classification_report(y_test, y_pred))

## References

- Emanuele Guidotti and Alfio Ferrara. Text Classification with Born’s Rule. Advances in Neural Information
Processing Systems, 2022.
- Schouten, K., & Frasincar, F. (2015). Survey on aspect-level sentiment analysis. IEEE Transactions on
Knowledge and Data Engineering, 28(3), 813-830. [link](https://ieeexplore.ieee.org/document/7286808)
- Rana, T. A., & Cheah, Y. N. (2016). Aspect extraction in sentiment analysis: comparative analysis and survey.
Artificial Intelligence Review, 46(4), 459-483. [link](https://link.springer.com/article/10.1007/s10462-016-9472-z)