In this notebook we will deomonstrate aspect based sentiment analysis using [Varder](https://github.com/cjhutto/vaderSentiment) and [Stanford Core NLP](https://stanfordnlp.github.io/CoreNLP/index.html).<br>
<br>**VADER Sentiment Analysis**: VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.(source:[github](https://github.com/cjhutto/vaderSentiment))<br>
Stanford NLP have a live demo of aspect based sentiment analysis [here](http://nlp.stanford.edu:8080/sentiment/rntnDemo.html).<br><br>
**Stanford Core NLP**: "Most sentiment prediction systems work just by looking at words in isolation, giving positive points for positive words and negative points for negative words and then summing up these points. That way, the order of words is ignored and important information is lost. In constrast, our new deep learning model actually builds up a representation of whole sentences based on the sentence structure. It computes the sentiment based on how words compose the meaning of longer phrases. This way, the model is not as easily fooled as previous models."(source: [Stanford Core NLP](https://nlp.stanford.edu/sentiment/index.html).)

In [1]:
!pip install vaderSentiment==3.3.2
!pip install pycorenlp==0.3.0

Collecting vaderSentiment==3.3.2
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/126.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.7/126.0 kB[0m [31m1.3 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m122.9/126.0 kB[0m [31m1.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.0/126.0 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: vaderSentiment
Successfully installed vaderSentiment-3.3.2
Collecting pycorenlp==0.3.0
  Downloading pycorenlp-0.3.0.tar.gz (1.3 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: pycorenlp
  Building wheel for pycorenlp (setup.py) ... [?25l[?25hdone
  Created wheel for pycorenlp: filename=pycorenlp-0.3.0-py3-none-an

### Importing the necessary packages

In [2]:
from pprint import pprint

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import re
import string

import nltk
nltk.download('punkt')
nltk.download('vader_lexicon')
from nltk.tokenize import word_tokenize, RegexpTokenizer

from pycorenlp import StanfordCoreNLP
import json

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package vader_lexicon to /root/nltk_data...


Lets analyze these three sentences.

In [3]:
positive = "This fried chicken tastes very good. It is juicy and perfectly cooked."
negative = "This fried chicken tasted bad. It is dry and overcooked."
ambiguous = "Except the amazing fried chicken everything else at the restaurant tastes very bad."

### VarderSentiment
It scores from -1 to 1. -1 being negative and 1 being positive

In [4]:
def sentiment_analyzer_scores(text):
    sentiment_analyzer = SentimentIntensityAnalyzer()
    score = sentiment_analyzer.polarity_scores(text)
    pprint(text)
    pprint(score)
    print("-"*30)

In [5]:
print("Positive:")
sentiment_analyzer_scores(positive)

print("Negative:")
sentiment_analyzer_scores(negative)

print("Ambiguous:")
sentiment_analyzer_scores(ambiguous)

Positive:
'This fried chicken tastes very good. It is juicy and perfectly cooked.'
{'compound': 0.8122, 'neg': 0.0, 'neu': 0.575, 'pos': 0.425}
------------------------------
Negative:
'This fried chicken tasted bad. It is dry and overcooked.'
{'compound': -0.5423, 'neg': 0.28, 'neu': 0.72, 'pos': 0.0}
------------------------------
Ambiguous:
('Except the amazing fried chicken everything else at the restaurant tastes '
 'very bad.')
{'compound': 0.0018, 'neg': 0.204, 'neu': 0.592, 'pos': 0.204}
------------------------------


As expected the sentiment analyzer performed well on the positive and negative case. When taking into consideration the ambiguous sentence, it calculated the compound sentiment to be close to 0, i.e, neutral.<br>
But it seems to be a negative comment.

In [6]:
def get_word_sentiment(text):
    sentiment_analyzer = SentimentIntensityAnalyzer()

    tokenized_text = nltk.word_tokenize(text)

    positive_words=[]
    neutral_words=[]
    negative_words=[]
    for word in tokenized_text:
        if (sentiment_analyzer.polarity_scores(word)['compound']) >= 0.1:
            positive_words.append(word)
        elif (sentiment_analyzer.polarity_scores(word)['compound']) <= -0.1:
            negative_words.append(word)
        else:
            neutral_words.append(word)
    print(text)
    print('Positive:',positive_words)
    print('Negative:',negative_words)
    print('Neutral:',neutral_words)
    print("-"*30)

In [7]:
get_word_sentiment(positive)
get_word_sentiment(negative)
get_word_sentiment(ambiguous)

This fried chicken tastes very good. It is juicy and perfectly cooked.
Positive: ['good', 'perfectly']
Negative: []
Neutral: ['This', 'fried', 'chicken', 'tastes', 'very', '.', 'It', 'is', 'juicy', 'and', 'cooked', '.']
------------------------------
This fried chicken tasted bad. It is dry and overcooked.
Positive: []
Negative: ['bad']
Neutral: ['This', 'fried', 'chicken', 'tasted', '.', 'It', 'is', 'dry', 'and', 'overcooked', '.']
------------------------------
Except the amazing fried chicken everything else at the restaurant tastes very bad.
Positive: ['amazing']
Negative: ['bad']
Neutral: ['Except', 'the', 'fried', 'chicken', 'everything', 'else', 'at', 'the', 'restaurant', 'tastes', 'very', '.']
------------------------------


### Stanford Core NLP
Before moving on to execute the code we need to start the Stanford Core NLP server on our local machine.<br> To do that follow the steps below (tested on debian should work fine for other distributions too):
1. Download the Stanford Core NLP model from [here](https://stanfordnlp.github.io/CoreNLP/#download).
2. Unizip the folder
3. cd into the folder<br>
    ```cd stanford-corenlp-4.0.0/```
4. Start the server using this command:<br>
    ```java -mx5g -cp "./*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000```
<br><br>
If you do not have java installed on your system please install it from the official [Oracle](https://www.oracle.com/in/java/technologies/javase-downloads.html) page.
<br><br>

### Stanford Core NLP
## Same can be performed in colab using the below command
1. Download the Stanford Core NLP model
```
wget 'https://nlp.stanford.edu/software/stanford-corenlp-4.5.4.zip'
```
Note: The version may differ thus you may need to use above link to download latest version

2. Unzip the File
```
unzip stanford-corenlp-4.5.4.zip
```

3. Move to the Directory
```
cd stanford-corenlp-4.5.4
```

4. Start the server using this command:
```
java -mx5g -cp "./*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000
```

## Or Just Run the Below Cell and Change the version as Required.

In [8]:
%%shell

version='4.5.4'
wget https://nlp.stanford.edu/software/stanford-corenlp-${version}.zip
unzip stanford-corenlp-${version}.zip

--2023-09-04 13:24:15--  https://nlp.stanford.edu/software/stanford-corenlp-4.5.4.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://downloads.cs.stanford.edu/nlp/software/stanford-corenlp-4.5.4.zip [following]
--2023-09-04 13:24:16--  https://downloads.cs.stanford.edu/nlp/software/stanford-corenlp-4.5.4.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 506470124 (483M) [application/zip]
Saving to: ‘stanford-corenlp-4.5.4.zip’


2023-09-04 13:25:48 (5.27 MB/s) - ‘stanford-corenlp-4.5.4.zip’ saved [506470124/506470124]

Archive:  stanford-corenlp-4.5.4.zip
   creating: stanford-corenlp-4.5.4/
  inflating: stanford-corenlp-4.5.4/Makefile 



In [33]:
%%shell
cd stanford-corenlp-4.5.4/
nohup java -mx5g -cp './*' edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000 --port 9001 > corenlp.log 2>&1 &



In [32]:
!ps aux | grep java
!killall java
!ps aux | grep java

root        2027  0.0  0.0   7372  3548 ?        S    13:29   0:00 /bin/bash -c ps aux | grep java
root        2029  0.0  0.0   6480  2400 ?        S    13:29   0:00 grep java
java: no process found
root        2031  0.0  0.0   7372  3468 ?        S    13:29   0:00 /bin/bash -c ps aux | grep java
root        2033  0.0  0.0   6480  2372 ?        S    13:29   0:00 grep java


In [36]:
nlp = StanfordCoreNLP('http://localhost:9001')

def get_sentiment(text):
    res = json.loads(nlp.annotate(text,
                       properties={'annotators': 'sentiment',
                                   'outputFormat': 'json',
                                   'timeout': 10000,
                       }))
    print(text)
    print('Sentiment:', res['sentences'][0]['sentiment'])
    print('Sentiment score:', res['sentences'][0]['sentimentValue'])
    print('Sentiment distribution (0-v. negative, 5-v. positive:', res['sentences'][0]['sentimentDistribution'])
    print("-"*30)

In [37]:
get_sentiment(positive)
get_sentiment(negative)
get_sentiment(ambiguous)

This fried chicken tastes very good. It is juicy and perfectly cooked.
Sentiment: Negative
Sentiment score: 1
Sentiment distribution (0-v. negative, 5-v. positive: [0.12830923698552, 0.37878858949882, 0.30518256344905, 0.17180670417797, 0.01591290588864]
------------------------------
This fried chicken tasted bad. It is dry and overcooked.
Sentiment: Negative
Sentiment score: 1
Sentiment distribution (0-v. negative, 5-v. positive: [0.35691292388455, 0.38793571113551, 0.18201904294799, 0.04194609175503, 0.03118623027692]
------------------------------
Except the amazing fried chicken everything else at the restaurant tastes very bad.
Sentiment: Negative
Sentiment score: 1
Sentiment distribution (0-v. negative, 5-v. positive: [0.12830923590495, 0.37878858881094, 0.30518256399302, 0.1718067054989, 0.01591290579219]
------------------------------


Here you see the model successfully predicts the ambigous sentence which the Varder failed to predict correctly.<br>
The code in this notebook has been adapted from this [article](https://towardsdatascience.com/sentiment-analysis-beyond-words-6ca17a6c1b54).See below code for colab.