<a href="https://colab.research.google.com/github/Viny2030/NLP/blob/main/01_Aspect_Based_Sentiment_analysis_(2).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we will deomonstrate aspect based sentiment analysis using [Varder](https://github.com/cjhutto/vaderSentiment) and [Stanford Core NLP](https://stanfordnlp.github.io/CoreNLP/index.html).<br>
<br>**VADER Sentiment Analysis**: VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.(source:[github](https://github.com/cjhutto/vaderSentiment))<br>
Stanford NLP have a live demo of aspect based sentiment analysis [here](http://nlp.stanford.edu:8080/sentiment/rntnDemo.html).<br><br>
**Stanford Core NLP**: "Most sentiment prediction systems work just by looking at words in isolation, giving positive points for positive words and negative points for negative words and then summing up these points. That way, the order of words is ignored and important information is lost. In constrast, our new deep learning model actually builds up a representation of whole sentences based on the sentence structure. It computes the sentiment based on how words compose the meaning of longer phrases. This way, the model is not as easily fooled as previous models."(source: [Stanford Core NLP](https://nlp.stanford.edu/sentiment/index.html).)

En este cuaderno, demostraremos el análisis de sentimientos basado en aspectos utilizando Varder y Stanford Core NLP.

Análisis de sentimientos VADER: VADER (Valence Aware Dictionary and sEntiment Reasoner) es una herramienta de análisis de sentimientos basada en reglas y léxicos que está específicamente adaptada a los sentimientos expresados ​​en las redes sociales y funciona bien con textos de otros dominios. (Fuente: github)
Stanford NLP tiene una demostración en vivo del análisis de sentimientos basado en aspectos aquí.

Stanford Core NLP: "La mayoría de los sistemas de predicción de sentimientos funcionan simplemente observando las palabras de forma aislada, dando puntos positivos para las palabras positivas y puntos negativos para las palabras negativas y luego sumando estos puntos. De esa manera, se ignora el orden de las palabras y se pierde información importante. En cambio, nuestro nuevo modelo de aprendizaje profundo en realidad crea una representación de oraciones completas en función de la estructura de la oración. Calcula el sentimiento en función de cómo las palabras componen el significado de frases más largas. De esta manera, el modelo no se engaña tan fácilmente como los modelos anteriores". (Fuente: Stanford Core NLP).


In [1]:
!pip install vaderSentiment
!pip install pycorenlp

Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl.metadata (572 bytes)
Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.0/126.0 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: vaderSentiment
Successfully installed vaderSentiment-3.3.2
Collecting pycorenlp
  Downloading pycorenlp-0.3.0.tar.gz (1.3 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: pycorenlp
  Building wheel for pycorenlp (setup.py) ... [?25l[?25hdone
  Created wheel for pycorenlp: filename=pycorenlp-0.3.0-py3-none-any.whl size=2121 sha256=5774d9c1cf9a8ecaf360c8bc9d749e851fc898a5e7c37e001de5b8d99cc06d0c
  Stored in directory: /root/.cache/pip/wheels/68/91/be/b83633256a1655afb34c5ea44b3290af84417a144e1f13e56f
Successfully built pycorenlp
Installing collected packages: pycorenlp
Successfully installed pycorenlp-0.3.0


### Importing the necessary packages

In [2]:
from pprint import pprint

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import re
import string

import nltk
nltk.download('punkt')
nltk.download('vader_lexicon')
from nltk.tokenize import word_tokenize, RegexpTokenizer

from pycorenlp import StanfordCoreNLP

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package vader_lexicon to /root/nltk_data...


Lets analyze these three sentences.

In [3]:
positive = "This fried chicken tastes very good. It is juicy and perfectly cooked."
negative = "This fried chicken tasted bad. It is dry and overcooked."
ambiguous = "Except the amazing fried chicken everything else at the restaurant tastes very bad."

### VarderSentiment
It scores from -1 to 1. -1 being negative and 1 being positive

VarderSentiment
Tiene una puntuación de -1 a 1, siendo -1 negativo y 1 positivo.

In [4]:
def sentiment_analyzer_scores(text):
    sentiment_analyzer = SentimentIntensityAnalyzer()
    score = sentiment_analyzer.polarity_scores(text)
    pprint(text)
    pprint(score)
    print("-"*30)

In [5]:
print("Positive:")
sentiment_analyzer_scores(positive)

print("Negative:")
sentiment_analyzer_scores(negative)

print("Ambiguous:")
sentiment_analyzer_scores(ambiguous)

Positive:
'This fried chicken tastes very good. It is juicy and perfectly cooked.'
{'compound': 0.8122, 'neg': 0.0, 'neu': 0.575, 'pos': 0.425}
------------------------------
Negative:
'This fried chicken tasted bad. It is dry and overcooked.'
{'compound': -0.5423, 'neg': 0.28, 'neu': 0.72, 'pos': 0.0}
------------------------------
Ambiguous:
('Except the amazing fried chicken everything else at the restaurant tastes '
 'very bad.')
{'compound': 0.0018, 'neg': 0.204, 'neu': 0.592, 'pos': 0.204}
------------------------------


As expected the sentiment analyzer performed well on the positive and negative case. When taking into consideration the ambiguous sentence, it calculated the compound sentiment to be close to 0, i.e, neutral.<br>
But it seems to be a negative comment.

Como era de esperar, el analizador de sentimientos funcionó bien tanto en el caso positivo como en el negativo. Al tomar en cuenta la oración ambigua, calculó que el sentimiento compuesto era cercano a 0, es decir, neutral.
Pero parece ser un comentario negativo.

In [6]:
def get_word_sentiment(text):
    sentiment_analyzer = SentimentIntensityAnalyzer()

    tokenized_text = nltk.word_tokenize(text)

    positive_words=[]
    neutral_words=[]
    negative_words=[]
    for word in tokenized_text:
        if (sentiment_analyzer.polarity_scores(word)['compound']) >= 0.1:
            positive_words.append(word)
        elif (sentiment_analyzer.polarity_scores(word)['compound']) <= -0.1:
            negative_words.append(word)
        else:
            neutral_words.append(word)
    print(text)
    print('Positive:',positive_words)
    print('Negative:',negative_words)
    print('Neutral:',neutral_words)
    print("-"*30)

In [8]:
import nltk
nltk.download('punkt_tab')

[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


True

In [9]:
get_word_sentiment(positive)
get_word_sentiment(negative)
get_word_sentiment(ambiguous)

This fried chicken tastes very good. It is juicy and perfectly cooked.
Positive: ['good', 'perfectly']
Negative: []
Neutral: ['This', 'fried', 'chicken', 'tastes', 'very', '.', 'It', 'is', 'juicy', 'and', 'cooked', '.']
------------------------------
This fried chicken tasted bad. It is dry and overcooked.
Positive: []
Negative: ['bad']
Neutral: ['This', 'fried', 'chicken', 'tasted', '.', 'It', 'is', 'dry', 'and', 'overcooked', '.']
------------------------------
Except the amazing fried chicken everything else at the restaurant tastes very bad.
Positive: ['amazing']
Negative: ['bad']
Neutral: ['Except', 'the', 'fried', 'chicken', 'everything', 'else', 'at', 'the', 'restaurant', 'tastes', 'very', '.']
------------------------------


### Stanford Core NLP
Before moving on to execute the code we need to start the Stanford Core NLP server on our local machine.<br> To do that follow the steps below (tested on debian should work fine for other distributions too):
1. Download the Stanford Core NLP model from [here](https://stanfordnlp.github.io/CoreNLP/#download).
2. Unizip the folder
3. cd into the folder<br>
    ```cd stanford-corenlp-4.0.0/```
4. Start the server using this command:<br>
    ```java -mx5g -cp "./*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000```
<br><br>
If you do not have java installed on your system please install it from the official [Oracle](https://www.oracle.com/in/java/technologies/javase-downloads.html) page.
<br><br>


Stanford Core NLP
Antes de ejecutar el código, debemos iniciar el servidor Stanford Core NLP en nuestra máquina local.
Para ello, siga los pasos que se indican a continuación (probado en Debian, debería funcionar bien también para otras distribuciones):

Descargue el modelo Stanford Core NLP desde aquí.
Descomprima la carpeta
cd en la carpeta
cd stanford-corenlp-4.0.0/
Inicie el servidor con este comando:
java -mx5g -cp "./*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000

Si no tiene Java instalado en su sistema, instálelo desde la página oficial de Oracle.


In [10]:
nlp = StanfordCoreNLP('http://localhost:9000')

def get_sentiment(text):
    res = nlp.annotate(text,
                       properties={'annotators': 'sentiment',
                                   'outputFormat': 'json',
                                   'timeout': 1000,
                       })
    print(text)
    print('Sentiment:', res['sentences'][0]['sentiment'])
    print('Sentiment score:', res['sentences'][0]['sentimentValue'])
    print('Sentiment distribution (0-v. negative, 5-v. positive:', res['sentences'][0]['sentimentDistribution'])
    print("-"*30)

In [12]:
nlp = StanfordCoreNLP('http://localhost:9001')  # Update the port here

In [14]:
# ipython-input-10-79eacba354a0
nlp = StanfordCoreNLP('http://localhost:9000') # Assuming the server is running on port 9000

def get_sentiment(text):
    res = nlp.annotate(text,
                       properties={'annotators': 'sentiment',
                                   'outputFormat': 'json',
                                   'timeout': 1000,
                       })
    print(text)
    print('Sentiment:', res['sentences'][0]['sentiment'])
    print('Sentiment score:', res['sentences'][0]['sentimentValue'])
    print('Sentiment distribution (0-v. negative, 5-v. positive:', res['sentences'][0]['sentimentDistribution'])
    print("-"*30)

In [18]:
!wget https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
!unzip stanford-corenlp-full-2018-10-05.zip

--2024-12-08 23:12:41--  https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://downloads.cs.stanford.edu/nlp/software/stanford-corenlp-full-2018-10-05.zip [following]
--2024-12-08 23:12:41--  https://downloads.cs.stanford.edu/nlp/software/stanford-corenlp-full-2018-10-05.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 393239982 (375M) [application/zip]
Saving to: ‘stanford-corenlp-full-2018-10-05.zip’


2024-12-08 23:13:52 (5.36 MB/s) - ‘stanford-corenlp-full-2018-10-05.zip’ saved [393239982/393239982]

Archive:  stanford-corenlp-full-2018-10-05.zip
   creating: stanford-

In [20]:
!wget https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
!unzip stanford-corenlp-full-2018-10-05.zip

--2024-12-08 23:14:11--  https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://downloads.cs.stanford.edu/nlp/software/stanford-corenlp-full-2018-10-05.zip [following]
--2024-12-08 23:14:12--  https://downloads.cs.stanford.edu/nlp/software/stanford-corenlp-full-2018-10-05.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 393239982 (375M) [application/zip]
Saving to: ‘stanford-corenlp-full-2018-10-05.zip.1’


2024-12-08 23:15:22 (5.37 MB/s) - ‘stanford-corenlp-full-2018-10-05.zip.1’ saved [393239982/393239982]

Archive:  stanford-corenlp-full-2018-10-05.zip
replace stanford-c

In [21]:
!java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

Error: Could not find or load main class edu.stanford.nlp.pipeline.StanfordCoreNLPServer
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.pipeline.StanfordCoreNLPServer


In [23]:
from pycorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')  # Make sure this port matches the server

def get_sentiment(text):
    res = nlp.annotate(text,
                       properties={'annotators': 'sentiment',
                                   'outputFormat': 'json',
                                   'timeout': 1000,
                       })
    print(text)
    print('Sentiment:', res['sentences'][0]['sentiment'])
    print('Sentiment score:', res['sentences'][0]['sentimentValue'])
    print('Sentiment distribution (0-v. negative, 5-v. positive:', res['sentences'][0]['sentimentDistribution'])
    print("-"*30)

In [24]:
get_sentiment(positive)
get_sentiment(negative)
get_sentiment(ambiguous)

Exception: Check whether you have started the CoreNLP server e.g.
$ cd stanford-corenlp-full-2015-12-09/ 
$ java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer

Here you see the model successfully predicts the ambigous sentence which the Varder failed to predict correctly.<br>
The code in this notebook has been adapted from this [article](https://towardsdatascience.com/sentiment-analysis-beyond-words-6ca17a6c1b54).See below code for colab.