# DaCy and Sentiment
DaCy currently does not include any tools for sentiment extraction, but a couple of good tools already exists. Thus DaCy wraps these in the DaCy framework, but if you use this in a publication or similar be sure to credit the original authors.

In [1]:
import dacy
import spacy

## BertTone
--- 

BertTone is a model trained by DaNLP, actually it is two. One for classification of polarity (whether a sentence is positive, negative or neutral) and subjectivity (whether a text is subjective or not).

To read more about BertTone as well as its performance matched against other models see DaNLP's [GitHub](https://github.com/alexandrainst/danlp/blob/master/docs/docs/tasks/sentiment_analysis.md).

Here I will show a simple use case of both models. Furthermore if you wish to inspect the TransformerData to see e.g. the used wordpieces you check out the `doc._.berttone_subj_trf_data` or `doc._.berttone_pol_trf_data`

In [2]:
nlp = spacy.blank("da")
nlp = dacy.sentiment.add_berttone_subjectivity(nlp)

Model bert.subjective exists in /Users/au561649/.danlp/bert.subjective


In [3]:
texts = ["Analysen viser, at økonomien bliver forfærdelig dårlig", 
         "Jeg tror alligvel, det bliver godt"]

docs = nlp.pipe(texts)

for doc in docs:
    print(doc._.subjectivity)
    print(doc._.subjectivity_prop)

objective
{'prop': array([1., 0.], dtype=float32), 'labels': ['objective', 'subjective']}
subjective
{'prop': array([0., 1.], dtype=float32), 'labels': ['objective', 'subjective']}


In [4]:
nlp = dacy.sentiment.add_berttone_polarity(nlp)

docs = nlp.pipe(texts)

for doc in docs:
    print(doc._.polarity)
    print(doc._.polarity_prop)

Model bert.polarity exists in /Users/au561649/.danlp/bert.polarity
negative
{'prop': array([0.002, 0.008, 0.99 ], dtype=float32), 'labels': ['positive', 'neutral', 'negative']}
positive
{'prop': array([0.854, 0.146, 0.001], dtype=float32), 'labels': ['positive', 'neutral', 'negative']}


BertEmotion
---

Siliar to BertTone is a BertEmoiton is a model trained by DaNLP, actually it is also two. One for classifying whether a text is emotionally laden or not, and one for emotion classification using. The possible emotions to classify from is:

- "Glæde/Sindsro"
- "Tillid/Accept"
- "Forventning/Interrese"
- "Overasket/Målløs"
- "Vrede/Irritation"
- "Foragt/Modvilje"
- "Sorg/trist"
- "Frygt/Bekymret"

Their transformerData can be accessed using `bertemotion_laden_trf_data` for the model whether a text is emotionally laden and `bertemotion_emo_trf_data` for the model predicting emotion. Similarly to above you can always use the `*_prop` prefix to extract the probabilities of each label.

In [5]:
nlp = dacy.sentiment.add_bertemotion_laden(nlp)  # whether a text is emotionally laden
nlp = dacy.sentiment.add_bertemotion_emo(nlp)    # what emotion is portrayed

Model bert.noemotion exists in /Users/au561649/.danlp/bert.noemotion
Model bert.emotion exists in /Users/au561649/.danlp/bert.emotion


In [6]:
texts = ['bilen er flot', 
         'jeg ejer en rød bil og det er en god bil', 
         'jeg ejer en rød bil men den er gået i stykker', 
         "Ifølge TV udsendelsen så bliver vejret skidt imorgen",  
         "Fuck jeg hader bare Hitler. Han er bare så FUCKING træls!",
         "Har i set at Tesla har landet en raket på månen? Det er vildt!!",
         "Nu må vi altså få ændret noget",
         "En sten kan ikke flyve. Morlille kan ikke flyve. Ergo er morlille en sten!"]

docs = nlp.pipe(texts)

for doc in docs:
    print(doc._.laden)
    print("\t", doc._.emotion)

Emotional
	 Tillid/Accept
Emotional
	 Tillid/Accept
Emotional
	 Sorg/trist
Emotional
	 Frygt/Bekymret
Emotional
	 Sorg/trist
Emotional
	 Overasket/Målløs
Emotional
	 Forventning/Interrese
Emotional
	 Foragt/Modvilje


Den opmærksomme person, ville med rette undre sig over outputtet altid er emotional. Det gjorde jeg også selv så har forsøgt med en lang række eksempler. Desværre har jeg endnu ikke fundet et som giver et "No emotion"-tagget. Har indberettet dette som en fejl til DaNLP's [GitHub](https://github.com/alexandrainst/danlp/issues/122). Dog som det ser ud pt. vil jeg ikke anbefale at bruge modellen i praktiske applikationer. Det ses særligt tydeligt i det næste output som også printer sandsynlighederne:

In [7]:
docs = nlp.pipe(texts)
for doc in docs:
    print(doc._.laden_prop)

{'prop': array([1., 0.], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([0.831, 0.169], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([1., 0.], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([1., 0.], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([1., 0.], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([1., 0.], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([0.999, 0.001], dtype=float32), 'labels': ['Emotional', 'No emotion']}
{'prop': array([0.997, 0.003], dtype=float32), 'labels': ['Emotional', 'No emotion']}


## DaVader

---

DaVader is a Danish Sentiment model developing using [Vader](https://github.com/fnielsen/afinn) and the dictionary list of [SentiDa](https://github.com/guscode/sentida) and [AFINN](https://github.com/fnielsen/afinn). This adaption is developed by Center for Humanities Computing Aarhus and by the author of this package. It is a lexicon and rule-based sentiment analysis tool which predict sentiment valence - the degree to which a text is positive or negative - as opposed to BertTone which simply predict whether or not it is.

An additional advantage of it being rule-based is that it is transparent (the entire lexion can be found in the sentiment folder) and very fast compared to transformer-based approaches.

In [8]:
from spacy.tokens import Doc
from dacy.sentiment import da_vader_getter

Doc.set_extension("vader_da", getter=da_vader_getter)

texts = ['bilen er flot', 'jeg ejer en rød bil og det er en god bil', 'jeg ejer en rød bil men den er gået i stykker']

docs = nlp.pipe(texts)

for doc in docs:
    print(doc._.vader_da)

FileNotFoundError: [Errno 2] No such file or directory: '/Users/au561649/.virtualenvs/NLP/lib/python3.8/site-packages/dacy/sentiment/vader_lexicon_da.csv'