# Evaluating Your Dictionary-based Sentiment Analyzer

## Importance to project

- Later you will build a neural network-based sentiment analyzer, the performance of which will be compared to the performance of you dictionary-based sentiment analyzer. To be able to make this comparison, first the performance of the dictionary-based sentiment analyzer must be evaluated.
NLP Specialists often evaluate their models regarding their accuracy, precision, and recall. Knowing these values will provide you with feedback on the accuracy of your models and give you hints about which parts of your model should be fine-tuned.
- NLP Specialists often evaluate their classifiers this way—not only sentiment analyzers, but any type of classifiers.
- NLP Specialists like to see their results both in textual and in visual format. For the latter representation, they tend to use confusion matrix.

## Workflow

Workflow

1. Convert the sentiment scores of the small_corpus.csv into sentiment values. Categorize the reviews as positive, negative, or neutral on the basis of the sentiment scores.

   - If you want to get the same results as the reference solution, use the following categories:
   - If the sentiment score of the review is over 0.2, its sentiment value is positive.
   - If the sentiment score of the review is lower than or equal to 0.2 and higher than or equal to -0.2, its sentiment value is neutral.
   - If the sentiment score of the review is under -0.2, its sentiment value is negative.
2. Convert the review ratings into rating classes. Categorize the review ratings as positive, negative, or neutral on the basis of the review ratings.

   - If you want to get the same results as the reference solution, utilize the following categories:
   - If the review rating is 5, its sentiment value is positive.
   - If the review rating is 2, 3, or 4, its sentiment value is neutral.
   - If the review rating 1, its sentiment value is negative.
3. Export your results to the data file and add two columns to the table. One of them should contain the sentiment scores of the reviews; the other should contain the rating classes.

4. Evaluate your dictionary-based sentiment analyzer in two steps:

   - 4.1 Calculate the following values:
      - How many reviews are categorized correctly as positive, negative, or neutral by your dictionary-based sentiment analyzer? (This value is called accuracy.)
      - What is the ratio of correct predictions per category? For example, a review is positive and your sentiment analyzer manages to categorize it as positive. (This value is called precision.)
      - What portion the reviews is categorized correctly per category? For example, 300 reviews are positive, and your sentiment analyzer categorizes only 200 reviews out of the 300 as positive. (This value is called recall.)
      - It is recommended to use the metrics module of the scikit-learn package to perform these tasks.
   - 4.2 Write a textual summary on the performance of your sentiment analyzer.
      -It is recommended to use the evaluation text report function of the metrics module.
5. Illustrate the evaluation of your sentiment analyzer. Create a confusion matrix in Altair.

## Deliverable

The deliverable for this milestone is a Jupyter Notebook, which documents your workflow with the following items:

- sentiment values of the reviews
- rating classes
- textual evaluation containing precision, recall, and accuracy
- confusion matrix
-Upload a link to your deliverable in the Submit Your Work section and click submit. After submitting, the Author’s solution and peer solutions will appear on the page for you to examine.

In [59]:
import nltk
import pandas as pd

In [60]:
df = pd.read_csv("..\\data\\processed\\dictionary_based_sentiment_with_negation.tsv", sep="\t")
df

Unnamed: 0,rating,review,review sentiment
0,1.0,Yet another garbage CoD game. Zombies is unpla...,-0.008081
1,1.0,$80? .... No way. This is NOT worth $80. $80?....,0.033333
2,1.0,One of the worst games ever. I bought and down...,-0.154938
3,1.0,I did a lot of homework before I decided to by...,0.032363
4,1.0,"I am really into RPG games, I loved Skyrim, Bo...",-0.141755
...,...,...,...
7495,5.0,"Worked good, girlfriend loves this game. Five ...",0.250000
7496,5.0,This is my 3rd Mystery PI game and I've enjoye...,0.077381
7497,5.0,work like brand new wont brake any time soon,0.400000
7498,5.0,This remote works fantastic. I love it. Five S...,0.277778


In [61]:
ratings = list(df["rating"])
reviews = list(df["review"])
reviews = [str(e) for e in reviews]
sentiment = list(df["review sentiment"])

In [62]:
def get_rating_class(rating):
    if rating > 4:
        return "positive"
    elif 2 <= rating <= 4:
        return "neutral"
    else:
        return "negative"

In [63]:
def get_sentiment_value(sentiment):
    if sentiment > 0.2:
        return "positive"
    elif -0.2 <= sentiment <= 0.2:
        return "neutral"
    else:
        return "negative"

In [64]:
def check_status(e):
    if e[0] == e[1]:
        return "OK"
    else:
        return "CHECK"

In [65]:
rating_classes = [get_rating_class(e) for e in ratings]
sentiment_values = [get_sentiment_value(e) for e in sentiment]

In [66]:
# just for marking those reviews that should be checked
check = [check_status(e) for e in zip(rating_classes, sentiment_values)]

new_df = pd.DataFrame(
    {
        "ratings": ratings,
        "sentiment_value": sentiment_values,
        "rating_class": rating_classes,
        "status": check,
        "reviews": reviews,
    }
)


In [67]:
# df = pd.DataFrame(
#     {
#         "rating": ratings,
#         "review": reviews,
#         "review sentiment": review_sentiments,
#     }
# )

In [68]:
new_df

Unnamed: 0,ratings,sentiment_value,rating_class,status,reviews
0,1.0,neutral,negative,CHECK,Yet another garbage CoD game. Zombies is unpla...
1,1.0,neutral,negative,CHECK,$80? .... No way. This is NOT worth $80. $80?....
2,1.0,neutral,negative,CHECK,One of the worst games ever. I bought and down...
3,1.0,neutral,negative,CHECK,I did a lot of homework before I decided to by...
4,1.0,neutral,negative,CHECK,"I am really into RPG games, I loved Skyrim, Bo..."
...,...,...,...,...,...
7495,5.0,positive,positive,OK,"Worked good, girlfriend loves this game. Five ..."
7496,5.0,neutral,positive,CHECK,This is my 3rd Mystery PI game and I've enjoye...
7497,5.0,positive,positive,OK,work like brand new wont brake any time soon
7498,5.0,positive,positive,OK,This remote works fantastic. I love it. Five S...


In [69]:
with open("..\\data\\processed\\dict_based_sent_w_neg_w_classes.tsv", "w") as outfile:
    outfile.write(df.to_csv(index=False, sep="\t"))

In [70]:
# evaluation
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report

acc = accuracy_score(rating_classes, sentiment_values)
print("Accuracy = {}\n".format(acc))
print("Classification_report:\n ")
target_names = ["negative", "neutral", "positive"]
print(
    classification_report(rating_classes, sentiment_values, target_names=target_names)
)

Accuracy = 0.6286666666666667

Classification_report:
 
              precision    recall  f1-score   support

    negative       0.65      0.08      0.14      1500
     neutral       0.63      0.93      0.75      4500
    positive       0.60      0.29      0.39      1500

    accuracy                           0.63      7500
   macro avg       0.63      0.43      0.43      7500
weighted avg       0.63      0.63      0.56      7500



In [71]:
import altair as alt
import numpy as np
from sklearn.metrics import confusion_matrix

In [72]:
x, y = np.meshgrid(range(0, 3), range(0, 3))
cm = confusion_matrix(rating_classes, 
                      sentiment_values, 
                      labels=["negative", "netural", "positive"])
source = pd.DataFrame({"true": x.ravel(), 
                       "predicted": y.ravel(), 
                       "number": cm.ravel()})

In [73]:
chart = (
    alt.Chart(source)
    .mark_rect()
    .encode(x="true:O", y="predicted:O", color="number:Q", tooltip=["number"])
    .interactive()
    .properties(width=800, height=500)
)
chart.save(".\\plots\\01\\confusion_matrix.html")
chart