# Evaluate a real classifier

This code is an example of the use of VADER classifier from NLTK. It is a Naive-Bayes classifier that is trainded with a lexicon and dataset of movie reviews.

Look in the example how the library SKLearn is used to evaulate the classifier.

At the end you have an example on how to use the classifier en custom examples. 


In [8]:
import nltk
from nltk.corpus import movie_reviews
from nltk.classify import NaiveBayesClassifier
from nltk.classify.util import accuracy
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.sentiment import SentimentIntensityAnalyzer
from sklearn.metrics import classification_report, confusion_matrix
import random


In [None]:
# Download required NLTK datasets
nltk.download('movie_reviews')
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('vader_lexicon')

In [None]:
# Preprocess the data
stop_words = set(stopwords.words('english'))

def extract_features(words):
    return {word: True for word in words if word.lower() not in stop_words}

[nltk_data] Downloading package movie_reviews to
[nltk_data]     C:\Users\victo\AppData\Roaming\nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\victo\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\victo\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\victo\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


In [None]:
# Prepare the dataset
documents = [(list(movie_reviews.words(fileid)), category)
             for category in movie_reviews.categories()
             for fileid in movie_reviews.fileids(category)]
random.shuffle(documents)  # Shuffle the dataset for better randomness

# Feature extraction
feature_sets = [(extract_features(words), category) for (words, category) in documents]

# Split the data into training and testing sets
train_size = int(len(feature_sets) * 0.8)
train_set, test_set = feature_sets[:train_size], feature_sets[train_size:]

# Train a Naive Bayes Classifier
classifier = NaiveBayesClassifier.train(train_set)

In [None]:
# Evaluate the classifier
print("\nNaive Bayes Classifier Evaluation:")
print(f"Accuracy: {accuracy(classifier, test_set) * 100:.2f}%")
classifier.show_most_informative_features(10)






Naive Bayes Classifier Evaluation:
Accuracy: 71.25%
Most Informative Features
               ludicrous = True              neg : pos    =     12.8 : 1.0
               depiction = True              pos : neg    =     11.2 : 1.0
                  avoids = True              pos : neg    =     10.5 : 1.0
                   stark = True              pos : neg    =     10.5 : 1.0
               strongest = True              pos : neg    =     10.5 : 1.0
                 idiotic = True              neg : pos    =     10.4 : 1.0
              astounding = True              pos : neg    =      9.8 : 1.0
                    slip = True              pos : neg    =      9.8 : 1.0
                religion = True              pos : neg    =      9.6 : 1.0
                    3000 = True              neg : pos    =      9.5 : 1.0


In [None]:
# Prepare predictions and true labels for sklearn metrics
y_true = [label for (_, label) in test_set]
y_pred = [classifier.classify(features) for (features, _) in test_set]
# Evaluate using sklearn metrics
print("\nClassification Report:")
print(classification_report(y_true, y_pred))



Classification Report:
              precision    recall  f1-score   support

         neg       0.94      0.43      0.59       193
         pos       0.65      0.98      0.78       207

    accuracy                           0.71       400
   macro avg       0.80      0.70      0.68       400
weighted avg       0.79      0.71      0.69       400



In [12]:
# Confusion Matrix
print("Confusion Matrix:")
print(confusion_matrix(y_true, y_pred))

# VADER Sentiment Analysis on custom examples
sia = SentimentIntensityAnalyzer()
example_sentences = [
    "I absolutely loved this movie! The acting was fantastic.",
    "This was the worst film I have ever seen.",
    "The plot was predictable, but the cinematography was beautiful.",
    "I wouldn't recommend it. It was boring and too long."
]



Confusion Matrix:
[[ 83 110]
 [  5 202]]


In [13]:
print("\nVADER Sentiment Analysis:")
for sentence in example_sentences:
    score = sia.polarity_scores(sentence)
    sentiment = "positive" if score['compound'] > 0 else "negative"
    print(f"Sentence: {sentence}\nSentiment: {sentiment} (Score: {score['compound']})\n")


VADER Sentiment Analysis:
Sentence: I absolutely loved this movie! The acting was fantastic.
Sentiment: positive (Score: 0.8436)

Sentence: This was the worst film I have ever seen.
Sentiment: negative (Score: -0.6249)

Sentence: The plot was predictable, but the cinematography was beautiful.
Sentiment: positive (Score: 0.7469)

Sentence: I wouldn't recommend it. It was boring and too long.
Sentiment: negative (Score: -0.5283)



# Exercise:

Create your own gold standard and measure Precission, Recall, and F1 manually and with SKLearn to check if the result is the same. 