### Exercise 1: Sentiment Analysis using VADER

In [1]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

In [2]:
# Initialize the analyzer
vader_analyzer = SentimentIntensityAnalyzer()

### 1. What do the sentiment scores (positive, neutral, negative, and compound) represent?

VADER provides four sentiment scores for a sentence:

positive: The proportion of the text that has positive sentiment.

neutral: The proportion of the text that is neutral.

negative: The proportion of the text that has negative sentiment.

compound: A normalized score that summarizes the overall sentiment, ranging from -1 (most negative) to +1 (most positive). It’s computed using a weighted sum of all the lexicon scores in the sentence, adjusted according to the rules VADER applies (like punctuation, capitalization, intensifiers, etc.).

### 2. How can you classify a sentence as positive, negative, or neutral based on the compound score?

We Use the following thresholds (recommended by VADER's authors):

Positive: compound >= 0.05

Neutral: -0.05 < compound < 0.05

Negative: compound <= -0.05

In [3]:
# Sample sentences for VADER
vader_sentences = [
    "I am so happy with the service.",
    "This movie was a waste of time.",
    "It was an okay experience.",
    "Best purchase I've made in years!",
    "I don't like this app, it's too slow."
]

In [4]:
print("--- VADER Sentiment Analysis ---")
for sentence in vader_sentences:
    scores = vader_analyzer.polarity_scores(sentence)
    compound = scores['compound']
    if compound >= 0.05:
        sentiment = 'Positive'
    elif compound <= -0.05:
        sentiment = 'Negative'
    else:
        sentiment = 'Neutral'
    print(f"Sentence: {sentence}\nScores: {scores}\nPredicted Sentiment: {sentiment}\n")

--- VADER Sentiment Analysis ---
Sentence: I am so happy with the service.
Scores: {'neg': 0.0, 'neu': 0.559, 'pos': 0.441, 'compound': 0.6948}
Predicted Sentiment: Positive

Sentence: This movie was a waste of time.
Scores: {'neg': 0.318, 'neu': 0.682, 'pos': 0.0, 'compound': -0.4215}
Predicted Sentiment: Negative

Sentence: It was an okay experience.
Scores: {'neg': 0.0, 'neu': 0.678, 'pos': 0.322, 'compound': 0.2263}
Predicted Sentiment: Positive

Sentence: Best purchase I've made in years!
Scores: {'neg': 0.0, 'neu': 0.527, 'pos': 0.473, 'compound': 0.6696}
Predicted Sentiment: Positive

Sentence: I don't like this app, it's too slow.
Scores: {'neg': 0.232, 'neu': 0.768, 'pos': 0.0, 'compound': -0.2755}
Predicted Sentiment: Negative



### Exercise 2: Sentiment Analysis using Huggingface Transformers

### 1. What are the labels provided by the Huggingface model for sentiment analysis?
By default, the Huggingface sentiment analysis pipeline provides two labels:

POSITIVE

NEGATIVE

### 2. How do the confidence scores relate to the model's prediction?
The confidence score represents the model's probability (softmax output) that the prediction is correct. It ranges between 0 and 1.

A higher score (e.g., 0.98) means the model is very confident in its prediction.

A lower score (e.g., 0.55) indicates uncertainty.

In [5]:
from transformers import pipeline

  from .autonotebook import tqdm as notebook_tqdm


In [6]:
# Load the pipeline
hf_pipeline = pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Device set to use cpu


In [7]:
# Sample sentences for Huggingface
hf_sentences = [
    "I love this new phone.",
    "I had a terrible experience with customer support.",
    "The movie was not bad, but not great either.",
    "Absolutely loved the restaurant!",
    "The product arrived damaged, very disappointed."
]

In [8]:
print("--- Huggingface Sentiment Analysis ---")
for sentence in hf_sentences:
    result = hf_pipeline(sentence)[0]
    print(f"Sentence: {sentence}\nLabel: {result['label']}, Score: {result['score']:.4f}\n")


--- Huggingface Sentiment Analysis ---
Sentence: I love this new phone.
Label: POSITIVE, Score: 0.9998

Sentence: I had a terrible experience with customer support.
Label: NEGATIVE, Score: 0.9995

Sentence: The movie was not bad, but not great either.
Label: NEGATIVE, Score: 0.9963

Sentence: Absolutely loved the restaurant!
Label: POSITIVE, Score: 0.9999

Sentence: The product arrived damaged, very disappointed.
Label: NEGATIVE, Score: 0.9998



### Exercise 3: Compare VADER and Huggingface

In [9]:

# Combined test sentences
combined_sentences = hf_sentences

print("--- Comparison of VADER and Huggingface ---")
for sentence in combined_sentences:
    vader_score = vader_analyzer.polarity_scores(sentence)
    hf_result = hf_pipeline(sentence)[0]

    # Determine VADER sentiment
    compound = vader_score['compound']
    if compound >= 0.05:
        vader_sentiment = 'Positive'
    elif compound <= -0.05:
        vader_sentiment = 'Negative'
    else:
        vader_sentiment = 'Neutral'

    print(f"Sentence: {sentence}")
    print(f"VADER: Sentiment={vader_sentiment}, Compound={compound}")
    print(f"Huggingface: Sentiment={hf_result['label']}, Score={hf_result['score']:.4f}\n")


--- Comparison of VADER and Huggingface ---
Sentence: I love this new phone.
VADER: Sentiment=Positive, Compound=0.6369
Huggingface: Sentiment=POSITIVE, Score=0.9998

Sentence: I had a terrible experience with customer support.
VADER: Sentiment=Negative, Compound=-0.1027
Huggingface: Sentiment=NEGATIVE, Score=0.9995

Sentence: The movie was not bad, but not great either.
VADER: Sentiment=Negative, Compound=-0.5448
Huggingface: Sentiment=NEGATIVE, Score=0.9963

Sentence: Absolutely loved the restaurant!
VADER: Sentiment=Positive, Compound=0.6689
Huggingface: Sentiment=POSITIVE, Score=0.9999

Sentence: The product arrived damaged, very disappointed.
VADER: Sentiment=Negative, Compound=-0.7425
Huggingface: Sentiment=NEGATIVE, Score=0.9998



### 1. How do the results of VADER and Huggingface compare in terms of sentiment classification?
VADER is rule-based and uses a lexicon of words with pre-assigned sentiment values. It works well for simple and clear-cut sentiment in texts like reviews, tweets, etc.

Huggingface captures context and semantics much better. It’s more accurate on nuanced language, like “The movie wasn’t bad” (which VADER might misclassify as negative because of the word "bad").

### 2. Which method provides a more accurate prediction for complex sentences (e.g., sentences with sarcasm)?
Huggingface provides better predictions for complex sentences, sarcasm, or double negatives because it’s context-aware.

VADER struggles with sarcasm since it doesn't “understand” context, it just adds up word scores.

### 3. Which method is faster? Why might that be the case?
VADER is faster as it’s lightweight and purely rule-based, with no model loading or GPU acceleration required.

Huggingface is slower because:

It loads a large pre-trained transformer model.

It runs each sentence through deep neural networks.

It might require hardware acceleration (GPU/TPU) for speed.

### Exercise 4: Evaluating Sentiment Analysis Performance

In [10]:
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
import pandas as pd

In [15]:
# Step 1: Create a Test Dataset
test_data = pd.DataFrame({
    'Sentence': [
        "I really enjoyed the movie.",
        "The service was terrible.",
        "It was just fine, nothing special.",
        "I absolutely hated the product.",
        "This is the best experience ever."
    ],
    'True Sentiment': ['positive', 'negative', 'neutral', 'negative', 'positive']
})


In [16]:
# Step 2: Perform Sentiment Analysis
vader_preds = []
hf_preds = []


In [17]:
for sentence in test_data['Sentence']:
    # VADER prediction
    compound = vader_analyzer.polarity_scores(sentence)['compound']
    if compound >= 0.05:
        vader_preds.append('positive')
    elif compound <= -0.05:
        vader_preds.append('negative')
    else:
        vader_preds.append('neutral')

    # Huggingface prediction
    label = hf_pipeline(sentence)[0]['label'].lower()
    if label == 'positive':
        hf_preds.append('positive')
    else:
        hf_preds.append('negative')  # Huggingface usually returns POSITIVE or NEGATIVE

In [None]:
# Step 3: Calculate Evaluation Metrics
vader_metrics = precision_recall_fscore_support(test_data['True Sentiment'], vader_preds, average='weighted', zero_division=0)
hf_metrics = precision_recall_fscore_support(test_data['True Sentiment'], hf_preds, average='weighted', zero_division=0)

print("--- Evaluation Metrics ---")
print(f"VADER - Accuracy: {accuracy_score(test_data['True Sentiment'], vader_preds):.2f}, Precision: {vader_metrics[0]:.2f}, Recall: {vader_metrics[1]:.2f}, F1: {vader_metrics[2]:.2f}")
print(f"Huggingface - Accuracy: {accuracy_score(test_data['True Sentiment'], hf_preds):.2f}, Precision: {hf_metrics[0]:.2f}, Recall: {hf_metrics[1]:.2f}, F1: {hf_metrics[2]:.2f}")


--- Evaluation Metrics ---
VADER - Accuracy: 0.80, Precision: 0.67, Recall: 0.80, F1: 0.72
Huggingface - Accuracy: 0.80, Precision: 0.67, Recall: 0.80, F1: 0.72


### 1. How do the models perform in terms of accuracy, precision, recall, and F1-score?
VADER generally performs well on simple, short texts, especially where sentiment is explicit.

Huggingface tends to achieve higher accuracy, precision, recall, and F1-score, especially on complex or subtle text, due to its understanding of language context.

### 2. Which model performs better in predicting positive sentiment? Negative sentiment?
Positive sentiment: Both models perform well, but Huggingface tends to be more confident and consistent in predictions.

Negative sentiment: Huggingface usually performs better here. VADER may misclassify subtle or indirect negativity (e.g., “not worth the price”) because it lacks contextual understanding.

### 3. What might cause discrepancies between the two models' predictions?
Several factors can lead to differences:

Contextual Understanding: Huggingface understands grammar, negation, and context; VADER doesn’t.

E.g., “I’m not happy” → VADER might pick up “happy” as a positive word.

Model Type: VADER uses a fixed wordlist and rules; Huggingface uses a learned model from millions of text samples.

Sentence Length and Complexity: VADER performs better on short, clear statements. Huggingface handles long and nuanced sentences better.

Sarcasm and Slang: Huggingface can detect sarcasm/slang better (to an extent), whereas VADER might misinterpret them.
