Sentiment Analysis

1️⃣ Install NLP Libraries

In [3]:
%pip install transformers vaderSentiment pandas


Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl.metadata (572 bytes)
Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
Installing collected packages: vaderSentiment
Successfully installed vaderSentiment-3.3.2
Note: you may need to restart the kernel to use updated packages.


In [10]:
from transformers import pipeline
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import pandas as pd

# Load cleaned CBE dataset
df = pd.read_csv(r"C:\10 Kifia Tasks\Week-2\Customer-Experience-Analytics-for-Fintech-Apps\notebooks\web_scrape\CBE_bank_reviews_clean.csv")

# Initialize sentiment analyzers
vader_analyzer = SentimentIntensityAnalyzer()
bert_analyzer = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

# Apply VADER sentiment analysis
def get_vader_sentiment(text):
    score = vader_analyzer.polarity_scores(text)["compound"]
    return "positive" if score > 0.05 else "negative" if score < -0.05 else "neutral"

# Apply BERT sentiment analysis
def get_bert_sentiment(text):
    return bert_analyzer(text)[0]["label"]

# Apply both sentiment models
df["vader_sentiment"] = df["review"].apply(get_vader_sentiment)
df["bert_sentiment"] = df["review"].apply(get_bert_sentiment)

# Save results
df.to_csv("CBE_sentiment_analysis.csv", index=False)
print("✅ Sentiment analysis completed! Results saved in CBE_sentiment_analysis.csv.")


To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Device set to use cpu


✅ Sentiment analysis completed! Results saved in CBE_sentiment_analysis.csv.


🔹 Aggregation

In [1]:
import pandas as pd

# Load sentiment analysis results
df = pd.read_csv("CBE_sentiment_analysis.csv")

# Convert sentiment labels to numerical values
sentiment_mapping = {"positive": 1, "neutral": 0, "negative": -1}
df["vader_sentiment_score"] = df["vader_sentiment"].map(sentiment_mapping)
df["bert_sentiment_score"] = df["bert_sentiment"].map({"POSITIVE": 1, "NEGATIVE": -1})

# Aggregate sentiment by bank and rating
aggregated_df = df.groupby(["bank", "rating"]).agg(
    mean_vader_sentiment=("vader_sentiment_score", "mean"),
    mean_bert_sentiment=("bert_sentiment_score", "mean"),
    count_reviews=("review", "count")
).reset_index()

# Save aggregated results
aggregated_df.to_csv("aggregated_sentiment.csv", index=False)
print("✅ Sentiment aggregation completed! Saved in aggregated_sentiment.csv.")


✅ Sentiment aggregation completed! Saved in aggregated_sentiment.csv.
