# Advanced Sentiment Analysis Insights

This notebook explores deeper insights into the sentiment model's performance, focusing on error drivers, temporal trends, and probability calibration.

**Theme:** Asiimov (Orange, Black, White)

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
from collections import Counter
import re

# Define Asiimov Color Palette
asiimov_colors = {
    'orange': '#FF9900',
    'black': '#1a1a1a',
    'white': '#ffffff',
    'grey': '#5c5c5c',
    'light_grey': '#d1d1d1'
}

# Set default template
pio.templates["asiimov"] = go.layout.Template(
    layout=go.Layout(
        colorway=[asiimov_colors['orange'], asiimov_colors['black'], asiimov_colors['grey']],
        plot_bgcolor=asiimov_colors['white'],
        paper_bgcolor=asiimov_colors['white'],
        font={'color': asiimov_colors['black']},
        title={'font': {'color': asiimov_colors['black']}},
        xaxis={'gridcolor': '#e0e0e0', 'showgrid': True},
        yaxis={'gridcolor': '#e0e0e0', 'showgrid': True}
    )
)
pio.templates.default = "asiimov"

print("Libraries loaded and Asiimov theme defined.")

In [2]:
# Load and Preprocess Data
df = pd.read_csv('cs2_10k_predictions.csv')

df['actual_label'] = df['voted_up'].astype(int)
df['is_correct'] = df['actual_label'] == df['predicted_label']
df['timestamp'] = pd.to_datetime(df['timestamp_created'], unit='s')

print("Data loaded. Rows:", len(df))

## Insight 1: Keyword Error Analysis
What specific words are associated with the model's mistakes? We analyze the most frequent terms in False Positives and False Negatives to understand what confuses the model.

In [3]:
# Custom Stopwords list
STOPWORDS = set([
    'the', 'a', 'an', 'and', 'or', 'but', 'is', 'are', 'was', 'were', 'of', 'in', 'on', 'at', 'to', 'for', 
    'with', 'by', 'it', 'this', 'that', 'i', 'you', 'he', 'she', 'they', 'we', 'my', 'your', 'his', 'her', 
    'their', 'our', 'game', 'play', 'cs2', 'cs', 'counter', 'strike', 'valve', 'steam', 'be', 'have', 'has',
    'not', 'no', 'so', 'just', 'like', 'good', 'bad', 'very', 'much', 'get', 'got', 'do', 'does', 'did',
    'can', 'will', 'would', 'if', 'when', 'from', 'out', 'up', 'down', 'about', 'than', 'then', 'now', 'go',
    'global', 'offensive', 'fps', 'playing', 'played', 'time', 'really', 'even', 'still', 'one', 'all', 'me', 'im', 'its',
    'review', '10', '0', 'best', 'worst', 'trash', 'shit', 'fun', 'great', 'nice', 'love', 'recommend', 'dont', 'cant'
])

def get_top_words(texts, n=10):
    all_text = " ".join(texts).lower()
    words = re.findall(r'\b\w+\b', all_text)
    words = [w for w in words if w not in STOPWORDS and len(w) > 2]
    return Counter(words).most_common(n)

# False Positives: Actual Negative (0), Predicted Positive (1)
fp_reviews = df[(df['actual_label'] == 0) & (df['predicted_label'] == 1)]['clean_review'].astype(str).tolist()
fp_data = pd.DataFrame(get_top_words(fp_reviews), columns=['Word', 'Count'])
fp_data['Error Type'] = 'False Positive (Predicted Pos, Actual Neg)'

# False Negatives: Actual Positive (1), Predicted Negative (0)
fn_reviews = df[(df['actual_label'] == 1) & (df['predicted_label'] == 0)]['clean_review'].astype(str).tolist()
fn_data = pd.DataFrame(get_top_words(fn_reviews), columns=['Word', 'Count'])
fn_data['Error Type'] = 'False Negative (Predicted Neg, Actual Pos)'

error_keywords = pd.concat([fp_data, fn_data])

fig = px.bar(
    error_keywords,
    x='Count',
    y='Word',
    color='Error Type',
    orientation='h',
    title='Top Keywords in Prediction Errors',
    color_discrete_map={
        'False Positive (Predicted Pos, Actual Neg)': asiimov_colors['orange'],
        'False Negative (Predicted Neg, Actual Pos)': asiimov_colors['black']
    },
    barmode='group'
)
fig.update_layout(yaxis={'categoryorder':'total ascending'})
fig.show()

**Observation:**
Words like "cheaters", "hackers", and "fix" appear frequently in errors. This suggests the model struggles when players complain about specific issues (like cheaters) but might still vote 'Recommended' (or vice versa, praising the game but voting 'Not Recommended' due to a specific grievance).

## Insight 2: Temporal Sentiment Trend
How has the model's perceived sentiment changed over time? We look at the rolling average of positive predictions.

In [4]:
# Resample by month to see trends
monthly_sentiment = df.set_index('timestamp').resample('ME')['predicted_label'].mean().reset_index()
monthly_sentiment.columns = ['Date', 'Positive Sentiment Rate']

fig = px.line(
    monthly_sentiment,
    x='Date',
    y='Positive Sentiment Rate',
    title='Trend of Positive Sentiment Over Time (Model Prediction)',
    color_discrete_sequence=[asiimov_colors['orange']]
)
fig.update_yaxes(range=[0, 1])
fig.show()

**Observation:**
The chart shows how the sentiment evolves. Significant dips might correlate with controversial updates or ban waves.

## Insight 3: Probability Calibration
Is the model's confidence reliable? When it predicts a 90% probability of being positive, is it actually positive 90% of the time?

In [5]:
# Bin probabilities into 10 bins
df['prob_bin'] = pd.cut(df['predicted_prob'], bins=10)

# Calculate actual positive rate for each bin
calibration = df.groupby('prob_bin', observed=False).agg(
    avg_pred_prob=('predicted_prob', 'mean'),
    actual_pos_rate=('actual_label', 'mean'),
    count=('actual_label', 'count')
).reset_index()

# Filter out empty bins
calibration = calibration[calibration['count'] > 0]

fig = px.line(
    calibration,
    x='avg_pred_prob',
    y='actual_pos_rate',
    markers=True,
    title='Model Probability Calibration (Reliability Diagram)',
    labels={'avg_pred_prob': 'Mean Predicted Probability', 'actual_pos_rate': 'Actual Positive Fraction'},
    color_discrete_sequence=[asiimov_colors['black']]
)

# Add a perfect calibration line
fig.add_shape(
    type="line", line=dict(dash="dash", color=asiimov_colors['grey']),
    x0=0, x1=1, y0=0, y1=1
)

fig.show()

**Observation:**
A perfectly calibrated model would follow the diagonal dashed line. Deviations indicate over-confidence or under-confidence. If the curve is below the diagonal, the model is over-estimating the probability of the positive class.