## Task 3: Employee Score Calculation

In [1]:
import pandas as pd

# Load dataset with sentiment labels
df = pd.read_csv("test_with_sentiment.csv")

# Convert date column to datetime
df['date'] = pd.to_datetime(df['date'], errors='coerce')

# Extract year-month for monthly grouping
df['year_month'] = df['date'].dt.to_period('M')

# Map sentiment labels to numerical scores
sentiment_score_map = {
    "Positive": 1,
    "Negative": -1,
    "Neutral": 0
}

df['sentiment_score'] = df['sentiment_label'].map(sentiment_score_map)

# Aggregate monthly sentiment score per employee
monthly_employee_scores = (
    df.groupby(['from', 'year_month'])['sentiment_score']
      .sum()
      .reset_index()
      .rename(columns={'sentiment_score': 'monthly_sentiment_score'})
)

# Save results
monthly_employee_scores.to_csv("employee_monthly_sentiment_scores.csv", index=False)

# Preview results
monthly_employee_scores.head()


Unnamed: 0,from,year_month,monthly_sentiment_score
0,bobette.riner@ipgdirect.com,2010-01,1
1,bobette.riner@ipgdirect.com,2010-02,7
2,bobette.riner@ipgdirect.com,2010-03,6
3,bobette.riner@ipgdirect.com,2010-04,3
4,bobette.riner@ipgdirect.com,2010-05,2


In [2]:
monthly_employee_scores.groupby('from').head(5)

Unnamed: 0,from,year_month,monthly_sentiment_score
0,bobette.riner@ipgdirect.com,2010-01,1
1,bobette.riner@ipgdirect.com,2010-02,7
2,bobette.riner@ipgdirect.com,2010-03,6
3,bobette.riner@ipgdirect.com,2010-04,3
4,bobette.riner@ipgdirect.com,2010-05,2
24,don.baughman@enron.com,2010-01,5
25,don.baughman@enron.com,2010-02,6
26,don.baughman@enron.com,2010-03,2
27,don.baughman@enron.com,2010-04,9
28,don.baughman@enron.com,2010-05,16


##### Each employee message was assigned a numerical sentiment score based on its sentiment label: +1 for positive, −1 for negative, and 0 for neutral messages. Message timestamps were converted to a year-month format to enable monthly grouping. Sentiment scores were then aggregated by employee and month using summation, ensuring that sentiment scores reset at the beginning of each new month. The resulting monthly sentiment score represents the cumulative emotional tone of an employee’s communications for that period.

In [3]:
df[
    (df['from'] == 'bobette.riner@ipgdirect.com') &
    (df['year_month'] == '2010-02')
][['sentiment_label', 'sentiment_score']]

Unnamed: 0,sentiment_label,sentiment_score
156,Neutral,0
347,Positive,1
354,Positive,1
807,Neutral,0
828,Positive,1
1065,Positive,1
1111,Positive,1
1170,Neutral,0
1330,Negative,-1
1504,Positive,1


#### The monthly sentiment scores correctly reflect the cumulative sentiment of employee communications within each calendar month, with scores resetting at the start of each new month. This ensures accurate temporal isolation of sentiment trends and enables reliable employee ranking and risk analysis.