# 🧠 Leaders vs Laggards Analysis: Quartile-Based Fund Comparison

This notebook compares the top 25% (ESG leaders) and bottom 25% (ESG laggards) of funds to quantify the average return gap between sustainability extremes. This benchmark-style contrast is widely used in institutional ESG scoring validation, decarbonization strategies, and portfolio screening.

---
**TECHNIQUES USED:**
- Quartile Segmentation (P75 and P25 ESG thresholds)
- Group Labeling (Leader vs Laggard)
- Return Mean Comparison via Bar Chart
---

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# 📥 Step 1: Load fund-level ESG and return data
df = pd.read_csv('../data/fund_esg_scores_and_returns.csv')

In [None]:
# 🧮 Step 2: Calculate ESG quartiles
p75 = df['ESG_Score'].quantile(0.75)
p25 = df['ESG_Score'].quantile(0.25)

In [None]:
# 🏷️ Step 3: Tag top and bottom quartiles
def assign_group(score):
    if score >= p75:
        return 'Leader'
    elif score <= p25:
        return 'Laggard'
    else:
        return 'Middle'

df['Group'] = df['ESG_Score'].apply(assign_group)

In [None]:
# 📊 Step 4: Compare Mean Returns
group_means = df[df['Group'].isin(['Leader', 'Laggard'])].groupby('Group')['Annual_Return_%'].mean()

In [None]:
# 📈 Step 5: Visualize Return Gap
plt.figure(figsize=(7, 5))
group_means.plot(kind='bar', color=['forestgreen', 'orangered'])
plt.title('Leaders vs Laggards: Avg Annual Return')
plt.ylabel('Annual Return (%)')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.tight_layout()
plt.show()

In [None]:
# 📋 Step 6: Summary Statistics and t-test
leaders = df[df['Group'] == 'Leader']['Annual_Return_%']
laggards = df[df['Group'] == 'Laggard']['Annual_Return_%']

# Summary table
summary = pd.DataFrame({
    'Count': [leaders.count(), laggards.count()],
    'Mean': [leaders.mean(), laggards.mean()],
    'Std Dev': [leaders.std(), laggards.std()]
}, index=['Leader', 'Laggard'])
display(summary)

# t-test
t_stat, p_val = ttest_ind(leaders, laggards, equal_var=False)
print(f"t-statistic: {t_stat:.3f}, p-value: {p_val:.4f}")
if p_val < 0.05:
    print("✅ Statistically significant difference in returns between Leaders and Laggards.")
else:
    print("⚠️ No statistically significant difference detected.")