# 📊 Lead Scoring CRM Dashboard

This notebook simulates how a CRM analyst or account manager can use Python and pandas to prioritize high-value leads using custom scoring logic.

We will:
- Load CRM-style lead data
- Calculate a custom Priority Score
- Visualize top leads and key trends
- Provide business insights per section


## 📥 Step 1: Load the Dataset

In [None]:
import pandas as pd

df = pd.read_csv('/content/leads_dataset.csv')
df.head()

## 🧮 Step 2: Define Scoring Logic
We will combine engagement, deal size, recent activity, and lifecycle stage into one `Priority_Score`.

In [None]:
# Lifecycle weights
lifecycle_weights = {'SQL': 100, 'MQL': 70, 'New': 50, 'Customer': 30, 'Lost': 10}
df['Lifecycle_Weight'] = df['Lifecycle_Stage'].map(lifecycle_weights)

# Normalize deal size
df['Deal_Size_Score'] = (df['Deal_Size (€)'] - df['Deal_Size (€)'].min()) / (df['Deal_Size (€)'].max() - df['Deal_Size (€)'].min()) * 100

# Invert recency
df['Recency_Score'] = (1 - df['Last_Activity (days ago)'] / df['Last_Activity (days ago)'].max()) * 100

# Weighted Priority Score
df['Priority_Score'] = (
    df['Engagement_Score'] * 0.4 +
    df['Recency_Score'] * 0.3 +
    df['Deal_Size_Score'] * 0.2 +
    df['Lifecycle_Weight'] * 0.1
)

df[['Lead_ID', 'Priority_Score']].head()

## ⭐ Step 3: Top 10 Priority Leads

In [None]:
df_top = df.sort_values(by='Priority_Score', ascending=False).head(10)
df_top

## 📊 Step 4: Visual Insights

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set(style='whitegrid')

# Priority Score by Region
plt.figure(figsize=(10, 6))
region_scores = df.groupby('Region')['Priority_Score'].mean().sort_values(ascending=False)
sns.barplot(x=region_scores.values, y=region_scores.index, palette='Blues_d')
plt.title('Average Priority Score by Region')
plt.xlabel('Avg Score')
plt.ylabel('Region')
plt.tight_layout()
plt.show()

In [None]:
# Engagement vs Deal Size by Lifecycle
plt.figure(figsize=(10, 6))
sns.scatterplot(
    data=df,
    x='Deal_Size (€)',
    y='Engagement_Score',
    hue='Lifecycle_Stage',
    palette='Set2',
    edgecolor='k', alpha=0.8
)
plt.title('Engagement vs Deal Size by Lifecycle Stage')
plt.xlabel('Deal Size (€)')
plt.ylabel('Engagement Score')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2)
plt.tight_layout()
plt.show()

In [None]:
# Priority Score by Lifecycle Stage
plt.figure(figsize=(8, 5))
lifecycle_avg = df.groupby('Lifecycle_Stage')['Priority_Score'].mean().sort_values(ascending=False)
sns.barplot(x=lifecycle_avg.index, y=lifecycle_avg.values, palette='pastel')
plt.title('Avg Priority Score by Lifecycle Stage')
plt.xlabel('Lifecycle Stage')
plt.ylabel('Avg Score')
plt.tight_layout()
plt.show()

## 🧠 Step 5: Business Insights
- **Regions** like Île-de-France and PACA may have more engaged, high-value leads.
- **SQLs** have the highest scores, as expected — but some MQLs are high potential.
- Use this scoring to prioritize follow-up and decide who should be contacted next.