# Procurement KPI Analytics - Supplier Performance Deep Dive

**Objective**: Conduct comprehensive supplier performance analysis to optimize supplier relationships and drive strategic sourcing decisions.

**Key Analysis Areas**:
- Individual supplier performance scorecards
- Supplier risk assessment and classification
- Supplier relationship analysis and portfolio optimization
- Performance benchmarking and competitive analysis
- Supplier development and improvement opportunities
- Strategic supplier identification and partnership potential
- Supplier lifecycle and relationship management insights

**Input**: Feature-engineered procurement dataset with supplier performance metrics
**Output**: Detailed supplier analysis, scorecards, recommendations, and strategic insights

---

## 1. Setup and Data Loading

In [1]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import warnings
from datetime import datetime, timedelta
from typing import Dict, List, Tuple, Any
import math

# Configure display and warnings
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.2f}'.format)
warnings.filterwarnings('ignore')

# Set plotting themes
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Custom color palette for supplier analysis
SUPPLIER_COLORS = {
    'excellent': '#2E8B57',  # Sea Green
    'good': '#4682B4',       # Steel Blue
    'average': '#DAA520',    # Goldenrod
    'poor': '#DC143C',       # Crimson
    'primary': '#1f77b4',    # Default blue
    'secondary': '#ff7f0e'   # Default orange
}

print("Supplier Performance Analysis environment initialized")
print(f"Analysis timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

Supplier Performance Analysis environment initialized
Analysis timestamp: 2025-07-08 21:54:15


In [2]:
# Load feature-engineered dataset
try:
    df = pd.read_csv('../data/processed/procurement_features_engineered.csv')
    print("Feature-engineered dataset loaded successfully")
    print(f"Dataset shape: {df.shape[0]:,} rows x {df.shape[1]} columns")
except FileNotFoundError:
    print("Error: Feature-engineered dataset not found.")
    print("Expected file: '../data/processed/procurement_features_engineered.csv'")
    print("Please run the feature engineering notebook first.")

# Convert date columns
date_columns = ['Order_Date', 'Delivery_Date']
for col in date_columns:
    if col in df.columns:
        df[col] = pd.to_datetime(df[col], errors='coerce')

# Supplier overview
print(f"\nSupplier Portfolio Overview:")
print(f"Total active suppliers: {df['Supplier'].nunique():,}")
print(f"Total orders analyzed: {len(df):,}")
print(f"Data period: {df['Order_Date'].min().strftime('%Y-%m-%d')} to {df['Order_Date'].max().strftime('%Y-%m-%d')}")
print(f"Total procurement value: ${df['total_negotiated_value'].sum():,.2f}")
print(f"Average orders per supplier: {len(df) / df['Supplier'].nunique():.1f}")

Feature-engineered dataset loaded successfully
Dataset shape: 777 rows x 63 columns

Supplier Portfolio Overview:
Total active suppliers: 5
Total orders analyzed: 777
Data period: 2022-01-01 to 2024-01-01
Total procurement value: $45,373,696.39
Average orders per supplier: 155.4


## 2. Supplier Portfolio Analysis

In [3]:
# Create comprehensive supplier portfolio analysis
def create_supplier_portfolio(df: pd.DataFrame) -> pd.DataFrame:
    """
    Create comprehensive supplier portfolio analysis.
    """
    print("Creating Supplier Portfolio Analysis:")
    print("=" * 50)
    
    # Aggregate supplier-level metrics
    supplier_portfolio = df.groupby('Supplier').agg({
        'PO_ID': 'count',
        'total_negotiated_value': ['sum', 'mean', 'std'],
        'cost_savings': ['sum', 'mean'],
        'savings_percentage': ['mean', 'std'],
        'lead_time_days': ['mean', 'std', 'min', 'max'],
        'defect_rate': ['mean', 'std'],
        'Quantity': 'sum',
        'Defective_Units': 'sum',
        'Order_Date': ['min', 'max'],
        'Item_Category': 'nunique'
    }).round(2)
    
    # Flatten column names
    supplier_portfolio.columns = ['_'.join(col).strip() if col[1] else col[0] for col in supplier_portfolio.columns]
    
    # Rename columns for clarity
    column_mapping = {
        'PO_ID_count': 'total_orders',
        'total_negotiated_value_sum': 'total_spend',
        'total_negotiated_value_mean': 'avg_order_value',
        'total_negotiated_value_std': 'order_value_variability',
        'cost_savings_sum': 'total_savings',
        'cost_savings_mean': 'avg_savings_per_order',
        'savings_percentage_mean': 'avg_savings_rate',
        'savings_percentage_std': 'savings_consistency',
        'lead_time_days_mean': 'avg_lead_time',
        'lead_time_days_std': 'lead_time_consistency',
        'lead_time_days_min': 'best_lead_time',
        'lead_time_days_max': 'worst_lead_time',
        'defect_rate_mean': 'avg_defect_rate',
        'defect_rate_std': 'quality_consistency',
        'Quantity_sum': 'total_units',
        'Defective_Units_sum': 'total_defective_units',
        'Order_Date_min': 'first_order_date',
        'Order_Date_max': 'last_order_date',
        'Item_Category_nunique': 'categories_served'
    }
    
    supplier_portfolio = supplier_portfolio.rename(columns=column_mapping)
    
    # Calculate additional metrics
    total_portfolio_spend = supplier_portfolio['total_spend'].sum()
    supplier_portfolio['spend_share'] = (supplier_portfolio['total_spend'] / total_portfolio_spend * 100).round(2)
    
    # Relationship duration in days
    supplier_portfolio['relationship_duration_days'] = (
        supplier_portfolio['last_order_date'] - supplier_portfolio['first_order_date']
    ).dt.days
    
    # Order frequency (orders per month)
    supplier_portfolio['order_frequency_monthly'] = (
        supplier_portfolio['total_orders'] / 
        (supplier_portfolio['relationship_duration_days'] / 30 + 1)
    ).round(2)
    
    # Quality metrics
    supplier_portfolio['overall_defect_rate'] = (
        supplier_portfolio['total_defective_units'] / supplier_portfolio['total_units'] * 100
    ).round(2)
    
    # Performance consistency score (lower is better - less variability)
    supplier_portfolio['performance_consistency'] = (
        (supplier_portfolio['lead_time_consistency'].fillna(0) / supplier_portfolio['avg_lead_time'].fillna(1)) +
        (supplier_portfolio['quality_consistency'].fillna(0) / (supplier_portfolio['avg_defect_rate'].fillna(0.1) + 0.1))
    ).round(2)
    
    # Sort by total spend
    supplier_portfolio = supplier_portfolio.sort_values('total_spend', ascending=False)
    
    print(f"Portfolio analysis created for {len(supplier_portfolio)} suppliers")
    
    return supplier_portfolio

# Create supplier portfolio
supplier_portfolio = create_supplier_portfolio(df)

# Display top suppliers
print("\nTOP 10 SUPPLIERS BY SPEND:")
print("=" * 50)
top_suppliers = supplier_portfolio.head(10)[[
    'total_spend', 'spend_share', 'total_orders', 'avg_savings_rate', 
    'avg_lead_time', 'avg_defect_rate', 'categories_served'
]]
display(top_suppliers)

# Portfolio summary statistics
print("\nSUPPLIER PORTFOLIO SUMMARY:")
print("=" * 50)
print(f"Total suppliers: {len(supplier_portfolio)}")
print(f"Top 5 supplier concentration: {supplier_portfolio.head(5)['spend_share'].sum():.1f}%")
print(f"Top 10 supplier concentration: {supplier_portfolio.head(10)['spend_share'].sum():.1f}%")
print(f"Suppliers with >5% spend share: {(supplier_portfolio['spend_share'] > 5).sum()}")
print(f"Average relationship duration: {supplier_portfolio['relationship_duration_days'].mean():.0f} days")
print(f"Average savings rate across portfolio: {supplier_portfolio['avg_savings_rate'].mean():.2f}%")
print(f"Average lead time across portfolio: {supplier_portfolio['avg_lead_time'].mean():.1f} days")

Creating Supplier Portfolio Analysis:
Portfolio analysis created for 5 suppliers

TOP 10 SUPPLIERS BY SPEND:


Unnamed: 0_level_0,total_spend,spend_share,total_orders,avg_savings_rate,avg_lead_time,avg_defect_rate,categories_served
Supplier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Beta_Supplies,9858665.9,21.73,156,7.83,11.27,8.27,5
Epsilon_Group,9851156.06,21.71,166,8.04,10.87,2.61,5
Delta_Logistics,9236240.47,20.36,171,7.81,10.85,10.87,5
Gamma_Co,8587921.71,18.93,143,7.98,10.19,4.5,5
Alpha_Inc,7839712.25,17.28,141,8.21,10.61,1.89,5



SUPPLIER PORTFOLIO SUMMARY:
Total suppliers: 5
Top 5 supplier concentration: 100.0%
Top 10 supplier concentration: 100.0%
Suppliers with >5% spend share: 5
Average relationship duration: 722 days
Average savings rate across portfolio: 7.97%
Average lead time across portfolio: 10.8 days


In [4]:
# Create supplier portfolio visualization
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=(
        'Spend Concentration (Top 20 Suppliers)',
        'Supplier Performance Distribution',
        'Relationship Duration vs Order Frequency',
        'Savings Rate vs Lead Time Performance'
    ),
    specs=[[{"type": "bar"}, {"type": "histogram"}],
           [{"type": "scatter"}, {"type": "scatter"}]]
)

# 1. Top 20 suppliers by spend
top_20 = supplier_portfolio.head(20)
fig.add_trace(
    go.Bar(
        x=top_20.index,
        y=top_20['spend_share'],
        name='Spend Share %',
        marker_color=SUPPLIER_COLORS['primary']
    ),
    row=1, col=1
)

# 2. Average savings rate distribution
fig.add_trace(
    go.Histogram(
        x=supplier_portfolio['avg_savings_rate'],
        nbinsx=25,
        name='Savings Rate Distribution',
        marker_color=SUPPLIER_COLORS['good']
    ),
    row=1, col=2
)

# 3. Relationship duration vs order frequency
fig.add_trace(
    go.Scatter(
        x=supplier_portfolio['relationship_duration_days'],
        y=supplier_portfolio['order_frequency_monthly'],
        mode='markers',
        name='Duration vs Frequency',
        marker=dict(
            size=supplier_portfolio['total_spend'] / supplier_portfolio['total_spend'].max() * 30 + 5,
            color=supplier_portfolio['avg_savings_rate'],
            colorscale='Viridis',
            showscale=True
        )
    ),
    row=2, col=1
)

# 4. Savings rate vs lead time
fig.add_trace(
    go.Scatter(
        x=supplier_portfolio['avg_lead_time'],
        y=supplier_portfolio['avg_savings_rate'],
        mode='markers',
        name='Lead Time vs Savings',
        marker=dict(
            size=supplier_portfolio['total_spend'] / supplier_portfolio['total_spend'].max() * 30 + 5,
            color=supplier_portfolio['avg_defect_rate'],
            colorscale='Reds',
            showscale=True
        )
    ),
    row=2, col=2
)

# Update layout
fig.update_xaxes(title_text="Supplier", row=1, col=1, tickangle=45)
fig.update_xaxes(title_text="Savings Rate (%)", row=1, col=2)
fig.update_xaxes(title_text="Relationship Duration (Days)", row=2, col=1)
fig.update_xaxes(title_text="Average Lead Time (Days)", row=2, col=2)

fig.update_yaxes(title_text="Spend Share (%)", row=1, col=1)
fig.update_yaxes(title_text="Frequency", row=1, col=2)
fig.update_yaxes(title_text="Order Frequency (Monthly)", row=2, col=1)
fig.update_yaxes(title_text="Average Savings Rate (%)", row=2, col=2)

fig.update_layout(
    height=800,
    title_text="Supplier Portfolio Dashboard",
    showlegend=False
)

fig.show()

## 3. Supplier Performance Scorecards

In [5]:
# Create comprehensive supplier scorecards
def create_supplier_scorecards(supplier_portfolio: pd.DataFrame) -> pd.DataFrame:
    """
    Create detailed supplier performance scorecards with weighted scores.
    """
    print("Creating Supplier Performance Scorecards:")
    print("=" * 50)
    
    scorecards = supplier_portfolio.copy()
    
    # Define scoring weights (must sum to 100)
    weights = {
        'cost_performance': 30,    # Cost savings and efficiency
        'delivery_performance': 25, # Lead time and reliability
        'quality_performance': 25,  # Defect rate and consistency
        'relationship_value': 20   # Strategic value and partnership
    }
    
    print(f"Scoring methodology (weights): {weights}")
    
    # 1. Cost Performance Score (0-100)
    # Higher savings rate is better
    savings_percentile = scorecards['avg_savings_rate'].rank(pct=True, ascending=True)
    scorecards['cost_performance_score'] = (savings_percentile * 100).round(1)
    
    # 2. Delivery Performance Score (0-100)
    # Lower lead time is better, higher consistency (lower std) is better
    lead_time_percentile = scorecards['avg_lead_time'].rank(pct=True, ascending=False)
    consistency_percentile = scorecards['lead_time_consistency'].fillna(0).rank(pct=True, ascending=False)
    scorecards['delivery_performance_score'] = ((lead_time_percentile * 0.7 + consistency_percentile * 0.3) * 100).round(1)
    
    # 3. Quality Performance Score (0-100)
    # Lower defect rate is better
    quality_percentile = scorecards['avg_defect_rate'].rank(pct=True, ascending=False)
    scorecards['quality_performance_score'] = (quality_percentile * 100).round(1)
    
    # 4. Relationship Value Score (0-100)
    # Based on spend volume, relationship duration, and category coverage
    spend_percentile = scorecards['total_spend'].rank(pct=True, ascending=True)
    duration_percentile = scorecards['relationship_duration_days'].rank(pct=True, ascending=True)
    category_percentile = scorecards['categories_served'].rank(pct=True, ascending=True)
    
    scorecards['relationship_value_score'] = (
        (spend_percentile * 0.5 + duration_percentile * 0.3 + category_percentile * 0.2) * 100
    ).round(1)
    
    # 5. Calculate Overall Supplier Score
    scorecards['overall_supplier_score'] = (
        scorecards['cost_performance_score'] * weights['cost_performance'] / 100 +
        scorecards['delivery_performance_score'] * weights['delivery_performance'] / 100 +
        scorecards['quality_performance_score'] * weights['quality_performance'] / 100 +
        scorecards['relationship_value_score'] * weights['relationship_value'] / 100
    ).round(1)
    
    # 6. Create Performance Tier Classification
    scorecards['performance_tier'] = pd.cut(
        scorecards['overall_supplier_score'],
        bins=[0, 40, 60, 80, 100],
        labels=['Poor', 'Average', 'Good', 'Excellent']
    )
    
    # 7. Create Strategic Classification based on spend and performance
    def classify_strategic_value(row):
        if row['spend_share'] >= 5 and row['overall_supplier_score'] >= 80:
            return 'Strategic Partner'
        elif row['spend_share'] >= 5 and row['overall_supplier_score'] >= 60:
            return 'Key Supplier'
        elif row['spend_share'] >= 5:
            return 'High Risk Supplier'
        elif row['overall_supplier_score'] >= 80:
            return 'Development Opportunity'
        elif row['overall_supplier_score'] >= 60:
            return 'Standard Supplier'
        else:
            return 'Performance Concern'
    
    scorecards['strategic_classification'] = scorecards.apply(classify_strategic_value, axis=1)
    
    # Sort by overall score
    scorecards = scorecards.sort_values('overall_supplier_score', ascending=False)
    
    print(f"Scorecards created for {len(scorecards)} suppliers")
    
    return scorecards

# Create supplier scorecards
supplier_scorecards = create_supplier_scorecards(supplier_portfolio)

# Display top performing suppliers
print("\nTOP 10 PERFORMING SUPPLIERS:")
print("=" * 50)
top_performers = supplier_scorecards.head(10)[[
    'overall_supplier_score', 'performance_tier', 'strategic_classification',
    'cost_performance_score', 'delivery_performance_score', 
    'quality_performance_score', 'relationship_value_score',
    'total_spend', 'spend_share'
]]
display(top_performers)

# Performance tier distribution
print("\nPERFORMANCE TIER DISTRIBUTION:")
print("=" * 50)
tier_distribution = supplier_scorecards['performance_tier'].value_counts()
tier_spend_distribution = supplier_scorecards.groupby('performance_tier')['spend_share'].sum()

for tier in ['Excellent', 'Good', 'Average', 'Poor']:
    if tier in tier_distribution.index:
        count = tier_distribution[tier]
        spend_pct = tier_spend_distribution.get(tier, 0)
        print(f"{tier}: {count} suppliers ({count/len(supplier_scorecards)*100:.1f}%) | {spend_pct:.1f}% of spend")

# Strategic classification distribution
print("\nSTRATEGIC CLASSIFICATION DISTRIBUTION:")
print("=" * 50)
strategic_distribution = supplier_scorecards['strategic_classification'].value_counts()
for classification, count in strategic_distribution.items():
    print(f"{classification}: {count} suppliers ({count/len(supplier_scorecards)*100:.1f}%)")

Creating Supplier Performance Scorecards:
Scoring methodology (weights): {'cost_performance': 30, 'delivery_performance': 25, 'quality_performance': 25, 'relationship_value': 20}
Scorecards created for 5 suppliers

TOP 10 PERFORMING SUPPLIERS:


Unnamed: 0_level_0,overall_supplier_score,performance_tier,strategic_classification,cost_performance_score,delivery_performance_score,quality_performance_score,relationship_value_score,total_spend,spend_share
Supplier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Alpha_Inc,83.0,Excellent,Strategic Partner,100.0,80.0,100.0,40.0,7839712.25,17.28
Gamma_Co,69.2,Good,Key Supplier,60.0,100.0,60.0,56.0,8587921.71,18.93
Epsilon_Group,66.8,Good,Key Supplier,80.0,40.0,80.0,64.0,9851156.06,21.71
Beta_Supplies,43.6,Average,High Risk Supplier,40.0,32.0,40.0,68.0,9858665.9,21.73
Delta_Logistics,37.4,Poor,High Risk Supplier,20.0,48.0,20.0,72.0,9236240.47,20.36



PERFORMANCE TIER DISTRIBUTION:
Excellent: 1 suppliers (20.0%) | 17.3% of spend
Good: 2 suppliers (40.0%) | 40.6% of spend
Average: 1 suppliers (20.0%) | 21.7% of spend
Poor: 1 suppliers (20.0%) | 20.4% of spend

STRATEGIC CLASSIFICATION DISTRIBUTION:
Key Supplier: 2 suppliers (40.0%)
High Risk Supplier: 2 suppliers (40.0%)
Strategic Partner: 1 suppliers (20.0%)


In [6]:
# Create supplier scorecard visualization
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=(
        'Overall Performance Score Distribution',
        'Performance Tier by Spend Share',
        'Strategic Classification',
        'Performance Dimensions Radar (Top 5)'
    ),
    specs=[[{"type": "histogram"}, {"type": "scatter"}],
           [{"type": "pie"}, {"type": "scatterpolar"}]]
)

# 1. Overall performance score distribution
fig.add_trace(
    go.Histogram(
        x=supplier_scorecards['overall_supplier_score'],
        nbinsx=20,
        name='Performance Score',
        marker_color=SUPPLIER_COLORS['primary']
    ),
    row=1, col=1
)

# 2. Performance tier by spend share
tier_colors = {
    'Excellent': SUPPLIER_COLORS['excellent'],
    'Good': SUPPLIER_COLORS['good'],
    'Average': SUPPLIER_COLORS['average'],
    'Poor': SUPPLIER_COLORS['poor']
}

for tier in ['Excellent', 'Good', 'Average', 'Poor']:
    tier_data = supplier_scorecards[supplier_scorecards['performance_tier'] == tier]
    if len(tier_data) > 0:
        fig.add_trace(
            go.Scatter(
                x=tier_data['overall_supplier_score'],
                y=tier_data['spend_share'],
                mode='markers',
                name=tier,
                marker=dict(
                    color=tier_colors.get(tier, SUPPLIER_COLORS['primary']),
                    size=8
                )
            ),
            row=1, col=2
        )

# 3. Strategic classification pie chart
strategic_counts = supplier_scorecards['strategic_classification'].value_counts()
fig.add_trace(
    go.Pie(
        labels=strategic_counts.index,
        values=strategic_counts.values,
        name="Strategic Classification"
    ),
    row=2, col=1
)

# 4. Radar chart for top 5 suppliers
top_5_suppliers = supplier_scorecards.head(5)
performance_dimensions = ['cost_performance_score', 'delivery_performance_score', 
                         'quality_performance_score', 'relationship_value_score']
dimension_labels = ['Cost Performance', 'Delivery Performance', 
                   'Quality Performance', 'Relationship Value']

for i, (supplier, row) in enumerate(top_5_suppliers.iterrows()):
    fig.add_trace(
        go.Scatterpolar(
            r=[row[dim] for dim in performance_dimensions] + [row[performance_dimensions[0]]],
            theta=dimension_labels + [dimension_labels[0]],
            fill='toself',
            name=supplier[:15] + '...' if len(supplier) > 15 else supplier,
            opacity=0.7
        ),
        row=2, col=2
    )

# Update layout
fig.update_layout(
    height=900,
    title_text="Supplier Performance Scorecard Dashboard",
    polar=dict(
        radialaxis=dict(
            visible=True,
            range=[0, 100]
        )
    )
)

fig.show()

## 4. Supplier Risk Assessment

In [7]:
# Comprehensive supplier risk assessment
def assess_supplier_risks(supplier_scorecards: pd.DataFrame, df: pd.DataFrame) -> pd.DataFrame:
    """
    Conduct comprehensive supplier risk assessment.
    """
    print("Conducting Supplier Risk Assessment:")
    print("=" * 50)
    
    risk_assessment = supplier_scorecards.copy()
    
    # 1. Concentration Risk
    risk_assessment['concentration_risk'] = 'Low'
    risk_assessment.loc[risk_assessment['spend_share'] >= 20, 'concentration_risk'] = 'Critical'
    risk_assessment.loc[risk_assessment['spend_share'] >= 10, 'concentration_risk'] = 'High'
    risk_assessment.loc[risk_assessment['spend_share'] >= 5, 'concentration_risk'] = 'Medium'
    
    # 2. Performance Risk
    risk_assessment['performance_risk'] = 'Low'
    risk_assessment.loc[risk_assessment['overall_supplier_score'] <= 40, 'performance_risk'] = 'Critical'
    risk_assessment.loc[risk_assessment['overall_supplier_score'] <= 60, 'performance_risk'] = 'High'
    risk_assessment.loc[risk_assessment['overall_supplier_score'] <= 80, 'performance_risk'] = 'Medium'
    
    # 3. Quality Risk
    risk_assessment['quality_risk'] = 'Low'
    risk_assessment.loc[risk_assessment['avg_defect_rate'] >= 10, 'quality_risk'] = 'Critical'
    risk_assessment.loc[risk_assessment['avg_defect_rate'] >= 5, 'quality_risk'] = 'High'
    risk_assessment.loc[risk_assessment['avg_defect_rate'] >= 2, 'quality_risk'] = 'Medium'
    
    # 4. Delivery Risk
    avg_lead_time = risk_assessment['avg_lead_time'].median()
    risk_assessment['delivery_risk'] = 'Low'
    risk_assessment.loc[risk_assessment['avg_lead_time'] >= avg_lead_time * 2, 'delivery_risk'] = 'Critical'
    risk_assessment.loc[risk_assessment['avg_lead_time'] >= avg_lead_time * 1.5, 'delivery_risk'] = 'High'
    risk_assessment.loc[risk_assessment['avg_lead_time'] >= avg_lead_time * 1.2, 'delivery_risk'] = 'Medium'
    
    # 5. Volatility Risk (based on performance consistency)
    high_volatility_threshold = risk_assessment['performance_consistency'].quantile(0.8)
    risk_assessment['volatility_risk'] = 'Low'
    risk_assessment.loc[risk_assessment['performance_consistency'] >= high_volatility_threshold, 'volatility_risk'] = 'High'
    risk_assessment.loc[risk_assessment['performance_consistency'] >= risk_assessment['performance_consistency'].median(), 'volatility_risk'] = 'Medium'
    
    # 6. Relationship Risk (new or short-term relationships)
    short_relationship_threshold = 90  # days
    risk_assessment['relationship_risk'] = 'Low'
    risk_assessment.loc[risk_assessment['relationship_duration_days'] <= short_relationship_threshold, 'relationship_risk'] = 'High'
    risk_assessment.loc[risk_assessment['relationship_duration_days'] <= 180, 'relationship_risk'] = 'Medium'
    
    # 7. Calculate Composite Risk Score
    risk_weights = {
        'concentration_risk': 25,
        'performance_risk': 20,
        'quality_risk': 20,
        'delivery_risk': 15,
        'volatility_risk': 10,
        'relationship_risk': 10
    }
    
    risk_mapping = {
        'Low': 1,
        'Medium': 2,
        'High': 3,
        'Critical': 4
    }
    
    # Calculate weighted risk score
    composite_risk = 0
    for risk_type, weight in risk_weights.items():
        risk_scores = risk_assessment[risk_type].map(risk_mapping)
        composite_risk += risk_scores * (weight / 100)
    
    risk_assessment['composite_risk_score'] = composite_risk.round(2)
    
    # 8. Overall Risk Level
    risk_assessment['overall_risk_level'] = pd.cut(
        risk_assessment['composite_risk_score'],
        bins=[0, 1.5, 2.5, 3.5, 4],
        labels=['Low Risk', 'Medium Risk', 'High Risk', 'Critical Risk']
    )
    
    # 9. Risk Priority Score (combines risk and business impact)
    risk_assessment['risk_priority_score'] = (
        risk_assessment['composite_risk_score'] * risk_assessment['spend_share'] / 100
    ).round(2)
    
    print(f"Risk assessment completed for {len(risk_assessment)} suppliers")
    
    return risk_assessment

# Conduct risk assessment
supplier_risk_assessment = assess_supplier_risks(supplier_scorecards, df)

# Display high-risk suppliers
print("\nHIGH-RISK SUPPLIERS (Critical and High Risk):")
print("=" * 50)
high_risk_suppliers = supplier_risk_assessment[
    supplier_risk_assessment['overall_risk_level'].isin(['Critical Risk', 'High Risk'])
].sort_values('risk_priority_score', ascending=False)

if len(high_risk_suppliers) > 0:
    high_risk_display = high_risk_suppliers[[
        'overall_risk_level', 'composite_risk_score', 'risk_priority_score',
        'spend_share', 'concentration_risk', 'performance_risk', 
        'quality_risk', 'delivery_risk'
    ]].head(10)
    display(high_risk_display)
else:
    print("No high-risk suppliers identified!")

# Risk distribution summary
print("\nRISK LEVEL DISTRIBUTION:")
print("=" * 50)
risk_distribution = supplier_risk_assessment['overall_risk_level'].value_counts()
risk_spend_distribution = supplier_risk_assessment.groupby('overall_risk_level')['spend_share'].sum()

for risk_level in ['Critical Risk', 'High Risk', 'Medium Risk', 'Low Risk']:
    if risk_level in risk_distribution.index:
        count = risk_distribution[risk_level]
        spend_pct = risk_spend_distribution.get(risk_level, 0)
        print(f"{risk_level}: {count} suppliers ({count/len(supplier_risk_assessment)*100:.1f}%) | {spend_pct:.1f}% of spend")

# Top risk priorities
print("\nTOP 10 RISK PRIORITIES (by risk-weighted spend):")
print("=" * 50)
top_risk_priorities = supplier_risk_assessment.nlargest(10, 'risk_priority_score')[[
    'risk_priority_score', 'overall_risk_level', 'spend_share', 
    'overall_supplier_score', 'strategic_classification'
]]
display(top_risk_priorities)

Conducting Supplier Risk Assessment:
Risk assessment completed for 5 suppliers

HIGH-RISK SUPPLIERS (Critical and High Risk):
No high-risk suppliers identified!

RISK LEVEL DISTRIBUTION:
Critical Risk: 0 suppliers (0.0%) | 0.0% of spend
High Risk: 0 suppliers (0.0%) | 0.0% of spend
Medium Risk: 4 suppliers (80.0%) | 82.7% of spend
Low Risk: 1 suppliers (20.0%) | 17.3% of spend

TOP 10 RISK PRIORITIES (by risk-weighted spend):


Unnamed: 0_level_0,risk_priority_score,overall_risk_level,spend_share,overall_supplier_score,strategic_classification
Supplier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Epsilon_Group,0.38,Medium Risk,21.71,66.8,Key Supplier
Beta_Supplies,0.36,Medium Risk,21.73,43.6,High Risk Supplier
Delta_Logistics,0.36,Medium Risk,20.36,37.4,High Risk Supplier
Gamma_Co,0.31,Medium Risk,18.93,69.2,Key Supplier
Alpha_Inc,0.23,Low Risk,17.28,83.0,Strategic Partner


## 5. Supplier Development Opportunities

In [8]:
# Identify supplier development and optimization opportunities
def identify_development_opportunities(supplier_risk_assessment: pd.DataFrame) -> Dict[str, pd.DataFrame]:
    """
    Identify supplier development and optimization opportunities.
    """
    print("Identifying Supplier Development Opportunities:")
    print("=" * 50)
    
    opportunities = {}
    
    # 1. High-Potential Suppliers (good performance, low spend share)
    high_potential = supplier_risk_assessment[
        (supplier_risk_assessment['overall_supplier_score'] >= 75) &
        (supplier_risk_assessment['spend_share'] < 2) &
        (supplier_risk_assessment['total_orders'] >= 3)
    ].sort_values('overall_supplier_score', ascending=False)
    
    opportunities['high_potential'] = high_potential[[
        'overall_supplier_score', 'spend_share', 'total_orders',
        'avg_savings_rate', 'avg_lead_time', 'avg_defect_rate'
    ]].head(10)
    
    # 2. Underperforming Strategic Suppliers (high spend, poor performance)
    underperforming_strategic = supplier_risk_assessment[
        (supplier_risk_assessment['spend_share'] >= 5) &
        (supplier_risk_assessment['overall_supplier_score'] < 70)
    ].sort_values('spend_share', ascending=False)
    
    opportunities['underperforming_strategic'] = underperforming_strategic[[
        'spend_share', 'overall_supplier_score', 'performance_tier',
        'cost_performance_score', 'delivery_performance_score', 'quality_performance_score'
    ]].head(10)
    
    # 3. Cost Optimization Opportunities (below average savings)
    avg_savings_rate = supplier_risk_assessment['avg_savings_rate'].mean()
    cost_optimization = supplier_risk_assessment[
        (supplier_risk_assessment['avg_savings_rate'] < avg_savings_rate) &
        (supplier_risk_assessment['spend_share'] >= 1)
    ].sort_values('total_spend', ascending=False)
    
    # Calculate potential savings
    cost_optimization['potential_additional_savings'] = (
        cost_optimization['total_spend'] * 
        (avg_savings_rate - cost_optimization['avg_savings_rate']) / 100
    ).round(0)
    
    opportunities['cost_optimization'] = cost_optimization[[
        'total_spend', 'avg_savings_rate', 'potential_additional_savings',
        'spend_share', 'overall_supplier_score'
    ]].head(10)
    
    # 4. Quality Improvement Opportunities
    quality_improvement = supplier_risk_assessment[
        (supplier_risk_assessment['avg_defect_rate'] > 2) &
        (supplier_risk_assessment['spend_share'] >= 1)
    ].sort_values('avg_defect_rate', ascending=False)
    
    opportunities['quality_improvement'] = quality_improvement[[
        'avg_defect_rate', 'total_defective_units', 'spend_share',
        'quality_performance_score', 'overall_supplier_score'
    ]].head(10)
    
    # 5. Delivery Performance Improvement
    median_lead_time = supplier_risk_assessment['avg_lead_time'].median()
    delivery_improvement = supplier_risk_assessment[
        (supplier_risk_assessment['avg_lead_time'] > median_lead_time * 1.5) &
        (supplier_risk_assessment['spend_share'] >= 1)
    ].sort_values('avg_lead_time', ascending=False)
    
    opportunities['delivery_improvement'] = delivery_improvement[[
        'avg_lead_time', 'worst_lead_time', 'lead_time_consistency',
        'delivery_performance_score', 'spend_share'
    ]].head(10)
    
    # 6. New Supplier Evaluation (short relationships with good performance)
    new_suppliers = supplier_risk_assessment[
        (supplier_risk_assessment['relationship_duration_days'] <= 180) &
        (supplier_risk_assessment['overall_supplier_score'] >= 70) &
        (supplier_risk_assessment['total_orders'] >= 2)
    ].sort_values('overall_supplier_score', ascending=False)
    
    opportunities['new_supplier_evaluation'] = new_suppliers[[
        'relationship_duration_days', 'overall_supplier_score', 'total_orders',
        'avg_savings_rate', 'avg_lead_time', 'avg_defect_rate'
    ]].head(10)
    
    return opportunities

# Identify development opportunities
development_opportunities = identify_development_opportunities(supplier_risk_assessment)

# Display opportunities
print("\nSUPPLIER DEVELOPMENT OPPORTUNITIES:")
print("=" * 60)

for opportunity_type, data in development_opportunities.items():
    print(f"\n{opportunity_type.upper().replace('_', ' ')} ({len(data)} suppliers):")
    print("-" * 50)
    if len(data) > 0:
        display(data.head(5))
    else:
        print("No opportunities identified in this category")

# Calculate total opportunity value
print("\nOPPORTUNITY VALUE SUMMARY:")
print("=" * 50)

if len(development_opportunities['cost_optimization']) > 0:
    total_cost_opportunity = development_opportunities['cost_optimization']['potential_additional_savings'].sum()
    print(f"Total Cost Optimization Opportunity: ${total_cost_opportunity:,.0f}")

high_potential_count = len(development_opportunities['high_potential'])
underperforming_count = len(development_opportunities['underperforming_strategic'])
quality_issues_count = len(development_opportunities['quality_improvement'])

print(f"High-Potential Suppliers for Growth: {high_potential_count}")
print(f"Underperforming Strategic Suppliers: {underperforming_count}")
print(f"Suppliers with Quality Issues: {quality_issues_count}")
print(f"New Suppliers Under Evaluation: {len(development_opportunities['new_supplier_evaluation'])}")

Identifying Supplier Development Opportunities:

SUPPLIER DEVELOPMENT OPPORTUNITIES:

HIGH POTENTIAL (0 suppliers):
--------------------------------------------------
No opportunities identified in this category

UNDERPERFORMING STRATEGIC (4 suppliers):
--------------------------------------------------


Unnamed: 0_level_0,spend_share,overall_supplier_score,performance_tier,cost_performance_score,delivery_performance_score,quality_performance_score
Supplier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beta_Supplies,21.73,43.6,Average,40.0,32.0,40.0
Epsilon_Group,21.71,66.8,Good,80.0,40.0,80.0
Delta_Logistics,20.36,37.4,Poor,20.0,48.0,20.0
Gamma_Co,18.93,69.2,Good,60.0,100.0,60.0



COST OPTIMIZATION (2 suppliers):
--------------------------------------------------


Unnamed: 0_level_0,total_spend,avg_savings_rate,potential_additional_savings,spend_share,overall_supplier_score
Supplier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Beta_Supplies,9858665.9,7.83,14196.0,21.73,43.6
Delta_Logistics,9236240.47,7.81,15147.0,20.36,37.4



QUALITY IMPROVEMENT (4 suppliers):
--------------------------------------------------


Unnamed: 0_level_0,avg_defect_rate,total_defective_units,spend_share,quality_performance_score,overall_supplier_score
Supplier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Delta_Logistics,10.87,19678.0,20.36,20.0,37.4
Beta_Supplies,8.27,13838.0,21.73,40.0,43.6
Gamma_Co,4.5,7034.0,18.93,60.0,69.2
Epsilon_Group,2.61,4682.0,21.71,80.0,66.8



DELIVERY IMPROVEMENT (0 suppliers):
--------------------------------------------------
No opportunities identified in this category

NEW SUPPLIER EVALUATION (0 suppliers):
--------------------------------------------------
No opportunities identified in this category

OPPORTUNITY VALUE SUMMARY:
Total Cost Optimization Opportunity: $29,343
High-Potential Suppliers for Growth: 0
Underperforming Strategic Suppliers: 4
Suppliers with Quality Issues: 4
New Suppliers Under Evaluation: 0


## 6. Strategic Supplier Recommendations

In [9]:
# Generate strategic supplier recommendations
def generate_supplier_recommendations(supplier_risk_assessment: pd.DataFrame, 
                                     development_opportunities: Dict) -> Dict[str, List[str]]:
    """
    Generate strategic supplier management recommendations.
    """
    print("Generating Strategic Supplier Recommendations:")
    print("=" * 50)
    
    recommendations = {
        'immediate_actions': [],
        'strategic_initiatives': [],
        'risk_mitigation': [],
        'performance_improvement': [],
        'portfolio_optimization': []
    }
    
    # Analyze current state
    critical_risk_suppliers = len(supplier_risk_assessment[
        supplier_risk_assessment['overall_risk_level'] == 'Critical Risk'
    ])
    
    strategic_partners = len(supplier_risk_assessment[
        supplier_risk_assessment['strategic_classification'] == 'Strategic Partner'
    ])
    
    top_5_concentration = supplier_risk_assessment.head(5)['spend_share'].sum()
    
    avg_performance_score = supplier_risk_assessment['overall_supplier_score'].mean()
    
    # Generate immediate action recommendations
    if critical_risk_suppliers > 0:
        recommendations['immediate_actions'].append(
            f"URGENT: Address {critical_risk_suppliers} critical risk suppliers immediately"
        )
    
    if len(development_opportunities['underperforming_strategic']) > 0:
        underperforming_spend = development_opportunities['underperforming_strategic']['spend_share'].sum()
        recommendations['immediate_actions'].append(
            f"Initiate performance improvement plans for strategic suppliers representing {underperforming_spend:.1f}% of spend"
        )
    
    if len(development_opportunities['cost_optimization']) > 0:
        total_opportunity = development_opportunities['cost_optimization']['potential_additional_savings'].sum()
        recommendations['immediate_actions'].append(
            f"Launch cost optimization initiatives with potential ${total_opportunity:,.0f} in additional savings"
        )
    
    # Strategic initiative recommendations
    if strategic_partners < 3:
        recommendations['strategic_initiatives'].append(
            "Develop strategic partnerships with top-performing, high-spend suppliers"
        )
    
    if len(development_opportunities['high_potential']) >= 5:
        recommendations['strategic_initiatives'].append(
            f"Scale up relationships with {len(development_opportunities['high_potential'])} high-potential suppliers"
        )
    
    if top_5_concentration > 60:
        recommendations['strategic_initiatives'].append(
            f"Diversify supplier base - top 5 suppliers represent {top_5_concentration:.1f}% of spend"
        )
    
    # Risk mitigation recommendations
    high_concentration_suppliers = supplier_risk_assessment[
        supplier_risk_assessment['concentration_risk'].isin(['Critical', 'High'])
    ]
    
    if len(high_concentration_suppliers) > 0:
        recommendations['risk_mitigation'].append(
            f"Develop backup suppliers for {len(high_concentration_suppliers)} high-concentration dependencies"
        )
    
    volatile_suppliers = supplier_risk_assessment[
        supplier_risk_assessment['volatility_risk'] == 'High'
    ]
    
    if len(volatile_suppliers) > 0:
        recommendations['risk_mitigation'].append(
            f"Implement performance monitoring for {len(volatile_suppliers)} volatile suppliers"
        )
    
    # Performance improvement recommendations
    if len(development_opportunities['quality_improvement']) > 0:
        recommendations['performance_improvement'].append(
            f"Launch quality improvement programs with {len(development_opportunities['quality_improvement'])} suppliers"
        )
    
    if len(development_opportunities['delivery_improvement']) > 0:
        recommendations['performance_improvement'].append(
            f"Optimize delivery performance with {len(development_opportunities['delivery_improvement'])} suppliers"
        )
    
    if avg_performance_score < 70:
        recommendations['performance_improvement'].append(
            f"Overall supplier performance below target ({avg_performance_score:.1f}/100) - implement comprehensive improvement program"
        )
    
    # Portfolio optimization recommendations
    excellent_tier_pct = (supplier_risk_assessment['performance_tier'] == 'Excellent').mean() * 100
    if excellent_tier_pct < 25:
        recommendations['portfolio_optimization'].append(
            f"Increase excellent-tier suppliers from {excellent_tier_pct:.1f}% to 25% of portfolio"
        )
    
    categories_per_supplier = supplier_risk_assessment['categories_served'].mean()
    if categories_per_supplier < 1.5:
        recommendations['portfolio_optimization'].append(
            "Consolidate suppliers to increase category coverage and leverage"
        )
    
    if len(development_opportunities['new_supplier_evaluation']) > 0:
        recommendations['portfolio_optimization'].append(
            f"Evaluate {len(development_opportunities['new_supplier_evaluation'])} promising new suppliers for expanded partnerships"
        )
    
    return recommendations

# Generate recommendations
supplier_recommendations = generate_supplier_recommendations(supplier_risk_assessment, development_opportunities)

# Display recommendations
print("\nSTRATEGIC SUPPLIER MANAGEMENT RECOMMENDATIONS:")
print("=" * 60)

for category, recs in supplier_recommendations.items():
    if recs:
        print(f"\n{category.upper().replace('_', ' ')}:")
        print("-" * 40)
        for i, rec in enumerate(recs, 1):
            print(f"  {i}. {rec}")

# Create action priority matrix
print("\n\nACTION PRIORITY MATRIX:")
print("=" * 60)

# High Impact, High Urgency
print("\nHIGH IMPACT + HIGH URGENCY (Do First):")
for rec in supplier_recommendations['immediate_actions']:
    print(f"  • {rec}")

# High Impact, Lower Urgency
print("\nHIGH IMPACT + LOWER URGENCY (Schedule):")
for rec in supplier_recommendations['strategic_initiatives']:
    print(f"  • {rec}")

# Lower Impact, High Urgency
print("\nLOWER IMPACT + HIGH URGENCY (Delegate):")
for rec in supplier_recommendations['risk_mitigation']:
    print(f"  • {rec}")

# Lower Impact, Lower Urgency
print("\nLOWER IMPACT + LOWER URGENCY (Monitor):")
for rec in supplier_recommendations['performance_improvement']:
    print(f"  • {rec}")
for rec in supplier_recommendations['portfolio_optimization']:
    print(f"  • {rec}")

Generating Strategic Supplier Recommendations:

STRATEGIC SUPPLIER MANAGEMENT RECOMMENDATIONS:

IMMEDIATE ACTIONS:
----------------------------------------
  1. Initiate performance improvement plans for strategic suppliers representing 82.7% of spend
  2. Launch cost optimization initiatives with potential $29,343 in additional savings

STRATEGIC INITIATIVES:
----------------------------------------
  1. Develop strategic partnerships with top-performing, high-spend suppliers
  2. Diversify supplier base - top 5 suppliers represent 100.0% of spend

PERFORMANCE IMPROVEMENT:
----------------------------------------
  1. Launch quality improvement programs with 4 suppliers
  2. Overall supplier performance below target (60.0/100) - implement comprehensive improvement program

PORTFOLIO OPTIMIZATION:
----------------------------------------
  1. Increase excellent-tier suppliers from 20.0% to 25% of portfolio


ACTION PRIORITY MATRIX:

HIGH IMPACT + HIGH URGENCY (Do First):
  • Initiate p

## 7. Export Supplier Analysis Results

In [11]:
# Export comprehensive supplier analysis results
import os
import json

# Ensure output directories exist
os.makedirs('../data/processed', exist_ok=True)
os.makedirs('../reports', exist_ok=True)

print("Exporting Supplier Analysis Results:")
print("=" * 50)

# 1. Export supplier scorecards
scorecards_path = '../data/processed/supplier_scorecards.csv'
supplier_risk_assessment.to_csv(scorecards_path, index=True)
print(f"Supplier scorecards exported to: {scorecards_path}")

# 2. Export development opportunities
opportunities_path = '../data/processed/supplier_development_opportunities.json'
# Convert DataFrames to dictionaries for JSON serialization
opportunities_export = {}
for key, df in development_opportunities.items():
    opportunities_export[key] = df.to_dict('index')

with open(opportunities_path, 'w') as f:
    json.dump(opportunities_export, f, indent=2, default=str)
print(f"Development opportunities exported to: {opportunities_path}")

# 3. Create comprehensive supplier analysis report
report_path = '../reports/supplier_performance_analysis_report.txt'
with open(report_path, 'w') as f:
    f.write("SUPPLIER PERFORMANCE ANALYSIS REPORT\n")
    f.write("=" * 60 + "\n")
    f.write(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
    #f.write(f"Analysis Period: {df['Order_Date'].min().strftime('%Y-%m-%d')} to {df['Order_Date'].max().strftime('%Y-%m-%d')}\n")
    if 'Order_Date' in df.columns:
        f.write(f"Analysis Period: {df['Order_Date'].min().strftime('%Y-%m-%d')} to {df['Order_Date'].max().strftime('%Y-%m-%d')}\n")
    else:
        f.write("Analysis Period: Full dataset period\n")
    f.write(f"Total Suppliers Analyzed: {len(supplier_risk_assessment)}\n")
    f.write(f"Total Procurement Value: ${supplier_risk_assessment['total_spend'].sum():,.2f}\n")
    
    f.write("\nEXECUTIVE SUMMARY:\n")
    f.write("-" * 40 + "\n")
    f.write(f"Average Supplier Performance Score: {supplier_risk_assessment['overall_supplier_score'].mean():.1f}/100\n")
    f.write(f"Top 5 Supplier Concentration: {supplier_risk_assessment.head(5)['spend_share'].sum():.1f}%\n")
    f.write(f"Critical Risk Suppliers: {(supplier_risk_assessment['overall_risk_level'] == 'Critical Risk').sum()}\n")
    f.write(f"Strategic Partners Identified: {(supplier_risk_assessment['strategic_classification'] == 'Strategic Partner').sum()}\n")
    
    f.write("\nPERFORMANCE TIER DISTRIBUTION:\n")
    f.write("-" * 40 + "\n")
    tier_dist = supplier_risk_assessment['performance_tier'].value_counts()
    for tier, count in tier_dist.items():
        pct = count / len(supplier_risk_assessment) * 100
        f.write(f"  {tier}: {count} suppliers ({pct:.1f}%)\n")
    
    f.write("\nRISK ASSESSMENT SUMMARY:\n")
    f.write("-" * 40 + "\n")
    risk_dist = supplier_risk_assessment['overall_risk_level'].value_counts()
    for risk_level, count in risk_dist.items():
        pct = count / len(supplier_risk_assessment) * 100
        f.write(f"  {risk_level}: {count} suppliers ({pct:.1f}%)\n")
    
    f.write("\nTOP 10 PERFORMING SUPPLIERS:\n")
    f.write("-" * 40 + "\n")
    top_10 = supplier_risk_assessment.head(10)
    for supplier, row in top_10.iterrows():
        f.write(f"  {supplier}:\n")
        f.write(f"    Overall Score: {row['overall_supplier_score']:.1f}/100\n")
        f.write(f"    Spend Share: {row['spend_share']:.2f}%\n")
        f.write(f"    Classification: {row['strategic_classification']}\n")
        f.write(f"    Risk Level: {row['overall_risk_level']}\n")
        f.write("\n")
    
    f.write("\nSTRATEGIC RECOMMENDATIONS:\n")
    f.write("-" * 40 + "\n")
    for category, recs in supplier_recommendations.items():
        if recs:
            f.write(f"\n{category.upper().replace('_', ' ')}:\n")
            for i, rec in enumerate(recs, 1):
                f.write(f"  {i}. {rec}\n")
    
    f.write("\nDEVELOPMENT OPPORTUNITIES SUMMARY:\n")
    f.write("-" * 40 + "\n")
    for opp_type, data in development_opportunities.items():
        f.write(f"  {opp_type.replace('_', ' ').title()}: {len(data)} suppliers\n")
    
    if len(development_opportunities['cost_optimization']) > 0:
        total_cost_opp = development_opportunities['cost_optimization']['potential_additional_savings'].sum()
        f.write(f"\nTotal Cost Optimization Opportunity: ${total_cost_opp:,.0f}\n")

print(f"Comprehensive report exported to: {report_path}")

# 4. Create supplier performance summary for dashboard
dashboard_summary = {
    'analysis_date': datetime.now().strftime('%Y-%m-%d'),
    'total_suppliers': len(supplier_risk_assessment),
    'avg_performance_score': supplier_risk_assessment['overall_supplier_score'].mean(),
    'top_5_concentration': supplier_risk_assessment.head(5)['spend_share'].sum(),
    'strategic_partners': (supplier_risk_assessment['strategic_classification'] == 'Strategic Partner').sum(),
    'critical_risk_suppliers': (supplier_risk_assessment['overall_risk_level'] == 'Critical Risk').sum(),
    'performance_tiers': supplier_risk_assessment['performance_tier'].value_counts().to_dict(),
    'risk_levels': supplier_risk_assessment['overall_risk_level'].value_counts().to_dict(),
    'top_10_suppliers': supplier_risk_assessment.head(10)[[
        'overall_supplier_score', 'spend_share', 'strategic_classification'
    ]].to_dict('index')
}

dashboard_path = '../data/processed/supplier_dashboard_summary.json'
with open(dashboard_path, 'w') as f:
    json.dump(dashboard_summary, f, indent=2, default=str)
print(f"Dashboard summary exported to: {dashboard_path}")

print(f"\nSupplier Performance Analysis Complete!")
print(f"Files generated:")
print(f"  1. {scorecards_path} - Detailed supplier scorecards")
print(f"  2. {opportunities_path} - Development opportunities (JSON)")
print(f"  3. {report_path} - Comprehensive analysis report")
print(f"  4. {dashboard_path} - Dashboard summary data")

print(f"\nReady for next phase: Predictive Modeling (Notebook 06)")

Exporting Supplier Analysis Results:
Supplier scorecards exported to: ../data/processed/supplier_scorecards.csv
Development opportunities exported to: ../data/processed/supplier_development_opportunities.json
Comprehensive report exported to: ../reports/supplier_performance_analysis_report.txt
Dashboard summary exported to: ../data/processed/supplier_dashboard_summary.json

Supplier Performance Analysis Complete!
Files generated:
  1. ../data/processed/supplier_scorecards.csv - Detailed supplier scorecards
  2. ../data/processed/supplier_development_opportunities.json - Development opportunities (JSON)
  3. ../reports/supplier_performance_analysis_report.txt - Comprehensive analysis report
  4. ../data/processed/supplier_dashboard_summary.json - Dashboard summary data

Ready for next phase: Predictive Modeling (Notebook 06)


---

## Supplier Performance Deep Dive Complete!

**Major Accomplishments:**
- Created comprehensive supplier portfolio analysis with 15+ performance metrics
- Developed weighted performance scorecards across 4 key dimensions
- Conducted detailed risk assessment with 6 risk categories and composite scoring
- Identified specific development opportunities across 6 strategic areas
- Generated actionable strategic recommendations with priority matrix
- Classified suppliers into strategic categories and performance tiers

**Key Business Insights Delivered:**
- **Supplier Portfolio Optimization**: Top performers, underperformers, and high-potential suppliers
- **Risk Mitigation Strategies**: Critical risk suppliers and concentration dependencies
- **Cost Optimization Opportunities**: Quantified savings potential from underperforming suppliers
- **Strategic Partnership Identification**: Suppliers worthy of deeper strategic relationships
- **Performance Improvement Roadmap**: Specific actions for quality, delivery, and cost improvements

**Strategic Supplier Classifications:**
- **Strategic Partners**: High-spend, high-performance suppliers for deeper partnerships
- **Key Suppliers**: Important suppliers requiring active management
- **High Risk Suppliers**: High-spend suppliers with performance concerns
- **Development Opportunities**: High-performing suppliers with growth potential
- **Performance Concerns**: Suppliers requiring immediate attention or replacement

**Actionable Recommendations Generated:**
- Immediate actions for critical risk suppliers
- Strategic initiatives for portfolio optimization
- Risk mitigation strategies for concentration management
- Performance improvement programs for key suppliers
- Portfolio optimization for competitive advantage

**Ready for Next Phase:**
- **Predictive Modeling** (Notebook 06) - Build forecasting models for supplier performance
- **Interactive Dashboard Creation** - Transform insights into executive dashboards
- **Supplier Relationship Management** - Implement strategic supplier programs

**Strategic Value Delivered:**
- Data-driven supplier relationship management
- Risk-based supplier portfolio optimization
- Performance-based supplier development programs
- Strategic sourcing decision support
- Competitive advantage through supplier excellence

---