# Pareto Analysis (80/20 Rule) in Supply Chain

The **Pareto Principle** states that roughly 80% of effects come from 20% of causes. In supply chain management, this principle is incredibly powerful for focusing efforts on the most impactful areas.

## Learning Objectives
1. Understand the Pareto Principle and its applications in supply chain
2. Learn ABC analysis for inventory categorization
3. Create Pareto charts for data visualization
4. Apply 80/20 thinking to supplier management and product analysis
5. Make data-driven decisions using Pareto insights

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import sys

# Add src directory to path
sys.path.append('../src')

# Set plotting style
plt.style.use('default')
sns.set_palette("Set2")
plt.rcParams['figure.figsize'] = (12, 6)

print("📊 Pareto Analysis Environment Ready!")
print("Let's explore the 80/20 principle in supply chain management!")

## 1. What is the 80/20 Rule?

The Pareto Principle, also known as the 80/20 rule, suggests that:
- **80% of outcomes result from 20% of inputs**
- **80% of problems come from 20% of causes**
- **80% of sales come from 20% of customers**

### Common Supply Chain Applications:
- 80% of inventory value from 20% of SKUs
- 80% of revenue from 20% of products
- 80% of quality issues from 20% of suppliers
- 80% of transportation costs from 20% of routes

In [None]:
# Import our Pareto analysis module
from pareto_analysis.pareto_basics import ParetoAnalyzer, create_sample_supply_chain_data

# Create sample supply chain data
sample_data = create_sample_supply_chain_data()

print("📦 Sample Supply Chain Data Generated:")
print(f"\n🏷️  Product Data: {len(sample_data['product_data'])} products")
print(sample_data['product_data'].head())

print(f"\n🏭 Supplier Data: {len(sample_data['supplier_data'])} suppliers")
print(sample_data['supplier_data'].head())

## 2. ABC Analysis - Product Revenue Classification

ABC analysis is a practical application of the Pareto principle. Products are classified into three categories:
- **Category A**: High-value items (typically ~20% of items, ~80% of value)
- **Category B**: Medium-value items (typically ~30% of items, ~15% of value)
- **Category C**: Low-value items (typically ~50% of items, ~5% of value)

In [None]:
# Perform ABC analysis on product revenue
analyzer = ParetoAnalyzer()
product_abc = analyzer.load_data(sample_data['product_data']).abc_analysis('annual_revenue', 'product_id')

# Get summary statistics
summary = analyzer.get_pareto_summary()

print("📈 ABC Analysis Results - Product Revenue:")
print("=" * 50)
print(f"Key Finding: {summary['interpretation']}")
print(f"Pareto Ratio: {summary['pareto_ratio']} (instead of the theoretical 20/80)")

# Show ABC distribution
abc_stats = analyzer.analysis_results['abc_analysis']['stats']
print("\n📊 ABC Category Breakdown:")
for category in ['A', 'B', 'C']:
    if category in abc_stats.index:
        row = abc_stats.loc[category]
        print(f"Category {category}: {int(row['item_count'])} items ({row['max_item_percent']:.1f}% of products) = ${row['total_value']:,.0f} ({row['value_percent']:.1f}% of revenue)")

In [None]:
# Create Pareto chart for product revenue
analyzer.pareto_chart('annual_revenue', 'product_id', 
                     title='Product Revenue Pareto Analysis', 
                     figsize=(14, 8))

print("💡 Interpretation of Pareto Chart:")
print("   📊 Blue bars: Individual product revenues (sorted from highest to lowest)")
print("   📈 Red line: Cumulative percentage of total revenue")
print("   🟢 Green lines: 80% revenue line and 20% products line")
print("   🎯 Products to the left of intersection are your 'vital few'")

## 3. Supplier Quality Analysis

Let's apply Pareto analysis to supplier quality issues. This helps identify which suppliers require immediate attention.

In [None]:
# Analyze supplier quality issues
supplier_analyzer = ParetoAnalyzer()
supplier_abc = supplier_analyzer.load_data(sample_data['supplier_data']).abc_analysis('quality_issues', 'supplier_id')

supplier_summary = supplier_analyzer.get_pareto_summary()

print("🏭 Supplier Quality Issues - ABC Analysis:")
print("=" * 50)
print(f"Key Finding: {supplier_summary['interpretation']}")

# Show top problem suppliers
top_10_suppliers = supplier_abc.head(10)[['supplier_id', 'quality_issues', 'cumulative_percent', 'abc_class']]
print("\n⚠️  Top 10 Problem Suppliers:")
for idx, row in top_10_suppliers.iterrows():
    print(f"{row['supplier_id']}: {int(row['quality_issues'])} issues ({row['cumulative_percent']:.1f}% cumulative) - Category {row['abc_class']}")

# Create Pareto chart for supplier issues
supplier_analyzer.pareto_chart('quality_issues', 'supplier_id', 
                              title='Supplier Quality Issues Pareto Analysis',
                              figsize=(14, 8))

## 4. Comparative Analysis Across Categories

Let's compare different applications of Pareto analysis side by side:

In [None]:
# Analyze multiple dimensions
profit_analyzer = ParetoAnalyzer()
profit_abc = profit_analyzer.load_data(sample_data['product_data']).abc_analysis('profit', 'product_id')
profit_summary = profit_analyzer.get_pareto_summary()

spend_analyzer = ParetoAnalyzer()
spend_abc = spend_analyzer.load_data(sample_data['supplier_data']).abc_analysis('annual_spend', 'supplier_id')
spend_summary = spend_analyzer.get_pareto_summary()

# Create comparison table
comparison_data = {
    'Analysis Type': ['Product Revenue', 'Product Profit', 'Supplier Quality Issues', 'Supplier Spend'],
    'Items in Category A': [summary['a_category_items'], 
                           profit_summary['a_category_items'],
                           supplier_summary['a_category_items'],
                           spend_summary['a_category_items']],
    '% Items in A': [f"{summary['a_category_item_percent']:.0f}%",
                     f"{profit_summary['a_category_item_percent']:.0f}%",
                     f"{supplier_summary['a_category_item_percent']:.0f}%",
                     f"{spend_summary['a_category_item_percent']:.0f}%"],
    '% Value in A': [f"{summary['a_category_value_percent']:.0f}%",
                     f"{profit_summary['a_category_value_percent']:.0f}%",
                     f"{supplier_summary['a_category_value_percent']:.0f}%",
                     f"{spend_summary['a_category_value_percent']:.0f}%"],
    'Actual Ratio': [summary['pareto_ratio'],
                     profit_summary['pareto_ratio'],
                     supplier_summary['pareto_ratio'],
                     spend_summary['pareto_ratio']]
}

comparison_df = pd.DataFrame(comparison_data)
print("📋 Pareto Analysis Comparison:")
print("=" * 70)
print(comparison_df.to_string(index=False))

print("\n🎯 Management Insights:")
print("   📈 Focus inventory management on high-revenue Category A products")
print("   💰 Prioritize profit improvement for Category A profit drivers")
print("   🏭 Implement quality improvement programs for problem suppliers")
print("   🤝 Develop strategic partnerships with high-spend suppliers")

## 5. Advanced Pareto Visualizations

Let's create more sophisticated visualizations to better understand the data patterns:

In [None]:
# Create a comprehensive dashboard
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
fig.suptitle('Supply Chain Pareto Analysis Dashboard', fontsize=16, fontweight='bold')

# 1. ABC Distribution for Products
product_abc_counts = product_abc['abc_class'].value_counts().sort_index()
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1']
axes[0,0].pie(product_abc_counts.values, labels=[f'Category {cat}' for cat in product_abc_counts.index], 
              autopct='%1.1f%%', colors=colors)
axes[0,0].set_title('Product ABC Distribution by Count')

# 2. Revenue by ABC Category
product_revenue_by_abc = product_abc.groupby('abc_class')['annual_revenue'].sum()
axes[0,1].bar(product_revenue_by_abc.index, product_revenue_by_abc.values, color=colors)
axes[0,1].set_title('Total Revenue by ABC Category')
axes[0,1].set_ylabel('Revenue ($)')
axes[0,1].tick_params(axis='y', rotation=0)

# Format y-axis
axes[0,1].yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# 3. Supplier Issues Distribution
supplier_abc_counts = supplier_abc['abc_class'].value_counts().sort_index()
axes[1,0].pie(supplier_abc_counts.values, labels=[f'Category {cat}' for cat in supplier_abc_counts.index], 
              autopct='%1.1f%%', colors=colors)
axes[1,0].set_title('Supplier ABC Distribution by Count')

# 4. Heat map of Product Categories vs ABC Class
# First, let's add category information to our product data
np.random.seed(42)
categories = ['Electronics', 'Clothing', 'Home', 'Sports', 'Automotive']
product_abc['category'] = np.random.choice(categories, len(product_abc))

# Create cross-tabulation
category_abc_cross = pd.crosstab(product_abc['category'], product_abc['abc_class'], normalize='index') * 100
sns.heatmap(category_abc_cross, annot=True, fmt='.1f', cmap='RdYlBu_r', ax=axes[1,1])
axes[1,1].set_title('Product Category vs ABC Class (%)')
axes[1,1].set_ylabel('Product Category')
axes[1,1].set_xlabel('ABC Class')

plt.tight_layout()
plt.show()

print("📊 Dashboard Insights:")
highest_revenue_category = product_revenue_by_abc.idxmax()
print(f"   🏆 Category {highest_revenue_category} generates the most revenue: ${product_revenue_by_abc[highest_revenue_category]:,.0f}")
print(f"   📈 This represents {product_revenue_by_abc[highest_revenue_category]/product_revenue_by_abc.sum()*100:.1f}% of total revenue")

## 6. Practical Business Applications

Now let's see how to apply these insights to make business decisions:

In [None]:
# Generate actionable insights
print("🎯 ACTIONABLE PARETO INSIGHTS FOR SUPPLY CHAIN MANAGEMENT")
print("=" * 65)

# 1. Product Management Recommendations
category_a_products = product_abc[product_abc['abc_class'] == 'A']
category_c_products = product_abc[product_abc['abc_class'] == 'C']

print("\n📦 PRODUCT MANAGEMENT:")
print(f"   ✅ FOCUS AREAS (Category A - {len(category_a_products)} products):")
print(f"      • Maintain high service levels (95-99% availability)")
print(f"      • Invest in demand forecasting accuracy")
print(f"      • Negotiate better supplier terms for high-volume items")
print(f"      • Monitor these products weekly")

print(f"\n   🔄 OPTIMIZATION AREAS (Category C - {len(category_c_products)} products):")
print(f"      • Consider consolidating slow-moving items")
print(f"      • Reduce inventory levels (accept lower service levels)")
print(f"      • Review monthly rather than weekly")
print(f"      • Evaluate discontinuation of lowest performers")

# 2. Supplier Management Recommendations
problem_suppliers = supplier_abc[supplier_abc['abc_class'] == 'A']
good_suppliers = supplier_abc[supplier_abc['abc_class'] == 'C']

print(f"\n🏭 SUPPLIER MANAGEMENT:")
print(f"   🚨 IMMEDIATE ACTION REQUIRED ({len(problem_suppliers)} suppliers):")
for _, supplier in problem_suppliers.head(3).iterrows():
    print(f"      • {supplier['supplier_id']}: {int(supplier['quality_issues'])} issues - Schedule quality audit")

print(f"\n   ⭐ RECOGNITION CANDIDATES ({len(good_suppliers)} suppliers):")
for _, supplier in good_suppliers.head(3).iterrows():
    print(f"      • {supplier['supplier_id']}: {int(supplier['quality_issues'])} issues - Consider partnership expansion")

# 3. Inventory Investment Recommendations
total_revenue = product_abc['annual_revenue'].sum()
category_a_revenue = category_a_products['annual_revenue'].sum()

print(f"\n💰 INVENTORY INVESTMENT STRATEGY:")
print(f"   📊 Current Situation:")
print(f"      • Total products: {len(product_abc)}")
print(f"      • Category A products: {len(category_a_products)} ({len(category_a_products)/len(product_abc)*100:.1f}%)")
print(f"      • Category A revenue share: {category_a_revenue/total_revenue*100:.1f}%")
print(f"   💡 Recommendations:")
print(f"      • Allocate 70-80% of inventory budget to Category A items")
print(f"      • Implement tighter inventory controls for Category A")
print(f"      • Use simpler replenishment rules for Category C")

## 7. Interactive Pareto Analysis Tool

Let's create a simple tool to perform Pareto analysis on any dataset:

In [None]:
def quick_pareto_analysis(data, value_col, item_col=None, top_n=10):
    """
    Quick Pareto analysis function for any dataset.
    """
    analyzer = ParetoAnalyzer()
    result = analyzer.load_data(data).abc_analysis(value_col, item_col)
    summary = analyzer.get_pareto_summary()
    
    print(f"📊 Quick Pareto Analysis - {value_col}")
    print("=" * 50)
    print(f"Finding: {summary['interpretation']}")
    print(f"Actual ratio: {summary['pareto_ratio']}")
    
    # Show top performers
    if item_col:
        top_items = result.head(top_n)[[item_col, value_col, 'cumulative_percent', 'abc_class']]
        print(f"\nTop {top_n} items:")
        for idx, row in top_items.iterrows():
            print(f"  {row[item_col]}: {row[value_col]:,.0f} ({row['cumulative_percent']:.1f}% cum.) [Class {row['abc_class']}]")
    
    return result, summary

# Test the tool with our data
print("🛠️ Testing Quick Pareto Analysis Tool:")
print()

# Analyze product revenue
product_result, product_summary = quick_pareto_analysis(
    sample_data['product_data'], 'annual_revenue', 'product_id', top_n=5
)

print("\n" + "-"*50)

# Analyze supplier spend
supplier_result, supplier_summary = quick_pareto_analysis(
    sample_data['supplier_data'], 'annual_spend', 'supplier_id', top_n=5
)

## 8. Key Takeaways and Next Steps

Congratulations! You've learned how to apply the Pareto principle to supply chain management. Here are the key takeaways:

In [None]:
print("🎓 KEY TAKEAWAYS FROM PARETO ANALYSIS:")
print("=" * 50)

takeaways = [
    "📊 The 80/20 rule helps prioritize limited resources effectively",
    "🎯 ABC analysis enables strategic inventory management",
    "📈 Pareto charts visualize where to focus improvement efforts",
    "🏭 Different dimensions (revenue, quality, cost) may have different patterns",
    "💡 Regular Pareto analysis helps adapt to changing business conditions",
    "🔧 Simple Python tools can automate complex business analyses",
    "📋 Visual dashboards make insights accessible to all stakeholders"
]

for i, takeaway in enumerate(takeaways, 1):
    print(f"{i}. {takeaway}")

print("\n🚀 NEXT STEPS IN YOUR LEARNING JOURNEY:")
print("   📚 Explore optimization_problems/ for mathematical modeling")
print("   🔬 Apply these techniques to your organization's real data")
print("   🎯 Build automated Pareto analysis dashboards")
print("   📊 Combine with other analytics techniques (forecasting, clustering)")
print("   🤝 Share insights with business stakeholders")

print("\n💾 Don't forget to save interesting analyses for future reference!")

In [None]:
# Save analysis results
product_abc.to_csv('../data/product_abc_analysis.csv', index=False)
supplier_abc.to_csv('../data/supplier_abc_analysis.csv', index=False)

print("💾 Analysis results saved!")
print("   📄 product_abc_analysis.csv - Product ABC classification")
print("   📄 supplier_abc_analysis.csv - Supplier ABC classification")
print("\n🎉 Ready for the next notebook: 03_inventory_optimization.ipynb")