# Advanced Matplotlib and Seaborn - Part 3: Visualization Best Practices

**Week 5 Thursday - May 8, 2025**

## Learning Objectives
By the end of this session, you will be able to:
1. Apply design principles for professional business visualizations
2. Implement data storytelling techniques for impactful presentations
3. Optimize visualizations for performance and scalability
4. Create accessible, publication-ready charts for diverse audiences

## Why Visualization Best Practices Matter

**Business Impact:**
- **Decision Making**: Clear visuals lead to better business decisions
- **Communication**: Effective charts bridge technical and business teams
- **Credibility**: Professional visualizations enhance trust and authority
- **Accessibility**: Inclusive design reaches broader audiences

**Career Benefits:**
- **Executive Presentations**: Board-ready visualization skills
- **Data Storytelling**: Transform data into compelling narratives
- **Professional Standards**: Industry-level visualization quality
- **Cross-functional Impact**: Bridge data science and business strategy

## Setup and Data Preparation

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
from datetime import datetime
import zipfile
import requests
from io import BytesIO
from matplotlib.gridspec import GridSpec
import matplotlib.patches as patches
from matplotlib.colors import LinearSegmentedColormap
import colorsys

# Configure display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 10)
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

print("✅ Libraries imported successfully!")
print("🎨 Ready to explore visualization best practices")

In [None]:
# Load sample Olist data for best practices demonstration
def create_best_practices_data():
    """
    Create comprehensive sample data for best practices demonstration
    """
    print("📊 Creating sample data for best practices demonstration...")
    
    np.random.seed(42)
    
    # Sample parameters
    n_orders = 5000
    states = ['SP', 'RJ', 'MG', 'RS', 'PR', 'SC', 'BA', 'GO', 'PE', 'CE']
    categories = ['electronics', 'home_garden', 'sports_leisure', 'health_beauty', 
                 'fashion_bags', 'computers', 'auto', 'toys', 'furniture', 'books']
    payment_types = ['credit_card', 'boleto', 'debit_card', 'voucher']
    
    # Generate date range with seasonality
    start_date = datetime(2023, 1, 1)
    end_date = datetime(2024, 12, 31)
    date_range = pd.date_range(start_date, end_date, freq='D')
    
    # Create orders data
    orders_data = []
    for i in range(n_orders):
        order_date = np.random.choice(date_range)
        
        # Add seasonality effects
        seasonal_multiplier = 1.3 if order_date.month in [11, 12] else 1.0
        state_weight = np.random.choice([1.2, 1.0, 0.8], p=[0.3, 0.5, 0.2])  # Regional variation
        
        order = {
            'order_id': f'order_{i}',
            'customer_id': f'customer_{np.random.randint(1, 1500)}',
            'order_date': order_date,
            'order_month': order_date.to_period('M'),
            'customer_state': np.random.choice(states, p=[0.35, 0.12, 0.10, 0.08, 0.07, 0.06, 0.05, 0.04, 0.04, 0.09]),
            'primary_category': np.random.choice(categories),
            'order_value': np.random.lognormal(3.2, 0.9) * seasonal_multiplier * state_weight,
            'freight_value': np.random.gamma(2, 8),
            'item_count': np.random.poisson(2) + 1,
            'review_score': np.random.choice([1, 2, 3, 4, 5], p=[0.04, 0.06, 0.15, 0.30, 0.45]),
            'payment_type': np.random.choice(payment_types, p=[0.73, 0.19, 0.06, 0.02]),
            'delivery_days': np.random.gamma(3, 4) + 1
        }
        orders_data.append(order)
    
    df = pd.DataFrame(orders_data)
    
    # Add derived metrics
    df['freight_ratio'] = df['freight_value'] / df['order_value']
    df['value_tier'] = pd.cut(df['order_value'], 
                             bins=[0, 50, 150, 500, float('inf')],
                             labels=['Budget', 'Standard', 'Premium', 'Luxury'])
    df['satisfaction_level'] = pd.cut(df['review_score'],
                                     bins=[0, 2, 3, 4, 5],
                                     labels=['Poor', 'Fair', 'Good', 'Excellent'])
    df['season'] = df['order_date'].apply(lambda x: 
        'Q4 (Holiday)' if x.month in [11, 12] else
        'Q1 (Post-Holiday)' if x.month in [1, 2, 3] else
        'Q2-Q3 (Regular)'
    )
    
    return df

# Create the dataset
olist_data = create_best_practices_data()

print(f"✅ Sample data created: {len(olist_data):,} orders")
print(f"📅 Date range: {olist_data['order_date'].min().date()} to {olist_data['order_date'].max().date()}")
print(f"🌎 States: {olist_data['customer_state'].nunique()}")
print(f"📦 Categories: {olist_data['primary_category'].nunique()}")
print(f"💰 Average order value: R$ {olist_data['order_value'].mean():.2f}")

## 1. Design Principles for Business Visualizations (10 minutes)

### Core Design Principles

**1. Clarity Over Complexity**
- Prioritize understanding over aesthetics
- Remove unnecessary elements (chart junk)
- Focus on the key message

**2. Accessibility and Inclusion**
- Color-blind friendly palettes
- Sufficient contrast ratios
- Clear typography and sizing

**3. Professional Standards**
- Consistent branding and styling
- Appropriate chart types for data
- Publication-ready quality

**4. Business Context**
- Meaningful titles and labels
- Relevant benchmarks and targets
- Actionable insights highlighted

In [None]:
# 1. Color Theory and Accessibility

# Define accessible color palettes
def create_accessible_palettes():
    """
    Create colorblind-friendly and professional color palettes
    """
    palettes = {
        'colorblind_safe': ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b'],
        'corporate_blue': ['#003f5c', '#2f4b7c', '#665191', '#a05195', '#d45087', '#f95d6a'],
        'high_contrast': ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2'],
        'print_friendly': ['#000000', '#666666', '#999999', '#CCCCCC', '#FFFFFF'],
        'viridis_custom': ['#440154', '#31688e', '#35b779', '#fde725']
    }
    return palettes

# Demonstrate color accessibility
fig, axes = plt.subplots(2, 3, figsize=(18, 10))
fig.suptitle('Color Accessibility and Professional Palettes\nChoosing the Right Colors for Business Visualizations', 
             fontsize=16, fontweight='bold', y=0.98)

palettes = create_accessible_palettes()

# Sample data for demonstration
sample_categories = olist_data['primary_category'].value_counts().head(6)

# Plot with different palettes
palette_names = list(palettes.keys())
for i, (name, colors) in enumerate(palettes.items()):
    if i >= 6:  # Only show first 6 palettes
        break
    
    row = i // 3
    col = i % 3
    ax = axes[row, col]
    
    # Create bar chart with current palette
    bars = ax.bar(range(len(sample_categories)), sample_categories.values, 
                  color=colors[:len(sample_categories)])
    
    ax.set_title(f'{name.replace("_", " ").title()}\nPalette', 
                fontsize=12, fontweight='bold')
    ax.set_xticks(range(len(sample_categories)))
    ax.set_xticklabels([cat[:8] + '...' if len(cat) > 8 else cat 
                       for cat in sample_categories.index], rotation=45, ha='right')
    ax.set_ylabel('Order Count')
    
    # Add accessibility score
    accessibility_score = "High" if name in ['colorblind_safe', 'high_contrast'] else "Medium"
    score_color = 'green' if accessibility_score == "High" else 'orange'
    ax.text(0.02, 0.98, f'Accessibility: {accessibility_score}', 
            transform=ax.transAxes, fontsize=10, fontweight='bold',
            bbox=dict(boxstyle='round', facecolor=score_color, alpha=0.3),
            verticalalignment='top')

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

print("🎨 Color Accessibility Guidelines:")
print("• Use colorblind-safe palettes (affects ~8% of men, ~0.5% of women)")
print("• Ensure sufficient contrast ratio (4.5:1 for text, 3:1 for graphics)")
print("• Test visualizations in grayscale")
print("• Combine color with other visual cues (patterns, shapes, labels)")
print("• Corporate palettes should maintain accessibility standards")

In [None]:
# Final Summary and Transition to Group Assignment

print("🎓 VISUALIZATION BEST PRACTICES MASTERY COMPLETE!")
print("=" * 70)

print("\n📊 WHAT YOU'VE LEARNED TODAY:")
print("\n🎨 DESIGN PRINCIPLES:")
print("• Color theory and accessibility guidelines")
print("• Typography hierarchy and readability standards")
print("• Chart selection decision framework")
print("• Professional styling and branding")

print("\n📖 DATA STORYTELLING:")
print("• Context-Conflict-Resolution narrative structure")
print("• Progressive disclosure techniques")
print("• Strategic annotations and highlighting")
print("• Actionable insights and recommendations")

print("\n⚡ PERFORMANCE OPTIMIZATION:")
print("• Data sampling for large datasets")
print("• Efficient aggregation strategies")
print("• Memory-efficient visualization techniques")
print("• Export optimization for different formats")

print("\n♿ ACCESSIBILITY & INCLUSION:")
print("• Colorblind-friendly design patterns")
print("• High contrast and multi-modal encoding")
print("• Universal design principles")
print("• Cultural sensitivity considerations")

print("\n🚀 READY FOR THE MAJOR GROUP ASSIGNMENT:")
print("• Create a comprehensive Olist e-commerce dashboard")
print("• Apply statistical visualization techniques")
print("• Implement multi-plot layouts and complex dashboards")
print("• Follow professional best practices and accessibility guidelines")
print("• Tell a compelling data story with actionable insights")

print("\n🎯 SUCCESS CRITERIA FOR YOUR DASHBOARD:")
print("• ✅ Executive summary with key KPIs")
print("• ✅ Statistical analysis using Seaborn")
print("• ✅ Multi-plot layouts with GridSpec")
print("• ✅ Professional styling and accessibility")
print("• ✅ Clear data story with business insights")
print("• ✅ Actionable recommendations")

print("\n💡 TIPS FOR GROUP SUCCESS:")
print("• Divide work by dashboard sections (KPIs, trends, analysis, insights)")
print("• Maintain consistent styling across all visualizations")
print("• Focus on business insights, not just technical execution")
print("• Test your dashboard with different audiences")
print("• Practice your presentation and story flow")

print("\n🏆 YOU'RE NOW EQUIPPED TO CREATE WORLD-CLASS VISUALIZATIONS!")
print("Let's move to the collaborative group work session...")