# Week 11: Marketing Performance Analysis & Optimization
# Wednesday Python - Matplotlib Fundamentals for Marketing Analytics
# Part 1: Matplotlib Fundamentals for Marketing Campaign Visualization

## Business Context
**Scenario**: You're a data analyst for Jumia Nigeria's marketing team. The CMO wants visual insights into campaign performance to optimize the Q4 marketing budget allocation.

**Dataset**: Marketing campaign data from Nigerian e-commerce platform including leads, conversions, and revenue metrics.

## Learning Objectives
1. Create basic matplotlib plots for marketing data visualization
2. Customize visualizations for business presentations
3. Apply chart types appropriate for marketing analytics
4. Tell compelling data stories with visualizations

## 1. Setup and Data Loading

First, let's import the necessary libraries and load our marketing campaign data.

In [None]:
# Import essential libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib.ticker import FuncFormatter
import seaborn as sns

# Set style for professional-looking plots
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Configure matplotlib for better display
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12
plt.rcParams['axes.titlesize'] = 16
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
plt.rcParams['legend.fontsize'] = 12

# Load marketing campaign data
df = pd.read_csv('datasets/marketing_campaign_data.csv')

# Convert date columns to datetime
df['first_contact_date'] = pd.to_datetime(df['first_contact_date'])
df['won_date'] = pd.to_datetime(df['won_date'])

# Display first few rows and basic info
print("Dataset Shape:", df.shape)
print("\nFirst few rows:")
display(df.head())
print("\nDataset Info:")
df.info()

## 2. Matplotlib Fundamentals - Basic Plot Types

Let's create the foundational plots that every marketing analyst needs to master.

### 2.1 Bar Charts - Marketing Channel Performance

In [None]:
# Calculate MQLs by marketing channel
channel_mqls = df['origin'].value_counts().sort_values(ascending=False)

# Create bar chart for MQL volume by channel
fig, ax = plt.subplots(figsize=(12, 7))

# Create bars
bars = ax.bar(channel_mqls.index, channel_mqls.values, 
             color=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])

# Customize the chart
ax.set_title('Marketing Qualified Leads by Channel - Nigerian E-commerce', 
             fontsize=16, fontweight='bold', pad=20)
ax.set_xlabel('Marketing Channel', fontsize=14, fontweight='bold')
ax.set_ylabel('Number of MQLs', fontsize=14, fontweight='bold')

# Add value labels on bars
for bar in bars:
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height + 0.5,
            f'{int(height):,}',
            ha='center', va='bottom', fontweight='bold')

# Format y-axis to show commas
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, p: f'{int(x):,}'))

# Add grid for better readability
ax.grid(axis='y', alpha=0.3, linestyle='--')

# Remove top and right spines for cleaner look
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.tight_layout()
plt.show()

# Print insights
print("\n=== MARKETING CHANNEL INSIGHTS ===")
print(f"Total MQLs: {channel_mqls.sum():,}")
print(f"Best performing channel: {channel_mqls.index[0]} ({channel_mqls.iloc[0]:,} MQLs)")
print(f"Share of top channel: {channel_mqls.iloc[0]/channel_mqls.sum()*100:.1f}%")

### 2.2 Line Charts - Marketing Funnel Timeline

In [None]:
# Create monthly MQL trends
df['contact_month'] = df['first_contact_date'].dt.to_period('M')
monthly_mqls = df.groupby('contact_month').size().reset_index(name='mql_count')
monthly_mqls['contact_month'] = monthly_mqls['contact_month'].dt.to_timestamp()

# Create line chart
fig, ax = plt.subplots(figsize=(14, 8))

# Plot the line with markers
line = ax.plot(monthly_mqls['contact_month'], monthly_mqls['mql_count'], 
              marker='o', linewidth=3, markersize=8, color='#2E86AB',
              markerfacecolor='#A23B72', markeredgewidth=2, markeredgecolor='white')

# Customize the chart
ax.set_title('Monthly MQL Generation Trend - Nigerian E-commerce Platform', 
             fontsize=16, fontweight='bold', pad=20)
ax.set_xlabel('Month', fontsize=14, fontweight='bold')
ax.set_ylabel('Number of MQLs', fontsize=14, fontweight='bold')

# Format x-axis for dates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
plt.setp(ax.xaxis.get_majorticklabels(), rotation=45, ha='right')

# Format y-axis
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, p: f'{int(x):,}'))

# Add grid
ax.grid(True, alpha=0.3, linestyle='--')

# Highlight peak and trough
max_mql_idx = monthly_mqls['mql_count'].idxmax()
min_mql_idx = monthly_mqls['mql_count'].idxmin()

ax.annotate(f'Peak: {monthly_mqls.iloc[max_mql_idx]["mql_count"]:,}',
            xy=(monthly_mqls.iloc[max_mql_idx]['contact_month'], 
                monthly_mqls.iloc[max_mql_idx]['mql_count']),
            xytext=(10, 20), textcoords='offset points',
            bbox=dict(boxstyle='round,pad=0.5', fc='yellow', alpha=0.7),
            arrowprops=dict(arrowstyle='->', connectionstyle='arc3,rad=0'))

plt.tight_layout()
plt.show()

# Print insights
print("\n=== MONTHLY TREND INSIGHTS ===")
peak_month = monthly_mqls.loc[monthly_mqls['mql_count'].idxmax()]
low_month = monthly_mqls.loc[monthly_mqls['mql_count'].idxmin()]
print(f"Peak month: {peak_month['contact_month'].strftime('%B %Y')} ({peak_month['mql_count']:,} MQLs)")
print(f"Lowest month: {low_month['contact_month'].strftime('%B %Y')} ({low_month['mql_count']:,} MQLs)")
print(f"Variability (CV): {monthly_mqls['mql_count'].std()/monthly_mqls['mql_count'].mean()*100:.1f}%")

### 2.3 Pie Charts - Lead Origin Distribution

In [None]:
# Calculate lead origin distribution
origin_counts = df['origin'].value_counts()

# Create pie chart
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))

# Standard pie chart
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#96CEB4', '#FECA57']
wedges, texts, autotexts = ax1.pie(origin_counts.values, 
                                   labels=origin_counts.index,
                                   autopct='%1.1f%%',
                                   startangle=90,
                                   colors=colors,
                                   explode=(0.05, 0.03, 0.02, 0.01, 0))

# Customize pie chart
ax1.set_title('Lead Origin Distribution', fontsize=16, fontweight='bold')
plt.setp(autotexts, size=12, weight='bold')
plt.setp(texts, size=12, weight='bold')

# Donut chart (pie chart with donut hole)
wedges2, texts2, autotexts2 = ax2.pie(origin_counts.values,
                                    labels=origin_counts.index,
                                    autopct='%1.1f%%',
                                    startangle=90,
                                    colors=colors,
                                    pctdistance=0.85,
                                    wedgeprops=dict(width=0.4))

# Add center text to donut
ax2.text(0, 0, f'\nTotal Leads\n{origin_counts.sum():,}', 
         horizontalalignment='center',
         verticalalignment='center',
         fontsize=14, fontweight='bold')

ax2.set_title('Lead Origin (Donut View)', fontsize=16, fontweight='bold')
plt.setp(autotexts2, size=12, weight='bold')
plt.setp(texts2, size=12, weight='bold')

plt.suptitle('Marketing Lead Sources - Nigerian E-commerce Analysis', 
             fontsize=18, fontweight='bold', y=1.05)
plt.tight_layout()
plt.show()

# Print insights
print("\n=== LEAD ORIGIN INSIGHTS ===")
for origin, count in origin_counts.items():
    percentage = count / origin_counts.sum() * 100
    print(f"{origin}: {count:,} leads ({percentage:.1f}%)")

## 3. Advanced Plot Customization

Now let's explore more advanced customization techniques for professional marketing presentations.

### 3.1 Conversion Funnel Visualization

In [None]:
# Calculate conversion funnel metrics
total_mqls = len(df)
converted_leads = len(df[df['won_date'].notna()])
revenue_generated = df[df['won_date'].notna()]['declared_monthly_revenue'].sum()

funnel_data = [
    ('Marketing Qualified Leads', total_mqls),
    ('Closed Deals', converted_leads),
    ('Active Revenue Generators', len(df[df['declared_monthly_revenue'] > 0]))
]

# Create funnel chart using horizontal bars
fig, ax = plt.subplots(figsize=(14, 10))

# Prepare data
stages = [item[0] for item in funnel_data]
values = [item[1] for item in funnel_data]
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1']

# Create horizontal bar chart
bars = ax.barh(stages, values, color=colors, height=0.6)

# Add conversion rates between stages
conversion_rates = [100, values[1]/values[0]*100, values[2]/values[1]*100]

for i, (bar, rate) in enumerate(zip(bars, conversion_rates)):
    width = bar.get_width()
    
    # Add value labels
    ax.text(width + width*0.01, bar.get_y() + bar.get_height()/2,
            f'{int(width):,}',
            ha='left', va='center', fontsize=14, fontweight='bold')
    
    # Add conversion rate labels
    if i > 0:
        ax.text(width/2, bar.get_y() + bar.get_height()/2 + 0.3,
                f'{rate:.1f}% conversion',
                ha='center', va='bottom', fontsize=12,
                bbox=dict(boxstyle='round,pad=0.3', fc='yellow', alpha=0.7))

# Customize the chart
ax.set_title('Marketing Conversion Funnel - Nigerian E-commerce', 
             fontsize=18, fontweight='bold', pad=30)
ax.set_xlabel('Number of Leads/Customers', fontsize=14, fontweight='bold')

# Format x-axis
ax.xaxis.set_major_formatter(FuncFormatter(lambda x, p: f'{int(x):,}'))

# Add grid
ax.grid(axis='x', alpha=0.3, linestyle='--')

# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

# Add insights text box
insights_text = f"""Key Metrics:
â€¢ Total MQLs: {total_mqls:,}
â€¢ Conversion Rate: {converted_leads/total_mqls*100:.1f}%
â€¢ Revenue Generated: â‚¦{revenue_generated:,.0f}
â€¢ Avg Revenue per Deal: â‚¦{revenue_generated/converted_leads:,.0f}"""

ax.text(0.98, 0.02, insights_text,
        transform=ax.transAxes, fontsize=11,
        verticalalignment='bottom', horizontalalignment='right',
        bbox=dict(boxstyle='round,pad=0.7', fc='lightblue', alpha=0.8))

plt.tight_layout()
plt.show()

print("\n=== CONVERSION FUNNEL INSIGHTS ===")
print(f"Overall MQL to Deal Conversion: {converted_leads/total_mqls*100:.2f}%")
print(f"Total Revenue Generated: â‚¦{revenue_generated:,.0f}")
print(f"Average Revenue per Converted Deal: â‚¦{revenue_generated/converted_leads:,.0f}")

### 3.2 Scatter Plot - Revenue vs Deal Size Analysis

In [None]:
# Analyze relationship between declared catalog size and revenue
revenue_data = df[df['declared_monthly_revenue'] > 0].copy()

# Create scatter plot
fig, ax = plt.subplots(figsize=(14, 10))

# Create scatter with different colors for business segments
segments = revenue_data['business_segment'].unique()
colors_map = plt.cm.Set3(np.linspace(0, 1, len(segments)))

for i, segment in enumerate(segments):
    segment_data = revenue_data[revenue_data['business_segment'] == segment]
    ax.scatter(segment_data['declared_product_catalog_size'], 
              segment_data['declared_monthly_revenue'],
              c=[colors_map[i]], label=segment, alpha=0.7, s=60)

# Add trend line
z = np.polyfit(revenue_data['declared_product_catalog_size'], 
              revenue_data['declared_monthly_revenue'], 1)
p = np.poly1d(z)
ax.plot(revenue_data['declared_product_catalog_size'], 
        p(revenue_data['declared_product_catalog_size']),
        "r--", alpha=0.8, linewidth=2, label='Trend Line')

# Customize the chart
ax.set_title('Product Catalog Size vs Monthly Revenue by Business Segment', 
             fontsize=16, fontweight='bold', pad=20)
ax.set_xlabel('Declared Product Catalog Size', fontsize=14, fontweight='bold')
ax.set_ylabel('Declared Monthly Revenue (â‚¦)', fontsize=14, fontweight='bold')

# Format axes
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, p: f'â‚¦{int(x/1000)}K'))
ax.xaxis.set_major_formatter(FuncFormatter(lambda x, p: f'{int(x)}'))

# Add grid
ax.grid(True, alpha=0.3, linestyle='--')

# Add legend
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', title='Business Segments')

# Add correlation coefficient
correlation = revenue_data['declared_product_catalog_size'].corr(revenue_data['declared_monthly_revenue'])
ax.text(0.02, 0.98, f'Correlation: {correlation:.3f}',
        transform=ax.transAxes, fontsize=12,
        verticalalignment='top',
        bbox=dict(boxstyle='round,pad=0.5', fc='yellow', alpha=0.7))

plt.tight_layout()
plt.show()

# Print insights
print("\n=== REVENUE ANALYSIS INSIGHTS ===")
print(f"Correlation between catalog size and revenue: {correlation:.3f}")
print(f"Average catalog size: {revenue_data['declared_product_catalog_size'].mean():.0f} products")
print(f"Average monthly revenue: â‚¦{revenue_data['declared_monthly_revenue'].mean():,.0f}")
print(f"Highest revenue segment: {revenue_data.groupby('business_segment')['declared_monthly_revenue'].mean().idxmax()}")

## 4. Professional Presentation Techniques

Let's create presentation-ready charts with advanced formatting and annotations.

### 4.1 Executive Dashboard - Multi-Panel Visualization

In [None]:
# Create comprehensive dashboard
fig = plt.figure(figsize=(20, 16))
fig.suptitle('Marketing Performance Executive Dashboard - Nigerian E-commerce', 
             fontsize=20, fontweight='bold', y=0.95)

# Define grid layout
gs = fig.add_gridspec(3, 3, hspace=0.3, wspace=0.3)

# 1. Top-left: Channel Performance (Bar Chart)
ax1 = fig.add_subplot(gs[0, 0])
channel_performance = df.groupby('origin').agg({
    'mql_id': 'count',
    'declared_monthly_revenue': 'sum'
}).rename(columns={'mql_id': 'MQL Count'})

channel_performance.sort_values('MQL Count', ascending=True).plot(
    kind='barh', y='MQL Count', ax=ax1, color='#FF6B6B')
ax1.set_title('MQL Volume by Channel', fontweight='bold')
ax1.set_xlabel('Number of MQLs')

# 2. Top-middle: Conversion Rate (Pie Chart)
ax2 = fig.add_subplot(gs[0, 1])
conversion_data = [len(df) - len(df[df['won_date'].notna()]), len(df[df['won_date'].notna()])]
ax2.pie(conversion_data, labels=['Not Converted', 'Converted'], autopct='%1.1f%%',
        colors=['#FF9999', '#66BB6A'], startangle=90)
ax2.set_title('Overall Conversion Rate', fontweight='bold')

# 3. Top-right: Business Segment Distribution (Donut Chart)
ax3 = fig.add_subplot(gs[0, 2])
segment_counts = df['business_segment'].value_counts().head(6)
ax3.pie(segment_counts.values, labels=segment_counts.index, autopct='%1.1f%%',
        wedgeprops=dict(width=0.4))
ax3.set_title('Top Business Segments', fontweight='bold')

# 4. Middle: Monthly Trends (Line Chart)
ax4 = fig.add_subplot(gs[1, :])
monthly_trends = df.groupby(df['first_contact_date'].dt.to_period('M')).agg({
    'mql_id': 'count',
    'won_date': 'count'
}).rename(columns={'mql_id': 'MQLs', 'won_date': 'Conversions'})

monthly_trends.index = monthly_trends.index.to_timestamp()
ax4.plot(monthly_trends.index, monthly_trends['MQLs'], 
         marker='o', label='MQLs', linewidth=3)
ax4.plot(monthly_trends.index, monthly_trends['Conversions'], 
         marker='s', label='Conversions', linewidth=3)
ax4.set_title('Monthly Marketing Trends', fontweight='bold')
ax4.set_xlabel('Month')
ax4.set_ylabel('Count')
ax4.legend()
ax4.grid(True, alpha=0.3)

# 5. Bottom-left: Revenue by Segment (Bar Chart)
ax5 = fig.add_subplot(gs[2, 0])
revenue_by_segment = df[df['declared_monthly_revenue'] > 0].groupby(
    'business_segment')['declared_monthly_revenue'].sum().sort_values(ascending=True).tail(8)
revenue_by_segment.plot(kind='barh', ax=ax5, color='#4ECDC4')
ax5.set_title('Revenue by Business Segment', fontweight='bold')
ax5.set_xlabel('Monthly Revenue (â‚¦)')

# 6. Bottom-middle: Lead Type Distribution (Bar Chart)
ax6 = fig.add_subplot(gs[2, 1])
lead_types = df['lead_type'].value_counts()
ax6.bar(lead_types.index, lead_types.values, color='#FFB347')
ax6.set_title('Lead Type Distribution', fontweight='bold')
ax6.set_ylabel('Count')
plt.setp(ax6.xaxis.get_majorticklabels(), rotation=45, ha='right')

# 7. Bottom-right: Key Metrics (Text Box)
ax7 = fig.add_subplot(gs[2, 2])
ax7.axis('off')  # Hide axes

# Calculate key metrics
total_mqls = len(df)
converted_deals = len(df[df['won_date'].notna()])
conversion_rate = converted_deals / total_mqls * 100
total_revenue = df[df['declared_monthly_revenue'] > 0]['declared_monthly_revenue'].sum()
avg_deal_size = total_revenue / converted_deals

metrics_text = f"""
KEY PERFORMANCE INDICATORS

ðŸ“Š Lead Generation
   Total MQLs: {total_mqls:,}
   Conversion Rate: {conversion_rate:.1f}%

ðŸ’° Revenue Metrics
   Total Revenue: â‚¦{total_revenue:,.0f}
   Avg Deal Size: â‚¦{avg_deal_size:,.0f}

ðŸŽ¯ Top Channels
   1. {df['origin'].value_counts().index[0]}
   2. {df['origin'].value_counts().index[1]}

ðŸ“ˆ Growth Trend
   Monthly Avg: {monthly_trends['MQLs'].mean():.0f} MQLs
"""

ax7.text(0.05, 0.95, metrics_text,
        transform=ax7.transAxes, fontsize=12,
        verticalalignment='top', fontfamily='monospace',
        bbox=dict(boxstyle='round,pad=1', fc='lightgray', alpha=0.8))

# Add watermark/date
fig.text(0.99, 0.01, f'Generated: {pd.Timestamp.now().strftime("%Y-%m-%d %H:%M")}', 
         ha='right', fontsize=8, alpha=0.7)

plt.show()

# Print executive summary
print("\n" + "="*60)
print("EXECUTIVE SUMMARY - MARKETING PERFORMANCE")
print("="*60)
print(f"â€¢ Total Marketing Qualified Leads Generated: {total_mqls:,}")
print(f"â€¢ Overall Conversion Rate: {conversion_rate:.2f}%")
print(f"â€¢ Total Revenue Generated: â‚¦{total_revenue:,.0f}")
print(f"â€¢ Average Deal Size: â‚¦{avg_deal_size:,.0f}")
print(f"â€¢ Best Performing Channel: {df['origin'].value_counts().index[0]}")
print(f"â€¢ Top Revenue Segment: {revenue_by_segment.index[-1]}")
print("="*60)

## Summary

In this session, we've covered:

1. **Basic Plot Types**:
   - Bar charts for categorical comparisons
   - Line charts for time series analysis
   - Pie/donut charts for composition analysis

2. **Advanced Customization**:
   - Funnel visualizations for conversion analysis
   - Scatter plots for relationship analysis
   - Color coding and annotations

3. **Professional Presentations**:
   - Multi-panel dashboards
   - Executive summary visualizations
   - Business-focused labeling and insights

### Key Takeaways for Marketing Analytics:
- **Choose the right chart type** for your data story
- **Focus on actionable insights** rather than just showing data
- **Use consistent color schemes** that align with brand guidelines
- **Include context and benchmarks** for meaningful comparisons
- **Label clearly** and remove unnecessary visual clutter

### Next Steps:
In Part 2, we'll explore more advanced visualization techniques including:
- Interactive plot elements
- Advanced storytelling techniques
- Custom color schemes for brand consistency
- Exporting charts for presentations and reports