# Data Analysis - Module 5
## Data Visualization with Matplotlib & Seaborn

**Your Role:** Data Analyst at a B2B SaaS Company

**Your Mission:** Tell compelling stories with data through visualizations.

**Why this matters:**
- Humans process visuals 60,000x faster than text
- Stakeholders remember charts, not spreadsheets
- Good visualizations reveal patterns hidden in numbers
- The right chart can change a business decision

**This module covers:**
- Matplotlib fundamentals (the foundation)
- Quick plots with Pandas
- Beautiful statistical plots with Seaborn
- Choosing the right chart type
- Customizing colors, labels, and styles
- Multi-plot figures and subplots
- Advanced visualizations for business insights
- Saving publication-ready figures

**Libraries used:**
- `matplotlib` - The foundation of Python visualization
- `seaborn` - Statistical visualization made beautiful
- `pandas` - Quick plotting from DataFrames

**Time to complete:** ~90 minutes

---

# SETUP: Import Libraries and Load Data

In [None]:
# Core libraries
import pandas as pd
import numpy as np

# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns

# Display settings
pd.set_option('display.max_columns', 15)

# Seaborn style (makes plots look better)
sns.set_style('whitegrid')
sns.set_palette('husl')

# For Jupyter: display plots inline
%matplotlib inline

# Load datasets
df = pd.read_csv('../dataset/TechFlow.csv')
daily = pd.read_csv('../dataset/daily_metrics.csv')
daily['Date'] = pd.to_datetime(daily['Date'])

print(f"Main data: {df.shape}")
print(f"Daily metrics: {daily.shape}")
print(f"\nMatplotlib version: {plt.matplotlib.__version__}")
print(f"Seaborn version: {sns.__version__}")

---
# PART 1: Matplotlib Fundamentals

Matplotlib is the foundation - understand it first.

## 1.1 Basic Line Plot

**Create a simple line plot**

```python
# Simple data
x = [1, 2, 3, 4, 5]
y = [10, 20, 15, 25, 30]

# Create figure and plot
plt.figure(figsize=(10, 6))
plt.plot(x, y)
plt.title('My First Plot')
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


**Anatomy of a matplotlib figure:**
- `plt.figure()` - Creates the canvas
- `figsize=(width, height)` - Size in inches
- `plt.plot()` - Draws the data
- `plt.show()` - Displays the figure

## 1.2 Customizing Line Plots

**Add multiple lines with style**

```python
# Multiple datasets
x = [1, 2, 3, 4, 5]
y1 = [10, 20, 15, 25, 30]
y2 = [5, 15, 20, 18, 25]

plt.figure(figsize=(10, 6))

# Customize each line
plt.plot(x, y1, color='#2ecc71', linewidth=2, linestyle='-', marker='o', label='Product A')
plt.plot(x, y2, color='#e74c3c', linewidth=2, linestyle='--', marker='s', label='Product B')

plt.title('Sales Comparison', fontsize=14, fontweight='bold')
plt.xlabel('Month')
plt.ylabel('Revenue ($K)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 1.3 Bar Charts

**Create a bar chart**

```python
# Data
categories = ['Basic', 'Standard', 'Enterprise']
values = [15, 22, 13]
colors = ['#3498db', '#2ecc71', '#9b59b6']

plt.figure(figsize=(10, 6))
plt.bar(categories, values, color=colors, edgecolor='black', linewidth=1.2)

plt.title('Customers by Subscription Plan', fontsize=14, fontweight='bold')
plt.xlabel('Plan')
plt.ylabel('Number of Customers')

# Add value labels on bars
for i, v in enumerate(values):
    plt.text(i, v + 0.5, str(v), ha='center', fontweight='bold')

plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


**Horizontal bar chart**

```python
# Better for long category names
industries = df['Industry'].value_counts().head(8)

plt.figure(figsize=(10, 6))
plt.barh(industries.index, industries.values, color='#3498db')
plt.xlabel('Number of Customers')
plt.title('Customers by Industry', fontsize=14, fontweight='bold')
plt.gca().invert_yaxis()  # Highest at top
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 1.4 Pie Charts (Use Sparingly!)

**Create a pie chart**

```python
# Data
plan_counts = df['SubscriptionPlan'].value_counts()

plt.figure(figsize=(8, 8))
plt.pie(
    plan_counts.values, 
    labels=plan_counts.index,
    autopct='%1.1f%%',
    colors=['#3498db', '#2ecc71', '#9b59b6'],
    explode=(0.05, 0, 0),  # Explode first slice
    shadow=True,
    startangle=90
)
plt.title('Subscription Plan Distribution', fontsize=14, fontweight='bold')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


**Note:** Pie charts are often criticized. Bar charts are usually clearer for comparisons.

---
# PART 2: Quick Plotting with Pandas

Pandas has built-in plotting that uses matplotlib.

## 2.1 Histogram - Distribution of Values

**Revenue distribution histogram**

```python
# Plot histogram directly from DataFrame
df['MonthlyRevenue'].plot(
    kind='hist', 
    bins=15, 
    figsize=(10, 6),
    color='#3498db',
    edgecolor='black',
    alpha=0.7
)

plt.title('Distribution of Monthly Revenue', fontsize=14, fontweight='bold')
plt.xlabel('Monthly Revenue ($)')
plt.ylabel('Number of Customers')
plt.axvline(df['MonthlyRevenue'].mean(), color='red', linestyle='--', label=f"Mean: ${df['MonthlyRevenue'].mean():.0f}")
plt.legend()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 2.2 Box Plot - Distribution Summary

**Box plot of NPS scores**

```python
df['NPS_Score'].plot(
    kind='box',
    figsize=(8, 6),
    vert=True
)

plt.title('NPS Score Distribution', fontsize=14, fontweight='bold')
plt.ylabel('NPS Score')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 2.3 Scatter Plot - Relationship Between Variables

**Revenue vs Tenure scatter**

```python
df.plot(
    kind='scatter',
    x='TenureMonths',
    y='MonthlyRevenue',
    figsize=(10, 6),
    alpha=0.6,
    s=100,  # marker size
    c='NPS_Score',  # color by NPS
    colormap='RdYlGn',
    colorbar=True
)

plt.title('Revenue vs Tenure (colored by NPS)', fontsize=14, fontweight='bold')
plt.xlabel('Tenure (Months)')
plt.ylabel('Monthly Revenue ($)')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PART 3: Beautiful Plots with Seaborn

Seaborn makes statistical visualization easy and beautiful.

## 3.1 Count Plot - Categorical Counts

**Count customers by subscription plan**

```python
plt.figure(figsize=(10, 6))
sns.countplot(
    data=df, 
    x='SubscriptionPlan',
    palette='viridis',
    order=['Basic', 'Standard', 'Enterprise']
)

plt.title('Customer Count by Subscription Plan', fontsize=14, fontweight='bold')
plt.xlabel('Plan')
plt.ylabel('Count')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


**Count with hue (split by another variable)**

```python
plt.figure(figsize=(10, 6))
sns.countplot(
    data=df, 
    x='SubscriptionPlan',
    hue='Cancelled',
    palette=['#2ecc71', '#e74c3c'],
    order=['Basic', 'Standard', 'Enterprise']
)

plt.title('Cancellation by Subscription Plan', fontsize=14, fontweight='bold')
plt.xlabel('Plan')
plt.ylabel('Count')
plt.legend(title='Cancelled', labels=['Active', 'Churned'])
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 3.2 Box Plot - Compare Distributions

**Revenue by subscription plan**

```python
plt.figure(figsize=(10, 6))
sns.boxplot(
    data=df,
    x='SubscriptionPlan',
    y='MonthlyRevenue',
    palette='Set2',
    order=['Basic', 'Standard', 'Enterprise']
)

plt.title('Revenue Distribution by Plan', fontsize=14, fontweight='bold')
plt.xlabel('Subscription Plan')
plt.ylabel('Monthly Revenue ($)')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 3.3 Violin Plot - Distribution Shape

**NPS distribution by plan**

```python
plt.figure(figsize=(10, 6))
sns.violinplot(
    data=df,
    x='SubscriptionPlan',
    y='NPS_Score',
    palette='muted',
    order=['Basic', 'Standard', 'Enterprise']
)

plt.title('NPS Score Distribution by Plan', fontsize=14, fontweight='bold')
plt.xlabel('Subscription Plan')
plt.ylabel('NPS Score')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 3.4 Scatter Plot with Regression Line

**Revenue vs Seats with trend line**

```python
plt.figure(figsize=(10, 6))
sns.regplot(
    data=df,
    x='SeatCount',
    y='MonthlyRevenue',
    scatter_kws={'alpha': 0.6, 's': 80},
    line_kws={'color': 'red', 'linewidth': 2}
)

plt.title('Revenue vs Seat Count (with trend)', fontsize=14, fontweight='bold')
plt.xlabel('Number of Seats')
plt.ylabel('Monthly Revenue ($)')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 3.5 Heatmap - Correlation Matrix

**Correlation heatmap**

```python
# Select numeric columns
numeric_cols = ['MonthlyRevenue', 'SeatCount', 'TenureMonths', 'AvgWeeklyLogins', 
                'NPS_Score', 'SupportTicketsRaised', 'Cancelled']

# Calculate correlation
corr = df[numeric_cols].corr()

# Create heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(
    corr,
    annot=True,
    fmt='.2f',
    cmap='RdYlGn',
    center=0,
    square=True,
    linewidths=0.5
)

plt.title('Correlation Matrix', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PART 4: Advanced Seaborn Visualizations

## 4.1 Pair Plot - All Relationships at Once

**Pairwise relationships**

```python
# Select a few columns (pairplot can be slow with many)
cols = ['MonthlyRevenue', 'SeatCount', 'TenureMonths', 'NPS_Score']

sns.pairplot(
    df[cols + ['SubscriptionPlan']],
    hue='SubscriptionPlan',
    palette='Set1',
    diag_kind='kde',
    height=2.5
)

plt.suptitle('Pairwise Relationships by Plan', y=1.02, fontsize=14, fontweight='bold')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 4.2 Joint Plot - Two Variables with Distributions

**Revenue vs Tenure with marginal distributions**

```python
sns.jointplot(
    data=df,
    x='TenureMonths',
    y='MonthlyRevenue',
    kind='hex',  # hexbin for density
    height=8,
    cmap='Blues'
)

plt.suptitle('Revenue vs Tenure Distribution', y=1.02)
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 4.3 Facet Grid - Multiple Small Plots

**Revenue distribution by plan (faceted)**

```python
g = sns.FacetGrid(
    df, 
    col='SubscriptionPlan', 
    col_order=['Basic', 'Standard', 'Enterprise'],
    height=4,
    aspect=1.2
)
g.map(sns.histplot, 'MonthlyRevenue', kde=True, color='steelblue')
g.set_titles('{col_name} Plan')
g.fig.suptitle('Revenue Distribution by Plan', y=1.05, fontsize=14, fontweight='bold')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 4.4 Bar Plot with Error Bars

**Average revenue by plan with confidence interval**

```python
plt.figure(figsize=(10, 6))
sns.barplot(
    data=df,
    x='SubscriptionPlan',
    y='MonthlyRevenue',
    order=['Basic', 'Standard', 'Enterprise'],
    palette='Blues_d',
    errorbar='ci'  # confidence interval
)

plt.title('Average Revenue by Plan (with 95% CI)', fontsize=14, fontweight='bold')
plt.xlabel('Subscription Plan')
plt.ylabel('Average Monthly Revenue ($)')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PART 5: Multi-Plot Figures (Subplots)

## 5.1 Creating Subplots

**2x2 grid of plots**

```python
# Create 2x2 subplot grid
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: Revenue by Plan (bar)
sns.barplot(data=df, x='SubscriptionPlan', y='MonthlyRevenue', 
            order=['Basic', 'Standard', 'Enterprise'], ax=axes[0, 0], palette='Blues')
axes[0, 0].set_title('Avg Revenue by Plan')

# Plot 2: NPS Distribution (histogram)
sns.histplot(data=df, x='NPS_Score', bins=10, ax=axes[0, 1], color='green')
axes[0, 1].set_title('NPS Score Distribution')

# Plot 3: Tenure vs Revenue (scatter)
sns.scatterplot(data=df, x='TenureMonths', y='MonthlyRevenue', 
                hue='Cancelled', ax=axes[1, 0], palette=['blue', 'red'])
axes[1, 0].set_title('Tenure vs Revenue')

# Plot 4: Customers by Industry (bar)
industry_counts = df['Industry'].value_counts().head(6)
axes[1, 1].barh(industry_counts.index, industry_counts.values, color='purple')
axes[1, 1].set_title('Top Industries')

plt.suptitle('Customer Dashboard', fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PART 6: Time Series Visualization

## 6.1 Line Plot Over Time

**Daily revenue trend**

```python
# Filter for one customer
customer_1001 = daily[daily['CustomerID'] == 1001]

plt.figure(figsize=(12, 6))
plt.plot(customer_1001['Date'], customer_1001['Revenue'], 
         marker='o', linewidth=2, color='#3498db')

plt.title('Daily Revenue - Customer 1001', fontsize=14, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Revenue ($)')
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 6.2 Multiple Time Series

**Compare two customers**

```python
plt.figure(figsize=(12, 6))

for cust_id in [1001, 1004]:
    cust_data = daily[daily['CustomerID'] == cust_id]
    plt.plot(cust_data['Date'], cust_data['Revenue'], 
             marker='o', linewidth=2, label=f'Customer {cust_id}')

plt.title('Daily Revenue Comparison', fontsize=14, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Revenue ($)')
plt.legend()
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 6.3 Area Chart

**Stacked area chart**

```python
# Pivot data for stacked plot
pivot_data = daily.pivot(index='Date', columns='CustomerID', values='Revenue')

plt.figure(figsize=(12, 6))
plt.stackplot(pivot_data.index, pivot_data[1001], pivot_data[1004],
              labels=['Customer 1001', 'Customer 1004'],
              colors=['#3498db', '#e74c3c'], alpha=0.7)

plt.title('Cumulative Daily Revenue', fontsize=14, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Revenue ($)')
plt.legend(loc='upper left')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PART 7: Customization and Styling

## 7.1 Color Palettes

**Seaborn color palettes**

```python
# Show available palettes
fig, axes = plt.subplots(3, 2, figsize=(12, 8))

palettes = ['deep', 'muted', 'pastel', 'dark', 'colorblind', 'Set2']

for ax, palette in zip(axes.flatten(), palettes):
    sns.barplot(x=['A', 'B', 'C', 'D', 'E'], y=[5, 4, 3, 4, 5], 
                palette=palette, ax=ax)
    ax.set_title(f"Palette: {palette}")
    ax.set_ylabel('')

plt.suptitle('Seaborn Color Palettes', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 7.2 Custom Colors

**Using custom colors**

```python
# Define company colors
company_colors = {
    'primary': '#2C3E50',
    'secondary': '#3498DB',
    'success': '#27AE60',
    'warning': '#F39C12',
    'danger': '#E74C3C'
}

fig, ax = plt.subplots(figsize=(10, 6))

plans = ['Basic', 'Standard', 'Enterprise']
values = [15, 22, 13]
colors = [company_colors['danger'], company_colors['warning'], company_colors['success']]

bars = ax.bar(plans, values, color=colors, edgecolor='black')

# Add value labels
for bar, val in zip(bars, values):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.5, 
            str(val), ha='center', fontweight='bold', fontsize=12)

ax.set_title('Customers by Plan (Custom Colors)', fontsize=14, fontweight='bold')
ax.set_ylabel('Count')
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


## 7.3 Saving Figures

**Save to file**

```python
# Create a plot
fig, ax = plt.subplots(figsize=(10, 6))
sns.barplot(data=df, x='SubscriptionPlan', y='MonthlyRevenue', 
            order=['Basic', 'Standard', 'Enterprise'], palette='Blues')
ax.set_title('Revenue by Plan', fontsize=14, fontweight='bold')

# Save as PNG (high resolution)
fig.savefig('revenue_by_plan.png', dpi=300, bbox_inches='tight')

# Save as PDF (vector - perfect for printing)
fig.savefig('revenue_by_plan.pdf', bbox_inches='tight')

print("Figures saved!")
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PART 8: Business Dashboard Example

**Complete business dashboard**

```python
# Create comprehensive dashboard
fig = plt.figure(figsize=(16, 12))

# Define grid
gs = fig.add_gridspec(3, 3, hspace=0.3, wspace=0.3)

# KPI Cards (text in top row)
ax_kpi1 = fig.add_subplot(gs[0, 0])
ax_kpi1.text(0.5, 0.6, f"${df['MonthlyRevenue'].sum():,}", fontsize=24, fontweight='bold', 
             ha='center', va='center', color='#2ecc71')
ax_kpi1.text(0.5, 0.3, 'Total Monthly Revenue', fontsize=12, ha='center', va='center')
ax_kpi1.axis('off')
ax_kpi1.set_title('Revenue', fontsize=14, fontweight='bold')

ax_kpi2 = fig.add_subplot(gs[0, 1])
ax_kpi2.text(0.5, 0.6, f"{len(df)}", fontsize=24, fontweight='bold', 
             ha='center', va='center', color='#3498db')
ax_kpi2.text(0.5, 0.3, 'Total Customers', fontsize=12, ha='center', va='center')
ax_kpi2.axis('off')
ax_kpi2.set_title('Customers', fontsize=14, fontweight='bold')

ax_kpi3 = fig.add_subplot(gs[0, 2])
churn_rate = df['Cancelled'].mean() * 100
ax_kpi3.text(0.5, 0.6, f"{churn_rate:.1f}%", fontsize=24, fontweight='bold', 
             ha='center', va='center', color='#e74c3c')
ax_kpi3.text(0.5, 0.3, 'Churn Rate', fontsize=12, ha='center', va='center')
ax_kpi3.axis('off')
ax_kpi3.set_title('Churn', fontsize=14, fontweight='bold')

# Revenue by Plan
ax1 = fig.add_subplot(gs[1, 0])
sns.barplot(data=df, x='SubscriptionPlan', y='MonthlyRevenue', 
            order=['Basic', 'Standard', 'Enterprise'], palette='Blues', ax=ax1)
ax1.set_title('Avg Revenue by Plan')

# NPS Distribution
ax2 = fig.add_subplot(gs[1, 1])
sns.histplot(data=df, x='NPS_Score', bins=10, color='green', ax=ax2)
ax2.axvline(df['NPS_Score'].mean(), color='red', linestyle='--', label='Mean')
ax2.set_title('NPS Score Distribution')
ax2.legend()

# Churn by Plan
ax3 = fig.add_subplot(gs[1, 2])
churn_by_plan = df.groupby('SubscriptionPlan')['Cancelled'].mean() * 100
churn_by_plan = churn_by_plan.reindex(['Basic', 'Standard', 'Enterprise'])
ax3.bar(churn_by_plan.index, churn_by_plan.values, color=['#e74c3c', '#f39c12', '#27ae60'])
ax3.set_title('Churn Rate by Plan (%)')
ax3.set_ylabel('Churn Rate (%)')

# Top Industries
ax4 = fig.add_subplot(gs[2, 0])
industry_counts = df['Industry'].value_counts().head(5)
ax4.barh(industry_counts.index, industry_counts.values, color='#9b59b6')
ax4.set_title('Top 5 Industries')
ax4.invert_yaxis()

# Tenure vs Revenue
ax5 = fig.add_subplot(gs[2, 1])
colors = df['Cancelled'].map({0: '#2ecc71', 1: '#e74c3c'})
ax5.scatter(df['TenureMonths'], df['MonthlyRevenue'], c=colors, alpha=0.6)
ax5.set_xlabel('Tenure (Months)')
ax5.set_ylabel('Revenue ($)')
ax5.set_title('Tenure vs Revenue')

# Correlation Heatmap
ax6 = fig.add_subplot(gs[2, 2])
cols = ['MonthlyRevenue', 'SeatCount', 'NPS_Score', 'Cancelled']
corr = df[cols].corr()
sns.heatmap(corr, annot=True, fmt='.2f', cmap='RdYlGn', center=0, ax=ax6, square=True)
ax6.set_title('Correlation Matrix')

plt.suptitle('Customer Analytics Dashboard', fontsize=18, fontweight='bold', y=0.98)
plt.show()
```

In [None]:
# ↓ Type the code below, then press Shift+Enter to run


---
# PRACTICE: Business Scenarios

### Q1: Create a bar chart of customer count by Industry

In [None]:
# Your answer:


### Q2: Create a histogram of TenureMonths

In [None]:
# Your answer:


### Q3: Create a box plot of NPS_Score by SubscriptionPlan

In [None]:
# Your answer:


### Q4: Create a scatter plot of SeatCount vs MonthlyRevenue

In [None]:
# Your answer:


### Q5: Create a correlation heatmap for numeric columns

In [None]:
# Your answer:


### Q6: Create a count plot showing Cancelled by Industry

In [None]:
# Your answer:


### Q7: Create a 2x2 subplot dashboard with 4 different charts

In [None]:
# Your answer:


---
# CHEAT SHEET

## Matplotlib Basics
```python
plt.figure(figsize=(10, 6))      # Create figure
plt.plot(x, y)                    # Line plot
plt.bar(x, y)                     # Bar chart
plt.barh(x, y)                    # Horizontal bar
plt.scatter(x, y)                 # Scatter plot
plt.hist(data, bins=20)           # Histogram
plt.pie(values, labels=labels)    # Pie chart
plt.title('Title')                # Add title
plt.xlabel('X Label')             # X axis label
plt.ylabel('Y Label')             # Y axis label
plt.legend()                      # Show legend
plt.show()                        # Display
```

## Subplots
```python
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes[0, 0].plot(x, y)             # Top-left
axes[0, 1].bar(x, y)              # Top-right
axes[1, 0].scatter(x, y)          # Bottom-left
axes[1, 1].hist(data)             # Bottom-right
plt.tight_layout()
```

## Seaborn Plots
```python
sns.countplot(data=df, x='col')           # Count categories
sns.barplot(data=df, x='cat', y='num')    # Mean with CI
sns.boxplot(data=df, x='cat', y='num')    # Box plot
sns.violinplot(data=df, x='cat', y='num') # Violin plot
sns.histplot(data=df, x='col')            # Histogram
sns.scatterplot(data=df, x='x', y='y')    # Scatter
sns.regplot(data=df, x='x', y='y')        # Scatter + trend
sns.heatmap(corr, annot=True)             # Heatmap
sns.pairplot(df)                          # All pairs
```

## Common Parameters
```python
hue='column'              # Color by category
palette='Blues'           # Color scheme
order=['A', 'B', 'C']     # Category order
alpha=0.7                 # Transparency
edgecolor='black'         # Border color
linewidth=2               # Line thickness
```

## Saving
```python
plt.savefig('plot.png', dpi=300, bbox_inches='tight')
plt.savefig('plot.pdf', bbox_inches='tight')
```

## Chart Selection Guide
| Question | Chart Type |
|----------|------------|
| How is data distributed? | Histogram, Box plot |
| Compare categories? | Bar chart |
| Show relationship? | Scatter plot |
| Track over time? | Line chart |
| Show proportions? | Pie, Stacked bar |
| Find correlations? | Heatmap, Pair plot |

---
## Module 5 Complete!

**You now know how to:**
- Create plots with matplotlib (line, bar, scatter, pie)
- Quick plot from pandas DataFrames
- Use seaborn for statistical visualizations
- Create heatmaps and correlation matrices
- Build multi-plot dashboards with subplots
- Visualize time series data
- Customize colors, labels, and styles
- Save publication-ready figures

**Key Takeaways:**
1. Choose the right chart for your question
2. Less is more - don't overcomplicate
3. Labels and titles are essential
4. Seaborn makes statistical plots easy
5. Always consider your audience

**The complete Pandas Training Series is finished!**