# Module 06: Scientific Best Practices for Data Visualization

**Estimated Time**: 60 minutes  
**Difficulty**: Intermediate

## Learning Objectives

By the end of this module, you will:
- Create publication-quality figures for research papers
- Understand DPI, sizing, and aspect ratios
- Apply color theory and accessibility principles
- Recognize and avoid common visualization mistakes
- Choose the right chart type for your data
- Apply the data-ink ratio principle
- Follow journal-specific formatting guidelines

---

In [None]:
# Import required libraries
%matplotlib inline

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from matplotlib import rcParams
import warnings

warnings.filterwarnings("ignore")
np.random.seed(42)

print("Libraries imported successfully!")

## Part 1: Publication-Quality Figure Settings

### Understanding DPI (Dots Per Inch)

- **Screen viewing**: 72-96 DPI (default)
- **Presentations**: 150 DPI (PowerPoint, Google Slides)
- **Print/Publications**: 300-600 DPI (journals, posters)
- **High-quality prints**: 600+ DPI

### Figure Sizes for Different Media

- **Single column (journals)**: 3.5 inches wide
- **Double column**: 7 inches wide
- **Full page**: 8.5 × 11 inches
- **Poster**: 36 × 48 inches (or larger)
- **Slide**: 10 × 7.5 inches (4:3) or 13.33 × 7.5 inches (16:9)

### Font Sizes

- **Axis labels**: 10-12 pt
- **Tick labels**: 8-10 pt
- **Title**: 12-14 pt
- **Legend**: 8-10 pt
- **Annotations**: 8-10 pt

In [None]:
# Example 1: Publication-ready figure settings

# Set publication parameters
pub_params = {
    "figure.figsize": (3.5, 2.5),  # Single column width
    "figure.dpi": 300,  # High resolution
    "font.size": 8,  # Base font size
    "axes.labelsize": 9,  # Axis label size
    "axes.titlesize": 10,  # Title size
    "xtick.labelsize": 7,  # X-tick labels
    "ytick.labelsize": 7,  # Y-tick labels
    "legend.fontsize": 7,  # Legend font size
    "font.family": "sans-serif",  # Font family
    "font.sans-serif": ["Arial"],  # Specific font
    "axes.linewidth": 0.5,  # Axis line width
    "xtick.major.width": 0.5,  # Tick width
    "ytick.major.width": 0.5,
    "lines.linewidth": 1.0,  # Line width
    "savefig.dpi": 300,  # Save DPI
    "savefig.bbox": "tight",  # Tight bounding box
    "savefig.pad_inches": 0.05,  # Minimal padding
}

# Apply settings
plt.rcParams.update(pub_params)

# Create sample data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

# Create publication-quality figure
fig, ax = plt.subplots()

ax.plot(x, y1, "b-", label="sin(x)", linewidth=1.2)
ax.plot(x, y2, "r--", label="cos(x)", linewidth=1.2)

ax.set_xlabel("X values (radians)", fontsize=9)
ax.set_ylabel("Y values", fontsize=9)
ax.set_title("Trigonometric Functions", fontsize=10, fontweight="bold")
ax.legend(frameon=True, loc="upper right")
ax.grid(True, alpha=0.3, linewidth=0.5)

plt.tight_layout()
plt.savefig("../notebooks/outputs/publication_figure.png", dpi=300)
plt.show()

print("Publication-quality figure created!")
print(f"Figure size: 3.5 × 2.5 inches (single column)")
print(f"Resolution: 300 DPI")
print(f"Font: Arial, 8-10 pt")

# Reset to default settings
plt.rcParams.update(plt.rcParamsDefault)

In [None]:
# Example 2: Comparing screen vs print settings

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)

# Screen optimized (left)
axes[0].plot(x, y, linewidth=2)
axes[0].set_title("Screen Optimized (96 DPI, 12pt)", fontsize=12)
axes[0].set_xlabel("X", fontsize=11)
axes[0].set_ylabel("sin(x)", fontsize=11)
axes[0].grid(True, alpha=0.3)
axes[0].text(
    0.5,
    0.5,
    "Large fonts\nThick lines\nGood for screens",
    transform=axes[0].transAxes,
    fontsize=10,
    ha="center",
    bbox=dict(boxstyle="round", facecolor="wheat", alpha=0.5),
)

# Print optimized (right)
axes[1].plot(x, y, linewidth=0.8)
axes[1].set_title("Print Optimized (300 DPI, 8pt)", fontsize=9)
axes[1].set_xlabel("X", fontsize=8)
axes[1].set_ylabel("sin(x)", fontsize=8)
axes[1].tick_params(labelsize=7)
axes[1].grid(True, alpha=0.3, linewidth=0.5)
axes[1].text(
    0.5,
    0.5,
    "Smaller fonts\nThinner lines\nGood for print",
    transform=axes[1].transAxes,
    fontsize=7,
    ha="center",
    bbox=dict(boxstyle="round", facecolor="lightblue", alpha=0.5),
)

plt.tight_layout()
plt.show()

print("Key differences:")
print("  Screen: Larger fonts (readable at normal viewing distance)")
print("  Print: Smaller fonts (high resolution allows finer detail)")

## Part 2: Color Theory and Accessibility

### Color Blindness Statistics
- **8%** of men have some form of color vision deficiency
- **0.5%** of women
- Most common: Red-green color blindness

### Accessible Color Principles
1. **Don't rely on color alone** - use markers, line styles, patterns
2. **Use colorblind-friendly palettes**
3. **Ensure sufficient contrast** (3:1 minimum for text)
4. **Test your visualizations** with colorblind simulators

### Choosing Colors
- **Sequential**: One color, varying intensity (0 → 100)
- **Diverging**: Two colors with neutral midpoint (-100 ← 0 → +100)
- **Qualitative**: Distinct colors for categories

In [None]:
# Example 1: Bad vs Good color choices

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

categories = ["A", "B", "C", "D", "E"]
values = [23, 45, 56, 34, 41]

# BAD: Red-green combination (problematic for colorblind)
bad_colors = ["red", "green", "blue", "yellow", "purple"]
axes[0].bar(categories, values, color=bad_colors)
axes[0].set_title(
    "BAD: Red-Green Palette\n(Problematic for colorblind)",
    fontsize=12,
    fontweight="bold",
    color="darkred",
)
axes[0].set_ylabel("Values")

# GOOD: Colorblind-friendly palette
good_colors = ["#0173B2", "#DE8F05", "#029E73", "#CC78BC", "#949494"]
axes[1].bar(categories, values, color=good_colors)
axes[1].set_title(
    "GOOD: Colorblind-Friendly Palette\n(Safe for all viewers)",
    fontsize=12,
    fontweight="bold",
    color="darkgreen",
)
axes[1].set_ylabel("Values")

plt.tight_layout()
plt.show()

print("Colorblind-friendly palette (Paul Tol):")
print("  Blue:   #0173B2")
print("  Orange: #DE8F05")
print("  Green:  #029E73")
print("  Pink:   #CC78BC")
print("  Gray:   #949494")

In [None]:
# Example 2: Using markers and line styles for accessibility

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

x = np.linspace(0, 10, 50)

# BAD: Color only
for i, color in enumerate(["red", "green", "blue"]):
    axes[0].plot(x, np.sin(x + i), color=color, linewidth=2, label=f"Series {i+1}")
axes[0].set_title(
    "BAD: Color Only\n(Hard to distinguish if colorblind)",
    fontsize=12,
    fontweight="bold",
    color="darkred",
)
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# GOOD: Color + markers + line style
colors_safe = ["#0173B2", "#DE8F05", "#029E73"]
markers = ["o", "s", "^"]
linestyles = ["-", "--", "-."]

for i, (color, marker, ls) in enumerate(zip(colors_safe, markers, linestyles)):
    axes[1].plot(
        x,
        np.sin(x + i),
        color=color,
        marker=marker,
        linewidth=2,
        linestyle=ls,
        markersize=4,
        markevery=5,
        label=f"Series {i+1}",
    )
axes[1].set_title(
    "GOOD: Color + Markers + Line Style\n(Multiple visual cues)",
    fontsize=12,
    fontweight="bold",
    color="darkgreen",
)
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("Best practice: Use multiple visual channels!")
print("  1. Color (for those who can see it)")
print("  2. Markers (shapes are universally distinguishable)")
print("  3. Line style (dashed, dotted, etc.)")

## Part 3: Common Visualization Mistakes

### The Seven Deadly Sins of Data Visualization

1. **Misleading axes** (truncated or reversed)
2. **3D charts** (distortion, hard to read)
3. **Dual Y-axes** (can be manipulated)
4. **Pie charts** (hard to compare angles)
5. **Excessive decoration** (chartjunk)
6. **Wrong chart type** (doesn't match data)
7. **Missing context** (no labels, units, or scale)

In [None]:
# Example 1: Misleading truncated axis

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

categories = ["Product A", "Product B"]
sales = [100, 105]  # Only 5% difference

# MISLEADING: Truncated Y-axis
axes[0].bar(categories, sales, color=["blue", "red"])
axes[0].set_ylim(95, 106)  # Starts at 95, not 0!
axes[0].set_title(
    "MISLEADING: Truncated Axis\n(Makes 5% look like 200%)",
    fontsize=12,
    fontweight="bold",
    color="darkred",
)
axes[0].set_ylabel("Sales")

# HONEST: Full axis from zero
axes[1].bar(categories, sales, color=["blue", "red"])
axes[1].set_ylim(0, 120)
axes[1].set_title(
    "HONEST: Full Axis from Zero\n(Shows true 5% difference)",
    fontsize=12,
    fontweight="bold",
    color="darkgreen",
)
axes[1].set_ylabel("Sales")
axes[1].axhline(y=0, color="black", linewidth=0.8)

plt.tight_layout()
plt.show()

print("Golden Rule: Bar charts should start at zero!")
print("Exception: Line charts can use truncated axes if clearly labeled.")

In [None]:
# Example 2: Pie charts vs better alternatives

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

categories = ["A", "B", "C", "D", "E"]
values = [23, 25, 18, 20, 14]

# POOR: Pie chart (hard to compare)
axes[0].pie(
    values, labels=categories, autopct="%1.1f%%", colors=plt.cm.Set3(range(len(categories)))
)
axes[0].set_title(
    "POOR: Pie Chart\n(Hard to compare angles)", fontsize=12, fontweight="bold", color="darkred"
)

# BETTER: Horizontal bar chart
axes[1].barh(categories, values, color="steelblue")
axes[1].set_title(
    "BETTER: Bar Chart\n(Easy to compare lengths)",
    fontsize=12,
    fontweight="bold",
    color="darkgreen",
)
axes[1].set_xlabel("Value")
axes[1].grid(True, alpha=0.3, axis="x")

plt.tight_layout()
plt.show()

print("When to use pie charts: Almost never!")
print("Exception: Only when you have 2-3 categories and want to show 'part of whole'")
print("Better alternatives: Bar charts, treemaps, waffle charts")

In [None]:
# Example 3: Chartjunk - too much decoration

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# BAD: Excessive decoration (chartjunk)
axes[0].plot(
    x,
    y,
    "ro-",
    linewidth=3,
    markersize=15,
    markeredgewidth=2,
    markeredgecolor="black",
    markerfacecolor="yellow",
)
axes[0].set_facecolor("lightgray")
axes[0].grid(True, linewidth=2, color="white")
axes[0].set_title(
    "BAD: Chartjunk\n(Too much decoration)",
    fontsize=14,
    fontweight="bold",
    color="white",
    bbox=dict(boxstyle="round", facecolor="red", pad=0.5),
)
for i, (xi, yi) in enumerate(zip(x, y)):
    axes[0].annotate(
        f"{yi}",
        xy=(xi, yi),
        xytext=(0, 10),
        textcoords="offset points",
        ha="center",
        fontsize=12,
        fontweight="bold",
        bbox=dict(boxstyle="round", facecolor="yellow"),
    )

# GOOD: Clean, minimal design
axes[1].plot(x, y, "o-", linewidth=1.5, markersize=6, color="steelblue")
axes[1].set_title(
    "GOOD: Clean Design\n(High data-ink ratio)", fontsize=12, fontweight="bold", color="darkgreen"
)
axes[1].set_xlabel("X values")
axes[1].set_ylabel("Y values")
axes[1].grid(True, alpha=0.3, linestyle=":")
axes[1].spines["top"].set_visible(False)
axes[1].spines["right"].set_visible(False)

plt.tight_layout()
plt.show()

print("Edward Tufte's principle: Maximize data-ink ratio")
print("  Remove: Unnecessary decorations, 3D effects, excessive colors")
print("  Keep: Data, labels, minimal grid, subtle styling")

## Part 4: Choosing the Right Chart Type

### Decision Tree for Chart Selection

**What do you want to show?**

1. **Comparison**
   - Few items: Bar chart
   - Many items: Horizontal bar chart
   - Over time: Line chart

2. **Distribution**
   - Single variable: Histogram, box plot
   - Multiple groups: Violin plot, overlapping histograms
   - Relationship: Scatter plot

3. **Composition (Part-to-Whole)**
   - Static: Stacked bar chart
   - Over time: Stacked area chart
   - 2-3 parts: Pie chart (reluctantly)

4. **Relationship**
   - Two variables: Scatter plot
   - Three variables: Bubble chart
   - Many variables: Heatmap, parallel coordinates

5. **Trend Over Time**
   - Continuous: Line chart
   - Discrete: Bar chart
   - Multiple series: Multi-line chart

In [None]:
# Example: Same data, different chart types

# Create sample data
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
sales = [45, 52, 48, 63, 71, 65]

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Bar chart - Good for discrete comparisons
axes[0, 0].bar(months, sales, color="steelblue", alpha=0.8)
axes[0, 0].set_title("Bar Chart: Best for\nDiscrete Comparisons", fontsize=11, fontweight="bold")
axes[0, 0].set_ylabel("Sales (Thousands)")
axes[0, 0].grid(True, alpha=0.3, axis="y")

# Line chart - Good for trends over time
axes[0, 1].plot(months, sales, "o-", linewidth=2, markersize=8, color="orangered")
axes[0, 1].set_title("Line Chart: Best for\nTrends Over Time", fontsize=11, fontweight="bold")
axes[0, 1].set_ylabel("Sales (Thousands)")
axes[0, 1].grid(True, alpha=0.3)

# Area chart - Good for cumulative or part-to-whole
axes[1, 0].fill_between(range(len(months)), sales, alpha=0.6, color="green")
axes[1, 0].plot(months, sales, "o-", linewidth=2, color="darkgreen")
axes[1, 0].set_title("Area Chart: Best for\nCumulative Values", fontsize=11, fontweight="bold")
axes[1, 0].set_ylabel("Sales (Thousands)")
axes[1, 0].grid(True, alpha=0.3)

# Scatter (not ideal for this data)
axes[1, 1].scatter(range(len(months)), sales, s=100, alpha=0.6, color="purple")
axes[1, 1].set_title("Scatter Plot: NOT ideal\nfor time series", fontsize=11, fontweight="bold")
axes[1, 1].set_ylabel("Sales (Thousands)")
axes[1, 1].set_xticks(range(len(months)))
axes[1, 1].set_xticklabels(months)
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("For this monthly sales data:")
print("  ✓ Line chart: Best (shows trend clearly)")
print("  ✓ Bar chart: Good (emphasizes individual months)")
print("  ~ Area chart: OK (if showing cumulative)")
print("  ✗ Scatter plot: Poor (implies no connection between points)")

## Part 5: The Data-Ink Ratio

**Edward Tufte's Principle**: Maximize the proportion of ink devoted to data.

### Data-Ink Ratio Formula
```
Data-Ink Ratio = Data-Ink / Total Ink Used
```

### How to Improve Data-Ink Ratio
1. Remove chart borders (spines)
2. Remove gridlines or make them subtle
3. Remove unnecessary labels
4. Lighten non-data elements
5. Use direct labeling instead of legends when possible
6. Remove redundant information

In [None]:
# Example: Progressive improvement of data-ink ratio

x = np.linspace(0, 10, 50)
y = np.sin(x)

fig, axes = plt.subplots(1, 3, figsize=(16, 4))

# Default (cluttered)
axes[0].plot(x, y, "b-", linewidth=2)
axes[0].set_title("Default\n(Low data-ink ratio)", fontsize=11, fontweight="bold")
axes[0].set_xlabel("X values")
axes[0].set_ylabel("Y values")
axes[0].grid(True)

# Improved (subtle grid)
axes[1].plot(x, y, "b-", linewidth=2)
axes[1].set_title("Improved\n(Subtle grid)", fontsize=11, fontweight="bold")
axes[1].set_xlabel("X values")
axes[1].set_ylabel("Y values")
axes[1].grid(True, alpha=0.3, linestyle=":")
axes[1].spines["top"].set_visible(False)
axes[1].spines["right"].set_visible(False)

# Optimal (maximum data-ink)
axes[2].plot(x, y, "b-", linewidth=2.5)
axes[2].set_title("Optimal\n(High data-ink ratio)", fontsize=11, fontweight="bold")
axes[2].set_xlabel("X values")
axes[2].set_ylabel("Y values")
axes[2].spines["top"].set_visible(False)
axes[2].spines["right"].set_visible(False)
axes[2].spines["left"].set_color("gray")
axes[2].spines["bottom"].set_color("gray")
axes[2].tick_params(colors="gray")
axes[2].grid(False)

plt.tight_layout()
plt.show()

print("Progressive improvements:")
print("  1. Remove top and right spines")
print("  2. Make grid subtle or remove it")
print("  3. Lighten non-data elements (gray)")
print("  4. Increase data line weight")
print("  Result: Eyes drawn to data, not decoration")

## Part 6: Journal-Specific Guidelines

Different journals have different requirements. Always check author guidelines!

### Common Journal Requirements

**Nature**
- Single column: 89 mm wide
- Double column: 183 mm wide
- Max height: 247 mm
- Font: Arial 6-7 pt
- Format: TIFF, EPS, or PDF
- Resolution: 300 DPI (color), 600 DPI (line art)

**Science**
- Single column: 5.5 cm (2.16 inches)
- Double column: 12 cm (4.72 inches)
- Font: Helvetica, Arial, or Times
- Format: EPS, PDF
- Resolution: 300-600 DPI

**PLOS**
- Width: 8.3 cm or 17.3 cm
- Font: Any readable sans-serif
- Format: TIFF, EPS, PDF
- Resolution: 300-600 DPI

In [None]:
# Example: Creating figures for different journals

def create_journal_figure(width_inches, height_inches, journal_name, dpi=300):
    """Create a figure with journal-specific dimensions"""
    
    fig, ax = plt.subplots(figsize=(width_inches, height_inches), dpi=dpi)
    
    # Sample data
    x = np.linspace(0, 10, 100)
    y = np.sin(x)
    
    ax.plot(x, y, 'b-', linewidth=1.2)
    ax.set_xlabel('X values', fontsize=8)
    ax.set_ylabel('Y values', fontsize=8)
    ax.set_title(f'{journal_name} Format
{width_inches:.2f} × {height_inches:.2f} inches',
                fontsize=9, fontweight='bold')
    ax.tick_params(labelsize=7)
    ax.grid(True, alpha=0.3, linewidth=0.5)
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    
    plt.tight_layout()
    
    # Save
    filename = f'../notebooks/outputs/{journal_name.lower()}_figure.png'
    plt.savefig(filename, dpi=dpi, bbox_inches='tight')
    
    return fig

# Create figures for different journals
journals = [
    ('Nature (Single)', 3.5, 2.5),
    ('Science (Single)', 2.16, 2.16),
    ('PLOS (Single)', 3.27, 2.5)
]

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

for ax, (name, width, height) in zip(axes, journals):
    # Create mock visualization
    x = np.linspace(0, 10, 100)
    ax.plot(x, np.sin(x), 'b-', linewidth=1.5)
    ax.set_title(f'{name}
{width}" × {height}"', fontsize=10, fontweight='bold')
    ax.set_xlabel('X', fontsize=8)
    ax.set_ylabel('Y', fontsize=8)
    ax.tick_params(labelsize=7)
    ax.grid(True, alpha=0.3)
    
    # Show dimensions with rectangle
    rect = plt.Rectangle((0.05, 0.05), 0.9, 0.9,
                         transform=ax.transAxes,
                         fill=False, edgecolor='red',
                         linewidth=2, linestyle='--')
    ax.add_patch(rect)

plt.tight_layout()
plt.show()

print("
Journal figure specifications:")
for name, width, height in journals:
    print(f"  {name:20} {width}" × {height}" (at 300 DPI)")

## Part 7: Complete Example - Publication-Quality Figure

In [None]:
# Create publication-quality multi-panel figure

# Set publication parameters
plt.rcParams.update(
    {
        "font.size": 8,
        "axes.labelsize": 9,
        "axes.titlesize": 10,
        "xtick.labelsize": 7,
        "ytick.labelsize": 7,
        "legend.fontsize": 7,
        "font.family": "sans-serif",
        "font.sans-serif": ["Arial"],
        "axes.linewidth": 0.5,
        "lines.linewidth": 1.2,
    }
)

# Create data
np.random.seed(42)
x = np.linspace(0, 10, 100)
y1 = np.sin(x) + np.random.normal(0, 0.1, 100)
y2 = np.cos(x) + np.random.normal(0, 0.1, 100)

# Create figure
fig = plt.figure(figsize=(7, 5), dpi=300)
gs = fig.add_gridspec(2, 2, hspace=0.3, wspace=0.3)

# Panel A: Time series
ax1 = fig.add_subplot(gs[0, :])
ax1.plot(x, y1, color="#0173B2", linewidth=1.2, label="Signal A")
ax1.plot(x, y2, color="#DE8F05", linewidth=1.2, label="Signal B")
ax1.set_xlabel("Time (s)", fontsize=9)
ax1.set_ylabel("Amplitude (mV)", fontsize=9)
ax1.set_title("A", fontsize=11, fontweight="bold", loc="left")
ax1.legend(frameon=False, loc="upper right")
ax1.spines["top"].set_visible(False)
ax1.spines["right"].set_visible(False)
ax1.grid(True, alpha=0.2, linewidth=0.5, linestyle=":")

# Panel B: Scatter
ax2 = fig.add_subplot(gs[1, 0])
ax2.scatter(y1, y2, s=15, alpha=0.6, color="#029E73", edgecolors="none")
ax2.set_xlabel("Signal A (mV)", fontsize=9)
ax2.set_ylabel("Signal B (mV)", fontsize=9)
ax2.set_title("B", fontsize=11, fontweight="bold", loc="left")
ax2.spines["top"].set_visible(False)
ax2.spines["right"].set_visible(False)
ax2.grid(True, alpha=0.2, linewidth=0.5, linestyle=":")

# Panel C: Histogram
ax3 = fig.add_subplot(gs[1, 1])
ax3.hist(y1, bins=20, color="#0173B2", alpha=0.7, edgecolor="white", linewidth=0.5)
ax3.set_xlabel("Signal A (mV)", fontsize=9)
ax3.set_ylabel("Frequency", fontsize=9)
ax3.set_title("C", fontsize=11, fontweight="bold", loc="left")
ax3.spines["top"].set_visible(False)
ax3.spines["right"].set_visible(False)
ax3.grid(True, alpha=0.2, linewidth=0.5, linestyle=":", axis="y")

# Save figure
plt.savefig("../notebooks/outputs/publication_quality_figure.png", dpi=300, bbox_inches="tight")
plt.savefig("../notebooks/outputs/publication_quality_figure.pdf", bbox_inches="tight")

plt.show()

print("Publication-quality figure created!")
print("\nFeatures:")
print("  ✓ 7-inch width (double column)")
print("  ✓ 300 DPI resolution")
print("  ✓ Arial font, 7-9 pt")
print("  ✓ Colorblind-friendly palette")
print("  ✓ Panel labels (A, B, C)")
print("  ✓ Minimal design, high data-ink ratio")
print("  ✓ Saved as PNG and PDF")
print("\nReady for submission to journals!")

# Reset parameters
plt.rcParams.update(plt.rcParamsDefault)

## Key Takeaways

### Publication Quality Checklist

**Technical Requirements:**
- [ ] Correct dimensions for target publication
- [ ] 300 DPI for color, 600 DPI for line art
- [ ] Appropriate font (Arial, Helvetica) at 8-10 pt
- [ ] Saved in required format (TIFF, EPS, PDF)

**Design Principles:**
- [ ] Colorblind-friendly palette
- [ ] Multiple visual channels (color + markers + line style)
- [ ] High data-ink ratio (minimal decoration)
- [ ] Clear, informative labels with units
- [ ] Appropriate chart type for data

**Accessibility:**
- [ ] Sufficient contrast (3:1 minimum)
- [ ] Not relying on color alone
- [ ] Readable font sizes
- [ ] Alternative text for screen readers (when digital)

**Scientific Integrity:**
- [ ] No misleading axes or scales
- [ ] Error bars where appropriate
- [ ] Sample sizes indicated
- [ ] Statistical significance marked
- [ ] Source data available

### What's Next
In **Module 07**, you'll apply everything you've learned:
- Complete capstone project
- End-to-end data analysis
- Combining all visualization libraries
- Creating a comprehensive visual report
- Professional data storytelling

---

## Exercises

### Exercise 1: Fix the Bad Visualizations
Create intentionally bad visualizations with these flaws, then fix them:
1. Truncated axis making small differences look huge
2. Using only red-green colors (problematic for colorblind)
3. 3D pie chart (double sin!)
4. Chart with excessive decoration

Show before and after for each.

In [None]:
# Your code here

### Exercise 2: Publication Figure for Nature
Create a publication-ready figure following Nature guidelines:
- Single column width (89 mm = 3.5 inches)
- 300 DPI
- Arial font, 6-7 pt
- Two panels (A and B)
- Colorblind-friendly
- Save as both PNG and PDF

In [None]:
# Your code here

### Exercise 3: Data-Ink Ratio Challenge
Take a cluttered default plot and progressively improve it:
1. Start with maximum decoration
2. Remove unnecessary elements one by one
3. Create 4 versions showing progression
4. Explain what you removed and why

In [None]:
# Your code here

### Exercise 4: Chart Type Selection
Given this data, create 5 different visualizations:
- Sales data for 5 products across 4 quarters

Use:
1. Grouped bar chart
2. Stacked bar chart
3. Line chart
4. Heatmap
5. Small multiples

Discuss which is best for different insights.

In [None]:
# Your code here

---

**Congratulations!** You've mastered scientific best practices for data visualization. You can now create professional, publication-quality figures that communicate your findings clearly and honestly.

**Next**: Module 07 - Capstone Project (Apply Everything You've Learned)