# Week 8 Workshop: Visualization Principles

## Creating Publication-Ready Figures with Budget Execution Data

**Student Name:** _____________________

**Date:** _____________________

---

### Workshop Objectives

1. Create 5 different chart types with real budget data
2. Apply a consistent, professional color palette
3. Add proper titles, labels, and annotations
4. Export publication-ready figures

---

## Setup

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

# Create figures directory for exports
os.makedirs('figures', exist_ok=True)

print("Libraries loaded successfully!")

In [None]:
# Configure matplotlib for publication quality
plt.style.use('seaborn-v0_8-whitegrid')

# Custom settings
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['figure.dpi'] = 100
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['font.size'] = 11
plt.rcParams['font.family'] = 'sans-serif'
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['axes.titlesize'] = 12
plt.rcParams['axes.labelsize'] = 10

print("Matplotlib configured for publication quality!")

In [None]:
# Load the dataset
# UPDATE THE PATH if your file is in a different location
df = pd.read_csv('EJECUCION_PRESUPUESTAL.csv')

# Display basic information
print(f"Dataset shape: {df.shape}")
print(f"\nColumns: {df.columns.tolist()}")

In [None]:
# Preview the data
df.head()

In [None]:
# Check data types
df.info()

In [None]:
# Prepare sample data for visualization
# (Use this if the actual dataset columns differ)

np.random.seed(42)

# Budget by category
categories = ['Education', 'Health', 'Infrastructure', 'Security', 'Social Programs', 'Administration']
budget_by_category = pd.DataFrame({
    'Category': categories,
    'Budget_Approved': [450, 380, 290, 220, 160, 100],
    'Budget_Executed': [420, 350, 310, 195, 145, 95]
})
budget_by_category['Execution_Rate'] = (budget_by_category['Budget_Executed'] / 
                                         budget_by_category['Budget_Approved'] * 100).round(1)

# Monthly execution data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
monthly_data = pd.DataFrame({
    'Month': months,
    'Month_Num': range(1, 13),
    'Execution_Rate': [7.8, 15.9, 24.5, 33.2, 41.8, 49.5, 58.2, 66.8, 75.1, 83.5, 91.8, 99.2],
    'Target': [8.3, 16.7, 25.0, 33.3, 41.7, 50.0, 58.3, 66.7, 75.0, 83.3, 91.7, 100.0]
})

# Department execution data
dept_data = pd.DataFrame({
    'Department': ['Finance', 'Human Resources', 'IT', 'Operations', 'Marketing', 'Legal', 'R&D', 'Facilities'],
    'Execution_Pct': [95.2, 88.7, 92.1, 78.4, 85.6, 91.3, 82.5, 89.9],
    'Budget': [120, 85, 150, 200, 95, 60, 180, 70]
})

# Budget composition by type
composition = pd.DataFrame({
    'Year': [2020, 2021, 2022, 2023, 2024],
    'Personnel': [45, 44, 46, 47, 48],
    'Operations': [25, 26, 24, 23, 22],
    'Investment': [20, 19, 21, 22, 23],
    'Debt_Service': [10, 11, 9, 8, 7]
})

print("Sample data prepared!")
print(f"\nDatasets created:")
print(f"  - budget_by_category: {budget_by_category.shape}")
print(f"  - monthly_data: {monthly_data.shape}")
print(f"  - dept_data: {dept_data.shape}")
print(f"  - composition: {composition.shape}")

---

# Part 1: Data Preparation and Color Palette (15 minutes)

---

## Task 1.1: Explore the Data

In [None]:
# Explore the budget by category data
budget_by_category

In [None]:
# Summary statistics
budget_by_category.describe()

## Task 1.2: Define Your Color Palette

In [None]:
# TODO: Define your professional color palette
# Choose colors that work well together and serve different purposes

# YOUR CODE HERE
COLORS = {
    'primary': ___,      # Main data color (e.g., '#2C3E50', 'steelblue')
    'secondary': ___,    # Comparison data (e.g., '#3498DB', 'lightsteelblue')
    'accent': ___,       # Highlighting important values (e.g., '#1ABC9C')
    'warning': ___,      # Below target / negative (e.g., '#E74C3C', 'coral')
    'neutral': ___,      # Reference lines, text (e.g., '#95A5A6', 'gray')
}

# Sequential palette for stacked charts (light to dark)
SEQUENTIAL = [___, ___, ___, ___]  # e.g., ['#D5E8D4', '#97D077', '#5FAD41', '#2E7D32']

print("Color palette defined!")
print(f"Primary: {COLORS['primary']}")
print(f"Secondary: {COLORS['secondary']}")
print(f"Accent: {COLORS['accent']}")
print(f"Warning: {COLORS['warning']}")
print(f"Neutral: {COLORS['neutral']}")

In [None]:
# Visualize your palette
fig, ax = plt.subplots(figsize=(10, 2))

# Show main colors
for i, (name, color) in enumerate(COLORS.items()):
    ax.add_patch(plt.Rectangle((i, 0), 1, 1, facecolor=color))
    ax.text(i + 0.5, -0.2, name, ha='center', va='top', fontsize=9)
    ax.text(i + 0.5, 0.5, color, ha='center', va='center', fontsize=8, 
            color='white' if name in ['primary', 'warning'] else 'black')

ax.set_xlim(0, len(COLORS))
ax.set_ylim(-0.5, 1)
ax.axis('off')
ax.set_title('My Color Palette', fontsize=12, loc='left')
plt.tight_layout()
plt.show()

---

# Part 2: Create 5 Chart Types (75 minutes)

---

## Chart 1: Horizontal Bar Chart - Budget by Category (15 minutes)

Create a clean horizontal bar chart showing budget allocation.

**Requirements:**
- Sort by value (largest at top)
- Use primary color
- Add value labels
- Clean title

In [None]:
# TODO: Create horizontal bar chart

fig, ax = plt.subplots(figsize=(10, 6))

# Sort data
sorted_data = budget_by_category.sort_values('Budget_Approved', ascending=True)

# YOUR CODE HERE
# Create horizontal bars
bars = ax.barh(
    ___,  # y: categories
    ___,  # width: budget values
    color=___,  # Use COLORS['primary']
    edgecolor='none'
)

# Add value labels
for bar, value in zip(bars, sorted_data['Budget_Approved']):
    ax.text(
        bar.get_width() + 5,
        bar.get_y() + bar.get_height()/2,
        f'${value}M',
        va='center',
        fontsize=10
    )

# Remove unnecessary elements
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.set_xticks([])

# Title and labels
ax.set_title('Budget Allocation by Category (Millions USD)', fontsize=12, loc='left', pad=10)

# Add source
ax.text(0, -0.12, 'Source: datos.gov.co - Budget Execution Data', 
        transform=ax.transAxes, fontsize=8, color='gray')

plt.tight_layout()
plt.show()

In [None]:
# TODO: Export Chart 1
# Uncomment and run after your chart looks good

# fig.savefig('figures/chart1_budget_by_category.png', dpi=300, bbox_inches='tight',
#             facecolor='white', edgecolor='none')
# fig.savefig('figures/chart1_budget_by_category.svg', format='svg', bbox_inches='tight')
# print("Chart 1 exported!")

## Chart 2: Grouped Bar Chart - Approved vs Executed (15 minutes)

Create a grouped bar chart comparing approved and executed budgets.

**Requirements:**
- Two bars per category (approved and executed)
- Use primary and secondary colors
- Add legend
- Show execution rate

In [None]:
# TODO: Create grouped bar chart

fig, ax = plt.subplots(figsize=(12, 6))

# Set up bar positions
x = np.arange(len(budget_by_category))
width = 0.35

# YOUR CODE HERE
# Create bars for approved budget
bars1 = ax.bar(
    x - width/2,
    ___,  # Budget_Approved values
    width,
    label='Approved',
    color=___  # Use COLORS['secondary']
)

# Create bars for executed budget
bars2 = ax.bar(
    x + width/2,
    ___,  # Budget_Executed values
    width,
    label='Executed',
    color=___  # Use COLORS['primary']
)

# Add execution rate labels above each pair
for i, (approved, executed, rate) in enumerate(zip(
    budget_by_category['Budget_Approved'],
    budget_by_category['Budget_Executed'],
    budget_by_category['Execution_Rate']
)):
    ax.text(i, max(approved, executed) + 10, f'{rate}%',
            ha='center', va='bottom', fontsize=9, fontweight='bold')

# Customize
ax.set_xticks(x)
ax.set_xticklabels(budget_by_category['Category'])
ax.set_ylabel('Budget (Millions USD)')
ax.set_title('Approved vs Executed Budget by Category', fontsize=12, loc='left', pad=10)
ax.legend(frameon=False)

# Remove top and right spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

# Add light horizontal gridlines
ax.yaxis.grid(True, linestyle='-', linewidth=0.5, color='lightgray', alpha=0.7)
ax.set_axisbelow(True)

# Add source
ax.text(0, -0.15, 'Source: datos.gov.co - Budget Execution Data', 
        transform=ax.transAxes, fontsize=8, color='gray')

plt.tight_layout()
plt.show()

In [None]:
# TODO: Export Chart 2
# Uncomment and run after your chart looks good

# fig.savefig('figures/chart2_approved_vs_executed.png', dpi=300, bbox_inches='tight',
#             facecolor='white', edgecolor='none')
# fig.savefig('figures/chart2_approved_vs_executed.svg', format='svg', bbox_inches='tight')
# print("Chart 2 exported!")

## Chart 3: Line Chart - Monthly Execution Trend (15 minutes)

Create a line chart showing monthly budget execution vs target.

**Requirements:**
- Two lines: actual and target
- Highlight months below target
- Clean gridlines
- Annotate final value

In [None]:
# TODO: Create line chart

fig, ax = plt.subplots(figsize=(12, 5))

# YOUR CODE HERE

# Plot target line (subtle, in background)
ax.plot(
    monthly_data['Month'],
    monthly_data['Target'],
    linestyle='--',
    linewidth=1.5,
    color=___,  # Use COLORS['neutral']
    label='Target'
)

# Plot actual execution (prominent)
ax.plot(
    monthly_data['Month'],
    monthly_data['Execution_Rate'],
    linestyle='-',
    linewidth=2.5,
    color=___,  # Use COLORS['primary']
    label='Actual'
)

# Highlight months below target with markers
below_target = monthly_data[monthly_data['Execution_Rate'] < monthly_data['Target']]
ax.scatter(
    below_target['Month'],
    below_target['Execution_Rate'],
    color=___,  # Use COLORS['warning']
    s=50,
    zorder=5,
    label='Below Target'
)

# Minimal gridlines (horizontal only)
ax.yaxis.grid(True, linestyle='-', linewidth=0.5, color='lightgray', alpha=0.7)
ax.set_axisbelow(True)

# Labels and title
ax.set_ylabel('Execution Rate (%)')
ax.set_title('Monthly Budget Execution vs Target', fontsize=12, loc='left', pad=10)
ax.legend(frameon=False, loc='lower right')

# Annotate final value
final_rate = monthly_data['Execution_Rate'].iloc[-1]
ax.annotate(
    f'{final_rate}%',
    xy=(11, final_rate),
    xytext=(11.3, final_rate),
    fontsize=10,
    fontweight='bold',
    color=COLORS['primary']
)

# Add source
ax.text(0, -0.12, 'Source: datos.gov.co - Budget Execution Data', 
        transform=ax.transAxes, fontsize=8, color='gray')

plt.tight_layout()
plt.show()

In [None]:
# TODO: Export Chart 3
# Uncomment and run after your chart looks good

# fig.savefig('figures/chart3_monthly_execution.png', dpi=300, bbox_inches='tight',
#             facecolor='white', edgecolor='none')
# fig.savefig('figures/chart3_monthly_execution.svg', format='svg', bbox_inches='tight')
# print("Chart 3 exported!")

## Chart 4: Stacked Bar Chart - Budget Composition (15 minutes)

Create a stacked bar chart showing budget composition over years.

**Requirements:**
- Sequential color palette (light to dark)
- Percentage labels within segments
- Legend
- Clean styling

In [None]:
# TODO: Create stacked bar chart

fig, ax = plt.subplots(figsize=(10, 6))

# Prepare data
years = composition['Year']
components = ['Personnel', 'Operations', 'Investment', 'Debt_Service']

# YOUR CODE HERE

# Create stacked bars
bottom = np.zeros(len(years))

for i, component in enumerate(components):
    values = composition[component]
    ax.bar(
        years,
        values,
        bottom=bottom,
        label=component.replace('_', ' '),
        color=___,  # Use SEQUENTIAL[i]
        edgecolor='white',
        linewidth=0.5
    )
    
    # Add percentage labels in center of each segment
    for j, (year, value) in enumerate(zip(years, values)):
        if value > 5:  # Only label if segment is big enough
            ax.text(
                year,
                bottom[j] + value/2,
                f'{value}%',
                ha='center',
                va='center',
                fontsize=9,
                color='white' if i > 1 else 'black'
            )
    
    bottom += values

# Customize
ax.set_ylabel('Percentage of Total Budget')
ax.set_title('Budget Composition Over Time', fontsize=12, loc='left', pad=10)
ax.legend(frameon=False, loc='upper right', bbox_to_anchor=(1.15, 1))

# Set y-axis limit to 100
ax.set_ylim(0, 100)

# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

# Add source
ax.text(0, -0.12, 'Source: datos.gov.co - Budget Execution Data', 
        transform=ax.transAxes, fontsize=8, color='gray')

plt.tight_layout()
plt.show()

In [None]:
# TODO: Export Chart 4
# Uncomment and run after your chart looks good

# fig.savefig('figures/chart4_budget_composition.png', dpi=300, bbox_inches='tight',
#             facecolor='white', edgecolor='none')
# fig.savefig('figures/chart4_budget_composition.svg', format='svg', bbox_inches='tight')
# print("Chart 4 exported!")

## Chart 5: Small Multiples - Department Comparison (15 minutes)

Create a small multiples visualization comparing department execution.

**Requirements:**
- Consistent scales across all subplots
- Highlight department below 85% target
- Reference line at 90%
- Minimal styling

In [None]:
# TODO: Create small multiples

# Calculate grid dimensions
n_depts = len(dept_data)
n_cols = 4
n_rows = (n_depts + n_cols - 1) // n_cols

fig, axes = plt.subplots(n_rows, n_cols, figsize=(14, 6))
axes = axes.flatten()

# Target threshold
target = 90

# YOUR CODE HERE

for i, (idx, row) in enumerate(dept_data.iterrows()):
    ax = axes[i]
    
    # Choose color based on whether above/below target
    bar_color = ___ if row['Execution_Pct'] >= target else ___  # primary vs warning
    
    # Create single bar
    ax.barh([0], [row['Execution_Pct']], color=bar_color, height=0.5)
    
    # Add target reference line
    ax.axvline(x=target, color=COLORS['neutral'], linestyle='--', linewidth=1)
    
    # Add value label
    ax.text(row['Execution_Pct'] + 1, 0, f"{row['Execution_Pct']}%",
            va='center', fontsize=9, fontweight='bold')
    
    # Department name as title
    ax.set_title(row['Department'], fontsize=10, loc='left', pad=5)
    
    # Consistent x-axis
    ax.set_xlim(0, 105)
    ax.set_ylim(-0.5, 0.5)
    
    # Remove y-axis and simplify
    ax.set_yticks([])
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_visible(False)

# Hide empty subplots
for j in range(i + 1, len(axes)):
    axes[j].set_visible(False)

# Add overall title
fig.suptitle('Budget Execution by Department (Target: 90%)', fontsize=12, y=1.02)

plt.tight_layout()
plt.show()

In [None]:
# TODO: Export Chart 5
# Uncomment and run after your chart looks good

# fig.savefig('figures/chart5_department_comparison.png', dpi=300, bbox_inches='tight',
#             facecolor='white', edgecolor='none')
# fig.savefig('figures/chart5_department_comparison.svg', format='svg', bbox_inches='tight')
# print("Chart 5 exported!")

---

# Part 3: Publication-Ready Formatting (20 minutes)

---

## Checklist: Review Your Charts

For each chart, verify:

| Requirement | Chart 1 | Chart 2 | Chart 3 | Chart 4 | Chart 5 |
|-------------|---------|---------|---------|---------|----------|
| Descriptive title | [ ] | [ ] | [ ] | [ ] | [ ] |
| Clear axis labels | [ ] | [ ] | [ ] | [ ] | [ ] |
| Consistent font sizes | [ ] | [ ] | [ ] | [ ] | [ ] |
| Color from palette | [ ] | [ ] | [ ] | [ ] | [ ] |
| Appropriate legend | [ ] | [ ] | [ ] | [ ] | [ ] |
| No unnecessary gridlines | [ ] | [ ] | [ ] | [ ] | [ ] |
| Source attribution | [ ] | [ ] | [ ] | [ ] | [ ] |

In [None]:
# List exported files
import os

if os.path.exists('figures'):
    files = os.listdir('figures')
    print(f"Exported files ({len(files)}):")
    for f in sorted(files):
        size = os.path.getsize(f'figures/{f}') / 1024
        print(f"  - {f} ({size:.1f} KB)")
else:
    print("No figures exported yet.")

---

# Part 4: Critical Analysis (10 minutes)

---

## Critical Analysis Questions

### Question 1: Most Effective Chart Type

Which chart type was most effective for communicating the budget story? Why?

_Your answer:_

---

### Question 2: Insight from Multiple Charts

What insight would be missed if you only used one chart type?

_Your answer:_

---

### Question 3: One Chart for Decision-Maker

If you could only show one chart to a decision-maker, which would you choose and why?

_Your answer:_

---

### Question 4: Additional Data

What additional data would make these visualizations more impactful?

_Your answer:_

---

## Color Palette Documentation

Document your color choices:

| Purpose | Hex Code | Rationale |
|---------|----------|----------|
| Primary | | |
| Secondary | | |
| Accent | | |
| Warning | | |
| Neutral | | |

---

## Summary

### Key Takeaways

1. **Data-Ink Ratio:** Remove unnecessary elements to let data speak

2. **Chart Selection:** Choose the right chart for the data and message

3. **Color Usage:** Use color purposefully, not decoratively

4. **Consistency:** Apply the same styling across all charts

5. **Export Quality:** High DPI for print, vector for presentations

---

## Submission Checklist

- [ ] All 5 charts created and styled
- [ ] Color palette defined and documented
- [ ] 10 files exported (5 PNG + 5 SVG)
- [ ] Critical analysis questions answered
- [ ] Source attribution on all charts

---

*End of Workshop*