# Week 9 Workshop: Plotly Exercises

## Budget Execution Dataset - Interactive Visualization

**Student Name:** _____________________

**Date:** _____________________

---

### Workshop Objectives

1. Master Plotly Express chart creation
2. Add hover information and interactivity
3. Customize chart appearance and layout
4. Prepare charts for Streamlit dashboard

---

## Setup

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

print("Libraries loaded successfully!")

In [None]:
# Create Budget Execution dataset
np.random.seed(42)

departments = ['Education', 'Health', 'Infrastructure', 'Security', 'Social Services', 'Environment']
years = [2021, 2022, 2023, 2024]
categories = ['Personnel', 'Operations', 'Investment', 'Transfers']

data = []
for dept in departments:
    for year in years:
        for cat in categories:
            approved = np.random.randint(100000, 5000000)
            execution_rate = np.random.uniform(0.6, 1.0)
            executed = int(approved * execution_rate)
            
            data.append({
                'department': dept,
                'year': year,
                'category': cat,
                'budget_approved': approved,
                'budget_executed': executed,
                'execution_rate': round(execution_rate * 100, 1),
                'project_count': np.random.randint(5, 50)
            })

df = pd.DataFrame(data)

print(f"Dataset shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")

In [None]:
# Preview the data
df.head(10)

In [None]:
# Summary statistics
df.describe()

---

# Part 1: Basic Plotly Charts (20 minutes)

---

## Task 1.1: Bar Chart - Budget by Department

Create a horizontal bar chart showing total executed budget by department.

In [None]:
# Aggregate data by department
dept_totals = df.groupby('department').agg({
    'budget_approved': 'sum',
    'budget_executed': 'sum'
}).reset_index()

dept_totals['execution_rate'] = (dept_totals['budget_executed'] / dept_totals['budget_approved'] * 100).round(1)
dept_totals = dept_totals.sort_values('budget_executed', ascending=True)

dept_totals

In [None]:
# TODO: Create horizontal bar chart
# Requirements:
# - Use px.bar() with orientation='h'
# - Show executed budget by department
# - Add hover data for approved budget and execution rate
# - Add appropriate title and axis labels

fig = px.bar(
    dept_totals,
    x=___,  # budget_executed
    y=___,  # department
    orientation=___,  # 'h' for horizontal
    hover_data=[___],  # list of additional columns
    title='Total Budget Executed by Department',
    color='execution_rate',
    color_continuous_scale='RdYlGn'
)

fig.update_layout(
    xaxis_title='Executed Budget ($)',
    yaxis_title='',
    coloraxis_colorbar_title='Exec Rate %'
)

fig.show()

## Task 1.2: Grouped Bar Chart - Approved vs Executed

In [None]:
# Reshape data for grouped bar chart
dept_melted = dept_totals.melt(
    id_vars=['department'],
    value_vars=['budget_approved', 'budget_executed'],
    var_name='budget_type',
    value_name='amount'
)

# Clean up labels
dept_melted['budget_type'] = dept_melted['budget_type'].replace({
    'budget_approved': 'Approved',
    'budget_executed': 'Executed'
})

dept_melted.head()

In [None]:
# TODO: Create grouped bar chart comparing approved vs executed
# Requirements:
# - Use px.bar() with barmode='group'
# - Color by budget_type
# - Add appropriate title

fig = px.bar(
    dept_melted,
    x=___,  # department
    y=___,  # amount
    color=___,  # budget_type
    barmode=___,  # 'group'
    title='Budget Comparison: Approved vs Executed by Department',
    color_discrete_map={'Approved': '#636EFA', 'Executed': '#00CC96'}
)

fig.update_layout(
    xaxis_title='Department',
    yaxis_title='Budget ($)',
    legend_title='Budget Type',
    xaxis_tickangle=45
)

fig.show()

## Task 1.3: Line Chart - Budget Trend Over Time

In [None]:
# Aggregate by year
year_totals = df.groupby('year').agg({
    'budget_approved': 'sum',
    'budget_executed': 'sum',
    'project_count': 'sum'
}).reset_index()

year_totals['execution_rate'] = (year_totals['budget_executed'] / year_totals['budget_approved'] * 100).round(1)

year_totals

In [None]:
# TODO: Create line chart showing budget trend
# Requirements:
# - Use px.line() to show both approved and executed over time
# - Add markers to the lines
# - Add hover information

# First, reshape data
year_melted = year_totals.melt(
    id_vars=['year', 'execution_rate'],
    value_vars=['budget_approved', 'budget_executed'],
    var_name='budget_type',
    value_name='amount'
)
year_melted['budget_type'] = year_melted['budget_type'].replace({
    'budget_approved': 'Approved',
    'budget_executed': 'Executed'
})

fig = px.line(
    year_melted,
    x=___,  # year
    y=___,  # amount
    color=___,  # budget_type
    markers=___,  # True to show markers
    title='Budget Trend Over Time',
    hover_data=['execution_rate']
)

fig.update_layout(
    xaxis_title='Year',
    yaxis_title='Budget ($)',
    legend_title='Budget Type'
)

fig.show()

## Task 1.4: Pie Chart - Budget Distribution by Category

In [None]:
# Aggregate by category
cat_totals = df.groupby('category').agg({
    'budget_executed': 'sum'
}).reset_index()

cat_totals['percentage'] = (cat_totals['budget_executed'] / cat_totals['budget_executed'].sum() * 100).round(1)

cat_totals

In [None]:
# TODO: Create pie chart showing budget distribution by category
# Requirements:
# - Use px.pie()
# - Show percentages in labels
# - Add hover information with actual values

fig = px.pie(
    cat_totals,
    values=___,  # budget_executed
    names=___,  # category
    title='Budget Distribution by Category',
    hole=0.3  # Creates a donut chart (optional)
)

fig.update_traces(
    textposition='inside',
    textinfo='percent+label'
)

fig.show()

---

# Part 2: Advanced Plotly Features (20 minutes)

---

## Task 2.1: Scatter Plot with Size and Color Encoding

In [None]:
# TODO: Create scatter plot with multiple encodings
# Requirements:
# - x = budget_approved
# - y = budget_executed
# - color = department
# - size = project_count
# - hover_name = category
# - Add a reference line (y = x) to show perfect execution

fig = px.scatter(
    df,
    x=___,
    y=___,
    color=___,
    size=___,
    hover_name=___,
    hover_data=['year', 'execution_rate'],
    title='Budget Execution: Approved vs Executed',
    opacity=0.7
)

# Add reference line (y = x)
max_val = max(df['budget_approved'].max(), df['budget_executed'].max())
fig.add_trace(
    go.Scatter(
        x=[0, max_val],
        y=[0, max_val],
        mode='lines',
        name='Perfect Execution',
        line=dict(dash='dash', color='gray')
    )
)

fig.update_layout(
    xaxis_title='Approved Budget ($)',
    yaxis_title='Executed Budget ($)',
    legend_title='Department'
)

fig.show()

## Task 2.2: Faceted Charts

In [None]:
# TODO: Create faceted bar chart by year
# Requirements:
# - Use facet_col to create separate panels by year
# - Show budget by department for each year

fig = px.bar(
    df.groupby(['department', 'year'])['budget_executed'].sum().reset_index(),
    x='department',
    y='budget_executed',
    facet_col=___,  # 'year'
    title='Budget Execution by Department (by Year)',
    color='budget_executed',
    color_continuous_scale='Viridis'
)

fig.update_layout(height=400)
fig.update_xaxes(tickangle=45)

fig.show()

## Task 2.3: Heatmap

In [None]:
# Create pivot table for heatmap
heatmap_data = df.groupby(['department', 'year'])['execution_rate'].mean().reset_index()
heatmap_pivot = heatmap_data.pivot(index='department', columns='year', values='execution_rate')

heatmap_pivot

In [None]:
# TODO: Create heatmap showing execution rate by department and year
# Requirements:
# - Use px.imshow() for heatmap
# - Use RdYlGn color scale (red=low, green=high)
# - Show values in cells

fig = px.imshow(
    heatmap_pivot,
    text_auto=___,  # True to show values, '.1f' for 1 decimal
    color_continuous_scale=___,  # 'RdYlGn'
    aspect='auto',
    title='Execution Rate by Department and Year (%)',
    labels={'color': 'Execution Rate %'}
)

fig.update_layout(
    xaxis_title='Year',
    yaxis_title='Department'
)

fig.show()

---

# Part 3: Customization and Styling (10 minutes)

---

## Task 3.1: Custom Color Palette and Theme

In [None]:
# Define custom color palette
custom_colors = {
    'Education': '#667eea',
    'Health': '#00b894',
    'Infrastructure': '#fdcb6e',
    'Security': '#e17055',
    'Social Services': '#74b9ff',
    'Environment': '#00cec9'
}

# TODO: Create bar chart with custom colors
fig = px.bar(
    dept_totals.sort_values('budget_executed', ascending=False),
    x='department',
    y='budget_executed',
    color='department',
    color_discrete_map=___,  # custom_colors
    title='Budget by Department (Custom Colors)'
)

# Remove legend (redundant with x-axis)
fig.update_layout(showlegend=False)

fig.show()

## Task 3.2: Adding Annotations

In [None]:
# Create line chart with annotations
fig = px.line(
    year_totals,
    x='year',
    y='budget_executed',
    markers=True,
    title='Budget Trend with Annotations'
)

# Find the year with highest execution rate
best_year = year_totals.loc[year_totals['execution_rate'].idxmax()]

# Add annotation
fig.add_annotation(
    x=best_year['year'],
    y=best_year['budget_executed'],
    text=f"Best execution rate: {best_year['execution_rate']}%",
    showarrow=True,
    arrowhead=2,
    ax=50,
    ay=-50
)

fig.show()

---

# Part 4: Dashboard-Ready Charts (10 minutes)

---

Create the 4 charts you'll use in your Streamlit dashboard.

In [None]:
# CHART 1: Department comparison (approved vs executed)

def create_dept_comparison_chart(data):
    """
    Create grouped bar chart comparing approved vs executed budget by department.
    """
    dept_data = data.groupby('department').agg({
        'budget_approved': 'sum',
        'budget_executed': 'sum'
    }).reset_index()
    
    dept_melted = dept_data.melt(
        id_vars=['department'],
        value_vars=['budget_approved', 'budget_executed'],
        var_name='type',
        value_name='amount'
    )
    dept_melted['type'] = dept_melted['type'].replace({
        'budget_approved': 'Approved',
        'budget_executed': 'Executed'
    })
    
    fig = px.bar(
        dept_melted,
        x='department',
        y='amount',
        color='type',
        barmode='group',
        title='Budget by Department: Approved vs Executed',
        color_discrete_map={'Approved': '#636EFA', 'Executed': '#00CC96'}
    )
    
    fig.update_layout(
        xaxis_title='',
        yaxis_title='Budget ($)',
        legend_title='',
        xaxis_tickangle=45
    )
    
    return fig

# Test it
create_dept_comparison_chart(df).show()

In [None]:
# CHART 2: Year trend line chart

def create_year_trend_chart(data):
    """
    Create line chart showing budget trend over years.
    """
    # TODO: Complete this function
    # Aggregate by year, reshape data, create line chart
    
    year_data = data.groupby('year').agg({
        'budget_approved': 'sum',
        'budget_executed': 'sum'
    }).reset_index()
    
    year_melted = year_data.melt(
        id_vars=['year'],
        value_vars=['budget_approved', 'budget_executed'],
        var_name='type',
        value_name='amount'
    )
    year_melted['type'] = year_melted['type'].replace({
        'budget_approved': 'Approved',
        'budget_executed': 'Executed'
    })
    
    fig = px.line(
        ___,  # year_melted
        x=___,  # 'year'
        y=___,  # 'amount'
        color=___,  # 'type'
        markers=True,
        title='Budget Trend Over Time'
    )
    
    fig.update_layout(
        xaxis_title='Year',
        yaxis_title='Budget ($)',
        legend_title=''
    )
    
    return fig

# Test it
create_year_trend_chart(df).show()

In [None]:
# CHART 3: Category distribution pie chart

def create_category_pie_chart(data):
    """
    Create pie chart showing budget distribution by category.
    """
    # TODO: Complete this function
    
    cat_data = data.groupby('category')['budget_executed'].sum().reset_index()
    
    fig = px.pie(
        cat_data,
        values=___,  # 'budget_executed'
        names=___,  # 'category'
        title='Budget Distribution by Category',
        hole=0.3
    )
    
    fig.update_traces(textposition='inside', textinfo='percent+label')
    
    return fig

# Test it
create_category_pie_chart(df).show()

In [None]:
# CHART 4: Scatter plot (approved vs executed)

def create_execution_scatter(data):
    """
    Create scatter plot of approved vs executed budget.
    """
    # TODO: Complete this function
    
    fig = px.scatter(
        data,
        x=___,  # 'budget_approved'
        y=___,  # 'budget_executed'
        color=___,  # 'department'
        size=___,  # 'execution_rate'
        hover_name='category',
        hover_data=['year', 'project_count'],
        title='Budget Execution Analysis',
        opacity=0.7
    )
    
    # Add reference line
    max_val = max(data['budget_approved'].max(), data['budget_executed'].max())
    fig.add_trace(
        go.Scatter(
            x=[0, max_val],
            y=[0, max_val],
            mode='lines',
            name='Perfect Execution',
            line=dict(dash='dash', color='gray')
        )
    )
    
    fig.update_layout(
        xaxis_title='Approved Budget ($)',
        yaxis_title='Executed Budget ($)'
    )
    
    return fig

# Test it
create_execution_scatter(df).show()

---

# Summary

---

## Plotly Express Quick Reference

### Chart Types
```python
px.bar()      # Bar charts (vertical/horizontal)
px.line()     # Line charts
px.scatter()  # Scatter plots
px.pie()      # Pie/donut charts
px.imshow()   # Heatmaps
px.histogram()# Histograms
px.box()      # Box plots
```

### Common Parameters
```python
x, y          # Axis columns
color         # Color by category
size          # Size by value (scatter)
hover_name    # Main hover label
hover_data    # Additional hover info
facet_col     # Split into columns
facet_row     # Split into rows
animation_frame  # Add slider/animation
title         # Chart title
```

### Layout Customization
```python
fig.update_layout(
    title='Title',
    xaxis_title='X Label',
    yaxis_title='Y Label',
    showlegend=True,
    height=500,
    width=800
)
```

---

**Next:** Build your Streamlit dashboard using `streamlit_app_starter.py`

*End of Workshop Exercises*

---

## Checklist Before Submission

- [ ] All code cells execute without errors
- [ ] Created horizontal bar chart (Task 1.1)
- [ ] Created grouped bar chart (Task 1.2)
- [ ] Created line chart (Task 1.3)
- [ ] Created pie chart (Task 1.4)
- [ ] Created scatter plot with encodings (Task 2.1)
- [ ] Created faceted chart (Task 2.2)
- [ ] Created heatmap (Task 2.3)
- [ ] All 4 dashboard functions completed (Part 4)

---

*End of Workshop*