# Introduction to Visualization with PanelBox

This notebook introduces the PanelBox visualization API. You will learn how to:
- Create basic and panel-specific charts
- Apply themes and customize appearance
- Export charts to multiple formats

**Prerequisites:** Tutorial 01 (Panel Data Structures)  
**Duration:** ~60–80 minutes

In [None]:
import sys
import os
sys.path.insert(0, '../../../')  # make panelbox importable

import numpy as np
import pandas as pd
import plotly.io as pio

# Set Plotly renderer for Jupyter
pio.renderers.default = 'notebook'

# PanelBox version
import panelbox
print(f'PanelBox version: {panelbox.__version__}')

# PanelBox visualization
from panelbox.visualization import (
    ChartFactory,
    ChartRegistry,
)
from panelbox.visualization.api import (
    create_panel_charts,
    create_residual_diagnostics,
    create_comparison_charts,
    export_charts,
    export_charts_multiple_formats,
)
from panelbox.visualization.themes import Theme, get_theme, list_themes, register_theme

print('Setup complete.')

## 1. Visualization API Overview

PanelBox provides a layered visualization system:

| Layer | Class/Module | Purpose |
|---|---|---|
| High-level API | `panelbox.visualization.api` | Convenience functions for common workflows |
| Factory | `ChartFactory` | Central chart creation with theme management |
| Registry | `ChartRegistry` | Declarative chart registration via `@register_chart` |
| Chart classes | `plotly/`, `quantile/`, `spatial_plots.py`, `var_plots.py` | Individual chart implementations |

The typical workflow is:
```python
# Option A — convenience API (recommended)
charts = create_residual_diagnostics(model_results, theme='academic')

# Option B — direct factory
chart = ChartFactory.create('qq_plot', data=data, theme='professional')
```

In [None]:
# Show all registered chart types
registered = ChartRegistry.list_charts()
print(f'Total registered chart types: {len(registered)}')
for name in sorted(registered):
    print(f'  • {name}')

## 2. Basic Interactive Charts with Plotly

PanelBox wraps Plotly to provide consistent styling and easy theme switching.
We start by loading the classic **Grunfeld dataset** (10 firms × 20 years, balanced panel).

In [None]:
# Load Grunfeld dataset (10 firms, 20 years)
from panelbox.datasets import load_grunfeld

data = load_grunfeld()
print(data.head())
print(f'\nShape: {data.shape}')
print(f'Firms: {data["firm"].nunique()}')
print(f'Years: {data["year"].nunique()}')
print(f'Variables: {data.columns.tolist()}')

In [None]:
import plotly.express as px

# Scatter plot: Investment vs. Firm Value, colored by firm
fig_scatter = px.scatter(
    data,
    x='value',
    y='invest',
    color=data['firm'].astype(str),
    title='Investment vs. Firm Value (Grunfeld Data)',
    labels={'value': 'Firm Value', 'invest': 'Investment', 'color': 'Firm'},
    template='plotly_white'
)
fig_scatter.update_layout(
    font=dict(family='Arial', size=13),
    title_font_size=16,
)
fig_scatter.show()

In [None]:
import plotly.graph_objects as go

# Line chart: average investment over time
firm_avg = data.groupby('year')['invest'].mean().reset_index()

fig_line = go.Figure()
fig_line.add_trace(go.Scatter(
    x=firm_avg['year'],
    y=firm_avg['invest'],
    mode='lines+markers',
    name='Average Investment',
    line=dict(color='#1f77b4', width=2),
    marker=dict(size=6),
))
fig_line.update_layout(
    title='Average Investment Over Time',
    xaxis_title='Year',
    yaxis_title='Average Investment',
    template='plotly_white',
    font=dict(family='Arial', size=13),
)
fig_line.show()

In [None]:
# Bar chart: average investment by firm
firm_invest = data.groupby('firm')['invest'].mean().reset_index()

fig_bar = px.bar(
    firm_invest,
    x=firm_invest['firm'].astype(str),
    y='invest',
    title='Average Investment by Firm',
    labels={'x': 'Firm', 'invest': 'Average Investment'},
    template='plotly_white',
    color_discrete_sequence=['#1f77b4'],
)
fig_bar.update_layout(
    xaxis_title='Firm',
    font=dict(family='Arial', size=13),
)
fig_bar.show()

## 3. Panel-Specific Visualizations

PanelBox provides charts designed specifically for panel data analysis:
- **Entity Effects Plot**: estimated fixed effects per entity (requires estimated model)
- **Time Effects Plot**: temporal fixed effects over years
- **Between-Within Decomposition**: how much variance is cross-sectional vs. temporal
- **Panel Structure Plot**: shows which (entity, time) cells have observations

For the panel-specific charts, we first need to estimate a model.

In [None]:
from panelbox.models.static import FixedEffects

np.random.seed(42)

# Estimate a Fixed Effects model on the Grunfeld data
model_fe = FixedEffects(
    formula='invest ~ value + capital',
    data=data,
    entity_col='firm',
    time_col='year',
    entity_effects=True,
)
results_fe = model_fe.fit()
print('Model estimated.')
print(f'Coefficients:\n{results_fe.params}')
print(f'R²: {results_fe.rsquared:.4f}')

In [None]:
# Entity effects chart — computed from mean residuals per entity
# (PanelResults exposes .resid and .entity_index; compute αᵢ = mean(εᵢ) per firm)
from panelbox.visualization.api import create_entity_effects_plot
import pandas as pd

resid_df = pd.DataFrame({
    'firm': results_fe.entity_index,
    'resid': results_fe.resid,
})
entity_effects_data = (
    resid_df.groupby('firm')['resid']
    .mean()
    .reset_index()
    .rename(columns={'firm': 'entity_id', 'resid': 'effect'})
)

# Pass pre-computed dict directly — create_entity_effects_plot accepts a dict
effects_dict = {
    'entity_id': entity_effects_data['entity_id'].tolist(),
    'effect': entity_effects_data['effect'].tolist(),
}

entity_chart = create_entity_effects_plot(panel_results=effects_dict, theme='academic')
entity_chart.figure.show()

In [None]:
# Between-within variance decomposition
# calculate_between_within requires a DataFrame with MultiIndex (entity, time)
from panelbox.visualization.api import create_between_within_plot

data_mi = data.set_index(['firm', 'year'])  # create MultiIndex

bw_chart = create_between_within_plot(
    panel_data=data_mi,
    variables=['invest', 'value', 'capital'],
    theme='academic',
    style='stacked',
)
# Returns a BaseChart; display via .figure
bw_chart.figure.show()

**Interpretation:**
- **High "between" share** → most variation across firms → cross-section dimension dominates
- **High "within" share** → most variation within firms → time dimension dominates

## 4. Themes and Customization

PanelBox ships with three built-in themes:

| Theme | Use case |
|---|---|
| `academic` | Academic papers, 300 DPI |
| `professional` | Business reports |
| `presentation` | Slide decks, high contrast |

Use `list_themes()` to see all available themes and `get_theme()` to load one.

In [None]:
# List available themes
print('Available themes:', list_themes())

# Inspect academic theme
academic = get_theme('academic')
print(f'\nTheme name: {academic.name}')
print(f'Colors: {academic.color_scheme[:5]}')
print(f'Font: {academic.font_config}')
print(f'Plotly template: {academic.plotly_template}')

In [None]:
# Compare Q-Q plots across all three built-in themes
# create_residual_diagnostics returns BaseChart objects; display via .figure
diag_academic = create_residual_diagnostics(results_fe, theme='academic')
diag_professional = create_residual_diagnostics(results_fe, theme='professional')
diag_presentation = create_residual_diagnostics(results_fe, theme='presentation')

print('Theme: academic')
diag_academic['qq_plot'].figure.show()

print('Theme: professional')
diag_professional['qq_plot'].figure.show()

print('Theme: presentation')
diag_presentation['qq_plot'].figure.show()

In [None]:
# register_theme takes a single Theme object; the theme's .name is its registry key
custom_theme = Theme(
    name='custom_demo',
    color_scheme=['#2E86AB', '#A23B72', '#F18F01', '#C73E1D', '#3B1F2B'],
    font_config={'family': 'Arial', 'size': 13, 'color': '#222222'},
    layout_config={'margin': {'l': 60, 'r': 40, 't': 60, 'b': 60}},
    plotly_template='plotly_white',
)
register_theme(custom_theme)  # registers under 'custom_demo'

print('Available themes after registration:', list_themes())

# Use the custom theme by name
diag_custom = create_residual_diagnostics(results_fe, theme='custom_demo')
diag_custom['qq_plot'].figure.show()

## 5. Exporting Charts

PanelBox supports exporting charts to:
- **PNG / JPEG / WebP** — raster images for web and emails
- **SVG / PDF** — vector formats for publications
- **HTML** — interactive, self-contained files

> **Note:** Static image export requires `kaleido`:
> ```
> pip install kaleido
> ```

In [None]:
# Create output directories
os.makedirs('../outputs/charts/png', exist_ok=True)
os.makedirs('../outputs/charts/svg', exist_ok=True)
os.makedirs('../outputs/charts/pdf', exist_ok=True)

# The chart object returned by create_residual_diagnostics is a BaseChart (PlotlyChartBase)
# Use .save_image() for static formats; .figure.write_html() for interactive HTML
qq_chart = diag_academic['qq_plot']

try:
    # Export PNG via .save_image() (requires kaleido)
    qq_chart.save_image('../outputs/charts/png/01_qq_plot.png', width=900, height=600, scale=1)
    print('PNG exported: ../outputs/charts/png/01_qq_plot.png')

    # Export SVG
    qq_chart.save_image('../outputs/charts/svg/01_qq_plot.svg', width=900, height=600)
    print('SVG exported: ../outputs/charts/svg/01_qq_plot.svg')

    # Export PDF
    qq_chart.save_image('../outputs/charts/pdf/01_qq_plot.pdf', width=900, height=600)
    print('PDF exported: ../outputs/charts/pdf/01_qq_plot.pdf')
except Exception as e:
    print(f'Static export skipped (kaleido may not be installed): {e}')

# Export interactive HTML — use the underlying Plotly figure's write_html()
qq_chart.figure.write_html('../outputs/charts/01_qq_plot.html', include_plotlyjs=True)
print('HTML exported: ../outputs/charts/01_qq_plot.html')

In [None]:
# Batch export with export_charts()
# export_charts() expects BaseChart objects (which have .save_image())
# The diagnostic charts from create_residual_diagnostics are BaseChart instances
charts_to_export = {
    'qq_plot': diag_academic['qq_plot'],
    'residual_vs_fitted': diag_academic['residual_vs_fitted'],
    'scale_location': diag_academic['scale_location'],
}

try:
    exported_paths = export_charts(
        charts=charts_to_export,
        output_dir='../outputs/charts/png',
        format='png',
        prefix='01_batch_',
        width=900,
        height=600,
        scale=1,
    )
    print('Batch export complete:')
    for name, path in exported_paths.items():
        print(f'  {name}: {path}')
except Exception as e:
    print(f'Batch export skipped (kaleido may not be installed): {e}')

## 6. Complete Example: Exploratory Data Analysis of a Panel

We now combine everything learned to perform a full EDA on the Grunfeld dataset.

In [None]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from panelbox.visualization.plotly.correlation import CorrelationHeatmapChart
from panelbox.visualization.api import create_panel_structure_plot

os.makedirs('../outputs/charts/png', exist_ok=True)

# --- 1. Investment trajectories for each firm over time ---
fig_traj = go.Figure()
for firm_id in sorted(data['firm'].unique()):
    firm_data = data[data['firm'] == firm_id].sort_values('year')
    fig_traj.add_trace(go.Scatter(
        x=firm_data['year'],
        y=firm_data['invest'],
        mode='lines+markers',
        name=f'Firm {firm_id}',
        marker=dict(size=4),
    ))
fig_traj.update_layout(
    title='Investment Trajectories by Firm',
    xaxis_title='Year',
    yaxis_title='Investment',
    template='plotly_white',
    font=dict(family='Arial', size=12),
)
fig_traj.show()

# --- 2. Panel structure heatmap ---
# create_panel_structure_plot needs DataFrame with MultiIndex (entity, time)
data_mi = data.set_index(['firm', 'year'])
structure_chart = create_panel_structure_plot(data_mi, theme='academic')
structure_chart.figure.show()

try:
    structure_chart.save_image('../outputs/charts/png/01_panel_structure.png',
                               width=900, height=500)
    print('Panel structure chart saved.')
except Exception as e:
    print(f'Static export skipped: {e}')

# --- 3. Correlation heatmap using CorrelationHeatmapChart ---
corr_matrix = data[['invest', 'value', 'capital']].corr()
corr_chart = CorrelationHeatmapChart()
corr_chart.create({'correlation_matrix': corr_matrix})
corr_chart.figure.show()

# --- 4. Residual diagnostics from FE model ---
print('\n--- Residual Diagnostics (Fixed Effects on Grunfeld) ---')
diag = create_residual_diagnostics(results_fe, theme='academic')
print(f'Available diagnostic charts: {list(diag.keys())}')
diag['qq_plot'].figure.show()
diag['residual_vs_fitted'].figure.show()

print('\nEDA complete!')

## Summary

In this notebook you learned:

1. **API architecture** — ChartFactory, ChartRegistry, convenience functions
2. **Available charts** — 35 registered chart types across 9 categories
3. **Basic charts** — scatter, line, bar with Plotly and PanelBox styling
4. **Panel-specific charts** — entity effects, between-within decomposition
5. **Themes** — `academic`, `professional`, `presentation`; how to register a custom theme
6. **Export** — single PNG/SVG/PDF/HTML, batch export with `export_charts()`

**Next:** Notebook 02 — Visual Diagnostics (Q-Q plots, residual analysis, ACF/PACF)