# Automated Report Generation with PanelBox

In many real-world workflows, we need to:
- Produce **consistent, reproducible** analysis reports
- Share results with **non-technical stakeholders** (HTML)
- Submit results to **academic journals** (LaTeX)
- **Automate** monthly/weekly analyses without manual effort

PanelBox's `ReportManager` integrates model results, charts, and validation tests
into professional reports with a single function call.

**Topics:**
1. ReportManager architecture
2. Single-model validation report (HTML)
3. Multi-model comparison report (HTML)
4. Full diagnostic report (charts + tests)
5. Template customization (Jinja2)
6. LaTeX and Markdown export
7. Complete automated pipeline

**Prerequisites:** Notebooks 01–03; Tutorial 03 and 06  
**Duration:** ~180–200 minutes

In [None]:
import sys
import os
sys.path.insert(0, '../../../')  # make panelbox importable
sys.path.insert(0, '../')       # make utils importable

import numpy as np
import pandas as pd
import plotly.io as pio
pio.renderers.default = 'notebook'

np.random.seed(42)

# PanelBox models
from panelbox.models.static import PooledOLS, FixedEffects, RandomEffects

# Visualization API
from panelbox.visualization.api import create_residual_diagnostics, create_comparison_charts

# Report system
from panelbox.report import ReportManager
from panelbox.report.validation_transformer import ValidationTransformer
from panelbox.validation import ValidationReport
from panelbox.report.exporters.latex_exporter import LaTeXExporter
from panelbox.report.exporters.markdown_exporter import MarkdownExporter

# Local data generators
from utils.data_generators import generate_panel_data, generate_heteroskedastic_panel

# Output directories
os.makedirs('../outputs/charts/png', exist_ok=True)
os.makedirs('../outputs/charts/pdf', exist_ok=True)
os.makedirs('../outputs/reports/html', exist_ok=True)
os.makedirs('../outputs/reports/latex', exist_ok=True)

print('Setup complete.')

## 1. ReportManager Architecture

The PanelBox report system has three layers:

```
ReportManager (orchestrator)
│
├── TemplateManager — loads Jinja2 templates from panelbox/templates/
├── AssetManager   — collects CSS/JS assets; inlines or links them
└── CSSManager     — compiles and optionally minifies stylesheets
```

**Report types:**
- `'validation'` → uses `templates/validation/interactive/index.html`
- `'residuals'` → uses `templates/residuals/interactive/index.html`
- `'comparison'` → uses `templates/comparison/interactive/index.html`
- `'master'`     → uses `templates/master/index.html` (multi-section)

**Convenience methods:**
```python
report_mgr = ReportManager()

html = report_mgr.generate_validation_report(validation_data, title='...')
html = report_mgr.generate_comparison_report(comparison_data, title='...')
html = report_mgr.generate_residual_report(residual_data, title='...')

report_mgr.save_report(html, '../outputs/reports/html/report.html')
```

Data preparation uses transformer classes:
```python
from panelbox.report.validation_transformer import ValidationTransformer
val_data = ValidationTransformer(validation_report).transform()
```

In [None]:
# Inspect ReportManager — list available methods
import inspect

report_mgr = ReportManager()

print('=== ReportManager public methods ===')
for name, method in inspect.getmembers(report_mgr, predicate=inspect.ismethod):
    if not name.startswith('_'):
        sig = inspect.signature(method)
        print(f'  {name}{sig}')

# Inspect available templates
import panelbox
templates_dir = os.path.join(os.path.dirname(panelbox.__file__), 'templates')
print(f'\n=== Templates directory: {templates_dir} ===')
for root, dirs, files in os.walk(templates_dir):
    for fname in sorted(files):
        rel = os.path.relpath(os.path.join(root, fname), templates_dir)
        print(f'  • {rel}')

## 2. Single-Model Validation Report

The most common use case: estimate one model, prepare validation data,
and produce a self-contained HTML report.

The report includes:
- Model summary (coefficients, standard errors, p-values, R²)
- Validation test results
- Residual diagnostic charts
- Recommendations based on test outcomes

**Workflow:**
1. Estimate model → `results`
2. Create `ValidationReport` (or use `results.validate()`)
3. Transform via `ValidationTransformer(val_report).transform()`
4. Pass to `report_mgr.generate_validation_report(val_data, ...)`
5. Save with `report_mgr.save_report(html, path)`

In [None]:
# --- Generate data and estimate Fixed Effects model ---
rng = np.random.default_rng(42)

# Panel: 40 entities, 15 periods (600 obs)
n_ent, n_per = 40, 15
entities = np.repeat(np.arange(1, n_ent + 1), n_per)
periods  = np.tile(np.arange(1, n_per + 1), n_ent)
alpha    = np.repeat(rng.normal(0, 1, n_ent), n_per)  # entity fixed effects
x1 = rng.normal(0, 1, n_ent * n_per)
x2 = rng.normal(0, 1, n_ent * n_per)
y  = alpha + 1.5 * x1 - 0.8 * x2 + rng.normal(0, 1, n_ent * n_per)

df = pd.DataFrame({'entity': entities, 'time': periods,
                   'x1': x1, 'x2': x2, 'y': y})

# Estimate Fixed Effects
model_fe = FixedEffects(
    formula='y ~ x1 + x2',
    data=df,
    entity_col='entity',
    time_col='time',
    entity_effects=True,
)
results_fe = model_fe.fit()

print('Fixed Effects Model estimated.')
print(f'Coefficients:\n{results_fe.params}')
print(f'R²: {results_fe.rsquared:.4f}')
print(f'N:  {results_fe.nobs}')

In [None]:
# --- Run validation tests and transform to template data ---

# Use results.validate() to run actual statistical tests
val_report = results_fe.validate(tests='all', verbose=True)

# Show the validation summary
print(val_report.summary(verbose=False))

# Transform to template-ready dict
transformer = ValidationTransformer(val_report)
val_data = transformer.transform(include_charts=True)

print('\nValidation data keys:', list(val_data.keys()))
print('Summary keys:', list(val_data.get('summary', {}).keys()))
print(f"Total tests: {val_data['summary']['total_tests']}")
print(f"Pass rate:   {val_data['summary']['pass_rate_formatted']}")

In [None]:
# --- Generate Single-Model HTML Report ---

html_validation = report_mgr.generate_validation_report(
    validation_data=val_data,
    title='Fixed Effects Model — Validation Report',
    subtitle='Simulated Panel Data (40 entities × 15 periods)',
    interactive=True,
)

# Save report
output_path = '../outputs/reports/html/04_single_model_report.html'
report_mgr.save_report(html_validation, output_path, overwrite=True)

print(f'Report saved: {output_path}')
print(f'File size:    {os.path.getsize(output_path) / 1024:.1f} KB')

In [None]:
# Display report preview in notebook
from IPython.display import IFrame, display, HTML

display(HTML('<p><strong>Report preview (scroll to see all sections):</strong></p>'))
try:
    display(IFrame(src='../outputs/reports/html/04_single_model_report.html',
                   width='100%', height=500))
except Exception:
    display(HTML(f'<a href="../outputs/reports/html/04_single_model_report.html" target="_blank">'
                 f'Open report in new tab</a>'))

## 3. Multi-Model Comparison Report

When selecting the best specification, comparing multiple models in a single report
is more effective than switching between separate outputs.

The comparison report includes:
- Coefficient comparison chart (side-by-side bars)
- Model fit comparison (R², within-R², adjusted R²)
- Information criteria (AIC, BIC) comparison
- Navigation tabs per model

**Workflow:**
1. Estimate multiple models
2. `create_comparison_charts(results_list, names, theme)` → charts dict
3. Build `comparison_data = {'comparison_charts': charts, 'models_info': [...], ...}`
4. `report_mgr.generate_comparison_report(comparison_data, title='...')`

In [None]:
# --- Estimate OLS, FE, RE on the same dataset ---

model_ols = PooledOLS(
    formula='y ~ x1 + x2',
    data=df,
    entity_col='entity',
    time_col='time',
)
results_ols = model_ols.fit()

model_re = RandomEffects(
    formula='y ~ x1 + x2',
    data=df,
    entity_col='entity',
    time_col='time',
)
results_re = model_re.fit()

print('Models estimated:')
print(f'  OLS  β_x1 = {results_ols.params["x1"]:.4f}')
print(f'  FE   β_x1 = {results_fe.params["x1"]:.4f}')
print(f'  RE   β_x1 = {results_re.params["x1"]:.4f}')
print()
print(f'  OLS  R²   = {results_ols.rsquared:.4f}')
print(f'  FE   R²   = {results_fe.rsquared:.4f}')
print(f'  RE   R²   = {results_re.rsquared:.4f}')

In [None]:
# --- Create comparison charts ---

model_names = ['Pooled OLS', 'Fixed Effects', 'Random Effects']

comparison_charts = create_comparison_charts(
    results_list=[results_ols, results_fe, results_re],
    names=model_names,
    theme='academic',
)

print('Comparison chart types:', list(comparison_charts.keys()))

# Chart objects expose .figure (a Plotly Figure) for display
comparison_charts['coefficients'].figure.show()

In [None]:
# --- Build comparison_data and generate HTML report ---

comparison_data = {
    'comparison_charts': comparison_charts,
    'models_info': [
        {'name': 'Pooled OLS', 'estimator': 'PooledOLS',
         'nobs': results_ols.nobs, 'r_squared': results_ols.rsquared},
        {'name': 'Fixed Effects', 'estimator': 'FixedEffects',
         'nobs': results_fe.nobs, 'r_squared': results_fe.rsquared},
        {'name': 'Random Effects', 'estimator': 'RandomEffects',
         'nobs': results_re.nobs, 'r_squared': results_re.rsquared},
    ],
    'best_model_aic': 'Fixed Effects',
    'best_model_bic': 'Fixed Effects',
}

html_comparison = report_mgr.generate_comparison_report(
    comparison_data=comparison_data,
    title='Panel Model Comparison Report',
    subtitle='Pooled OLS vs. Fixed Effects vs. Random Effects',
    interactive=True,
)

output_path_cmp = '../outputs/reports/html/04_comparison_report.html'
report_mgr.save_report(html_comparison, output_path_cmp, overwrite=True)

print(f'Comparison report saved: {output_path_cmp}')
print(f'File size: {os.path.getsize(output_path_cmp) / 1024:.1f} KB')

## 4. Full Diagnostic Report

The residuals report template includes:
- Model summary
- All residual diagnostic charts (Q-Q, Residuals vs. Fitted, ACF/PACF, etc.)
- Interpretation guide per chart

We use the **heteroskedastic** dataset to demonstrate a report that flags issues.

The `generate_residual_report()` convenience method expects a `residual_data` dict
with the chart HTML already embedded. We build it manually by:
1. Generating diagnostic charts → Plotly figures
2. Converting figures to HTML strings → `fig.to_html(full_html=False, include_plotlyjs='cdn')`
3. Passing chart HTML in the context

In [None]:
# --- Heteroskedastic panel dataset ---

rng_h = np.random.default_rng(99)

n_ent_h, n_per_h = 50, 10
entities_h = np.repeat(np.arange(1, n_ent_h + 1), n_per_h)
periods_h  = np.tile(np.arange(1, n_per_h + 1), n_ent_h)
alpha_h    = np.repeat(rng_h.normal(0, 0.5, n_ent_h), n_per_h)
x1_h       = rng_h.normal(2, 1, n_ent_h * n_per_h)

# Heteroskedastic errors: variance proportional to |x1|
sigma_h    = 0.5 * np.abs(x1_h)
eps_h      = rng_h.normal(0, 1, n_ent_h * n_per_h) * sigma_h
y_h        = alpha_h + 1.5 * x1_h + eps_h

df_h = pd.DataFrame({'entity': entities_h, 'time': periods_h, 'x1': x1_h, 'y': y_h})

results_h = FixedEffects(
    formula='y ~ x1', data=df_h,
    entity_col='entity', time_col='time', entity_effects=True,
).fit()

print('Heteroskedastic model estimated.')
print(f'Coefficient x1: {results_h.params["x1"]:.4f}  (true: 1.500)')
print(f'R²: {results_h.rsquared:.4f}')

In [None]:
# --- Create residual diagnostic charts for the heteroskedastic model ---

diag_h = create_residual_diagnostics(results_h, theme='academic')

print('Diagnostic charts available:', list(diag_h.keys()))

# Chart objects expose .figure (a Plotly Figure) for display
diag_h['residual_vs_fitted'].figure.show()
diag_h['scale_location'].figure.show()

In [None]:
# --- Build residual_data and generate diagnostic HTML report ---

# Convert chart objects to HTML strings for embedding
charts_html = {
    name: chart.figure.to_html(full_html=False, include_plotlyjs='cdn')
    for name, chart in diag_h.items()
}

residual_data = {
    'model_info': {
        'model_type': 'Fixed Effects',
        'formula': 'y ~ x1',
        'nobs': results_h.nobs,
        'n_entities': results_h.n_entities,
        'n_periods': results_h.n_periods,
        'r_squared': results_h.rsquared,
    },
    'residual_charts': charts_html,
    'chart_names': list(diag_h.keys()),
}

try:
    html_diag = report_mgr.generate_residual_report(
        residual_data=residual_data,
        title='Residual Diagnostics — Heteroskedastic Panel',
        subtitle='Heteroskedastic dataset: variance proportional to |x1|',
        interactive=True,
    )
    output_path_diag = '../outputs/reports/html/04_diagnostic_report.html'
    report_mgr.save_report(html_diag, output_path_diag, overwrite=True)
    print(f'Diagnostic report saved: {output_path_diag}')
    print(f'File size: {os.path.getsize(output_path_diag) / 1024:.1f} KB')
except Exception as e:
    print(f'Residual report: {e}')
    print('Saving diagnostic charts as standalone HTML instead.')
    # Fallback: export each chart individually
    for name, chart in diag_h.items():
        path = f'../outputs/reports/html/04_diag_{name}.html'
        chart.figure.write_html(path, include_plotlyjs='cdn')
        print(f'  Saved: {path}')

## 5. Template Customization (Jinja2)

PanelBox reports use **Jinja2** templates stored in `panelbox/templates/`.
You can customize reports by:

1. **Override context variables** — pass different titles, dates, metadata
2. **Inject custom CSS** — pass CSS string via `custom_css` parameter of `generate_report()`
3. **Inspect template source** — copy to a local folder and adapt

> **Important:** Do not edit the original template files inside the installed library.
> Instead, copy the desired template to a local folder and load it with a custom template directory.

Below we demonstrate **custom CSS injection** to apply organization branding.

In [None]:
# --- Inspect template directory structure ---

import panelbox
templates_dir = os.path.join(os.path.dirname(panelbox.__file__), 'templates')

print(f'Templates directory: {templates_dir}')
print()
print('Template files:')
for root, dirs, files in os.walk(templates_dir):
    for fname in sorted(files):
        rel = os.path.relpath(os.path.join(root, fname), templates_dir)
        print(f'  • {rel}')

In [None]:
# --- Custom CSS injection for branded report ---

# Define organization CSS overrides
custom_css = """
:root {
    --primary-color: #1A3A5C;   /* Dark blue brand color */
    --accent-color:  #F4A261;   /* Orange accent */
}
h1, h2, h3 { color: #1A3A5C; }
.report-header { background-color: #1A3A5C; color: white; }
"""

# Re-use the val_data from Section 2 (FE model on clean data)
html_branded = report_mgr.generate_report(
    report_type='validation',
    template='validation/interactive/index.html',
    context={
        **val_data,
        'report_title': 'Branded Report — My Research Institute',
        'report_subtitle': 'Fixed Effects Model on Simulated Panel Data',
    },
    embed_assets=True,
    include_plotly=True,
    custom_css=[custom_css],
)

output_path_branded = '../outputs/reports/html/04_branded_report.html'
report_mgr.save_report(html_branded, output_path_branded, overwrite=True)
print(f'Branded report saved: {output_path_branded}')
print(f'File size: {os.path.getsize(output_path_branded) / 1024:.1f} KB')

## 6. LaTeX and Markdown Export

For **academic publications**, PanelBox can export results to:
- **`.tex` file** — LaTeX source with `booktabs` regression tables
- **`.md` file** — Markdown with pipe tables (for GitHub, Quarto, etc.)

Both exporters use the same data format:
```python
coefficients = [
    {'variable': 'x1', 'coefficient': 1.42, 'std_error': 0.05,
     't_statistic': 28.4, 'p_value': 0.000, 'stars': '***'},
    ...
]
model_info = {'model_type': 'Fixed Effects', 'n_obs': 600, 'r_squared': 0.75}
```

> **Note:** PDF compilation requires `pdflatex` installed:
> ```bash
> sudo apt install texlive-latex-recommended
> ```

In [None]:
# --- Build coefficient data from results_fe ---

def build_coefficient_table(results):
    """Convert PanelResults to the list-of-dicts format for exporters."""
    rows = []
    for var in results.params.index:
        coef  = results.params[var]
        se    = results.std_errors[var]
        tstat = results.tvalues[var]
        pval  = results.pvalues[var]
        if pval < 0.01:
            stars = '***'
        elif pval < 0.05:
            stars = '**'
        elif pval < 0.10:
            stars = '*'
        else:
            stars = ''
        rows.append({
            'variable':    var,
            'coefficient': coef,
            'std_error':   se,
            't_statistic': tstat,
            'pvalue':      pval,
            'stars':       stars,
        })
    return rows

coef_data = build_coefficient_table(results_fe)
model_meta = {
    'model_type': 'Fixed Effects',
    'nobs':       results_fe.nobs,
    'r_squared':  results_fe.rsquared,
    'r_squared_within': results_fe.rsquared_within,
    'r_squared_between': results_fe.rsquared_between,
}

print(f'Coefficient rows prepared: {len(coef_data)}')
for row in coef_data:
    print(f"  {row['variable']:15s} {row['coefficient']:+.4f}  ({row['stars']})")

In [None]:
# --- Export to LaTeX ---

latex_exporter = LaTeXExporter()

latex_content = latex_exporter.export_regression_table(
    coefficients=coef_data,
    model_info=model_meta,
    caption='Fixed Effects Regression Results — Simulated Panel',
    label='tab:fe_main',
)

tex_path = '../outputs/reports/latex/04_regression_table.tex'
latex_exporter.save(
    latex_content=latex_content,
    output_path=tex_path,
    overwrite=True,
    add_preamble=True,   # wrap table in full LaTeX document
)

print(f'LaTeX file saved: {tex_path}')
print(f'File size: {os.path.getsize(tex_path) / 1024:.1f} KB')
print()
print('LaTeX content preview:')
print(latex_content[:600])

In [None]:
# --- Compile LaTeX to PDF (optional — requires pdflatex) ---

import subprocess

try:
    result = subprocess.run(
        ['pdflatex', '-interaction=nonstopmode',
         '-output-directory', '../outputs/reports/latex/', tex_path],
        capture_output=True, text=True, timeout=60
    )
    if result.returncode == 0:
        pdf_path = tex_path.replace('.tex', '.pdf')
        print(f'PDF compiled successfully: {pdf_path}')
    else:
        print('LaTeX compilation encountered errors. Last 500 chars of log:')
        print(result.stdout[-500:])
except FileNotFoundError:
    print('pdflatex not found. To compile PDF, install:')
    print('  sudo apt install texlive-latex-recommended')
except subprocess.TimeoutExpired:
    print('LaTeX compilation timed out (> 60 s).')

In [None]:
# --- Export to Markdown ---

markdown_exporter = MarkdownExporter()

md_content = markdown_exporter.export_regression_table(
    coefficients=coef_data,
    model_info=model_meta,
    title='Fixed Effects Regression Results',
)

md_path = '../outputs/reports/latex/04_regression_table.md'
markdown_exporter.save(md_content, md_path, overwrite=True)

print(f'Markdown report saved: {md_path}')
print()
print('Markdown content:')
print(md_content)

## 7. Controlling Report File Size

A self-contained HTML report with Plotly.js **embedded** can be **3–5 MB**.
For sharing via email or a slow server, use the CDN option:

| Option | Size | Trade-off |
|---|---|---|
| `include_plotly=True` | ~3–5 MB | Self-contained, works offline |
| `include_plotly=False` | ~50–100 KB | Requires Plotly.js from CDN |
| Charts as PNG (static) | ~100–300 KB | No interactivity |

The `save_report()` method also writes an existing file only if `overwrite=True`.

In [None]:
# --- Compare file sizes: embedded vs. CDN Plotly.js ---

# 1. Inline Plotly.js (already saved above)
inline_path = '../outputs/reports/html/04_single_model_report.html'
inline_size_kb = os.path.getsize(inline_path) / 1024

# 2. No inline Plotly (CDN)
html_cdn = report_mgr.generate_report(
    report_type='validation',
    template='validation/interactive/index.html',
    context={
        **val_data,
        'report_title': 'CDN Plotly Report',
    },
    embed_assets=True,
    include_plotly=False,   # Plotly NOT embedded; loaded from CDN
)
cdn_path = '../outputs/reports/html/04_cdn_report.html'
report_mgr.save_report(html_cdn, cdn_path, overwrite=True)
cdn_size_kb = os.path.getsize(cdn_path) / 1024

print(f'Inline Plotly.js:  {inline_size_kb:.0f} KB')
print(f'CDN Plotly.js:     {cdn_size_kb:.0f} KB')
reduction = (1 - cdn_size_kb / inline_size_kb) * 100
print(f'Size reduction:    {reduction:.0f}%')
print()
print('Tip: CDN reports require an internet connection to display charts.')

## 8. Complete Automated Pipeline

The final section encapsulates the full workflow in a reusable function.
In practice, you would call this from a cron job, a GitHub Action, or a scheduled script:

```bash
python generate_monthly_report.py --data monthly_sales.csv --output reports/
```

**Pipeline steps:**
1. Load data
2. Estimate Pooled OLS, Fixed Effects, Random Effects
3. Build validation data
4. Create visualization charts
5. Generate HTML validation + comparison reports
6. Export LaTeX regression table
7. Save all outputs to structured directories

In [None]:
def run_panel_analysis_pipeline(
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    dependent: str,
    regressors: list,
    output_dir: str,
    report_title: str = 'Panel Data Analysis Report',
    theme: str = 'academic',
) -> dict:
    """
    Full automated panel analysis pipeline.

    Parameters
    ----------
    data        : DataFrame with panel data (entity, time, variables)
    entity_col  : name of entity column
    time_col    : name of time column
    dependent   : name of dependent variable
    regressors  : list of regressor column names
    output_dir  : root output directory (reports/html and reports/latex created inside)
    report_title: title for all generated reports
    theme       : visualization theme ('academic', 'professional', etc.)

    Returns
    -------
    dict with paths of all generated files
    """
    os.makedirs(f'{output_dir}/reports/html',  exist_ok=True)
    os.makedirs(f'{output_dir}/reports/latex', exist_ok=True)
    os.makedirs(f'{output_dir}/charts/png',    exist_ok=True)

    formula = f"{dependent} ~ {' + '.join(regressors)}"

    # --- 1. Estimate models ---
    print('[1/5] Estimating models...')
    res_ols = PooledOLS(
        formula=formula, data=data,
        entity_col=entity_col, time_col=time_col
    ).fit()
    res_fe = FixedEffects(
        formula=formula, data=data,
        entity_col=entity_col, time_col=time_col, entity_effects=True
    ).fit()
    res_re = RandomEffects(
        formula=formula, data=data,
        entity_col=entity_col, time_col=time_col
    ).fit()

    # --- 2. Run validation tests ---
    print('[2/5] Running validation tests...')
    val_rpt = res_fe.validate(tests='all')
    n_tests = (len(val_rpt.specification_tests) + len(val_rpt.serial_tests)
               + len(val_rpt.het_tests) + len(val_rpt.cd_tests))
    n_failed = len(val_rpt.get_failed_tests())
    print(f'       Tests run: {n_tests}, failed: {n_failed}')
    val_data_pipe = ValidationTransformer(val_rpt).transform(include_charts=True)

    # --- 3. Create charts ---
    print('[3/5] Creating charts...')
    diag_charts = create_residual_diagnostics(res_fe, theme=theme)
    comp_charts = create_comparison_charts(
        results_list=[res_ols, res_fe, res_re],
        names=['Pooled OLS', 'Fixed Effects', 'Random Effects'],
        theme=theme,
    )
    # Export chart PNGs (skip silently if kaleido missing)
    for name, chart in diag_charts.items():
        try:
            chart.figure.write_image(f'{output_dir}/charts/png/{name}.png',
                                     width=900, height=600)
        except Exception:
            pass

    # --- 4. Generate HTML reports ---
    print('[4/5] Generating HTML reports...')
    mgr = ReportManager()
    date_str = pd.Timestamp.now().strftime('%Y-%m-%d')

    html_v = mgr.generate_validation_report(
        validation_data=val_data_pipe,
        title=f'{report_title} — Validation',
        subtitle=f'Generated {date_str}',
    )
    path_html_v = f'{output_dir}/reports/html/validation_report.html'
    mgr.save_report(html_v, path_html_v, overwrite=True)

    comp_data_pipe = {
        'comparison_charts': comp_charts,
        'models_info': [
            {'name': 'Pooled OLS', 'estimator': 'PooledOLS',
             'nobs': res_ols.nobs, 'r_squared': res_ols.rsquared},
            {'name': 'Fixed Effects', 'estimator': 'FixedEffects',
             'nobs': res_fe.nobs, 'r_squared': res_fe.rsquared},
            {'name': 'Random Effects', 'estimator': 'RandomEffects',
             'nobs': res_re.nobs, 'r_squared': res_re.rsquared},
        ],
        'best_model_aic': 'Fixed Effects',
        'best_model_bic': 'Fixed Effects',
    }
    html_c = mgr.generate_comparison_report(
        comparison_data=comp_data_pipe,
        title=f'{report_title} — Comparison',
        subtitle=f'Generated {date_str}',
    )
    path_html_c = f'{output_dir}/reports/html/comparison_report.html'
    mgr.save_report(html_c, path_html_c, overwrite=True)

    # --- 5. Export LaTeX ---
    print('[5/5] Generating LaTeX table...')
    coefs_pipe = [
        {
            'variable':    var,
            'coefficient': res_fe.params[var],
            'std_error':   res_fe.std_errors[var],
            't_statistic': res_fe.tvalues[var],
            'pvalue':      res_fe.pvalues[var],
        }
        for var in res_fe.params.index
    ]
    meta_pipe = {
        'model_type': 'Fixed Effects',
        'nobs': res_fe.nobs,
        'r_squared': res_fe.rsquared,
    }
    latex_exp = LaTeXExporter()
    latex_src = latex_exp.export_regression_table(
        coefficients=coefs_pipe,
        model_info=meta_pipe,
        caption=report_title,
        label='tab:results',
    )
    path_tex = f'{output_dir}/reports/latex/regression_table.tex'
    latex_exp.save(latex_src, path_tex, overwrite=True, add_preamble=True)

    # --- Summary ---
    output_files = {
        'html_validation': path_html_v,
        'html_comparison': path_html_c,
        'latex': path_tex,
    }
    print()
    print('Pipeline complete. Generated files:')
    for label, path in output_files.items():
        size_kb = os.path.getsize(path) / 1024
        print(f'  [{label}] {path} ({size_kb:.0f} KB)')

    return output_files

In [None]:
# --- Run the pipeline on a fresh dataset ---

# Simulated monthly sales panel (35 entities × 12 periods)
df_pipeline = generate_panel_data(n_individuals=35, n_periods=12, n_covariates=2, seed=123)
# generate_panel_data returns a MultiIndex df; reset index for flat columns
df_pipeline = df_pipeline.reset_index()

generated = run_panel_analysis_pipeline(
    data=df_pipeline,
    entity_col='entity',
    time_col='time',
    dependent='y',
    regressors=['x1', 'x2'],
    output_dir='../outputs',
    report_title='Monthly Panel Analysis — February 2026',
    theme='professional',
)

print('\nAll reports generated successfully!')

In [None]:
# --- Display the final validation report in the notebook ---

from IPython.display import IFrame, display, HTML

display(HTML('<h3>Pipeline output — Validation Report:</h3>'))
try:
    display(IFrame(src=generated['html_validation'], width='100%', height=550))
except Exception:
    display(HTML(f'<a href="{generated["html_validation"]}" target="_blank">'
                 f'Open validation report</a>'))

display(HTML('<h3>Pipeline output — Comparison Report:</h3>'))
try:
    display(IFrame(src=generated['html_comparison'], width='100%', height=550))
except Exception:
    display(HTML(f'<a href="{generated["html_comparison"]}" target="_blank">'
                 f'Open comparison report</a>'))

## Summary

| Capability | Method | Output format |
|---|---|---|
| Single-model report | `ReportManager.generate_validation_report(val_data, title)` | HTML |
| Multi-model comparison | `ReportManager.generate_comparison_report(comp_data, title)` | HTML |
| Full diagnostics | `ReportManager.generate_residual_report(residual_data, title)` | HTML |
| Custom branding | `generate_report(..., custom_css=[css_str])` | HTML |
| LaTeX regression table | `LaTeXExporter().export_regression_table(coefs, model_info)` | `.tex` |
| PDF compilation | `pdflatex` via `subprocess` | `.pdf` |
| Markdown table | `MarkdownExporter().export_regression_table(coefs, model_info)` | `.md` |
| Automated pipeline | `run_panel_analysis_pipeline(...)` | HTML + LaTeX |

**File size guide:**
- `include_plotly=True`: 3–5 MB (offline, self-contained)
- `include_plotly=False`: ~50 KB (CDN — requires internet)

**Data preparation:**
- `ValidationTransformer(val_report).transform()` → dict for `generate_validation_report()`
- `create_comparison_charts(results_list, names, theme)` → dict for `generate_comparison_report()`

**Congratulations!** You have completed the PanelBox Visualization Series (Notebooks 01–04).  
You can now create, diagnose, visualize, compare, and report panel data analyses
using PanelBox's complete visualization and report generation stack.