Skip to content

merwanroudane/pywaveletpanel

Repository files navigation

🌊 PyWaveletPanel

Wavelet-Based Panel Data Econometrics in Python

PyPI version Python 3.9+ MIT License GitHub PyPI downloads


PyWaveletPanel is a Python library for wavelet-based panel data analysis. It implements econometric methods from five papers, providing tools for scale-by-scale panel regression, structural break detection, and panel unit root testing, together with journal-quality tables and publication-grade plots.

This document is a complete usage guide with full syntax for every public function and class.

Links: GitHub Repository · PyPI · Issue Tracker

📚 Table of Contents

📑 Implemented Papers

# Paper Method Module
1 Bada et al. (2021)A Wavelet Method for Panel Models with Jump Discontinuities SAW Estimator, Post-SAW structural_breaks
2 Karlsson et al. (2020)Unveiling Time-dependent Dynamics: Oil Prices & Exchange Rates MODWT Panel OLS panel_regression
3 Almasri et al. (2016)Wavelet-based Panel Unit-root Test with Structural Breaks WDWT, WMODWT unit_root
4 Gallegati et al. (2015)Productivity and Unemployment: Scale-by-scale Panel Analysis Scale-by-scale Panel FE panel_regression
5 Li & Shukur (2013)Testing for Unit Roots in Panel Data Using Wavelet Ratio Wavelet Ratio IPS unit_root

🚀 Installation

Install the latest release from PyPI:

pip install pywaveletpanel

Or install from source for development:

git clone https://github.com/merwanroudane/pywaveletpanel.git
cd pywaveletpanel

# Install in development mode (recommended — makes `import pywaveletpanel` work anywhere)
pip install -e .

# Or install dependencies only
pip install -r requirements.txt

Note: If you do not pip install, you must run scripts from the repo root or set PYTHONPATH to the repo directory, otherwise import pywaveletpanel fails with ModuleNotFoundError.

Dependencies: numpy>=1.20, scipy>=1.7, pandas>=1.3, statsmodels>=0.13, matplotlib>=3.5, PyWavelets>=1.1, rich>=12.0, tabulate>=0.9. Requires Python ≥ 3.9.

⚡ Quick Start

import numpy as np
from pywaveletpanel import WaveletPanelOLS, set_journal_style

set_journal_style()

model = WaveletPanelOLS(wavelet='sym4', level=3, robust=True)
result = model.fit(y=y, X=X, entity_ids=entity_ids, time_ids=time_ids,
                   regressor_names=['Productivity'])

print(result.summary())     # journal-quality console table
result.plot()               # scale-dependent coefficient forest plot
print(result.to_latex())    # LaTeX export
df = result.summary_df()    # tidy DataFrame

📐 Data Conventions

Two distinct data layouts are used across the library:

Layout Used by Shape Description
Stacked panel WaveletPanelOLS, SAWEstimator, PostSAWEstimator (N*T,) for y; (N*T,) or (N*T, k) for X One row per (entity, time) observation. Paired with entity_ids (length N*T) and optional time_ids.
Matrix panel All unit-root tests, simulate_panel_ar1 (N, T) Each row is one entity's full time series.
  • Balanced panels only: every entity must have the same number of periods T. WaveletPanelOLS.fit raises ValueError on unbalanced data.
  • If time_ids is omitted, observations are assumed already sorted in time order within each entity.
  • X.ndim == 1 is automatically reshaped to a single column (N*T, 1).

📖 API Reference

1. Wavelet Transforms (wavelets)

Low-level transforms. All operate on a 1-D series x of shape (T,).

haar_dwt(x, level=1) -> (V_J, W)

Decimated Haar Discrete Wavelet Transform up to level J.

Parameter Type Default Description
x ndarray (T,) Input series (ideally length divisible by 2**level; odd lengths are boundary-reflected).
level int 1 Decomposition level J.

Returns: V_J — scaling (approximation) coefficients at level J; W — list [W_1, …, W_J] of detail coefficients (each halves in length per level).

haar_idwt(V_J, W) -> x

Inverse Haar DWT. Reconstructs the signal from coarsest scaling coefficients V_J and detail list W.

haar_modwt(x, level=1) -> (V_J, W)

Maximal-Overlap Haar DWT — translation-invariant, no downsampling (every level returns T coefficients). Uses rescaled, circularly-filtered Haar coefficients.

Returns: V_J of shape (T,); W list of (T,) arrays.

haar_imodwt(V_J, W) -> x

Inverse Haar MODWT.

modwt(x, wavelet="haar", level=1) -> (V_J, W)

General MODWT using any PyWavelets filter (non-decimated / stationary transform).

Parameter Type Default Description
x ndarray (T,) Input series.
wavelet str "haar" Filter name from pywt.wavelist(). Use "sym4" for LA(8).
level int 1 Decomposition level J.

la8_modwt(x, level=4) -> (V_J, W)

Convenience wrapper: modwt(x, wavelet="sym4", level=level) — the LA(8) filter from Gallegati et al. (2015) and Karlsson et al. (2020).

modwt_mra(x, wavelet="sym4", level=4) -> dict

MODWT-based Multiresolution Analysis. Decomposes x into additive components such that x ≈ D1 + D2 + … + DJ + SJ (implemented via PyWavelets SWT with reflective padding).

Returns a dict with:

  • Keys 'D1', 'D2', …, 'DJ' → detail-component arrays (length T).
  • Key 'SJ' (e.g. 'S4') → smooth/trend component.
  • Key 'labels' → dict mapping each scale name to a frequency-band string (e.g. 'D1' → '2–4 periods').
from pywaveletpanel import modwt_mra
comp = modwt_mra(x, wavelet='sym4', level=4)
print(comp['labels'])        # {'D1': '2–4 periods', ..., 'S4': '>32 periods (trend)'}
trend = comp['S4']

pad_dyadic(x, mode="reflect") -> x_padded

Pads x to the next power-of-two length. mode is any NumPy pad mode ("reflect", "constant", "edge"). Returns x unchanged if already dyadic.


2. Panel Regression (panel_regression)

class WaveletPanelOLS

Scale-by-scale wavelet panel regression with fixed effects (Papers 2, 4). Decomposes each variable via modwt_mra, then runs a fixed-effects OLS at each scale.

Constructor

WaveletPanelOLS(
    wavelet="sym4",          # filter; 'sym4'=LA(8) (Papers 2,4), 'haar' (Papers 1,3)
    level=3,                 # decomposition level J
    robust=True,             # Newey-West HAC standard errors
    nw_lags=None,            # NW lag truncation; None = automatic rule-of-thumb
    include_aggregate=True,  # also estimate on raw (non-decomposed) data
)

.fit(y, X, entity_ids, time_ids=None, regressor_names=None) -> ScaleRegressionResult

Parameter Type Description
y ndarray (N*T,) Dependent variable (stacked).
X ndarray (N*T,) or (N*T, k) Regressors (stacked).
entity_ids ndarray (N*T,) Entity identifiers.
time_ids ndarray (N*T,), optional Time identifiers (used to sort within entity).
regressor_names list[str], optional Defaults to ['x1', 'x2', …].

Raises ValueError if the panel is unbalanced.

@dataclass ScaleRegressionResult

Returned by WaveletPanelOLS.fit.

Attributes: scale_results (dict[str, dict]), aggregate_result (dict | None), scale_labels (dict[str, str]), n_entities, n_periods, wavelet, level, regressor_names.

Each per-scale result dict contains: coef, se, t_stat, pvalue (arrays of length k), plus r_squared, adj_r_squared, residuals, nobs, df.

Methods

Method Returns Description
.summary(decimals=3) str Rendered journal-quality table (columns: Aggregate, SJ, DJ…D1).
.summary_df() pd.DataFrame Tidy long-format results (one row per scale × regressor).
.to_latex(decimals=3) str LaTeX table environment.
.plot(figsize=(10,6), **kwargs) plt.Figure Forest plot of coefficients by scale.
from pywaveletpanel import WaveletPanelOLS

model = WaveletPanelOLS(wavelet='sym4', level=3, robust=True, nw_lags=None)
res = model.fit(y, X, entity_ids, time_ids, regressor_names=['Productivity'])

print(res.summary(decimals=3))
res.summary_df().to_csv('scale_results.csv', index=False)
fig = res.plot(figsize=(10, 7))

3. Structural Breaks (structural_breaks)

class SAWEstimator

Structure-Adapted Wavelet estimator for detecting breaks in panel coefficients (Paper 1). First-differences out fixed effects, expands the cross-sectional coefficient estimates γ̂_t in a Haar basis, hard-thresholds the detail coefficients, and reads breaks off the reconstructed piecewise-constant path.

Constructor

SAWEstimator(
    threshold_method="adaptive",  # see note below
    kappa_adjustment=True,        # log-log correction to kappa (eq. 3.1); affects the analytic threshold
    min_segment_length=2,         # minimum periods between consecutive breaks
)

Threshold: the noise in γ̂_t is estimated robustly from the finest detail level (median-absolute-deviation, Donoho & Johnstone 1994) and the universal threshold σ_w·√(2 log T) is applied. threshold_method="universal" additionally takes the max with the analytic threshold from Theorem 2. This MAD calibration is what keeps detection from over-segmenting when γ̂_t, being a cross-sectional average, has a much lower noise floor than the per-observation residual.

.detect(y, X, entity_ids, time_ids=None, regressor_names=None) -> BreakDetectionResult

Inputs use the stacked-panel layout (see Data Conventions).

class PostSAWEstimator

Re-estimates the panel model on the stability intervals found by SAWEstimator, achieving the oracle property (Paper 1, Theorem 3).

Constructor

PostSAWEstimator(
    variance_type="homoskedastic",  # 'homoskedastic' | 'cross_hetero' | 'time_hetero' | 'both'
    chow_test=True,                 # run Chow tests between consecutive intervals
)

.fit(y, X, entity_ids, time_ids=None, breaks=None) -> dict

breaks is a BreakDetectionResult from SAWEstimator.detect. Returns a dict with keys:

  • interval_results{regressor_idx: [ {interval, coef, se, t_stat, pvalue, nobs}, … ]}
  • chow_tests{(p, seg_i, seg_j): {F_stat, pvalue, break_time}}
  • full_coefficientsndarray (T, k) time-varying coefficient path
  • n_entities, n_periods

@dataclass BreakDetectionResult

Attributes: n_breaks (dict[int,int]), break_locations (dict[int, list[int]]), stability_intervals (dict[int, list[(start,end)]]), coefficients (dict[int, list[float]]), threshold (float), wavelet_coeffs (dict[int, ndarray]), n_entities, n_periods, regressor_names.

Methods: .summary() -> str, .plot(figsize=(14,5), **kwargs) -> plt.Figure, .total_breaks() -> int.

from pywaveletpanel import SAWEstimator, PostSAWEstimator

saw = SAWEstimator(threshold_method='adaptive', min_segment_length=2)
breaks = saw.detect(y, X, entity_ids, time_ids, regressor_names=['AT_share'])

print(breaks.summary())
print("Total breaks:", breaks.total_breaks())
breaks.plot()

post = PostSAWEstimator(variance_type='both', chow_test=True)
final = post.fit(y, X, entity_ids, time_ids, breaks)
for (p, i, j), c in final['chow_tests'].items():
    print(f"reg {p}, {i}->{j}: F={c['F_stat']:.2f}, p={c['pvalue']:.4f}")

4. Unit Root Tests (unit_root)

All test classes share the signature .test(data, n_mc=10000, seed=None) -> UnitRootResult, where data is a matrix panel of shape (N, T) and critical values are obtained by n_mc Monte Carlo replications under H0 (independent random walks).

H0: all entities have a unit root. H1: at least some entities are stationary.

Class Constructor Statistic Test direction Reference
WaveletRatioIPS WaveletRatioIPS() Mean Fan–Gençay wavelet ratio S_NT left-tail (reject if S_NT ≤ CV) Li & Shukur (2013)
WaveletWaldDWT WaveletWaldDWT() W_DWT = T·tr[(H'H)⁻¹E'E] − N right-tail (reject if stat ≥ CV) Almasri et al. (2016)
WaveletWaldMODWT WaveletWaldMODWT() MODWT analogue of W_DWT right-tail Almasri et al. (2016)
PanelADF PanelADF(trend="c") Mean ADF t-stat (IPS) left-tail Im, Pesaran & Shin (2003)

PanelADF accepts trend: "c" (constant, default), "ct" (constant + trend), or any other value for no deterministic term.

.test parameters

Parameter Type Default Description
data ndarray (N, T) Panel matrix, one entity per row.
n_mc int 10000 Monte Carlo replications for critical values.
seed int, optional None RNG seed for reproducibility.

@dataclass UnitRootResult

Attributes: test_name (str), statistic (float), pvalue (float), critical_values ({0.01, 0.05, 0.10 → float}), reject_null ({level → bool}), n_entities, n_periods, individual_stats (ndarray | None, per-entity stats where applicable).

Method: .summary() -> str.

import numpy as np
from pywaveletpanel import (
    WaveletRatioIPS, WaveletWaldDWT, WaveletWaldMODWT, PanelADF,
    plot_unit_root_comparison,
)
from pywaveletpanel.tables import UnitRootTable

data = np.random.randn(5, 128)   # (N, T)

res_adf    = PanelADF(trend='c').test(data, n_mc=5000, seed=0)
res_wr     = WaveletRatioIPS().test(data, n_mc=5000, seed=0)
res_wdwt   = WaveletWaldDWT().test(data, n_mc=5000, seed=0)
res_wmodwt = WaveletWaldMODWT().test(data, n_mc=5000, seed=0)

print(UnitRootTable.from_multiple_results(
    [res_adf, res_wr, res_wdwt, res_wmodwt]).render())
plot_unit_root_comparison([res_adf, res_wr, res_wdwt, res_wmodwt])

5. Tables (tables)

Four table builders. Each renders to console (via rich, falling back to tabulate) and exports to LaTeX/HTML/DataFrame. Console rendering auto-highlights significance stars and reject/accept decisions.

class RegressionTable

Member Signature Description
classmethod RegressionTable.from_scale_result(result, decimals=3) Build from a ScaleRegressionResult.
method .render() -> str Console table.
method .to_latex() -> str LaTeX.
method .to_html() -> str Bootstrap-styled HTML.
method .to_dataframe() -> pd.DataFrame Underlying frame.

class UnitRootTable

Member Signature Description
classmethod UnitRootTable.from_single_result(result) Single UnitRootResult.
classmethod UnitRootTable.from_multiple_results(results, title="") Side-by-side comparison from a list of results.
method .render() -> str / .to_latex() -> str Output.

class BreakTable

Member Signature Description
classmethod BreakTable.from_break_result(result) From a BreakDetectionResult.
method .render() -> str / .to_latex() -> str Output.

class SimulationTable

Member Signature Description
classmethod SimulationTable.from_simulation(results, title="Monte Carlo Simulation Results") results = {test_name: {scenario: rejection_rate}}.
method .render() -> str / .to_latex() -> str Output.
from pywaveletpanel.tables import SimulationTable
sim = SimulationTable.from_simulation({
    "WDWT":   {"rho=1.00": 0.051, "rho=0.95": 0.62},
    "WMODWT": {"rho=1.00": 0.049, "rho=0.95": 0.71},
})
print(sim.render())

6. Visualisation (visualization)

All plotting functions return a matplotlib.figure.Figure and accept an optional save_path to write a 300-dpi image.

set_journal_style()

Applies the light journal/paper publication theme globally to matplotlib (white background, serif fonts, subtle grey grid, Okabe-Ito colorblind-safe palette). Call once at the top of a script.

plot_wavelet_decomposition(x, components, title="MODWT Multiresolution Decomposition", time_index=None, figsize=(14,10), save_path=None)

Stacked panels of the original series and each MRA component. components is the dict returned by modwt_mra.

plot_scale_coefficients(result, figsize=(10,7), ci_level=0.05, save_path=None, **kwargs)

Forest plot of coefficients per scale with confidence intervals; significant points highlighted. result is a ScaleRegressionResult. (Also reachable via result.plot().)

plot_structural_breaks(result, figsize=(14,5), time_index=None, save_path=None, **kwargs)

Step-function coefficient paths with vertical break lines. result is a BreakDetectionResult. (Also reachable via result.plot().)

plot_unit_root_comparison(results, figsize=(12,6), save_path=None)

Two-panel grouped bar chart (p-values and 5% decisions) across a list of UnitRootResult.

plot_loess_by_country(x_dict, y_dict, xlabel="Productivity growth", ylabel="Unemployment rate", title="Nonparametric Loess Fit by Country", span=0.5, figsize=(16,10), save_path=None)

Per-country scatter with a smoothed fit. x_dict/y_dict map country_name → array. span (0–1) controls smoothing window.

from pywaveletpanel import (
    set_journal_style, modwt_mra,
    plot_wavelet_decomposition, plot_loess_by_country,
)
set_journal_style()
comp = modwt_mra(series, wavelet='sym4', level=4)
plot_wavelet_decomposition(series, comp, title="Oil Price Decomposition",
                           save_path="decomp.png")

7. Utilities (utils)

Lower-level helpers (importable from pywaveletpanel.utils).

Function Signature Description
newey_west_se newey_west_se(X, residuals, n_lags=None) -> ndarray HAC standard errors (Bartlett kernel). n_lags=Nonefloor(4·(T/100)^(2/9)).
fixed_effects_transform fixed_effects_transform(y, entity_ids) -> ndarray Within (entity-demeaning) transform.
first_difference first_difference(y, entity_ids, time_ids=None) -> (dy, mask) First-difference within each entity.
ols_fit ols_fit(y, X, robust=True, n_lags=None) -> dict OLS with optional NW SEs; returns coef, se, t_stat, pvalue, r_squared, adj_r_squared, residuals, nobs.
panel_fixed_effects_ols panel_fixed_effects_ols(y, X, entity_ids, robust=True, n_lags=None) -> dict Core FE panel estimator (adds df).
significance_stars significance_stars(pvalue) -> str ***/**/*/"".
format_coef format_coef(value, pvalue, decimals=4) -> str Coefficient + stars.
simulate_panel_ar1 simulate_panel_ar1(N, T, rho=1.0, cross_corr=0.0, seed=None) -> ndarray (N,T) AR(1) panel; rho=1 → unit root, cross_corr adds equi-correlation.
from pywaveletpanel.utils import simulate_panel_ar1
# Near-integrated panel with cross-sectional dependence
data = simulate_panel_ar1(N=10, T=200, rho=0.95, cross_corr=0.3, seed=42)

📊 Scale Interpretation

Annual data (J=3):

Scale Period Interpretation
D1 2–4 years Short-run / business cycle
D2 4–8 years Business cycle
D3 8–16 years Medium-run
S3 >16 years Long-run trend

Monthly data (J=4):

Scale Period Interpretation
D1 1–2 months Very short-run
D2 2–4 months Short-run
D3 4–8 months Medium-run
D4 8–16 months Long-run

Unit Root Test Comparison

Test Robust to cross-dep? Robust to breaks? Reference
IPS (ADF) Im, Pesaran & Shin (2003)
Wavelet Ratio IPS Partial Li & Shukur (2013)
WDWT Almasri et al. (2016)
WMODWT Almasri et al. (2016)

📁 Examples

Run from the repo root (or after pip install -e .):

python examples/example_scale_regression.py   # Papers 2, 4
python examples/example_structural_breaks.py   # Paper 1
python examples/example_unit_root.py           # Papers 3, 5

🏗️ Library Architecture

pywaveletpanel/
├── wavelets.py             # Haar DWT/MODWT, LA(8), MODWT-MRA, dyadic padding
├── panel_regression.py     # WaveletPanelOLS, ScaleRegressionResult
├── structural_breaks.py    # SAWEstimator, PostSAWEstimator, BreakDetectionResult
├── unit_root.py            # WaveletRatioIPS, WaveletWaldDWT/MODWT, PanelADF, UnitRootResult
├── tables.py               # RegressionTable, UnitRootTable, BreakTable, SimulationTable
├── visualization.py        # set_journal_style + 5 plot functions
└── utils.py                # Newey-West HAC, panel transforms, OLS, MC simulation

📚 References

  1. Bada, O., Kneip, A., Liebl, D., Mensinger, T., Gualtieri, J. & Sickles, R.C. (2021). A Wavelet Method for Panel Models with Jump Discontinuities in the Parameters. arXiv:2109.10950v1.
  2. Karlsson, H.K., Månsson, K. & Sjölander, P. (2020). Unveiling the Time-dependent Dynamics between Oil Prices and Exchange Rates: A Wavelet-based Panel Analysis. The Energy Journal, 41(1), 87–106.
  3. Almasri, A., Månsson, K., Sjölander, P. & Shukur, G. (2016). A wavelet-based panel unit-root test in the presence of an unknown structural break and cross-sectional dependency. Applied Economics, DOI:10.1080/00036846.2016.1231908.
  4. Gallegati, M., Gallegati, M., Ramsey, J.B. & Semmler, W. (2015). Productivity and unemployment: a scale-by-scale panel data analysis for the G7 countries. Studies in Nonlinear Dynamics & Econometrics, DOI:10.1515/snde-2014-0053.
  5. Li, Y. & Shukur, G. (2013). Testing for Unit Roots in Panel Data Using a Wavelet Ratio Method. Computational Economics, 41, 59–69.

👤 Author

Dr. Merwan Roudane — 📧 merwanroudane920@gmail.com — 🔗 github.com/merwanroudane

📄 License

MIT License — see LICENSE.

@software{roudane2024pywaveletpanel,
  author = {Roudane, Merwan},
  title  = {PyWaveletPanel: Wavelet-Based Panel Data Econometrics in Python},
  year   = {2024},
  url    = {https://github.com/merwanroudane/pywaveletpanel}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors