HEPA: Horizon-conditioned Event Predictive Architecture

Self-supervised event prediction in multivariate time series via causal JEPA.

HEPA is a single, compact (~2.16M parameter) architecture that pretrains on multivariate time series with no labels, then finetunes a horizon-conditioned predictor for downstream event forecasting. The same encoder transfers across spacecraft telemetry, server metrics, ECG, water-distribution attacks, and turbofan-engine degradation.

The paper makes two contributions:

Predictor-finetuning as the downstream abstraction. After self-supervised pretraining, freeze the encoder and finetune only the horizon-conditioned predictor + a shared event head. The resulting probability surface p(t, Delta_t) is monotone in Delta_t by construction (discrete-hazard CDF) and supports any event-prediction metric (AUPRC, h-AUROC, PA-F1, RMSE).
One architecture across N datasets and M domains. The same 2.16M-param model attains competitive results on every benchmark in Table 1 - no per-dataset tuning, no architecture changes.

Paper: forthcoming - NeurIPS 2026 (link TBD).

Installation

pip install -e .

PyTorch >= 2.0 is required. CPU works for the unit tests; a single GPU is recommended for full pretraining.

Quickstart

# 1. Tell HEPA where your data lives (default: ~/.hepa/data)
export HEPA_DATA_DIR=/path/to/data

# 2. Get download instructions for any supported dataset
python scripts/download_data.py FD001

# 3. Pretrain + finetune + evaluate on C-MAPSS FD001
python scripts/train.py --dataset FD001 --seed 0

The train.py script:

Loads the dataset bundle (pretrain stream, finetune entities).
Pretrains HEPA with the JEPA objective (50 epochs by default).
Freezes the encoder; finetunes the predictor + event head with positive-weighted BCE.
Reports pooled AUPRC, pooled AUROC, h-AUROC (paper's primary metric), and per-horizon breakdown.
Saves the probability surface (*.npz) and metrics (*.json).

Architecture

Component	Spec	Role
Patch embed	group P=16 timesteps, project to d=256	reduce sequence length
Context encoder	causal Transformer, L=2, h=4, d=256, RevIN + sin PE	x[0:t] -> h_t
Target encoder	bidirectional Transformer + attention pool, periodic hard-sync from encoder (SIGReg)	x(t : t+Delta_t] -> h*
Predictor	MLP(d+1, 256, 256, d) on (h_t, Delta_t)	predicted future embedding
Event head	LayerNorm + Linear(d, 1)	per-horizon hazard logit

Total: ~2.16M parameters. CDF parameterization: lambda_k = sigmoid(head(predictor(h_t, Delta_t_k))), p(t, Delta_t_k) = 1 - prod_{j<=k} (1 - lambda_j).

All hyperparameters live in hepa/utils/config.py under the PROTOCOL dict.

Datasets

Dataset	Domain	Channels	Loader
FD001-004	Turbofan engine degradation	14	`hepa.data.cmapss`
SMAP	Spacecraft telemetry	25	`hepa.data.smap`
PSM	Server metrics	25	`hepa.data.psm`
MBA	Cardiac ECG arrhythmia	2	`hepa.data.mba`
GECCO	Drinking water quality	9	`hepa.data.gecco`
BATADAL	Water-distribution attacks	~43	`hepa.data.batadal`
TEP	Chemical-plant faults	52	`hepa.data.tep`
ETTm1	Electricity transformer load	7	`hepa.data.ettm1`
Weather	Climate forecasting	21	`hepa.data.weather`
BeijingAQ	Air-quality / public health	11	`hepa.data.beijing_aq`
VIX	Financial volatility	6	`hepa.data.vix`

HEPA does not redistribute data. See python scripts/download_data.py for source URLs and expected file layouts.

Results

See Table 1 in the paper for the full per-dataset h-AUROC (mean + std over 3 seeds) against the strongest published baselines.

Repository layout

hepa/
  model/        encoder, target_encoder, predictor, event_head, hepa wrapper
  data/         per-dataset loaders + central path config
  training/     pretrain loop, finetune loop, JEPA + BCE losses
  evaluation/   AUPRC / AUROC / h-AUROC / ECE / Brier, surface I/O
  utils/        PROTOCOL dict, seeding helper
scripts/
  train.py            single-entry pretrain + finetune + evaluate
  download_data.py    per-dataset download instructions
tests/
  test_model.py       forward-pass shape tests
  test_metrics.py     metric correctness tests

Citation

@article{hepa2026,
  title   = {HEPA: A Self-Supervised Horizon-Conditioned Event Predictive
             Architecture for Time Series},
  author  = {Anonymous Authors},
  journal = {arXiv preprint},
  year    = {2026}
}

License

MIT - see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
hepa		hepa
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HEPA: Horizon-conditioned Event Predictive Architecture

Installation

Quickstart

Architecture

Datasets

Results

Repository layout

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HEPA: Horizon-conditioned Event Predictive Architecture

Installation

Quickstart

Architecture

Datasets

Results

Repository layout

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages