AETHER
Process-JEPA: Extending LeCun's Joint Embedding Architecture to Business Event Prediction
Why JEPA? •
The Problem •
How It Works •
Quick Start •
Architecture •
Quick Start
If JEPA can learn to predict the physical world from images and video, can it predict how business processes unfold?
AETHER is JEPA implementation for discrete business event sequences. It takes ideas from Yann LeCun's Joint Embedding Predictive Architecture and applies them to enterprise workflow prediction — purchase-to-pay, order-to-cash, and procurement processes.
The key ideas from the JEPA ecosystem that AETHER adapts:
| JEPA Concept | Original Domain | AETHER Application |
|---|---|---|
| Joint Embedding | Images (I-JEPA), Video (V-JEPA) | Business event sequences |
| Latent-space prediction | Pixel masking, frame prediction | Event transition: f(z_t, action, variant) → z_{t+1} |
| Energy-based scoring | LeCun's EBM framework (2006) | Process conformance anomaly detection |
| SIGReg loss | LeJEPA (Balestriero & LeCun, 2025) | Latent collapse prevention via eigenvalue regularization |
| VICReg loss | VICReg (Bardes, Ponce & LeCun, 2022) | Variance-Invariance-Covariance as alternative regularizer |
The novel contribution: AETHER combines these JEPA components with epistemic uncertainty decomposition and adaptive governance. The model decomposes uncertainty into what's reducible (epistemic) vs. what's inherently random (aleatoric), and uses that decomposition to dynamically tighten or relax governance thresholds — no static thresholds, no manual tuning. The system earns trust through demonstrated calibration.
"A possible path towards building a world model is to learn hierarchical representations of the world that capture both short-term and long-term dependencies." — LeCun, A Path Towards Autonomous Machine Intelligence (2022)
AETHER explores the complementary question: can JEPA model enterprise workflows, where the "world" is a structured sequence of business events?
Every AI governance system today uses static thresholds:
- Flag if confidence < 0.90
- Review if drift > 0.15
- Block if uncertainty > 0.80
These break immediately. A well-calibrated model gets held back by thresholds tuned for a bad one. A degrading model sails through gates set during its best day.
Worse: not all uncertainty is equal. A model that's uncertain because it hasn't seen enough data (epistemic) should trigger more review — human judgment helps. A model that's uncertain because the process is inherently random (aleatoric) should not trigger more review — no amount of human oversight reduces coin-flip randomness.
No existing system makes this distinction.
effective_threshold = base × mode_factor × uncertainty_factor × calibration_factor
min_floor = 0.50 + 0.05 × log(vocab_size / 20) / log(4) # v3: vocabulary-aware
Each factor is independently computed and composable:
| Factor | What it captures | Effect |
|---|---|---|
| Mode | Operational context (flexible → strict → forbidden) | Symbolic governance from PromptSpeak modes |
| Uncertainty | Epistemic ratio of total uncertainty | Only reducible uncertainty tightens governance |
| Calibration | Recent ECE/MCE/Brier score | Poorly calibrated models get tighter oversight |
| Vocabulary | Activity taxonomy complexity (v3) | High-vocab datasets get conservative floors |
The key insight: aleatoric uncertainty is ignored in governance tightening. This is the formal contribution. It means the system won't waste human attention on inherently random outcomes.
v3 addition: The vocabulary-aware minimum floor prevents regressions on high-activity datasets. At 80+ activities, the floor rises to match static thresholds, implementing a "do no harm" principle.
Trust is earned slowly and lost quickly:
SUPERVISED ──[10 calibrated windows]──> GUIDED
GUIDED ──[20 calibrated windows]──> COLLABORATIVE
COLLABORATIVE ──[50 calibrated windows]──> AUTONOMOUS
Any level ──[1 critical miss]──> immediate demotion
Any level ──[immutable violation]──> reset to SUPERVISED
This mirrors real-world trust: it takes months to build and seconds to destroy.
Some constraints never relax, regardless of trust level or calibration:
- Forbidden mode → always block
- Sensitive data patterns (SSN, API keys, private keys) → always hold
- Dempster-Shafer conflict > 0.7 → always review
- Circuit breaker floor → 3+ consecutive failures = block
- Uncertainty ceiling > 0.95 → always hold
git clone https://github.com/christopherbailey/aether.git
cd aether
npm install
npm run build
npm test # 99 tests — governance, modulation, bridge, tools, vocab-awarepip install -e ".[dev]" # Editable install with test dependencies
python -m pytest core/tests/ -v # 303 tests — encoder, world model, critic, training, data# Terminal 1: Python inference server
python -m core.inference.server # Starts on localhost:8712
# Terminal 2: MCP server (connects to Claude, Cursor, etc.)
npm startThat's it. AETHER exposes 6 MCP tools that any AI assistant can call to get uncertainty-aware predictions and governance decisions.
MCP Tools (6)
predict_next_event
predict_outcome
get_calibration
get_autonomy_level
get_effective_thresholds
evaluate_gate
|
TypeScript MCP Server
├── Governance Modulation ← aether.config.ts
├── Autonomy Controller (asymmetric trust)
├── Immutable Constraints (safety floor)
│ |
│ HTTP bridge (:8712)
│ |
Python FastAPI Server
├── EventEncoder (activity + time + context → 128D)
├── TransitionModel (JEPA predictor: z_t → z_{t+1})
├── EnergyScorer (energy-based anomaly scoring)
├── HierarchicalPredictor (activity / phase / outcome)
├── LatentVariable (Gumbel-Softmax path variants)
├── UncertaintyDecomposer (epistemic vs. aleatoric)
├── CalibrationTracker (ECE / MCE / Brier)
└── ConformalPredictor (distribution-free prediction sets)
| Module | Purpose |
|---|---|
encoder/ |
Event → 128D latent state via vocabularies + Time2Vec + causal transformer |
world_model/ |
JEPA-style transition model with energy scoring and hierarchical predictions |
critic/ |
Epistemic/aleatoric decomposition, calibration tracking, adaptive conformal inference |
training/ |
VICReg + SIGReg loss functions, multi-loss training loop, checkpoints |
inference/ |
FastAPI server with /predict, /calibration, /health endpoints |
data/ |
Unified pipeline for SAP, BPI 2019, OCEL 2.0, and CSV event logs |
| Module | Purpose |
|---|---|
governance/ |
Compositional modulation, autonomy state machine, immutable safety |
bridge/ |
HTTP client to Python server with conservative fallbacks |
tools/ |
6 MCP tools for predictions, calibration, and governance decisions |
types/ |
Full type system mirroring Python structures |
All governance tuning lives in one file: mcp-server/src/governance/aether.config.ts
export const BASE_THRESHOLDS = {
driftThreshold: 0.15, // Concept drift detection
reviewGateAutoPass: 0.92, // Auto-pass confidence
threatActivation: 0.60, // Threat level activation
conformanceDeviation: 0.05, // Process conformance
sayDoGap: 0.20, // Say-Do consistency
knowledgePromotion: 0.75, // Knowledge promotion score
};export const COEFFICIENTS = {
modeStrength: 0.3, // Governance mode sensitivity
uncertaintyStrength: 0.5, // Epistemic uncertainty sensitivity
calibrationStrength: 0.4, // Calibration quality sensitivity
};Every threshold is bounded to prevent pathological behavior. See aether.config.ts for the full configuration.
| Variable | Default | Description |
|---|---|---|
AETHER_PYTHON_URL |
http://localhost:8712 |
Python inference server URL |
AETHER_BPI2019_PATH |
— | Path to BPI 2019 dataset JSON file |
AETHER exposes 6 tools via the Model Context Protocol:
| Tool | Description |
|---|---|
predict_next_event |
Next activity predictions with uncertainty decomposition and conformal sets |
predict_outcome |
Case outcome prediction (on-time, rework, remaining hours) |
get_calibration |
Current model calibration metrics (ECE, MCE, Brier) |
get_autonomy_level |
Trust state: SUPERVISED → GUIDED → COLLABORATIVE → AUTONOMOUS |
get_effective_thresholds |
All 6 adaptive thresholds with full modulation breakdown |
evaluate_gate |
Allow/hold/block decision with audit trail |
Add to your claude_desktop_config.json:
{
"mcpServers": {
"aether": {
"command": "node",
"args": ["/path/to/aether/mcp-server/dist/index.js"]
}
}
}- JEPA — LeCun, 2022. A Path Towards Autonomous Machine Intelligence. The foundational architecture.
- LeJEPA — Balestriero & LeCun, 2025. Provable and Scalable Self-Supervised Learning Without the Heuristics (arXiv 2511.08544). SIGReg regularization via Epps-Pulley / random-projection. AETHER uses the eigenvalue formulation. (Official repo)
- I-JEPA — Assran et al., CVPR 2023. Joint embedding for images.
- V-JEPA — Bardes et al., 2024. Joint embedding for video.
- VICReg — Bardes, Ponce & LeCun, ICLR 2022. Variance-Invariance-Covariance Regularization.
- Energy-Based Models — LeCun et al., 2006. A Tutorial on Energy-Based Learning. The theoretical framework for AETHER's anomaly scoring.
- Adaptive Conformal Inference — Gibbs & Candes, NeurIPS 2021. Distribution-free prediction sets.
- Law of Total Variance — Classic. Epistemic/aleatoric uncertainty decomposition.
- Time2Vec — Kazemi et al., ICLR 2019. Continuous temporal encoding.
AETHER has been evaluated on 10 process mining datasets across 5 domains:
| Dataset | Domain | Cases | MCC Improvement | Notes |
|---|---|---|---|---|
| Road Traffic Fine | Government | 30,074 | +266% | Scale validation (150K total) |
| SAP Workflow | Enterprise | 2,896 | +31.3% | Best enterprise result |
| Wearable Tracker | Retail | 218 | +17.8% | O2C process |
| Sepsis | Healthcare | 210 | +2.3% | Clinical workflows |
| BPI 2019 | Finance | 500 | +0.6% | Procurement |
| BPIC 2012 | Finance | 500 | +0.4% | Loan applications |
| Judicial | Legal | 5 | 0.0% | Novel domain |
| BPI 2018 | Government | 2,000 | -2.2% | v3 floor applied |
| NetSuite 2025 | Finance | 274 | -3.3% | High class imbalance |
| SAP BSP669 | Enterprise | 767 | -24.0% | 77 activities (v3 candidate) |
Key findings:
- AETHER improves MCC on 7/10 datasets
- Largest improvement at scale: Road Traffic Fine (+266% on 150K cases)
- v3 vocabulary-aware floor reduces regressions on high-activity datasets
See docs/BENCHMARK_COMPARISON.md for detailed analysis.
npm test # TypeScript: 99 tests
python -m pytest core/tests/ -v # Python: 303 tests
npm run test:coverage # TypeScript coverage report
npm run test:python:coverage # Python coverage report
npm run test:all # Run everythingCI runs automatically on push to main and all PRs via GitHub Actions.
MIT — Christopher Bailey, 2026