Models ML

Models — Analytical / ML Approaches (index)

A rigorous, skeptical survey of the analytical and machine-learning model family relevant to a conditional probabilistic seismic forecasting product. This page is an index: each method now lives on its own deep sub-page with intuition and history, governing equations and derivation, training/estimation, strengths and limitations, role in operational forecasting, a diagram, and a References section with DOIs.

The verdict, stated up front (honesty over hype). As of 2026, no machine-learning model has been shown to reliably beat a well-fit ETAS for short-term earthquake forecasting under fair, prospective, CSEP-style testing. This is not a vibe — it is the explicit conclusion of the most rigorous benchmark to date (EarthquakeNPP; see RECAST-and-FERN and Detection-vs-Forecasting). ML does add genuine value, but in specific, honest places: better catalogs upstream (detection), multivariate covariate ingestion, learned spatial anisotropy, and inference speed. This is why the product ships an ETAS-class core (see Models-Classical and Models-Employed) with any neural model gated as a challenger that must beat ETAS in our own prospective Evaluation-and-Tests and be calibrated before it reaches the public map.

The forecasting target (shared by every method)

A seismicity catalog is a realization of a marked spatio-temporal point process. Every method — statistical or neural — is ultimately estimating the conditional intensity function $\lambda^*(t,x,y,m \mid \mathcal{H}_t)$, the instantaneous expected rate of events given history. The log-likelihood over $[0,T]$,

$$\log \mathcal{L} = \sum_{i=1}^{n} \log \lambda^_(t_i) ;-; \int_0^T \lambda^_(\tau), d\tau,$$

contains a compensator / survival term (the integral) that makes a model probabilistic and calibratable rather than a regressor. A forecasting system that does not evaluate this term is not doing point-process forecasting — the structural root of most ML-forecasting failures. The full treatment is in Temporal-Point-Processes.

flowchart TD
    TPP[Temporal Point Processes<br/>conditional intensity + likelihood] --> RMTPP[RMTPP<br/>RNN intensity]
    TPP --> NHP[Neural Hawkes<br/>continuous-time LSTM]
    TPP --> THP[Transformer Hawkes<br/>attention]
    RMTPP --> EQ[Earthquake-specific neural TPPs]
    NHP --> EQ
    THP --> EQ
    EQ --> RECAST[RECAST and FERN]
    CNN[CNN spatial models<br/>DeVries cautionary tale] -. spatial .-> EQ
    GRN[Graph and Recurrent networks] -. structure .-> EQ
    DET[Detection vs Forecasting<br/>the hard line] -. upstream catalogs .-> TPP
    RECAST --> GATE{Beats ETAS in prospective CSEP<br/>AND calibrated?}
    GATE -- yes --> PUB[Reaches the public map]
    GATE -- no --> ETAS[ETAS-class core stays]

The methods — one line each

Temporal-Point-Processes — the unifying framework: conditional intensity $\lambda^*$, the compensator, the log-likelihood, thinning/simulation, and residual analysis. Every other method is a special case.
RMTPP — Recurrent Marked Temporal Point Process (Du et al., 2016): the first neural TPP, an RNN that embeds event history into a vector and parameterizes the intensity.
Neural-Hawkes-Process — Mei & Eisner (2017): a continuous-time LSTM whose hidden state decays between events, generalizing the Hawkes self-excitation to learned, non-additive dynamics.
Transformer-Hawkes-Process — Zuo et al. (2020) and self-attentive Hawkes: attention over the event history for long-range dependencies, with the same likelihood/compensator machinery.
RECAST-and-FERN — the two neural TPPs built specifically for earthquakes; the honest benchmark evidence (EarthquakeNPP) on where they match, and where they do not yet beat, ETAS.
CNN-Spatial-Models — CNN spatial forecasting and the canonical cautionary tale: DeVries (2018) vs. Mignan & Broccardo (2019) — why a single neuron matched a deep net, and the leakage/AUC lessons.
Graph-and-Recurrent-Networks — GNN/RNN/LSTM approaches; where graph structure and recurrence genuinely help (associations, upstream catalogs) and where they underwhelm for forecasting.
Detection-vs-Forecasting — the hard line: deep learning is transformative for detection and phase-picking (PhaseNet, EQTransformer) but that is not forecasting; the two must not be conflated.

Where ML helps, where it does not (honest synthesis)

ML genuinely helps	ML has not beaten the classical baseline
Detection / phase-picking → better, more complete catalogs upstream (Detection-vs-Forecasting)	Short-term rate forecasting vs. a well-fit ETAS under fair prospective CSEP testing
Ingesting multivariate covariates a parametric ETAS cannot easily absorb	Any claim resting on AUC / classification framings (calibration-blind; banned as a primary metric)
Learned spatial anisotropy and flexible kernels	Anything trained or evaluated with temporal leakage (the DeVries lesson)
Inference speed and scalable conditioning	Deterministic "yes/no" prediction — impossible, never claimed

The discipline that keeps this honest is in Evaluation-and-Tests and Honest-Limits.

See also: Models-Classical · Models-Employed · Temporal-Point-Processes · Methodology-History · Evaluation-and-Tests · Honest-Limits · References · Glossary

⚠️ Disclaimer — read this. CAOS_SEISMIC produces probabilistic forecasts, not predictions. It is an independent research and education tool. It is NOT an official earthquake early-warning or civil-protection system, it does NOT predict when, where, or how large an earthquake will be, and it must NOT be used for life-safety, emergency, or evacuation decisions. Every number it publishes is a bounded, calibrated probability conditioned on the present state of seismicity — never an alarm, a countdown, or a "safe" state. A single outcome neither confirms nor refutes a probabilistic forecast.

It complements, and does not replace or speak for, official agencies — always follow your national seismological and civil-protection authorities (e.g. USGS, INGV, CSN (Chile, SENAPRED for civil protection), GeoNet, JMA). The software is provided "as is", without warranty of any kind (MIT License); the authors accept no liability for its use. Data are courtesy of their providers (USGS/ANSS, ISC/ISC-GEM, Global CMT, EMSC, CSN, and others) under their respective licenses and attribution terms. See Honest-Limits for the full epistemic context.

CAOS_SEISMIC · seismic.fasl-work.com · source · MIT

CAOS_SEISMIC

Conditional probabilistic seismic forecasting — forecasts, never predictions.

Live site · Repo

Overview

Methodology & History

Methodology-History

Classical models

ML & analytical methods

Models employed

Models-Employed

Data

Architecture

Evaluation

Evaluation-and-Tests

Progress

Changelog-and-Progress

Reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models ML

Models — Analytical / ML Approaches (index)

The forecasting target (shared by every method)

The methods — one line each

Where ML helps, where it does not (honest synthesis)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CAOS_SEISMIC

Clone this wiki locally