-
Notifications
You must be signed in to change notification settings - Fork 0
ETAS Model
The model to beat. ETAS is a self-exciting point process in which every earthquake — main, fore, or after — can trigger its own offspring, which in turn trigger theirs, so that observed seismicity is a superposition of a stationary background and an "epidemic" of cascading triggered events. It is the de-facto operational baseline for short-term earthquake forecasting: any candidate model, classical or neural, must demonstrate positive, significant information gain over a well-fit ETAS in prospective CSEP testing before it can claim forecasting skill. v0 of this product ships an ETAS-class core (see Models-Employed).
Honest framing. ETAS produces a conditional intensity — an instantaneous expected rate, given the history — never a deterministic prediction. Integrated over a horizon and combined with the Gutenberg–Richter magnitude term, it yields a bounded, calibrated probability shown next to its long-term baseline. Even at the peak of an active sequence, the absolute probability of a large event in the next day is usually well under a few percent; the relative gain over background can be large, but the absolute number stays small.
Conventions.
- Intuition and history
- The conditional intensity
- Background term $\mu(x,y)$
- Utsu productivity $k(m)$
- Omori–Utsu temporal kernel $g(t)$
- Spatial kernel $f(r\mid m)$
- Branching ratio and the two stability gates
- Estimation — MLE and stochastic declustering
- The dual-catalog rule
- From intensity to the published probability
- Assumptions and honest limits
- Role in operational earthquake forecasting
- Worked illustration
- Structure of the process (diagram)
- References
The classical Omori–Utsu law describes the aftershocks of one mainshock. But aftershocks have aftershocks, foreshocks are simply mainshocks of smaller events that came first, and real catalogs are a tangle of overlapping cascades. Yosihiko Ogata's insight (1988) was to stop labelling events as "main" or "after" and instead model seismicity as a self-exciting point process — a Hawkes process — in which every event, regardless of label, contributes a burst of elevated rate to its space–time neighborhood, and those bursts superpose.
The name is deliberately epidemiological. Each earthquake is like an infected individual: it "infects" its surroundings with a temporarily elevated probability of further earthquakes (offspring), each of which can infect again. The whole observed sequence is an epidemic of triggering on top of a steady background of spontaneous (independent) events driven by long-term tectonic loading. Ogata (1998) extended the temporal model to space–time, adding a spatial triggering kernel so the model forecasts where as well as how many and how big.
ETAS is "physics-free" in the sense that it encodes no fault mechanics directly — only the empirical Gutenberg–Richter, Omori–Utsu, and Utsu productivity scaling laws. Yet it remains, decades on, the model that neural and physics-based forecasters are measured against, because it captures the dominant, most predictable feature of seismicity — clustering — with a handful of interpretable parameters (see Methodology-History and Models-ML).
ETAS is defined by its conditional intensity
Each past event
-
$k(m_i)$ — productivity: how many offspring an event of magnitude$m_i$ seeds (§4), -
$g(t - t_i)$ — temporal decay: the Omori–Utsu kernel, how that burst fades in time (§5), -
$f(x - x_i, y - y_i \mid m_i)$ — spatial spread: where the offspring land relative to the parent (§6),
on top of
The temporal-only special case (Ogata 1988), useful for intuition and for 1-D testing, is
References. Ogata, Y. (1988), Statistical models for earthquake occurrences and residual analysis for point processes, J. Am. Stat. Assoc. 83(401), 9–27, doi:10.1080/01621459.1988.10478560; Ogata, Y. (1998), Space–time point-process models for earthquake occurrences, Ann. Inst. Statist. Math. 50(2), 379–402, doi:10.1023/A:1003403601725.
The background is the rate of spontaneous earthquakes — those not triggered by any catalogued predecessor, driven by long-term tectonic loading. It is taken time-independent and factored as
with
The number of direct offspring an event seeds grows exponentially with its magnitude (the Utsu productivity law):
-
$A$ (or$K_0$ ) — the aftershock-occurrence rate at the reference magnitude$m = M_0$ , -
$\alpha$ — the productivity parameter: how strongly a larger event seeds more aftershocks. A larger$\alpha$ means big events dominate the triggering; a smaller$\alpha$ means even small events contribute appreciably. -
$M_0$ — a reference / threshold magnitude (often set at or above$M_c$ ).
The burst from each past event decays in time as the modified Omori law. Written
as a normalized density in elapsed time (integrating to 1 over
-
$c$ — the Omori–Utsu time offset (hours to a day), -
$p$ — the decay exponent ($p \approx 1$ –1.3 typically).
This is the same kernel as in the standalone Omori–Utsu law; ETAS's contribution
is to attach one such decaying burst to every event and sum them, which is what produces secondary
aftershocks and bursts that a single Omori law cannot. The same short-term incompleteness caveat
applies:
Offspring cluster around their parent, within a zone that grows with the parent's magnitude (a larger rupture seeds aftershocks over a larger area). Ogata's (1998) inverse-power spatial kernel, normalized to integrate to 1 over the plane, is
-
$D$ — the characteristic triggering length at the reference magnitude$M_0$ , -
$\gamma$ — the magnitude scaling of the aftershock-zone size, -
$q$ — the tail exponent, controlling how far triggering reaches (larger$q$ = tighter clustering).
The bandwidth
The branching ratio
Because
This integral converges only if
-
Finite branching (
$\alpha < \beta$ ). If$\alpha \ge \beta$ the productivity × magnitude integral diverges — the largest events dominate without bound and the model is improper. The constraint$\alpha < \beta$ (with$\beta = b\ln 10$ ) is a genuine real-world fitting gotcha;$\alpha$ and$b$ must be estimated consistently. -
Subcriticality / stationarity (
$n < 1$ ). Given$\alpha < \beta$ , the branching ratio must satisfy$n < 1$ for the cascade to die out:-
$n < 1$ → subcritical: each generation is smaller; the sequence is stationary and the expected total number of descendants per event is finite,$1/(1-n)$ . -
$n = 1$ → critical. -
$n > 1$ → supercritical: explosive, non-physical for a real finite catalog; a fit with$n \ge 1$ is rejected as a mis-fit.
-
The expected family size
Maximum likelihood. The parameters
$$\ln L = \sum_{i} \ln \lambda(t_i, x_i, y_i \mid \mathcal{H}_{t_i})
- \int_{0}^{T}!!\int_{A} \lambda(t, x, y \mid \mathcal{H}_t), \mathrm{d}x, \mathrm{d}y, \mathrm{d}t .$$
The first (sum) term rewards high intensity where events occurred; the second (compensator / integral) term penalizes total predicted intensity over the whole space–time volume and is what makes the fit a calibrated probability model rather than a regressor. The same log-likelihood is reused as the L-test score in CSEP evaluation.
Stochastic declustering / EM. A second, complementary estimator (Zhuang, Ogata & Vere-Jones 2002) assigns to each event a probability of being background versus triggered:
Iterating (an EM scheme) between these weights and the parameters both recovers the background
field
Bayesian / scalable inference. For uncertainty quantification and scalability the product can use
a Bayesian posterior (e.g. INLAbru, or simulation-based ETAS), giving the full parameter covariance
needed for honest forecast bounds rather than a single point estimate (see
Technical-Architecture and Evaluation-and-Tests). Software in the ecosystem includes the R
ETAS/bayesianETAS packages and pyetas.
References. Ogata, Y. (1988), doi:10.1080/01621459.1988.10478560; Zhuang, J., Ogata, Y. & Vere-Jones, D. (2002), Stochastic declustering of space–time earthquake occurrences, J. Am. Stat. Assoc. 97(458), 369–380, doi:10.1198/016214502760046925.
A subtle but decisive pipeline rule distinguishes a correct ETAS deployment from a naive one:
-
Background
$\mu(x,y)$ is estimated on a declustered catalog (Zaliapin–Ben-Zion nearest-neighbour, or Gardner–Knopoff windowing as a cross-check), so the background reflects independent mainshocks, not aftershock-inflated rates. - The conditional triggering term is fit on the full, un-declustered catalog, because aftershock/foreshock triggering is the predictable short-term signal — declustering it away would discard exactly what ETAS exists to model.
Forecasts are scored on the non-declustered catalog, because the product deliberately forecasts clustering. A consequence that must be stated, not hidden: because most scored events are aftershocks, a model that merely reproduces Omori decay already passes the consistency tests — so skill is established only by winning a comparison test against a real ETAS baseline (positive information gain per earthquake; see Evaluation-and-Tests), never by passing consistency tests alone (see Models-Employed).
Reference. Zaliapin, I. & Ben-Zion, Y. (2020), Earthquake declustering using the nearest-neighbor approach in space-time-magnitude domain, J. Geophys. Res. Solid Earth 125, e2018JB017120, doi:10.1029/2018JB017120.
ETAS gives the conditional intensity
In practice the integral is evaluated by simulating a large ensemble of synthetic catalogs
(≥ 10,000 per day) from the fitted intensity, which also yields the over-dispersion-honest,
catalog-based uncertainty bounds that a naive Poisson interval would understate (see
Evaluation-and-Tests and Models-Employed). The
- Isotropic, magnitude-scaled triggering. The standard spatial kernel is circular, whereas real aftershock zones are elongated along the rupture. For great earthquakes this is a known simplification; anisotropic / finite-fault ETAS variants exist and matter for subduction megathrusts (a flagged design decision for a Chile build; see Models-Employed).
-
Separable magnitude. Magnitudes are drawn from a fixed GR
distribution independent of history; in reality
$b$ varies in space and time, and that variation is carried separately with uncertainty. -
Stationary background.
$\mu(x,y)$ is assumed time-independent — violated near swarms, slow-slip episodes, and induced/injection-driven seismicity, where the background itself moves. - Post-mainshock incompleteness. The highest-stakes moment is exactly where the catalog is worst; without an incompleteness-aware likelihood, ETAS under-forecasts aftershocks right after a large event (see Omori-Utsu-Law §incompleteness).
- Not a predictor. ETAS forecasts a rate; whether a small rupture cascades into a great earthquake depends on unmeasurably fine details of the crust, and a single outcome neither validates nor invalidates a probabilistic forecast (see Honest-Limits).
ETAS is the gold-standard physics-free short-term baseline and the v0 core of this product:
- It is what real operational systems run: USGS Operational Aftershock Forecasting and OEF-Italy (an ensemble that includes ETAS) publish calibrated probabilities with uncertainty as scheduled services.
- It is the benchmark gate: any neural temporal point process or hybrid must beat a well-fit ETAS in prospective CSEP testing — and, as of 2026, no pure neural model has robustly done so on the public benchmarks to date (see Models-ML). This is a requirement, not a claim that ML can never help.
- It supplies the conditional intensity that, combined with GR and the Poisson wrapper, produces the daily, region-scoped, horizon-specific probabilities the product publishes (see Models-Employed and Pipeline).
The product ships a region-refit space–time ETAS with the full hygiene pipeline — per-region
Consider a temporal ETAS with reference
Stability check (gate 1). Is
Since
Forecast amplitude. Suppose an
flowchart TD
BG[Background mu of x,y<br/>spontaneous events<br/>smoothed seismicity, declustered] --> SUM[Conditional intensity<br/>lambda of t,x,y given history]
subgraph TRIG[Triggering: summed over every past event i]
P[Productivity k of m = A exp alpha m - M0]
T[Temporal g of t = Omori–Utsu kernel]
S[Spatial f of r given m, zeta = D exp gamma m - M0]
P --> KER[k x g x f burst per parent]
T --> KER
S --> KER
end
KER --> SUM
SUM --> GATE{Stability gates}
GATE -- gate 1: alpha < beta --> OK1[finite branching]
GATE -- gate 2: n < 1 --> OK2[subcritical / stationary]
GATE -- n >= 1 or alpha >= beta --> REJ[reject fit: mis-fit]
OK1 --> SIM[Simulate >= 10,000 catalogs over horizon]
OK2 --> SIM
SIM --> N[Expected count x GR exceedance Phi of M*]
N --> PR[Probability P = 1 - exp - N<br/>+ over-dispersion-honest bounds]
PR --> CSEP[Scored prospectively in CSEP<br/>must beat smoothed-seismicity AND ETAS]
Each past event contributes a separable burst (productivity × temporal × spatial) on top of the background; the summed intensity, gated for stability and simulated forward, becomes the published probability and is scored in CSEP.
- Ogata, Y. (1988), Statistical models for earthquake occurrences and residual analysis for point processes, J. Am. Stat. Assoc. 83(401), 9–27, doi:10.1080/01621459.1988.10478560.
- Ogata, Y. (1998), Space–time point-process models for earthquake occurrences, Ann. Inst. Statist. Math. 50(2), 379–402, doi:10.1023/A:1003403601725.
- Zhuang, J., Ogata, Y. & Vere-Jones, D. (2002), Stochastic declustering of space–time earthquake occurrences, J. Am. Stat. Assoc. 97(458), 369–380, doi:10.1198/016214502760046925.
- Helmstetter, A., Kagan, Y. Y. & Jackson, D. D. (2007), High-resolution time-independent grid-based forecast for M ≥ 5 earthquakes in California, Seismol. Res. Lett. 78(1), 78–86, doi:10.1785/gssrl.78.1.78.
- Zaliapin, I. & Ben-Zion, Y. (2020), Earthquake declustering using the nearest-neighbor approach in space-time-magnitude domain, J. Geophys. Res. Solid Earth 125, e2018JB017120, doi:10.1029/2018JB017120.
- Reasenberg, P. A. & Jones, L. M. (1989), Earthquake hazard after a mainshock in California, Science 243(4895), 1173–1176, doi:10.1126/science.243.4895.1173.
- Jordan, T. H. et al. (2011), Operational earthquake forecasting: state of knowledge and guidelines for utilization (ICEF Report), Annals of Geophysics 54(4), 315–391, doi:10.4401/ag-5350.
- Mizrahi, L., Nandan, S. & Wiemer, S. (2024), Developing, testing, and communicating earthquake forecasts: current practices and future directions, Reviews of Geophysics 62, doi:10.1029/2023RG000823.
See also: Gutenberg-Richter-Law · Omori-Utsu-Law · Models-Classical · Models-ML · Models-Employed · Evaluation-and-Tests · Glossary
⚠️ Disclaimer — read this. CAOS_SEISMIC produces probabilistic forecasts, not predictions. It is an independent research and education tool. It is NOT an official earthquake early-warning or civil-protection system, it does NOT predict when, where, or how large an earthquake will be, and it must NOT be used for life-safety, emergency, or evacuation decisions. Every number it publishes is a bounded, calibrated probability conditioned on the present state of seismicity — never an alarm, a countdown, or a "safe" state. A single outcome neither confirms nor refutes a probabilistic forecast.It complements, and does not replace or speak for, official agencies — always follow your national seismological and civil-protection authorities (e.g. USGS, INGV, CSN (Chile, SENAPRED for civil protection), GeoNet, JMA). The software is provided "as is", without warranty of any kind (MIT License); the authors accept no liability for its use. Data are courtesy of their providers (USGS/ANSS, ISC/ISC-GEM, Global CMT, EMSC, CSN, and others) under their respective licenses and attribution terms. See Honest-Limits for the full epistemic context.
CAOS_SEISMIC · seismic.fasl-work.com · source · MIT
Conditional probabilistic seismic forecasting — forecasts, never predictions.
Overview
Methodology & History
Classical models
- Models-Classical · index
- Gutenberg-Richter-Law
- Omori-Utsu-Law
- ETAS-Model
- Reasenberg-Jones-Model
- STEP-Model
- EEPAS-Model
- Smoothed-Seismicity
- Brownian-Passage-Time
- Rate-and-State-and-Coulomb
ML & analytical methods
- Models-ML · index
- Temporal-Point-Processes
- RMTPP
- Neural-Hawkes-Process
- Transformer-Hawkes-Process
- RECAST-and-FERN
- CNN-Spatial-Models
- Graph-and-Recurrent-Networks
- Detection-vs-Forecasting
Models employed
Data
Architecture
Evaluation
Progress
Reference