-
Notifications
You must be signed in to change notification settings - Fork 0
Models Employed
The model stack this product actually runs: a calibrated ETAS reference as the defensible core, a smoothed-seismicity Poisson null as the mandatory baseline, a Reasenberg–Jones transparent fallback, and a gated, context-conditioned neural temporal point process (with a CNN as the spatial-context encoder, never a standalone classifier) as a challenger that only ships if it beats ETAS in our own prospective CSEP harness. This page records the architecture, the regime/tile partitioning, the target definition, and how calibration and bounds are produced.
Framing (non-negotiable). This is a forecaster, never a predictor. It outputs bounded, calibrated, conditional probabilities scoped to region × magnitude band × horizon, always shown next to a long-term baseline, evaluated CSEP-style. No deterministic call, no alarm, no countdown, no "safe" state. The product creed, carried verbatim into the app:
Earthquakes cannot be predicted, but their probability can be forecast — reported honestly, with uncertainty, evaluated against reality, never as an alarm and never as a promise of safety.
The building-block models referenced below are defined in full on the Models — Classical and Models — Analytical / ML pages; this page is about how they are assembled.
- The model stack at a glance
- Layer 1 — ETAS reference (the v0 product and the floor)
- Layer 2 — Smoothed-seismicity Poisson null (the mandatory baseline)
- Layer 3 — Reasenberg–Jones transparent fallback
- Layer 4 — The gated, context-conditioned neural TPP
- Regime and tile partitioning
- Target definition
- Calibration and bounds
- The daily-inference flow
- Why this beats a simplistic baseline, by design
- References
The system is a layered conditional estimator. Every layer emits a conditional intensity
flowchart TD
subgraph Inputs["Inputs (snapshotted at issue time t)"]
CAT["Catalog spine<br/>ComCat + regional low-Mc network<br/>(full, un-declustered)"]
DCAT["Declustered catalog<br/>(Zaliapin–Ben-Zion / Gardner–Knopoff)"]
ENR["Static enrichers (gated)<br/>Slab2 · faults · GNSS strain · tidal ΔCFS"]
end
subgraph Hygiene["Hygiene pipeline"]
MC["Time-varying Mc(x,y,t)<br/>MAXC+GFT / EMR"]
MW["Mw homogenization<br/>(ISC-GEM / GCMT anchor)"]
BV["b-value (Aki–Utsu, binned)"]
end
subgraph Models["Model layers (same 0.1 grid)"]
NULL["L2 · Smoothed-seismicity<br/>Poisson NULL (mu(x,y))"]
ETAS["L1 · Space-time ETAS<br/>REFERENCE / v0 core"]
RJ["L3 · Reasenberg–Jones<br/>transparent FALLBACK"]
NPP["L4 · Gated neural TPP<br/>(Hawkes bias + CNN context encoder)<br/>FEATURE-FLAGGED"]
end
subgraph Gate["CSEP gate + calibration"]
CSEP["Prospective CSEP harness<br/>N/M/S/CL consistency + IGPE vs Poisson AND ETAS"]
CAL["Calibration (isotonic/Platt)<br/>reliability diagram — release blocker"]
end
OUT["Public artifact<br/>per-cell P(>=1 event >= M*) for 1d/2d/7d<br/>+ P10/median/P90 bounds + baseline"]
CAT --> MC --> MW --> BV
DCAT --> NULL
BV --> ETAS
BV --> RJ
NULL --> ETAS
ENR -.gated.-> NPP
BV --> NPP
ETAS --> CSEP
NULL --> CSEP
RJ --> CSEP
NPP -.must beat ETAS.-> CSEP
CSEP --> CAL --> OUT
| Layer | Model | Status | Role |
|---|---|---|---|
| L1 | Space–time ETAS (region-refit) | Ships in v0 | The calibrated, defensible reference and primary estimator. |
| L2 | Smoothed-seismicity Poisson | Always on | The time-independent spatial background |
| L3 | Reasenberg–Jones | Always on | Transparent operational fallback and sanity check (the shape USGS OAF runs). |
| L4 | Context-conditioned neural TPP | Gated / feature-flagged | A challenger; reaches the public map only if it beats ETAS in our prospective CSEP harness and is calibrated. |
v0 ships ETAS-class only. A region-refit space–time ETAS with the full hygiene pipeline that
simplistic baselines omit — per-region
The space–time conditional intensity (canonical Ogata-1998 separable kernels):
-
Utsu productivity:
$k(m) = K, e^{\alpha(m - M_0)}$ -
Omori–Utsu temporal decay:
$g(t) = \dfrac{p-1}{c}\left(1 + \dfrac{t}{c}\right)^{-p}$ -
Spatial kernel (Ogata 1998):
$f(r \mid m) = \dfrac{q-1}{\pi,\zeta(m)^2}\left(1 + \dfrac{r^2}{\zeta(m)^2}\right)^{-q}$ , with$\zeta(m) = D, e^{\gamma(m - M_0)}$ -
Magnitudes follow Gutenberg–Richter,
$f(m) = \beta e^{-\beta(m - M_c)}$ ,$\beta = b\ln 10$ .
Two stability gates, kept separate (rejecting a fit on either):
-
$\alpha < \beta$ is required for the productivity × magnitude integral to converge (finite branching ratio$n$ ). - Given that,
$n < 1$ is the subcritical / stationary condition. Reject any fit with$n \ge 1$ (supercritical = mis-fit).
The model is fit by maximizing the point-process log-likelihood
$$\ln L = \sum_i \ln \lambda(t_i, x_i, y_i \mid \mathcal{H}_{t_i})
- \int_0^T!!\int_A \lambda(t, x, y), dx, dy, dt$$
on the full, un-declustered catalog (triggering is the predictable signal), with the
background
(Full theory: Models — Classical §3. Ogata 1988, doi:10.1080/01621459.1988.10478560; Ogata 1998, doi:10.1023/A:1003403601725.)
A stationary, time-independent estimate of where earthquakes occur, obtained by smoothing a declustered catalog with an adaptive kernel (Helmstetter–Kagan–Jackson):
with bandwidth
This layer plays a dual role, which is why it is always on:
- It is the spatial background
$\mu(x,y)$ that feeds the ETAS reference. - It is the mandatory Poisson null — the time-independent reference that any time-dependent
model (ETAS or neural) must beat in a comparison test before it can claim skill. It is also the
principled cold-start floor: where recent seismicity is sparse or zero, the conditional rate
floors to
$\mu(x,y)$ rather than a hard-coded per-day constant. A blank cell never reads as "safe."
Implementation note (do not hard-code the exponent). The kernel exponent
$s$ and normalization$C(d)$ vary across the Helmstetter–Kagan–Jackson family ($s = 1$ and$s = 3/2$ both appear). Pin$s$ ,$C(d)$ , and the neighbour count to a specific reference implementation (HKJ 2007 / Werner 2011 or the pyCSEP/floatCSEP code) before coding; do not treat$s = 3/2$ as a verified universal constant.
(Full theory: Models — Classical §7. Helmstetter, Kagan & Jackson 2007, doi:10.1785/gssrl.78.1.78.)
The most transparent "tomorrow's earthquakes" model, kept as a sanity check and a transparent operational baseline (the shape USGS OAF runs):
Its purpose in the stack is interpretability and robustness: when a felt mainshock occurs, R-J gives a directly auditable "probability of a larger event in the next N days" that any seismologist can reproduce by hand, which cross-checks the ETAS reference. The STEP production system (Gerstenberger et al. 2005) — which wraps R-J + a background term into gridded short-interval shaking-probability maps — is the closest analogue to this product's daily gridded output shape.
(Full theory: Models — Classical §4–5. Reasenberg & Jones 1989, doi:10.1126/science.243.4895.1173; Page et al. 2016, doi:10.1785/0120160073.)
The "stronger model" is never the default. It is delivered as a feature-flagged challenger and reaches the public map only if it beats ETAS in our prospective CSEP harness (positive IGPE, T-test CI excluding zero) and is calibrated. Otherwise it stays behind the flag.
A conditional spatio-temporal Neural Point Process in the FERN spirit (Zlydenko et al. 2023): keep the additive background + summed-triggering skeleton of ETAS, and replace the fixed kernels with small learned components:
$$\lambda(x, y, t \mid \mathcal{H}_t) = \mu(x, y)
- \sum_{i:,t_i < t} T_\theta(t - t_i); S_\phi(x - x_i, y - y_i;, m_i,, \mathbf{c}_i),$$
where
This is the load-bearing design decision that avoids the DeVries trap (see Models — Analytical / ML §5). A CNN is used only to encode spatial context $\mathbf{c}i$ — gridded fields such as Slab2 subduction geometry, distance-to-fault, GNSS strain rate, and smoothed background density — into a feature vector that conditions the point-process intensity $S\phi$. It is never a standalone per-cell "will there be an aftershock here?" binary classifier scored by AUC. The survival-term / point-process likelihood is retained end-to-end, so the output remains a proper, calibratable probability — exactly what a per-cell CNN classifier throws away.
flowchart LR
GRID["Gridded spatial context<br/>Slab2 · faults · GNSS strain · mu(x,y)"] --> CNN["CNN spatial-context encoder"]
CNN --> CVEC["context vector c_i"]
HIST["Event history H_t<br/>(t_i, x_i, y_i, m_i)"] --> ENC["Triggering encoder<br/>T_theta, S_phi (Hawkes skeleton)"]
CVEC --> ENC
ENC --> LAM["Conditional intensity<br/>lambda(t,x,y | H_t)"]
LAM --> LL["Point-process log-likelihood<br/>(survival term retained)"]
LL --> PROB["Proper, calibratable probability"]
FERN's reported 4–12% IGPE gain came mostly from (a) ingesting sub-$M_c$ events and (b)
learned spatial anisotropy aligned with fault traces — the two real gaps in classical ETAS —
not from network depth. So neural R&D targets exactly those two proven levers; covariate
availability for the chosen region (not the architecture) is the binding resource. FERN's own
caveats are release-blockers for us: it was not CSEP-tested, gave no uncertainty
quantification, and its test period ended before Tohoku
flowchart TD
A["Neural challenger fit (temporal split)"] --> B{"Passes CSEP consistency?<br/>N / M / S / CL"}
B -- No --> X["Stays behind feature flag"]
B -- Yes --> C{"Beats ETAS on IGPE?<br/>paired T-test CI excludes 0"}
C -- No --> X
C -- Yes --> D{"Calibrated?<br/>reliability diagram / PIT"}
D -- No --> X
D -- Yes --> E["Eligible for the public map"]
Honest experimental record (full provenance: the repo experiment register E6, E10–E11). The challenger was trained on the real global catalog on a laptop RTX 4070.
First run — seismicity-only context. Gate ≈ 0 by construction: with no external covariates the neural is essentially an ETAS-equivalent, so it adds nothing. Honest, expected, and consistent with the field.
Geodetic context — and a calibration bug. We then wired the real GNSS strain rate covariate (NGL
MIDAS, 20,168 stations; strain high on active margins — Japan 95, California 154 nstrain/yr — and low in
stable interiors — Central US 3.6). The first gate was catastrophically negative because the neural
over-forecast the count ~490× (122,148 vs 248 observed): its absolute-rate calibration was broken (the
training compensator constrained the background only at event locations, leaving the forecast-grid integral
free to inflate). Diagnosed and fixed with a fit-time rate_cal scalar that anchors the integrated rate to
the training rate on the same grid the forecast uses.
Calibrated result — the context-neural beats ETAS on the in-loop gate.
| neural (calibrated) | ETAS | observed | |
|---|---|---|---|
| forecast count (30-day global) | 218.9 | 173.7 | 248 |
| N-test | PASS (q = 0.028) | — | — |
| IGPE vs ETAS | +0.075 nats/eq · gate passed | — | — |
So the GNSS-strain-conditioned neural, once calibrated, beats ETAS on a single-window leakage-free gate and passes the N-test — the spatial-placement (shape) gain was a robust +0.05–0.07 across runs. This is more than the published reference repos achieved.
The honesty that defines this product. That is a single-window in-loop gate, not the
pseudo-prospective back-analysis. Single-window / retrospective wins do not generalise — precisely why
no neural point process has beaten ETAS prospectively (EarthquakeNPP) and why the reference repos' wins
evaporate under rigorous testing. So we do not claim "we beat ETAS." We claim an encouraging,
calibrated signal that the geodetic context helps, now under pseudo-prospective validation (the neural
recondition path scores it across many weekly windows). It reaches the public map only if it survives that
test, the §5.4 gate, and calibration — never on a single window.
(Full theory: Models — Analytical / ML §4–5, §9. Zlydenko et al. 2023, doi:10.1038/s41598-023-38033-9; Dascher-Cousineau et al. 2023, doi:10.1029/2023GL103909; Stockman, Lawson & Werner 2026, arXiv:2410.08226.)
The map is fit fine, reported coarse, and parameters are regionalized rather than global.
- Fine fit / score grid: regular 0.1° × 0.1° spatial cells with 0.1-magnitude bins (matching the CSEP California convention), so the S-test and M-test can resolve where and what size. This is the resolution at which ETAS is fit and CSEP-tested.
- Coarse display: the UI aggregates the fine grid into H3 hexbins / tectonic polygons. Fit fine, report coarse.
-
Tectonic regionalization (the regime partition): parameters are not global. Subduction
zones violate the isotropic-kernel / point-source assumptions of generic ETAS for great
earthquakes, so a subduction build (e.g. Chile) needs Slab2 geometry, possibly anisotropic /
finite-fault triggering, and its own network's
$M_c$ — California generic parameters are never reused. Each tectonic regime gets its own fitted ETAS and its own region-appropriate$M^*$ /$M_{\max}$ . - Borrow strength where data are sparse: rather than shrinking cells until they are empty, sparse regions borrow strength via the smoothed-seismicity background (Layer 2) and hierarchical / empirical-Bayes pooling with regionalized priors, so cells with few events inherit a sensible prior from their tectonic neighbourhood.
The binding design decision. The model produces the full conditional magnitude distribution (via
the GR /
Spatial cell: 0.1° × 0.1° (fit/score), aggregated to region for display (§6).
Magnitude: forecast the distribution, threshold for display. The exceedance probability — the public number — is
- Forecasting the full distribution (not a single binary) keeps the GR information the model already computes, enables the M-test and CRPS, and lets the UI offer multiple thresholds.
-
$M_{\max}$ is specified per region (sourced from regional hazard models): it bounds the exceedance integral and sets the tail probability of the rare large events that dominate impact. It is an explicit, documented assumption with sensitivity reported. -
$b$ -value spatial/temporal variation propagates into the tail and is carried with uncertainty.
Horizon: three rolling horizons re-issued daily — 1 day, 2 days, 7 days — each at the
region's
The public formula
$P = 1 - e^{-N}$ never changes. Only the quality of$\lambda$ improves as the model improves. The number is always shown next to its long-term baseline, so a user reads "X% vs. Y% baseline," never an unanchored figure.
The public probability is recalibrated (isotonic / Platt) and validated with a reliability diagram per horizon ("when we said 5%, it happened ~5% of the time"). An uncalibrated probability does not ship. The number is always rendered next to the climatological / Poisson baseline. Because the reliability diagram is dominated by quiet cells, calibration is validated specifically in the cold-start regime, not only during active sequences. The public reliability diagram is the single most credibility-building artifact the product ships.
The UI ships an optimistic (P10) / expected / pessimistic (P90) triad — the empirically best uncertainty design (Schneider et al. 2022). Those bounds are a genuine epistemic + aleatory decomposition, sourced from:
- ETAS parameter uncertainty — MLE covariance / bootstrap, or a Bayesian posterior (INLAbru gives a fast posterior, not just speed).
-
$M_c$ and$b$ -value estimation uncertainty, propagated through the exceedance integral. - Structural / model-selection uncertainty (ETAS-variant choice).
- Over-dispersion. Regional seismicity is over-dispersed relative to Poisson (variance ≫ mean, from clustering), so the pessimistic bound must be wider than a naive Poisson quantile (negative-binomial behaviour; handled via the catalog-based number test in pyCSEP). A Poisson-only band systematically under-warns at the tail — and the Schneider result holds only if the bounds are real.
(Schneider et al. 2022, doi:10.5194/nhess-22-1499-2022; Kagan 2017, doi:10.1093/gji/ggx300. Scoring and over-dispersion handling: Evaluation.)
A strict forecast clock makes temporal leakage structurally impossible: at each daily issue
time
flowchart TD
S1["1 · Snapshot inputs at issue time t<br/>ComCat + regional network, slice (-inf, t)<br/>persist immutable, versioned artifact"]
S2["2 · Hygiene<br/>Mc(x,y,t) -> Mw homogenize -> dual-catalog decluster<br/>+ post-mainshock incompleteness correction"]
S3["3 · Fit/update region-refit ETAS<br/>0.1 grid; enforce alpha<beta; reject n>=1"]
S4["4 · Generate >=10,000 synthetic next-day catalogs<br/>-> 1d/2d/7d probabilities at M* + P10/median/P90"]
S5["5 · Calibrate (isotonic/Platt) + QA-gate"]
S6["6 · Write one compact artifact<br/>per-cell rates + baseline + bounds + coverage mask + versions"]
S7["7 · Log immutably (forecast + input snapshot)"]
S8["8 · Serve stateless/read-only via a thin static viewer"]
S1 --> S2 --> S3 --> S4 --> S5 --> S6 --> S7 --> S8
Two operational guards carried from the design:
-
Post-mainshock incompleteness. Right after a large mainshock — exactly when the forecast
matters most — the catalog is grossly incomplete and naive ETAS underestimates productivity. The
daily job uses an incompleteness-aware likelihood with a time-dependent
$M_c(t)$ , not a flat threshold. Under-forecasting here is both a credibility and an (indirect) safety-communication failure. -
QA gating. A single bad / duplicated / retracted event near
$M^*$ can swing a public probability. Auto-publish only if input-catalog sanity, artifact integrity, and a rolling-window N-test drift monitor on the forecaster itself all pass. On a data outage or a failed run, serve "unavailable" with a staleness banner — never silently serve a stale or corrupted artifact.
A coarse-rectangle, hand-binned-Hawkes, AUC-scored classifier fails on the exact axes this design fixes:
| Failure mode of a simplistic baseline | This design |
|---|---|
| No |
Rolling per-region |
| No declustering decision | Explicit dual-catalog rule (Zaliapin–Ben-Zion background / full catalog for triggering) |
| Mixed |
|
| Binary ROC-AUC, no null, calibration-blind | CSEP N/M/S/L/CL consistency + IGPE T/W comparison vs. Poisson and ETAS; reliability diagram; AUC banned as a primary metric |
| Coarse rectangles (un-S-testable) | Fine 0.1° grid (S-testable); region-level only for display |
| Hand-set constants, no CIs | MLE / Bayesian fit, propagated parameter + structural + over-dispersion uncertainty |
| Arbitrary per-day floor | Principled smoothed-seismicity background + hierarchical pooling for cold-start |
| Soft feature/label leak | Forecast-clock causal cutoff (features |
Information gain over a Poisson reference is state-dependent (reported in nats, the CSEP
unit, not bits): positive and large during active aftershock sequences (probability gains up to
orders of magnitude on peak days), near zero in quiet periods, with a modest all-period average.
For context, time-independent model-vs-smoothed-seismicity IGPE in prospective California CSEP is
only about −0.7 to +0.5 nats. We report the gain honestly as state-dependent — never as a
fabricated round figure — and always in nats. The public exceedance formula
- Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes. JASA 83(401), 9–27. doi:10.1080/01621459.1988.10478560
- Ogata, Y. (1998). Space–time point-process models for earthquake occurrences. Ann. Inst. Statist. Math. 50(2), 379–402. doi:10.1023/A:1003403601725
- Reasenberg, P.A. & Jones, L.M. (1989). Earthquake hazard after a mainshock in California. Science 243(4895), 1173–1176. doi:10.1126/science.243.4895.1173
- Page, M.T. et al. (2016). Three ingredients for improved global aftershock forecasts. BSSA 106(5), 2290–2301. doi:10.1785/0120160073
- Gerstenberger, M.C., Wiemer, S., Jones, L.M. & Reasenberg, P.A. (2005). Real-time forecasts of tomorrow's earthquakes in California. Nature 435, 328–331. doi:10.1038/nature03622
- Helmstetter, A., Kagan, Y.Y. & Jackson, D.D. (2007). High-resolution time-independent grid-based forecast for M ≥ 5 earthquakes in California. SRL 78(1), 78–86. doi:10.1785/gssrl.78.1.78
- Werner, M.J., Helmstetter, A., Jackson, D.D. & Kagan, Y.Y. (2011). High-resolution long-term and short-term earthquake forecasts for California. BSSA 101, 1630–1648. doi:10.1785/0120090340
- Zhuang, J., Ogata, Y. & Vere-Jones, D. (2002). Stochastic declustering of space–time earthquake occurrences. JASA 97(458), 369–380. doi:10.1198/016214502760046925
- Zaliapin, I. & Ben-Zion, Y. (2020). Earthquake declustering using the nearest-neighbor approach in space-time-magnitude domain. JGR Solid Earth 125, e2018JB017120. doi:10.1029/2018JB017120
- Zlydenko, O. et al. (2023). A neural encoder for earthquake rate forecasting. Scientific Reports 13. doi:10.1038/s41598-023-38033-9
- Dascher-Cousineau, K., Shchur, O., Brodsky, E.E. & Günnemann, S. (2023). Using deep learning for flexible and scalable earthquake forecasting (RECAST). GRL 50, e2023GL103909. doi:10.1029/2023GL103909
- Stockman, S., Lawson, D. & Werner, M.J. (2026, accepted). EarthquakeNPP: A Benchmark for Earthquake Forecasting with Neural Point Processes. TMLR. arXiv:2410.08226
- Hayes, G.P. et al. (2018). Slab2, a comprehensive subduction zone geometry model. Science 362:58–61. doi:10.1126/science.aat4723
- Schneider, M. et al. (2022). Bridging the gap between earthquake forecasts and uncertainty communication. NHESS 22(4), 1499–1518. doi:10.5194/nhess-22-1499-2022
- Kagan, Y.Y. (2017). Worldwide earthquake forecasts. GJI 211(1), 335–345. doi:10.1093/gji/ggx300
- Wiemer, S. & Wyss, M. (2000). Minimum magnitude of completeness in earthquake catalogs. BSSA 90(4), 859–869. doi:10.1785/0119990114
- Jordan, T.H. et al. (2011). Operational Earthquake Forecasting (ICEF Report). Annals of Geophysics 54(4), 315–391. doi:10.4401/ag-5350
See also: Models — Classical · Models — Analytical / ML · Evaluation · Methodology · Data & Pipelines.
⚠️ Disclaimer — read this. CAOS_SEISMIC produces probabilistic forecasts, not predictions. It is an independent research and education tool. It is NOT an official earthquake early-warning or civil-protection system, it does NOT predict when, where, or how large an earthquake will be, and it must NOT be used for life-safety, emergency, or evacuation decisions. Every number it publishes is a bounded, calibrated probability conditioned on the present state of seismicity — never an alarm, a countdown, or a "safe" state. A single outcome neither confirms nor refutes a probabilistic forecast.It complements, and does not replace or speak for, official agencies — always follow your national seismological and civil-protection authorities (e.g. USGS, INGV, CSN (Chile, SENAPRED for civil protection), GeoNet, JMA). The software is provided "as is", without warranty of any kind (MIT License); the authors accept no liability for its use. Data are courtesy of their providers (USGS/ANSS, ISC/ISC-GEM, Global CMT, EMSC, CSN, and others) under their respective licenses and attribution terms. See Honest-Limits for the full epistemic context.
CAOS_SEISMIC · seismic.fasl-work.com · source · MIT
Conditional probabilistic seismic forecasting — forecasts, never predictions.
Overview
Methodology & History
Classical models
- Models-Classical · index
- Gutenberg-Richter-Law
- Omori-Utsu-Law
- ETAS-Model
- Reasenberg-Jones-Model
- STEP-Model
- EEPAS-Model
- Smoothed-Seismicity
- Brownian-Passage-Time
- Rate-and-State-and-Coulomb
ML & analytical methods
- Models-ML · index
- Temporal-Point-Processes
- RMTPP
- Neural-Hawkes-Process
- Transformer-Hawkes-Process
- RECAST-and-FERN
- CNN-Spatial-Models
- Graph-and-Recurrent-Networks
- Detection-vs-Forecasting
Models employed
Data
Architecture
Evaluation
Progress
Reference