-
Notifications
You must be signed in to change notification settings - Fork 0
Methodology History
Earthquakes cannot be predicted, but their probability can be forecast — reported honestly, with uncertainty, evaluated against reality, never as an alarm and never as a promise of safety.
This page is the authoritative methodological record of CAOS_SEISMIC. It does two things:
- It traces how the methodology of short-term earthquake forecasting actually evolved — the century-long arc from the first empirical scaling laws to self-exciting point processes, the CSEP testing revolution, and the neural / machine-learning era with its honest verdict — naming the key papers, people, and why each step mattered.
- It documents how this product's own methodology was reasoned out on top of that history: why ETAS is the calibrated reference, why a stationary Poisson model is the mandatory null, why the "stronger model" is a gated, context-conditioned neural temporal point process, why global-context-conditioning is the central thesis, why every number is scored under CSEP, and why the framing is relentlessly honest.
Every equation is real and every non-trivial claim is anchored to peer-reviewed literature with a
DOI. A magnitude convention note: magnitudes are homogenized to moment magnitude
-
Part I — The historical arc of the field
- 1. The empirical scaling laws (1894–1965): statistics, not timing
- 2. Operational aftershock forecasting (1989–2016): from law to service
- 3. ETAS (1988–1998): the self-exciting point process
- 4. Smoothed seismicity and the spatial background
- 5. CSEP (2006–): the testing revolution that made forecasts honest
- 6. Operational Earthquake Forecasting and the ICEF doctrine (2009–2011)
- 7. The neural / machine-learning era (2016–2026) and its honest verdict
-
Part II — How this product's methodology was reasoned out
- 8. The forecasting target: a conditional intensity
- 9. Why ETAS is the calibrated reference
- 10. Why a stationary Poisson model is the mandatory null
- 11. Why the stronger model is a gated, context-conditioned neural TPP
- 12. Why global-context-conditioning is the thesis
- 13. Why CSEP is the only acceptable scoring framework
- 14. Why the framing stays honest
- References
The methodology of earthquake forecasting did not arrive as a single theory. It accreted, over more than a century, out of three robust empirical regularities, a point-process formalism that fused them, a testing culture that made claims falsifiable, and — most recently — a machine-learning wave whose honest contribution turned out to be narrower than its hype. The sequence below is roughly chronological, but the deeper logic is epistemic: each step is an answer to "how do we make an honest, testable statement about the future of a system we cannot deterministically predict?"
flowchart TD
GR["Gutenberg–Richter 1944<br/>magnitude–frequency law"] --> ETAS
OM["Omori 1894 / Utsu 1961<br/>aftershock decay law"] --> ETAS
PROD["Utsu productivity<br/>aftershock abundance ∝ magnitude"] --> ETAS
RJ["Reasenberg–Jones 1989<br/>operational aftershock probabilities"] --> STEP["STEP 2005<br/>gridded short-term maps"]
RJ --> OAF["USGS OAF<br/>operational service"]
ETAS["ETAS<br/>Ogata 1988 / 1998<br/>self-exciting point process"] --> SMOOTH["Smoothed seismicity<br/>Helmstetter et al. 2007<br/>spatial background μ(x,y)"]
ETAS --> CSEP["CSEP / RELM 2006–<br/>prospective testing"]
SMOOTH --> CSEP
STEP --> CSEP
CSEP --> ICEF["ICEF / OEF 2011<br/>operational doctrine"]
ICEF --> NEURAL["Neural TPP era<br/>2016–2026<br/>honest verdict: gated challenger"]
CSEP --> NEURAL
The foundation of every honest forecast is a small set of empirical scaling laws that describe the
statistics of populations of earthquakes — not the timing of any single event. This distinction
is the whole game: the laws let us say "the rate of
The number of earthquakes is, to first order, a power law in magnitude:
Here
The law supplies the magnitude term of every forecast: given a rate of small events, it sets the
expected rate of large ones. Crucially, because the distribution is self-similar (no characteristic
scale), there is no "gap" that signals a specific large event is "due." Gutenberg & Richter (1944)
established the relation for California; Aki (1965) gave the maximum-likelihood estimator for
later refined for finite magnitude bins (the Aki–Utsu / Tinti–Mulargia binning correction, Tinti & Mulargia 1987):
Why it mattered: it turned the raw fact "small quakes are common, big ones rare" into a
quantitative, estimable distribution — the first ingredient of a forecast. Its central weakness, the
strong bias of
Fusakichi Omori observed in 1894 that aftershock rate decays as roughly
with cumulative form (for
Here
Why it mattered: it is the temporal kernel that every modern short-term model — Reasenberg–Jones,
STEP, ETAS — embeds. The honesty caveat it carries forward: right after a large mainshock, exactly
the highest-stakes moment,
The third regularity completes the trio: the number of aftershocks a mainshock produces grows
exponentially with its magnitude,
The honest takeaway from Part 1. These laws are statistical and conditional. They are the reason short-term forecasting has real skill — and the reason that skill is always a statement about probability over a population, never about the timing of one rupture.
The first step from "law" to "operational forecast" was made by Reasenberg & Jones (1989) in
Science. They multiplied the Gutenberg–Richter magnitude term by the modified-Omori time decay to
get the rate of aftershocks of magnitude
Treating events as a non-homogeneous Poisson process, the probability of one or more such
aftershocks in a window
Why it mattered: this is the first model that produced an actual public probability — "the chance of a damaging aftershock in the next week" — rather than a description of past statistics. It is the direct ancestor of the USGS Operational Aftershock Forecast (OAF) service, whose lineage runs Reasenberg & Jones (1989) → Page et al. (2016), who introduced global generic parameters by tectonic regime and, critically, modeled intersequence variability so that forecasts come with uncertainty bounds, not point estimates.
STEP (Short-Term Earthquake Probability; Gerstenberger, Wiemer, Jones & Reasenberg 2005, Nature) wrapped Reasenberg–Jones clustering plus a background term into hourly, gridded shaking-probability maps for California. STEP is the methodological template for the output shape of a daily gridded probabilistic product — "tomorrow's earthquakes" as a map — exactly the form this product takes (at daily cadence).
The decisive theoretical unification was Yosihiko Ogata's Epidemic-Type Aftershock Sequence
(ETAS) model — temporal in Ogata (1988, JASA), spatio-temporal in Ogata (1998, AISM). ETAS is a
self-exciting (Hawkes) marked point process: a stationary background rate plus a sum over all
past events, each of which triggers its own offspring with Omori-time decay, Gutenberg–Richter
magnitudes, and a spatial kernel. The conditional intensity — the instantaneous expected rate at
with the Ogata (1998) inverse-power spatial kernel
So ETAS is literally (background) + (Utsu productivity) × (Omori–Utsu decay) × (spatial spread), with Gutenberg–Richter magnitudes, stitched into one equation. It is the canonical example of turning the three scaling laws of §1 into a forecast. The model is fit by maximizing the point-process log-likelihood
$$\ln L = \sum_i \ln \lambda(t_i, x_i, y_i \mid \mathcal H_{t_i})
- \int_0^T!!\int_A \lambda(t, x, y \mid \mathcal H_t), dx, dy, dt,$$
and the background field
A subtlety with real operational consequences is stability, which requires two logically separate conditions:
-
Finite branching: the productivity × magnitude integral converges only if
$\alpha < \beta$ (with$\beta = b\ln 10$ ). If$\alpha \ge \beta$ the largest events dominate and the model is improper. -
Subcriticality: given
$\alpha < \beta$ , the branching ratio$n$ (expected direct offspring per event) must satisfy$n < 1$ . A fit with$n \ge 1$ is supercritical (explosive) and is rejected as a mis-fit.
Why it mattered: ETAS is fully generative (you can simulate synthetic next-day catalogs and compute any test on them), has only ~5–8 interpretable parameters (little overfitting headroom), and has decades of prospective track record. It is the de-facto operational baseline for short-term forecasting — the model any challenger must beat. This is exactly the role it plays in this product (Part II, §9).
ETAS needs a stationary background field
The catalog is declustered first so the map reflects the long-term average, not transient bursts. This was the best-performing time-independent spatial forecast in the RELM/CSEP experiments.
Implementation honesty. The kernel exponent
$s$ and normalization$C(d)$ vary across the Helmstetter–Kagan–Jackson family of papers (forms with$s = 1$ and$s = 3/2$ both appear), and the neighbor count is a region-tuned hyperparameter. These are pinned to a reference implementation, not hard-coded as universal constants.
Why it mattered for us: smoothed seismicity plays two roles in this product — it is the spatial
background
Forecasting only became a science — rather than a sequence of post-hoc success stories — when it became prospectively falsifiable. The Regional Earthquake Likelihood Models (RELM) experiment and its successor, the Collaboratory for the Study of Earthquake Predictability (CSEP), established the culture: a testing center controls the input data and runs the models (not the modelers), evaluating each forecast against future seismicity it never saw. This removes hindsight tuning and makes results reproducible.
CSEP scores a gridded Poisson forecast by the joint log-likelihood over space–magnitude bins,
$$L(\Omega \mid \Lambda) = \sum_{\text{bins } i}\Big[-\lambda_i + \omega_i \ln \lambda_i
- \ln(\omega_i!)\Big],$$
and applies a battery of consistency tests (Schorlemmer et al. 2007; Zechar, Gerstenberger & Rhoades 2010):
| Test | Question |
|---|---|
| N-test | Is the total number of forecast events consistent with what occurred? |
| M-test | Is the magnitude distribution consistent? |
| S-test | Is the spatial distribution consistent? |
| L-test / CL-test | Is the joint (pseudo-)likelihood consistent? (CL conditions on |
Passing consistency tests is necessary but not sufficient. Skill is established only by winning a comparison test against a real baseline, via the information gain per earthquake (IGPE, in nats):
$$I_N(A, B) = \frac{1}{N}\sum_{i=1}^{N}\Big(\ln \lambda_{A}(k_i) - \ln \lambda_{B}(k_i)\Big)
- \frac{\hat N_A - \hat N_B}{N},$$
tested with a paired T-test ($T = I_N/(s/\sqrt N)$) and the non-parametric W-test. All of this is implemented in the community toolkit pyCSEP (Savran et al. 2022).
Why it mattered: CSEP is the difference between "we forecast the earthquake" (a cherry-picked anecdote) and "across many days, our stated probabilities matched observed frequencies" (a calibrated, auditable claim). It is the testing backbone of this product (§13), and the reason a public reliability diagram is the single most credibility-building artifact the product can ship.
The 2009 L'Aquila earthquake — and the criminal trials that followed a miscommunication of low probability — provoked the field's most important governance document. Italy appointed the International Commission on Earthquake Forecasting for Civil Protection (ICEF), chaired by Thomas H. Jordan; its report (Jordan et al. 2011) is the canonical statement of Operational Earthquake Forecasting (OEF): the authoritative dissemination of time-dependent, regularly updated probabilistic forecasts for civil-protection decision-making.
The ICEF report fixed the field's vocabulary, and ours. A prediction is "a deterministic statement that a future earthquake will or will not occur" (yes/no); a forecast is "a probability (greater than zero but less than one)" that such an event occurs. The report's single most important honesty constraint: short-term probabilities "may vary over orders of magnitude but typically remain low in an absolute sense (< 1 % per day)."
Real OEF systems run today and are honest, calibrated, and validated:
- USGS OAF (Reasenberg–Jones → Page et al. 2016 global) publishes probabilities and expected counts with uncertainty.
-
OEF-Italy (INGV) runs an ensemble of three clustering models (two ETAS flavors plus a STEP
model), updated every midnight and after each
$M_L \ge 3.5$ event. Its 10-year prospective validation (Spassiani, Falcone, Murru & Marzocchi 2023) found it broadly reliable — with a documented underestimation during the 2016–2017 Central Italy sequence caused by post-mainshock catalog incompleteness, exactly the §1.2 limit.
Why it mattered: the field's answer to L'Aquila was not "stop forecasting." It was "do probabilistic forecasting properly and communicate it honestly." That doctrine is the spine of this product's framing (§14) and its governance posture.
The deep-learning wave reached seismology with high expectations. The honest verdict, established by the field's most rigorous benchmarks, is sobering — and clarifying.
The cautionary tale. DeVries, Viegas, Wattenberg & Meade (2018, Nature) trained a deep network (6 hidden layers, ~13,451 free parameters, 12 stress features) to forecast aftershock spatial patterns and reported AUC 0.85 versus 0.58 for the classical Coulomb stress criterion. Mignan & Broccardo (2019, Nature) then matched that 0.85 with a 2-parameter logistic regression ("one neuron") using a single physically simple feature. The deep net's apparent advantage was an artifact of massive over-parameterization (versus only ~199 effective mainshocks), a per-cell "computer-vision" framing that inflated the apparent sample size, and an inappropriate metric (AUC is invariant to calibration and degenerates into a region classifier on rare per-cell tasks).
The decisive benchmark. EarthquakeNPP (Stockman, Lawson & Werner, TMLR 2026) benchmarked five modern neural point processes (NSTPP, DeepSTPP, AutoSTPP, DSTPP, SMASH) against ETAS on California 1971–2021 with strict chronological splits and CSEP consistency tests. None outperformed ETAS. ETAS won spatial log-likelihood consistently; on ComCat, ETAS passed the spatial test at ~92 % while the best NPP managed only ~69 % — and the spatial test is exactly where forecasting value lives. The crucial methodological fix: earlier neural-TPP-for-earthquakes work had used non-chronological splits (which inflate metrics because earthquakes trigger one another) and had excluded the giant 2011 Tohoku sequence; once temporal splits and the big sequences are restored, the apparent neural advantage evaporates.
Where ML genuinely helps. The honest, evidence-grounded reading is that neural value is concentrated in two real ETAS gaps, not in "deep-learning magic":
- Multivariate covariate ingestion — ETAS uses one catalog; a neural conditional intensity can fuse geodesy/InSAR, pore-pressure data, multiple catalogs, and sub-$M_c$ events. FERN (Zlydenko et al. 2023) — an ETAS-generalizing encoder that replaces fixed kernels with MLPs — reported a 4–12 % information-gain improvement, but its own authors note the gain came mostly from ingesting sub-$M_c$ events, that it was not CSEP-tested, provided no uncertainty quantification, and its test period ended before Tohoku.
- Learned spatial anisotropy — fault-aligned triggering structure recovered without explicit fault inputs.
RECAST (Dascher-Cousineau et al. 2023) — a GRU-based neural TPP — improves on temporal ETAS only
when the training catalog is large (
Detection is not forecasting. ML waveform models (PhaseNet, EQTransformer, PhaseNO, the SeisLM foundation model) are mature and production-grade for detection, phase-picking, and characterization — they build better, more complete catalogs, which is the biggest near-term realizable lever for both ETAS and any neural forecaster. But they tell you what already happened; they do not forecast. SeisLM's "foreshock–aftershock" task is retrospective classification of existing waveforms relative to a known mainshock. This product keeps the detection/forecasting line explicit so that detection branding never implies prediction.
The honest verdict (2026). No pure machine-learning model has robustly beaten a well-fit ETAS for short-term forecasting under fair, prospective CSEP-style testing. This justifies shipping ETAS-class as the calibrated reference and gating any neural model behind a CSEP win — but it is stated as "on the benchmarks to date," not as an unconditional law that ML can never add skill.
Part I is the field's history. Part II is this product's history of decisions — what was chosen, and the reasoning behind each choice. The through-line: build the strongest calibrated, testable conditional estimator, ship it honestly, and never let a more flexible model reach the public until it has earned its place against the established baseline.
The first decision was to formulate the problem as a marked spatio-temporal point process and to
estimate its conditional intensity
The integral ("compensator") term is what makes a model probabilistic and calibratable rather
than a regressor. A system that drops it — by reframing forecasting as classification ("will there be
an
The public number is then an exceedance probability. The expected number of events above a target
magnitude
and the published probability is
The product forecasts the full conditional magnitude distribution and thresholds it for display
(per-region
ETAS (§3) is shipped as the calibrated, defensible core for three reasons that no alternative matches at v0:
-
It is fully generative. Simulating
$\ge 10{,}000$ synthetic next-day catalogs produces bounded probabilities for the 1-day / 2-day / 7-day horizons and lets every CSEP test run on the ensemble. - It has few interpretable parameters (~5–8), so there is little overfitting headroom and the fit is defensible to a senior seismologist.
- It is the field's prospective baseline. Decades of CSEP track record make it the honest yardstick.
The product's ETAS is not a toy. It runs the full hygiene pipeline a simplistic baseline omits:
rolling per-region
Every time-dependent forecast is required to beat a stationary, time-independent Poisson model
built from smoothed seismicity (§4). This is non-negotiable for an honest reason: most of any map's
area-and-time is quiet, and on quiet days a time-dependent model should read near climatology. If a
"sophisticated" model cannot beat the smoothed-seismicity null on the IGPE comparison test with a
confidence interval excluding zero, it has added no skill — it has only added complexity. The null is
also the principled cold-start floor: where recent seismicity is sparse or zero, the conditional
rate floors to
This discipline is the direct lesson of the DeVries/Mignan episode (§7): always include a trivially simple baseline. Here the trivial baselines are time-independent Poisson, smoothed seismicity, and ETAS — and any model that does not beat all three is not shipped.
The "stronger model" is not a different default; it is a gated challenger. The product ships ETAS-class as the reference and treats any neural model as a candidate that reaches the public map only if it beats ETAS in the product's own prospective CSEP harness (positive IGPE, T-test confidence interval excluding zero) and is calibrated (reliability diagram, not just sharpness). Calibration is a release blocker.
The chosen challenger class is a conditional spatio-temporal neural temporal point process with a Hawkes inductive bias, in the spirit of FERN: keep the additive background + summed-triggering skeleton of ETAS, replace the fixed kernels with small MLPs / attention, and model magnitude explicitly (a real gap in most neural point processes, flagged by EarthquakeNPP). The design targets the two proven neural levers from §7 — covariate ingestion and learned anisotropy — rather than chasing "deep-learning magic" that the benchmarks show does not exist for this problem. A spatial CNN appears in the architecture only as a context encoder for static geophysical fields, never as a standalone aftershock-pattern CNN (the refuted DeVries approach is documented as a lesson, not a template).
The gate is the methodology. What makes the challenger honest is not the architecture but the gate: strict temporal splits, proper scoring, a CSEP win over ETAS and the Poisson null, and a calibration check — every one of them a guardrail distilled from the failures in §7.
The product's central scientific thesis is global-context-conditioning: rather than fitting an isolated regional model, it treats any country as a view into one global field, trains on global seismicity plus every feasible global covariate, and runs inference across many countries at once — high-seismicity (Chile, Japan, Indonesia, Mexico, Turkey, California, New Zealand) and low-seismicity (UK, Germany, Australia, Brazil).
The reasoning is twofold:
- A region-only model is the opposite of the purpose. The interesting, testable question is whether a conditional forecaster, given enough global context (tectonic regime, slab geometry, plate-boundary type, strain rate), generalizes across tectonic settings — not whether it can be over-fit to one well-instrumented region.
- Comparing across high- and low-seismicity regions is an explicit evaluation goal: it is the way to detect and quantify bias toward high-seismicity zones, which a single-region build cannot surface at all.
Mechanically, this is implemented as a tiled global field: the globe is partitioned into interior tiles (which own cells for aggregation) plus halos (which carry edge events so triggering is edge-correct), each tile fit with a tectonic-regime prior anchored on the USGS OAF tectonic-regime study (Page et al. 2016). Five regimes — subduction interface, intraslab, crustal/strike-slip, intraplate, ridge — assigned per event from Slab2 and plate-boundary geometry, with a self-contained heuristic fallback when the enricher grids are absent. The tiled ETAS stays the calibrated reference the neural model must beat; the neural challenger's job is to do better than per-tile ETAS by exploiting the global covariate context that per-tile ETAS cannot absorb. This is precisely the neural lever §7 identified as real.
The product adopts CSEP / pyCSEP as the only scoring framework — standard tests, not bespoke metrics — because using the same yardstick the field uses is what makes the product defensible: reviewers can dispute the model, not the test code. The methodology around scoring carries several non-obvious but load-bearing decisions:
-
Catalog-based tests are the primary path. Regional seismicity is over-dispersed relative to
Poisson (variance
$\gg$ mean, because of clustering), so Poisson grid tests over-reject during aftershock sequences. The product emits both a gridded-rate forecast (Poisson tests) and a catalog-based forecast (Savran et al. 2020; over-dispersion-honest tests), and the pessimistic uncertainty bound is deliberately wider than a naive Poisson interval (negative-binomial behavior; Kagan 2017). - Skill = winning a comparison test, in nats, against smoothed seismicity and ETAS — never a consistency-test pass reported as if it were skill.
-
A strict forecast clock makes temporal leakage structurally impossible: at each daily issue
time the model is handed only the catalog slice
$(-\infty, t)$ , the forecast is sealed with an immutable input snapshot (catalog version,$M_c$ grid, declustering choice, parameters), then the clock advances. The testing-mode hierarchy is explicit: true prospective > pseudo-prospective > retrospective out-of-sample > in-sample (Mizrahi et al. 2024). - AUC / accuracy are banned as primary skill metrics — the direct lesson of DeVries/Mignan (§7). ROC appears only as a communication aid (the Molchan diagram / area skill score), never as a skill claim.
- The headline credibility artifact is the live reliability diagram — "when we said 5 %, it happened ~5 % of the time" — updated as the prospective record accrues.
The information gain over a Poisson reference is reported as state-dependent (large during active aftershock sequences, near zero in quiet periods, with a modest all-period average), in nats — never a fabricated round figure, never bits.
The final methodological commitment is the honest framing itself, treated as a first-class part of the product rather than a disclaimer bolted on at the end. It rests on a clear epistemic basis: deterministic prediction is effectively impossible because whether a small rupture cascades into a great earthquake depends on unmeasurably fine details of the crust (Geller, Jackson, Kagan & Mulargia 1997), with self-organized criticality (Bak & Tang 1989) as the leading explanatory framework, not settled physics. What is achievable is OEF: conditional probabilities that typically remain low in absolute terms even when the relative gain during a sequence is large.
The product therefore commits, in code and copy, to:
- Never an alarm, never a "safe" state. No alarm dots, no countdown, no binary call. Red is reserved for model quality, never danger. Every number is shown next to its long-term baseline, with horizon and magnitude threshold always attached.
- Honest, real uncertainty (P10 / median / P90), over-dispersion-aware, with a staleness banner and coverage mask — blank never reads as "safe."
- Complement, never compete. It is an independent research/education tool that complements official OEF (USGS, INGV, CSN, GeoNet, JMA), never an authoritative civil-protection alarm.
- A single outcome neither validates nor invalidates a probabilistic forecast — carried into copy with the 2019 Ridgecrest worked example (a ~3 % first-week forecast was not wrong when the ~3 % outcome occurred).
The reason this is methodology and not marketing is the field's own cautionary tale: the L'Aquila harm was a communication failure — false reassurance — not a failure to predict. The honest framing is the engineered antidote to both over-reassurance and over-alarm. See the companion page Honest Limits for the full epistemics.
The product's creed, carried verbatim into the app: Earthquakes cannot be predicted, but their probability can be forecast — reported honestly, with uncertainty, evaluated against reality, never as an alarm and never as a promise of safety.
Canonical, peer-reviewed sources (USGS / ISC / CSEP documentation and journal articles). DOIs given where registered.
- Aki, K. (1965). Maximum likelihood estimate of b in the formula log N = a − bM and its confidence limits. Bull. Earthq. Res. Inst. 43, 237–239.
- Bak, P. & Tang, C. (1989). Earthquakes as a self-organized critical phenomenon. JGR 94(B11), 15635–15637. doi:10.1029/JB094iB11p15635
- Dascher-Cousineau, K. et al. (2023). Using deep learning for flexible and scalable earthquake forecasting (RECAST). GRL 50, e2023GL103909. doi:10.1029/2023GL103909
- DeVries, P. M. R., Viegas, F., Wattenberg, M. & Meade, B. J. (2018). Deep learning of aftershock patterns following large earthquakes. Nature 560, 632–634. doi:10.1038/s41586-018-0438-y
- Dieterich, J. (1994). A constitutive law for rate of earthquake production and its application to earthquake clustering. JGR 99(B2), 2601–2618. doi:10.1029/93JB02581
- Geller, R. J., Jackson, D. D., Kagan, Y. Y. & Mulargia, F. (1997). Earthquakes cannot be predicted. Science 275(5306), 1616–1617. doi:10.1126/science.275.5306.1616
- Gerstenberger, M. C., Wiemer, S., Jones, L. M. & Reasenberg, P. A. (2005). Real-time forecasts of tomorrow's earthquakes in California (STEP). Nature 435, 328–331. doi:10.1038/nature03622
- Gutenberg, B. & Richter, C. F. (1944). Frequency of earthquakes in California. BSSA 34, 185–188.
- Helmstetter, A., Kagan, Y. Y. & Jackson, D. D. (2007). High-resolution time-independent grid-based forecast for M ≥ 5 earthquakes in California. SRL 78(1), 78–86. doi:10.1785/gssrl.78.1.78
- Jordan, T. H. et al. (2011). Operational Earthquake Forecasting: State of Knowledge and Guidelines for Utilization (ICEF Report). Annals of Geophysics 54(4), 315–391. doi:10.4401/ag-5350
- Kagan, Y. Y. (2017). Worldwide earthquake forecasts. GJI 211(1), 335–345. doi:10.1093/gji/ggx300
- Mignan, A. & Broccardo, M. (2019). One neuron versus deep learning in aftershock prediction. Nature 575, E1–E3. doi:10.1038/s41586-019-1582-8
- Mizrahi, L. et al. (2024). Developing, testing, and communicating earthquake forecasts: Current practices and future directions. Reviews of Geophysics 62. doi:10.1029/2023RG000823
- Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes. JASA 83(401), 9–27. doi:10.1080/01621459.1988.10478560
- Ogata, Y. (1998). Space–time point-process models for earthquake occurrences. Ann. Inst. Statist. Math. 50(2), 379–402. doi:10.1023/A:1003403601725
- Page, M. T. et al. (2016). Three ingredients for improved global aftershock forecasts. BSSA 106(5), 2290–2301. doi:10.1785/0120160073
- Reasenberg, P. A. & Jones, L. M. (1989). Earthquake hazard after a mainshock in California. Science 243(4895), 1173–1176. doi:10.1126/science.243.4895.1173
- Savran, W. H. et al. (2020). Pseudoprospective evaluation of UCERF3-ETAS forecasts during the 2019 Ridgecrest sequence. BSSA 110(4), 1799–1817. doi:10.1785/0120200026
- Savran, W. H. et al. (2022). pyCSEP: A Python toolkit for earthquake forecast developers. SRL 93(5), 2858–2870. doi:10.1785/0220220033
- Schorlemmer, D., Gerstenberger, M. C., Wiemer, S., Jackson, D. D. & Rhoades, D. A. (2007). Earthquake likelihood model testing. SRL 78(1), 17–29. doi:10.1785/gssrl.78.1.17
- Shcherbakov, R., Turcotte, D. L. & Rundle, J. B. (2004). A generalized Omori's law for earthquake aftershock decay. GRL 31, L11613. doi:10.1029/2004GL019808
- Spassiani, I., Falcone, G., Murru, M. & Marzocchi, W. (2023). Operational Earthquake Forecasting in Italy: validation after 10 yr of operativity. GJI 234(3), 2501–2518.
- Stockman, S., Lawson, D. & Werner, M. J. (2026, accepted). EarthquakeNPP: A benchmark for earthquake forecasting with neural point processes. TMLR. arXiv:2410.08226
- Tinti, S. & Mulargia, F. (1987). Confidence intervals of b-values for grouped magnitudes. BSSA 77(6), 2125–2134.
- Wiemer, S. & Wyss, M. (2000). Minimum magnitude of completeness in earthquake catalogs. BSSA 90(4), 859–869. doi:10.1785/0119990114
- Woessner, J. & Wiemer, S. (2005). Assessing the quality of earthquake catalogues. BSSA 95(2), 684–698. doi:10.1785/0120040007
- Zaliapin, I. & Ben-Zion, Y. (2020). Earthquake declustering using the nearest-neighbor approach in space–time–magnitude domain. JGR Solid Earth 125, e2018JB017120. doi:10.1029/2018JB017120
- Zechar, J. D., Gerstenberger, M. C. & Rhoades, D. A. (2010). Likelihood-based tests for evaluating space–rate–magnitude earthquake forecasts. BSSA 100(3), 1184–1195. doi:10.1785/0120090192
- Zhuang, J., Ogata, Y. & Vere-Jones, D. (2002). Stochastic declustering of space–time earthquake occurrences. JASA 97(458), 369–380. doi:10.1198/016214502760046925
- Zlydenko, O. et al. (2023). A neural encoder for earthquake rate forecasting (FERN). Scientific Reports 13. doi:10.1038/s41598-023-38033-9
Collaboratory for the Study of Earthquake Predictability (CSEP): https://cseptesting.org. USGS Operational Aftershock Forecasting: https://earthquake.usgs.gov/data/oaf/.
See also: Honest Limits — the epistemics of why deterministic prediction is impossible, why absolute probabilities stay small, and the L'Aquila communication lesson.
⚠️ Disclaimer — read this. CAOS_SEISMIC produces probabilistic forecasts, not predictions. It is an independent research and education tool. It is NOT an official earthquake early-warning or civil-protection system, it does NOT predict when, where, or how large an earthquake will be, and it must NOT be used for life-safety, emergency, or evacuation decisions. Every number it publishes is a bounded, calibrated probability conditioned on the present state of seismicity — never an alarm, a countdown, or a "safe" state. A single outcome neither confirms nor refutes a probabilistic forecast.It complements, and does not replace or speak for, official agencies — always follow your national seismological and civil-protection authorities (e.g. USGS, INGV, CSN (Chile, SENAPRED for civil protection), GeoNet, JMA). The software is provided "as is", without warranty of any kind (MIT License); the authors accept no liability for its use. Data are courtesy of their providers (USGS/ANSS, ISC/ISC-GEM, Global CMT, EMSC, CSN, and others) under their respective licenses and attribution terms. See Honest-Limits for the full epistemic context.
CAOS_SEISMIC · seismic.fasl-work.com · source · MIT
Conditional probabilistic seismic forecasting — forecasts, never predictions.
Overview
Methodology & History
Classical models
- Models-Classical · index
- Gutenberg-Richter-Law
- Omori-Utsu-Law
- ETAS-Model
- Reasenberg-Jones-Model
- STEP-Model
- EEPAS-Model
- Smoothed-Seismicity
- Brownian-Passage-Time
- Rate-and-State-and-Coulomb
ML & analytical methods
- Models-ML · index
- Temporal-Point-Processes
- RMTPP
- Neural-Hawkes-Process
- Transformer-Hawkes-Process
- RECAST-and-FERN
- CNN-Spatial-Models
- Graph-and-Recurrent-Networks
- Detection-vs-Forecasting
Models employed
Data
Architecture
Evaluation
Progress
Reference