-
Notifications
You must be signed in to change notification settings - Fork 0
Detection vs Forecasting
Machine learning has transformed one half of observational seismology: detection — picking phases, associating arrivals, locating and characterizing events, and building far more complete catalogs. PhaseNet, EQTransformer, PhaseNO, SeisBench, and SeisLM are mature, production-grade, and genuinely excellent. None of them forecast. They tell you, with superhuman speed and recall, what already happened. Forecasting estimates what will happen. This page draws that line precisely, explains why the two tasks are different in kind, and states the product rule that keeps detection branding from ever implying prediction.
Why this is its own page. Conflating "AI is great at earthquakes" (true, for detection) with "AI predicts earthquakes" (false, as a calibrated forecaster beating ETAS — see RECAST and FERN) is the most common public misunderstanding in the field. Keeping the line explicit is a matter of scientific honesty and of not over-claiming in a life-safety-adjacent domain.
- Two different questions
- Why detection is "easy" and forecasting is "hard"
- The detection toolkit (mature, production-grade)
- The connection that is real: better catalogs help forecasting
- The trap: foundation models and "foreshock classification"
- The honest ETAS-vs-NPP verdict, restated here
- The product copy rule
- References
| Detection / characterization | Forecasting | |
|---|---|---|
| Question | What just happened? (is there an event; where; how big; which phase) | What will happen? (calibrated probability of future events) |
| Input | Waveforms / multi-station data after an event | Catalog history up to now |
| Target | A label on observed data | A probability over the future |
| Ground truth | Available almost immediately | Only revealed by waiting |
| Right metric | Precision/recall, pick error, location RMS | CSEP consistency + comparison tests, proper scoring, calibration |
| ML maturity (2026) | Production — superhuman recall and speed | Research — no NPP reliably beats ETAS prospectively |
The tasks share waveforms and catalogs as raw material, but they are different in kind: one classifies signal that exists, the other assigns probability to events that do not yet exist. A tool that excels at the first tells you nothing about its skill at the second.
flowchart LR
W["Raw waveforms<br/>(multi-station)"] --> DET["DETECTION ML<br/>PhaseNet · EQTransformer · PhaseNO"]
DET --> CAT["Better, more complete catalog<br/>(lower, more stable Mc)"]
CAT --> FC["FORECASTING model<br/>ETAS / point process"]
FC --> P["Calibrated probability<br/>of FUTURE events"]
DET -. "does NOT" .-> P
style DET fill:#e8f4ff,stroke:#3a7
style FC fill:#fff4e8,stroke:#a73
The dashed arrow is the whole message: detection feeds the catalog that forecasting consumes — it does not itself produce a forecast.
The asymmetry is not about effort; it is about information.
- Detection has the signal in hand. A P-wave arrival is physically present in the waveform. The task is pattern recognition on data that contains the answer; abundant labelled training data exists (millions of picked arrivals), and ground truth is available the moment the event is reviewed. This is exactly the regime where deep learning excels.
- Forecasting is fighting irreducible randomness. The future catalog is not determined by the present one. The best-supported physics says short-term seismicity is a clustered point process: the rate is forecastable (Omori–Utsu aftershock decay, Utsu productivity), but which fault ruptures when and at what size is governed by chaotic, threshold-sensitive nonlinear dynamics. Worse, magnitude is approximately memoryless given that an event occurs (Gutenberg–Richter), so "how big is the next one" carries little learnable signal (see Honest Limits).
So detection can approach a ceiling set by data quality, while forecasting is bounded by a physical limit on predictability itself. No amount of model capacity converts a clustered, partly chaotic process into a deterministic one — the honest target is a bounded, calibrated probability, never an alarm.
These are the tools that built modern catalogs. Every one is detection / characterization — not forecasting:
| Tool | Task | Architecture | Maturity | Forecasting? |
|---|---|---|---|---|
| PhaseNet (Zhu & Beroza, 2019) | P/S phase picking | U-Net CNN (~268k params) | Production | No — detection |
| EQTransformer (Mousavi et al., 2020) | Joint detection + P/S picking | CNN + BiLSTM + attention | Production | No — detection |
| PhaseNO (Sun et al., 2023) | Multi-station picking / association | Fourier / Graph Neural Operator | Maturing | No — detection/association |
| SeisBench (Woollam et al., 2022) | Benchmark/toolbox + datasets (ETHZ, GEOFON, STEAD) | Infrastructure | Production | No — infrastructure |
| SeisLM (Liu et al., 2024) | Foundation model: detection, P/S, onset, foreshock/aftershock classification | Wav2Vec2-style transformer | Research | No — retrospective characterization |
A few details worth fixing:
- PhaseNet reframes picking as image segmentation (a U-Net over the three-component waveform), outputting per-sample P/S/noise probabilities. Small, fast, ubiquitous.
- EQTransformer does joint detection and picking with attention, and is a workhorse for building dense modern catalogs.
- PhaseNO uses a graph neural operator over multiple stations at once (a station-graph approach — see Graph & Recurrent Networks), improving association.
- SeisBench is the shared toolbox/benchmark that makes these comparable and reproducible — it forecasts nothing; it is plumbing.
These tools lower and stabilize the magnitude of completeness
Detection is not merely "adjacent" to forecasting — it is the single biggest realizable near-term lever on forecast quality, indirectly:
Why a lower
-
More events constrain the triggering parameters. ETAS's Omori
$p,c$ , Utsu productivity$\alpha$ , and the spatial kernel are all estimated from event counts; more (smaller) events sharpen those estimates — especially in the crucial hours after a mainshock, when the catalog is grossly incomplete and naive ETAS under-forecasts productivity. - Sub-$M_c$ events are where the neural gain lives. FERN's 4–12% IGPE improvement came specifically from ingesting sub-completeness-magnitude events — i.e. from the extra detections that better ML provides. The forecasting gain traced to data, surfaced by detection, not to architectural depth.
So the honest causal story is: ML helps forecasting mostly by building a better catalog upstream, and only secondarily (and conditionally) through a better forecasting model. The detection tools are indispensable inputs to the pipeline — they are just not the forecaster.
The most seductive way to blur the line is a seismic foundation model that lists "foreshock / aftershock classification" among its capabilities. It sounds predictive. It is not.
SeisLM (Liu, Münchmeyer, Laurenti, Marone, de Hoop & Dokmanić, 2024; arXiv:2410.15765) is a self-supervised, Wav2Vec2-style transformer foundation model for seismic waveforms (SeisLM-base 11.4M / SeisLM-large 90.7M parameters; masked-reconstruction contrastive pretraining). It is strong at detection, phase-picking, onset regression — and at a task labelled "foreshock–aftershock classification." Read that task carefully:
Its foreshock/aftershock "classification" labels an existing waveform relative to a known mainshock — i.e. it decides, after the mainshock is known, whether a given recorded event sits before or after it. This is retrospective characterization, not prediction. The model is not told a waveform is a foreshock before the mainshock exists, because that information does not exist yet.
The distinction is total. "This recorded wiggle, given we now know the mainshock, was a foreshock" is a labelling task on the past. "This wiggle happening now is a foreshock, so a mainshock is coming" is a forecasting claim about the future — and no foundation model makes it reliably, because the foreshock-vs-ordinary-swarm distinction is not separable in real time (the great majority of swarms are not foreshocks). SeisLM's own positioning is explicit: it performs detection / characterization and does not perform earthquake forecasting.
The rule of thumb: if a model needs to already know the mainshock to assign its label, it is doing retrospective characterization, not forecasting — no matter how predictive the label name sounds.
For completeness, the forecasting-side verdict that anchors this whole wiki, stated once more in the detection context:
- As of 2026, no neural point process reliably beats a well-fit ETAS for short-term forecasting under fair, prospective, CSEP-style testing. The decisive benchmark — EarthquakeNPP (Stockman, Lawson & Werner, accepted TMLR 2026; arXiv:2410.08226) — tested five modern NPPs on California (1971–2021) with strict chronological splits and found none outperformed ETAS; ETAS won spatial log-likelihood consistently, with the authors concluding "current NPP implementations are not yet suitable for practical earthquake forecasting."
- The honest pro-neural exceptions (FERN, RECAST) win only narrowly and conditionally, and trace their gains to data ingestion and spatial flexibility, not depth.
- Therefore this product ships an ETAS-class core (Models — Classical, Models — Employed) and gates any neural challenger behind a prospective CSEP win.
The two halves of the verdict reinforce the line: ML is mature on the detection side, and the forecasting side still belongs to the calibrated classical core. Detection feeds forecasting; it does not replace it.
Scope honesty. "NPPs do not beat ETAS" is established on the California benchmark to date, not as a universal law. We do not claim ML can never add forecasting skill — only that, today, it has not been shown to, prospectively.
A single, non-negotiable rule for every public surface of this product:
Never let detection-model branding imply prediction. Detection / phase-picking / association tools are used to build the input catalog; the forecast is produced by the point-process model and scored CSEP-style. Saying or implying otherwise — "our AI detects earthquakes before they happen," "foreshock detection warns of mainshocks" — is the kind of overclaim that gets seismology papers rebutted in Nature, and in a life-safety-adjacent domain it is irresponsible.
Concretely, in the app and docs:
- Detection tools are described as catalog builders in the pipeline, never as forecasters.
- Any model output that is a probability of a future event is shown as a bounded, calibrated forecast next to its long-term baseline, never as an alarm, countdown, or "safe" state (see the product disclaimer and Honest Limits).
- "Foreshock" language is only ever used retrospectively, with the explicit caveat that foreshocks are not identifiable as such in real time.
This is the same honesty discipline that runs through the evaluation and employed-models pages: forecast, never predict; calibrate, never alarm; and keep the detection/forecasting line bright.
- Zhu, W. & Beroza, G.C. (2019). PhaseNet: a deep-neural-network-based seismic arrival-time picking method. Geophysical Journal International 216(1), 261–273. doi:10.1093/gji/ggy423
- Mousavi, S.M., Ellsworth, W.L., Zhu, W., Chuang, L.Y. & Beroza, G.C. (2020). Earthquake Transformer — an attentive deep-learning model for simultaneous earthquake detection and phase picking. Nature Communications 11, 3952. doi:10.1038/s41467-020-17591-w
- Sun, H., Ross, Z.E., Zhu, W. & Azizzadenesheli, K. (2023). Phase Neural Operator for Multi-Station Picking of Seismic Arrivals (PhaseNO). Geophysical Research Letters 50, e2023GL106434. doi:10.1029/2023GL106434
- Woollam, J., Münchmeyer, J., Tilmann, F., Rietbrock, A., Lange, D., Bornstein, T., Diehl, T., Giunchi, C., Haslinger, F., Jozinović, D., Michelini, A., Saul, J. & Soto, H. (2022). SeisBench — A Toolbox for Machine Learning in Seismology. Seismological Research Letters 93(3), 1695–1709. doi:10.1785/0220210324
- Liu, T., Münchmeyer, J., Laurenti, L., Marone, C., de Hoop, M.V. & Dokmanić, I. (2024). SeisLM: a Foundation Model for Seismic Waveforms. arXiv:2410.15765
- Mousavi, S.M. & Beroza, G.C. (2022). Deep-learning seismology. Science 377, eabm4470. doi:10.1126/science.abm4470
- Stockman, S., Lawson, D. & Werner, M.J. (2026, accepted). EarthquakeNPP: A Benchmark for Earthquake Forecasting with Neural Point Processes. TMLR. arXiv:2410.08226
- Zlydenko, O., Elidan, G., Hassidim, A., Kukliansky, D., Matias, Y., Meade, B., Molchanov, A., Nevo, A. & Bar-Sinai, Y. (2023). A neural encoder for earthquake rate forecasting. Scientific Reports 13, 12350. doi:10.1038/s41598-023-38033-9
- Mignan, A. & Broccardo, M. (2020). Neural network applications in earthquake prediction (1994–2019): Meta-analytic and statistical insights on their limitations. Seismological Research Letters 91(4), 2330–2342. doi:10.1785/0220200021
- Jordan, T.H., Chen, Y.-T., Gasparini, P., Madariaga, R., Main, I., Marzocchi, W., Papadopoulos, G., Sobolev, G., Yamaoka, K. & Zschau, J. (2011). Operational Earthquake Forecasting: State of Knowledge and Guidelines for Utilization (ICEF Report). Annals of Geophysics 54(4), 315–391. doi:10.4401/ag-5350
- CSEP / pyCSEP — Collaboratory for the Study of Earthquake Predictability. https://cseptesting.org · https://github.com/SCECcode/pycsep
See also: Models — ML · RECAST and FERN · CNN Spatial Models · Graph & Recurrent Networks · Data Sources · Pipeline · Evaluation · Honest Limits.
⚠️ Disclaimer — read this. CAOS_SEISMIC produces probabilistic forecasts, not predictions. It is an independent research and education tool. It is NOT an official earthquake early-warning or civil-protection system, it does NOT predict when, where, or how large an earthquake will be, and it must NOT be used for life-safety, emergency, or evacuation decisions. Every number it publishes is a bounded, calibrated probability conditioned on the present state of seismicity — never an alarm, a countdown, or a "safe" state. A single outcome neither confirms nor refutes a probabilistic forecast.It complements, and does not replace or speak for, official agencies — always follow your national seismological and civil-protection authorities (e.g. USGS, INGV, CSN (Chile, SENAPRED for civil protection), GeoNet, JMA). The software is provided "as is", without warranty of any kind (MIT License); the authors accept no liability for its use. Data are courtesy of their providers (USGS/ANSS, ISC/ISC-GEM, Global CMT, EMSC, CSN, and others) under their respective licenses and attribution terms. See Honest-Limits for the full epistemic context.
CAOS_SEISMIC · seismic.fasl-work.com · source · MIT
Conditional probabilistic seismic forecasting — forecasts, never predictions.
Overview
Methodology & History
Classical models
- Models-Classical · index
- Gutenberg-Richter-Law
- Omori-Utsu-Law
- ETAS-Model
- Reasenberg-Jones-Model
- STEP-Model
- EEPAS-Model
- Smoothed-Seismicity
- Brownian-Passage-Time
- Rate-and-State-and-Coulomb
ML & analytical methods
- Models-ML · index
- Temporal-Point-Processes
- RMTPP
- Neural-Hawkes-Process
- Transformer-Hawkes-Process
- RECAST-and-FERN
- CNN-Spatial-Models
- Graph-and-Recurrent-Networks
- Detection-vs-Forecasting
Models employed
Data
Architecture
Evaluation
Progress
Reference