Skip to content

Glossary

Felipe Santibañez-Leal edited this page Jun 17, 2026 · 1 revision

Glossary — Definitions

A precise, sourced definitions reference for every domain term used across CAOS_SEISMIC. Terms are grouped by theme; within each group they read in a natural conceptual order. Equations are real LaTeX (GitHub renders MathJax); references are canonical, peer-reviewed, with DOIs. For the models that use these terms in context see Models-Classical, Models-ML and Models-Employed; for the tests see Evaluation-and-Tests.

Conventions used throughout: magnitudes are homogenized to moment magnitude $M_w$ where possible; $M_c$ is the magnitude of completeness; $b$ is the Gutenberg–Richter slope; $\beta = b\ln 10$; information gains are reported in nats (natural-log units), the CSEP convention — never bits.


Contents


Core framing

Forecast vs. prediction

The field's organizing distinction (ICEF; Jordan et al., 2011). A prediction is a deterministic statement that an event will or will not occur in a given region, time window and magnitude range. A forecast assigns a probability strictly between 0 and 1 to such an event. CAOS_SEISMIC is strictly a forecaster: every number it publishes is a probability in $(0, 1)$, never a binary call. Jordan, T. H., et al. (2011), Operational Earthquake Forecasting (ICEF Report), Annals of Geophysics 54(4), 315–391, doi:10.4401/ag-5350.

Operational Earthquake Forecasting (OEF)

The authoritative, deployed practice of issuing time-dependent, conditional earthquake probabilities as a scheduled service. OEF is what is achievable (as opposed to deterministic prediction). Real systems include the USGS Operational Aftershock Forecast and OEF-Italy (INGV). Jordan et al. (2011), doi:10.4401/ag-5350.

Conditional probabilistic forecast

A probability of one or more qualifying events, conditioned on the present state of the system (the recent catalog and covariates), scoped to a region × magnitude band × horizon. "Conditional" means the number changes day to day as the observed history $\mathcal{H}_t$ changes; it is not an unconditional climatological rate.

Self-organized criticality (SOC)

The leading explanatory framework (not settled physics) for why deterministic prediction is effectively impossible: the crust sits near a critical state in which a small rupture may or may not cascade into a large one, governed by unmeasurably fine details. Presented as a leading hypothesis, alongside coexisting characteristic-earthquake / partial-predictability views. Bak, P., & Tang, C. (1989), Earthquakes as a self-organized critical phenomenon, JGR 94(B11), 15635–15637, doi:10.1029/JB094iB11p15635.

Predictability ceiling

The empirical fact that short-term forecasting has a hard skill ceiling: the best operational models (ETAS family) beat a time-independent Poisson baseline by only a modest information gain, and mostly within aftershock-rich windows. Any claim beyond "modest, well-calibrated conditional probabilities" is overclaiming. Geller, R. J., Jackson, D. D., Kagan, Y. Y., & Mulargia, F. (1997), Earthquakes Cannot Be Predicted, Science 275(5306), 1616–1617, doi:10.1126/science.275.5306.1616.

Absolute vs. relative probability

A crucial honesty distinction. The relative gain over the long-term background during an active sequence can be large (one to three orders of magnitude); the absolute probability of a large event in the next day usually remains well under a few percent (< 1 %/day for great events). Every number is therefore shown next to its baseline so "elevated" and "still small" are visible at once.


Magnitude, completeness, and the size distribution

Gutenberg–Richter (GR) law

The frequency–magnitude distribution of earthquakes is, to first order, a power law: $$\log_{10} N(\ge M) = a - b,M, \qquad M \ge M_c.$$ Equivalently the magnitude density above completeness is exponential, $f(M) = \beta, e^{-\beta (M - M_c)}$ with $\beta = b\ln 10$. It supplies the magnitude term of every forecast. Here $a$ is a productivity/rate constant and $b$ the slope.

b-value

The slope $b$ of the GR law, typically near 1 but never hard-coded. A lower $b$ means relatively more large events (a heavier tail). It is estimated by the Aki–Utsu maximum-likelihood estimator with the Utsu/Tinti–Mulargia binning correction: $$\hat b = \frac{\log_{10} e}{\bar m - \left(M_c - \tfrac{\Delta M}{2}\right)},$$ where $\bar m$ is the mean magnitude of events $\ge M_c$ and $\Delta M$ the bin width. The estimator is strongly biased if $M_c$ is mis-estimated, so $M_c$ and $b$ are re-estimated jointly on a rolling space–time window with propagated uncertainty. Aki, K. (1965), Bull. Earthq. Res. Inst. 43, 237–239; Tinti, S., & Mulargia, F. (1987), BSSA 77(6), 2125–2134.

Magnitude of completeness ($M_c$)

The lowest magnitude above which (essentially) all events in a catalog are reliably detected. A model trained below $M_c$ learns the network's detection limits, not the Earth. $M_c$ varies in space and time and is estimated per region on the training window only (a single global $M_c$ injects fake non-stationarity). Estimators: maximum-curvature (MAXC) + goodness-of-fit (GFT), cross-checked with EMR. The +0.2 MAXC correction is California-tuned, not universal — it must be re-validated per region. Wiemer, S., & Wyss, M. (2000), BSSA 90(4), 859–869, doi:10.1785/0119990114; Woessner, J., & Wiemer, S. (2005), BSSA 95(2), 684–698, doi:10.1785/0120040007.

MAXC (maximum-curvature) and GFT (goodness-of-fit test)

Two standard $M_c$ estimators. MAXC takes $M_c$ as the magnitude of the peak of the non-cumulative frequency–magnitude distribution (often with a +0.2 offset). GFT finds the magnitude above which a synthetic GR distribution fits the data to a chosen confidence. Wiemer & Wyss (2000), doi:10.1785/0119990114.

EMR (Entire-Magnitude-Range method)

A maximum-likelihood $M_c$ estimator that models the whole magnitude range — GR above $M_c$ and a detection (incompleteness) function below it — yielding $M_c$ with an uncertainty estimate. Used as a cross-check on MAXC/GFT. Woessner & Wiemer (2005), doi:10.1785/0120040007.

Short-term aftershock incompleteness

Immediately after a large mainshock — exactly the highest-stakes, highest-traffic moment — the catalog is grossly incomplete: $M_c$ spikes for hours to days as small events are buried in coda. A naive fit then underestimates productivity precisely when a large aftershock is most likely. The real-time update therefore uses a time-dependent completeness $M_c(t)$ (incompleteness-aware / detrended-likelihood ETAS) rather than a flat threshold.

Moment magnitude ($M_w$) and homogenization

$M_w$ is the non-saturating magnitude derived from seismic moment. Catalogs mix magnitude types (ML, mb, Ms, Md, Mw) with different saturation and physics, so the pipeline homogenizes all magnitudes to $M_w$-equivalent via total-least-squares (TLS) regression anchored on the ISC-GEM / GCMT overlap, keeping the native magType field. A wrong conversion shifts the entire GR distribution and every rate forecast.

Maximum magnitude ($M_{\max}$)

The upper magnitude bound that truncates the exceedance integral and sets the tail probability of the rare, high-impact events that dominate risk. It is specified per region from regional hazard models and carried as an explicit, documented assumption with sensitivity reported — never left undefined.

Exceedance probability

The single public number: the probability of at least one event above a target magnitude $M^$ in a region over a horizon. For a non-homogeneous Poisson process with expected count $N_{\ge M^}$, $$P(\ge 1\text{ event} \ge M^) = 1 - e^{-N_{\ge M^}}, \qquad N_{\ge M^} = \iint \lambda(t,x,y\mid\mathcal H_t),\Phi(M^),dx,dy,dt, \quad \Phi(M^) = 10^{-b(M^ - M_c)}.$$ The formula $P = 1 - e^{-N}$ never changes; only the quality of $\lambda$ improves.


Point processes and triggering

Point process

A stochastic model for events (points) scattered in time (and space and magnitude). A marked spatio-temporal point process attaches a magnitude (the "mark") to each space–time point. All models in this project are special cases of a marked point process.

Conditional intensity $\lambda(t,x,y\mid\mathcal H_t)$

The instantaneous expected rate of events at time $t$, location $(x,y)$, given the entire observed history $\mathcal H_t$ up to $t$. It is the central object: the conditional intensity is the model. ETAS, Reasenberg–Jones, STEP and neural TPPs are all different functional forms of $\lambda$. The point-process log-likelihood maximized when fitting is $$\ln L = \sum_i \ln \lambda(t_i,x_i,y_i\mid\mathcal H_{t_i}) - \int_0^T!!\int_A \lambda(t,x,y\mid\mathcal H_t),dx,dy,dt.$$

History $\mathcal H_t$

The set of all events (times, locations, magnitudes) observed strictly before $t$. The forecast clock guarantees the model only ever sees $\mathcal H_t = {$events in $(-\infty, t)}$.

Hawkes process (self-exciting process)

A point process in which each event increases the intensity of future events — a stationary background plus the summed, decaying contribution of all past events. ETAS is the seismology-specific Hawkes process. Hawkes (1971); for ETAS see Ogata (1988).

Omori–Utsu law

The empirical power-law decay of aftershock rate after a mainshock: $$n(t) = \frac{K}{(t + c)^{p}}, \qquad p \approx 1,$$ with productivity $K$, time offset $c$ (hours–day), and decay exponent $p$. Its cumulative form (for $p \ne 1$) integrates analytically. Utsu, T., Ogata, Y., & Matsu'ura, R. S. (1995); Ogata, Y. (1988), JASA 83(401), 9–27, doi:10.1080/01621459.1988.10478560.

Utsu productivity

The scaling of the number of direct offspring with mainshock magnitude, $k(m) = K,e^{\alpha(m - M_0)}$, where $\alpha$ controls how strongly larger events trigger more aftershocks and $M_0$ is a reference magnitude.

Branching ratio ($n$)

The expected number of direct offspring (triggered events) per event in an ETAS / Hawkes process. It controls stability — see subcriticality.

Subcriticality / stationarity gate

ETAS stability requires two logically separate conditions: (1) finite branching — the productivity × magnitude integral converges only if $\alpha &lt; \beta$ (with $\beta = b\ln 10$); if $\alpha \ge \beta$ it diverges. (2) Subcriticalitygiven $\alpha &lt; \beta$, the branching ratio must satisfy $n &lt; 1$. A fit with $n \ge 1$ is supercritical (explosive, non-stationary) and signals a mis-fit; it is rejected.

Background rate $\mu(x,y)$

The stationary, time-independent part of the intensity — where earthquakes occur independently of recent triggering. Estimated by smoothing a declustered catalog (see smoothed seismicity), it both feeds ETAS and serves as the floor for cold-start cells.


Classical and operational models

ETAS (Epidemic-Type Aftershock Sequence)

The de-facto operational baseline: a self-exciting Hawkes point process combining a stationary background, Utsu productivity, the Omori–Utsu time kernel, and a spatial kernel into one conditional intensity, $$\lambda(t,x,y\mid\mathcal H_t) = \mu(x,y) + \sum_{i:,t_i&lt;t} K,e^{\alpha(M_i-M_0)}, \big(1 + \tfrac{t-t_i}{c}\big)^{-p}, f(x-x_i, y-y_i\mid M_i).$$ Any candidate model must beat a well-fit ETAS in prospective CSEP testing before it can claim forecasting skill. Fit by MLE (or Bayesian / INLAbru) on the full, un-declustered catalog. Ogata, Y. (1988), JASA 83(401), 9–27, doi:10.1080/01621459.1988.10478560; Ogata, Y. (1998), Ann. Inst. Statist. Math. 50(2), 379–402, doi:10.1023/A:1003403601725.

Ogata-1998 spatial kernel

The inverse-power spatial decay used in space–time ETAS, $$f(x,y\mid M_i) = \frac{q-1}{\pi,\zeta^2}\left(1 + \frac{r^2}{\zeta^2}\right)^{-q}, \qquad \zeta = D,e^{\gamma(M_i - M_0)},$$ with magnitude-dependent bandwidth via $D$ and $\gamma$, decay $q$, and $r$ the distance to the parent event. Ogata (1998), doi:10.1023/A:1003403601725.

Reasenberg–Jones (R-J)

The most transparent operational aftershock model: a GR magnitude term times a modified-Omori time decay, $$\lambda(t,M) = \frac{10^{,a + b(M_m - M)}}{(t+c)^{p}}, \qquad N = \int \lambda,dt, \qquad P(\ge 1) = 1 - e^{-N}.$$ It is the transparent fallback / sanity check alongside ETAS; the USGS OAF runs both. Reasenberg, P. A., & Jones, L. M. (1989), Science 243(4895), 1173–1176, doi:10.1126/science.243.4895.1173; global tectonic-regime extension Page, M. T., et al. (2016), BSSA 106(5), 2290–2301, doi:10.1785/0120160073.

STEP (Short-Term Earthquake Probability)

The production reference for the output shape: it wraps Reasenberg–Jones clustering plus a background term into gridded shaking-probability maps — i.e. a "one-inference-per-short-interval" probabilistic regional map, exactly the form of this product. Gerstenberger, M. C., Wiemer, S., Jones, L. M., & Reasenberg, P. A. (2005), Nature 435, 328–331, doi:10.1038/nature03622.

USGS OAF (Operational Aftershock Forecast)

The USGS scheduled cloud service that monitors ComCat and issues calibrated aftershock probabilities (running both Reasenberg–Jones and ETAS), with the first forecast roughly 30 minutes after a significant event. A real, honest operational analogue. Page et al. (2016), doi:10.1785/0120160073.

OEF-Italy

INGV's operational forecasting system, an ensemble of three distinct models — ETAS + ETES + STEP — running as a true prospective service for ~10 years. Its validation found it broadly reliable, with a documented underestimation during the 2016–2017 Central Italy (Amatrice–Norcia) sequence caused by post-mainshock catalog incompleteness. Spassiani, I., Falcone, G., Murru, M., & Marzocchi, W. (2023), GJI 234(3), 2501–2518.

EEPAS (Every Earthquake a Precursor According to Scale)

A medium-term (months–years) precursory-scaling model: each event of precursor magnitude $M_p$ contributes a rate density that is a product of magnitude, time, and location densities governed by scaling relations. It sits outside the short-term primary window — a feature/context source, not a core model. Its published density constants contain known typos across papers, so any use must pin to a reference implementation (pyCSEP / floatCSEP). Rhoades, D. A., & Evison, F. F. (2004), Pure Appl. Geophys. 161, 47–72, doi:10.1007/s00024-003-2434-9.

Smoothed seismicity

A stationary, time-independent estimate of where earthquakes occur, obtained by smoothing a declustered catalog with an adaptive kernel (the Helmstetter–Kagan–Jackson adaptive power-law kernel, bandwidth set by the distance to the $n$-th nearest neighbour, $n \sim 6$). It serves two roles: (a) the spatial background $\mu(x,y)$ feeding ETAS, and (b) the mandatory stationary Poisson null that any time-dependent model must beat. The exact kernel exponent and normalization are pinned to a reference implementation, not hard-coded. Helmstetter, A., Kagan, Y. Y., & Jackson, D. D. (2007), SRL 78(1), 78–86, doi:10.1785/gssrl.78.1.78.

BPT / renewal model (Brownian Passage Time)

A long-term, time-dependent recurrence model used where paleoseismic data constrain a fault's mean recurrence interval. The inter-event-time density is inverse-Gaussian, $$f(t;\mu,\alpha) = \sqrt{\frac{\mu}{2\pi,\alpha^2 t^3}};\exp!\left(-\frac{(t-\mu)^2}{2,\mu,\alpha^2 t}\right),$$ with mean recurrence $\mu$ and aperiodicity $\alpha$ (coefficient of variation). With few observed cycles $\alpha$ is poorly constrained and the gain over plain Poisson is marginal — no renewal skill is claimed where data do not support it. Matthews, M. V., Ellsworth, W. L., & Reasenberg, P. A. (2002), BSSA 92, 2233–2250, doi:10.1785/0120010267.

Coulomb stress change ($\Delta\mathrm{CFS}$)

The change in stress driving a fault toward or away from failure from a nearby slip event: $$\Delta\mathrm{CFS} = \Delta\tau - \mu',\Delta\sigma_n,$$ where $\Delta\tau$ is the shear-stress change, $\Delta\sigma_n$ the normal-stress change, and $\mu'$ the effective friction. Coulomb "lobes" promote or suppress triggering. King, G. C. P., Stein, R. S., & Lin, J. (1994), BSSA 84, 935–953.

Rate-and-state friction

Dieterich's constitutive theory predicting how the seismicity rate responds to a stress step, deriving the Omori-like $1/t$ decay from first principles. The rate responds exponentially to a Coulomb stress step, $R/r = \exp(\Delta\mathrm{CFS}/(A\sigma))$, with characteristic aftershock duration $t_a = A\sigma_n/\dot\tau_r$. It provides the functional shape for stress-based covariates (including tides). Dieterich, J. (1994), JGR 99(B2), 2601–2618, doi:10.1029/93JB02581.

Nucleation duration

The time a rupture takes to nucleate. The decisive argument for why daily tides barely trigger ordinary earthquakes: if nucleation ($\sim$ years on crustal faults, $\propto A\sigma/\dot\tau$) far exceeds the forcing period, the oscillating stress averages out. Beeler, N. M., & Lockner, D. A. (2003), JGR 108(B8), 2391, doi:10.1029/2001JB001518.


Machine-learning approaches

Temporal point process (TPP)

The unifying language: a model of an event stream defined by its conditional intensity $\lambda(t\mid\mathcal H_t)$. ETAS is the parametric, physics-informed member; neural TPPs are the flexible, learned members. Fit by maximizing the point-process log-likelihood; scored by the same likelihood on held-out, time-causal data.

Neural temporal point process (NTPP / neural TPP)

A TPP whose intensity (or its components) is parameterized by a neural network (RNN, LSTM, attention/ transformer), learning event dynamics from data. Foundational architectures (validated on non-seismic streams — their log-likelihood wins do not automatically transfer to seismicity) include RMTPP (Du et al., 2016, doi:10.1145/2939672.2939875), the Neural Hawkes Process (Mei & Eisner, 2017, arXiv:1612.09328), Self-Attentive Hawkes (Zhang et al., 2020, arXiv:1907.07561), and the Transformer Hawkes Process (Zuo et al., 2020, PMLR v119).

Hawkes inductive bias

A design principle for the project's gated neural challenger: keep the additive background + summed-triggering skeleton of a Hawkes/ETAS process, but replace the fixed kernels with small MLPs / attention, and model magnitude explicitly. This anchors the network to known physics rather than learning triggering from scratch (the FERN spirit).

RECAST

A GRU-based (recurrent) encoder–decoder neural TPP for earthquakes. It improves on temporal ETAS only when the training catalog is large ($\gtrsim 10^4$ events); on smaller catalogs it merely matches ETAS. Dascher-Cousineau, K., et al. (2023), GRL 50, e2023GL103909, doi:10.1029/2023GL103909.

FERN

An ETAS-generalizing neural encoder (MLPs replace the fixed kernels). The FERN+ variant (which ingests sub-$M_c$ events) reports a 4–12 % information-gain-per-earthquake improvement and learns fault-aligned anisotropy ~1000× faster than ETAS. Crucially, the authors note it was not CSEP-tested, provided no uncertainty quantification, and its test period ended before the 2011 Tohoku $M_w$ 9.0. The gain came from two real ETAS gaps (sub-$M_c$ events + spatial flexibility), not network depth. Zlydenko, O., et al. (2023), Sci. Rep. 13, doi:10.1038/s41598-023-38033-9.

EarthquakeNPP

The decisive 2026 benchmark: five modern neural point processes (NSTPP, DeepSTPP, AutoSTPP, DSTPP, SMASH) on California 1971–2021 with strict chronological splits and CSEP consistency tests. None outperformed ETAS — the gap is largest on the spatial test (best NPP ≈ 68.6 % vs ETAS ≈ 92.0 % pass rate), exactly where forecasting value lives. It also repaired a data-leakage flaw in earlier work (non-chronological splits + excluding the Tohoku sequence inflate metrics). The honest conclusion: no pure ML / NPP has robustly beaten ETAS in prospective CSEP as of 2026. Stockman, S., Lawson, D., & Werner, M. J. (2026), TMLR, arXiv:2410.08226.

The DeVries trap (over-parameterization + wrong metric)

The canonical cautionary tale: a deep net (~13,451 parameters, AUC 0.85 for aftershock spatial pattern) was matched by a 2-parameter logistic regression ("one neuron") on a single physical feature. Root causes — massive over-parameterization vs ~199 effective mainshocks, a per-cell "computer-vision" framing inflating the apparent sample size, and AUC being the wrong metric for rate forecasting. DeVries, P. M. R., et al. (2018), Nature 560, 632–634, doi:10.1038/s41586-018-0438-y; rebuttal Mignan, A., & Broccardo, M. (2019), Nature 575, E1–E3, doi:10.1038/s41586-019-1582-8.

Detection is not forecasting

The hard line between two distinct tasks. ML waveform models (PhaseNet, EQTransformer, PhaseNO, SeisBench, the SeisLM foundation model) are mature/production for phase-picking, detection, association and characterization — they build better, more complete catalogs (lower, more stable $M_c$), which helps both ETAS and neural forecasters. But they do not forecast. Product copy keeps the detection/forecasting line explicit so detection branding never implies prediction. SeisLM: arXiv:2410.15765.

Gated challenger

A non-default model kept behind a feature flag. The neural challenger reaches the public map only if it beats ETAS in the project's own prospective CSEP harness (positive IGPE, T-test CI excluding zero) and passes calibration (a release blocker). Otherwise ETAS remains what ships.


Declustering and the dual catalog

Declustering

Separating "independent" mainshocks from triggered foreshocks/aftershocks. Methods: Gardner–Knopoff space–time windowing (transparent cross-check) and Zaliapin–Ben-Zion nearest-neighbour (primary).

Dual-catalog rule

The most common pipeline mistake, made explicit: feed the declustered catalog only to the stationary Poisson / smoothed-seismicity background $\mu(x,y)$; feed the full, un-declustered catalog to the conditional / ETAS model, because triggering is the predictable signal. Declustering the conditional input would throw away exactly what the model forecasts.

Gardner–Knopoff windows

A windowing declustering method with magnitude-dependent space and time windows. OpenQuake hmtk coefficients: $L(M)=10^{0.1238M+0.983}$ km; $T(M)=10^{0.032M+2.7389}$ d for $M\ge6.5$, else $10^{0.5409M-0.547}$ d. Used as a transparent cross-check for the background.

Zaliapin–Ben-Zion nearest-neighbour (NND)

A declustering and feature method based on the Baiesi–Paczuski nearest-neighbour proximity $$\eta_{ij} = t_{ij},(r_{ij})^{d_f},10^{-b,m_i},$$ decomposed into rescaled time $T_j = t_{ij},10^{-q b m_i}$ and rescaled distance $R_j = (r_{ij})^{d_f},10^{-(1-q)b m_i}$ ($q\approx0.5$). $\log_{10}\eta$ is bimodal (a background mode and a clustered mode), which separates independent events from triggered ones; $\eta$, $T$, $R$ are computed as ML features, not just keep/drop labels. Zaliapin, I., & Ben-Zion, Y. (2020), JGR Solid Earth 125, e2018JB017120, doi:10.1029/2018JB017120; Zaliapin et al. (2008), PRL 101, 018501, doi:10.1103/PhysRevLett.101.018501.


Evaluation: CSEP, tests, and scoring

CSEP (Collaboratory for the Study of Earthquake Predictability)

The community-endorsed framework (grown out of RELM, California) for prospective earthquake forecast testing. Using its standard tests — not bespoke metrics — is what makes the product defensible. Schorlemmer, D., et al. (2007), SRL 78(1), 17–29, doi:10.1785/gssrl.78.1.17.

pyCSEP

The community Python toolkit implementing every CSEP test (catalog access, both forecast representations, the grid tests and catalog tests). Using it means reviewers can dispute the model, not the test code. Savran, W. H., et al. (2022), SRL 93(5), 2858–2870, doi:10.1785/0220220033; docs.cseptesting.org.

Gridded-rate forecast

One forecast representation: a Poisson expected count per space–magnitude cell, evaluated with the Poisson consistency tests. Directly comparable to the published CSEP California daily benchmark.

Catalog-based forecast

The other representation: an ensemble of $\ge 10{,}000$ synthetic catalogs per day, evaluated with empirical, non-Poisson tests. This is the correct primary path for a clustered daily forecast because the Poisson grid tests over-reject during sequences. Savran, W. H., et al. (2020), BSSA 110(4), 1799–1817, doi:10.1785/0120200026.

Over-dispersion

Regional seismicity has variance $\gg$ mean (because of clustering) — it is over-dispersed relative to Poisson. Consequence: Poisson grid tests over-reject during aftershock sequences, and the pessimistic uncertainty bound must be wider than a naive Poisson interval (negative-binomial behaviour). Kagan, Y. Y. (2017), GJI 211(1), 335–345, doi:10.1093/gji/ggx300.

Consistency test

A test of whether one model is calibrated (its forecast is statistically consistent with what happened). Necessary but not sufficient for skill. The CSEP consistency tests are N / M / S / L / CL, all built on the Poisson joint log-likelihood $L(\Omega\mid\Lambda) = \sum_i [-\lambda_i + \omega_i\ln\lambda_i - \ln(\omega_i!)]$.

N-test (number)

Tests whether the total forecast count matches the observed count, via Poisson quantile scores $\delta_1 = 1 - F(N_{obs}-1\mid N_{fore})$ (small ⇒ observed too many, forecast too low) and $\delta_2 = F(N_{obs}\mid N_{fore})$ (small ⇒ too few), with $F$ the Poisson CDF. Zechar, J. D., Gerstenberger, M. C., & Rhoades, D. A. (2010), BSSA 100(3), 1184–1195, doi:10.1785/0120090192.

M-test (magnitude)

Tests whether the forecast magnitude distribution (the GR shape) matches the observed magnitudes; quantile $\kappa$. Zechar et al. (2010), doi:10.1785/0120090192.

S-test (spatial)

Tests whether the forecast spatial distribution matches where events actually occurred; quantile $\zeta$. This is exactly the dimension where neural models most often fail relative to ETAS.

L-test and CL-test (likelihood / conditional likelihood)

The L-test scores the joint pseudo-likelihood (quantile $\gamma$). The CL-test conditions on the observed number $N_{obs}$ and is preferred over the raw L-test, which correlates strongly with the N-test. The L-test is not deprecated in current pyCSEP, but the conditional variant and Poisson/negative-binomial number-test variants are preferred. Werner, M. J., et al. (2011), BSSA 101(4), 1630–1648, doi:10.1785/0120090340.

Comparison test

A test of whether model A is better than model B — the only thing that establishes skill. Skill is claimed only by winning a comparison test against a real baseline (smoothed-seismicity and ETAS) with a confidence interval excluding zero.

Information gain per earthquake (IGPE)

The comparison metric, in nats: $$I_N(A,B) = \frac{1}{N}\sum_{i=1}^{N}\big(\ln\lambda_A(k_i) - \ln\lambda_B(k_i)\big) - \frac{\hat N_A - \hat N_B}{N}.$$ It is state-dependent — positive and large during active sequences (gains up to orders of magnitude on peak days), near zero in quiet periods, with a modest all-period average. For scale, time-independent California CSEP contrasts give only about −0.7 to +0.5 nats; there is no stable "2–3" steady-state value. Rhoades, D. A., et al. (2011), Acta Geophysica 59(4), 728–747, doi:10.2478/s11600-011-0013-5.

Nats

Natural-logarithm units of information — the CSEP convention for information gain (a gain in $\ln$ space). All gains in this project are reported in nats, never bits ($\log_2$).

T-test and W-test

The paired tests on per-earthquake information gain. The T-test is the paired Student-t, $T = I_N(A,B)/(s/\sqrt N) \sim t_{N-1}$; the W-test is the non-parametric Wilcoxon signed-rank companion. Skill is claimed only when IGPE is positive with a T-test CI excluding zero, corroborated by the W-test. Rhoades et al. (2011), doi:10.2478/s11600-011-0013-5.

Pseudo-likelihood

The catalog-based analogue of the L-test statistic, $\hat L_{obs} = \sum_{i=1}^{N_{obs}} \log\hat\lambda_s(k_i) - \bar N$, evaluated against the empirical distribution from the synthetic-catalog ensemble (the over-dispersion-honest path).

Proper scoring rule

A scoring rule that is optimized (in expectation) only by the true probability — it cannot be gamed by miscalibration. The project reports the logarithmic score $\mathrm{LogS}(p,y) = -\ln p(y)$, the Brier score, and CRPS. Gneiting, T., & Raftery, A. E. (2007), JASA 102(477), 359–378, doi:10.1198/016214506000001437.

Brier score (BS)

A proper score for the bounded binary exceedance output, $\mathrm{BS} = \frac{1}{T}\sum_t (p_t - y_t)^2$. It decomposes into reliability − resolution + uncertainty (Murphy 1973), separating calibration from discrimination. Originates with Brier, G. W. (1950).

CRPS (Continuous Ranked Probability Score)

A proper score for a full predictive distribution, $\mathrm{CRPS} = \int (F(x) - \mathbf 1{x \ge y})^2,dx$, where $F$ is the forecast CDF. Suited to forecasting the whole magnitude/count distribution rather than a single binary.

Molchan diagram

An alarm-style / ROC view (a communication aid, never a primary skill metric): miss rate $\nu = 1 - H$ plotted against alarm fraction $\tau$ (fraction of space–time placed under alarm). A perfect model hugs the axes; a random one sits on the anti-diagonal. Zechar, J. D., & Jordan, T. H. (2008), GJI 172(2), 715–724, doi:10.1111/j.1365-246X.2007.03676.x.

Area Skill Score (ASS)

The normalized area above the Molchan trajectory: 1 = perfect, 0.5 = random, 0 = perfectly unskilled. Zechar, J. D., & Jordan, T. H. (2010), PAGEOPH 167, 893–906.

ROC / AUC (and why AUC is banned as a primary metric)

ROC plots hit rate against false-alarm rate; AUC is the area under it. AUC is banned as a primary forecasting metric: it is invariant to monotone rescaling, hence blind to the calibration of the very probabilities a forecast publishes, and on a rare per-cell-per-day task it degenerates into a region classifier (the DeVries trap). Shown only as a communication aid.

Testing-mode hierarchy

The credibility ordering: true prospective (gold) > pseudo-prospective (the primary back-analysis mode) > retrospective out-of-sample (weak) > in-sample (fit diagnostic only). Mizrahi, L., et al. (2024), Reviews of Geophysics 62, doi:10.1029/2023RG000823.

Pseudo-prospective testing

Back-analysis that mimics a true prospective run via the forecast clock: at each daily issue time the model is handed only the catalog slice $(-\infty, t)$, the forecast is sealed, then the clock advances — making temporal leakage structurally impossible.

Forecast clock

The strict driver enforcing pseudo-prospective discipline: features come only from $(-\infty, t)$, the label is the window $[t, t+H)$, the forecast is sealed at $t$, then the clock advances. It turns leakage avoidance from a matter of discipline into a structural guarantee.

Leakage (and the five failure modes)

Any way information from the future, or a retroactively improved catalog, contaminates a forecast and inflates its apparent skill. The five engineered-against modes: temporal leakage, catalog-revision leakage, $M_c$-inconsistency leakage, region/parameter snooping, and multiple-testing inflation. Mizrahi et al. (2024), doi:10.1029/2023RG000823.

Input-state snapshot

The immutable, versioned record — for each daily issue — of the exact catalog state, the $M_c$ grid, the declustering choice, the model version, and all parameters. It makes any past forecast byte-reproducible and scorable against the catalog as it was at issue time, never a retroactively improved one. Reproducibility here is existential, not a nicety.


Calibration and uncertainty

Calibration

The property that forecast probabilities match observed frequencies ("when we said 5 %, it happened ~5 % of the time"). Calibration is a release blocker — an uncalibrated probability does not ship. The public probability is recalibrated (isotonic / Platt) and validated per horizon.

Reliability diagram

A plot of forecast probability against observed frequency; the diagonal is perfect calibration. It is the single most credibility-building artifact the product ships, and is validated specifically in the cold-start / quiet regime, which dominates the diagram.

Isotonic / Platt recalibration

Post-hoc monotone recalibration maps (isotonic regression / logistic Platt scaling) that adjust raw model probabilities so the reliability diagram lands on the diagonal.

PIT (Probability Integral Transform)

A calibration diagnostic: under a well-calibrated forecast, the forecast CDF evaluated at the observation is uniformly distributed. Deviations from uniformity reveal miscalibration.

Epistemic vs. aleatory uncertainty

Epistemic uncertainty is reducible (parameter and model uncertainty); aleatory uncertainty is the irreducible randomness of the process. The published bounds must be a real decomposition of both — not a cosmetic Poisson interval.

Uncertainty bounds (optimistic / expected / pessimistic — P10 / median / P90)

The published triad. The bounds are sourced from ETAS parameter uncertainty (MLE covariance / bootstrap / INLAbru posterior), propagated $M_c$ and $b$ uncertainty, structural / model-selection uncertainty, and the over-dispersion (negative-binomial) correction. A pessimistic bound that is only a Poisson quantile systematically under-warns at the tail. The bounds design was empirically the most effective for communicating "surprises." Schneider, M., et al. (2022), NHESS 22(4), 1499–1518, doi:10.5194/nhess-22-1499-2022.

Cold-start / low-seismicity regime

The dominant regime (most of any map's area-and-time is quiet). The conditional rate floors to the principled smoothed-seismicity background, never an arbitrary per-day constant; strength is borrowed spatially via hierarchical / empirical-Bayes pooling. Three honest, visually distinct UI states: low-but-poorly-constrained (wide bounds), genuinely quiescent (tight bounds near baseline), and no data / out-of-coverage (explicit mask). Blank must never read as "safe."


Tidal triggering and stress

Tidal triggering

The small, regime-dependent modulation of earthquake rate by solid-Earth and ocean tides. Tidal stresses (~0.1–10 kPa) are ~$10^{-3}$–$10^{-4}$ of earthquake stress drops (~1–10 MPa), so tides can only advance/retard a rupture already near failure, never cause one. A real but small correlation (~0.5–1 % global rate excess; up to factor ~2–3 only for shallow ocean-loaded thrusts). Useless as a standalone predictor; encoded as a regularized covariate that may shrink to ~0. Métivier, L., et al. (2009), EPSL 278, 370–375, doi:10.1016/j.epsl.2008.12.024; Cochran, E. S., Vidale, J. E., & Tanaka, S. (2004), Science 306, 1164–1166, doi:10.1126/science.1103961.

Tidal Coulomb Failure Stress (TCFS)

The scalar resolved onto a fault from the tidal stress tensor, $\Delta\mathrm{CFS}(t) = \Delta\tau(t) + \mu_f,\Delta\sigma_n(t)$ ($\mu_f\approx0.4$ default), used to drive the rate-and-state covariate $R/r = \exp(\Delta\mathrm{CFS}(t)/(A\sigma))$.

Tidal constituents (M2, S2, O1, K1, Mf)

The dominant periodic components of the tide: semidiurnal M2 (12.421 h) and S2 (12.000 h), diurnal O1 (25.819 h) and K1 (23.934 h), and the long-period fortnightly Mf (~13.66–14.77 d). The Mf band, where the period approaches nucleation timescales, is where clean correlation is most expected.

Ocean tidal loading (OTL)

The crustal stress from the weight of ocean tides. At coastal/subduction margins OTL dominates the body tide — skipping it is the single biggest tidal-modeling error. Computed with SPOTL (nloadf/hartid) and a global ocean-tide model (TPXO/GOT/FES); the body tide via pygtide (ETERNA PREDICT).

Schuster test

The standard test for tidal phase selectivity: each event is a unit vector at its tidal phase angle; the length of the vector sum gives a p-value against the null of random occurrence ("Schuster walk"). Caveat: the p-value depends on $N$ — huge catalogs make tiny effects "significant," so report effect size (rate-modulation amplitude), not just $p$; declustering matters (aftershocks share the mainshock's tidal phase).

Nonvolcanic tremor / slow-slip events (SSE) / LFE

Slow-earthquake phenomena that respond to the same ~kPa tidal stresses far more strongly than ordinary earthquakes — the strongest, least-disputed tidal signal. Tremor rate is exponential in tidal shear stress; fortnightly (Mf) modulation of tremor acts as an in-situ stressmeter. Kept as a separate channel from the fast-earthquake model where tremor/SSE catalogs exist. van der Elst, N. J., et al. (2016), PNAS 113, 8601–8606, doi:10.1073/pnas.1524316113; Rubinstein, J. L., et al. (2008), Science 319, 186–189, doi:10.1126/science.1150558.


Geometry, regimes, and tiling

Tectonic regime

A classification of the stress/faulting environment that conditions the global model. The project uses five regimes — subduction interface, intraslab, crustal / strike-slip, intraplate, and ridge — each with its own ETAS priors anchored on the USGS OAF tectonic-regime study. Page, M. T., et al. (2016), BSSA 106(5), 2290–2301, doi:10.1785/0120160073.

Tile (interior + halo)

The unit of the tiled global forecaster: the globe is partitioned into tiles, each fit on its halo events (a buffer that makes triggering edge-correct) while owning only its interior cells (for aggregation). Per-tile fitting avoids the global $O(N^2)$ cost and lets each tile use its regime prior; a supercritical or data-thin tile falls back to its smoothed null.

View (region as a view into the global field)

In the global scope, any country/region is a view into one global forecast field — not a separately trained model. This is what makes cross-country bias comparison (high- vs low-seismicity) a first-class evaluation goal.

Slab2

The USGS global subduction-zone geometry model (depth-to-slab, dip, strike, interface distance), a primary enricher for subduction regimes. Hayes, G. P., et al. (2018), Science 362, 58–61, doi:10.1126/science.aat4723.

Finite-fault / anisotropic triggering

For great subduction earthquakes the isotropic point-source kernel of generic ETAS breaks down; fault-aligned (anisotropic) and finite-rupture triggering is needed. A key reason a subduction build must not reuse generic California parameters.


Product and UI terms

Probability field

The default Monitoring view: a continuous expected-count/rate surface on a perceptually-uniform sequential colormap (viridis/magma-class — not a red traffic-light ramp), with a calibrated numeric legend. The honest object is "where the conditional rate is elevated relative to its own long-term baseline," never "alerts."

Baseline (long-term / climatological)

The stationary background rate against which every forecast is shown, both as a ratio ($R = \lambda_\text{forecast}/\lambda_\text{baseline}$, e.g. "≈ 4× the usual rate") and an absolute expected count (so a 10× ratio on a near-zero baseline still reads "still very unlikely").

Horizon

The forecast window: 1 day / 2 days / 7 days, re-issued daily, always visible in the legend. A probability with no horizon is meaningless. Long horizons have near-zero gain outside active sequences.

Calibration badge (traffic-light, model quality only)

A compact always-present badge driven by the CSEP tests. The green/amber/red triad is reserved exclusively for model quality (green = within CSEP consistency, amber = borderline, red = rejected/under-tested) — the only place red appears, describing model quality, never earthquake danger.

Coverage mask / staleness banner

Honest degradation signals. The coverage mask hatches regions outside the validated footprint (blank never means safe). The staleness banner shows "generated {UTC} · next run {UTC}" — with one inference per day, the user must know the data's age; a failed run degrades visibly rather than silently serving a stale artifact.

Compact artifact

The single small (few hundred KB – few MB), gzipped, committed JSON file the daily job produces and the static viewer consumes: per-cell rates for {1d,2d,7d} × {P10,median,P90}, the baseline, a CSEP test summary + reliability points, the coverage mask, and provenance. The raw ~6.48 M-cell global grid is never shipped to the browser.

Git-as-data publishing

The deploy pattern: the daily job commits the compact artifact (scoped git add results/ — never git add -A) and pushes to the public repo; the static site auto-rebuilds. No processing backend on the request path.


Data and catalogs

FDSN

The international standard web-service API (event / station / dataselect) for seismological data, exposed by USGS, ISC, EarthScope/IRIS, EMSC, and regional networks. Accessed uniformly via ObsPy's Client. Has a 20,000-event-per-request cap (tile larger queries) and updatedafter for incremental deltas.

ComCat (ANSS Comprehensive Earthquake Catalog)

The USGS real-time, no-auth, daily-current global catalog — the daily inference spine. The magType field is first-class (mixing magnitude types distorts the GR tail). US Government work, public domain (non-US networks keep their own attribution).

ISC-GEM

The Global Instrumental Earthquake Catalogue (1904–2021, $M\ge5.5$, Mw-homogenized, relocated; v12.1, doi:10.31905/d808b825) — the long-term homogeneous anchor for b-value, large-event recurrence, and the magnitude-conversion overlap. Licensed CC-BY-SA 3.0 (share-alike: a redistributed derived catalog must keep the license + attribution).

GCMT (Global Centroid Moment Tensor)

The catalog of centroid moment tensors for ~all $M\gtrsim5$ events since 1976 (Mw, nodal planes, P/T axes) — the mechanism enricher and a Mw anchor. NDK format.

Regional networks (CSN, SCEDC/NCEDC, GeoNet, INGV, JMA)

The source of short-horizon skill (low, stable $M_c$): Chile → CSN (via EarthScope/IRIS, net C/C1; mandatory attribution); California → SCEDC/NCEDC; New Zealand → GeoNet (CC-BY 3.0 NZ); Italy → INGV ISIDe; Japan → JMA / NIED Hi-net (registration-gated, internal-only, not redistributable). EMSC SeismicPortal is an independent dedup cross-check.

Enrichers

Static / slow-moving geophysical covariates that are upside, not foundation (the catalog dominates skill), each gated on measured incremental information gain over a catalog-only ETAS: Slab2 (subduction geometry) > GEM Active Faults + Bird PB2002 plate model > GNSS strain (Nevada Geodetic Lab MIDAS, feeding the background term) > focal-mechanism stress. InSAR and heat flow are deferred.

Manifest / provenance

The versioned record stamping every pipeline stage (source catalog versions, query params, retrieved-at timestamps, row counts, checksums, conversion coefficients, $M_c$ grid version, declustering choice, model + parameter versions, config hash, code git SHA, issue timestamp). The raw data is rebuildable from manifests and is never versioned.


See also: Methodology-History · Models-Classical · Models-ML · Models-Employed · Data-Sources · Technical-Architecture · Pipeline · Evaluation-and-Tests · Honest-Limits · Changelog-and-Progress.

Clone this wiki locally