Skip to content

STEP Model

Felipe Santibañez-Leal edited this page Jun 17, 2026 · 1 revision

STEP Model — Short-Term Earthquake Probability

STEP (Short-Term Earthquake Probability) is the model that turned aftershock statistics into a map. Where Reasenberg–Jones answers "how many aftershocks, and how likely is a large one" with no sense of place, STEP answers "how likely is strong shaking here, in the next interval" — a near-real-time, gridded probability surface updated as the seismicity evolves. It is the canonical production template for "tomorrow's earthquakes" probabilistic maps, and therefore the closest published analogue of this product's daily gridded output. This page is a deep, self-contained treatment of that one model: intuition and history, the governing equations and how they assemble into a map, parameter estimation, assumptions, strengths and limitations, its operational role, a worked sketch, a structural diagram, and references.

Honest framing. STEP produces a bounded, calibrated probability surface, never a deterministic prediction and never an alarm. Each cell carries a probability strictly in $(0,1)$ of exceeding a shaking threshold over a fixed short horizon — not a statement that an earthquake will strike there. A single outcome neither confirms nor refutes a probabilistic map. See Honest-Limits.


Table of contents

  1. Intuition and history
  2. What STEP computes
  3. The three rate components
  4. Blending the components per cell
  5. From rate to shaking probability (the GMPE step)
  6. Parameter estimation
  7. Assumptions and failure modes
  8. Strengths and limitations
  9. Role in operational earthquake forecasting
  10. Worked sketch of a single cell
  11. How STEP informs this product
  12. References

1. Intuition and history

By the early 2000s the Reasenberg–Jones model gave reliable aggregate aftershock probabilities, and seismic networks in California were dense enough to locate small earthquakes within minutes. The missing piece was space: a forecast that says "a damaging aftershock is 100× more likely near the rupture than 50 km away" is far more actionable than a single regional number.

STEP, developed by Matthew Gerstenberger, Stefan Wiemer, Lucile Jones and Paul Reasenberg (USGS + SCEC + ETH Zürich), was the first system to deliver this operationally. Published in Nature in 2005 under the title "Real-time forecasts of tomorrow's earthquakes in California", it produced hourly maps of the probability of experiencing strong shaking (Modified Mercalli intensity $\ge$ VI) on a ~10-km grid covering California, automatically updated as new events were located. It was, in effect, the first running instance of what the field now calls Operational Earthquake Forecasting (OEF).

Two design choices made STEP both useful and honest:

  1. It separates a slowly-varying background rate from a fast, event-driven clustering rate, so the map is dominated by the background in quiet times and lights up around fresh sequences.
  2. It converts the seismicity rate into a shaking probability via a ground-motion model — the quantity people actually care about — rather than reporting an abstract event rate.

STEP's architecture was later adopted, in spirit, by OEF-Italy, where a STEP-type component sits in an ensemble with ETAS and ETES.

Why "tomorrow's earthquakes" is the right framing. STEP never claims to know which fault will move or when the next event strikes. It quantifies, cell by cell, the elevated probability that the current state of seismicity confers on the next interval — which is exactly the honest thing a short-term forecast can say.


2. What STEP computes

STEP's output is a grid of cells; for each cell $s_j$ and each forecast interval $[t, t + \Delta t]$ it produces:

  1. A conditional earthquake rate $\lambda(s_j, t)$ — the expected rate of events (above a reference magnitude, with a Gutenberg–Richter magnitude distribution) in that cell, given the current catalog.
  2. A probability of strong shaking — the probability that ground motion in that cell exceeds a chosen intensity threshold (originally MMI $\ge$ VI) during the interval, obtained by pushing the rate through a ground-motion prediction equation (GMPE).

The pipeline is therefore: catalog → per-cell rate (background + clustering) → magnitude distribution → ground-motion conversion → exceedance probability map.

flowchart TD
    CAT["Real-time catalog<br/>(located events)"] --> BG["Background rate<br/>μ(s) — smoothed, time-independent"]
    CAT --> CL["Clustering rate<br/>Σ Reasenberg–Jones contributions<br/>of recent events"]
    BG --> SUM["Per-cell conditional rate<br/>λ(s,t) = μ(s) + Σ R–J"]
    CL --> SUM
    SUM --> GR["Gutenberg–Richter<br/>magnitude distribution"]
    GR --> GMPE["GMPE / ground-motion model<br/>P(shaking ≥ threshold)"]
    GMPE --> MAP["Gridded shaking-probability map<br/>(updated each interval)"]
Loading

3. The three rate components

The clustering rate in STEP is built entirely on the Reasenberg–Jones temporal–magnitude law

$$\lambda_{\text{RJ}}(t, M) = \frac{10^{,a + b,(M_m - M)}}{(t + c)^{,p}},$$

but STEP estimates the parameters three ways simultaneously and keeps all three:

Component How parameters are obtained When it is most informative
Generic Fixed regional $(a, b, c, p)$ from past sequences in the tectonic setting. The instant a new mainshock is located — before its aftershocks accumulate.
Sequence-specific $(a, c, p)$ refit by maximum likelihood to the ongoing sequence. Once a productive sequence has enough aftershocks to constrain the fit.
Spatially-varying Parameters estimated per grid cell, capturing along-rupture variation in productivity and decay. Where the aftershock distribution is spatially non-uniform (most large ruptures).

Each component yields its own estimate of the clustering rate at a cell. The background rate $\mu(s_j)$ is a separate, slowly-varying, time-independent term — a smoothed-seismicity surface that represents the long-term spatial distribution of events (see Models-Classical §7 for the smoothed-seismicity construction).


4. Blending the components per cell

The key operational idea is that STEP does not commit to one parameterization. For each cell it selects the most informative of the three clustering estimates available there, and adds the background:

$$\lambda(s_j, t) = \mu(s_j) ;+; \lambda_{\text{cluster}}(s_j, t),$$

where $\lambda_{\text{cluster}}$ is taken from the spatially-varying estimate where data permit, backing off to sequence-specific, then to generic where the local data are too sparse to constrain a fit. The result is a rate surface that is maximally specific where data are rich and gracefully generic where they are not — a per-cell version of the regime hierarchy described for Reasenberg–Jones §5.

In quiet periods $\lambda_{\text{cluster}} \to 0$ and the map is just the background $\mu(s_j)$; when a sequence is active, the clustering term dominates the cells around the rupture and the map "lights up" there.

flowchart LR
    SV["Spatially-varying<br/>per-cell fit"] -->|"enough local data?"| PICK{Select most<br/>informative}
    SS["Sequence-specific<br/>fit"] --> PICK
    GEN["Generic<br/>regional"] --> PICK
    PICK --> CL["λ_cluster(s,t)"]
    MU["Background μ(s)"] --> ADD["λ(s,t) = μ(s) + λ_cluster(s,t)"]
    CL --> ADD
Loading

5. From rate to shaking probability (the GMPE step)

STEP reports shaking probability, not event rate, because shaking is what damages buildings and what civil protection acts on. Two conversions are layered on top of the rate.

5.1 Magnitude distribution

The cell rate $\lambda(s_j, t)$ is distributed over magnitude by the Gutenberg–Richter law: the expected rate of events of magnitude $\ge M$ in the cell scales as $10^{-b(M - M_{\text{ref}})}$ relative to the reference rate. This gives a rate of events of each magnitude in the cell over the interval.

5.2 Ground-motion conversion and exceedance

For an event of magnitude $M$ at distance $r$ from a target site, a ground-motion prediction equation (GMPE) gives the expected shaking intensity $I(M, r)$ (e.g. peak ground acceleration or Modified Mercalli intensity) with its scatter. Integrating over the magnitude distribution and over the spatial source distribution gives the rate $\Lambda_{\ge I^}(s_j, t)$ at which shaking at cell $s_j$ exceeds a threshold $I^$ during the interval. Under the Poisson assumption, the exceedance probability is

$$P\big(I \ge I^* \text{ at } s_j \text{ in } [t, t+\Delta t]\big) = 1 - \exp!\big(-\Lambda_{\ge I^*}(s_j, t),\Delta t\big),$$

the same $1 - e^{-N}$ form used throughout probabilistic seismic hazard. STEP's original threshold was MMI $\ge$ VI ("strong" shaking, the onset of potential damage). The map is the field of these per-cell probabilities.

Why this matters. The GMPE step is what makes STEP a hazard product rather than a seismicity product. It also means STEP inherits the GMPE's uncertainty — an honest map carries that scatter through to the published probability.


6. Parameter estimation

  • Clustering parameters $(a, c, p)$: Omori–Utsu maximum likelihood (Ogata, 1983) on the ongoing sequence for the sequence-specific and spatially-varying components; regional priors for the generic component. See Reasenberg-Jones-Model §6.
  • $b$-value: Aki–Utsu MLE above the completeness magnitude $M_c$, never hard-coded — see Models-Classical §1.
  • Background $\mu(s)$: adaptive-kernel smoothed seismicity on a declustered catalog (Models-Classical §7).
  • GMPE: a region-appropriate ground-motion model with its aleatory scatter; the choice of GMPE is itself a modelling decision that must match the tectonic setting.
  • Early incompleteness: as with R–J, the minutes-to-hours after a large mainshock are incomplete; the fit must start past the time-of-completeness or model $M_c(t)$, or productivity is underestimated exactly when the map matters most.

7. Assumptions and failure modes

Assumption What breaks it
Clustering is aftershock-type, captured by Reasenberg–Jones. Swarms, slow-slip-driven and induced sequences are not aftershock decay; STEP mis-fits them.
Background is stationary over the relevant horizon. Transient background changes (fluid injection, volcanic unrest) violate it.
Non-homogeneous Poisson counts per cell. Real counts are over-dispersed; secondary triggering is only partially captured.
GMPE transfers to the region and threshold. A California-tuned GMPE does not describe subduction shaking; the wrong GMPE biases the whole map.
Generic parameters are regime-appropriate. Reusing California generics outside California is invalid (the same pitfall as R–J).
Catalog is complete above $M_c$ per cell. Early/near-source incompleteness suppresses the very cells that should light up.

The signature failure mode is mainshock/foreshock anticipation: STEP is aftershock-dominated by design, so it has little skill at flagging the next large independent event before any clustering signal exists. It is excellent at mapping the elevated hazard after a sequence starts.


8. Strengths and limitations

Strengths

  • Gridded and spatial — the first operational map of short-term shaking probability; the output shape this product targets.
  • Hazard-relevant — reports shaking, not abstract rate, via the GMPE step.
  • Graceful data dependence — per-cell selection of generic / sequence-specific / spatial parameters means it degrades sensibly where data are sparse.
  • Real-time and automatic — designed to update each interval with no human in the loop.

Limitations

  • Aftershock-dominated — weak for anticipating independent mainshocks/foreshocks by design.
  • No secondary triggering in the R–J core — large aftershocks' own sequences are under-modelled relative to ETAS.
  • California-tuned origins — generic parameters and GMPE must be re-derived for any other region.
  • GMPE-limited — map accuracy is capped by the ground-motion model's fidelity and scatter.

In direct CSEP-style comparison ETAS is the more complete generator, but STEP-type models often excel specifically during vigorous aftershock sequences, which is why operational ensembles keep both — see Evaluation-and-Tests.


9. Role in operational earthquake forecasting

STEP is a foundational operational system, not a paper model:

  • It ran as a near-real-time California service producing hourly gridded shaking-probability maps — the first instance of operational earthquake forecasting in the modern sense.
  • Its architecture (background + Reasenberg–Jones clustering → GMPE → gridded exceedance) is the template that this product's daily gridded map follows.
  • OEF-Italy embeds a STEP-type component in a validated three-model ensemble (with ETAS and ETES) that runs as a scheduled civil-protection service and publishes calibrated probabilities with uncertainty; its long-term validation found the ensemble broadly reliable.
  • Across a decade-long California experiment in which dozens of next-day models were scored with pyCSEP, the honest finding is that no single model dominates: STEP-type models shine during aftershock sequences while ETAS is the consistent generalist — the empirical justification for running an ensemble.

STEP's enduring contribution is conceptual as much as computational: it established that the right operational deliverable is a calibrated, gridded probability surface conditioned on present seismicity — never an alarm, never a countdown.


10. Worked sketch of a single cell

Illustrative only — not a forecast.

Consider a cell ~10 km from an $M_m = 6.0$ rupture, two days into the sequence, with the daily forecast interval $[2,\text{d}, 3,\text{d}]$.

  1. Clustering rate (R–J core). Using generic $a = -1.7$, $b = 0.9$, $c = 0.05$ d, $p = 1.1$, the rate of $\ge 4.0$ events in the whole sequence over that day is $10^{-1.7 + 0.9(6.0 - 4.0)} \int_2^3 (t + 0.05)^{-1.1}, dt$. The magnitude term is $10^{0.1} \approx 1.26$; the day's time integral is small ($\approx 0.30$), giving a sequence rate $\approx 0.38$ events $\ge 4.0$ that day.
  2. Spatial share. The spatially-varying component assigns a fraction of that rate to this cell based on the local aftershock density — say 5 %, i.e. $\approx 0.019$ events $\ge 4.0$ in the cell that day.
  3. Add background. Add the (small) stationary background $\mu(s)$ for the cell.
  4. GMPE → shaking. Push the cell's magnitude-distributed rate through the GMPE to get the rate $\Lambda_{\ge \text{VI}}$ of MMI $\ge$ VI shaking, then $P = 1 - e^{-\Lambda_{\ge \text{VI}}}$.

The map is this calculation repeated over every cell, dominated by the cells hugging the rupture and fading to background away from it.

Read it honestly. A cell showing "2 % chance of MMI ≥ VI today" is a probability, scored over many cells and many days (see Evaluation-and-Tests) — not a promise about that one cell on that one day.


11. How STEP informs this product

CAOS_SEISMIC adopts STEP's output shape and architecture, not its California parameters:

  • The product emits a gridded, calibrated probability surface per cycle, exactly STEP's deliverable form — see Models-Employed, Pipeline and Technical-Architecture.
  • The internal rate is assembled as background + time-dependent clustering, with the clustering term supplied by ETAS (more complete than R–J's single-trigger core) and cross-checked by Reasenberg–Jones.
  • The magnitude tail comes from a rolling Gutenberg–Richter fit, and the per-cell exceedance probability uses the same $1 - e^{-N}$ machinery STEP pioneered (Methodology-History).
  • All parameters and any ground-motion conversion are derived for the operating tectonic regime; California generics and California GMPEs are never reused unchanged.
  • Every map cell is CSEP-scored (Evaluation-and-Tests) and the surface is published as a bounded probability, never an alarm — see Honest-Limits.

References

  1. Gerstenberger, M. C., Wiemer, S., Jones, L. M. & Reasenberg, P. A. (2005). Real-time forecasts of tomorrow's earthquakes in California. Nature 435, 328–331. doi:10.1038/nature03622
  2. Reasenberg, P. A. & Jones, L. M. (1989). Earthquake hazard after a mainshock in California. Science 243(4895), 1173–1176. doi:10.1126/science.243.4895.1173
  3. Page, M. T., van der Elst, N., Hardebeck, J., Felzer, K. & Michael, A. J. (2016). Three ingredients for improved global aftershock forecasts. Bulletin of the Seismological Society of America 106(5), 2290–2301. doi:10.1785/0120160073
  4. Ogata, Y. (1983). Estimation of the parameters in the modified Omori formula for aftershock frequencies by the maximum likelihood procedure. Journal of Physics of the Earth 31(2), 115–124. doi:10.4294/jpe1952.31.115
  5. Helmstetter, A., Kagan, Y. Y. & Jackson, D. D. (2007). High-resolution time-independent grid-based forecast for M ≥ 5 earthquakes in California. Seismological Research Letters 78(1), 78–86. doi:10.1785/gssrl.78.1.78
  6. Schorlemmer, D., Gerstenberger, M. C., Wiemer, S., Jackson, D. D. & Rhoades, D. A. (2007). Earthquake likelihood model testing. Seismological Research Letters 78(1), 17–29. doi:10.1785/gssrl.78.1.17
  7. Jordan, T. H., Chen, Y.-T., Gasparini, P., Madariaga, R., Main, I., Marzocchi, W., Papadopoulos, G., Sobolev, G., Yamaoka, K. & Zschau, J. (2011). Operational earthquake forecasting: state of knowledge and guidelines for utilization. Annals of Geophysics 54(4), 315–391. doi:10.4401/ag-5350

Related pages: Models-Classical · Reasenberg-Jones-Model · EEPAS-Model · Models-Employed · Methodology-History · Pipeline · Evaluation-and-Tests · Honest-Limits · Glossary.

Clone this wiki locally