Skip to content

Gutenberg Richter Law

Felipe Santibañez-Leal edited this page Jun 17, 2026 · 1 revision

Gutenberg–Richter Law — the frequency–magnitude relation

The magnitude term of every earthquake forecast. The Gutenberg–Richter (GR) law states that, in any sufficiently large space–time–magnitude window, the number of earthquakes falls off exponentially with magnitude. It is the single most robust empirical regularity in seismology, and it is the device that converts a forecast rate of events above one reference magnitude into a rate above any target magnitude — without it, a conditional intensity $\lambda$ has no way to say how big the events it forecasts will be.

Honest framing. Nothing on this page is a prediction. The GR law is a statistical statement about populations of earthquakes; it says nothing about when or where the next event of a given size will occur. It supplies the magnitude distribution that any conditional forecaster (see Omori-Utsu-Law, ETAS-Model) multiplies into its time–space rate to produce a bounded, calibrated probability. A prediction is a deterministic statement that an event will or will not occur; a forecast gives a probability strictly in $(0,1)$ (Jordan et al., 2011).

Conventions. Magnitudes are homogenized to moment magnitude $M_w$ where possible; $M_c$ is the magnitude of completeness; $b$ is the GR slope; $\beta = b\ln 10$.


Table of contents

  1. Intuition and history
  2. Governing equation
  3. The exponential magnitude density (derivation)
  4. The b-value and what it means physically
  5. Estimating b — the Aki maximum-likelihood estimator
  6. The Utsu / Tinti–Mulargia binning correction
  7. Uncertainty of the b-value estimate
  8. Magnitude of completeness $M_c$
  9. Departures at large magnitude — tapering and $M_{\max}$
  10. Assumptions and failure modes
  11. Role in operational earthquake forecasting
  12. Worked illustration
  13. Estimation pipeline (diagram)
  14. References

1. Intuition and history

In 1944 Beno Gutenberg and Charles Richter, working on the seismicity of California, observed that if you count earthquakes by size and plot the logarithm of the cumulative count against magnitude, the points fall on a straight line over many orders of magnitude. Small earthquakes are overwhelmingly more common than large ones, and the ratio between the count at one magnitude and the count one unit higher is approximately constant. A region that produces one $M,6$ per decade produces of order ten $M,5$, a hundred $M,4$, a thousand $M,3$ — a clean power law in event size.

This is not a curiosity of California. The same straight line, with a slope near 1, appears in essentially every tectonic region, in induced seismicity, in laboratory rock-fracture acoustic emissions, and across roughly seven orders of magnitude in seismic moment. It is the empirical signature of a scale-invariant fracture process: the crust has no preferred earthquake size, and ruptures of all sizes are nucleated by the same physics, with only their final extent setting their magnitude. This self-similarity is the reason the GR law is the magnitude backbone of every statistical-seismology forecast (see Methodology-History).

The law is frequency–magnitude, not a recurrence model: it tells you the proportions of event sizes in a population, not the timing of any one event. Timing comes from the temporal models (Omori-Utsu-Law, ETAS-Model); GR tells those models how to distribute their forecast rate across magnitude.


2. Governing equation

The cumulative form, as Gutenberg and Richter wrote it:

$$\log_{10} N(\ge M) = a - b,M$$

  • $N(\ge M)$ — the number (or, divided by the window duration, the rate) of earthquakes with magnitude $\ge M$ in the space–time window.
  • $a$ — the productivity or total seismicity of the window. It absorbs the catalog length, the area, and the overall activity level; $10^{a}$ is the rate of events with $M \ge 0$ implied by the fit (an extrapolation, not an observation).
  • $b$ — the b-value, the slope of the log-frequency-versus-magnitude line. Globally $b \approx 1.0$, meaning each unit increase in magnitude reduces the event count by a factor of $10^{b} \approx 10$.

Because magnitude is itself logarithmic in seismic energy/moment, the GR law is a power law in the physical size of events: $N(\ge M) \propto 10^{-bM}$, and seismic moment $M_0 \propto 10^{1.5 M_w}$, so $N(\ge M_0) \propto M_0^{-2b/3}$ — a power law with exponent $\tfrac{2}{3}b \approx 0.67$ for $b=1$. This connects the seismological b-value to the more general physics-of-fracture exponent.

Reference. Gutenberg, B. & Richter, C. F. (1944), Frequency of earthquakes in California, Bull. Seismol. Soc. Am. 34(4), 185–188.


3. The exponential magnitude density (derivation)

The cumulative law is equivalent to an exponential probability density for magnitude above completeness — the form a forecaster actually uses, because a point process needs a normalized density, not a cumulative count.

Start from the cumulative count and convert it to a survival function. Above $M_c$, the probability that a randomly chosen event exceeds magnitude $M$ is the ratio of counts:

$$P(\text{mag} \ge M \mid \text{mag} \ge M_c) = \frac{N(\ge M)}{N(\ge M_c)} = \frac{10^{a - bM}}{10^{a - bM_c}} = 10^{-b(M - M_c)} = e^{-\beta (M - M_c)}, \qquad \beta = b\ln 10 .$$

This is the survival function of an exponential distribution shifted to start at $M_c$. Differentiating $-,\mathrm{d}P/\mathrm{d}M$ gives the probability density of magnitude:

$$\boxed{,f(M) = \beta, e^{-\beta (M - M_c)}, \qquad \beta = b\ln 10, \quad M \ge M_c,}$$

So the GR slope $b$ becomes the rate parameter $\beta = b\ln 10$ of a (shifted) exponential distribution. This is the object that multiplies into the conditional intensity: a forecaster computes a rate of events $\ge M_c$ from its time–space model, then distributes that rate over magnitude with $f(M)$ to obtain the exceedance probability at any target magnitude $M^\ast$:

$$\Phi(M^\ast) \equiv P(\text{mag} \ge M^\ast \mid \text{mag} \ge M_c) = 10^{-b(M^\ast - M_c)} .$$

The published forecast probability is then $P(\ge 1\ \text{event} \ge M^\ast) = 1 - e^{-N,\Phi(M^\ast)}$, where $N$ is the expected count $\ge M_c$ over the horizon (see Models-Employed and Evaluation-and-Tests).


4. The b-value and what it means physically

The b-value is the ratio of small to large events. A high $b$ ($\gtrsim 1.2$) means relatively many small events and few large ones; a low $b$ ($\lesssim 0.8$) means a heavier tail of large events. Several physical drivers are documented:

  • Differential stress. Laboratory and field studies find $b$ decreases as differential stress increases — the b-value behaves like an inverse stress-meter. Locked, highly stressed asperities tend to show locally low $b$.
  • Depth and faulting style. $b$ tends to vary systematically with depth and with the tectonic regime (normal-faulting regions often show higher $b$ than thrust regions).
  • Heterogeneity / fault maturity. More heterogeneous, immature fault zones tend toward higher $b$.

These are real geophysical signals. The danger — and the single most cited pitfall of b-value work — is that an apparent change in $b$ can equally well be an artifact of a mis-estimated $M_c$ (see §8). Any claim that $b$ has changed in time or space must first rule out a change in catalog completeness. The product treats $b(t)$ and $M_c(t)$ as jointly monitored quantities: drift in $b$ that coincides with drift in $M_c$ is flagged as a likely catalog/network artifact, not a tectonic signal.


5. Estimating b — the Aki maximum-likelihood estimator

The slope $b$ should not be read off a least-squares fit to the cumulative plot: the cumulative points are correlated (each is a running sum), and least squares is biased and over-weights the rare large events. Aki (1965) gave the correct estimator by maximum likelihood.

Derivation. Take the magnitude density $f(M) = \beta e^{-\beta(M - M_c)}$ for $M \ge M_c$. For $n$ independent events with magnitudes $M_1,\dots,M_n$ above $M_c$, the log-likelihood is

$$\ell(\beta) = \sum_{i=1}^{n} \ln f(M_i) = n\ln\beta - \beta \sum_{i=1}^{n}(M_i - M_c) .$$

Setting $\mathrm{d}\ell/\mathrm{d}\beta = 0$:

$$\frac{n}{\beta} - \sum_{i=1}^{n}(M_i - M_c) = 0 ;\Longrightarrow; \hat\beta = \frac{1}{\bar M - M_c}, \qquad \bar M = \frac1n\sum_i M_i .$$

Converting back with $b = \beta/\ln 10$ gives the Aki estimator:

$$\boxed{,\hat b = \frac{\log_{10} e}{\bar M - M_c} \approx \frac{0.4343}{\bar M - M_c},}$$

where $\bar M$ is the mean magnitude of events with $M \ge M_c$. The estimator depends only on the mean magnitude above completeness — an elegant, sufficient-statistic result. It assumes a homogeneous Poisson process with a continuous exponential magnitude distribution.

Reference. Aki, K. (1965), Maximum likelihood estimate of $b$ in the formula $\log N = a - bM$ and its confidence limits, Bull. Earthq. Res. Inst. 43, 237–239.


6. The Utsu / Tinti–Mulargia binning correction

Real catalogs report magnitudes rounded to a finite resolution $\Delta M$ (commonly 0.1). The continuous Aki estimator is then slightly biased, because a magnitude reported as $M$ actually represents the bin $[M - \tfrac{\Delta M}{2},, M + \tfrac{\Delta M}{2})$. Correcting for the discretization (Utsu 1965; Tinti & Mulargia 1987) shifts the effective completeness by half a bin:

$$\boxed{,\hat b = \frac{\log_{10} e}{\bar M - \left(M_c - \tfrac{\Delta M}{2}\right)},}$$

This is the Aki–Utsu estimator and is the form the product uses. The correction is small for $\Delta M = 0.1$ but non-negligible for the steep $b$ and the coarse binning that appear in some historical catalogs, and it is free to apply.

References. Utsu, T. (1965), A method for determining the value of $b$ in a formula $\log n = a - bM$…, Geophys. Bull. Hokkaido Univ. 13, 99–103; Tinti, S. & Mulargia, F. (1987), Confidence intervals of b values for grouped magnitudes, Bull. Seismol. Soc. Am. 77(6), 2125–2134.


7. Uncertainty of the b-value estimate

A b-value with no error bar is not usable in a forecast — its uncertainty propagates straight into the magnitude tail and therefore into the published probability of a large event. Two standard results:

  • Aki's own confidence limits. For the MLE, the standard error scales as $\sigma_{\hat b} \approx \hat b / \sqrt{n}$, so a stable estimate needs many events. A common rule of thumb is 50–100 events above $M_c$ for a usable $\hat b$.

  • Shi & Bolt (1982) sample standard deviation, which accounts for the spread of the observed magnitudes:

$$\sigma_{\hat b} = 2.30, \hat b^{2}, \sqrt{\frac{\sum_{i=1}^{n}(M_i - \bar M)^{2}}{n,(n-1)}} .$$

The product re-estimates $\hat b$ on a rolling space–time window, carries $\sigma_{\hat b}$, and propagates it (together with the $M_c$ uncertainty, §8) into the optimistic/expected/pessimistic bounds of the forecast. $b$ is never hard-coded to 1.

Reference. Shi, Y. & Bolt, B. A. (1982), The standard error of the magnitude–frequency $b$ value, Bull. Seismol. Soc. Am. 72(5), 1677–1687.


8. Magnitude of completeness $M_c$

$M_c$ is the lowest magnitude at which (almost) all events in a space–time volume are detected and reported. Below $M_c$ the catalog rolls off — small events are missed in the noise or in the coda of larger ones — and the GR straight line bends. This is the load-bearing nuisance parameter of the whole law: the Aki–Utsu $\hat b$ is strongly biased if $M_c$ is set too low (you fit the rolled-off part and get a too-shallow $b$) or wastefully high (you throw away data and inflate the variance).

Common estimators of $M_c$:

Estimator Idea
Maximum curvature (MAXC) $M_c$ = the magnitude bin of the peak of the non-cumulative frequency–magnitude distribution; fast, but biased low in gradually-rolling-off catalogs, so a +0.2 correction is often added.
Goodness-of-fit (GFT) The smallest $M_c$ for which a GR model fits the data to a chosen residual level (e.g. 90–95 %) — Wiemer & Wyss (2000).
b-value stability (MBS) The $M_c$ above which $\hat b$ stabilizes within its uncertainty — Cao & Gao (2002); Woessner & Wiemer (2005).
EMR, Lilliefors Likelihood / distributional variants that model the detected and undetected parts jointly.

The +0.2 MAXC correction is not universal. It was calibrated for California catalogs and must be re-validated per region (cross-checked with GFT/EMR and a direct look at the frequency–magnitude distribution), taking the conservative value. Treating it as a fixed constant is a known way to bias $M_c$ — and therefore $b$ — across regions.

A critical operational subtlety is short-term, post-mainshock incompleteness: immediately after a large earthquake, small events are buried in the coda and $M_c$ spikes for hours to days. A flat $M_c$ then over-counts the deficit and a naive fit underestimates productivity at exactly the highest-stakes moment. The product therefore uses a time-dependent $M_c(t)$ after large events (see Omori-Utsu-Law §incompleteness and Models-Employed). The $M_c$ grid is stored as a first-class versioned artifact alongside each daily catalog snapshot.

References. Wiemer, S. & Wyss, M. (2000), Minimum magnitude of completeness in earthquake catalogs…, Bull. Seismol. Soc. Am. 90(4), 859–869, doi:10.1785/0119990114; Woessner, J. & Wiemer, S. (2005), Assessing the quality of earthquake catalogues: estimating the magnitude of completeness and its uncertainty, Bull. Seismol. Soc. Am. 95(2), 684–698, doi:10.1785/0120040007.


9. Departures at large magnitude — tapering and $M_{\max}$

The pure exponential cannot hold to infinity: a fault of finite length cannot host an arbitrarily large earthquake, so the magnitude distribution must be bounded or tapered near a maximum magnitude $M_{\max}$. Two standard remedies:

  • Truncated GR: the exponential is cut at a hard $M_{\max}$, renormalized over $[M_c, M_{\max}]$.
  • Tapered GR (Kagan): the survival function is multiplied by an exponential taper in seismic moment, $P(\ge M_0) \propto (M_0/M_t)^{-2b/3}\exp!\big((M_t - M_0)/M_{cm}\big)$, giving a smooth roll-off controlled by a corner magnitude $M_{cm}$ rather than a hard cut.

This tail matters disproportionately for a forecaster: the rare, high-impact events live in it, and $M_{\max}$ bounds the exceedance integral and therefore sets the tail probability of the events that dominate impact. The product sources $M_{\max}$ per region from regional hazard models, treats it as an explicit documented assumption, and reports sensitivity to it (see Models-Employed). A characteristic-earthquake bump (Schwartz & Coppersmith 1984), where a fault ruptures preferentially in similar-size events, is a further documented departure from pure GR at large $M$, and is contested against the pure power law.

References. Kagan, Y. Y. (2002), Seismic moment distribution revisited, Geophys. J. Int. 148(3), 520–541, doi:10.1046/j.1365-246x.2002.01594.x; Schwartz, D. P. & Coppersmith, K. J. (1984), Fault behavior and characteristic earthquakes, J. Geophys. Res. 89(B7), 5681–5698, doi:10.1029/JB089iB07p05681.


10. Assumptions and failure modes

  • Self-similarity above $M_c$. Magnitudes are exponentially distributed above completeness. This breaks down near $M_{\max}$ (needs tapering, §9) and can break locally where a characteristic earthquake dominates.
  • Stationarity and completeness. The estimator assumes a stationary, complete catalog over the window. Real $b$ varies with stress, depth and fault maturity (a true signal) and with $M_c$ mis-estimation (an artifact). Distinguishing the two is the central discipline of b-value work.
  • Independence. The Aki MLE assumes events are independent draws from the magnitude distribution. Strong clustering does not bias the magnitude estimate much (magnitudes of aftershocks still follow GR), but it does mean the count term $a$ is non-stationary — which is exactly why the temporal models exist.
  • Sample size. Below ~50 events the estimate is noisy; the uncertainty (§7) must be carried, never suppressed.

11. Role in operational earthquake forecasting

GR is the magnitude term of every forecast in this system, and it plays three concrete roles:

  1. Distributing rate over magnitude. Every conditional model (Omori-Utsu-Law, ETAS-Model) produces a rate of events $\ge M_c$. GR's $\Phi(M^\ast) = 10^{-b(M^\ast - M_c)}$ turns that into the exceedance rate at the displayed threshold, and hence the published probability $1 - e^{-N\Phi}$.
  2. Enabling the magnitude consistency test (M-test). Because the forecast carries a full magnitude distribution, CSEP can score whether the predicted mix of event sizes matches reality, not just the total count.
  3. Setting the tail. $b$ and $M_{\max}$ together control the probability of the rare large events that dominate impact; their uncertainty is propagated into the forecast bounds.

Because $b$ enters the tail exponentially, a small error in $b$ (or in $M_c$, which biases $b$) is amplified into the probability of a large event. This is why the product re-estimates $M_c$ and $b$ on a rolling window every inference, propagates both uncertainties, and monitors $M_c(t)$ and $b(t)$ for drift as an early warning of catalog/network breakage.


12. Worked illustration

Suppose a rolling window above $M_c = 2.5$ contains $n = 400$ events with mean magnitude $\bar M = 2.93$, reported at $\Delta M = 0.1$. The Aki–Utsu estimate is

$$\hat b = \frac{0.4343}{\bar M - (M_c - \Delta M/2)} = \frac{0.4343}{2.93 - (2.5 - 0.05)} = \frac{0.4343}{0.48} \approx 0.90 .$$

Its Shi–Bolt standard error with this sample is of order $\sigma_{\hat b} \approx 2.30,\hat b^2/\sqrt n \approx 2.30 \times 0.81 / 20 \approx 0.09$, so $b = 0.90 \pm 0.09$.

Now say a time–space model forecasts an expected $N = 2.0$ events $\ge M_c$ over the next day in a region. The exceedance factor at a target $M^\ast = 5.0$ is

$$\Phi(5.0) = 10^{-b(5.0 - 2.5)} = 10^{-0.90 \times 2.5} = 10^{-2.25} \approx 5.6\times10^{-3},$$

so the expected number $\ge 5$ is $N\Phi = 2.0 \times 5.6\times10^{-3} = 1.1\times10^{-2}$, and the probability of at least one $M\ge 5$ event is

$$P(\ge 1,\ M\ge 5) = 1 - e^{-0.011} \approx 1.1% .$$

Crucially, the $\pm 0.09$ uncertainty on $b$ moves $\Phi(5.0)$ by roughly a factor of $10^{\pm 0.09 \times 2.5} = 10^{\pm 0.22} \approx \times/\div,1.7$ — i.e. the probability of the large event swings by almost a factor of two from the b-value uncertainty alone. That is exactly why this page insists on estimating, bounding, and propagating $b$ rather than assuming it.


13. Estimation pipeline (diagram)

flowchart TD
    A[Catalog slice up to issue time t] --> B[Estimate Mc<br/>MAXC + 0.2, cross-check GFT / EMR]
    B --> C{Recent large<br/>mainshock?}
    C -- yes --> D[Use time-dependent Mc of t<br/>post-mainshock incompleteness]
    C -- no --> E[Use rolling Mc]
    D --> F[Select events with M >= Mc]
    E --> F
    F --> G[Aki–Utsu MLE<br/>b-hat = 0.4343 / mean M - Mc - dM/2]
    G --> H[Shi–Bolt sigma_b<br/>+ Aki confidence limits]
    H --> I[Magnitude density<br/>f of M = beta exp - beta M - Mc]
    I --> J[Exceedance factor<br/>Phi of M* = 10^- b M* - Mc]
    J --> K[Feeds conditional intensity<br/>ETAS / Omori-Utsu rate -> probability]
    B --> L[Store Mc grid + b + sigma_b<br/>versioned artifact per daily run]
Loading

The output $\Phi(M^\ast)$ and the magnitude density $f(M)$ are consumed by the time–space models (Omori-Utsu-Law, ETAS-Model) and by the evaluation harness; the $M_c$ grid, $b$, and $\sigma_b$ are persisted with every daily snapshot so any past forecast is reproducible.


References

  1. Gutenberg, B. & Richter, C. F. (1944), Frequency of earthquakes in California, Bull. Seismol. Soc. Am. 34(4), 185–188.
  2. Aki, K. (1965), Maximum likelihood estimate of $b$ in the formula $\log N = a - bM$ and its confidence limits, Bull. Earthq. Res. Inst. 43, 237–239.
  3. Utsu, T. (1965), A method for determining the value of $b$…, Geophys. Bull. Hokkaido Univ. 13, 99–103.
  4. Tinti, S. & Mulargia, F. (1987), Confidence intervals of $b$ values for grouped magnitudes, Bull. Seismol. Soc. Am. 77(6), 2125–2134.
  5. Shi, Y. & Bolt, B. A. (1982), The standard error of the magnitude–frequency $b$ value, Bull. Seismol. Soc. Am. 72(5), 1677–1687.
  6. Wiemer, S. & Wyss, M. (2000), Minimum magnitude of completeness in earthquake catalogs: examples from Alaska, the western United States, and Japan, Bull. Seismol. Soc. Am. 90(4), 859–869, doi:10.1785/0119990114.
  7. Woessner, J. & Wiemer, S. (2005), Assessing the quality of earthquake catalogues: estimating the magnitude of completeness and its uncertainty, Bull. Seismol. Soc. Am. 95(2), 684–698, doi:10.1785/0120040007.
  8. Kagan, Y. Y. (2002), Seismic moment distribution revisited: I. Magnitude distribution, Geophys. J. Int. 148(3), 520–541, doi:10.1046/j.1365-246x.2002.01594.x.
  9. Schwartz, D. P. & Coppersmith, K. J. (1984), Fault behavior and characteristic earthquakes: examples from the Wasatch and San Andreas fault zones, J. Geophys. Res. 89(B7), 5681–5698, doi:10.1029/JB089iB07p05681.
  10. Jordan, T. H. et al. (2011), Operational earthquake forecasting: state of knowledge and guidelines for utilization (ICEF Report), Annals of Geophysics 54(4), 315–391, doi:10.4401/ag-5350.

See also: Omori-Utsu-Law · ETAS-Model · Models-Classical · Models-Employed · Evaluation-and-Tests · Glossary

Clone this wiki locally