-
Notifications
You must be signed in to change notification settings - Fork 0
Technical Architecture
The end-to-end system: one global field computed offline, exposed as country views, refined
by tectonic-regime tiling, produced by local-GPU compute, published via git-as-data, and
served by a static web viewer on GitHub Pages at
seismic.fasl-work.com. A single daily job at 03:00 local time
recomputes and republishes the forecast.
Design principle. Heavy compute runs once per day, offline; the runtime is stateless and read-only. There is no server that computes anything on the request path — the "API" is just static JSON assets behind a CDN. This is the offline builder → compact committed artifact → thin static viewer contract, with the single change that the builder runs once per day instead of once-ever.
flowchart LR
subgraph COMPUTE["Always-on GPU laptop (compute host)"]
direction TB
SCHED["Task Scheduler 03:00<br/>scripts/daily"]
FETCH["fetch: ComCat delta<br/>+ regional FDSN + EMSC"]
HYG["hygiene: Mc -> homogenize<br/>-> dual-catalog decluster"]
FIT["regime-tiled ETAS fit/condition<br/>(+ gated GPU neural challenger)"]
SIM["simulate >=10k synthetic catalogs<br/>-> per-cell rate field"]
CAL["calibrate + bounds + QA gate"]
ART["ONE compact H3 artifact<br/>(few hundred KB - few MB)"]
SCHED --> FETCH --> HYG --> FIT --> SIM --> CAL --> ART
end
subgraph GIT["Public repo (GitHub)"]
RES["results/forecast-YYYY-MM-DD.json.gz<br/>results/index.json + manifests + CSEP"]
end
subgraph WEB["Static web (GitHub Pages)"]
PAGES["seismic.fasl-work.com<br/>Vite + React SPA viewer"]
end
ART -->|"scoped git add results/<br/>commit + push"| RES
RES -->|"Pages auto-publishes on push"| PAGES
PAGES -->|"fetch committed JSON<br/>(no backend)"| USER["Browser:<br/>map + charts"]
Three layers, cleanly separated:
- Compute host — Felipe's always-on GPU laptop runs the daily Python job (fetch → hygiene → fit → simulate → calibrate → compact artifact).
- Publishing — the job commits the compact artifact to the public repo (git-as-data). A push is the publish event.
- Serving — GitHub Pages auto-publishes on push; the static SPA fetches the committed JSON directly. No request-path backend exists.
CAOS_SEISMIC computes one global conditional-rate field. A country is not a separate model — it is a view into that single global field. This is a deliberate decision: a region-only model is the opposite of the product's purpose, which is to forecast globally and compare behaviour across tectonic settings.
-
One global field, fit on a fine grid. The conditional rate
$\lambda$ is computed per cell on a fine grid (~0.1°, fine enough for the CSEP S-test to run) for horizons {1 d, 2 d, 7 d}, then rendered at region granularity in the viewer. - Country views are slices. A per-country map (and the no-map summary) is a windowed read of the global field, joined with that country's context layers (faults, stations). The same artifact serves every view.
- Inference spans high- and low-seismicity countries on purpose. High-seismicity (Chile, Japan, Indonesia, Mexico, Türkiye, California, New Zealand) and low-seismicity (United Kingdom, Germany, Australia, Brazil) are all in the inference footprint, so that bias toward high-seismicity zones can be measured — an explicit evaluation goal, not an afterthought.
- Coverage is honest. Cells outside the validated footprint render as an explicit "out of coverage" mask. Blank never means safe.
A globally uniform ETAS is wrong: subduction interface, intraslab, crustal, intraplate, and ridge
seismicity obey different productivity and triggering. The global field is therefore conditioned by
tectonic regime and computed per tile, which also avoids the global
flowchart TD
GLOBE["Global catalog + enrichers"] --> TILE["iterate_tiles(globe, tile_deg, halo_deg)"]
TILE --> T1["Tile (interior + halo)"]
T1 --> REG["assign_regime(lat, lon, depth)<br/>via Slab2 / PB2002 (lazy) or heuristic"]
REG --> PRIOR["REGIME_PRIORS (Page et al. 2016)<br/>seed the ETAS MLE start"]
PRIOR --> FITT["Fit ETAS per tile on HALO events<br/>+ smoothed background"]
FITT --> GATE{"Both ETAS stability gates pass?<br/>alpha < beta and n < 1"}
GATE -->|yes| OWN["Route each INTERIOR cell<br/>to its owning tile"]
GATE -->|"no (supercritical/thin)"| FALL["Fall back to smoothed Poisson null<br/>(recorded in params_used)"]
FALL --> OWN
OWN --> CONCAT["Concatenate per-tile expected counts<br/>-> ONE global field"]
-
Five regimes — subduction interface, intraslab, crustal / strike-slip, intraplate, ridge —
assigned per event from Slab2 and the Bird PB2002 plate model (loaded lazily from the enricher
store; a self-contained heuristic fallback records its
sourcewhen the grids are absent). - Tiles carry a halo. Each tile has an interior (cells it owns for aggregation) and a halo (extra events for edge-correct triggering). Fitting on halo events removes boundary artifacts; routing interior cells to their owning tile prevents double-counting.
-
Per-regime priors seed the ETAS MLE start, anchored on Page et al. (2016) — the USGS
Operational Aftershock Forecasting tectonic-regime study (DOI
10.1785/0120160073). -
Stability gates are enforced per tile. Both ETAS gates — productivity convergence
(
$\alpha < \beta$ , with$\beta = b\ln 10$ , so the branching ratio$n$ is finite) and the subcritical condition ($n < 1$ ) — are checked per tile. A supercritical or data-thin tile falls back to its smoothed-seismicity null, recorded inparams_used. ETAS remains the calibrated reference any neural challenger must beat.
Compute runs on Felipe's always-on GPU laptop, not a shared VPS. The rationale: the output is compute-light (one small artifact/day) but the model benefits from headroom, and a local-first, GPU-bound research compute matches the family's archetype.
- ETAS is CPU, seconds-to-minutes. The reference model needs no GPU; the daily job mostly conditions on a recent window, so it stays a single-digit-minute job.
- The GPU is upside. It accelerates (a) the gated neural challenger — a conditional spatio-temporal Neural Temporal Point Process with a Hawkes inductive bias (additive background + summed triggering, fixed kernels replaced by small networks, magnitude modelled explicitly; a CNN is used only as the spatial-context encoder, never as a standalone aftershock predictor) — and (b) vectorized simulation of the ≥10k (→100k) synthetic catalogs.
- Compute is no longer the ceiling, but the publish bar is fixed. A heavier model reaches the public map only if it beats ETAS in the prospective-CSEP harness (positive information gain with a T-test CI excluding zero) and passes calibration. Otherwise it stays behind a feature flag and ETAS ships. The GPU changes what we can afford to build, train, and test — not the bar for what we publish.
-
Portable fallback. A parallel
.sh+ systemd-timer unit keeps the same job portable to a VPS unchanged if uptime ever needs to improve. The VPS is an optional fallback compute host, never a web backend.
The forecast is published by committing the compact artifact to the public repo. A push is the publish event; the static host rebuilds from it. Content updates once per day as one small, append-only commit — no per-day PR (a daily data commit is low-conflict; a PR would be noise).
Scoped auto-commit is the #1 safety rule. The job runs on a machine that holds raw data/,
features/, models/, the .venv, and the working .env. The auto-commit is therefore an
explicit allowlist — git add results/ manifests/ — and never git add -A / git add ..
Defense in depth:
| Layer | Control |
|---|---|
| Allowlist |
git add results/ manifests/ only |
.gitignore |
blocks data/, features/, models/, .venv/, .env, raw grids/weights |
| Pre-push hook | hard-fails if any path outside the allowlist is staged |
| Credential | a dedicated least-privilege deploy key / PAT scoped to this one repo, in the local credential store only — never committed |
Reproducibility under auto-commit. Each daily commit carries the input-state snapshot in its
manifests (catalog version, Mc grid, declustering choice, model + parameter versions, issue
timestamp), so any past forecast is byte-reproducible months later and pseudo-prospective CSEP
scoring is honest. Raw inputs stay local (gitignored), rebuildable from the manifests. See
Pipeline for the manifest schema.
The web app is a pure presentation layer with zero processing backend. It fetches committed static files and renders maps and charts.
| Layer | Choice |
|---|---|
| Frontend | Vite + React + TypeScript SPA |
| i18n | EN-first, then ES (EN is the source of truth) |
| Theme | light / dark; dark-technical palette (bg #0d1117, surface #161b22, accent #58a6ff) |
| Map (Monitoring only, lazy-loaded) | MapLibre GL JS base map + deck.gl overlay (MapboxOverlay, interleaved: true) rendering the H3 probability field |
| No-map charts | Observable Plot / hand-rolled SVG (zero map-library bundle) |
| Data "API" |
static asset paths, e.g. /data/latest.json, /data/forecast/{date}.json.gz, /data/calibration.json — never a service |
| Host | GitHub Pages, custom domain seismic.fasl-work.com
|
Page set (six routes): Introduction · The problem · Methodology (tabbed, with LaTeX) · Implementation (pipeline SVG) · Back-analysis (retrospective CSEP) · Monitoring (the live daily field). Routes 1–4 and 6 are lightweight; the heavy MapLibre + deck.gl bundle is code-split behind the Back-analysis / Monitoring routes.
Monitoring is honesty-first. The default view is a world probability field (not alarm dots), on a perceptually-uniform sequential colormap with a calibrated numeric legend; a P10 / median / P90 bounds triad; an always-visible horizon selector (1 d / 2 d / 7 d); a mandatory baseline comparison (ratio and absolute expected count); a calibration badge driven by the live CSEP tests (the only place a green/amber/red triad appears — describing model quality, never danger); and a staleness banner plus coverage mask. The client switches horizon / bound / region among arrays already present in the single artifact — no extra round-trips. Why a WebGL base map: an interleaved single-canvas field with legible labels requires WebGL, so the heavy data goes through deck.gl on MapLibre rather than vector layers or a non-WebGL base map.
The full UI specification, banned anti-patterns, and the honest-framing rules live in the dedicated web-app pages of this wiki and the product repository.
The Windows Task Scheduler is the "cron." scripts/daily fires at 03:00 local time with
Run whether user is logged on or not, Wake the computer to run, and a missed-run catch-up
(run on next wake if a fire was skipped).
sequenceDiagram
participant TS as Task Scheduler (03:00)
participant Job as scripts/daily
participant Up as Upstream FDSN (ComCat/EMSC)
participant Repo as Public repo
participant Pages as GitHub Pages
TS->>Job: fire (wake + catch-up)
Job->>Up: fetch updatedafter delta + cross-check
Job->>Job: hygiene (Mc -> homogenize -> dual decluster)
Job->>Job: regime-tiled ETAS condition/refit (+ gated challenger)
Job->>Job: simulate >=10k catalogs -> rate field + bounds
Job->>Job: calibrate + rolling CSEP
Job->>Job: QA gate (freshness, counts, no spike, N-test drift, size)
alt QA pass
Job->>Repo: scoped git add results/ manifests/ -> commit -> push
Repo->>Pages: auto-publish on push
else QA fail
Job-->>Repo: commit nothing; last-good stays up
Note over Pages: staleness banner makes the late/stale forecast obvious
end
Cadence. Daily (cron): fetch → hygiene → condition/refit ETAS on the recent window → simulate → calibrate → write + commit for {1 d, 2 d, 7 d} × {P10, median, P90}. Slower (weekly/monthly or event-triggered): full re-fit / re-training of model parameters and the gated neural challenger on the GPU; CSEP back-analysis refresh. Because the daily job mostly conditions on recent events, it stays a single-digit-minute job.
Operational QA gate (no silent corrupt publishes). The job does not auto-publish a stale or anomalous artifact. It gates on: fetch succeeded, event counts in range, no duplicate/retracted event near the magnitude threshold, rolling N-test drift within bounds, and a sane artifact size. On failure it commits nothing and leaves the last-good artifact up; the staleness banner does the rest. A product that serves a stale or corrupted artifact is worse than one that says "unavailable" — degrade visibly, never silently.
Laptop reliability vs VPS. Sleep, reboots, or a closed lid can miss a run and let the forecast
go stale. Mitigations: Task Scheduler wake + catch-up, and a UI that degrades honestly (the
staleness banner — "generated {UTC} · next run {UTC}" — and the coverage mask make a late forecast
obvious). This is acceptable because the product is an independent research/education tool, not an
official civil-protection alarm. If higher uptime is later required, the portable daily.sh moves
to the VPS unchanged.
| Concern | Resolution |
|---|---|
| Cost | No runtime backend; serving cost scales with bytes, not compute. GitHub Pages hosts the static viewer for free. |
| Simplicity | Pages auto-publishes on push — one less moving part than a VPS pull loop. |
| Honesty | Stateless read-only runtime + immutable issue-time logging make pseudo-prospective CSEP scoring structurally honest. |
| Safety | Scoped allowlist + .gitignore + pre-push hook + least-privilege repo-scoped credential keep secrets and heavy data out of the public repo. |
| Ambition | The local GPU lets us train and validate a stronger model without raising the publish bar (CSEP win + calibration stays fixed). |
| Deploy class | A narrow github-pages exception: static + public + no-backend + git-updated → Pages; anything with a backend or private → VPS.
|
See also: Data Sources — the global data layer this system consumes · Pipeline — the versioned pipeline DAG that turns data into the compact artifact.
⚠️ Disclaimer — read this. CAOS_SEISMIC produces probabilistic forecasts, not predictions. It is an independent research and education tool. It is NOT an official earthquake early-warning or civil-protection system, it does NOT predict when, where, or how large an earthquake will be, and it must NOT be used for life-safety, emergency, or evacuation decisions. Every number it publishes is a bounded, calibrated probability conditioned on the present state of seismicity — never an alarm, a countdown, or a "safe" state. A single outcome neither confirms nor refutes a probabilistic forecast.It complements, and does not replace or speak for, official agencies — always follow your national seismological and civil-protection authorities (e.g. USGS, INGV, CSN (Chile, SENAPRED for civil protection), GeoNet, JMA). The software is provided "as is", without warranty of any kind (MIT License); the authors accept no liability for its use. Data are courtesy of their providers (USGS/ANSS, ISC/ISC-GEM, Global CMT, EMSC, CSN, and others) under their respective licenses and attribution terms. See Honest-Limits for the full epistemic context.
CAOS_SEISMIC · seismic.fasl-work.com · source · MIT
Conditional probabilistic seismic forecasting — forecasts, never predictions.
Overview
Methodology & History
Classical models
- Models-Classical · index
- Gutenberg-Richter-Law
- Omori-Utsu-Law
- ETAS-Model
- Reasenberg-Jones-Model
- STEP-Model
- EEPAS-Model
- Smoothed-Seismicity
- Brownian-Passage-Time
- Rate-and-State-and-Coulomb
ML & analytical methods
- Models-ML · index
- Temporal-Point-Processes
- RMTPP
- Neural-Hawkes-Process
- Transformer-Hawkes-Process
- RECAST-and-FERN
- CNN-Spatial-Models
- Graph-and-Recurrent-Networks
- Detection-vs-Forecasting
Models employed
Data
Architecture
Evaluation
Progress
Reference