Rates Engine v0.5.0-rc.60
Pre-release[v0.5.0-rc.60] — 2026-05-20
Added
ratesengine-ops trim-galexie-archiveoperator (#7
implementation step 2b — second half of ADR-0027 §Step 2).
DESTRUCTIVE subcommand that deletes LCM files from the local hot
tier (galexie-archive MinIO) whose entire ledger range is below
the operator-specified--older-than-ledger N, after verifying
upstream presence in the cold tier. Five-layer safety stack:
(1)--dry-runis the default when neither--dry-runnor
--commitis set — actual deletion requires explicit
--commit; (2)--verify-upstreamis the default — every
candidate is HEAD'd against cold before being marked for
deletion;--no-verify-upstreamis a documented escape hatch
for restore-from-backup workflows; (3)--max-filescaps
deletions per run (default 100000) — a typo cannot trim the
full archive in one shot; (4)--older-than-ledgeris
required (no implicit cutoff); (5) cold tier MUST be
configured (refuses to run otherwise — trim without a cold
fallback is unrecoverable data loss). Rollback is mechanical:
ratesengine-ops rehydrate-galexie-archive -from N -to N
re-fetches from cold. Per-objectDeleteObject(vs bulk
DeleteObjects) so a partial failure leaves a clear position
cursor — operator re-runs--dry-runto see what's left.
Promotesaws-sdk-go-v2/{aws,config,credentials,service/s3}
from transitive to direct dependencies (already in our tree
via go-stellar-sdk) — needed because the SDK's
datastore.DataStoreinterface lacks aDeletemethod. Tests
cover the safety primitives (default verify-upstream, default
no-commit,--commitopt-in, uint32 overflow guard) +
splitBucketPath(SDK-compatiblebucket/prefixparsing). The
full ADR-0027 §Step 2 is now complete (rehydrate from §2a +
trim from §2b); §Steps 3-5 (flag-flip in r1's TOML, first bulk
trim, monthly cadence) are operator-gated and follow.ratesengine-ops rehydrate-galexie-archiveoperator (#7
implementation step 2a — first half of ADR-0027 §Step 2).
Non-destructive subcommand that copies LCM files for a ledger
range from the configured cold tier (storage.s3_cold_*) back
into the local hot tier (storage.s3_bucket_archiveMinIO
bucket). Idempotent viaPutFileIfNotExists— files already
present in hot are skipped, not refetched.-dry-runreports
the file list + skipped / would-copy / missing-in-cold counts
without writing. Use cases: recover from accidental trim,
pre-warm hot before a planned backfill, cold-tier integrity
spot check (themissing_in_coldcounter surfaces files that
genuinely never landed upstream). Refuses to run when cold tier
isn't configured. The path-enumeration logic uses the SDK's
DataStoreSchema.GetObjectKeyFromSequenceNumber+ steps by
LedgersPerFileso each schema-aligned file is visited once;
a defensive fallback handlesLedgersPerFile == 0(a
malformed schema would otherwise infinite-loop). Tests cover:
alignment of-fromdown to file boundary, no-duplicates,
zero-LPF fallback, single-LPF (the Galexie default), flag
parsing (4 cases). The destructive trim operator (the
second half of §Step 2) follows in a separate commit — it
needs delete capability which the SDK'sdatastore.DataStore
doesn't expose, so it'll wire AWS SDK v2'ss3.Client
directly.StorageConfigcold-tier fields +LedgerstreamConfigwires
them (#7 implementation step 1c). New TOML fields in
[storage]:s3_cold_endpoint,s3_cold_region,
s3_cold_bucket_archive,s3_cold_access_key_env,
s3_cold_secret_key_env. All default to empty — every
pre-ADR-0027 deployment continues to use the legacy single-
source path byte-for-byte.StorageConfig.ColdTieringEnabled()
returns true iffs3_cold_bucket_archiveis set (the
LCM_TIER_ENABLED=falseof ADR-0027 §Step 1 expressed as a
field presence).pipeline.LedgerstreamConfigpopulates
ledgerstream.Config.ColdDataStorewhen tiering is enabled
and the caller is reading the archive bucket — the live
bucket (galexie-live) is the rolling near-tip working set
authored locally and is never tiered. Tests cover the
no-cold-tier default, the cold-tier-archive path, the
cold-tier-skipped-for-live-bucket guard, and the
ColdTieringEnabledtruth table. ADR-0027 §Step 1 is now
complete in code; §Steps 2-5 (trim + rehydrate operators,
flag-flip on r1, bulk trim, monthly cadence) follow as
separate commits.ledgerstream.Streamgains an opt-in tiered read path
(#7 implementation step 1b).Configlearns a new optional
ColdDataStore datastore.DataStoreConfigfield; when set,
Streamconstructs aTieredDataStorewrapping
the hot (Config.DataStore) + cold (Config.ColdDataStore)
underlying stores, builds aBufferedStorageBackenddirectly
on top, and drives the LCM iteration with a loop that mirrors
the SDK'singest.ApplyLedgerMetadatashape — same bounded /
unbounded validation, samemax(2, range.From)clamp, same
GetLedger-per-ledger sequence, same WithMetrics wrap when a
registry is provided. WhenColdDataStoreis zero-valued
(the default), the legacy single-source path through
ingest.ApplyLedgerMetadatais used unchanged — backward
compatible with every existing caller. This satisfies ADR-0027
§Sequencing step 1 ("Land the dual-source read path behind a
LCM_TIER_ENABLED=falseflag"); the flag here is the
presence/absence ofColdDataStorerather than a separate
bool. Operator-facing config wiring (parsing the cold-tier TOML
block + populatingcfg.ColdDataStore) is the next step.ledgerstream.TieredDataStore— two-tierdatastore.DataStore
fallback chain (#7 implementation step 1). Satisfies the SDK's
datastore.DataStoreinterface; composes ahot+cold
underlying store. Reads try hot first, fall through to cold on
IsNotFounderrors only — transient errors (network timeouts,
auth failures, throttling) propagate immediately so a
misconfigured hot endpoint surfaces as the operator's problem
rather than being masked by a slow cold path that always
succeeds. Writes (PutFile,PutFileIfNotExists) target hot
exclusively (cold is read-only by design — production cold is
aws-public-blockchain, the AWS Open Data Sponsorship bucket).
ListFilePathsunions hot + cold with hot-wins dedup so a
backfill spanning the tier boundary sees every partition.
Optional Prometheus metrics:ratesengine_ledgerstream_tier_read_total{outcome="hot"|"cold"|"both_missing"}
andratesengine_ledgerstream_cold_read_duration_seconds. Not
yet wired intoledgerstream.Stream's Config — that integration
is the next step (still behind the plannedLCM_TIER_ENABLED
feature flag per ADR-0027 §Sequencing).
Docs
docs/operations/lcm-cache-tiering.md— operator runbook for
ADR-0027 §Steps 3-5 (#7 implementation companion). Step-by-
step playbook for the operator-gated transition: TOML flag-flip
(Step 3), first bulk trim with chunked 1M-ledger invocations- per-chunk pool monitoring (Step 4), and the monthly cadence
caveat (Step 5 — timer not yet shipped, pending an
--older-than-durationmode that resolves tip at run time).
Includes pre-flight checklist, cutoff-ledger computation
formula (TIP - 90 × 17280), rollback playbook via the
rehydrate operator, and a "common failure modes" catalogue
(cold tier check fails,cold.Existswarnings, pool capacity
rise during trim, indexercold.GetFileerrors). Metrics
reference points operators at
ratesengine_ledgerstream_tier_read_total{outcome=...}and
ratesengine_ledgerstream_cold_read_duration_secondsfor
real-time visibility.
- per-chunk pool monitoring (Step 4), and the monthly cadence
- ADR-0027 (Proposed): LCM cache tiering — local
galexie-archive as hot,aws-public-blockchainas cold (#7
design pass). R1's ZFS pool is at 93% (12.5 TB used, 1.35 TB
free) with the 2026-05-17 SEV showing what structural-tight
headroom costs. The biggest single tier-able lever is the
4.96 TBdata/miniodataset (mostly galexie-archive's
genesis→tip LCM mirror); the AWS Open Data Sponsorship publishes
the same data at sub-15ms for in-region readers and ~80 ms per-
GET (amortised over 64-ledger partitions) for r1. ADR proposes a
90 d hot window in local galexie-archive with cold reads
falling back to AWS, a HEAD-verify-before-delete trim operator
(ratesengine-ops trim-galexie-archive), and a five-step
rollout that lands the dual-source read path under a feature
flag before any deletion happens. Recovers ~3-4 TB at the 90 d
cutoff, unblocking #30 (composite index on the 2.7B-row
trades hypertable) and #35 (the SEV-frozen Soroban-era
backfill resume). History-archive offload + galexie-live
promotion-cadence tuning + PostgreSQL chunk retention beyond
current policy are explicitly out of scope as separate ADRs.
Changed
buildPoolsQueryreads frompools_per_source_1h(#25 phase
2). Replaces three trades-hypertable scans (vol_24h CTE,
last_px DISTINCT-ON CTE, outer FROM trades) with a single CAGG
scan + GROUP BY. The XLM-fallback semantics for unpriced trades
are preserved exactly (priced trades contribute their stored
usd_volume; trades with an XLM leg fall back to
base_amount × XLM/USDorquote_amount × XLM/USD; pure-token-
token unpriced trades contribute 0 — pre-#25 returned NULL, but
the handler scan collapsed NULL and "0" identically, so client-
visible behaviour unchanged). Trade-off:last_trade_atlags by
up to one CAGG refresh interval (5 min); acceptable for a pools
discovery surface. After this commit ships, #23's
CachedMarketsReaderSWR layer becomes a latency nicety rather
than load-bearing — refresh fills stop paying the 8-30s trades-
scan cost. Integration test bootstrap force-refreshes the new
CAGG alongsideprices_1m. Operator note: the CAGG was
createdWITH NO DATAin migration 0036; the 5-minute policy
only refreshes the last 7 days. Run
CALL refresh_continuous_aggregate('pools_per_source_1h', NULL, NULL)
once on r1 after the 0036 migration applies, to backfill the
14d-window's historical buckets so /v1/pools sees the full pool
set immediately rather than ramping up over a week.
Added
- Per-source pools continuous aggregate —
pools_per_source_1h
(#25, migration 0036). The durable backing for/v1/pools.
Pre-#25 the handler'sbuildPoolsQueryscanned the full trades
hypertable forts >= NOW() - 24hgrouped by(source, base, quote)— measured 8-30s; #23 wrapped it in SWR (sub-ms warm,
~8s cold first hit). This CAGG pre-aggregates per
(source, base_asset, quote_asset, 1h bucket):
sum_usd_priced,sum_base_unpriced/sum_quote_unpriced
(Phase-1 vs needs-XLM-fallback splits),trade_count, and
last(quote_amount/base_amount, ts)for the per-pool latest
price. Refresh policy every 5 minutes covering the last 7 days
(over-refresh tolerates late-arriving backfilled trades — the
#38 router/defindex run is the current example). Storage:
~3-4M rows steady-state (~hundreds of MB) — small enough to keep
no retention so operators can later widen the window past 24h.
Handler refactor to read from the CAGG ships in a follow-up
commit (this commit lands the migration first so the CAGG can
materialize cleanly before the handler depends on it). After
refactor, #23's SWR becomes a latency nicety rather than
load-bearing.
Changed
-
node-exporter consolidation: legacy → Debian package (#33).
The pre-#33 state was two units fighting:9100— a hand-rolled
node_exporter.service(custom unit +/usr/local/bin/node_exporter,
2024 binary) running, and the apt-installed
prometheus-node-exporter.serviceperpetually failing because
the port was taken. Cut over to the Debian package live on r1:
configured/etc/default/prometheus-node-exporterwith the
legacy's exact flags (--collector.systemd --collector.processes --collector.textfile.directory=/var/lib/node_exporter/textfile_collector
— preserves every existing textfile metric:archive_completeness.prom,
sla_probe.prom,galexie_archive_tip_lag.prom), stopped + disabled
the legacy unit, restarted the package. Live-verified: 13 textfile
metric lines visible, 3127node_*metrics serving,
prometheus-node-exporterno longer in the failed-unit list.
Codified in Ansible (10-observability.yml): apt-install the
package, templateARGS=, enable, idempotent stop+disable of any
pre-existing legacy unit. Legacy unit file + binary deliberately
retained for zero-downtime rollback (systemctl stop prometheus-node-exporter && systemctl start node_exporter). -
/v1/diagnostics/ingestionpregenerated server-side (#16).
Background goroutineServer.StartIngestionSnapshotRefresh
builds the full ingestion-diagnostics snapshot every 15 s into
anatomic.Pointer[ingestionSnapshotEntry]; the handler reads
the atomic and writes it sub-ms instead of the previous ~417 ms
inline build (7 parallel DB-filler goroutines + post-fillers
coverage projection). Inline build remains as the cold-start
fallback (the atomic is nil until the first refresh fires).
Cadence (15 s) matches the existingCache-Control: max-age=15
header. Refresh uses a detachedcontext.Background()-derived
ctx (//nolint:gosec,contextcheck— intentional, the parent is
the api process lifetime, not any request). Launched alongside
the existingprewarmCachesgoroutine incmd/ratesengine-api.
Added
- galexie-archive tip-lag alert (#31) — defense-in-depth for
#26. Adds a Prometheus textfile-collector metric
(galexie_archive_tip_lag_ledgersand friends) computed every
5 min bygalexie-archive-tip-lag.{service,timer}running
/usr/local/bin/galexie-archive-tip-lag. The accompanying alert
pages (ratesengine_galexie_archive_tip_lag_severe) within hours
if the hourlygalexie-archive-fill.timersilently breaks — the
exact failure class that let #26 go undetected for 23 days.
Rules added to BOTHdeploy/monitoring/rules/galexie-archive.yml
andconfigs/prometheus/rules.r1/galexie-archive.yml(wave-96
dual-dir). Runbook at
docs/operations/runbooks/galexie-archive-tip-lag.md. Codified
in Ansible (07-galexie.yml: copy script + install
.j2-templated unit + enable timer). Live on r1 (current lag
9,388 ledgers — well below the warn threshold of 5,000 sustained
for 30 min). Three alert variants:_high(P3, warn 5 k for
30 m),_severe(P1, page 50 k for 30 m),_metric_stale(P3,
the metric file hasn't refreshed in 30 m — the alert canary).