Rates Engine v0.5.0-rc.108
Pre-release[v0.5.0-rc.108] — 2026-06-10
Tested against Stellar Protocol 23 (Whisk).
Operator notes:
- The census + retention-scope completeness fixes take effect once the indexer
(and theratesengine-opsbinary) are deployed viadeploy.yml. Until then
the live census still records both-zero no-ops. - The
tradestable must have NO retention policy (migration 0031 — keep raw
forever). Iftimescaledb_information.jobsshows apolicy_retentionon
trades, it's drift —remove_retention_policy('trades').
Added
-
GET /v1/assets/{asset_id}/supply+ explorer supply panel (ADR-0034).
Exposes the live decode-at-ingest supply:Σmint − Σburn − Σclawbackfrom the
supply_flowslake, current to the latest ledger with no rollup refresh.
Resolves a Soroban contract id (C…) directly, a classic asset via the
operator's SAC wrappers (404 if unmapped), andnative/XLMfrom the ledger
headertotal_coins(source=ledger_total_coins). Amounts are decimal
strings (ADR-0003). The API server gains a pooledclickhouse.SupplyReader
(nil when ClickHouse isn't configured → endpoint 503s; non-fatal at boot).
The explorer's Supply tab now leads with a live "On-chain supply" section
(total + mint/burn/clawback breakdown) for every token — not just the
handful with an ADR-0011asset_supply_historysnapshot — degrading
gracefully (section omitted) when the endpoint 404s/503s. -
Real-time per-token supply via decode-at-ingest (ADR-0034). Token supply
is now a pure SQL sum over a newstellar.supply_flowstable instead of a
periodically-refreshed rollup. The blocker for real-time supply was that the
amount lives in the event body as a raw i128 XDR scval that ClickHouse can't
decode — so supply required a 16-min Go batch recompute (ch-supply), stale
by up to the refresh interval. Now the indexer decodes the i128 amount at
ingest (DecodeSupplyAmount) for every mint/burn/clawback event and writes
a decoded row tosupply_flows(ReplacingMergeTree, ORDER BYcontract_id
first for fast per-token reads; event-identity suffix → idempotent under the
lake's drop→heal / re-backfill). The real-time dual-sink feeds it inline, so
a token's supply (Σmint − Σburn − Σclawback,SupplyForContract) is always
current with no refresh job and no read-time XDR decode. History is
seeded once from the existing lake viascripts/ops/ch-supply-flows-seed.sh
(windowed + resumable wrapper overch-supply -seed-flows— a single-shot
all-history seed exceeds the 1h CH read timeout and, lacking an ORDER BY,
leaves scattered holes; windowing bounds each read); thereafter the dual-sink
keeps it live. The decode logic is shared between ingest and the seed so both
produce identical amounts. -
ClickHouse Tier-1 raw lake (ADR-0034, migration in progress). New
columnar storage tier for the OLAP-scale firehose (every ledger/tx/op/
event), moving it off Postgres where billion-row bulk reprocessing was
infeasible. Ships the Tier-1 schema (deploy/clickhouse/tier1_schema.sql),
theinternal/storage/clickhousestructural sink + LCM extractor (reuses
the proveningest/CensusLedger/sorobanevents.Capturewalk; stores raw
XDR, no SCVal decoding), and theratesengine-ops ch-backfillcommand
(-parallel Nfor concurrent range-walkers — the historic-backfill
throughput unlock). Theratesengine-ops ch-gatecommand runs the §6 gates
over a backfilled range: it census-walks galexie, asserts the extractor
matches the decoder-independent census oracle, then reads the range back out
of ClickHouse and asserts the stored + actual row counts both equal the
census; it also reports compressed bytes/ledger + a full-history footprint
projection. Gated: a 100k-ledger sample must pass throughput +
completeness-vs-census before any full historic walk. See
docs/architecture/clickhouse-migration-plan.md+
docs/architecture/clickhouse-tier1-decoder.md+
docs/architecture/clickhouse-phase4-decoder-adapter.md.- Fixed an extractor bug before any full walk:
claimAtomCountdecoded
CreatePassiveSellOffervia the wrongOperationResultTrunion arm
(GetManageSellOfferResult, alwaysok=falsefor that op type) and
silently undercountedclassic_trade_effect_countvs the census on every
crossing passive offer. Now usesGetCreatePassiveSellOfferResult,
matchingsdex.decode+dispatcher.census; covered by a new
per-op-variant test.
- Fixed an extractor bug before any full walk:
-
ADR-0033 — completeness verification model. Three independently
provable claims (substrate continuity, recognition, projection
reconciliation) replace threshold-based coverage as the
100%-confidence signal. Seedocs/adr/0033-completeness-verification-model.md. -
ledger_ingest_logsubstrate-continuity record (ADR-0033 Phase 2).
Migration 0051. One row per fully-processed ledger, written
post-persist by the live indexer, carrying the LCM-derived census
(soroban_event_count,classic_trade_effect_count— counted
decoder-independently from the LedgerCloseMeta) plus the header
hash-chain anchors. Newratesengine-ops census-backfill -from -to
populates history. Storage queriesFindLedgerIngestGaps(contiguity)
andVerifyLedgerHashChain(cryptographic linkage) are Claim 1 of the
completeness model — both run over the narrow record, never a trades
scan. Once a ledger is recorded with its census, "zero events for
contract C here" is a proven quiet period, which is what lets the
confidence signal stop guessing sparsity thresholds. -
Recognition check (ADR-0033 Phase 3 / Claim 2a). New
ratesengine-ops verify-recognition -from -topulls every distinct
(contract_id, topic_0_sym)shape fromsoroban_eventsand runs each
through the production decoder chain's realMatches()(no
hand-maintained topic list to drift). Any shape no decoder handles —
e.g. a topic a WASM upgrade added that we'd silently drop — is listed
and the command exits non-zero (cron/CI-gateable). Backed by
dispatcher.Recognize(side-effect-free),Store.DistinctSorobanTopicSamples,
andinternal/completeness.AuditRecognition. -
Projection reconciliation (ADR-0033 Phase 4 / Claim 2b). New
ratesengine-ops verify-reconciliation -from -to [-source S]
re-derives, per ledger, how manytradesrows the real decoder would
emit fromsoroban_events(deterministic recomputation) and diffs
that against the rows actually present — localizing any projector drop
(or phantom row) to an exact ledger. Covers soroswap/aquarius/phoenix/
comet (seeds soroswap pairs via RPC). Backed by
completeness.ReDeriveOutputCounts/ReconcileCountsand
Store.CountRowsByLedger. Correlation sources reconcile correctly
because each logical record's events share one (ledger, tx, op). -
SDEX / classic reconciliation (ADR-0033 Phase 5 / Claim 2b classic).
verify-reconciliationnow also covers SDEX, which predates Soroban
and has nosoroban_events: its expected count comes from the
LCM-derivedclassic_trade_effect_countcensus inledger_ingest_log
(one ClaimAtom = one trade), gated on the substrate record being
continuous over the range (else it tells you to runcensus-backfill
first). The existinghubble-check(per-ledger SDEX-vs-Hubble counts- amount cross-check) remains the external defense-in-depth anchor.
-
Completeness watermark verdict (ADR-0033 Phase 6 / headline).
ratesengine-ops compute-completenessderives the per-source
completeness WATERMARK — the highest ledger where substrate continuity- hash chain (Claim 1) AND projection reconciliation (Claim 2b) both
hold from genesis — plus a system recognition verdict (Claim 2a), and
writes them to the newcompleteness_snapshotstable (migration 0052).
/v1/diagnostics/ingestionoverlayscompleteness_pct/
completeness_watermark/completeness_completeonto each source
row, and the status page renderscompleteness_pctas the headline
(falling back to gap-free coverage when not yet computed). Unlike
density/gap_free this uses NO sparsity threshold — a single proven gap
pins it — so it is the honest 100%-confidence signal.MinGapSizeOverride
is now documented as alerting-cadence only, off the confidence path.
- hash chain (Claim 1) AND projection reconciliation (Claim 2b) both
-
Projection reconciliation extended to all per-ledger sources +
multi-output fix (ADR-0033 future work).verify-reconciliationand
compute-completenessnow drive off a shared catalogue covering every
source that writes a per-ledger table — trades (soroswap/aquarius/
phoenix/comet), oracles (reflector ×3 / redstone), cctp/rozo/defindex,
and blend's four tables — plus sdex via the LCM census. The re-derive
now buckets outputs byEventKind()(ReDeriveOutputCountsByKind+
SumKinds) and reconciles each table against only the kinds that
route to it — fixing a latent overcount where multi-output sources
(soroswap/phoenix/comet also emit skim/liquidity/stake events to other
tables) were compared whole againsttradesalone. Recognition gaps
are now attributed per-source for contract-pinned sources (oracles),
with a systemrecognitionsnapshot for gaps on unowned contracts.
(sep41/band/soroswap-router remain out of scope — documented in the
catalogue.) Also chunk-prunes those queries viaSorobanEventsTimeBound. -
Incremental completeness verify + hourly timer (ADR-0033 standing guard).
compute-completenessgains-from <ledger>: verify only[from, tip],
trusting[genesis, from]as previously verified (substrate hash-chain,
recognition shape scan, and projection reconcile all scoped to the window);
the watermark still extends to tip when the window is clean.scripts/ops/ completeness-incremental.shcomputesfrom = min(watermark)from the prior
snapshots, so each run re-checks only new ledgers — minutes, not the hours a
full genesis→tip sweep takes. It is READ-ONLY on served data (recomputes
completeness_snapshotsonly) and exits non-zero with the failing source +
range if a source regresses; repair (ch-rebuild over the range) stays a
deliberate action. Wired asratesengine-completeness.{service,timer}(hourly,
niced). This is the runtime data-driven guard that keeps "verified 100%" true
as the tip advances; it complements the PR-timelint-pk-discriminators. -
lint-pk-discriminatorsCI guard. A newscripts/cilint that parses
per-source table PKs and fails the build if a table that can receive multiple
same-key events per operation lacks a per-event discriminator (the coarse-PK
data-loss class) — wired intoverify.sh+ci.yml. Guards against
reintroducing the silent-drop bug fixed below for trades/blend/defindex.
Changed
-
Sources panel shows "Entries 24h" instead of "Trades 24h". The
old column came from aGROUP BY sourcescan over thetrades
hypertable whose error was swallowed — so any timeout under load
silently rendered every source 0, and it was structurally 0 for the
many registered sources that don't write trades (oracles, bridges,
FX). It's replaced by a universal per-source trailing-24h event count
sourced fromincrease(ratesengine_source_events_total[24h])(the
same counter that backsactive_sources) via a new
StatusBackend.SourceEntries24h— cheap, reliable, and non-zero for
every active source whether on-chain or external. Newentries_24h
field on/v1/diagnostics/ingestionsources[]; the silent-VWAP
highlight now keys off it too. -
Status-page on-chain coverage is now honest about what it's
measuring (ADR-0033). A source's coverage figure is only shown as a
trustworthy bar once its completeness watermark (completeness_pct)
has been computed — the substrate+projection-verified signal. Until
then the page falls back togap_free_pct, a liveness proxy ("no
large interior gap detected") that reads ~100% for sources that are
merely sparse or only partially indexed (e.g. phoenix-liquidity at
18 of 11.3M ledgers). Those unverified figures are now rendered muted
and tagged "unverified · N% gap-free" with an explanatory tooltip,
instead of a green ~100% bar that overstated completeness. Because we
cannot distinguish "sparse-but-complete" from "incomplete" without the
watermark, we never dress an unverified figure up as verified coverage.
Fixed
-
Real-time projector CH feed-switch no longer risks silent loss
(ADR-0034 #10). The dual-sink (clickhouse.LiveSink) is best-effort:
it drops whole ledgers under buffer pressure and a flush can partially
fail, so the CH lake can have holes near the tip — and the prior
ch-live-catchuponly extended[CH_max+1, tip], which can never re-fill
a hole the sink already wrote past (verified: 48 orphaned ledgers,
[62939016,62939063]). Reading the projector forward from CH with the raw
ledgerstream tip as its bound would skip such holes and lose their protocol
events (the cursor advances unconditionally). Three changes make the
feed-switch safe by construction: (1)Sink.Flushnow writes
stellar.ledgerslast, making a ledgers row a per-ledger commit marker
(present ⟹ all of that ledger's tables are already durable); (2) the
projector clamps its CH-mode upper bound toContiguousWatermark— the
highest ledger with no hole below it — so an unhealed drop stalls the
source at the hole instead of skipping it; (3)ch-live-catchup.sh
gap-scansstellar.ledgersand back-fills holes belowCH_max, not just
the tip. Net: the lake self-heals and the projector never reads ahead of
provably-complete CH.- Also: the no-contract-prefilter DEX/lending projector sources
(soroswap/aquarius/phoenix/comet/blend/cctp/rozo/defindex) now exclude the
CAP-67 classic-token firehose (transfer/mint/burn/clawback/
approve/set_authorized— ~99.8% of all events under V4 meta) at the SQL
layer on both read paths. A caught-up source reads a tiny window so it never
mattered, but a far-behind source's 10k-ledger catch-up window was streaming
~5M firehose rows it only discarded viaDecoder.Matches, blowing the 60s
cycle budget and wedging the source (aquarius was stuck ~92k ledgers behind,
deadlock-storming the trades table). Exclude-only and audited lossless —
every one of the eight decoders was checked against the six symbols;
set_adminis deliberately retained because blend dispatches on it.
- Also: the no-contract-prefilter DEX/lending projector sources
-
tradesno longer silently drops multi-trade-per-op trades
(aquarius, comet). The ADR-0033 projection reconciliation found
aquarius emitting 5 trade events in one operation (a multi-pool swap)
but only 2 rows landing — the decoders keyed the row on the raw
op_index, so every trade after the first in an op collided on the
tradesPK(source, ledger, tx_hash, op_index, ts)and was dropped
byON CONFLICT. They now fan out viacanonical.FanoutOpIndex(op, event_index)(op in the high 16 bits, the Phase-1 event_index in the
low 16), matching the stride pattern SDEX already used. Forward fix;
historical collided ops need re-backfill (delete-then-replay) to
recover. All four event-based trade sources are now fanned out:
aquarius/comet by the event's own index, soroswap by the swap
event's index (RawPair.Swap), phoenix by the swap's first-field
event index (RawSwap.EventIndex). Phoenix's 8-field buffer
emits-and-clears on completion, so router multi-hop segments into
separate swaps correctly — it was the same op_index collision, not a
merge (the old "multihops split on op_index naturally" assumption was
wrong). -
soroban_eventsno longer silently drops events from multi-event
operations.event_indexwas hardcoded to 0 at capture, so every
contract event in one operation collided on the
(ledger_close_time, ledger, tx_hash, op_index, event_index)PK and
the writer'sON CONFLICT DO NOTHINGkept only the first — Phoenix
(8 events per swap in one op) was archiving 1 of 8. A real
event_indexis now threaded from the dispatcher's per-op event walk
throughevents.EventintoCapture/Reconstruct, and
StreamSorobanEventsorders by it for deterministic replay. This is
the precondition for usingsoroban_eventsas a completeness oracle
(ADR-0033 Phase 1). Note: rows captured before this fix are missing
the collided events; affected ranges need re-backfilling — the
ADR-0033 reconciliation will surface exactly which. -
/v1/markets no longer returns 500 on unparseable trades rows.
A single stray row withbase_asset='test'500ed every markets
request on 2026-06-01, tripping page-tierapi_error_rate_criticalslo_availability_burn_fastuntil the row was hand-deleted.
The scanner now skips rows whose base/quote fail
canonical.ParseAsset, logs a WARN, and bumps the new
ratesengine_markets_skipped_rows_totalcounter so operators
can find and remove the offending row without serving 500s to
every consumer.
-
SDEX census counts real trades, not both-zero no-op crosses. The
projection census (claimAtomCount) counted EVERY claim atom — including the
both-zero no-op crosses stellar-core emits when an offer is touched in matching
but both legs round to 0 (dust offers / integer-rounding artifacts; ~1–2% of
SDEX claims). The decoder correctly drops those (one-side-zero KEPT), so the
census over-counted vsCOUNT(trades)— violating its own invariant and
showing a spurious SDEX projection Δ.realTradeCountnow mirrors the decoder
exactly (skip both-zero), in both mirrored copies (dispatcher/census.go +
clickhouse/extract.go). Going forward the live census equals the served trade
count; the historical retention window re-records once to match. -
SDEX projection reconcile floors at the actual retained boundary. trades
isdrop_chunks-managed, andretentionStart = tip-1.5Mis ~100d at the
current ledger rate — ~10d / 150k ledgers below the oldest retained chunk. The
reconcile compared census>0 vs served=0 over that strip, manufacturing a
100%/20% "gap" in the lowest windows for rows retention deliberately dropped.
Newstore.MinLedger+retentionFloorscope the reconcile to where served
data actually begins; full-history coverage rests on the substrate (ADR-0033). -
blend_positions/blend_emissions/blend_admin/defindex_flowsno
longer silently drop multi-event-per-op rows. Same coarse-PK class as the
trades fanout above, on the per-source entity tables: their PKs lacked a
per-event discriminator, so a second same-kind event in one operation collided
onON CONFLICTand was dropped. Migrations 0053–0055 addevent_index(and,
for blend_positions,(asset, user_address)) to the PKs; the decoders +
sinks thread the in-txevent_indexthrough. Forward fix; collided historical
rows recover via re-derive from the lake.