Skip to content

Releases: StellarIndex/stellar-index

Rates Engine v0.5.0-rc.108

10 Jun 14:10

Choose a tag to compare

Pre-release

[v0.5.0-rc.108] — 2026-06-10

Tested against Stellar Protocol 23 (Whisk).

Operator notes:

  • The census + retention-scope completeness fixes take effect once the indexer
    (and the ratesengine-ops binary) are deployed via deploy.yml. Until then
    the live census still records both-zero no-ops.
  • The trades table must have NO retention policy (migration 0031 — keep raw
    forever). If timescaledb_information.jobs shows a policy_retention on
    trades, it's drift — remove_retention_policy('trades').

Added

  • GET /v1/assets/{asset_id}/supply + explorer supply panel (ADR-0034).
    Exposes the live decode-at-ingest supply: Σmint − Σburn − Σclawback from the
    supply_flows lake, current to the latest ledger with no rollup refresh.
    Resolves a Soroban contract id (C…) directly, a classic asset via the
    operator's SAC wrappers (404 if unmapped), and native/XLM from the ledger
    header total_coins (source=ledger_total_coins). Amounts are decimal
    strings (ADR-0003). The API server gains a pooled clickhouse.SupplyReader
    (nil when ClickHouse isn't configured → endpoint 503s; non-fatal at boot).
    The explorer's Supply tab now leads with a live "On-chain supply" section
    (total + mint/burn/clawback breakdown) for every token — not just the
    handful with an ADR-0011 asset_supply_history snapshot — degrading
    gracefully (section omitted) when the endpoint 404s/503s.

  • Real-time per-token supply via decode-at-ingest (ADR-0034). Token supply
    is now a pure SQL sum over a new stellar.supply_flows table instead of a
    periodically-refreshed rollup. The blocker for real-time supply was that the
    amount lives in the event body as a raw i128 XDR scval that ClickHouse can't
    decode — so supply required a 16-min Go batch recompute (ch-supply), stale
    by up to the refresh interval. Now the indexer decodes the i128 amount at
    ingest
    (DecodeSupplyAmount) for every mint/burn/clawback event and writes
    a decoded row to supply_flows (ReplacingMergeTree, ORDER BY contract_id
    first for fast per-token reads; event-identity suffix → idempotent under the
    lake's drop→heal / re-backfill). The real-time dual-sink feeds it inline, so
    a token's supply (Σmint − Σburn − Σclawback, SupplyForContract) is always
    current with no refresh job and no read-time XDR decode. History is
    seeded once from the existing lake via scripts/ops/ch-supply-flows-seed.sh
    (windowed + resumable wrapper over ch-supply -seed-flows — a single-shot
    all-history seed exceeds the 1h CH read timeout and, lacking an ORDER BY,
    leaves scattered holes; windowing bounds each read); thereafter the dual-sink
    keeps it live. The decode logic is shared between ingest and the seed so both
    produce identical amounts.

  • ClickHouse Tier-1 raw lake (ADR-0034, migration in progress). New
    columnar storage tier for the OLAP-scale firehose (every ledger/tx/op/
    event), moving it off Postgres where billion-row bulk reprocessing was
    infeasible. Ships the Tier-1 schema (deploy/clickhouse/tier1_schema.sql),
    the internal/storage/clickhouse structural sink + LCM extractor (reuses
    the proven ingest/CensusLedger/sorobanevents.Capture walk; stores raw
    XDR, no SCVal decoding), and the ratesengine-ops ch-backfill command
    (-parallel N for concurrent range-walkers — the historic-backfill
    throughput unlock). The ratesengine-ops ch-gate command runs the §6 gates
    over a backfilled range: it census-walks galexie, asserts the extractor
    matches the decoder-independent census oracle, then reads the range back out
    of ClickHouse and asserts the stored + actual row counts both equal the
    census; it also reports compressed bytes/ledger + a full-history footprint
    projection. Gated: a 100k-ledger sample must pass throughput +
    completeness-vs-census before any full historic walk. See
    docs/architecture/clickhouse-migration-plan.md +
    docs/architecture/clickhouse-tier1-decoder.md +
    docs/architecture/clickhouse-phase4-decoder-adapter.md.

    • Fixed an extractor bug before any full walk: claimAtomCount decoded
      CreatePassiveSellOffer via the wrong OperationResultTr union arm
      (GetManageSellOfferResult, always ok=false for that op type) and
      silently undercounted classic_trade_effect_count vs the census on every
      crossing passive offer. Now uses GetCreatePassiveSellOfferResult,
      matching sdex.decode + dispatcher.census; covered by a new
      per-op-variant test.
  • ADR-0033 — completeness verification model. Three independently
    provable claims (substrate continuity, recognition, projection
    reconciliation) replace threshold-based coverage as the
    100%-confidence signal. See docs/adr/0033-completeness-verification-model.md.

  • ledger_ingest_log substrate-continuity record (ADR-0033 Phase 2).
    Migration 0051. One row per fully-processed ledger, written
    post-persist by the live indexer, carrying the LCM-derived census
    (soroban_event_count, classic_trade_effect_count — counted
    decoder-independently from the LedgerCloseMeta) plus the header
    hash-chain anchors. New ratesengine-ops census-backfill -from -to
    populates history. Storage queries FindLedgerIngestGaps (contiguity)
    and VerifyLedgerHashChain (cryptographic linkage) are Claim 1 of the
    completeness model — both run over the narrow record, never a trades
    scan. Once a ledger is recorded with its census, "zero events for
    contract C here" is a proven quiet period, which is what lets the
    confidence signal stop guessing sparsity thresholds.

  • Recognition check (ADR-0033 Phase 3 / Claim 2a). New
    ratesengine-ops verify-recognition -from -to pulls every distinct
    (contract_id, topic_0_sym) shape from soroban_events and runs each
    through the production decoder chain's real Matches() (no
    hand-maintained topic list to drift). Any shape no decoder handles —
    e.g. a topic a WASM upgrade added that we'd silently drop — is listed
    and the command exits non-zero (cron/CI-gateable). Backed by
    dispatcher.Recognize (side-effect-free), Store.DistinctSorobanTopicSamples,
    and internal/completeness.AuditRecognition.

  • Projection reconciliation (ADR-0033 Phase 4 / Claim 2b). New
    ratesengine-ops verify-reconciliation -from -to [-source S]
    re-derives, per ledger, how many trades rows the real decoder would
    emit from soroban_events (deterministic recomputation) and diffs
    that against the rows actually present — localizing any projector drop
    (or phantom row) to an exact ledger. Covers soroswap/aquarius/phoenix/
    comet (seeds soroswap pairs via RPC). Backed by
    completeness.ReDeriveOutputCounts / ReconcileCounts and
    Store.CountRowsByLedger. Correlation sources reconcile correctly
    because each logical record's events share one (ledger, tx, op).

  • SDEX / classic reconciliation (ADR-0033 Phase 5 / Claim 2b classic).
    verify-reconciliation now also covers SDEX, which predates Soroban
    and has no soroban_events: its expected count comes from the
    LCM-derived classic_trade_effect_count census in ledger_ingest_log
    (one ClaimAtom = one trade), gated on the substrate record being
    continuous over the range (else it tells you to run census-backfill
    first). The existing hubble-check (per-ledger SDEX-vs-Hubble counts

    • amount cross-check) remains the external defense-in-depth anchor.
  • Completeness watermark verdict (ADR-0033 Phase 6 / headline).
    ratesengine-ops compute-completeness derives the per-source
    completeness WATERMARK — the highest ledger where substrate continuity

    • hash chain (Claim 1) AND projection reconciliation (Claim 2b) both
      hold from genesis — plus a system recognition verdict (Claim 2a), and
      writes them to the new completeness_snapshots table (migration 0052).
      /v1/diagnostics/ingestion overlays completeness_pct /
      completeness_watermark / completeness_complete onto each source
      row, and the status page renders completeness_pct as the headline
      (falling back to gap-free coverage when not yet computed). Unlike
      density/gap_free this uses NO sparsity threshold — a single proven gap
      pins it — so it is the honest 100%-confidence signal. MinGapSizeOverride
      is now documented as alerting-cadence only, off the confidence path.
  • Projection reconciliation extended to all per-ledger sources +
    multi-output fix (ADR-0033 future work).
    verify-reconciliation and
    compute-completeness now drive off a shared catalogue covering every
    source that writes a per-ledger table — trades (soroswap/aquarius/
    phoenix/comet), oracles (reflector ×3 / redstone), cctp/rozo/defindex,
    and blend's four tables — plus sdex via the LCM census. The re-derive
    now buckets outputs by EventKind() (ReDeriveOutputCountsByKind +
    SumKinds) and reconciles each table against only the kinds that
    route to it
    — fixing a latent overcount where multi-output sources
    (soroswap/phoenix/comet also emit skim/liquidity/stake events to other
    tables) were compared whole against trades alone. Recognition gaps
    are now attributed per-source for contract-pinned sources (oracles),
    with a system recognition snapshot for gaps on unowned contracts.
    (sep41/band/soroswap-router remain out of scope — documented in the
    catalogue.) Also chunk-prunes those queries via SorobanEventsTimeBound.

  • Incremental completeness verify + hourly timer (ADR-0033 standing guard).
    compute-completeness gains -from <ledger>: verify only [from, tip],
    trusting [genesis, from] as previously verified (substrate hash-chain,
    recognition shape scan, and projection reconcile all scoped to the window);
    the watermark still extends to tip when the window is clean. scripts/ops/ completeness-incremental.sh computes from = min(watermark) from the prior
    snapshots, so each run re-checks only new ledgers — minutes, not the hours a
    full genesis→t...

Read more

Rates Engine v0.5.0-rc.107

01 Jun 11:38

Choose a tag to compare

Pre-release

[v0.5.0-rc.107] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: aggregator restart picks up new oracle gap-detector targets. Coverage_pct values populate after the first 30-min cycle.

Fixed

  • Oracle sources (band, redstone, reflector-dex/cex/fx) now have
    gap-detector targets
    sliced from the unified oracle_updates
    hypertable. Pre-rc.107 these sources showed n/a on the
    backfill_coverage listing because no per-source target existed.
    Same shape as the rc.104 Soroban-DEX trade targets: shared
    hypertable + per-source WhereFilter. Result: customer-facing
    coverage_pct now populates for ALL Soroban sources with a
    per-source hypertable. defindex + soroswap-router remain n/a
    because they're log-only sinks (no per-ledger hypertable rows
    to scan).

Rates Engine v0.5.0-rc.106

01 Jun 11:28

Choose a tag to compare

Pre-release

[v0.5.0-rc.106] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: api restart picks up the coverage_pct
semantic fix. No migrations.

Fixed

  • coverage_pct now reflects gap-free-ness, not event-density.
    ADR-0031 Phase 2 deprecated the legacy cursor-derived
    coverage_pct and the status page fell back to rendering
    density_pct. density_pct = distinct_ledgers / expected_ledgers
    over [genesis, tip] — for sparse sources (Soroban oracles
    pushing once per hour, low-volume DEXes), density is naturally
    <1% and the UI was reading that as "1% covered". User feedback
    on r1 2026-06-01: that's a misleading metric.
    Fix: coverage_pct = gap_free_pct = 1 - max_gap_ledgers / expected_ledgers. 1.0 means the indexer hasn't skipped any
    ledger in this source's window — what "coverage" intuitively
    means. Sparse sources hit 100% as long as ingest is healthy.

Rates Engine v0.5.0-rc.105

01 Jun 10:52

Choose a tag to compare

Pre-release

[v0.5.0-rc.105] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: indexer restart picks up the poller skip-doesn't-mean-stale fix.

Fixed

  • ratesengine_external_poller_stale falsely firing on
    chainlink.
    Live-r1 incident 2026-06-01: chainlink poller
    reports ~36 min stale shortly after every indexer restart,
    even though it's polling correctly every 30s. Root cause:
    the runner's "skipped" branch (when the poller returns
    nil, nil, nil — by convention meaning "polled successfully
    but no new feed data") did NOT update
    ratesengine_external_poller_last_success_unix. Chainlink's
    Ethereum feeds update at most every 1 hour, so the vast
    majority of its 30-second polls naturally take the skip path.
    The alert read this as "the poller hasn't successfully
    reached upstream in 30+ min" — wrong: the poller IS
    reaching upstream, just finding nothing new.
    Fix: bump LastSuccessUnix on the skipped path too — the
    outcome="skipped" counter still distinguishes skip from
    success, but the timestamp tracks "last time we polled at all"
    not "last time we got an event."

Rates Engine v0.5.0-rc.104

01 Jun 10:32

Choose a tag to compare

Pre-release

[v0.5.0-rc.104] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: aggregator restart on r1 picks up the new per-source gap-detector targets. Migration 0048 (source_coverage_snapshots) was hand-applied during the incident and needs to be marked at version 48 in schema_migrations (already done on r1).

Fixed

  • Coverage snapshot rows for Soroban-DEX sources.
    Post-ADR-0031 Phase 2 removed the cursor-derived density and
    routed /v1/diagnostics/ingestion's coverage listing through
    source_coverage_snapshots. The gap detector targets covered
    SDEX (via source = 'sdex' WhereFilter on trades) but not the
    Soroban-DEX sources (aquarius, soroswap, phoenix, comet) that
    also land in the unified trades hypertable. Result on r1
    2026-06-01: API reported 0% coverage for all four. Added the
    matching per-source targets with appropriate genesis ledgers
    and 100K-ledger sparsity overrides — matches the SDEX shape.

Rates Engine v0.5.0-rc.103

01 Jun 09:59

Choose a tag to compare

Pre-release

[v0.5.0-rc.103] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: indexer restart picks up 8 workers from 4. No migrations.

Fixed

  • PersistWorkers bumped 4 → 8. rc.102 with 4 workers gave
    ~5 ledgers/min on r1 vs the ~10 ledgers/min network rate;
    doubling the concurrent drain lifts processing throughput above
    the network rate so the live cursor catches up and stays close
    to the SLA-freshness threshold.

Rates Engine v0.5.0-rc.102

01 Jun 09:49

Choose a tag to compare

Pre-release

[v0.5.0-rc.102] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: indexer restart picks up the
4-worker parallel drain. No migrations.

Fixed

  • PersistEvents parallel drain (4 workers). Live-r1 incident
    2026-06-01: even after rc.101's batch-INSERT fix, the indexer
    cursor advanced at ~1 ledger/min vs ~10/min network rate.
    Root cause: the single-goroutine drain meant only one PG
    roundtrip in flight at a time; the indexer's ProcessLedger
    goroutine was blocked on events <- ev waiting for that one
    worker to drain. With 4 worker goroutines sharing the same
    channel (Go's channel semantics handle concurrent receive
    safely), the events channel drains 4× faster; the existing
    PG pool of 25 conns carries the concurrent INSERTs. Each worker
    maintains its own 200-row trade batch + 200ms flush ticker.
    Per-event ordering within a source is not preserved across
    workers; the trades hypertable's PK (source, ledger, tx_hash, op_index, ts) makes that irrelevant for correctness.

Rates Engine v0.5.0-rc.101

01 Jun 09:21

Choose a tag to compare

Pre-release

[v0.5.0-rc.101] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: indexer restart picks up the batched-INSERT trade path. No migrations.

Fixed

  • Trade-insert throughput lifted ~40× via batch INSERT.
    Live-r1 incident 2026-06-01: per-INSERT roundtrip cost capped
    sustained trade throughput at ~5 trades/sec on the live indexer,
    despite PostgreSQL handling 9000+ single-row INSERTs/sec in a raw
    loop (verified). The bottleneck was the serial drain loop in
    pipeline.PersistEvents: one event dequeue → one HandleEvent →
    one InsertTrade roundtrip, no overlap. With ~300 events per
    mainnet ledger, the cap meant ~1.8 ledgers/min processed vs the
    ~10 ledgers/min network rate, accumulating multi-hour lag.

    New Store.BatchInsertTrades writes N rows in one statement
    (INSERT … VALUES (…), (…), … ON CONFLICT DO NOTHING); same
    idempotency, same per-source source_entry_counts UPSERT semantic,
    same TradeInsertOutcomeTotal metrics. PersistEvents now
    buffers trade events up to 200 rows OR 200 ms (whichever first),
    flushes via the batch path, falls back per-row on a batch DB
    error. Non-trade events (oracle updates, supply observations,
    log-only events) stay on the single-row HandleEvent path.

Rates Engine v0.5.0-rc.100

01 Jun 08:29

Choose a tag to compare

Pre-release

[v0.5.0-rc.100] — 2026-06-01

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: aggregator restart picks up the cadence-aware gap-detector. No migrations. After deploy, confirm pg_stat_activity shows no concurrent DISTINCT ledger FROM trades scans accumulating across cycles.

Fixed

  • Gap-detector no longer pile-drives postgres on huge tables.
    Live r1 incident 2026-05-29: three concurrent SELECT DISTINCT ledger FROM trades WHERE source='sdex' scans accumulated over
    successive gap-detector cycles because the Go-side ctx timeout
    didn't propagate to PostgreSQL — the queries kept running and
    starved trade-insert latency, lighting the slo_latency_burn
    page. Two complementary fixes:
    1. Per-target ScanCadence override. New
      GapDetectorTarget.ScanCadence lets huge-table targets opt
      into a longer scan cadence than the global 30-min interval.
      SDEX trades and soroban_events now scan every 6 hours; light
      targets keep the 30-min cadence for fast signal.
    2. SQL SET LOCAL statement_timeout backstop.
      CountDistinctLedgers and FindPerSourceLedgerGaps now wrap
      their query in a transaction with a 5-min PG-side timeout.
      If Go-side cancellation fails (the F-0020-cousin failure mode
      we just observed), PostgreSQL itself aborts the query —
      in-flight scans can no longer leak across cycles.

Rates Engine v0.5.0-rc.99

29 May 12:04

Choose a tag to compare

Pre-release

[v0.5.0-rc.99] — 2026-05-29

Tested against Stellar Protocol 23 (Whisk).

Pre-deploy operator note: api + ops binary restart. Indexer + aggregator unchanged (projector is opt-in and defaults to off). After api restart, run ratesengine-ops sep1-refresh -older-than 0 once to repopulate every issuer's sep1_payload JSONB column with the new Currencies shape; until then the per-asset overlay fields are empty.

Changed

  • /v1/assets/{id} SEP-1 overlay reads from DB instead of live
    HTTPS.
    Pre-rc.99 the asset-detail handler called
    metadata.Cache.Resolve(home_domain) on every uncached request,
    which dominated p95 (~4s long tail on cold issuers — drove the
    slo_latency_burn_medium page 2026-05-29 11:30). The handler now
    reads the issuers.sep1_payload JSONB column populated by the
    ratesengine-ops sep1-refresh cron, which is what /v1/issuers
    already did. The sep1-refresh cron is extended to persist
    Currencies (per-asset metadata) so the overlay's Name /
    Description / Image / AnchorAsset fields stay populated on the
    next cron run.
  • ADR-0029, ADR-0031, ADR-0032 promoted to Accepted. Phase 6
    of the projection-architecture rollout completes the
    documentation contract — three ADRs now describe the single
    writer per data domain (projector for Soroban-derived, direct
    for trades), the single data-derived coverage signal, and the
    raw soroban_events landing zone they share. CLAUDE.md gains
    Invariant 7 ("One writer per data domain") summarising the
    contract for future agents.

Added

  • ADR-0032 Phase 5 — projector-replay operator subcommand.
    Single SQL cursor-rewind:
    ratesengine-ops projector-replay -source <name> -from <ledger>.
    The projector goroutine catches up on its next cycle (≤ 5 s)
    and re-projects forward to the live tip. Replaces the family of
    *-backfill subcommands deleted in this release. New
    projector-replay
    runbook captures the new operator flow.

Removed

  • ADR-0032 Phase 5 — dead-code deletion. Removed eight
    redundant ratesengine-ops subcommands (~1500 LoC):
    cctp-backfill, rozo-backfill, soroswap-skim-backfill,
    comet-liquidity-backfill, phoenix-backfill, blend-backfill,
    sep41-transfers-backfill, drain-cascade-window. All replaced
    by projector-replay + the projector goroutine. Also removed
    the cascade-window-drain runbook (superseded by
    projector-replay). Runbook + alert references updated.

Changed

  • ADR-0032 Phase 4 — projector becomes sole writer for Soroban-
    derived events.
    New [ingestion.projector] persist_per_source
    knob (default true = Phase 3 parallel mode); flipping to
    false switches the dispatcher's events-goroutine to
    pipeline.SinkModeSkipProjected so it stops writing the
    Soroban-derived event subset. The projector becomes single
    writer-of-record for trades, blend_*, phoenix_*,
    comet_*, soroswap_skim, cctp_events, rozo_events,
    sep41_*, oracle_updates (reflector + redstone). Non-projected
    events (sdex, external CEX/FX, band, supply-observer
    LedgerEntry observations) continue through the events-goroutine
    unchanged. New pipeline.IsProjectedEvent is the dispatch
    contract — table-driven test pins it.

Added

  • ADR-0032 Phase 3 — projector scaffold in parallel mode. New
    internal/projector component tails soroban_events (the
    ADR-0029 raw-event landing zone) and invokes each protocol's
    existing Go decoder, then routes decoded consumer.Events
    through pipeline.HandleEvent (newly exported) to the same
    per-source persisters the dispatcher uses. Phase 3 runs in
    parallel with the dispatcher's existing per-source sinks — both
    writers race for the same per-source PKs and ON CONFLICT DO
    NOTHING absorbs duplicates, so projector lag versus the live
    tip can be measured before Phase 4 flips the writer primary.
    New [ingestion.projector] enabled config knob defaults to off;
    cmd/ratesengine-indexer/main.go wires + drains the goroutine
    on shutdown.
  • Projector observability. Four new metrics
    (ratesengine_projector_lag_ledgers, _runs_total,
    _events_decoded_total, _cycle_duration_seconds) plus a
    paired alert (ratesengine_projector_lag_high +
    ratesengine_projector_error_rate_high, both P3) and the
    projector-lag runbook.