Skip to content

Rates Engine v0.5.0-rc.58

Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 19 May 21:34
· 697 commits to main since this release

[v0.5.0-rc.58] — 2026-05-19

Added

  • ratesengine-ops scan-soroban-events — in-infra ground-truth
    event dumper (#28).
    Streams a bounded galexie ledger range and
    prints every Soroban contract event as one JSON line
    (contract_id, decoded topic[], body map keys + value types),
    optionally filtered to topic[0]==STR and/or one contract. A
    catch-all dispatcher.Decoder reuses the dispatcher's
    LCM→events.Event extraction, so it answers "what does protocol
    X actually emit on-chain" without BigQuery (the
    hubble-soroban-events analogue, which needs GCP we don't have).
    No DB writes. Built to unblock the defindex decoder
    re-derivation (#28) — discover real contract addresses + event
    schemas before writing/auditing a decoder — but reusable for the
    whole granular-coverage mission. Bundles into rc.58.

Fixed

  • /v1/issuers spurious 500 under concurrency (#34). A
    client-canceled request mid-ListIssuers surfaces from lib/pq as
    canceling statement due to user request (SQLSTATE 57014). The
    handler checked handlerTimedOut but was missing the
    clientAborted guard the canonical pattern (and
    handleObservations) uses — so a client abort fell through to a
    generic Issuers list failed 500 + ERROR log, polluting the
    5xx rate and SLA availability (it was the sole sla-probe
    SLA-harness blocker post-#32, ~4 % under the probe's concurrency;
    external sequential requests were always 200). Added
    if clientAborted(r, err) { return } as the first error check
    (matches envelope.go's documented ordering). Regression test
    TestHandleIssuersList_ClientAbortedNo500; the existing
    generic-error→500 test still passes (fix is scoped to client
    abort only). go test -race green.

  • galexie-archive 23-day mirror stall — healed + scheduled
    catch-up so it can't silently recur (#26).
    r1's durable
    full-mirror (ADR-0016) had silently fallen ~346 k ledgers /
    23 days behind (held genesis→62,296,694, nothing after
    2026-04-26): the live appender kept galexie-live current but
    the hardened galexie-archive-fill catch-up script (already
    built + installed) was only ever invoked by hand — nothing
    scheduled it. Healed the gap (mc mirror from live, 57.8 GiB,
    all partitions now complete; aws-public-blockchain is the
    durable upstream so nothing was at permanent risk, but the
    full-mirror guarantee was broken and local WASM-walks /
    backfills past the stall failed). Standing fix:
    galexie-archive-fill.{service,timer} (hourly, oneshot,
    root for the local+aws-public mc aliases) — added to
    deploy/systemd/ + Ansible
    (roles/archival-node/templates/systemd/*.j2 +
    tasks/07-galexie.yml), and installed + enabled + test-fired
    on r1 immediately (test run: Result=success, "needs work
    (missing): 0"). A stall is now repaired within ~1 h instead of
    weeks. Defense-in-depth lag alert split to a follow-up.
    Bundles into rc.58.

  • /v1/observations ~8 s → 503 fixed via CachedHistoryReader
    SWR (#29).
    The status page polls
    ?asset=native&quote=fiat:USD every ~2 min; that pair has zero
    direct trades (fiat:USD is an aggregator proxy, never a stored
    quote_asset) yet LatestTradePerSource is an unbounded
    DISTINCT ON (source) … ORDER BY source, ts DESC over the
    2.7 B-row trades hypertable — no time bound → no chunk
    exclusion → ~8 s even for an empty result → the handler's 8 s
    ceiling 503s. New internal/api/v1/history_cache.go
    SWR-caches LatestTradePerSource only (every other
    HistoryReader method passes through), wired at 2 m TTL in
    cmd/ratesengine-api/main.go. Mirrors the proven #22/#23
    pattern with one deliberate change: the cold fill is
    detached (own 30 s budget) so it outlives the 8 s request
    ceiling — the first caller(s) still 503 (bounded by their own
    ctx) but the fill warms the cache out-of-band, so the next poll
    is fast. Zero correctness loss (the exact query result,
    including a legitimate empty slice, is cached). Also corrected
    the HistoryReader.LatestTradePerSource doc, which falsely
    claimed a (base_asset,quote_asset,source,ts DESC) index that
    was never created. The real query-cheapening (create that
    index) is deferred (#30 — multi-GB on a 2.7 B-row hypertable,
    r1 disk-constrained). go test -race green (4 new tests incl.
    the detached-cold-fill-warms case). Bundles into rc.58.

  • defindex decoder re-derived from real on-chain schema —
    was decoding nothing (#28).
    The decoder + its docs/tests were
    written against paltalabs/defindex tag 1.0.0
    (("DeFindexVault",…){depositor,amounts:Vec<i128>, df_tokens_minted}); mainnet never deployed that. The watched
    contract addresses run Blend strategy code (deployed WASM
    11329c24…988) and emit ("BlendStrategy","deposit"| "withdraw") with body ScvMap{from:Address, amount:i128}
    confirmed from real LCM via the new scan-soroban-events.
    Rewrote internal/sources/defindex/{events,decode, dispatcher_adapter,consumer}.go to the real schema,
    dispatched by topic across every BlendStrategy emitter (not
    the mislabeled 3-contract set — comet/aquarius shared-emitter
    topology, captures all Blend autocompound instances). Deleted
    the fictional MainnetVault* / MainnetVaultWASMHash / factory
    consts; regenerated tests from the real schema (go test -race
    green, incl. the contract-from case mainnet actually emits).
    BackfillSafe stays false until live-verify on r1 + WASM
    re-audit vs 11329c24…988 (defindex.md "Resolution"). Source
    key kept defindex (rename to blend-strategy deferred —
    product-taxonomy, not correctness). Bundles into rc.58.

  • WASM-history audit: soroswap-router PASS → BackfillSafe: true; defindex FAIL → stays gated; defindex genesis
    corrected (#6, #28).
    The 2026-05-19 r1 wasm-history walk +
    byte-level disassembly resolved both Phase-A router sources.
    soroswap-router: a single immutable WASM hash
    (4c3db3eb...07) over the contract's entire on-chain life
    [50_746_272→tip], zero mid-life upgrades, both decoded
    function exports present, no event surface — BackfillSafe
    flipped true. defindex: audit FAILED — the decoder was
    written against paltalabs/defindex tag 1.0.0 (vault hash
    0f3073...8f3a) but mainnet runs 11329c24...988, whose
    deposit/withdraw topic + body schema differ (the
    DeFindexVault topic and every documented body field are
    absent from the sha256-verified deployed bytes;
    aggregator_exposures is empty on r1, corroborating that live
    defindex decoding matches nothing). BackfillSafe stays
    false; the gate did its job. Re-deriving the decoder from the
    deployed contract is Task #28. Independently,
    sourceGenesisLedger["defindex"] corrected from the
    provisional 51_499_545 to the walk-exact factory first-deploy
    57_056_338 (#10-class precision; orthogonal to the decoder
    fault — an honest genesis makes density read correctly, not
    falsely). Audit logs:
    docs/operations/wasm-audits/{soroswap-router,defindex}.md.
    Bundles into rc.58.

  • CachedCoinsReader single-asset SWR — fixes /v1/assets/{id}
    ~3.9 s (#24).
    The coin-extension path was entirely uncached
    pass-through
    : every /v1/assets/{id} ran the ~13 s
    whole-asset-universe listCoinsBaseSelect query (its CTEs
    aggregate ALL pairs even for one asset — structural, all CTEs are
    already time-bounded) via GetCoinByAssetID/GetNativeCoinRow,
    plus ~7 more uncached fan-out calls incl. the ~5.8 s
    trades WHERE base OR quote=$1 scan
    (GetCoinTradeCount24h/GetCoinMarketsCount). Added a generic
    swr[T]
    single-value stale-while-revalidate helper (free
    function — Go methods can't be type-parametric; new swrEntry
    map under the existing mutex) that is the proven, race-clean #22
    fetchRows/refreshRows logic made type-parametric, and wired
    all 9 per-asset single-value coin methods through it
    (GetCoinBySlug/ByAssetID/NativeCoinRow/TopMarkets/
    PriceHistory24h/7d/MarketsCount/ATH/TradeCount24h):
    serve stale instantly, single-flighted request-ctx-independent
    background refresh, keep-stale-on-error, cold-miss blocks,
    ttl<=0 still passes through. Zero correctness loss. go test -race clean (new generic-SWR serves-stale-under-20-
    concurrent / single-flight / keeps-stale-on-error tests + all
    existing coins tests still green). Deeper follow-up logged
    (asset-filter pushdown into the CTEs so the query itself is
    cheap). Bundles into rc.58.

  • CachedMarketsReader stale-while-revalidate — fixes /v1/pools
    ~8 s cold (#23).
    Post-rc.57 sweep surfaced /v1/pools at
    ~8 s/ok 87 %: buildPoolsQuery's OUTER FROM trades … ts>=14 d GROUP BY source,base,quote raw-trades enumeration (same disease
    class as #20). Unlike #20 it cannot be query-rewritten —
    verification ruled out every candidate per-source pre-aggregate
    (prices_* collapse source; price_source_contributions is
    curated/sparse — 5 sources/10 pairs, would make /v1/pools
    return ~10 pairs; market_observations doesn't exist). So the
    unavoidable per-source scan is moved off the request path:
    fetchPools/fetchPairs now stale-while-revalidate (serve stale
    immediately on expiry + one single-flighted, request-ctx-
    independent background refresh, keep-stale-on-error, cold-miss
    still blocks) — the exact proven coins_cache.go #22 pattern,
    mirrored. Zero correctness loss (full per-source coverage
    from raw trades preserved). go test -race clean across new
    fetchPools SWR tests (serves-stale-under-20-concurrent,
    single-flight, keeps-stale-on-error); existing cold-path tests
    still green (SWR only changes the expired path). New
    stale/refresh_error markets cache-op outcomes. A per-source
    pools CAGG (so the background query itself is cheap) is a logged
    follow-up. Bundles into rc.58.