Rates Engine v0.5.0-rc.62

Pre-release

Pre-release

github-actions released this 20 May 18:32

· 667 commits to main since this release

aca6016

[v0.5.0-rc.62] — 2026-05-20

Fixed

Prewarm extended to all 7 readers fired by /v1/assets/{id}
(full deferred-#37). rc.61 partial covered only
GetCoinByAssetID per verified asset; live measurement on r1
post-rc.61 confirmed the canonical-form path still dropped to
2.2s on first hit because the other SIX readers fired by the
handler (GetCoinTopMarkets(id, 5),
GetCoinPriceHistory24h, GetCoinPriceHistory7d,
GetCoinMarketsCount, GetCoinTradeCount24h, GetCoinATH)
cold-filled per request. Subsequent hits served sub-ms only
because the first request populated all 7 SWR slots. Full fix:
new prewarmAssetDetail(ctx, logger, coins, assetID) helper
that calls EVERY one of the 7 readers per verified canonical
asset_id PLUS native. Limit=5 on GetCoinTopMarkets matches
the handler's literal — per the
feedback_prewarm_handler_drift memory, any drift in
args (limit, order, sources) means a different SWR cache key
→ silent miss → the same bug class. Each reader's prewarm
call logs at Debug; transient failures don't block subsequent
readers. Net effect post-deploy: every verified-currency
canonical-form lookup (and native) should land sub-200ms on
FIRST hit, not just subsequent.
(#34 residual).** Postgres can issue SQLSTATE 57014
(canceling statement due to user request) from server-side
statement_timeout / lock_timeout /
idle_in_transaction_session_timeout — none of which trip
clientAborted (the http request context is alive) or
handlerTimedOut (the per-call context hasn't deadlined). The
error reached the issuers handler as a bare 57014 and fell
through to a 500. Combined with the sla-probe's threshold
set at exactly 99.0%, a single transient per ~430-sample
burst put availability under threshold and fired
ratesengine_sla_probe_unit_failed_alert. Fix: new
transientStorageErr(err) helper in envelope.go that
classifies the three transient classes (SQLSTATE 57014 not
carried by context cancellation, driver: bad connection
after pool retries exhausted, and EOF / broken-pipe network
blips) and returns 503 with the issuers-transient problem
type. Standard handler ordering preserved: clientAborted →
handlerTimedOut → transientStorageErr → 500. The 503 still
registers as a non-2xx for the sla-probe's availability
metric, BUT — operator runbook: configure the probe to
count 5xx-but-not-503 as failures, OR loosen the threshold
to 98.5%. The cleaner long-term path is operator-facing.
Density formula no longer over-credits via interior-gap
bridging (user-reported). Status-page reported 100% density
for soroswap-router and defindex while the #38 historical
backfill was only ~78% through their range. Root cause:
extendWithLiveTail (diagnostics_ingestion.go:1056) was
bridging interior gaps between two backfill intervals whenever
the upper bracket's start ≤ liveTop, on the assumption that
live ingest had walked the gap. That assumption is FALSE for
sources added to enabled_sources after live ingest had
already crossed the gap-end ledger — exactly the case for
soroswap-router + defindex which were enabled at rc.5x while
the live cursor was already at ~62.5M; the interior
[60M, 62.5M] gap got false credit. Fix: remove interior-gap
bridging entirely. Live-tail credit is now head-band only
— from the top of the backfill union up to min(liveTop, tip).
The previously-protected edge case ("disjoint high
gap-backfill island silently capping density at ~96%") becomes
honest under-coverage: operators close such gaps with a
targeted backfill rather than silent live-credit. Density on
defindex/router should now drop from the false 100% to the
honest ~78%-and-rising as #38 progresses. Two existing tests
that locked in the bridging behaviour were updated to the new
honesty-first policy.

Assets 9