Rates Engine v0.5.0-rc.62
Pre-release
Pre-release
·
667 commits
to main
since this release
[v0.5.0-rc.62] — 2026-05-20
Fixed
- Prewarm extended to all 7 readers fired by
/v1/assets/{id}
(full deferred-#37). rc.61 partial covered only
GetCoinByAssetIDper verified asset; live measurement on r1
post-rc.61 confirmed the canonical-form path still dropped to
2.2s on first hit because the other SIX readers fired by the
handler (GetCoinTopMarkets(id, 5),
GetCoinPriceHistory24h,GetCoinPriceHistory7d,
GetCoinMarketsCount,GetCoinTradeCount24h,GetCoinATH)
cold-filled per request. Subsequent hits served sub-ms only
because the first request populated all 7 SWR slots. Full fix:
newprewarmAssetDetail(ctx, logger, coins, assetID)helper
that calls EVERY one of the 7 readers per verified canonical
asset_id PLUS native. Limit=5 onGetCoinTopMarketsmatches
the handler's literal — per the
feedback_prewarm_handler_driftmemory, any drift in
args (limit, order, sources) means a different SWR cache key
→ silent miss → the same bug class. Each reader's prewarm
call logs at Debug; transient failures don't block subsequent
readers. Net effect post-deploy: every verified-currency
canonical-form lookup (and native) should land sub-200ms on
FIRST hit, not just subsequent.
(#34 residual).** Postgres can issue SQLSTATE 57014
(canceling statement due to user request) from server-side
statement_timeout/lock_timeout/
idle_in_transaction_session_timeout— none of which trip
clientAborted(the http request context is alive) or
handlerTimedOut(the per-call context hasn't deadlined). The
error reached the issuers handler as a bare 57014 and fell
through to a 500. Combined with the sla-probe's threshold
set at exactly 99.0%, a single transient per ~430-sample
burst put availability under threshold and fired
ratesengine_sla_probe_unit_failed_alert. Fix: new
transientStorageErr(err)helper inenvelope.gothat
classifies the three transient classes (SQLSTATE 57014 not
carried by context cancellation,driver: bad connection
after pool retries exhausted, and EOF / broken-pipe network
blips) and returns 503 with theissuers-transientproblem
type. Standard handler ordering preserved: clientAborted →
handlerTimedOut → transientStorageErr → 500. The 503 still
registers as a non-2xx for the sla-probe's availability
metric, BUT — operator runbook: configure the probe to
count 5xx-but-not-503 as failures, OR loosen the threshold
to 98.5%. The cleaner long-term path is operator-facing. - Density formula no longer over-credits via interior-gap
bridging (user-reported). Status-page reported 100% density
forsoroswap-routeranddefindexwhile the #38 historical
backfill was only ~78% through their range. Root cause:
extendWithLiveTail(diagnostics_ingestion.go:1056) was
bridging interior gaps between two backfill intervals whenever
the upper bracket's start ≤ liveTop, on the assumption that
live ingest had walked the gap. That assumption is FALSE for
sources added toenabled_sourcesafter live ingest had
already crossed the gap-end ledger — exactly the case for
soroswap-router + defindex which were enabled at rc.5x while
the live cursor was already at ~62.5M; the interior
[60M, 62.5M] gap got false credit. Fix: remove interior-gap
bridging entirely. Live-tail credit is now head-band only
— from the top of the backfill union up tomin(liveTop, tip).
The previously-protected edge case ("disjoint high
gap-backfill island silently capping density at ~96%") becomes
honest under-coverage: operators close such gaps with a
targeted backfill rather than silent live-credit. Density on
defindex/router should now drop from the false 100% to the
honest ~78%-and-rising as #38 progresses. Two existing tests
that locked in the bridging behaviour were updated to the new
honesty-first policy.