Skip to content

Stellar Index v0.5.0-rc.109

Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 17 Jun 09:24
· 159 commits to main since this release

[v0.5.0-rc.109] — 2026-06-17

Changed

  • Full web redesign — a unified light-mode design system across all three
    surfaces (explorer, status, dashboard).
    Modern, minimal, tech-forward:
    Inter + JetBrains Mono (now actually loaded via next/font — they were
    referenced but silently falling back to system-ui), a semantic token system
    (brand / surface / line / ink / up / down / warn / bad / ok), hairline
    borders over heavy shadows, generous whitespace, and one confident blue
    accent. Dark mode removed (light only for now). New shared component
    library (web/explorer/src/components/ui) + style guide
    (docs/architecture/design-system.md + /dev/styleguide). The status page
    is unified with the site UX, and the customer dashboard was fleshed out
    into a real product surface (sidebar shell + Overview/Keys/Usage/Settings on
    live API data). Fixed latent bugs found en route: -DEFAULT-suffixed colour
    classes (generated no CSS) and off-palette chart colours.

Fixed

  • SEP-41 supply: mint + clawback silently dropped post-P23 (data loss).
    sep41_supply.decodeCounterparty read the counterparty from a FIXED topic
    index (mint/clawback → topic[2]) matching the legacy admin-prefixed SAC
    shape. CAP-67 / Whisk (mainnet 2025-09-03) replaced that with
    ["mint", to, sep0011_asset] — counterparty at topic[1], a String at
    topic[2] — so AsAddressStrkey errored on the String and the whole row was
    dropped. r1-lake-verified: 99.96% of recent mints + 100% of clawbacks are the
    CAP-67 shape, all lost; total_supply under-counted for every watched SEP-41
    token. Now shape-aware (topic[2] is an Address ⇒ legacy/topic[2], else
    CAP-67/bare-spec/topic[1]); burn was already correct. The old back-compat
    test passed on a fabricated shape mainnet never emits; replaced with a
    lake-faithful shape matrix. Historical recovery (re-derive from the lake) is
    a deferred operator job. (audit-2026-06-14)
  • Explorer pagination dropped rows at page boundaries. Contract-event and
    account tx/op listings cursored on ledger_seq only, but many rows can share
    one ledger (a busy AMM emits >limit events/ledger), so a page boundary inside
    a ledger silently skipped the remainder. Now a composite keyset cursor
    (opaque next_cursor/cursor, ClickHouse tuple comparison). Ledger listing
    keeps its correct integer before. (audit-2026-06-14, A11)
  • Explorer UI: result_code rendered every op red. The API emits
    result_code as a JSON number (0 = success) but the TS typed it as string
    and regex-tested it; success now derives from === 0. Also: account
    source_account links 404'd (pointed at /issuers/{g}, which static-exports
    only ~100 issuers) — added a /accounts?id= query-param page; and
    total_coins (~1e18 stroops) lost precision through Number() — now
    BigInt-divided (ADR-0003). (audit-2026-06-14, A17)
  • SDK Envelope.Pagination round-trip drift (A14-01). The Go client typed
    Pagination as a value with omitempty — a no-op on a struct — while the
    server uses *Pagination, so re-encoding a non-list response emitted
    "pagination":{} where the server omits it. Changed to *Pagination (matches
    the wire; nil ⇒ absent). Pre-v1 SDK; consumers nil-check before .Next.
  • S3 credential env field corrupted by its own override (A16-01). [storage] s3_access_key_env/s3_secret_key_env hold the NAME of the env var carrying
    the credential (buildS3Client does os.Getenv(name)), but ApplyEnvOverrides
    • an env: tag overwrote the name with the env var's VALUE, so
      os.Getenv("AKIA…")→"" silently dropped S3 static creds for the
      trim/rehydrate-galexie-archive ops commands. Removed the override + tag (the
      fields are names with defaults; export STELLARINDEX_S3_ACCESS_KEY=<key> and
      it resolves through the name). Latent (the indexer hot path uses the AWS
      default chain).
  • Generated API reference could silently drift on main (A19-02). The
    spec→rendered-reference sync check was PR-only (path-filtered CI), so a
    direct-to-main push that edited openapi/ without make docs-api slipped a
    stale reference onto main (66 vs 73 paths). Added the diff as a lint-docs.sh
    section so verify.sh catches it pre-push on every commit, and regenerated
    the reference.
  • Projector decode panic could crash-loop the live indexer (X9). The
    projector's per-source goroutine ran decoders on raw lake rows (incl.
    historical/upgraded-WASM shapes) with no recover — the dispatcher path has
    one via pipeline.ProcessLedger, but the projector didn't inherit it. A
    panic on one poison row crashed the whole stellarindex-indexer, and since
    the cursor doesn't advance past the bad row, restart re-read it into a
    crash-loop. Per-row recover now demotes a panic to a counted soft-fail
    (extracted to a unit-tested processEventSafely). (audit-2026-06-14, X9)
  • API-key revocation could silently no-op under the Postgres backend (X6).
    /v1/account/keys (mint/list/revoke) was wired unconditionally to the Redis
    store, but under auth_backend=postgres the runtime validator authenticates
    from Postgres — disjoint stores, so a DELETE here removed the Redis record
    while the live Postgres row kept authenticating (a "revoked" key stays live).
    Latent on r1 (default redis backend, where writer+validator agree). The Redis
    account-keys surface is now disabled under the Postgres backend with a loud
    log; the Postgres-backed /v1/dashboard/keys (invalidates the cache on
    revoke) is the source of truth there. (audit-2026-06-14, X6)
  • Magic-link login could email-bomb an inbox. POST /v1/auth/login sent
    an email per accepted request, bounded only by the global anon per-IP
    rate-limit (60/min) — enough to flood a victim inbox / burn the email-send
    quota. Added an optional LoginThrottle (per-IP + per-target-email Redis
    sliding window, default 10/h IP + 5/h email); over quota the send is skipped
    but the generic 200 is still returned (no enumeration/throttle signal), and a
    Redis blip falls open. (audit-2026-06-14, A12)
  • Migration down of 0031/0040 re-armed retention (data-loss footgun). The
    down migrations re-added add_retention_policy('trades'/'oracle_updates', 90 days) — the exact mechanism of the "rogue retention" drift ADR-0034 forbids;
    one migrate down crossing 31/40 would schedule deletion of >90d raw rows.
    Both downs are now documented no-ops (forward-only). (audit-2026-06-14, A15)
  • Hot hypertables encoded a 1-day chunk interval. trades (and
    soroban_events / blend_auctions / phoenix_*) were created with
    chunk_time_interval => 1 day; trades reached 3445 chunks → per-INSERT
    ON CONFLICT walked all chunks → ~6 inserts/s + lock-table pressure. The r1
    fix was operational (merge_chunks), so a fresh bring-up re-accrued it. New
    migration 0062 widens them to 7 days (affects future chunks only).
    (audit-2026-06-14, A15)
  • k6 99-spike alert silence was a no-op. test/load/scenarios/lib/ alertmanager.js defaulted to matcher names (APIHighLatencyP95/
    APIHighErrorRate) that match NO deployed alert, so the planned-burst
    silence never applied and on-call would page. Fixed to the real
    stellarindex_api_* alert names. (audit-2026-06-14, A20)
  • projector-replay silently no-oped — the rewind called UpsertCursor,
    whose monotonic-forward guard (F-0020) matched zero rows on a backward
    write; the command printed success while the cursor stayed at tip. New
    dedicated RewindCursor store method (backward-only UPDATE; errors on
    missing row) wired into the subcommand. Found when the blend
    TRUNCATE+replay re-derive wrote nothing.

Added

  • Network explorer (ADR-0038) — a read API + UI over the certified
    ClickHouse Tier-1 lake: GET /v1/ledgers, /ledgers/{seq}/transactions,
    /tx/{hash}, /operations, /contracts/{c}, /accounts/{g}/transactions
    • /operations, and /search. Classic XDR is decoded to clean JSON
      (internal/xdrjson, amounts as strings per ADR-0003) and the served reads
      use the lake's bloom skip-indexes (tx_hash, source_account, contract_id).
      Next.js static-export UI: ledger / tx / contract / account pages + ⌘K
      search. Account activity is sourced/submitted scope only (participant index
      is Phase B/C). The two /accounts/{g}/* paths ship with OpenAPI
      AccountTransactions/AccountOperations schemas (scope, next_cursor).
  • GET /v1/coverage — public per-source completeness verdicts
    (ADR-0033): the three claims (substrate/recognition/projection), the
    verified-to watermark, and the headline complete boolean, served from
    completeness_snapshots. The trust story as an API: consumers can audit
    the "every protocol, verified complete" claim themselves. Feeds the
    explorer Coverage center.

Changed

  • Rebrand: Rates Engine -> Stellar Index (ADR-0037; the same-day
    interim name "Stellar Atlas", ADR-0036, was found taken and never
    shipped durably). Module path
    github.com/StellarIndex/stellar-index; binaries stellarindex-*; env
    vars STELLARINDEX_*; Prometheus namespace stellarindex_*; domain
    stellarindex.io. Repositioned as a protocol explorer for the Stellar
    network (pricing API remains a flagship product) evolving toward a
    comprehensive blockchain explorer. Historical archives (ADRs 0001-0035,
    discovery, audits, dated entries below) intentionally keep the old name.
    Migration plan + r1 cutover: docs/operations/stellar-index-migration.md.

Removed

  • BREAKING (API/SDK, SemVer-major): cross-chain / multi-network asset wire
    shapes removed — the public API + Go SDK are now Stellar-only.
    Part of the
    Stellar-focus refactor (docs/architecture/stellar-focus-refactor-plan.md,
    Unit D / Tier 3). Removed: the GlobalAssetView.networks[] array,
    VerifiedCurrencyListItem.networks[] + network_count, the NetworkView
    and PerNetworkAssetView schemas/types, the GET /v1/assets/{asset_id}/{network}
    per-network drill-down route, and the ?network= query param on /v1/assets.
    The verified-currency catalogue (internal/currency/data/seed.yaml) is now a
    pure Stellar-asset trust registry: every non-Stellar networks: entry was
    stripped, so each browseable entry carries at most one (stellar) network
    entry. Reference-only coins (BTC/ETH/…/USDT) keep their coingecko_id /
    coinmarketcap_id mappings — the proposal-scoped divergence/aggregator
    reference-price pipeline is unaffected. Pre-v1, no production consumers.
  • Cross-chain market-cap cache (internal/currency/marketcap) removed. The
    CoinGecko-backed presentation-only cache (and its refresher goroutine + the
    MarketCaps server option + the /v1/diagnostics/ingestion market_cap
    state section) populated a CMC-style market_cap_usd for non-Stellar coins.
    It was never read by divergence/aggregate. Catalogue crypto/stablecoin
    rows no longer carry a catalogue-level market cap (their per-Stellar-asset F2
    fields on /v1/assets/{asset_id} remain the canonical source). The legit
    Stellar-native market cap (AssetDetail.market_cap_usd, circulating supply ×
    price) and the fiat M2 × FX market cap are unchanged.

Fixed

  • ledgerstream: a bounded range of exactly one ledger is valid. The
    tiered-path range validation rejected To() == From(), but the SDK models
    a single-ledger bounded range as a first-class concept
    (ledgerbackend.SingleLedgerRange) and the walk loop handles it as one
    iteration. Practical impact: ch-live-catchup's tip-extend failed every
    time its 10-minute timer fired exactly one ledger behind the galexie tip
    (ch-backfill: invalid end value for bounded range — ~half of r1 runs
    flapped red on 2026-06-11). Inverted ranges (To < From) are still
    rejected.

  • loki (r1): chunk storage moved off the root filesystem to the ZFS pool
    (/tmp/lokidata/loki @ /var/lib/loki) + 30-day retention.
    The
    quickstart-scaffold config stored Loki chunks on the 49 GB root via
    /tmp/loki — the same failure class as the 2026-06-11
    ClickHouse-logs-on-root fill and the 2026-05-10 root-full SEV-2 — grew
    without bound (no compactor/retention configured), and lost all log
    history on every reboot (/tmp is wiped). Storage now lives on the
    data/loki ZFS dataset with retention_period: 720h enforced by the
    compactor; log_level codified at warn (matching what r1 actually ran)
    instead of the scaffold's debug. Applied live on r1 2026-06-11 with the
    existing 21 days of chunks migrated intact.

  • sla-probe: measure the ≤30 s RFP freshness target on /v1/price/tip,
    not /v1/price.
    The probe held /v1/price to the Freighter RFP's 30 s
    price-freshness target, but that surface serves the most recent CLOSED
    bucket (ADR-0015 cross-region byte-identical contract): 60 s prices_1m
    buckets + the CAGG refresh policy's 30 s end_offset + a 30 s schedule
    interval make its observed_at structurally 30–150 s old. Result: the
    probe failed every run since metrics began (≥14 days of Prometheus
    history), drowning real regressions. The probe now also hits
    /v1/price/tip — the rolling-window surface built to deliver the RFP
    promise (sub-second observed_at) — and applies the 30 s target there,
    while /v1/price is held to a structural 150 s bound
    (-closed-bucket-freshness-target) that still catches the closed-bucket
    pipeline falling behind (the 2026-06-02/03 chunk-perf regression read
    166–186 s and would fail it). Per-endpoint freshness targets are recorded
    in the JSON evidence as freshness_target_sec.

  • soroswap-router: distinct swaps in one op were collapsed by a coarse PK
    (migration 0056).
    A single InvokeContract op can carry multiple genuinely
    distinct router swaps (an aggregator splitting a trade, or a batch to several
    recipients); the PK (ledger_close_time, ledger, tx_hash, op_index) dropped
    all but one via ON CONFLICT. The completeness honesty guard confirmed 106
    real swaps lost across pubnet history (not auth-tree dup-noise). Added a
    per-call discriminator call_sigRouterSwap.CallSig(), a 128-bit content
    hash of function|recipient|path|amount_in|amount_out — to the PK: distinct
    swaps get distinct keys (all stored); auth-tree duplicates of the same call
    hash equal and still dedup. Operator runbook: stop indexer → migrate → deploy
    the call_sig sink → TRUNCATEch-rebuild -contract-calls -sources soroswap-router -write. Last of the coarse-PK class (lint allowlist now OK:).

  • Completeness census for the event-less ContractCall sources (band,
    soroswap-router) now counts distinct served-PK identities, not raw events.

    The auth tree surfaces the same authorized call at multiple CallPaths for
    multi-entry (co-signed) / nested-auth txs; the served tier dedups them via
    ON CONFLICT, so a raw-event census over-counted and reported a phantom
    projection Δ (soroswap-router: 107 of 157.3k). The census dedups on the same
    (tx_hash, op_index[, ts]) grain. An honesty guard logs any collision whose
    row content differs — that would be the coarse PK collapsing genuinely
    distinct rows (a schema-grain defect), surfaced loudly rather than buried.

  • soroswap-router swaps with an unrepresentable deadline were silently
    dropped.
    The router deadline arg is a user-supplied u64; some calls pass a
    sentinel/garbage value (≈3e18 s → year ~99 billion, or one that overflows
    int64 to a BC year) that lands outside Postgres's timestamptz range and
    rejected the whole INSERT (SQLSTATE 22008). The swap itself is a real,
    successful token movement, so InsertSoroswapRouterSwap now NULLs an
    out-of-range deadline_ts instead of dropping the row. This affected both the
    live indexer and every backfill — ≈24% of historical router calls (30.7k of
    157.3k) were unstorable. Forward-fixes live ingest on the next indexer deploy.

Added

  • ch-rebuild -contract-calls — lake-replay write path for the event-less
    ContractCall sources (band, soroswap-router).
    These emit no Soroban events,
    so neither the event pass nor the ADR-0032 projector can rebuild them. The new
    pass streams the lake's InvokeContract ops (filtered on the contract's bytes in
    body_xdrstellar.operations has no contract_id column), runs each
    source's ContractCallDecoder, and writes the decoded events through the
    production sink (idempotent ON CONFLICT). It shares the exact decode path
    (forEachContractCallEvent) with the completeness projection census, so a
    written-row re-verify reconciles to Δ=0. This is the ADR-0034 successor to the
    retired backfill-router MinIO walk (which under-produced — it pre-dated the
    auth-tree-roots extraction and missed router calls nested inside aggregator
    contracts).