Stellar Index v0.7.0
[v0.7.0] — 2026-07-02
Added
-
CCTP
mint_and_forwardis decoded (board #31). The CctpForwarder
contract emits a fifth event our decoder didn't handle — those events
reached the lake but nevercctp_events. Schema reverse-engineered from
real mainnet events (single Symbol topic; body map{amount: i128, forward_recipient: Address, token: Address}), golden-tested against the
actual lake fixture; no migration needed (the generic cctp_events shape
fits). The recognition blind spot is closed too: cctp's three contracts
are now pinned in the reconciliation catalogue, so a future unhandled
cctp topic caps THIS source's verdict instead of vanishing into the
system-wide bucket. New docs/protocols/cctp.md records the full inventory.
Historical catch-up (operator):projector-replay -source cctp -from 62403000. -
Phoenix is contract-identity gated (ADR-0040 §1 mechanism 2, CS-026).
Matches()now requires the emitting contract to be in the curated mainnet
set (phoenix.MainnetGatedSet: the page-verified 11 pools + 3 stake
contracts; multihop excluded — it emits no events) — phoenix topics are
plain string tuples any pubnet contract can forge, and were previously
attributed on shape alone. The factory's creation events predate the lake,
so the in-code seed is the trust root (wired intogatedSourcesfor the
protocol_contracts warm + future live-upsert anyway). Reject test pins the
injection vector closed; operator rollout steps (deploy → re-derive →
verdict watch) in the register. Defindex deliberately did NOT ship: the
lake now shows 88+22 emitters vs the 57 verified three weeks ago, and its
createevents don't carry vault addresses — gating on raw emitter lists
would bake potential look-alikes into the trust root, so it moves to the
ADR-0040 §3 cross-check enumeration (pages + ADR updated with the
evidence). -
Explorer error boundaries —
global-error.tsx(own html/body, inline
styles) + a shared design-systemRouteError+ 19 per-segmenterror.tsx
wrappers across the data-heavy routes; previously ONE boundary existed and
a render throw white-screened the route. Verified with a forced throw in a
real browser. -
Commercial funnel (LC-060/061/062/064/065): the pricing API is now in
the primary nav + a homepage product section; the dashboard's first-request
example is a copy-pasteable curl that actually works (the old
/v1/price/XLM-USD404s — verified live); Bearer is the one taught auth
header (X-API-Key mentioned once as the alternative, matching middleware
precedence); Business tier consistently 60,000 req/min (backend truth);
billing copy no longer promises self-service that doesn't exist. -
The agent skill library (
.claude/skills/, indexed in CLAUDE.md): nine
executable skills encoding this repo's procedures AND its incident-corpus
judgment so sonnet-class agents (and humans) work at standard without
tribal knowledge. Construction skills (/add-onchain-source,
/add-cex-connector,/add-endpoint,/add-metric) each end in the
machine checks that catch the historical failure mode (lockstep test,
contract test, guard chain); ops skills (/cut-release,/deploy-r1)
encode the release/deploy discipline + rollback;/review-stellarindex
distills the F-####/CS-### corpus into per-subsystem adversarial
checklists, each check citing its incident;/diagnose-stellarindexturns
the runbook corpus into triage decision trees with the exact r1 commands
and prior wrong turns;/verify-doneis the pre-completion gate stack
every other skill terminates in (including the new staged-content check
from this session's own 6161dd5 near-miss). -
Two delegation-ready structural specs (docs/architecture/):
storage-layering-spec.md— eliminates the 13 verified upward imports from
storage/timescaleinto compute/sources via storage-owned*Rowtypes +
caller-side conversion, in 4 grouped commits, gated by the new
storage-purityimport-lint rule that is the actual payoff; and
wiring-decomposition-spec.md— extracts the api binary's inline adapters
(main.go 3,338 → <800 lines target), collapses the Options/Server
triple-touch via embedding (rejecting a DI container as clever), groups the
ops CLI's 55-case switch into a declarative subcommand table (rejecting
cobra), and scopes the optional per-source pipeline registry now that the
lockstep guard exists. -
ADR-0043 (Proposed) +
scripts/ops/restore-drill.sh: the DR answer to
CS-110/111/112. Design: pgBackRest gains an offsite encryptedrepo2
(templated into the ansible role, gated off until the operator reviews the
rendered diff; refuses to render repo2 without its cipher pass); the CH lake
is protected by drilled RE-DERIVE + daily DDL/tail push instead of multi-TiB
full backups (the lake is derived data — the raw LCM exists in two archives;
the full-backup decision deliberately waits on the drill's measured
throughput). The restore drill is a non-destructive scratch restore on r1
(throwaway postgres on :5499, tip-lag + hash-chain + window row-count
verification, optional CH re-derive RTO measurement) appending to an
append-only evidence log. Runbook footguns fixed:--stanza=main→
stellarindex(CS-114) and dr-activation's false "Drilled" claim replaced
with the honest status (CS-113). -
ADR-0042 (Proposed): the v1 wire shape. The decision package for the
public flip, awaiting @ash sign-off: execute the Unit-D Tier-3 cross-chain
wire collapse pre-flip (rejecting the freeze fallback — pre-v1 with zero
consumers is the only free moment), give the dual-shape/v1/assets/{slug}
an explicitkinddiscriminator (catalogue/stellar_asset, oneOf +
typed SDK union, explorer stops shape-sniffing), and define the v1.0 freeze
contract: spec = the contract, SDK-coverage register = honest SDK scope,
explorer surfaces markedx-stability: experimentalat v1.0. -
Ansible: non-root services + the missing system user (CS-118/119/122).
Thestellarindexuser is now created FIRST in the role (a clean apply
previously FAILED chowning to a user that never existed); the api /
indexer / aggregator daemons and six timer oneshots run
User=stellarindexwith the hardened-unit settings ported into the
role's real templates; env files go0640 root:stellarindex;
archive-completenessdeliberately stays root (documented follow-up).
Patroni's REST API now defaults to the private interface and REFUSES to
render without basic-auth credentials (assert + unconditional auth block)
so it can never land unauthenticated on 0.0.0.0. Ordered r1 migration
steps live in the operator register; deploy workflow verified compatible. -
stellarindex-ops verify-served-values— the data-truth harness. The
recurring audit theme was "code-correct ≠ data-correct" (CS-010: XLM market
cap read +58% until hand-sampled). The new subcommand reconciles a curated
set of SERVED values against independent ground truth — XLM total/circulating
supply vs the SDF lumen API, USDC-on-Stellar supply vs Stellar Expert — and
emits node_exporter textfile gauges (served_value_{ok,rel_err,last_run_unix})
with two alerts in both rule trees (drift sustained two daily runs; harness
dark 48h) + runbook. Its FIRST live run caught three things: its own unit
bug (F2 supply fields are base-unit strings — fixed), the standing CS-010
config gap (XLM circulating 47% off untilsdf_reserve_accountsis set —
the alert now stands as pressure), and a NEW finding: served USDC supply is
85% below Stellar Expert (under investigation). Price cross-checks stay with
the divergence worker; lake↔served counts stay with compute-completeness.
Changed
- Explorer builds fail hard instead of baking fallback HTML. New
buildFetch.ts(bounded 429-aware retry, per-build memo, incident-history
contract): a build-time fetch failure for a promised entity now FAILS
next build— the class behind baked "Asset not found" pages and the
XLM/WXLM 330× price incident. ~200 lines of per-page scaffolding deleted;
the new layer immediately caught two real pre-existing baking bugs
(mixed-case slug variants; issuer fetches timing out under build
concurrency). Full 3,830-page build green against the live API. - Four D3 duplication extractions (net −LoC, behavior-preserving,
CAPABILITY-INVENTORY updated):wsclient.Loop(the ~50-line WS reconnect
loop duplicated across binance/kraken/coinbase/bitstamp — venue behavior
preserved via hooks),internal/httpxWriteJSON/WriteProblem (dashboard
handler copies),ratelimit.FixedWindowCounter(login/signup throttles,
Redis key bytes unchanged),canonical.SafeUnixSeconds/Millis(three
decoder timestamp-clamp copies; bound-checks the raw u64 before the cast —
the router deadline_ts wrap-negative class). - The explorer now derives every wire type from the generated OpenAPI
contract.src/api/types.ts(generated, CI-drift-checked) was imported
nowhere; all consumed shapes were hand-typed across hooks.ts,
explorer-shared.tsx, and ~20 pages — an API field rename shipped to prod
undetected. All hand interfaces are now aliases into
components['schemas'](35 files, −448 net lines; ~90 call sites gained
honest null-narrowing, zero!/ascasts), so spec drift is atsc
failure. Eight// SPEC-GAPintersections remain where the HANDLER serves
fields the spec under-documents — tracked for spec-side fixes.
Fixed
-
The explorer-surface OpenAPI gaps are closed — every field the handlers
serve that the generated-types migration had to bridge withSPEC-GAP
intersections is now in the spec: the Asset coin-overlay block (slug,
class, change_1h/7d_pct, first/last_seen_ledger, observation_count,
markets/trade counts, price_history_24h/7d, ath, top_markets,
issuer_scam_reason) +typeenum values global/external;
GlobalAssetView.class (required); Source.class enum gains
bridge/lending/router (all live in the registry); issuers list rows gain
org_verified + scam_reason (detail gains scam_reason); ContractEvent gains
contract_id; /account/me documents the session-cookie user/account shape;
the protocolsbespokeblock and evolving diagnostics fields are
documented as described-loose surfaces per ADR-0042's experimental tier.
SDK types mirror every addition (contract test green); all three artifacts
regenerated; zeroSPEC-GAPmarkers remain — the surviving intersections
are re-labeled for what they now are: required-narrowing over spec-optional
fields. -
Classic-asset supply was silently SAC-only — the trustline/claimable/LP
observers never matched their watched set. Root-caused from the
verify-served-values USDC finding (served 40M vs Stellar Expert 265.9M):
the three observers compare decoded keys inCODE:ISSUERform, but the
config (correctly, per its own docs) suppliesCODE-ISSUER— the raw
strings went straight into the watched sets, so all three observers
observed nothing since they shipped and every classic asset's served
supply degraded to its Soroban-wrapped slice. Cross-checked against the
lake: net SAC supply_flows for USDC ≈ 272.9M vs SE 265.9M — the lake was
right; the served tier was missing the entire classic trustline component.
Fix:supply.CanonicalizeWatchedClassic(one home, loud error on
unparseable entries so a config typo can never silently zero a supply
component again) applied in all three observer constructors + regression
test pinning dash-in/colon-match. Operator follow-ups (deploy, historical
state seed, harness watch) are in the register — served values heal only
after both. -
CS-089: the Chainlink divergence reference now rejects stale rounds. It
readlatestAnswer()— no timestamp at all — so a frozen feed was served as
a fresh reference, able to both mask a real divergence and fabricate a false
one. Now callslatestRoundData(), decodesupdatedAt, and rejects rounds
older than the feed'sMaxAgeasErrPriceUnavailable(reference
unavailable — feeding the CS-088no_referencemachinery). Defaults: 3h for
crypto feeds (≤1h heartbeat), 76h for the FX feeds (24h heartbeat + they
pause over market closes, so a Friday round is legitimately ~72h old on
Sunday). Operator override via new[divergence.chainlink.feeds]
max_age_hours. A proxy answering the legacy 32-byte shape now fails loudly
instead of decoding garbage. -
CS-084 (High): the
-chcompleteness projection reconcile is now strict
per-ledger. The production path compared window TOTALS (Σ expected vs Σ
served), so a real drop in ledger L netting against a phantom overcount
elsewhere reportedcomplete=true— the per-ledger maps were already
computed on both sides; only the comparison collapsed them. All three
reconcile branches (event re-derive, SDEX census, ContractCall census) now
compare per-ledger viacompleteness.ReconcileCounts. The four oracle
sources (reflector-dex/cex/fx, redstone) opt out via a documented
aggregateReconcilereason (legacy backfill vintages keyed
oracle_updates.ledgerby the oracle-timestamp ledger — strict compare
would false-flag the vintage boundary) and keep the totals compare.
Verified empirically on r1: per-ledger lake-vs-served counts for cctp match
exactly across 200k ledgers once the decoder's topic set is applied. The
same spot-check surfaced a NEW finding tracked separately: CCTP contracts
emitmint_and_forward, which the decoder does not handle. -
The contract test's first run caught real three-way drift, all fixed:
the spec'sPriceschema documented ~19 asset-enrichment fields
(market_cap_usd,top_markets,ath, supplies, sparklines…) that the
/v1/price handler has never served — trimmed to the honest 8-field
PriceSnapshot shape;/v1/poolsitems were documented as a bare untyped
object— now a realPoolRowschema; the healthz schema omitted the
always-serveduptime/status_root; theAssetschema omitted the served
change_24h_pct;Source/MarketRowomitted the served stats + sparkline
fields. SDK (pkg/client): gained the served-but-missing fields —
PriceSnapshot.confidence(+factors),HistorySeries.price_type,
Source.on_chain+stats,Market.last_price+sparkline,LendingPool
30d net-flow fields,Issuer.org_verified(CS-100 was never mirrored),
Account.key_prefix,KeyCreated.key_prefix,Health.checks/status_root—
and lostAssetDetail.is_experimental, which no handler and no spec ever
served (SDK invention; it always decoded to false). Stalerek_prefix
examples in auth comments updated tosip_. -
Load-test production guard had a collapsed host list. The two rebrand
sweeps mapped both legacy hosts (api.ratesengine.net,api.ratesengine.io)
ontoapi.stellarindex.io, leaving the guard in the Makefile and
test/load/scenarios/lib/env.jswith a duplicate entry — the legacy domains
(which may still route to production) were unguarded against accidental k6
targeting. Restored the distinct legacy hosts; verified prod + legacy are
refused and the documented staging target still passes. -
Per-row handler fan-out is now concurrency-bounded. The catalogue
market-cap/price fills (/v1/assetslistings,/v1/assets/verified) and the
slug-expansion markets merge spawned one goroutine + one DB round-trip per
row with no cap — safe only because the verified catalogue is small today,
but a latent connection-pool-exhaustion vector as it grows. All five sites
now go through a sharedforEachBoundedhelper (cap 16, the same bound as
the price batch, which was already correct). Race-tested that the bound is
actually respected.
Security
mainis now protected (CS-097) and lint baselines are growth-guarded
(CS-098). Two repo rulesets:main-integrityblocks force-pushes and
branch deletion for everyone (no bypass);main-required-checksmakes the 12
core CI jobs required status checks, with a repository-admin bypass so the
operator's direct-push workflow keeps working (the push-triggered CI run on
main stays as the tripwire for that path — its ci.yml comment now says so).
Newscripts/ci/lint-baseline-growth.sh(wired into the import-checks job)
fails any change that GROWSscripts/ci/*.baselineor theKNOWN_INERT
metric allowlist unless the commit carries an explicitBaseline-Growth:
trailer — closing the "edit the gate's own allowlist in the same commit"
bypass. Probe-tested all three paths (clean / undeclared growth / declared).- Middleware rejections (401/403/429) are no longer shared-cacheable.
Four problem+json writers — auth 401s (writeAuthProblem), per-key policy
403s (writeKeyPolicyDenied), signup email-verification 403s, and monthly-
quota 429s — never overrode the route directive the CacheControl middleware
pre-sets, so on publicly-cacheable routes (e.g./v1/price) a per-key/per-IP
denial carriedpublic, max-age, s-maxageand a shared cache keyed on the
URL could store one caller's rejection and replay it to everyone. All four
now setCache-Control: no-store(matching every other problem writer), the
cachecontrol.go invariant doc now enumerates them, and a regression test
drives all four rejection paths through the real CacheControl composition.
Removed
-
The dead
consumer.Orchestratorseam (−896 LoC). The per-source-
goroutine runner +Source/CursorStore/Cursortypes had zero production
callers and were exactly the RPC-era topology docs/architecture/ingest-
pipeline.md forbids — yet doc.go presented them as a reference template.
consumer.Event(the load-bearing contract) moved toevent.gountouched;
doc.go now states the retirement + points at the dispatcher path. Stale
comments advertising the deleted<source>-backfillsubcommands fixed in
sorobanevents/timescale. Follow-up:stellarindex_source_lag_ledgerslost
its only (never-production) setter — retirement folded into the
docs-integrity sweep. -
ADR-0041: ingest durability semantics. Settles CS-028's cursor question:
the ledgerstream cursor is a RESUME HINT, not a durability claim — the
ADR-0033 completeness verdict (strict per-ledger since CS-084) is the
durability claim, with the lake as the heal source. Consequences shipped
with the ADR:clickhouse_live_sink+clickhouse_projector_sourcenow
default totrue(r1 already ran both; the certified-lake substrate
must not be opt-in for the coverage claim to mean anything — explicit
opt-out documented for CH-less deployments), and the previously-unalerted
ch_live_sink_ledgers_total{outcome="dropped"}counter gains a two-tier
alert (ticket at 10m of drops, page at 1h sustained) in BOTH rule trees +
a new runbook (ch-live-sink-drops.md). -
ADR-0040: completing contract-identity gating (CS-026). Design for the
four still-ungated decoders: phoenix + defindex ship as curated-set /
factory-descended childgate registries (both already enumerated in
docs/protocols/ — the "waiting on team data" framing was stale), aquarius
gets a lake-derived enumeration procedure, and comet — the no-factory hard
case — gets a WASM-code-hash gate design (audited hash set + off-hot-path
registry sweep). Includes the rollout preconditions (seed before gate
binary, lake re-derive, verdict green) that prevent a fail-closed gate from
dropping live trades. -
Prometheus rule-tree semantic differ (
scripts/ci/lint-rule-equivalence,
wired intomake monitoring-check). The multi-host and r1-overlay rule
trees are hand-maintained near-copies; file pairing was checked but nothing
enforced that paired rules stay semantically equivalent — a threshold fixed
in one tree silently diverged the other (the api.yml header has warned about
this since F-1222). The differ compares every paired rule's expr (job labels
normalized),for, andlabels; the two genuine host-shape divergences
(redis replica expectation, scrape-job list) live in a shrink-only
rule-equivalence.baselinecovered by the CS-098 growth guard.
Probe-verified: a one-linefor:change in one tree fails with a precise
diagnosis. -
Pipeline lockstep guard (
internal/pipeline/lockstep_ast_test.go). The
five hand-synced wiring sites (HandleEvent / IsProjectedEvent /
tradeFromEvent / projectorbuildSource/ dispatcher registration) had no
machine check — the IsProjectedEvent comment cited an "ADR-0030 lint guard"
that never existed, and drift is silent data loss (F-1316). The new test
AST-walks the switches and every projected source package's consumer.Event
implementations: a projected event without a persist arm, a source package
event missing from IsProjectedEvent, a stale entry after a rename, or an
IsProjectedEvent package with no registry case now fails CI. Probe-verified
(removingrozo.Eventfrom IsProjectedEvent fails with the exact F-1316
diagnosis). -
SDK↔OpenAPI contract test (
pkg/client/spec_contract_test.go). Three
gates: every SDK method's route must exist in the spec; every spec operation
must be either SDK-covered or explicitly allowlisted with a reason (new
endpoints now fail CI until consciously triaged); and for covered endpoints
the spec'sdataschema properties must exactly match the SDK payload
struct's JSON tags in both directions. Closes the third edge of the
route↔spec↔SDK triangle (lint-docs.sh already reconciles routes↔spec).
Documentation
- Docs-integrity sweep: the institutional-knowledge layer agrees with
itself again.docs/architecture/overview.mdnow EXISTS (CLAUDE.md and
engineering-standards.md cited it for months; it routes to the real docs);
the CS-129 kubectl-on-a-systemd-fleet commands in insert-errors +
all-ingestion-down are systemd/psql; the CS-008 finding-ID collision is
re-IDed with a register note; the remediation STATUS deferred list and
launch-todo carry staleness banners naming what shipped since they were
written; the coverage tracker's header/table contradiction is annotated;
and the never-emittedstellarindex_source_lag_ledgersgauge (its only
setter was the deleted Orchestrator) is removed from obs + docs, with the
two archived runbooks that cited it scrubbed to historical prose. - CAGG price-math verified vs the exact engine;
twapcolumn marked dead.
Theprices_*continuous aggregates computevwapwith the per-row form
sum((quote/base)*base)/sum(base)instead of the exactsum(quote)/sum(base);
measured on r1 the divergence is ≤ 1.0e-16 relative (40,565 1h-bucket
comparisons) — below the 12-decimal wire truncation, so no rematerialization.
New aggregates must use the exact single-division form (migrations/README.md
rule 8). The CAGGs'twapcolumn is an equal-weight mean, not time-weighted,
and is read by nothing — documented as do-not-use in the TWAP/OHLC methodology
doc (/v1/twapcomputes real TWAP on demand from raw trades).