feat(core/db): drop redundant idx_core_transactions_tx_hash#205
Conversation
The shipped tx-hash lookup (GetTx) uses lower(tx_hash) = lower($1) and
rides idx_core_transactions_tx_hash_lower. The plain tx_hash index is
not referenced by any query in pkg/core or pkg/etl; it is pure write-amp
and disk overhead.
Drops online with CONCURRENTLY to avoid blocking block ingestion on a
live validator (a non-concurrent DROP would take ACCESS EXCLUSIVE on
core_transactions). Rollback recreates the index online.
Operators running ad-hoc WHERE tx_hash = '...' in psql should switch
to WHERE lower(tx_hash) = lower('...'); the functional index has been
the canonical shipped pattern for a while.
EvidenceCode path audit (
|
| query | predicate | index used |
|---|---|---|
GetTx |
lower(tx_hash) = lower($1) |
idx_core_transactions_tx_hash_lower |
GetBlockTransactions |
block_id = $1 |
idx_core_transactions_block_id |
GetRecentTxs |
order by created_at desc |
idx_core_transactions_created_at |
TotalTxResults |
count(tx_hash) |
any |
No query in pkg/core or pkg/etl does exact-case tx_hash = $1 against core_transactions. Other tx_hash = $1 lookups in the tree target different tables (core_etl_tx, core_rewards, core_ern, core_mead, core_pie) and are untouched.
Live evidence — 20 independent validator nodes, Postgres 15.16
Stats cumulative since pg_postmaster_start_time ≈ 2026-04-13 (~3.2 days).
Per node (the universal unit — operators run 1, 3, 30 nodes):
| metric (per node) | value |
|---|---|
idx_core_transactions_tx_hash size |
~5.20 GiB |
idx_core_transactions_tx_hash scans |
0 (on all 20/20) |
idx_core_transactions_tx_hash_lower scans > 0 |
14/20 nodes |
core_transactions rows |
~58.5–58.8M |
core_transactions inserts / day |
~50–120K |
| redundant btree inserts / day eliminated | ~50–120K |
Per node: ~5.20 GiB reclaimed; one fewer btree touched per core_transactions insert — smaller insert-path WAL, fewer dirty buffers, faster block apply. The write-amp reduction scales with a node's share of chain throughput, not with operator size.
EXPLAIN (ANALYZE, BUFFERS) — val001, val002, val020
```
Limit (cost=0.69..8.71 rows=1 width=557) (actual time=0.6–1.1 ms)
-> Index Scan using idx_core_transactions_tx_hash_lower on core_transactions
Index Cond: (lower(tx_hash) = lower($1))
```
Planner picks the functional index on every node tested. Sub-ms exec.
Our rollout plan
Canary one validator first, 24h soak, then staggered fleet rollout. Will report back if anything looks off.
Canary report — val001, T+24h
Index state
Chain state
Logs (last 24h)
No regressions observed. Ready to stagger-roll the remaining 19 nodes once this PR merges. |
main shipped a different 00033 (drop_redundant_tx_hash_index, #205) while this branch was open. Bump ours to 00034 to keep migration ordering unambiguous; content is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Reward authority rotation primitive (PRs 222 + 225 + 228 bundled)
Three logical chunks bundled onto a single branch off main:
PR1 — Schema (mjp-reward-pools-schema):
- New core_reward_pools table keyed by Solana RM pubkey, with a
text[] authorities column (gin-indexed for @> containment).
- launchpad_authority_rm seed table mapping every known launchpad-
derived per-mint claim authority → its Solana reward manager
state account. Used by both the migration backfill and PR2's
wire-compat replay logic.
- core_rewards.rewards_manager_pubkey FK column; claim_authorities
column dropped (reads now alias coalesce(p.authorities, '{}')
via LEFT JOIN on core_reward_pools).
- Backfill creates one pool per RM (per-RM authority union across
all rewards referencing it via launchpad lookup). Rows whose
authorities don't match any launchpad RM stay NULL — there are
no synthetic mig_<md5> identifiers.
- Live finalizeCreateReward (legacy proto shape, brief PR1-only
window) does launchpad lookup → bind to existing pool only;
never upserts. NULL fallback if no match or pool missing.
PR2 — CometBFT transactions (mjp-reward-pools-tx):
- New body+signature envelope: Tx { TxBody body; signatures[] }.
Reward and RewardPool messages move to the new shape.
- CreateRewardPool / SetRewardPoolAuthorities txs gated by
real-RM-shape pubkey + signer ∈ current pool authorities.
- CreateReward proto reserves tags 4-6 (former claim_authorities,
deadline, signature) and uses tag 7 for rewards_manager_pubkey.
DeleteReward reserves tags 2-3.
- Wire-compat layer (rewards_legacy.go): legacy bytes are
REJECTED at CheckTx/ProcessProposal (no new legacy txs
accepted) but ACCEPTED at FinalizeBlock for block-sync replay
of historical chain state. Replay uses launchpad lookup to
bind legacy rewards to the same RM the migration produced.
- Defense-in-depth re-validation at finalize for both pool txs
(block-sync replay skips ProcessProposal / CheckTx).
PR3 — Validator endpoint cutover (mjp-reward-pools-endpoints):
- GetRewardAttestation restored from the #215 kill-switch. Auth
check uses dbReward.ClaimAuthorities, which is sourced from
coalesce(p.authorities, '{}') — so rotating an authority out
via SetRewardPoolAuthorities immediately revokes attestation
rights. RewardClaim.RewardAddress is intentionally NOT set
(Solana reward manager program expects 2-piece RewardID:
Specifier disbursement_id).
- GetRewardSenderAttestation / GetDeleteRewardSenderAttestation
dispatch by RM: pool-gated if pool exists, else fall back to
the legacy validator/AAO trust set (AUDIO path).
- AUDIO RM denylist on validateRewardsManagerPubkey: prevents an
attacker from creating a pool for the AUDIO RM and inheriting
AUDIO sender attestations. Per-env constants in
pkg/core/config/rewards.go (dev/prod populated; stage left
empty intentionally).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Tighten reward-pool gating: union replay, AUDIO-only fallback
Three review-driven fixes on the bundle branch:
1. Replay/migration apphash divergence (#1).
UpsertSyntheticRewardPool was a hard overwrite, which produced
pool.authorities = last-replayed-reward.authorities on a from-genesis
block-sync — diverging from the migration backfill, which UNIONs
authorities across every legacy reward referencing the RM. Production
data has at most one authority per reward today, so the bug doesn't
currently manifest, but it's cheap insurance against future drift
(multi-authority rewards, debug keys, etc.). The DO UPDATE clause now
unions existing pool authorities with the incoming set.
Renamed the query to UpsertLegacyReplayRewardPool to reflect its
actual (and only) caller — the mig_<md5> shape was already gone (#5).
2. senderGateForRM AUDIO-only fallback (#2).
The legacy validator/AAO trust set used to be the fallback for ANY
RM without a pool. That was a quietly-permissive seam — any caller
could request validator-signed attestations for an arbitrary unknown
RM. Now the fallback applies only when the requested RM equals the
configured AUDIO RM; every other no-pool RM gets
ErrSenderGateUnknownRM, which the handlers map to InvalidArgument.
3. Stale doc comment in rewards_legacy.go (#6) saying the file did
"synthetic-pool fallback for create" — predates the mig_<md5>
removal. Updated to describe the launchpad-lookup behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* CreateRewardPool: require ed25519 signature from RM keypair
Closes the pool-creation frontrunning vector. Today's
validateCreateRewardPool only requires signer ∈ initial_authorities,
which an attacker satisfies trivially by listing themselves. After a
new reward manager is initialized on Solana, an observer who watches
init events can race the legitimate launchpad operator's
CreateRewardPool and register a pool with attacker-chosen authorities;
the legitimate operator is then locked out (PK conflict on
rewards_manager_pubkey), and the attacker controls every reward and
sender attestation under the RM.
Defense rests on a property of the existing system: the Solana
rewardManagerState account is a deterministic ed25519 keypair, derived
by the launchpad relay as
Keypair.fromSeed(sha256(launchpadDeterministicSecret ||
'audius-launchpad' ||
'reward-manager' ||
mint))
(see apps/.../solana-relay/.../launchpad/launch_coin.ts). The
launchpad has the secret and can re-derive the keypair at will; an
attacker who lacks the secret cannot. The 32-byte rewardManagerState
public key IS what cometbft has been carrying as
rewards_manager_pubkey — so we already have an ed25519 verification
key in hand at validate time.
This commit:
1. Adds CreateRewardPool.rm_owner_signature (proto tag 3, bytes).
2. Defines a canonical signing payload in pkg/rewards:
"audius:create-reward-pool:" + chain_id + ":" +
rm_pubkey_b58 + ":" + sorted_lowercased_authorities.join(",")
and a SignCreateRewardPool helper for client-side use.
3. validateCreateRewardPool and finalizeCreateRewardPool each call
verifyRewardPoolOwnerSignature, which decodes rm_pubkey from
base58 and runs ed25519.Verify against the canonical payload.
Defense-in-depth at finalize matches the existing pattern for
replay-time invariants.
4. Updates SDK example (examples/rewards/main.go) and integration
tests to populate the signature. New unit tests cover positive
verification, canonicalization invariance, foreign-keypair
rejection, cross-chain replay, mismatched authorities, malformed
signature length, and rm_pubkey shape errors.
The existing signer ∈ initial_authorities check is retained alongside
the new ed25519 gate. They're independent: the ed25519 sig proves
control of the RM keypair (frontrunning defense); the membership
check enforces the existing "you can't create a pool you have no
membership in" property. Operationally, the launchpad relay holds
both the per-mint claim authority eth key (envelope signer + the only
initial authority) and the RM ed25519 keypair, so producing both
signatures is symmetric.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Move rm_owner_signature to envelope, sign body bytes
Restructuring on top of the previous commit: the ed25519
rm_owner_signature moves from CreateRewardPool.rm_owner_signature (tag
3, signing a custom canonical string) to
RewardPoolMessage.rm_owner_signature (envelope-level, signing the same
ProtoMarshal(body) bytes the secp256k1 envelope signature covers).
Why:
- One encoding to maintain instead of two. Cross-language clients
(the TS launchpad relay) now sign the same bytes for both
signatures; no separate domain-separated string format to keep in
sync.
- Body bytes implicitly cover deadline_block_height + the action
oneof discriminator. The earlier custom string didn't include
deadline; stale-deadline replay was technically possible (though
blocked by pool PK uniqueness).
- Future fields added to RewardPoolBody / CreateRewardPool are
automatically covered without revving the signing scheme.
Not included: chain_id in the body. Cross-chain replay isn't a
concrete threat — each environment's launchpad uses a different
deterministic secret, so the same rewards_manager_pubkey cannot be
derived on more than one chain. A captured CreateRewardPool replayed
on another chain refers to an RM that doesn't exist there.
Other changes:
- pkg/common.ProtoSignableBytes (new): exports the deterministic-
marshal helper so verifyRewardPoolOwnerSignature can hash the
same bytes ProtoSign / ProtoRecover use.
- SDK signAndSendRewardPool takes an rmOwnerSig parameter; the
CreateRewardPool wrapper accepts an ed25519.PrivateKey and signs
body bytes locally. SetRewardPoolAuthorities passes nil — rotation
is gated by current pool authorities, no RM signature needed.
- Removed pkg/rewards.SignCreateRewardPool /
CanonicalCreateRewardPoolPayload / CreateRewardPoolOwnerSignatureDomain
— replaced by the body-bytes signing path.
- Updated unit tests, integration tests, and example to populate
rmKey at the SDK call site rather than constructing a signed
message struct.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Renumber reward-pools migration 00033 → 00034
main shipped a different 00033 (drop_redundant_tx_hash_index, #205)
while this branch was open. Bump ours to 00034 to keep migration
ordering unambiguous; content is unchanged.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Address PR #254 review feedback
Six Copilot-flagged items from the latest review:
1. Proto signature comments now spell out the actual pre-hashing:
- RewardMessage.signature: secp256k1 over sha256(body bytes).
- RewardPoolMessage.signature: same.
- RewardPoolMessage.rm_owner_signature: ed25519 over body bytes
directly (ed25519 hashes internally — do NOT pre-hash).
Lets non-Go clients reproduce signatures without reading
pkg/common/crypto.go.
2. Split validateRewardsManagerPubkey:
- validateRewardsManagerPubkeyShape (new): non-empty, no whitespace,
base58, 32 bytes. Pure shape. For read paths and rotation paths.
- validateRewardsManagerPubkey (existing): shape + AUDIO denylist.
Only for write paths (CreateRewardPool, CreateReward).
Switched call sites:
- validateSetRewardPoolAuthorities / finalizeSetRewardPoolAuthorities
→ Shape. SetAuthorities targets an existing pool; AUDIO has no
pool by construction, so checkPoolAuthorization surfaces the case
as "pool not found" rather than the misleading "is reserved".
- GetRewardPool → Shape. Probing GetRewardPool(AudioRM) now returns
a clean NotFound instead of InvalidArgument.
- GetRewardSenderAttestation /
GetDeleteRewardSenderAttestation → add Shape validation up front
so malformed pubkeys return a clear InvalidArgument instead of
falling through to ErrSenderGateUnknownRM (which is for valid-
shape-but-unmapped RMs).
3. Removed the stale "chain_id is covered by signed body bytes"
reference in validateCreateRewardPool's comment — the body
doesn't carry chain_id, and that's intentional (cross-chain replay
isn't a threat because per-env launchpad secrets prevent the same
rewards_manager_pubkey from existing on more than one chain).
4. SDK CreateRewardPool now validates rmKey length up front and
returns a typed error instead of panicking inside ed25519.Sign for
callers that pass nil / hex-decode-wrong / public-key-by-mistake.
5. GetRewardAttestation now TrimSpaces eth_recipient_address,
reward_address, and claim_authority at the boundary so
surrounding whitespace returns a clean InvalidArgument here
instead of a confusing hex-decode error deeper in
RewardClaim.Compile.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Address PR #254 review feedback (round 2)
Six items from raymondjacobson:
1. examples/rewards/main.go: drop REWARDS_MANAGER_SECRET_HEX env var
and generate a fresh ed25519 keypair inline. Strip the explainer
comments — the simpler example is self-documenting.
2. pkg/core/config/rewards.go: remove the staging-specific AUDIO RM
constant and the long comment about why staging is empty. The
AudioRewardsManagerPubkey() switch no longer special-cases stage,
so staging falls through to "" via the default branch, which the
denylist treats as "no enforcement." The reward_pools_test save/
restore no longer touches StageAudioRewardsManagerPubkey.
3. 00034_reward_pools.sql backfill comment: drop "leaked-key" framing,
replace with neutral "additional entries."
4 + 6. Sweep PR1/PR2/PR3/PR #225 references out of all bundle code
and comments — these labeled stacked-PR boundaries that no longer
exist now that the work is bundled. Phrasing now describes what
the code does, not which PR introduced it. Touched: connect.go,
reward_pools.go, rewards.go, rewards_legacy.go, reads.sql,
migration, proto, integration test.
5. reward_pools.go: drop the case-insensitive contains() helper and
use slices.Contains across all call sites. Pool authorities are
already canonicalized (lowercase) on write via
CanonicalAuthorities, so callers just lowercase the needle. Removes
~10 lines and a custom helper in favor of stdlib.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The core_tx_stats table carries two btree indexes covering (tx_hash): "core_tx_stats_tx_hash_key" UNIQUE CONSTRAINT, btree (tx_hash) "idx_core_tx_hash" btree (tx_hash) The UNIQUE constraint already provides a btree on tx_hash and dominates the non-unique idx_core_tx_hash for every selectivity scenario. pg_stat_user_indexes confirms idx_core_tx_hash accumulated zero scans across all 20 OpenAudio fleet nodes after months of production traffic. Each node holds ~5.20 GiB in this single redundant index. Total dead-weight: ~104 GiB fleet-wide. DROP CONCURRENTLY to avoid ACCESS EXCLUSIVE on core_tx_stats during ingest. Rollback recreates online. Sibling PR #205 covers the analogous drop on core_transactions (idx_core_transactions_tx_hash, superseded by the functional lower() index). The two are independent and can land in either order. Co-authored-by: Ray Jacobson <ray@audius.co>
Drops online with
CONCURRENTLYto avoid blocking block ingestion (a non-concurrentDROP INDEXwould takeACCESS EXCLUSIVEoncore_transactions). Rollback recreates the index online.Operators running ad-hoc
WHERE tx_hash = '…'in psql should switch toWHERE lower(tx_hash) = lower('…')— the functional index has been the canonical shipped pattern for a while.Fleet evidence and
EXPLAINplans in comment below.