feat(core/db): drop redundant idx_core_transactions_tx_hash by RolfAris · Pull Request #205 · OpenAudio/go-openaudio

RolfAris · 2026-04-16T10:35:37Z

2026-05-02 update: This PR is now linked into a sibling-PR campaign at #218. Step 1 of 2; the parallel drop of idx_core_tx_hash on core_tx_stats is at #217. RFC has the consolidated zero-scan evidence across both tables. Either PR can land first.
The shipped tx-hash lookup (GetTx) uses lower(tx_hash) = lower($1) and rides idx_core_transactions_tx_hash_lower. The plain tx_hash index is not referenced by any query in pkg/core or pkg/etl; it is pure write-amp and disk overhead.

Drops online with CONCURRENTLY to avoid blocking block ingestion (a non-concurrent DROP INDEX would take ACCESS EXCLUSIVE on core_transactions). Rollback recreates the index online.

Operators running ad-hoc WHERE tx_hash = '…' in psql should switch to WHERE lower(tx_hash) = lower('…') — the functional index has been the canonical shipped pattern for a while.

Fleet evidence and EXPLAIN plans in comment below.

The shipped tx-hash lookup (GetTx) uses lower(tx_hash) = lower($1) and rides idx_core_transactions_tx_hash_lower. The plain tx_hash index is not referenced by any query in pkg/core or pkg/etl; it is pure write-amp and disk overhead. Drops online with CONCURRENTLY to avoid blocking block ingestion on a live validator (a non-concurrent DROP would take ACCESS EXCLUSIVE on core_transactions). Rollback recreates the index online. Operators running ad-hoc WHERE tx_hash = '...' in psql should switch to WHERE lower(tx_hash) = lower('...'); the functional index has been the canonical shipped pattern for a while.

RolfAris · 2026-04-16T10:36:00Z

Evidence

Code path audit (`pkg/core/db/sql/reads.sql`, `main` @ 2026-04-16)

Every core_transactions access:

query	predicate	index used
`GetTx`	`lower(tx_hash) = lower($1)`	`idx_core_transactions_tx_hash_lower`
`GetBlockTransactions`	`block_id = $1`	`idx_core_transactions_block_id`
`GetRecentTxs`	`order by created_at desc`	`idx_core_transactions_created_at`
`TotalTxResults`	`count(tx_hash)`	any

No query in pkg/core or pkg/etl does exact-case tx_hash = $1 against core_transactions. Other tx_hash = $1 lookups in the tree target different tables (core_etl_tx, core_rewards, core_ern, core_mead, core_pie) and are untouched.

Live evidence — 20 independent validator nodes, Postgres 15.16

Stats cumulative since pg_postmaster_start_time ≈ 2026-04-13 (~3.2 days).

Per node (the universal unit — operators run 1, 3, 30 nodes):

metric (per node)	value
`idx_core_transactions_tx_hash` size	~5.20 GiB
`idx_core_transactions_tx_hash` scans	0 (on all 20/20)
`idx_core_transactions_tx_hash_lower` scans > 0	14/20 nodes
`core_transactions` rows	~58.5–58.8M
`core_transactions` inserts / day	~50–120K
redundant btree inserts / day eliminated	~50–120K

Per node: ~5.20 GiB reclaimed; one fewer btree touched per core_transactions insert — smaller insert-path WAL, fewer dirty buffers, faster block apply. The write-amp reduction scales with a node's share of chain throughput, not with operator size.

`EXPLAIN (ANALYZE, BUFFERS)` — val001, val002, val020

```
Limit (cost=0.69..8.71 rows=1 width=557) (actual time=0.6–1.1 ms)
-> Index Scan using idx_core_transactions_tx_hash_lower on core_transactions
Index Cond: (lower(tx_hash) = lower($1))
```

Planner picks the functional index on every node tested. Sub-ms exec.

Our rollout plan

Canary one validator first, 24h soak, then staggered fleet rollout. Will report back if anything looks off.

RolfAris · 2026-04-17T11:01:25Z

Canary report — val001, T+24h

DROP INDEX CONCURRENTLY IF EXISTS idx_core_transactions_tx_hash applied to val001 at 2026-04-16T10:42Z. 24h observation complete, posted automatically from the canary host.

Index state

idx_core_transactions_tx_hash: GONE
idx_core_transactions_tx_hash_lower: present (5313 MB)
Planner still selects the functional index for GetTx (EXPLAIN match: 1)
_lower scans: 583 → 742 (+159 in the 24h window)

Chain state

live=true, height=23042673, total_tx=59478326
peers healthy: 71/71

Logs (last 24h)

Postgres / migration / core_transactions errors: 0

No regressions observed. Ready to stagger-roll the remaining 19 nodes once this PR merges.

…x-hash-index

main shipped a different 00033 (drop_redundant_tx_hash_index, #205) while this branch was open. Bump ours to 00034 to keep migration ordering unambiguous; content is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Reward authority rotation primitive (PRs 222 + 225 + 228 bundled) Three logical chunks bundled onto a single branch off main: PR1 — Schema (mjp-reward-pools-schema): - New core_reward_pools table keyed by Solana RM pubkey, with a text[] authorities column (gin-indexed for @> containment). - launchpad_authority_rm seed table mapping every known launchpad- derived per-mint claim authority → its Solana reward manager state account. Used by both the migration backfill and PR2's wire-compat replay logic. - core_rewards.rewards_manager_pubkey FK column; claim_authorities column dropped (reads now alias coalesce(p.authorities, '{}') via LEFT JOIN on core_reward_pools). - Backfill creates one pool per RM (per-RM authority union across all rewards referencing it via launchpad lookup). Rows whose authorities don't match any launchpad RM stay NULL — there are no synthetic mig_<md5> identifiers. - Live finalizeCreateReward (legacy proto shape, brief PR1-only window) does launchpad lookup → bind to existing pool only; never upserts. NULL fallback if no match or pool missing. PR2 — CometBFT transactions (mjp-reward-pools-tx): - New body+signature envelope: Tx { TxBody body; signatures[] }. Reward and RewardPool messages move to the new shape. - CreateRewardPool / SetRewardPoolAuthorities txs gated by real-RM-shape pubkey + signer ∈ current pool authorities. - CreateReward proto reserves tags 4-6 (former claim_authorities, deadline, signature) and uses tag 7 for rewards_manager_pubkey. DeleteReward reserves tags 2-3. - Wire-compat layer (rewards_legacy.go): legacy bytes are REJECTED at CheckTx/ProcessProposal (no new legacy txs accepted) but ACCEPTED at FinalizeBlock for block-sync replay of historical chain state. Replay uses launchpad lookup to bind legacy rewards to the same RM the migration produced. - Defense-in-depth re-validation at finalize for both pool txs (block-sync replay skips ProcessProposal / CheckTx). PR3 — Validator endpoint cutover (mjp-reward-pools-endpoints): - GetRewardAttestation restored from the #215 kill-switch. Auth check uses dbReward.ClaimAuthorities, which is sourced from coalesce(p.authorities, '{}') — so rotating an authority out via SetRewardPoolAuthorities immediately revokes attestation rights. RewardClaim.RewardAddress is intentionally NOT set (Solana reward manager program expects 2-piece RewardID: Specifier disbursement_id). - GetRewardSenderAttestation / GetDeleteRewardSenderAttestation dispatch by RM: pool-gated if pool exists, else fall back to the legacy validator/AAO trust set (AUDIO path). - AUDIO RM denylist on validateRewardsManagerPubkey: prevents an attacker from creating a pool for the AUDIO RM and inheriting AUDIO sender attestations. Per-env constants in pkg/core/config/rewards.go (dev/prod populated; stage left empty intentionally). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Tighten reward-pool gating: union replay, AUDIO-only fallback Three review-driven fixes on the bundle branch: 1. Replay/migration apphash divergence (#1). UpsertSyntheticRewardPool was a hard overwrite, which produced pool.authorities = last-replayed-reward.authorities on a from-genesis block-sync — diverging from the migration backfill, which UNIONs authorities across every legacy reward referencing the RM. Production data has at most one authority per reward today, so the bug doesn't currently manifest, but it's cheap insurance against future drift (multi-authority rewards, debug keys, etc.). The DO UPDATE clause now unions existing pool authorities with the incoming set. Renamed the query to UpsertLegacyReplayRewardPool to reflect its actual (and only) caller — the mig_<md5> shape was already gone (#5). 2. senderGateForRM AUDIO-only fallback (#2). The legacy validator/AAO trust set used to be the fallback for ANY RM without a pool. That was a quietly-permissive seam — any caller could request validator-signed attestations for an arbitrary unknown RM. Now the fallback applies only when the requested RM equals the configured AUDIO RM; every other no-pool RM gets ErrSenderGateUnknownRM, which the handlers map to InvalidArgument. 3. Stale doc comment in rewards_legacy.go (#6) saying the file did "synthetic-pool fallback for create" — predates the mig_<md5> removal. Updated to describe the launchpad-lookup behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * CreateRewardPool: require ed25519 signature from RM keypair Closes the pool-creation frontrunning vector. Today's validateCreateRewardPool only requires signer ∈ initial_authorities, which an attacker satisfies trivially by listing themselves. After a new reward manager is initialized on Solana, an observer who watches init events can race the legitimate launchpad operator's CreateRewardPool and register a pool with attacker-chosen authorities; the legitimate operator is then locked out (PK conflict on rewards_manager_pubkey), and the attacker controls every reward and sender attestation under the RM. Defense rests on a property of the existing system: the Solana rewardManagerState account is a deterministic ed25519 keypair, derived by the launchpad relay as Keypair.fromSeed(sha256(launchpadDeterministicSecret || 'audius-launchpad' || 'reward-manager' || mint)) (see apps/.../solana-relay/.../launchpad/launch_coin.ts). The launchpad has the secret and can re-derive the keypair at will; an attacker who lacks the secret cannot. The 32-byte rewardManagerState public key IS what cometbft has been carrying as rewards_manager_pubkey — so we already have an ed25519 verification key in hand at validate time. This commit: 1. Adds CreateRewardPool.rm_owner_signature (proto tag 3, bytes). 2. Defines a canonical signing payload in pkg/rewards: "audius:create-reward-pool:" + chain_id + ":" + rm_pubkey_b58 + ":" + sorted_lowercased_authorities.join(",") and a SignCreateRewardPool helper for client-side use. 3. validateCreateRewardPool and finalizeCreateRewardPool each call verifyRewardPoolOwnerSignature, which decodes rm_pubkey from base58 and runs ed25519.Verify against the canonical payload. Defense-in-depth at finalize matches the existing pattern for replay-time invariants. 4. Updates SDK example (examples/rewards/main.go) and integration tests to populate the signature. New unit tests cover positive verification, canonicalization invariance, foreign-keypair rejection, cross-chain replay, mismatched authorities, malformed signature length, and rm_pubkey shape errors. The existing signer ∈ initial_authorities check is retained alongside the new ed25519 gate. They're independent: the ed25519 sig proves control of the RM keypair (frontrunning defense); the membership check enforces the existing "you can't create a pool you have no membership in" property. Operationally, the launchpad relay holds both the per-mint claim authority eth key (envelope signer + the only initial authority) and the RM ed25519 keypair, so producing both signatures is symmetric. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Move rm_owner_signature to envelope, sign body bytes Restructuring on top of the previous commit: the ed25519 rm_owner_signature moves from CreateRewardPool.rm_owner_signature (tag 3, signing a custom canonical string) to RewardPoolMessage.rm_owner_signature (envelope-level, signing the same ProtoMarshal(body) bytes the secp256k1 envelope signature covers). Why: - One encoding to maintain instead of two. Cross-language clients (the TS launchpad relay) now sign the same bytes for both signatures; no separate domain-separated string format to keep in sync. - Body bytes implicitly cover deadline_block_height + the action oneof discriminator. The earlier custom string didn't include deadline; stale-deadline replay was technically possible (though blocked by pool PK uniqueness). - Future fields added to RewardPoolBody / CreateRewardPool are automatically covered without revving the signing scheme. Not included: chain_id in the body. Cross-chain replay isn't a concrete threat — each environment's launchpad uses a different deterministic secret, so the same rewards_manager_pubkey cannot be derived on more than one chain. A captured CreateRewardPool replayed on another chain refers to an RM that doesn't exist there. Other changes: - pkg/common.ProtoSignableBytes (new): exports the deterministic- marshal helper so verifyRewardPoolOwnerSignature can hash the same bytes ProtoSign / ProtoRecover use. - SDK signAndSendRewardPool takes an rmOwnerSig parameter; the CreateRewardPool wrapper accepts an ed25519.PrivateKey and signs body bytes locally. SetRewardPoolAuthorities passes nil — rotation is gated by current pool authorities, no RM signature needed. - Removed pkg/rewards.SignCreateRewardPool / CanonicalCreateRewardPoolPayload / CreateRewardPoolOwnerSignatureDomain — replaced by the body-bytes signing path. - Updated unit tests, integration tests, and example to populate rmKey at the SDK call site rather than constructing a signed message struct. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Renumber reward-pools migration 00033 → 00034 main shipped a different 00033 (drop_redundant_tx_hash_index, #205) while this branch was open. Bump ours to 00034 to keep migration ordering unambiguous; content is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Address PR #254 review feedback Six Copilot-flagged items from the latest review: 1. Proto signature comments now spell out the actual pre-hashing: - RewardMessage.signature: secp256k1 over sha256(body bytes). - RewardPoolMessage.signature: same. - RewardPoolMessage.rm_owner_signature: ed25519 over body bytes directly (ed25519 hashes internally — do NOT pre-hash). Lets non-Go clients reproduce signatures without reading pkg/common/crypto.go. 2. Split validateRewardsManagerPubkey: - validateRewardsManagerPubkeyShape (new): non-empty, no whitespace, base58, 32 bytes. Pure shape. For read paths and rotation paths. - validateRewardsManagerPubkey (existing): shape + AUDIO denylist. Only for write paths (CreateRewardPool, CreateReward). Switched call sites: - validateSetRewardPoolAuthorities / finalizeSetRewardPoolAuthorities → Shape. SetAuthorities targets an existing pool; AUDIO has no pool by construction, so checkPoolAuthorization surfaces the case as "pool not found" rather than the misleading "is reserved". - GetRewardPool → Shape. Probing GetRewardPool(AudioRM) now returns a clean NotFound instead of InvalidArgument. - GetRewardSenderAttestation / GetDeleteRewardSenderAttestation → add Shape validation up front so malformed pubkeys return a clear InvalidArgument instead of falling through to ErrSenderGateUnknownRM (which is for valid- shape-but-unmapped RMs). 3. Removed the stale "chain_id is covered by signed body bytes" reference in validateCreateRewardPool's comment — the body doesn't carry chain_id, and that's intentional (cross-chain replay isn't a threat because per-env launchpad secrets prevent the same rewards_manager_pubkey from existing on more than one chain). 4. SDK CreateRewardPool now validates rmKey length up front and returns a typed error instead of panicking inside ed25519.Sign for callers that pass nil / hex-decode-wrong / public-key-by-mistake. 5. GetRewardAttestation now TrimSpaces eth_recipient_address, reward_address, and claim_authority at the boundary so surrounding whitespace returns a clean InvalidArgument here instead of a confusing hex-decode error deeper in RewardClaim.Compile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #254 review feedback (round 2) Six items from raymondjacobson: 1. examples/rewards/main.go: drop REWARDS_MANAGER_SECRET_HEX env var and generate a fresh ed25519 keypair inline. Strip the explainer comments — the simpler example is self-documenting. 2. pkg/core/config/rewards.go: remove the staging-specific AUDIO RM constant and the long comment about why staging is empty. The AudioRewardsManagerPubkey() switch no longer special-cases stage, so staging falls through to "" via the default branch, which the denylist treats as "no enforcement." The reward_pools_test save/ restore no longer touches StageAudioRewardsManagerPubkey. 3. 00034_reward_pools.sql backfill comment: drop "leaked-key" framing, replace with neutral "additional entries." 4 + 6. Sweep PR1/PR2/PR3/PR #225 references out of all bundle code and comments — these labeled stacked-PR boundaries that no longer exist now that the work is bundled. Phrasing now describes what the code does, not which PR introduced it. Touched: connect.go, reward_pools.go, rewards.go, rewards_legacy.go, reads.sql, migration, proto, integration test. 5. reward_pools.go: drop the case-insensitive contains() helper and use slices.Contains across all call sites. Pool authorities are already canonicalized (lowercase) on write via CanonicalAuthorities, so callers just lowercase the needle. Removes ~10 lines and a custom helper in favor of stdlib. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

The core_tx_stats table carries two btree indexes covering (tx_hash): "core_tx_stats_tx_hash_key" UNIQUE CONSTRAINT, btree (tx_hash) "idx_core_tx_hash" btree (tx_hash) The UNIQUE constraint already provides a btree on tx_hash and dominates the non-unique idx_core_tx_hash for every selectivity scenario. pg_stat_user_indexes confirms idx_core_tx_hash accumulated zero scans across all 20 OpenAudio fleet nodes after months of production traffic. Each node holds ~5.20 GiB in this single redundant index. Total dead-weight: ~104 GiB fleet-wide. DROP CONCURRENTLY to avoid ACCESS EXCLUSIVE on core_tx_stats during ingest. Rollback recreates online. Sibling PR #205 covers the analogous drop on core_transactions (idx_core_transactions_tx_hash, superseded by the functional lower() index). The two are independent and can land in either order. Co-authored-by: Ray Jacobson <ray@audius.co>

This was referenced May 1, 2026

feat(core/db): drop redundant idx_core_tx_hash on core_tx_stats #217

Merged

RFC: Drop two zero-scan redundant Postgres indexes (campaign for #205 + #217) #218

Closed

RolfAris and others added 2 commits May 5, 2026 16:10

Merge remote-tracking branch 'origin/main' into feat/drop-redundant-t…

a5dd47b

…x-hash-index

Merge branch 'main' into feat/drop-redundant-tx-hash-index

9cc8b91

raymondjacobson self-requested a review May 11, 2026 22:58

raymondjacobson approved these changes May 11, 2026

View reviewed changes

raymondjacobson merged commit 42aea5d into OpenAudio:main May 11, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core/db): drop redundant idx_core_transactions_tx_hash#205

feat(core/db): drop redundant idx_core_transactions_tx_hash#205
raymondjacobson merged 3 commits into
OpenAudio:mainfrom
RolfAris:feat/drop-redundant-tx-hash-index

RolfAris commented Apr 16, 2026 •

edited

Loading

Uh oh!

RolfAris commented Apr 16, 2026

Uh oh!

RolfAris commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RolfAris commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RolfAris commented Apr 16, 2026

Evidence

Code path audit (pkg/core/db/sql/reads.sql, main @ 2026-04-16)

Live evidence — 20 independent validator nodes, Postgres 15.16

EXPLAIN (ANALYZE, BUFFERS) — val001, val002, val020

Our rollout plan

Uh oh!

RolfAris commented Apr 17, 2026

Canary report — val001, T+24h

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RolfAris commented Apr 16, 2026 •

edited

Loading

Code path audit (`pkg/core/db/sql/reads.sql`, `main` @ 2026-04-16)

`EXPLAIN (ANALYZE, BUFFERS)` — val001, val002, val020