Skip to content

docs: partitioning runbook for high-volume tables#50

Merged
github-actions[bot] merged 1 commit into
mainfrom
docs/indexer-partitioning
May 10, 2026
Merged

docs: partitioning runbook for high-volume tables#50
github-actions[bot] merged 1 commit into
mainfrom
docs/indexer-partitioning

Conversation

@satyakwok
Copy link
Copy Markdown
Member

Summary

Tier 3 partitioning lands as a runbook + DDL templates rather than an auto-migration. Three reasons:

  1. Partition migration locks the table for the duration of `INSERT SELECT` — on a 50M+ row table that's minutes-to-tens-of-minutes of read-only window. Auto-running on container boot would block every consumer for that whole time without warning.
  2. The trigger thresholds (50M / 100M rows, p95 latency >200ms, autovacuum lag) need operator judgement, not a hardcoded check.
  3. Drizzle doesn't model `PARTITION BY` declaratively, so the schema migration would have to be raw SQL anyway. Cleaner to keep the SQL in docs where it can be reviewed step-by-step.

Strategy

Range-partition by `block_height`, 1M blocks per partition (~11.5 days at 1s blocks). Affected tables:

Table Trigger threshold
`transactions` 50M rows
`logs` 100M rows
`token_transfers` 50M rows

`addresses` and `blocks` stay non-partitioned (bounded sizes).

What lands

`docs/PARTITIONING.md` — full runbook including:

  • When-to-migrate signals
  • Per-table recipe (LOCK + RENAME + recreate as PARTITION BY + INSERT SELECT + index recreation + verify + DROP legacy)
  • Weekly auto-extender SQL for staying ahead of writes
  • Drizzle compatibility notes (existing schema.ts unchanged — planner handles partition routing transparently)
  • Rollback procedure
  • pg_partman tradeoff discussion

Test plan

  • On a staging clone of the production indexer DB, run the full `transactions` recipe end-to-end; verify row count parity + sample-query plans show `Partition Pruning`
  • Confirm Drizzle migration tracking still works after raw-SQL partition migration (one no-op Drizzle migration recorded post-hoc)
  • Verify `/address/:addr/txs` planner picks `txs_from_block_idx` on the relevant partition only (`EXPLAIN ANALYZE`)

Tier 3 partitioning lands as a runbook + DDL templates rather than an
auto-migration. Three reasons:

  1. Partition migration locks the table for the duration of INSERT
     SELECT — on a 50M+ row table that's minutes-to-tens-of-minutes
     of read-only window. Auto-running on container boot would block
     every consumer for that whole time without warning.

  2. The trigger thresholds (50M / 100M rows, p95 latency >200ms,
     autovacuum lag) need operator judgement, not a hardcoded check.

  3. Drizzle doesn't model PARTITION BY declaratively, so the schema
     migration would have to be raw SQL anyway. Cleaner to keep the
     SQL in docs where it can be reviewed step-by-step.

Strategy: range-partition by block_height, 1M blocks per partition
(~11.5 days at 1s blocks). Includes per-table recipe, weekly
partition-extender SQL, rollback procedure, and Drizzle
compatibility notes.
@github-actions github-actions Bot enabled auto-merge (squash) May 10, 2026 20:56
@github-actions github-actions Bot merged commit 9d52a20 into main May 10, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant