docs: partitioning runbook for high-volume tables by satyakwok · Pull Request #50 · Sentriscloud/indexer

satyakwok · 2026-05-10T20:55:54Z

Summary

Tier 3 partitioning lands as a runbook + DDL templates rather than an auto-migration. Three reasons:

Partition migration locks the table for the duration of `INSERT SELECT` — on a 50M+ row table that's minutes-to-tens-of-minutes of read-only window. Auto-running on container boot would block every consumer for that whole time without warning.
The trigger thresholds (50M / 100M rows, p95 latency >200ms, autovacuum lag) need operator judgement, not a hardcoded check.
Drizzle doesn't model `PARTITION BY` declaratively, so the schema migration would have to be raw SQL anyway. Cleaner to keep the SQL in docs where it can be reviewed step-by-step.

Strategy

Range-partition by `block_height`, 1M blocks per partition (~11.5 days at 1s blocks). Affected tables:

Table	Trigger threshold
`transactions`	50M rows
`logs`	100M rows
`token_transfers`	50M rows

`addresses` and `blocks` stay non-partitioned (bounded sizes).

What lands

`docs/PARTITIONING.md` — full runbook including:

When-to-migrate signals
Per-table recipe (LOCK + RENAME + recreate as PARTITION BY + INSERT SELECT + index recreation + verify + DROP legacy)
Weekly auto-extender SQL for staying ahead of writes
Drizzle compatibility notes (existing schema.ts unchanged — planner handles partition routing transparently)
Rollback procedure
pg_partman tradeoff discussion

Test plan

On a staging clone of the production indexer DB, run the full `transactions` recipe end-to-end; verify row count parity + sample-query plans show `Partition Pruning`
Confirm Drizzle migration tracking still works after raw-SQL partition migration (one no-op Drizzle migration recorded post-hoc)
Verify `/address/:addr/txs` planner picks `txs_from_block_idx` on the relevant partition only (`EXPLAIN ANALYZE`)

Tier 3 partitioning lands as a runbook + DDL templates rather than an auto-migration. Three reasons: 1. Partition migration locks the table for the duration of INSERT SELECT — on a 50M+ row table that's minutes-to-tens-of-minutes of read-only window. Auto-running on container boot would block every consumer for that whole time without warning. 2. The trigger thresholds (50M / 100M rows, p95 latency >200ms, autovacuum lag) need operator judgement, not a hardcoded check. 3. Drizzle doesn't model PARTITION BY declaratively, so the schema migration would have to be raw SQL anyway. Cleaner to keep the SQL in docs where it can be reviewed step-by-step. Strategy: range-partition by block_height, 1M blocks per partition (~11.5 days at 1s blocks). Includes per-table recipe, weekly partition-extender SQL, rollback procedure, and Drizzle compatibility notes.

github-actions Bot enabled auto-merge (squash) May 10, 2026 20:56

github-actions Bot merged commit 9d52a20 into main May 10, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: partitioning runbook for high-volume tables#50

docs: partitioning runbook for high-volume tables#50
github-actions[bot] merged 1 commit into
mainfrom
docs/indexer-partitioning

satyakwok commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

satyakwok commented May 10, 2026

Summary

Strategy

What lands

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant