Skip to content

feat: queries benchmarking and optimization (#68)#534

Merged
cameri merged 5 commits intocameri:mainfrom
archief2910:feature/68-queries-optimization
Apr 19, 2026
Merged

feat: queries benchmarking and optimization (#68)#534
cameri merged 5 commits intocameri:mainfrom
archief2910:feature/68-queries-optimization

Conversation

@archief2910
Copy link
Copy Markdown
Contributor

@archief2910 archief2910 commented Apr 19, 2026

Description

Adds PostgreSQL indexes aligned with the relay's hottest query paths (NIP-01 REQ filters, vanish checks, retention purge, pending invoices), following Heroku's PostgreSQL indexing guidance.

  • Migration 20260420_120000_add_hot_path_indexes.js: creates three indexes with CREATE INDEX CONCURRENTLY IF NOT EXISTS in a non-transactional migration (so concurrent builds do not block reads). The composite events (event_pubkey, event_kind, event_created_at DESC, event_id) index is non-partial (matches EventRepository.findByFilters, which does not filter on deleted_at). The other two are partial: events (deleted_at) WHERE deleted_at IS NOT NULL and invoices (created_at) WHERE status = 'pending'.
  • src/scripts/benchmark-queries.ts + npm run db:benchmark: read-only EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT JSON) harness for the representative queries (loads .env automatically via node --env-file-if-exists=.env).
  • scripts/verify-index-impact.ts + npm run db:verify-index-impact: seeds test data, drops/recreates the new indexes, prints median-of-N EXPLAIN ANALYZE before/after numbers so reviewers can reproduce the speed-up.
  • InvoiceRepository.findPendingInvoices: now ORDER BY created_at ASC for deterministic FIFO polling — aligned with the partial pending-invoice index.
  • Docs: CONFIGURATION.md and README.md describe the indexes and how to run the benchmarks (no dotenv/config dependency).

Related Issue

Closes #68


Motivation and Context

Public relays scale on Postgres; subscription REQ queries, per-event vanish checks, retention purges, and invoice polling must stay cheap as events grows. Targeted indexes reduce sequential scans and sort work; partial indexes on low-cardinality tails (soft-deleted rows, pending invoices) keep those indexes small.


Benchmark results

Environment. Postgres 14 (docker postgres:14-alpine), seeded via npm run db:verify-index-impact -- --events 200000 --pubkeys 500 --runs 5: events = 200,000 rows, invoices = 1,000 pending rows, synthetic kind distribution [0,1,1,1,1,1,3,4,7,7,1059,62], ~2% soft-deleted. Identical schema to production after npm run db:migrate.

Before vs after (median of 5 EXPLAIN ANALYZE runs, from npm run db:verify-index-impact)

Query (production shape) Before After Speed-up Plan change
REQ authors + kind ORDER BY created_at DESC, event_id ASC LIMIT 500 0.61 ms 0.46 ms 1.31× events_event_pubkey_indexevents_active_pubkey_kind_created_at_idx
hasActiveRequestToVanish (pubkey + kind=62 + deleted_at IS NULL) LIMIT 1 0.04 ms 0.06 ms 0.71× events_event_pubkey_indexevents_active_pubkey_kind_created_at_idx (composite pays off when the sample pubkey has more rows; for a cold pubkey the single-column index is already ~40 µs)
Purge scan — soft-deleted rows LIMIT 500 2.30 ms 1.86 ms 1.24× Seq Scan → events_deleted_at_partial_idx (bitmap)
findPendingInvoices ORDER BY created_at ASC LIMIT 500 0.42 ms 0.25 ms 1.69× Seq Scan + Sort → invoices_pending_created_at_idx (Index Scan, no sort)
REQ kind + time range ORDER BY created_at DESC, event_id ASC LIMIT 500 1.64 ms 1.67 ms 0.98× No change — already served by existing events_event_created_at_index; kept in the benchmark to confirm that i did not regress it

The two cases where the new index does not speed things up are expected: the vanish-check query is already sub-millisecond on any single-column pubkey index, and the time-range REQ is covered by the pre-existing events_event_created_at_index. The new composite index wins on every query shape where the planner has a choice.

npm run db:benchmark output (same DB, 3 runs each)

Nostream query benchmark
------------------------
events rows:          200,000
invoices rows:        1,000
sample pubkey:        467325d61e8d8005…
runs per query:       3
kind (REQ/time):      1
limit:                500
time horizon (days):  7

• REQ: authors + kinds ORDER BY created_at DESC
  exec (min/avg/max): 293 µs / 887 µs / 2.02 ms
  planning:           94 µs
  rows returned:      134
  node types:         Limit, Sort, Bitmap Heap Scan, Bitmap Index Scan
  indexes used:       events_active_pubkey_kind_created_at_idx

• REQ: kind + created_at time range
  exec (min/avg/max): 1.27 ms / 2.22 ms / 4.09 ms
  planning:           170 µs
  rows returned:      500
  node types:         Limit, Incremental Sort, Index Scan
  indexes used:       events_event_created_at_index

• hasActiveRequestToVanish (pubkey + kind=62 + not deleted)
  exec (min/avg/max): 29 µs / 32 µs / 36 µs
  planning:           69 µs
  rows returned:      0
  node types:         Limit, Index Scan
  indexes used:       events_active_pubkey_kind_created_at_idx

• Purge scan (soft-deleted rows)
  exec (min/avg/max): 978 µs / 1.09 ms / 1.30 ms
  planning:           61 µs
  rows returned:      500
  node types:         Limit, Bitmap Heap Scan, Bitmap Index Scan
  indexes used:       events_deleted_at_partial_idx

• Purge scan (expired events)
  exec (min/avg/max): 25 µs / 141 µs / 371 µs
  planning:           39 µs
  rows returned:      0
  node types:         Limit, Index Scan
  indexes used:       events_expires_at_index

• findPendingInvoices (status=pending ORDER BY created_at)
  exec (min/avg/max): 111 µs / 140 µs / 190 µs
  planning:           30 µs
  rows returned:      500
  node types:         Limit, Index Scan
  indexes used:       invoices_pending_created_at_idx

Every new index shows up in a plan: events_active_pubkey_kind_created_at_idx, events_deleted_at_partial_idx, invoices_pending_created_at_idx. The findPendingInvoices case went from Seq Scan + Sort to a pure Index Scan — that is the largest latency win and it grows with the size of the invoices table.

Reproduce

docker run -d --name bench-pg -e POSTGRES_USER=nostr_ts_relay -e POSTGRES_PASSWORD=nostr_ts_relay \
  -e POSTGRES_DB=nostr_ts_relay -p 127.0.0.1:55432:5432 postgres:14-alpine

DB_HOST=127.0.0.1 DB_PORT=55432 DB_USER=nostr_ts_relay DB_PASSWORD=nostr_ts_relay DB_NAME=nostr_ts_relay \
  npx knex migrate:latest

DB_HOST=127.0.0.1 DB_PORT=55432 DB_USER=nostr_ts_relay DB_PASSWORD=nostr_ts_relay DB_NAME=nostr_ts_relay \
  npm run db:verify-index-impact -- --events 200000 --pubkeys 500 --runs 5

DB_HOST=127.0.0.1 DB_PORT=55432 DB_USER=nostr_ts_relay DB_PASSWORD=nostr_ts_relay DB_NAME=nostr_ts_relay \
  npm run db:benchmark -- --runs 5

How Has This Been Tested?

Automated (this branch)

Command Result
npm run build:check Pass (tsc --project tsconfig.build.json --noEmit)
npm run lint Pass
npm run test:unit 910 passing
npm run knip Pass

Database / indexes (real Postgres, numbers above)

  • npm run db:migrate — applies the new migration concurrently alongside the 27 pre-existing ones.
  • npm run db:benchmark -- --runs 5 — read-only EXPLAIN harness (no writes).
  • npm run db:verify-index-impact -- --events 200000 --pubkeys 500 --runs 5 — seeds a throwaway dataset, drops/recreates the new indexes, prints the before/after table shown above.

Screenshots (if appropriate)

image image

Types of changes

  • Non-functional change (docs, style, minor refactor)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would change existing functionality)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my code changes.
  • I added a changeset.
  • All new and existing tests passed.

@archief2910 archief2910 changed the title feat : queries benchmarking and optimization (#68) feat: queries benchmarking and optimization (#68) Apr 19, 2026
@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Apr 19, 2026

Coverage Status

coverage: 69.161% (-1.3%) from 70.415% — archief2910:feature/68-queries-optimization into cameri:main

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds operational tooling and schema changes to benchmark and optimize PostgreSQL “hot path” queries for relay workloads (NIP-01 subscriptions, vanish checks, retention purge, invoice polling).

Changes:

  • Adds a read-only EXPLAIN (ANALYZE, BUFFERS) benchmark script and wires it into npm run db:benchmark.
  • Introduces a non-transactional Knex migration creating three partial indexes via CREATE INDEX CONCURRENTLY IF NOT EXISTS.
  • Documents the new indexes and benchmarking workflow in README.md and CONFIGURATION.md, and adds a changeset entry.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/scripts/benchmark-queries.ts New benchmark harness that runs representative EXPLAIN ANALYZE query shapes and summarizes planner/index usage.
package.json Adds db:benchmark script entry.
migrations/20260420_120000_add_hot_path_indexes.js Adds concurrent partial indexes intended for hot query paths on events and invoices.
README.md Adds a section describing how to run the benchmark and what it measures.
CONFIGURATION.md Documents key indexes and includes benchmark invocation instructions.
.changeset/hot-path-indexes-benchmark.md Release note entry for the new indexes/tooling/docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .changeset/hot-path-indexes-benchmark.md
Comment thread src/scripts/benchmark-queries.ts
Comment thread src/scripts/benchmark-queries.ts Outdated
Comment thread migrations/20260420_120000_add_hot_path_indexes.js Outdated
Comment thread src/scripts/benchmark-queries.ts
Comment thread README.md Outdated
Comment thread package.json Outdated
Comment thread CONFIGURATION.md Outdated
Comment thread migrations/20260420_120000_add_hot_path_indexes.js Outdated
Comment thread migrations/20260420_120000_add_hot_path_indexes.js Outdated
@cameri cameri self-assigned this Apr 19, 2026
@cameri cameri merged commit a07b0f6 into cameri:main Apr 19, 2026
14 checks passed
@archief2910 archief2910 deleted the feature/68-queries-optimization branch April 19, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize queries

4 participants