Performance improvements by michalkucharczyk · Pull Request #7 · paritytech/jamtart

michalkucharczyk · 2026-02-18T11:59:57Z

Performance: TimescaleDB migration + parallel batch writer

Motivation

We ran jip-3-spammer against v0.3.0 with 300 nodes at realistic
rates (~258K events/s total) and hit performance problems pretty quickly — the single
batch writer couldn't keep up, events started dropping, and the write buffer filled up
within seconds of the nodes connecting.

Why

The original PostgreSQL schema uses a single flat events table with 6+ indexes. At 3M events/s from 1024 nodes, every INSERT must update all indexes, all writes hit the same table, and aggregate queries (dashboards, stats) scan the entire table. This doesn't scale.

Database: PostgreSQL → TimescaleDB

TimescaleDB is a PostgreSQL extension — same Postgres, just with time-series superpowers.

Hypertable with automatic chunking
The events table is split into 1-hour chunks automatically. Queries like "events in the last hour" scan 1-2 chunks instead of the whole table. Old data is dropped per-chunk (DROP TABLE, instant) instead of DELETE + vacuum.

32 hash partitions on node_id
Each 1-hour chunk is further split into 32 sub-chunks by hashing node_id. This spreads writes from 1024 nodes across 32 parallel physical tables — 32x less lock contention on indexes and WAL. Queries filtering by node_id only scan 1/32 of each chunk. This is a DB-internal detail, transparent to application code.

Continuous aggregates (pre-computed rollups)
Instead of running COUNT(*) over billions of rows on every API request:

event_stats_1m — per-minute counts per node/type, refreshed every 2 min
event_stats_1h — per-hour counts, built from the 1m aggregate (not raw events)

These are incrementally maintained by TimescaleDB — only changed chunks get re-aggregated.

Data retention pyramid

Tier	Resolution	Retention
Raw events	Full JSONB payload	7 days
1-min aggregates	Counts per node/type	30 days
1-hour aggregates	Counts per node/type	365 days

After 7 days raw event data is gone, but you still know how many events each node sent — per-minute for 30 days, per-hour for a year.

Compression after 2 hours
Columnar compression grouped by (node_id, event_type), ordered by timestamp DESC. Typical 10-20x compression ratio. Queries can skip irrelevant segments without decompressing.

Other schema changes

event_type: INTEGER → SMALLINT (130 types fit in 2 bytes, saves ~500GB/day at full throughput)
id BIGSERIAL PRIMARY KEY → event_id BIGINT (no PK — hypertables don't support it, and ON CONFLICT dedup is too expensive at 3M/s)
Indexes reduced from 6+ to 2 (each index costs write throughput)
No per-row triggers (catastrophic at high throughput) — app-level batch stats instead
Docker image: postgres:18-alpine → timescale/timescaledb:latest-pg16

Ingestion: Work-stealing batch writer pool

Single writer replaced with 32 parallel workers sharing an Arc<Mutex<Receiver>>:

Each worker drains events into a local batch (up to 16k events) then flushes via COPY BINARY
Node stats aggregated per-worker and flushed every 5s (additive, concurrent-safe)
Channel buffer: 5M events to absorb bursts from 1024 nodes
Removed single-row store_event() — everything is batched

Server improvements

Per-event logging downgraded from debug! to trace! to reduce log noise
wait_for_connections() watch channel for deterministic test synchronization
Partition health check removed (TimescaleDB handles this)

Tests

13 new data-driven API tests: insert real events, query endpoints, validate results
Event encoding added for WorkPackageReceived, Authorized, Refined, GuaranteeBuilt
All test setups updated for TimescaleDB

Bug fixes

da_stats INTEGER overflow: Status fields (num_shards, num_preimages, preimages_size) cast to BIGINT instead of INTEGER

Replace PostgreSQL partitioned schema with TimescaleDB hypertable: - 1-hour chunks with 32 hash partitions on node_id - Continuous aggregates: event_stats_1m (2min refresh), event_stats_1h - Compression after 2h (segmentby node_id + event_type) - Retention: raw 7d, 1m aggs 30d, 1h aggs 365d - event_type SMALLINT (was INTEGER), event_id BIGINT (no PK) - No per-row triggers (app-level batch stats instead) - Event types lookup table with convenience view - Docker image: timescale/timescaledb:latest-pg16

- Remove all partition management code (ensure_partitions_exist, spawn_partition_maintenance, shutdown, check_partition_health, PartitionHealth struct) - Add store_events_batch() with COPY BINARY for >10 events, simple INSERT for <=10 events - Add update_node_stats() using unnest() for concurrent-safe batch updates - Update event_type from i32 to i16 (SMALLINT) - Update store_nodes_connected_batch() with address parameter - Add ping(), get_node_by_id(), get_cores_telemetry_agg() - Remove PartitionHealth export from lib.rs - Adapt all query methods for parameterized DurationPreset intervals

Replace single-task batch writer with parallel workers: - Arc<Mutex<Receiver>> shared across all workers - Each worker drains events into local batch (up to 16k) then flushes via store.store_events_batch() - Timeout-based accumulation (100ms) prevents tiny flushes - Separate stats flusher task aggregates node counts every 5s - node_connected() now takes address parameter - flush() sends sentinel to all workers and waits for responses - Channel buffer: 5M events

- health.rs: remove partition_check() (no partitions in TimescaleDB) - main.rs: remove partition health check (5 checks instead of 6) - rate_limiter.rs: make MAX_CONNECTIONS pub for test access

- Restructure TelemetryServer to store TcpListener (enables port 0 binding) - Add with_options() constructor with no_rate_limit parameter - Add local_addr(), wait_for_connections() for deterministic test sync - Add connection_watch channel for tracking connection count changes - Remove BytesMutExt trait in favor of bytes::Buf - Remove read timeouts (handled by TCP keepalive) - api.rs: remove secondary_interval() params from store query call sites

…eBuilt These events had stub encoding (0 bytes) which caused data-driven tests to silently fail — events were sent as empty payloads and never stored.

- Add 13 data-driven tests validating JSONB query paths against real events - Add now_jce_micros() helper for realistic test timestamps - Update all test setup functions to use port 0 + local_addr() pattern - Set test cache TTL to zero to avoid stale cache hits - Use realistic JCE-relative timestamps so events pass time-window filters

michalkucharczyk added 11 commits February 18, 2026 12:25

Wire up TimescaleDB batch writer + remove partition health check

eaec7c1

- health.rs: remove partition_check() (no partitions in TimescaleDB) - main.rs: remove partition health check (5 checks instead of 6) - rate_limiter.rs: make MAX_CONNECTIONS pub for test access

Event encoding for WorkPackageReceived, Authorized, Refined, Guarante…

7ee611f

…eBuilt These events had stub encoding (0 bytes) which caused data-driven tests to silently fail — events were sent as empty payloads and never stored.

debug -> trace

5e8d4dd

fmt

267843c

clippy

888873f

timescaledb:latest-pg17 + ci fixes

7fb2895

lovelaced merged commit cc275c3 into main Feb 18, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvements#7

Performance improvements#7
lovelaced merged 11 commits intomainfrom
mku-perf-improvements-3

michalkucharczyk commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

michalkucharczyk commented Feb 18, 2026

Performance: TimescaleDB migration + parallel batch writer

Motivation

Why

Database: PostgreSQL → TimescaleDB

Ingestion: Work-stealing batch writer pool

Server improvements

Tests

Bug fixes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants