Skip to content

Sub-second end-to-end latency #69

@NikolayS

Description

@NikolayS

Context

PgQue's end-to-end latency (latency #3 — producer INSERT → consumer visibility) is currently bounded by the pg_cron tick period (default 1 second). A discussion today (2026-04-18) on LinkedIn between Nikolay Samokhvalov (PgQue author) and Hannu Krosing (ex-Skype / Google database engineer — relevant because Skype originated PgQ) surfaced several techniques for breaking past the 1-second floor without requiring upstream pg_cron changes.

See LinkedIn comments on today's HN front-page thread covering the R4–R7 bench work.

The ideas

1. Single cron job with internal pg_sleep loop (Hannu Krosing)

A single pg_cron callout fires every 1s but internally loops pg_sleep() 100× at 10ms intervals (or 1000× at 1ms for kHz delivery). One schedule slot covers the full second of tick activity.

Trade-offs:

  • Pros: works today; no pg_cron changes needed; single worker per queue.
  • Cons: heavy per-slot work; if a sleep cycle overruns, handover to the next slot gets messy.

2. Two (or more) coordinating cron jobs with a shared advisory lock (Hannu Krosing, refinement)

Register N pg_cron jobs at 1-second cadence, each naturally offset, coordinating via a common advisory lock. 10 jobs = 10 ticks/sec (100ms granularity); 100 jobs = 10ms granularity.

Trade-offs:

  • Pros: no pg_cron API changes; scales naturally; clean handover via advisory lock.
  • Cons: cron.job registration overhead grows with N; pg_cron may dedupe or throttle at scale.

3. Support function that yields when the next pg_cron run is pending (Hannu Krosing)

A pgque-provided function called from pg_cron that tight-loops doing tick work, checks "next pg_cron run is pending", and gracefully exits. Pre-pg_cron Skype/PgQ did immediate handover this way.

Trade-offs:

  • Pros: simplest mental model; minimal new code.
  • Cons: requires reading pg_cron internal state from inside a job; not a stable API.

4. Upstream: pg_cron supports sub-second scheduling (Nikolay Samokhvalov)

Longer-term: ask the pg_cron project to support e.g. '100 milliseconds' schedules. 10/sec is enough for many workloads.

Trade-offs:

  • Pros: cleanest semantics; no workarounds.
  • Cons: upstream dependency; not available today.

Concern (Nikolay): metadata-table bloat at high tick rates

Ticking more often means pgque.subscription and pgque.tick (and potentially pgque.queue) are UPDATEd / INSERTed far more frequently. Under any held-xmin condition (idle-in-tx, long-running writer, stale logical replication slot, physical standby with hot_standby_feedback=on), those dead tuples can't be vacuumed → index bloat → next_batch / finish_batch lookups slow down → the very latency we're trying to minimize gets worse.

This is the motivation behind #61 and the rotation fix in #62. R7 bench confirmed at 1-second tick: pgque with PR #62 stays at sub+tick peak dead ≤ 1000; upstream pgq reaches 21k+. At 10× or 100× higher tick rates, bloat scales linearly — PR #62's rotation is what makes sub-ms ticking viable at all.

Any high-frequency-tick design must budget for metadata rotation cadence that scales with tick rate (e.g., at 10ms ticking, rotation cadence should drop from 30s to ~3s to keep per-table dead-tuple peak bounded). See #66 for bench methodology.

Next steps

Provenance

  • LinkedIn discussion, 2026-04-18, under today's HN front-page post of pgque.com / the R4-R5-R6-R7 bench work.
  • Hannu Krosing's public writeups on the Skype-era PgQ tick design are a secondary reference.
  • Ideas 1, 2, 3 credited to Hannu Krosing. Bloat concern and idea 4 credited to Nikolay Samokhvalov.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions