Skip to content

NikolayS/pgque

Repository files navigation

PgQue – PgQ, universal edition

Zero-bloat PostgreSQL queue. No extensions. No daemon. One SQL file.

CI PostgreSQL 14-18 License pg_cron Anti-Extension

Contents

PgQue brings back PgQ — one of the most proven PostgreSQL queue architectures ever built — in a form that fits modern Postgres.

PgQ was originally designed at Skype, with architecture meant to serve 1B users, and it was used in very large self-managed PostgreSQL installations for years. That knowledge is mostly forgotten ancient arto now — real database kung fu from the era when people solved brutal scale problems without cargo-culting another distributed system into the stack.

PgQue takes that battle-tested core and repackages it as an extension-free, managed-Postgres-friendly project.

The anti-extension. Pure SQL + PL/pgSQL that works on any Postgres 14+ — including RDS, Cloud SQL, AlloyDB, Supabase, Neon, and every other managed provider. No C extension, no shared_preload_libraries, no provider approval, no restart. Just \i and go.

If you want the historical context, two decks are worth your time:

Why PgQue

Most PostgreSQL queues rely on SKIP LOCKED plus DELETE or UPDATE. That works nicely in toy examples and then quietly turns into dead tuples, VACUUM pressure, index bloat, and performance drift under sustained load.

PgQue avoids that whole class of problems. It uses snapshot-based batching and TRUNCATE-based table rotation instead of per-row deletion. So the hot path stays predictable over time:

  • Zero bloat by design — no dead tuples in the main queue path
  • No performance decay under sustained load — it does not get slower just because it has been running for months
  • Built for heavy-loaded systems — this is exactly the kind of abuse the original PgQ architecture was made for
  • Real PostgreSQL guarantees — ACID transactions, transactional enqueue/consume patterns, WAL, backups, replication, SQL visibility, and the rest of the Postgres toolbox
  • Works on managed Postgres — no custom server build, no C extension, no separate daemon process

This is the key point: PgQue gives you queue semantics inside Postgres, with Postgres durability and transactional behavior, without paying the usual bloat tax most in-database queues eventually pay.

Comparison

Feature PgQue PgQ PGMQ River Que pg-boss
Snapshot-based batching (no row locks)
Zero bloat under sustained load
No external daemon or worker binary
Pure SQL install, managed Postgres ready
Language-agnostic SQL API
Multiple independent consumers (fan-out)
Built-in retry with backoff ⚠️
Built-in dead letter queue ⚠️

Legend: ✅ yes · ❌ no · ⚠️ partial / indirect

Notes:

  • PgQ is the original Skype-era queue engine (~2007) that PgQue is derived from. Same snapshot/rotation architecture, but requires C extensions and an external daemon (pgqd) — unavailable on managed Postgres. PgQue removes both constraints.
  • No external daemon: PgQue uses pg_cron for ticker/maintenance; PGMQ uses visibility timeouts. Both run entirely inside PostgreSQL. River requires a Go binary, Que requires Ruby, pg-boss requires Node.js.
  • Que uses advisory locks (not SKIP LOCKED) — no dead tuples from claiming, but completed jobs are still DELETEd. Brandur's famous bloat post was about Que at Heroku. Ruby-only.
  • PGMQ retry is via visibility timeout re-delivery (read_ct tracking) — no configurable backoff or max attempts built in.
  • pg-boss fan-out uses publish()/subscribe() with copy-per-queue semantics, not a shared event log with independent cursors.
  • Category difference: River, Que, and pg-boss are job queue frameworks with worker executors, priority queues, and per-job lifecycle management. PgQue is an event/message queue optimized for high-throughput streaming with fan-out. See also: Oban (Elixir), graphile-worker (Node.js), solid_queue (Ruby/Rails), good_job (Ruby).

What genuinely differentiates PgQue

1. Zero event-table bloat under sustained load (structural, not tuning-dependent)

SKIP LOCKED queues (PGMQ, River, pg-boss, Oban, graphile-worker) all use UPDATE + DELETE on rows, creating dead tuples that require VACUUM. Under sustained load this causes documented, reproducible production failures:

  • Brandur/Heroku (2015): single open transaction caused 60k job backlog in one hour
  • PlanetScale (2026): death spiral at 800 jobs/sec with shared analytics queries
  • River issue #59: autovacuum starvation documented at Heroku
  • Oban Pro shipped table partitioning specifically to mitigate bloat
  • PGMQ/Tembo ships aggressive autovacuum settings baked into their container image

PgQue's TRUNCATE rotation creates zero dead tuples in event tables by construction. No tuning required, immune to xmin horizon pinning. This matters most at sustained high throughput or when the queue database is shared with OLAP workloads.

2. Native fan-out (multiple independent consumers on a shared event log)

Each registered consumer maintains its own cursor position and independently receives all events. This is fundamentally different from competing-consumers (SKIP LOCKED) where each job goes to one worker. pg-boss has fan-out but it is copy-per-queue (one INSERT per subscriber per event). PgQue's model is position-in-shared-log — no data duplication, atomic batch boundaries, late subscribers catch up.

This makes PgQue more analogous to Kafka topics than to a job queue.

When to use PgQue vs. a job queue

PgQue is an event/message queue. River, graphile-worker, pg-boss, and Oban are job queue frameworks. They are different categories:

  • Choose PgQue when you want event-driven fan-out, zero-maintenance bloat behavior, language-agnostic SQL API, and you do not need per-job priorities/scheduling/worker frameworks
  • Choose a job queue when you need per-job lifecycle management, sub-3ms latency, priority queues, cron scheduling, unique jobs, and deep ecosystem integration (Elixir/Go/Node.js/Ruby)

Installation

Requirements: PostgreSQL 14+. pg_cron is optional and recommended.

begin;
\i sql/pgque.sql
commit;

With pg_cron installed, start the built-in ticker and maintenance jobs:

select pgque.start();

Without pg_cron, installation still works. You just drive ticking and maintenance from an external scheduler or your app:

# every 1-2 seconds
psql -c "select pgque.ticker()"

# for demos/tests, if you need an immediate batch without waiting,
# force a tick threshold first for that queue
psql -c "select pgque.force_tick('orders')"
psql -c "select pgque.ticker()"

# every 30 seconds
psql -c "select pgque.maint()"

For now, treat installation as initial setup. Upgrade/reinstall guarantees are still being tightened.

To uninstall:

\i sql/pgque_uninstall.sql

Project status

PgQue is early-stage as a product and API layer.

PgQ itself is rock solid — battle-tested in very large systems over many years. What's new here is the packaging, modernization, managed-Postgres compatibility, and the higher-level PgQue API around that core.

The default install intentionally stays small in v0.1. Additional APIs live under sql/experimental/ until they are worth promoting into the default install. See blueprints/PHASES.md.

Quick start

-- transaction 1: create queue + consumer
select pgque.create_queue('orders');
select pgque.subscribe('orders', 'processor');

-- transaction 2: send a message
select pgque.send('orders', '{"order_id": 42, "total": 99.95}'::jsonb);

-- transaction 3: advance the queue if you are not using pg_cron
-- force_tick() is handy in demos/tests to avoid waiting for lag/count thresholds
select pgque.force_tick('orders');
select pgque.ticker();

-- transaction 4: receive and process messages
select * from pgque.receive('orders', 'processor', 100);

-- transaction 5: acknowledge the batch
select pgque.ack(1);

Important: send/tick/receive should be separate transactions. That's not a PgQue quirk so much as PgQ's snapshot-based design doing exactly what it is supposed to do.

Usage examples

Send with event type

select pgque.send('orders', 'order.created',
  '{"order_id": 42}'::jsonb);

select pgque.send('orders', 'order.shipped',
  '{"order_id": 42, "tracking": "1Z999AA10123456784"}'::jsonb);

Batch send

select pgque.send_batch('orders', 'order.created', array[
  '{"order_id": 1}'::jsonb,
  '{"order_id": 2}'::jsonb,
  '{"order_id": 3}'::jsonb
]);

Experimental SQL

The repo also contains additional SQL under sql/experimental/ for features that are not part of the default v0.1 install yet:

  • delayed delivery
  • dead letter queue tooling
  • observability helpers

Those pieces are being kept out of the default install until the API surface settles.

Fan-out with multiple consumers

select pgque.subscribe('orders', 'audit_logger');
select pgque.subscribe('orders', 'notification_sender');
select pgque.subscribe('orders', 'analytics_pipeline');

select * from pgque.receive('orders', 'audit_logger', 100);
select * from pgque.receive('orders', 'notification_sender', 100);

Transactional exactly-once-ish processing

Do the queue read, your writes, and the ack in one transaction:

begin;
  create temp table msgs as
    select * from pgque.receive('orders', 'processor', 100);

  insert into processed_orders (order_id, status)
  select (payload::jsonb->>'order_id')::int, 'done'
  from msgs;

  select pgque.ack((select distinct batch_id from msgs limit 1));
commit;

If the transaction rolls back, your writes roll back and the ack rolls back too.

Recurring jobs with pg_cron

select cron.schedule('daily_report',
  '0 9 * * *',
  $$select pgque.send('jobs', 'report.generate',
      '{"type": "daily"}'::jsonb)$$);

Client libraries

PgQue is SQL-first, so any PostgreSQL driver works. On top of that, dedicated client libraries already exist or are being built around the API.

Python (pgque-py)

Built on psycopg 3. Typical usage:

from pgque import PgqueClient, Consumer

client = PgqueClient(conn)
client.send("orders", {"order_id": 42})

consumer = Consumer(dsn, queue="orders", name="processor", poll_interval=30)

@consumer.on("order.created")
def handle_order(msg):
    process_order(msg.payload)

consumer.start()

Go (pgque-go)

Built on pgx/v5. Typical usage:

client, _ := pgque.Connect(ctx, "postgresql://localhost/mydb")

consumer := client.NewConsumer("orders", "processor")
consumer.Handle("order.created", func(ctx context.Context, msg pgque.Message) error {
    return processOrder(msg)
})
consumer.Start(ctx)

Any language

If your language can talk to PostgreSQL, you can use PgQue immediately:

select pgque.send('orders', '{"order_id": 42}'::jsonb);
select * from pgque.receive('orders', 'processor', 100);
select pgque.ack(batch_id);

Function reference

Publishing

Function Returns Description
pgque.send(queue, payload) bigint Send a message with default type
pgque.send(queue, type, payload) bigint Send with explicit event type
pgque.send_batch(queue, type, payloads[]) bigint[] Batch send in a single transaction
pgque.send_at(queue, type, payload, deliver_at) bigint Delayed/scheduled delivery

Consuming

Function Returns Description
pgque.receive(queue, consumer, max_return) setof pgque.message Receive up to N messages from the next batch
pgque.ack(batch_id) integer Finish batch and advance consumer position
pgque.subscribe(queue, consumer) integer Register a consumer
pgque.unsubscribe(queue, consumer) integer Unregister a consumer

Queue management

Function Returns Description
pgque.create_queue(queue) integer Create a new queue
pgque.drop_queue(queue) integer Drop a queue

Lifecycle

Function Returns Description
pgque.start() void Create pg_cron ticker + maintenance jobs when pg_cron is available
pgque.stop() void Remove pg_cron jobs
pgque.status() table Diagnostic dashboard
pgque.ticker() bigint Manual ticker for all queues
pgque.maint() integer Manual maintenance runner
pgque.uninstall() void Stop jobs and drop schema

PgQ-native API

The original PgQ primitives are still there for advanced use:

Function Description
pgque.insert_event(queue, type, data) Low-level event insert
pgque.next_batch(queue, consumer) Get next batch ID
pgque.get_batch_events(batch_id) Get all events in a batch
pgque.finish_batch(batch_id) Mark batch complete
pgque.event_retry(batch_id, event_id, seconds) Schedule event retry

Benchmarks

Preliminary results on a laptop (Apple Silicon, 10 cores, 24 GiB RAM, PostgreSQL 18.3, synchronous_commit=off per-session). Full methodology: NikolayS/pgq#1.

Scenario Throughput Per core
PL/pgSQL single insert/TX, ~100 B, 16 clients 85,836 ev/s ~8.6k ev/s
PL/pgSQL batched 100k/TX, ~100 B 80,515 ev/s ~8.1k ev/s
PL/pgSQL batched 100k/TX, ~2 KiB 48,899 ev/s (91.5 MiB/s) ~4.9k ev/s
Consumer read rate, 100k batch, ~100 B ~2.4M ev/s ~240k ev/s
Consumer read rate, 100k batch, ~2 KiB ~305k ev/s (568 MiB/s) ~30.5k ev/s

Key takeaways:

  • Zero bloat under sustained load — 30-minute sustained test showed zero dead tuple growth in event tables
  • Batching matters — throughput jumps hard when you stop doing one tiny transaction per event
  • Consumer side is not the bottleneck — reads are much faster than writes
  • You keep Postgres guarantees — transactional semantics, WAL durability options, backups, replication, SQL introspection

synchronous_commit=off can be set per session or per transaction for queue-heavy workloads if that trade-off makes sense for your system.

Architecture

PgQue keeps PgQ's proven core architecture and adds a modern API layer:

  • Snapshot-based batch isolation — each batch contains exactly the events committed between two ticks
  • Three rotating event tables — the main queue path uses 3-table TRUNCATE rotation instead of row-by-row churn
  • Separate retry table — retries are stored outside the hot event path and re-inserted later by maintenance
  • Separate delayed-delivery table — scheduled messages wait outside the hot path until due
  • Separate dead letter queue — exhausted messages move aside cleanly instead of poisoning normal flow
  • Multiple independent consumers — each consumer tracks its own position
  • Optional pg_cron scheduler — replaces the old external pgqd daemon when available; otherwise call SQL functions manually

See SPECx.md for the full specification and SPEC.md for PgQ internals.

Contributing

See SPECx.md for the specification and implementation plan. New code should follow red/green TDD: write the failing test first, then fix it.

License

Apache-2.0. See LICENSE.

PgQue includes code derived from PgQ (ISC license, Marko Kreen / Skype Technologies OU). See NOTICE.

About

PgQue – PgQ, universal edition

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors