From 7f3135dc35dc3c87dff11b645db42ea16c6aebdb Mon Sep 17 00:00:00 2001 From: Justin George Date: Thu, 13 Nov 2025 16:49:04 -0800 Subject: [PATCH] Update for rewriting and schema sharding --- docs/configuration/pgdog.toml/.pages | 1 + docs/configuration/pgdog.toml/rewrite.md | 45 +++++++++++++++++++ .../pgdog.toml/sharded_schemas.md | 26 +++++++---- docs/features/sharding/sharding-functions.md | 9 ++++ 4 files changed, 73 insertions(+), 8 deletions(-) create mode 100644 docs/configuration/pgdog.toml/rewrite.md diff --git a/docs/configuration/pgdog.toml/.pages b/docs/configuration/pgdog.toml/.pages index a94c0ea..014c9b4 100644 --- a/docs/configuration/pgdog.toml/.pages +++ b/docs/configuration/pgdog.toml/.pages @@ -2,5 +2,6 @@ title: "pgdog.toml" nav: - 'general.md' - 'databases.md' + - 'rewrite.md' - '...' - 'plugins.md' diff --git a/docs/configuration/pgdog.toml/rewrite.md b/docs/configuration/pgdog.toml/rewrite.md new file mode 100644 index 0000000..4d63942 --- /dev/null +++ b/docs/configuration/pgdog.toml/rewrite.md @@ -0,0 +1,45 @@ +--- +icon: material/alpha-r-box-outline +--- + +# Rewrite + +The `rewrite` section controls PgDog's automatic SQL rewrites for sharded clusters. It affects shard-key updates and multi-row INSERT statements, and can be toggled globally or per-policy. + +## Options + +```toml +[rewrite] +enabled = false +shard_key = "error" +split_inserts = "error" +``` + +| Field | Description | Default | +| --- | --- | --- | +| `enabled` | Master toggle; when `false`, PgDog parses but never applies rewrite plans. | `false` | +| `shard_key` | Behaviour when an `UPDATE` changes a sharding key.
`error` rejects the statement.
`rewrite` migrates the row between shards.
`ignore` forwards it unchanged. | `"error"` | +| `split_inserts` | Behaviour when a sharded table receives a multi-row `INSERT`.
`error` rejects the statement.
`rewrite` fans the rows out to their shards.
`ignore` forwards it unchanged. | `"error"` | + +!!! note "Two-phase commit" + PgDog recommends enabling [`general.two_phase_commit`](general.md#two_phase_commit) when either policy is set to `rewrite`. Without it, rewrites are committed shard-by-shard and can leave partial changes if a shard fails. + +## Runtime overrides + +The admin database exposes these toggles via `SET`: + +```postgresql +SET rewrite_enabled TO true; -- mirrors [rewrite].enabled +SET rewrite_shard_key_updates TO rewrite; -- error | rewrite | ignore +SET rewrite_split_inserts TO rewrite; -- error | rewrite | ignore +``` + +Switches apply to subsequent sessions once the cluster reloads configuration. Session-level overrides allow canary testing before persisting them in `pgdog.toml`. + +## Limitations + +* Shard-key rewrites require the `WHERE` clause to resolve to a single row; otherwise PgDog rolls back and raises `rewrite.shard_key="rewrite" is not yet supported ...`. +* Split INSERT rewrites must run outside explicit transactions so PgDog can orchestrate per-shard `BEGIN`/`COMMIT` cycles. Inside a transaction PgDog returns `25001` and leaves the client transaction intact. +* Both features fall back to `error` semantics while `rewrite.enabled = false` or when PgDog cannot determine a target shard. + +See [feature docs](../../features/sharding/sharding-functions.md#rewrite-behaviour) for walkthroughs of these flows. diff --git a/docs/configuration/pgdog.toml/sharded_schemas.md b/docs/configuration/pgdog.toml/sharded_schemas.md index 416356d..3a84b3d 100644 --- a/docs/configuration/pgdog.toml/sharded_schemas.md +++ b/docs/configuration/pgdog.toml/sharded_schemas.md @@ -29,9 +29,11 @@ SELECT * FROM customer_a.users -- Prefixed with the schema name. WHERE admin = true; ``` +You can add multiple entries per database. Mappings are matched by schema name first; if none match, PgDog falls back to a default rule. + ## Default shard -For queries that don't specify a schema or for which a mapping doesn't exist, the default behavior is to send it to all shards. If this is not desirable, you can configure a default shard, like so: +For queries that don't specify a schema or for which a mapping doesn't exist, the default behavior is to send it to all shards. If this is not desirable, add an entry without a `name` to choose a default shard: ```toml [[sharded_schemas]] @@ -39,17 +41,25 @@ database = "prod" shard = 0 ``` -For example, the following queries will be sent to shard zero: +PgDog now sends any unmapped schema to shard zero, including plain references (`SELECT * FROM pg_stat_activity`) and schemas created after the mapping file was generated. -```postgresql --- No schema specified. -SELECT * FROM pg_stat_activity; +## Broadcast mappings --- Schema isn't mapped in the config. -SELECT * FROM customer_c.users -WHERE admin = true; +If you need a single configuration entry to cover “all shards”, set `all = true`. PgDog still accepts a `name` for documentation purposes, but ignores the shard number and forwards the query to every shard: + +```toml +[[sharded_schemas]] +database = "prod" +name = "reporting" +all = true ``` +This is useful for schemas that host reference tables replicated everywhere. + +## DDL routing + +Schema mappings apply to both DDL and DML. Fully-qualified statements such as `CREATE TABLE customer_b.users (...)` use the same shard resolution as regular queries, keeping schema changes aligned across shards. + ## Manual routing If you need to query a specific shard or can't specify the schema name in the query, you can add it to a comment, for example: diff --git a/docs/features/sharding/sharding-functions.md b/docs/features/sharding/sharding-functions.md index 982b7ee..568c598 100644 --- a/docs/features/sharding/sharding-functions.md +++ b/docs/features/sharding/sharding-functions.md @@ -180,6 +180,15 @@ shard = 0 This will send all queries that don't specify a schema or use a schema without a mapping to shard zero. +## Rewrite behaviour + +PgDog can transparently move writes between shards when [`rewrite`](../../configuration/pgdog.toml/rewrite.md) is enabled. + +* **Shard-key updates** (`rewrite.shard_key = "rewrite"`) delete the matching row from its current shard and re-insert it on the shard implied by the new key. Exactly one row must match the `WHERE` clause; PgDog aborts rewrites that affect multiple rows or unresolved shards. +* **Split INSERTs** (`rewrite.split_inserts = "rewrite"`) decompose multi-row `INSERT` statements so each shard receives only the rows it owns. PgDog opens per-shard transactions and can escalate to two-phase commit when configured to preserve atomicity across shards. + +Both features require `rewrite.enabled = true`, operate only on sharded tables, and fall back to returning errors when PgDog cannot determine a safe rewrite plan. Running them alongside [`general.two_phase_commit`](../../configuration/pgdog.toml/general.md#two_phase_commit) is recommended to guarantee atomic outcomes. + ## Read more - [COPY command](copy.md)