digoal
2023-04-10
PostgreSQL , PolarDB , standby , 物理从库 , logical decoding
PostgreSQL 16 preview - standby 支持 logical decoding
也就是说, PG 16 可以在物理流复制从库上开启logical decoding. 那么问题来了:
1、在哪里创建逻辑复制slot?
不知道, 猜测可以直接在standby上创建. 因为每个节点的slot信息都是文件形式存储的, 不需要修改catalog信息.
2、如何避免冲突问题: 例如 上游wal_level改成小于logical
如果上游primary节点的wal level改成了minimal或者replica, 那么standby 的logical slot将被设置为invalid, 因为上游产生的wal不足以解析逻辑数据.
3、如何避免冲突问题: 例如 如果 decoding wal pages 需要的catalog被primary节点清掉了怎么办?
standby 的logical slot将被设置为invalid, 因为catalog老的用于decoding wal page的tuple version可能不存在了, 因此不能正确的解析老的wal page.
4、standby 的logical slot的位点信息会不会同步给primary
这个问题不存在, 因为每个节点的slot信息都是文件形式存储的, 没有存储在catalog里面, 所以并不需要把slot位点同步给primary, primary也查不到standby的slot信息.
5、cascading replication的场景: 当物理从库仅仅完成wal receive and flush后, 如果就立即decoding对应的wal, 逻辑上可能存在问题, 怎么解决?
在cascading replication的场景, 物理流复制从库(standby)启动wal sender的时机做了区分: 当wal flush完成后即可启动physical wal sender, 只有当apply完成后才启动logical wal sender.
Physical walsenders can't send data until it's been flushed;
logical walsenders can't decode and send data until it's been applied.
On the standby, the WAL is flushed first, which will only wake up physical walsenders;
and then applied, which will only wake up logical walsenders.
Allow logical decoding on standbys
author Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 09:20:01 +0000 (02:20 -0700)
committer Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 09:20:05 +0000 (02:20 -0700)
commit 0fdab27ad68a059a1663fa5ce48d76333f1bd74c
tree 3d29650901130428712f71cbb8b57cb19f738f1a tree
parent e101dfac3a53c20bfbf1ca85d30a368c2954facf commit | diff
Allow logical decoding on standbys
Unsurprisingly, this requires wal_level = logical to be set on the primary and
standby. The infrastructure added in 26669757b6a ensures that slots are
invalidated if the primary's wal_level is lowered.
Creating a slot on a standby waits for a xl_running_xact record to be
processed. If the primary is idle (and thus not emitting xl_running_xact
records), that can take a while. To make that faster, this commit also
introduces the pg_log_standby_snapshot() function. By executing it on the
primary, completion of slot creation on the standby can be accelerated.
Note that logical decoding on a standby does not itself enforce that required
catalog rows are not removed. The user has to use physical replication slots +
hot_standby_feedback or other measures to prevent that. If catalog rows
required for a slot are removed, the slot is invalidated.
See 6af1793954e for an overall design of logical decoding on a standby.
Bumps catversion, for the addition of the pg_log_standby_snapshot() function.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: FabrÌzio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
For cascading replication, wake physical and logical walsenders separately
author Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 07:24:24 +0000 (00:24 -0700)
committer Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 08:06:00 +0000 (01:06 -0700)
commit e101dfac3a53c20bfbf1ca85d30a368c2954facf
tree 1a5c755d9eb0a2f6f1d6251d9dd9f185654534cb tree
parent 26669757b6a7665c1069e77e6472bd8550193ca6 commit | diff
For cascading replication, wake physical and logical walsenders separately
Physical walsenders can't send data until it's been flushed; logical
walsenders can't decode and send data until it's been applied. On the
standby, the WAL is flushed first, which will only wake up physical
walsenders; and then applied, which will only wake up logical
walsenders.
Previously, all walsenders were awakened when the WAL was flushed. That
was fine for logical walsenders on the primary; but on the standby the
flushed WAL would have been not applied yet, so logical walsenders were
awakened too early.
Per idea from Jeff Davis and Amit Kapila.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1+zO5LUeisabX10c81LU-fWMKO4M9Wyg1cdkbW7Hqh6vQ@mail.gmail.com
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6af1793954e
Add info in WAL records in preparation for logical slot conflict handling
author Andres Freund <andres@anarazel.de>
Sun, 2 Apr 2023 19:32:19 +0000 (12:32 -0700)
committer Andres Freund <andres@anarazel.de>
Sun, 2 Apr 2023 19:32:19 +0000 (12:32 -0700)
commit 6af1793954e8c5e753af83c3edb37ed3267dd179
tree f3f3f39f3d46a6ed6f21b9b9473fcfc93657ff05 tree
parent ab73291d263049f00a394b37940ed197616fc3f8 commit | diff
Add info in WAL records in preparation for logical slot conflict handling
This commit only implements one prerequisite part for allowing logical
decoding. The commit message contains an explanation of the overall design,
which later commits will refer back to.
Overall design:
1. We want to enable logical decoding on standbys, but replay of WAL
from the primary might remove data that is needed by logical decoding,
causing error(s) on the standby. To prevent those errors, a new replication
conflict scenario needs to be addressed (as much as hot standby does).
2. Our chosen strategy for dealing with this type of replication slot
is to invalidate logical slots for which needed data has been removed.
3. To do this we need the latestRemovedXid for each change, just as we
do for physical replication conflicts, but we also need to know
whether any particular change was to data that logical replication
might access. That way, during WAL replay, we know when there is a risk of
conflict and, if so, if there is a conflict.
4. We can't rely on the standby's relcache entries for this purpose in
any way, because the startup process can't access catalog contents.
5. Therefore every WAL record that potentially removes data from the
index or heap must carry a flag indicating whether or not it is one
that might be accessed during logical decoding.
Why do we need this for logical decoding on standby?
First, let's forget about logical decoding on standby and recall that
on a primary database, any catalog rows that may be needed by a logical
decoding replication slot are not removed.
This is done thanks to the catalog_xmin associated with the logical
replication slot.
But, with logical decoding on standby, in the following cases:
- hot_standby_feedback is off
- hot_standby_feedback is on but there is no a physical slot between
the primary and the standby. Then, hot_standby_feedback will work,
but only while the connection is alive (for example a node restart
would break it)
Then, the primary may delete system catalog rows that could be needed
by the logical decoding on the standby (as it does not know about the
catalog_xmin on the standby).
So, it’s mandatory to identify those rows and invalidate the slots
that may need them if any. Identifying those rows is the purpose of
this commit.
Implementation:
When a WAL replay on standby indicates that a catalog table tuple is
to be deleted by an xid that is greater than a logical slot's
catalog_xmin, then that means the slot's catalog_xmin conflicts with
the xid, and we need to handle the conflict. While subsequent commits
will do the actual conflict handling, this commit adds a new field
isCatalogRel in such WAL records (and a new bit set in the
xl_heap_visible flags field), that is true for catalog tables, so as to
arrange for conflict handling.
The affected WAL records are the ones that already contain the
snapshotConflictHorizon field, namely:
- gistxlogDelete
- gistxlogPageReuse
- xl_hash_vacuum_one_page
- xl_heap_prune
- xl_heap_freeze_page
- xl_heap_visible
- xl_btree_reuse_page
- xl_btree_delete
- spgxlogVacuumRedirect
Due to this new field being added, xl_hash_vacuum_one_page and
gistxlogDelete do now contain the offsets to be deleted as a
FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment.
It's not needed on the others struct where isCatalogRel has
been added.
This commit just introduces the WAL format changes mentioned above. Handling
the actual conflicts will follow in future commits.
Bumps XLOG_PAGE_MAGIC as the several WAL records are changed.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Handle logical slot conflicts on standby
author Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 06:11:28 +0000 (23:11 -0700)
committer Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 07:05:44 +0000 (00:05 -0700)
commit 26669757b6a7665c1069e77e6472bd8550193ca6
tree b60b91e8fa8276744b171a7a5714ffb5b8918f66 tree
parent be87200efd9308ccfe217ce8828f316e93e370da commit | diff
Handle logical slot conflicts on standby
During WAL replay on the standby, when a conflict with a logical slot is
identified, invalidate such slots. There are two sources of conflicts:
1) Using the information added in 6af1793954e, logical slots are invalidated if
required rows are removed
2) wal_level on the primary server is reduced to below logical
Uses the infrastructure introduced in the prior commit. FIXME: add commit
reference.
Change InvalidatePossiblyObsoleteSlot() to use a recovery conflict to
interrupt use of a slot, if called in the startup process. The new recovery
conflict is added to pg_stat_database_conflicts, as confl_active_logicalslot.
See 6af1793954e for an overall design of logical decoding on a standby.
Bumps catversion for the addition of the pg_stat_database_conflicts column.
Bumps PGSTAT_FILE_FORMAT_ID for the same reason.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
Support invalidating replication slots due to horizon and wal_level
author Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 05:40:27 +0000 (22:40 -0700)
committer Andres Freund <andres@anarazel.de>
Sat, 8 Apr 2023 05:40:27 +0000 (22:40 -0700)
commit be87200efd9308ccfe217ce8828f316e93e370da
tree f269cd86fdfebf3a15ef3559904f5863caede055 tree
parent 2ed16aacf1af1e1a26bffb121a19d1ad5f5177f0 commit | diff
Support invalidating replication slots due to horizon and wal_level
Needed for logical decoding on a standby. Slots need to be invalidated because
of the horizon if rows required for logical decoding are removed. If the
primary's wal_level is lowered from 'logical', logical slots on the standby
need to be invalidated.
The new invalidation methods will be used in a subsequent commit.
Logical slots that have been invalidated can be identified via the new
pg_replication_slots.conflicting column.
See 6af1793954e for an overall design of logical decoding on a standby.
Bumps catversion for the addition of the new pg_replication_slots column.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de