Skip to content

Latest commit

 

History

History
341 lines (250 loc) · 16.4 KB

20230410_07.md

File metadata and controls

341 lines (250 loc) · 16.4 KB

PostgreSQL 16 preview - standby 支持 logical decoding

作者

digoal

日期

2023-04-10

标签

PostgreSQL , PolarDB , standby , 物理从库 , logical decoding


背景

PostgreSQL 16 preview - standby 支持 logical decoding

也就是说, PG 16 可以在物理流复制从库上开启logical decoding. 那么问题来了:

1、在哪里创建逻辑复制slot?

不知道, 猜测可以直接在standby上创建. 因为每个节点的slot信息都是文件形式存储的, 不需要修改catalog信息.

2、如何避免冲突问题: 例如 上游wal_level改成小于logical

如果上游primary节点的wal level改成了minimal或者replica, 那么standby 的logical slot将被设置为invalid, 因为上游产生的wal不足以解析逻辑数据.

3、如何避免冲突问题: 例如 如果 decoding wal pages 需要的catalog被primary节点清掉了怎么办?

standby 的logical slot将被设置为invalid, 因为catalog老的用于decoding wal page的tuple version可能不存在了, 因此不能正确的解析老的wal page.

4、standby 的logical slot的位点信息会不会同步给primary

这个问题不存在, 因为每个节点的slot信息都是文件形式存储的, 没有存储在catalog里面, 所以并不需要把slot位点同步给primary, primary也查不到standby的slot信息.

5、cascading replication的场景: 当物理从库仅仅完成wal receive and flush后, 如果就立即decoding对应的wal, 逻辑上可能存在问题, 怎么解决?

在cascading replication的场景, 物理流复制从库(standby)启动wal sender的时机做了区分: 当wal flush完成后即可启动physical wal sender, 只有当apply完成后才启动logical wal sender.

Physical walsenders can't send data until it's been flushed;
logical walsenders can't decode and send data until it's been applied.
On the standby, the WAL is flushed first, which will only wake up physical walsenders;
and then applied, which will only wake up logical walsenders.

相关patch

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0fdab27ad68a059a1663fa5ce48d76333f1bd74c

Allow logical decoding on standbys  
author	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 09:20:01 +0000 (02:20 -0700)  
committer	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 09:20:05 +0000 (02:20 -0700)  
commit	0fdab27ad68a059a1663fa5ce48d76333f1bd74c  
tree	3d29650901130428712f71cbb8b57cb19f738f1a	tree  
parent	e101dfac3a53c20bfbf1ca85d30a368c2954facf	commit | diff  
Allow logical decoding on standbys  
  
Unsurprisingly, this requires wal_level = logical to be set on the primary and  
standby. The infrastructure added in 26669757b6a ensures that slots are  
invalidated if the primary's wal_level is lowered.  
  
Creating a slot on a standby waits for a xl_running_xact record to be  
processed. If the primary is idle (and thus not emitting xl_running_xact  
records), that can take a while.  To make that faster, this commit also  
introduces the pg_log_standby_snapshot() function. By executing it on the  
primary, completion of slot creation on the standby can be accelerated.  
  
Note that logical decoding on a standby does not itself enforce that required  
catalog rows are not removed. The user has to use physical replication slots +  
hot_standby_feedback or other measures to prevent that. If catalog rows  
required for a slot are removed, the slot is invalidated.  
  
See 6af1793954e for an overall design of logical decoding on a standby.  
  
Bumps catversion, for the addition of the pg_log_standby_snapshot() function.  
  
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Author: Andres Freund <andres@anarazel.de> (in an older version)  
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)  
Reviewed-by: Andres Freund <andres@anarazel.de>  
Reviewed-by: FabrÌzio de Royes Mello <fabriziomello@gmail.com>  
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>  
Reviewed-By: Robert Haas <robertmhaas@gmail.com>  

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e101dfac3a53c20bfbf1ca85d30a368c2954facf

For cascading replication, wake physical and logical walsenders separately  
author	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 07:24:24 +0000 (00:24 -0700)  
committer	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 08:06:00 +0000 (01:06 -0700)  
commit	e101dfac3a53c20bfbf1ca85d30a368c2954facf  
tree	1a5c755d9eb0a2f6f1d6251d9dd9f185654534cb	tree  
parent	26669757b6a7665c1069e77e6472bd8550193ca6	commit | diff  
For cascading replication, wake physical and logical walsenders separately  
  
Physical walsenders can't send data until it's been flushed; logical  
walsenders can't decode and send data until it's been applied. On the  
standby, the WAL is flushed first, which will only wake up physical  
walsenders; and then applied, which will only wake up logical  
walsenders.  
  
Previously, all walsenders were awakened when the WAL was flushed. That  
was fine for logical walsenders on the primary; but on the standby the  
flushed WAL would have been not applied yet, so logical walsenders were  
awakened too early.  
  
Per idea from Jeff Davis and Amit Kapila.  
  
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Reviewed-By: Jeff Davis <pgsql@j-davis.com>  
Reviewed-By: Robert Haas <robertmhaas@gmail.com>  
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>  
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>  
Discussion: https://postgr.es/m/CAA4eK1+zO5LUeisabX10c81LU-fWMKO4M9Wyg1cdkbW7Hqh6vQ@mail.gmail.com  

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6af1793954e

Add info in WAL records in preparation for logical slot conflict handling  
author	Andres Freund <andres@anarazel.de>	  
Sun, 2 Apr 2023 19:32:19 +0000 (12:32 -0700)  
committer	Andres Freund <andres@anarazel.de>	  
Sun, 2 Apr 2023 19:32:19 +0000 (12:32 -0700)  
commit	6af1793954e8c5e753af83c3edb37ed3267dd179  
tree	f3f3f39f3d46a6ed6f21b9b9473fcfc93657ff05	tree  
parent	ab73291d263049f00a394b37940ed197616fc3f8	commit | diff  
Add info in WAL records in preparation for logical slot conflict handling  
  
This commit only implements one prerequisite part for allowing logical  
decoding. The commit message contains an explanation of the overall design,  
which later commits will refer back to.  
  
Overall design:  
  
1. We want to enable logical decoding on standbys, but replay of WAL  
from the primary might remove data that is needed by logical decoding,  
causing error(s) on the standby. To prevent those errors, a new replication  
conflict scenario needs to be addressed (as much as hot standby does).  
  
2. Our chosen strategy for dealing with this type of replication slot  
is to invalidate logical slots for which needed data has been removed.  
  
3. To do this we need the latestRemovedXid for each change, just as we  
do for physical replication conflicts, but we also need to know  
whether any particular change was to data that logical replication  
might access. That way, during WAL replay, we know when there is a risk of  
conflict and, if so, if there is a conflict.  
  
4. We can't rely on the standby's relcache entries for this purpose in  
any way, because the startup process can't access catalog contents.  
  
5. Therefore every WAL record that potentially removes data from the  
index or heap must carry a flag indicating whether or not it is one  
that might be accessed during logical decoding.  
  
Why do we need this for logical decoding on standby?  
  
First, let's forget about logical decoding on standby and recall that  
on a primary database, any catalog rows that may be needed by a logical  
decoding replication slot are not removed.  
  
This is done thanks to the catalog_xmin associated with the logical  
replication slot.  
  
But, with logical decoding on standby, in the following cases:  
  
- hot_standby_feedback is off  
- hot_standby_feedback is on but there is no a physical slot between  
  the primary and the standby. Then, hot_standby_feedback will work,  
  but only while the connection is alive (for example a node restart  
  would break it)  
  
Then, the primary may delete system catalog rows that could be needed  
by the logical decoding on the standby (as it does not know about the  
catalog_xmin on the standby).  
  
So, it’s mandatory to identify those rows and invalidate the slots  
that may need them if any. Identifying those rows is the purpose of  
this commit.  
  
Implementation:  
  
When a WAL replay on standby indicates that a catalog table tuple is  
to be deleted by an xid that is greater than a logical slot's  
catalog_xmin, then that means the slot's catalog_xmin conflicts with  
the xid, and we need to handle the conflict. While subsequent commits  
will do the actual conflict handling, this commit adds a new field  
isCatalogRel in such WAL records (and a new bit set in the  
xl_heap_visible flags field), that is true for catalog tables, so as to  
arrange for conflict handling.  
  
The affected WAL records are the ones that already contain the  
snapshotConflictHorizon field, namely:  
  
- gistxlogDelete  
- gistxlogPageReuse  
- xl_hash_vacuum_one_page  
- xl_heap_prune  
- xl_heap_freeze_page  
- xl_heap_visible  
- xl_btree_reuse_page  
- xl_btree_delete  
- spgxlogVacuumRedirect  
  
Due to this new field being added, xl_hash_vacuum_one_page and  
gistxlogDelete do now contain the offsets to be deleted as a  
FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment.  
It's not needed on the others struct where isCatalogRel has  
been added.  
  
This commit just introduces the WAL format changes mentioned above. Handling  
the actual conflicts will follow in future commits.  
  
Bumps XLOG_PAGE_MAGIC as the several WAL records are changed.  
  
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Author: Andres Freund <andres@anarazel.de> (in an older version)  
Author: Amit Khandekar <amitdkhan.pg@gmail.com>  (in an older version)  
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Reviewed-by: Andres Freund <andres@anarazel.de>  
Reviewed-by: Robert Haas <robertmhaas@gmail.com>  
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>  
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>  

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=26669757b6a7665c1069e77e6472bd8550193ca6

Handle logical slot conflicts on standby  
author	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 06:11:28 +0000 (23:11 -0700)  
committer	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 07:05:44 +0000 (00:05 -0700)  
commit	26669757b6a7665c1069e77e6472bd8550193ca6  
tree	b60b91e8fa8276744b171a7a5714ffb5b8918f66	tree  
parent	be87200efd9308ccfe217ce8828f316e93e370da	commit | diff  
Handle logical slot conflicts on standby  
  
During WAL replay on the standby, when a conflict with a logical slot is  
identified, invalidate such slots. There are two sources of conflicts:  
1) Using the information added in 6af1793954e, logical slots are invalidated if  
   required rows are removed  
2) wal_level on the primary server is reduced to below logical  
  
Uses the infrastructure introduced in the prior commit. FIXME: add commit  
reference.  
  
Change InvalidatePossiblyObsoleteSlot() to use a recovery conflict to  
interrupt use of a slot, if called in the startup process. The new recovery  
conflict is added to pg_stat_database_conflicts, as confl_active_logicalslot.  
  
See 6af1793954e for an overall design of logical decoding on a standby.  
  
Bumps catversion for the addition of the pg_stat_database_conflicts column.  
Bumps PGSTAT_FILE_FORMAT_ID for the same reason.  
  
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Author: Andres Freund <andres@anarazel.de>  
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)  
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Reviewed-by: Andres Freund <andres@anarazel.de>  
Reviewed-by: Robert Haas <robertmhaas@gmail.com>  
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>  
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>  
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>  
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>  
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de  

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=be87200efd9308ccfe217ce8828f316e93e370da

Support invalidating replication slots due to horizon and wal_level  
author	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 05:40:27 +0000 (22:40 -0700)  
committer	Andres Freund <andres@anarazel.de>	  
Sat, 8 Apr 2023 05:40:27 +0000 (22:40 -0700)  
commit	be87200efd9308ccfe217ce8828f316e93e370da  
tree	f269cd86fdfebf3a15ef3559904f5863caede055	tree  
parent	2ed16aacf1af1e1a26bffb121a19d1ad5f5177f0	commit | diff  
Support invalidating replication slots due to horizon and wal_level  
  
Needed for logical decoding on a standby. Slots need to be invalidated because  
of the horizon if rows required for logical decoding are removed. If the  
primary's wal_level is lowered from 'logical', logical slots on the standby  
need to be invalidated.  
  
The new invalidation methods will be used in a subsequent commit.  
  
Logical slots that have been invalidated can be identified via the new  
pg_replication_slots.conflicting column.  
  
See 6af1793954e for an overall design of logical decoding on a standby.  
  
Bumps catversion for the addition of the new pg_replication_slots column.  
  
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Author: Andres Freund <andres@anarazel.de>  
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)  
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>  
Reviewed-by: Andres Freund <andres@anarazel.de>  
Reviewed-by: Robert Haas <robertmhaas@gmail.com>  
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>  
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>  
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>  
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>  
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>  
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de  

digoal's wechat