Skip to content

2.25.2.0-b69

@myang2021 myang2021 tagged this 27 Feb 17:01
Summary:
In part 1 and part 2, I have made changes to write PG-generated invalidation
messages of a DDL statement to the new table `pg_yb_invalidation_messages` at
the same time when the DDL statement increments the catalog version. For each
database, the table `pg_yb_invalidation_messages` maintains a small history of
invalidation messages, one per catalog version with a expiration time (10 second
by default). The idea is that as long as there is one successful heartbeat
service done in 10 second window, a new catalog version and its associated
invalidation messages will be propagated to tservers where a longer history of
catalog version and invalidation messages can be maintained in memory.

This diff adds changes to add the contents of `pg_yb_invalidation_messages`
table to the heartbeat response, along side with the existing contents of
`pg_yb_catalog_version` table. To reduce the number of reading of
`pg_yb_invalidation_messages` which may become bulky when there are many
databases and/or large invalidation messages, the table is only read when we
detect a change in `pg_yb_catalog_version` table (via the existing fingerprint
mechanism). So even though the writing to `pg_yb_catalog_version` and
`pg_yb_invalidation_messages` are atomic/transactional, it is possible that in a
heartbeat response, we only send back the new contents of
`pg_yb_catalog_version` but not `pg_yb_invalidation_messages` if the reading of
`pg_yb_catalog_version` succeeds but the reading of
`pg_yb_invalidation_messages` fails because we only make a one time reading of
`pg_yb_invalidation_messages` after `pg_yb_catalog_version` is read
successfully.

For example, if we execute 1 DDL every 11 seconds, then because the expiration time
by default is 10 seconds, each reading of the pg_yb_invalidation_messages table is going
to return 1 row for the DB where the DDL is executed because the next DDL's invocation
of yb_increment_db_catalog_version_with_inval_messages will delete the previous row
which is now expired. On the other hand, if we execute 1 DDL every second, then each
reading of pg_yb_invalidation_messages is going to return a history of 10 rows for the
given DB because when the 11th DDL is executed the 1st row is now expired and will be
deleted.

Sample annotated session showing how pg_yb_invalidation_messages is updated along with
pg_yb_catalog_version:

```
yb1$ ./bin/yb-ctl create --rf 1 --tserver_flags TEST_yb_enable_invalidation_messages=true
Creating cluster.
Waiting for cluster to be ready.
----------------------------------------------------------------------------------------------------
| Node Count: 1 | Replication Factor: 1                                                            |
----------------------------------------------------------------------------------------------------
| JDBC                : jdbc:postgresql://127.0.0.1:5433/yugabyte                                  |
| YSQL Shell          : bin/ysqlsh                                                                 |
| YCQL Shell          : bin/ycqlsh                                                                 |
| YEDIS Shell         : bin/redis-cli                                                              |
| Web UI              : http://127.0.0.1:7000/                                                     |
| Cluster Data        : /net/dev-server-myang/share/yugabyte-data                                  |
----------------------------------------------------------------------------------------------------

For more info, please use: yb-ctl status
yb1$ ysqlsh
ysqlsh (15.2-YB-2.25.2.0-b0)
Type "help" for help.

-- initial state
yugabyte=# select * from pg_yb_catalog_version;
 db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
      1 |               1 |                     1
      4 |               1 |                     1
      5 |               1 |                     1
  13515 |               1 |                     1
  13516 |               1 |                     1
(5 rows)

yugabyte=# select * from pg_yb_invalidation_messages;
 db_oid | current_version | message_time | messages
--------+-----------------+--------------+----------
(0 rows)

-- create table does not increment catalog version so its associated invalidation messages
-- are not inserted into pg_yb_invalidation_messages.
yugabyte=# create table foo(id int);
CREATE TABLE
yugabyte=# select * from pg_yb_catalog_version;
 db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
      1 |               1 |                     1
      4 |               1 |                     1
      5 |               1 |                     1
  13515 |               1 |                     1
  13516 |               1 |                     1
(5 rows)

yugabyte=# select * from pg_yb_invalidation_messages;
 db_oid | current_version | message_time | messages
--------+-----------------+--------------+----------
(0 rows)

-- each alter table increments catalog version so its associated invalidation messages
-- are inserted into pg_yb_invalidation_messages.
yugabyte=# alter table foo add column id1 int; alter table foo add column id2 int; alter table foo
add column id3 int;
ALTER TABLE
ALTER TABLE
ALTER TABLE
yugabyte=# select * from pg_yb_catalog_version;
 db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
      1 |               1 |                     1
      4 |               1 |                     1
      5 |               1 |                     1
  13515 |               4 |                     1
  13516 |               1 |                     1
(5 rows)

-- each new catalog version has a row with its associated invalidation messsages
yugabyte=# select * from pg_yb_invalidation_messages;
 db_oid | current_version | message_time |
                          messages

--------+-----------------+--------------+--------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
------------------
  13515 |               2 |   1739985961 |
\x07700000733e2c00cb3400007651cba7000000000000000006700000733e2c00cb340000f7d687bd000000000
000000037700000733e2c00cb340000465708532d7000000000000036700000733e2c00cb34000021e2d2ca2d70000000000000fe000000733e2c00cb3400000040000
07882f33e01000000
  13515 |               3 |   1739985961 |
\x07700000733e2c00cb340000c370bca3000000000000000006700000733e2c00cb3400006c96b878000000000
000000037700000733e2c00cb340000465708532d7000000000000036700000733e2c00cb34000021e2d2ca2d70000000000000fe000000733e2c00cb3400000040000
07882f33e01000000
  13515 |               4 |   1739985961 |
\x07700000733e2c00cb34000062d2bacf000000000000000006700000733e2c00cb34000089da71bf000000000
000000037700000733e2c00cb340000465708532d7000000000000036700000733e2c00cb34000021e2d2ca2d70000000000000fe000000733e2c00cb3400000040000
07882f33e01000000
(3 rows)

yugabyte=#

```

**Upgrade/Rollback safety:**

The new field `db_catalog_inval_messages_data` in `TSHeartbeatResponsePB` is
optional and will be ignored by an old tserver that does not have it. Its
purpose is for incremental catalog cache refresh optimization so if ignored we
just won't do the optimization in the PG backends spawned by an old tserver.
Likewise, if a new tserver receives a `TSHeartbeatResponsePB` from a new master,
the new field `db_catalog_inval_messages_data` will not be set and the new
tserver just won't do the optimization of incremental catalog cache refresh.

Test Plan:
(1) the default
YB_EXTRA_MASTER_FLAGS="--TEST_yb_enable_invalidation_messages=true --log_ysql_catalog_versions=true --vmodule=catalog_manager=2,heartbeater=2,master_heartbeat_service=2,pg_catversions=2" YB_EXTRA_TSERVER_FLAGS="--TEST_yb_enable_invalidation_messages=true --log_ysql_catalog_versions=true --vmodule=heartbeater=2,tablet_server=2,pg_catversions=2" ./yb_build.sh --cxx-test pg_catalog_version-test

(2) with --enable_heartbeat_pg_catalog_versions_cache=true
YB_EXTRA_MASTER_FLAGS="--TEST_yb_enable_invalidation_messages=true --log_ysql_catalog_versions=true --enable_heartbeat_pg_catalog_versions_cache=true --vmodule=catalog_manager=2,heartbeater=2,master_heartbeat_service=2,pg_catversions=2" YB_EXTRA_TSERVER_FLAGS="--TEST_yb_enable_invalidation_messages=true --log_ysql_catalog_versions=true --vmodule=heartbeater=2,tablet_server=2,pg_catversions=2" ./yb_build.sh --cxx-test pg_catalog_version-test

In both cases, look at the test logs that indicate invalidation messages are
read and set in the heartbeat response messages, and they are received at the
tserver side:

```
[m-1] I0219 02:08:38.033061 2652477 master_heartbeat_service.cc:460] vlog2: TSHeartbeat: responding (to ts 8e40c9c3f02943e3a1111264bf74f9f7) db catalog versions: db_catalog_versions { db_oid: 1 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 4 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 5 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 13515 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 13516 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 16384 current_version: 2 last_breaking_version: 1 }) db inval messages: db_catalog_inval_messages { db_oid: 16384 current_version: 2 message_list: "Pp\000\000qy(\000\000@\000\000@\301\353\n\000\000\000\000\000\000\000\000Op\000\000qy(\000\000@\000\000\305\366\351\242\000\000\000\000\000\000\000\000Pp\000\000qy(\000\000@\000\000G\242S{rp\000\000\000\000\000\000Op\000\000qy(\000\000@\000\000\360\230\2021rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000J4\027\233rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\00029X\237rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\204\237\234\023rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\002Q}$rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\325\031I,rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000S+\326Orp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\014\350L\363rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\361\367\247\350rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\354\273\251erp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\204\240\0360rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000?S\313\306rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\023\020\336\274rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\307ng\242rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\363\317\236\214rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\027\340 \035rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000K\243-\036rp\000\000\000\000\000\0007p\000\000qy(\000\000@\000\000FW\010Srp\000\000\000\000\000\0006p\000\000qy(\000\000@\000\000\360\230\2021rp\000\000\000\000\000\000\373\000\000\000qy(\000\000@\000\0000\n\000\000\362I\221\365lU\000\000\373\000\000\000qy(\000\000@\000\0000\n\000\000\362I\221\365lU\000\000\376\000\000\000qy(\000\000@\000\000\000@\000\000\362I\221\365\001\000\000\000\373\000\000\000qy(\000\000@\000\0000\n\000\000\362I\221\365lU\000\000" }

[ts-1] I0219 02:08:38.034925 2652211 heartbeater.cc:561] vlog1: TryHeartbeat: got master db catalog version data: db_catalog_versions { db_oid: 1 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 4 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 5 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 13515 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 13516 current_version: 1 last_breaking_version: 1 } db_catalog_versions { db_oid: 16384 current_version: 2 last_breaking_version: 1 } db inval messages: db_catalog_inval_messages { db_oid: 16384 current_version: 2 message_list: "Pp\000\000qy(\000\000@\000\000@\301\353\n\000\000\000\000\000\000\000\000Op\000\000qy(\000\000@\000\000\305\366\351\242\000\000\000\000\000\000\000\000Pp\000\000qy(\000\000@\000\000G\242S{rp\000\000\000\000\000\000Op\000\000qy(\000\000@\000\000\360\230\2021rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000J4\027\233rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\00029X\237rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\204\237\234\023rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\002Q}$rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\325\031I,rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000S+\326Orp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\014\350L\363rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\361\367\247\350rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\354\273\251erp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\204\240\0360rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000?S\313\306rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\023\020\336\274rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\307ng\242rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000\363\317\236\214rp\000\000\000\000\000\000\007p\000\000qy(\000\000@\000\000\027\340 \035rp\000\000\000\000\000\000\006p\000\000qy(\000\000@\000\000K\243-\036rp\000\000\000\000\000\0007p\000\000qy(\000\000@\000\000FW\010Srp\000\000\000\000\000\0006p\000\000qy(\000\000@\000\000\360\230\2021rp\000\000\000\000\000\000\373\000\000\000qy(\000\000@\000\0000\n\000\000\362I\221\365lU\000\000\373\000\000\000qy(\000\000@\000\0000\n\000\000\362I\221\365lU\000\000\376\000\000\000qy(\000\000@\000\000\000@\000\000\362I\221\365\001\000\000\000\373\000\000\000qy(\000\000@\000\0000\n\000\000\362I\221\365lU\000\000" }

```

Reviewers: hsunder, kfranz, sanketh, mihnea

Reviewed By: hsunder, kfranz

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D42023
Assets 2
Loading